|
|
The two talks I gave at the: International Workshop on Challenges and Visions in the Social Sciences, this summer, are now available at videolectures.net.
Not the best talks of my career, and hopefully not the last either. But the guys at VL did a great job in recording them.
One of the talk was about Tags, and the second about Democracy of the 21st Century.
In the one about Continue reading My first 2 talks available online: Tags & 21st Century Democracy
It is now the time to present the next project we have been working to: TagBay. And I say ‘we’, because is this project I am not alone. I did it with a friend of mine, Derek, who accepted, very patiently to code, some of the idea I have been tinkering around in the last year or so. I am speaking about how tags, and tag clouds, and distance between tags, and so on.
So, in brief we made a web site to tag material that is being sold on e-Bay. Anybody can tag any object that is being sold. Not only can any object be tagged but you can tag sellers, too (oh, we are not responsible for offensive tags, eh!).
Tags on objects can be made private or public, and you can also search among your tags, among everybody else tags, and eventually (when we code it) it will be possible to search among the tags of another user, like in del.icio.us.
Now that the summary for the people who have no time has been done, let’s try to explain the idea in the details for those who have a bit more time.
Pages:
On TagBay, right now, there are 3 type of pages: e-Bay Search Pages, TagBay Tag Search Page, TagBayUser Tag Search Page, Item Page, and Seller Page.
- Search Page: It is possible from inside Tag Bay to make searches on e-Bay on specific keywords. The user can then add tags to each object that came out, store the tags added all at once, or store the tags of a single object. The same thing can be done in the Tag Search Page
- TagBay Tag Search Page: In this page the user gets all the results for a single tag that someone have used. Nothing fancy (for now). Items where the tag only appears as a private tag will not appear here.
- TagBayUser Tag Search Page: In this page the user gets all the results for a single tag that he have used. If the user is logged in and is looking at his own tags, also the items tagged in a private way will appear.
- Item Page: Each object has its specific page. From such page any user can see what are the public tags that other users have used for that page. Also they can define their personal tags for that object, if their tags are going to be private, and the tags of the seller.
- Seller Page: And then there is the seller page, and in the seller page any user can tag any seller. The use of tag for sellers is still limited, but will be increased in the future.
The natural use of the site
- For a seller or for a shop A seller might want to use the site to tag all the objects that he is selling, giving for each object all the tags related. Thus increasing the possibility for it to be found. We suggest to list the tag in the order of importance, as soon we are going to use the order consider the importance in the search page.
Also, if a person wants to make a cool list of objects, they can tag exactly those objects, with a tag they never used, and then link to the page in their directory of this tag. Thus creating on the spot, their lists. Also sellers will want to tag their objects, and people making searches will tag objects to make lists of objects they want to follow, before jumping on a transaction. We think there is more than enough material to generate interesting behaviour. It doesn’t have to be exactly the same emergent behaviour that we are used to see. After all we are just exploring the possibilities of social folksonomy.
- A shop To the possibilities before, a shop who is selling on ebay might be interested to make sure that the shop itself (remember that you can tag sellers, and not only shops) have all the tags related to the merchandise that they are selling
- Someone buying Our suggestion for someone who wishes to buy, on e-Bay, would be to first look under the tag search, to see if there is anybody who has already tagged any object that they are interested in. This does not necessarily be someone else who is buying, but also someone who is selling. Then tag the objects they are interested themselves, to have it in their own list of objects. Then they could go to the search e-Bay page with the necessary keywords, and add the chosen tag to all the objects interesting. At that point a first selection have been made, and all the possible objects have been tagged. At this point, he could choose one or those objects, change the tags to private, and start betting on it.
- Someone suggesting And finally if someone is just trying to suggest some possible objects, he could search e-Bay for those objects, tag them with a unique tag and present the url of the list to whoever is interested.
There are many other ways to use TagBay. In a sense TagBay is a toy, and not a game. And as every good toy it can be used for many different games. We suggest here only some of them. Also TagBay itself is rapidly evolving. We have tons of stuff we are interested in including, and if you have been reading my blog, you know how my problem is always to find people to code my ideas, more than to find them. And this is why I am so happy for Derek work!
Difficulties that we found:
There were a number of issues that came out when we started developing this program.
- Public vs private tags:
Why would someone tag an object if they are interested to buy it? After all aren’t they making it easier to others to find it, by adding those tags?
This was a serious doubt that we had, and finally we decided to give the possibility to users to tag objects privately. Yet there have to be a balance between private tags and public tags, as public tags are necessary to generate the emerging folksonomy that we wish to use. So we decided for a compromise: public tags can be done from the search page, but private tags requite you to go to the specific object page. In our view (but we are ready to be proven wrong) someone would go to the search page, tag all the entries where he might be interested. Then chose one, and tag that one in a private way.
- Limitations due to the temporary nature of the objects
Considering that most object exist on ebay only for few weeks before being sold, wouldn’t this be not enough time to make a tag cloud and let all cool emergent properties that folksonomy induces, appear?
Maybe, but sellers also can tag the objects they are selling, thus giving a fresh start to all the objects. Also side by side to tagging object we are giving the possibility to tag sellers. Which eventually should survive each transaction and build up an interesting tag cloud.
- I spoke about sellers tagging their own objects, but wouldn’t this invite people to spam your site? After all, wouldn’t it be much better for a seller to add many tags to be present in many searches?
Ah ha! You think tag clouds can be spammed. This is false. Tag clouds cannot be spammed, and no one understand this. And we shall use this site to prove it. We have nothing against spammers, they are absolutely welcome in our site and spam it as much as they feel. Add all the tags they want to each object they sell. It will make ABSOLUTELY NO DIFFERENCE in the search page. Tag clouds are unspammable. And our engine will use tag clouds as its base. Everybody else uses tag sets. And this makes them easily spammable. So, no we don’t fear spammers. In fact we hope that spammers will come to our TagBay site. They are just people trying to sell their stuff, we are trying to make sellers meet with buyers. Wouldn’t be bad to single out spammers just because they are spammers.
TagBay is obviously still in beta, and there are many things that need to be coded. If you have any idea on how to make it better please do not hesitate to contact me. If you want to make a difference on what the final product will be now is the time to do it. Also all new suggestion implemented should be listed in a special page with links to the original suggester home page.
Some of you might remember my rant, once del.icio.us was bought. And some others, who where with me from before might remember the entries I wrote on tag clouds. Some time later I was contacted from an Italian developer, Fabio Vescarelli, who asked me some help in developing some algorithms to find the distance between users in a del.icio.us like program. We had an exchange of email first , and we met in chat some other time. He was building a del.icio.us clone, Smarking. But with some interesting differences. Continue reading Review: Smarking
I think the time have come to write my third, and hopefully last contribution to the topic of tagclouds.
I have been hearing a lot of talk on how users should not use too many tags in linking to url. I also am the maintainer of the mindmap maker, and I often look at some of the maps generated (available to everybody). There is a number of people who tend to use an average of between one and two tags per URL. Their maps are often very ordered. No clustering, no hierarchy. (Forgive me if I don’t put a link to such a map, but since I am going to bash this way of using delicious, I’d rather bash a method than a specific human being. Just go to the list of maps and open a couple, odds are one of them will be of the type I am describing). This way of using delicious uses tags as folders, just with the modification that every now and then you can put an URL in more than one folder at the same time. A bit like big bookstore might carry several copies of the same book, and store them in more than one place (and the Tao Te Ching, ends up in New Age -God knows why- and in Religion).
Of course tags tend not to fit exactly. My Tag Clouds and Cultural Change will be under Tags or Folksonomy or Sociology… Whatever you chose you probably will not put it under Ajax. And yet most of the analysis was done studying the spreading of the term Ajax.
Let’s make a few simple calculations. Continue reading Tag Clouds are hard to Spam
In the previous post I discussed how we can measure the relative importance of tags in a post, by calculating their weight, as
- weight of tag t= (number of people using t)/(total number of people)
I also said that:
Not only we could study a culture by studying the differences in the power law approximated by the tag clouds used by people of that culture. But we could even measure cultural eartquake by measuring the difference between the tag cloud being generated before a certain event, or after a certain event.
Independently Clay Shirky was coming at a similar conclusion, although he more focused on temporal changes that seem more signature of a particular subgroup of people all bookmarking a site at a certain time:
During a period of about 120 users’ additions of OIO, 20 or of them used the tag ‘ia’, putting it between #7 and #10 during that period. Now it is down to #17. This suggests that one or a few IA-oriented sites or mailing lists posted the link, and it got a flurry of attention from those taggers in a narrower window of time. This in turn suggests a conversationally tightly-knit IA community.
Through this tool we can see changes in the culture we are living in. We are used to feel those changes, but generally we never were able to measure them. Maybe now we might start to be able to do it.
But let’s go back to the tag weight. Terrell Russell took the ball, and in one evening of programming presented a tool to actually see how the weights change in time.
Nothing to say about the tool. It works perfectly well, and although it can be enhanced in many little ways, it already is very useful. Not bad for one evening.
More interesting, from my point of view, is how, through this tool we can see changes in the culture we are living in. We are used to feel those changes, but generally we never were able to measure them. Maybe now we might start to be able to do it.
No change
First of all I would like to show you the graph of a part of the culture where no changes are happening:
From the site: Nifty Corners. 1859 people having bookmarked it by now. The values soon converge to what we can expect to be their definite value (for the culture we are in).
Little Social-Quake
Continue reading Tagclouds and cultural changes
Note: This entry is connected also to a mindmap. Some people were having problems in opening the page because of that. As such the mindmap has been stored in a separate page, and can be viewed from here.
Introduction
As correctly pointed out by Jeffrey Zeldman tag clouds are becoming more and more popular. Yet I keep seeing services which should be using tag clouds that keep on using tag sets. It is not just a problem of programming a tool which can only support tag sets, but also but also of programming tools which might in principles produce tag clouds, but such that the users are not invited to use a tag if one already exists, and as such don’t generate a tag cloud.
Example of the first type of tools are Flickr, 43things, consuMating, tagsurf * , example of the second is the tagged version of the BBC* . In all those cases a tag set is used, where instead a tag cloud would be more appropriate. Some of the differences between a tag cloud and a tag set where explained in Vanderwal.net: Explaining and Showing Broad and Narrow Folksonomies. Let’s see them again, and see some consequences of those differences, which should clarify when is better to use one tool and when is better to use the other. Continue reading On Tag Clouds, Metric, Tag Sets and Power Laws
Some things are bound to happen. And they tend to happen at the right time. We have been using tags from years now, but the momentum have builded up, day after day. Always seeing more and more computer programs using them. Starting from deli.icio.us and flickr. Then 43 things.com, consumating.com, tagsurf.com and all the clones of the above (BTW if anybody can find me a small open source server program that emulates Flickr for personal use,I would be grateful). And of course technorati tags, and GutenTag that give rss feeds to technorati tag.
But something was missing. Somthing that some people might have noticed. The news were not playing with tags. News were still presented in the old top down way: politics, economics, international…
On Google News, as well as CNN. On Yahoo News, as on BBC.
But finally something is starting to move over there too.
Two services, pretty much at the same time were presented: Yahoo News with tags and BBC with tags.
But there are some serios differences between the two services. Yahoo content is being automatically indexed by a program, who imposes the tags according to what keywords are found in the text. As such Yahoo tags is a Top Down keyword classification of stories.
Instead (and here you can see the revolutionary spirit blowing through English news services), BBC program is a truly down up grassroot program. A program where everybody can add any tag to any article.
The difference is not a minor one, as in the first case it is the user that have to adapt to the world view of Yahoo, while in the second it is BBC that includes in his wider world view the user one. In a sense it is a case of Tagsonomy vs. Folksonomy, or
narrow folksonomy vs. broad folksonomy.
Of course both the program are still in their first days. Full of bugs, and of suggestion from us on how to make it better, smoother, and nearer to our personal desires.
Of course having anybody being able to add any tag to a copy of the BBC content is full of political dangers. What is stories about important politicians start to be tagged as ‘dictator’ or ‘wanker’. This is in fact inevitable, but politicians showld well use this as an indication of their popularity, than something to be changed.
At the moment anybody can add a tag in the BBC news page by login in as ‘guest’/'guest’. And already we have some people who have tagged some stories as ‘wanker. But if we go to delicious we see that nearly no one have used such epitome.
Why is that? My personal position is that people are more careful when tagging something for their own personal use. On delicious everybody have an account. And although you could have as many account you like, they cost. They cost time and memory to set it up. So we all tend to have just the minimum amount of acount needed. But on BBC, at the moment, only BBC person are allowed to have their own account. We normal human being, can just be guest. Ans as such we might feel deresponsabilized respect to what we wrote. So I think that, although the experiment is great, it will only work properly when everybody can set up his own account, and serch his account, or the account of another, well defined person.
Of course this also open up all sort of extra possibilities. After all, if anybody can tag any article with his own tags. Then to each article a set of tag will be defined. What is I want to receive (maybe on my mobile) all the articles tagged with a certain keyword. The possibilities are really endless.
And to look at those possibilities BBC had started a whole new project, called BBC Backstage where geeks are invited to collaborate with the staff of BBC to develop the API to permit to everybody to reuse the BBC material. Cross this with the fact tha much of this material is copyrighted with a copyleft copyright (copygotit?), and you see how the whole situation can positively explode.
Imagine, much of the material from BBC, offered for free, in the way wanted by the best geeks and hackers, to produce information in any noncommercial way they please.
Already many ideas are flowing? An RSS for the results from sport match. Crossing google maps with BBC News.
Possibility to have BBC news accepting trackbacks.
And many many others.
All this would mingle BBC with the common people. Think, all the news, mixed and remixed. Commented, trackbacked. Until you can read an article from BBC news from any device (through rss), in any format you want (through your rss reader). Filtered anyway you want (through folksonomy), and seeing the world response to that article(through trackback and comments).
Thank you BBC
(and no, I am not paid by BBC)
Thanks also Wired for some inspiration.
This evening I played with calendars. In particular with the calendar published by Mozilla. SunBird. It is pretty amazing. Also here they managed to install an open standard with which anybody can write his own calendar. The program let you then save it into a file or publish it on the web. You can also upload claendars from other people, and they will appear superimpressed on your events, so that you can see your event as well as the other calendar event.
Think about it, it is extreemly easy, and extreemly powerful. I can just write down the dates that for me are important, and people can use the info to define meeting, set up ambush, or find out when the campervan is unattended. Infinite power.
More, it is possible to set up calendars for particular type of events. For example we could, at work, set up a calendar for all the conferences on artificial life, artificial chemistry, complex system. etc. Or even a separate calendar for each of type, and each person could just subscribe to the calendars that he is interested in.
The calendar is still very limited in many ways. For example events can be assigned only to one category. The whole idea of tags and folksonomy has here yet to come. For example eventually people should be able to set each event in multiple categories, or even suggest categories for events of others.
In any case, my calendar is at http://www.pietrosperoni.it/calendar/agenda.ics. If you have firefox with the calendar plugin inserted you can just see it. If not you probably need to wait until I integrate it with my blog, which will take quite some centuries.
Update: Another thing that is definitly missing is an integration between this software and the smart phone technologies. What’s the point of having a cool phone that can connect to the net, so you can be everywhere anywhere you are, and have such phone have all your appointments, if you cannot let this phone speak, on the net, with your calendar. It does not seem such a hard thing to obtain, although I would not know where to start, so I would predict that within 6 months, no, no 4 months, a program should me around that let me integrate the 2 things. If it isn’t already there.
This is going to be big. It’s called tagsurf. When we were setting up the taoist discussion board, at Tao Bums, I was looking for a board that permitted me to tag individual messages with different tags. The reason is that over there we are now a group of friendly people and every thread start with a topic, but often touches many separate ones. The board had to be in PhP for reasons only knew to the web master, but that we all were happy to follow. So we started looking around, but no board with tagging facility went up. Nothing. I had to admit that the idea was quite new, and I have not seen any such board around in any case. And then we decided for phpBB which being open source would have had new versions with any new cool geeky thing appearing every so often. Well. Now I finally found the first tag based discussion board. It’s called tagsurf. And is very cool. You get to write messages and tag them. As tag you can use any word up to any size. Now, the result of this is that you can tag thing with the url of something. So immediatly a series of utilities started appearing:
People (first one I saw doing it was Russell Beattie) added a tagsurf button. In short if you click on that button you get all the comments on tagsurf that uses your permalink as a tag. In a sense it is outsourcing the discussion board.
Yes, I added it too, is down near the little technoraty bubble, and I just needed to add:
<a href="http://tagsurf.com/post?tag=<?php the_permalink() ?>">Tagsurf this</a>
in the template.
I also went back to see how was tagsurf behaving in del.icio.us. It seem that, as it often appear in other cases, the meme is 6 days old. At the beginning few people noticed it, and now is starting to explode. I too found out because of the delicious discussion board, which I would suggest anybody who is interested to anybody who is interested in delicious OR folksonomy
I think this tagsurf will and can have great impact. They already have some API defined.
I also got an eye to their privacy policy. It seemed simple and clear. Yet now I cannot find it anymore. I suspect that they might be working on it right now.
I also made a small bookmarklet to post an entry on tagsurf about a specific page. Just drag the word bookmarklet on the bar and it should work. Of course for it to work you have to be logged in in tagsurf.
Great points:
- trackback: every post gets is an entry point for trackback. In other words anything you say can receive trackback from anything else. You say something here, and it get people in the blogsphere chatting. And you can follow their conversation. This is something very important that was missing in all the bullettin board I have been using. In a sense many discussion board are only looking in. This is also looking out.
- trackback 2: Every post that you make can send trackback to anything you want. The software to do this automagically respect to the other posts inside tagsurf is still missing, but I can’t imagine it not appearing very soon.
- possibility to mix different threads: since each post gets as many tags as the poster want it is quite easy for people to join different threads of discussion.
Problems I might see coming.
- Spam, spam, spam: I recieve about 30 spam trackbacks a day. And they get filtered by cool programs and finally deleted by me. Yet those programs need me to make the final judgement. Who will make the judgement for all the trackbacks in all those posts? Will the user have to? Can someone close the trackback from his own posts? I see many problem and much discussion over here.
- copyright: This is another big one. Let’s say that I post a cool entry in tagsurf, who gets the copyright of it? It might be important. Imagine that someone takes it, and wants to add some extra tags. But adding tags is not allowed at the moment. So he copies the post and just reposts it with the extra tags. Do I have a say on it?
All together I think this is a wonderful piece of new technology. When tachnoraty started his tag page I wasn’t very impressed, but this, I think, will make some huge effects. And still I can’t see all the implications.
ADDENDUM: just as I ended this post I read fully the great and very interesting post from Russell Beattie. And I found that he had made exactly the same bookmarklet. Oops. Well, I hope he will not sue me, I haven’t copied his code. I just reinvented the wheel.
ADDENDUM to the ADDENDUM: As I was looking at all the people who were commenting on the thread on Russell post I noticed another post with the same bookmarklet. And I thought I would have been the first . At least I get to see if the trackback to posts over there actually works.
ADDENDUM to the ADDENDUMto the ADDENDUM: trackback does not seem to work, or the comment is being held back for security reasons
Did some more debugging. Now any unicode the user used in the tags should be ok. Still there is a big brick wall in terms of memory usage. And some users are not having any luck just out of the fact that their map is taking so much resources that it goes beyond the ISP limit. I could work hard and distribute the whole calculation so that all variables are stored on disk, so the memory would never be hit, but honestly, it is not my top priority at the moment. I am here to help those users run the program on their own machine. And eventually we might solve that problem too. So, what are my top priorities:
- Add an rss feed.I would like to add an rss feed that every time a new map is done, the feed gets updated. It wouldn’t just tell the name but all sorts of data, like the list of the Main Tags. So the users could see if they might be interested in checking the new buddy’s map
- Insert a way for user to delete their own maps. If I am going to go into hosting business, I am not going to be one of those hosts where you can add info, but you cannot delete it. I am aware that users info ultimately is adding value to my site, as such I want users to be happy in having their map here. Not forced.
- Insert a general log of all the maps that are being started, and ended. Right now such a log is absent, and there are about 200 maps completed, and more than twice maps that have been started. So about 300 have been dropped. I bet many of those users would have success, if they tried right now, after those 3 deubugging session. Still I want something that tells me: Warning warning warning, map dropped. Bug? OutOfMemoryError?
- Add the number of posts inside a tag. Just obvious
- Probably add some of the MainTags as keywords to each single map. The problem is: which? All is too much. All the ones that contain more than x posts, y subtags is not flexible enough. The solution should be: if a MainTag is part of a ParetoFront of Delicious than the keyword should be there. The fact that this means writing a whole program that stores in a database the latest ParetoFront is just a small detail
. And before you ask: no, I will not need anybody’s password to do that, and the data will all be public.
- Add a bookmarklet to save a map in your own delicious, with the keywords as tags
- Change the map, so that it can run on a single tag. Useful for big complex maps like mine, and others.
- Make it change the Title of the Map Page, to show the owner of the map. Useful if people want to add the maps to their delicious pages.
And then there are some tests I would like to make, like:
- Check if it would make sense to show all the tags that appear with a single tag, and not the subtags.
There is more? If you can think of other modifications , please drop a line in the comment section. Also if you tried to run the map maker and it is not giving you satisfaction let me know. I’ll whip it appropriatly. HarHarHar. (I’ve always wanted to say that!)
The first person to use the tool (presented here) was Mike Harris, for his delicious entries. Note immediatly how the time needed to compute the map has little to do with the number of posts, and much to do with the number of tags.
- WCityMike: 2029 Posts, 87 Tags and 81 Main Tags, calculated in 86.85 seconds.
- p.s.blog: 21 Posts, 43 Tags and 17 Main Tags, calculated in 0.23 seconds.
- pietrosperoni: 372 Posts, 400 Tags and 152 Main Tags, calculated in 377.40 seconds.
The Main Tags, are the tags that will appear as main branches. And we can also see a difference between Mike maps, and mine. In mine I tend to have about 0.4 of the tags as Main Tags, while Mike tends to have something more near 0.9. This is probably due to the fact that I tend to apply many tags to each post (four or five are common, but sometimes more), while Mike tends to use an average of one or two.
If we look at the map we can also see that there are less clusters than in my map. Note for example how in the small blog map nearly everything is clustered… and those are only 20 posts and 17 Main Tags.
If we look at the source code we can see that, on the 9th line some constants are set:
distances_constant= [0.333333,0.4,0.5,1]
Those constants define the minimum distance for entries to be in the same cluster.
The 1/3 means that if one third of the posts between two tags are in common then the tags should be in the same cluster. And so on. Tags that are farther apart, but have a path of tags between them such that you can go from one to the next without never going above that distance are in the same cluster, too. A process that in the log is referred to as making the distances tables transitive.
Those number have been specifically tweaked for my delicious posts (and generally my style of bookmarking). It seem obvious that for Mark the numbers should be different. Since it is more uncommon for him for posts to share a tag, probably the numbers should be lower. Something like:
distances_constant= [0.1,0.333333,0.25,0.4,1]
The last 1 is just to make sure that tags that are synonimes are shown together.
I think eventually I will modify the program so that it is possible to insert your own constants from outside. But for now I am just grateful to Mike for giving me the material to understand better how to enhance the program.
|
|
Recent Comments