|
|
The two talks I gave at the: International Workshop on Challenges and Visions in the Social Sciences, this summer, are now available at videolectures.net.
Not the best talks of my career, and hopefully not the last either. But the guys at VL did a great job in recording them.
One of the talk was about Tags, and the second about Democracy of the 21st Century.
In the one about Continue reading My first 2 talks available online: Tags & 21st Century Democracy
We seem to have made it.
The website is now hosted on different servers, at dreamhost.com.
Of course what could not be done was to resurrect the wiki, which now should instead be recreated from scratch. And, much worse, the mindmapping tool. The page from which it was possible to make a mind map of a delicious account. Right now I have unfortunately no time for much, as I am in the last months of my Ph.D., yet I hope to find at least the time to put back the program to make the mindmaps. So at least people will be able to make the mindmaps themselves. Apologise to all the spammers that sent me spam in the last 2 weeks. Knowing that all the latest comments were going to be lost anyway I avoided to mark them (about a thousand messages) as spam. Please don’t feel ignored, just sent your message again and I shall trash it asap.
I just came back from the vacations to discover that my website is now a mess.
All the delicious mindmap data have been deleted, and so did the MySQL data of the wiki.
I am not really sure how it could happen, but I am just not going to investigate.
I am, more pragmatically, moving my website to another host. It will take some days, and a lot of the data will in any case be lost.
Please if you came to my page looking for any of the previous services: the delicious mindmap, the wiki aggregator, have patience.
As soon as the transfer have been made, I shall post a new message.
So if this current message is the last one, you know that we are still using the old host.
Pietro
It is now the time to present the next project we have been working to: TagBay. And I say ‘we’, because is this project I am not alone. I did it with a friend of mine, Derek, who accepted, very patiently to code, some of the idea I have been tinkering around in the last year or so. I am speaking about how tags, and tag clouds, and distance between tags, and so on.
So, in brief we made a web site to tag material that is being sold on e-Bay. Anybody can tag any object that is being sold. Not only can any object be tagged but you can tag sellers, too (oh, we are not responsible for offensive tags, eh!).
Tags on objects can be made private or public, and you can also search among your tags, among everybody else tags, and eventually (when we code it) it will be possible to search among the tags of another user, like in del.icio.us.
Now that the summary for the people who have no time has been done, let’s try to explain the idea in the details for those who have a bit more time.
Pages:
On TagBay, right now, there are 3 type of pages: e-Bay Search Pages, TagBay Tag Search Page, TagBayUser Tag Search Page, Item Page, and Seller Page.
- Search Page: It is possible from inside Tag Bay to make searches on e-Bay on specific keywords. The user can then add tags to each object that came out, store the tags added all at once, or store the tags of a single object. The same thing can be done in the Tag Search Page
- TagBay Tag Search Page: In this page the user gets all the results for a single tag that someone have used. Nothing fancy (for now). Items where the tag only appears as a private tag will not appear here.
- TagBayUser Tag Search Page: In this page the user gets all the results for a single tag that he have used. If the user is logged in and is looking at his own tags, also the items tagged in a private way will appear.
- Item Page: Each object has its specific page. From such page any user can see what are the public tags that other users have used for that page. Also they can define their personal tags for that object, if their tags are going to be private, and the tags of the seller.
- Seller Page: And then there is the seller page, and in the seller page any user can tag any seller. The use of tag for sellers is still limited, but will be increased in the future.
The natural use of the site
- For a seller or for a shop A seller might want to use the site to tag all the objects that he is selling, giving for each object all the tags related. Thus increasing the possibility for it to be found. We suggest to list the tag in the order of importance, as soon we are going to use the order consider the importance in the search page.
Also, if a person wants to make a cool list of objects, they can tag exactly those objects, with a tag they never used, and then link to the page in their directory of this tag. Thus creating on the spot, their lists. Also sellers will want to tag their objects, and people making searches will tag objects to make lists of objects they want to follow, before jumping on a transaction. We think there is more than enough material to generate interesting behaviour. It doesn’t have to be exactly the same emergent behaviour that we are used to see. After all we are just exploring the possibilities of social folksonomy.
- A shop To the possibilities before, a shop who is selling on ebay might be interested to make sure that the shop itself (remember that you can tag sellers, and not only shops) have all the tags related to the merchandise that they are selling
- Someone buying Our suggestion for someone who wishes to buy, on e-Bay, would be to first look under the tag search, to see if there is anybody who has already tagged any object that they are interested in. This does not necessarily be someone else who is buying, but also someone who is selling. Then tag the objects they are interested themselves, to have it in their own list of objects. Then they could go to the search e-Bay page with the necessary keywords, and add the chosen tag to all the objects interesting. At that point a first selection have been made, and all the possible objects have been tagged. At this point, he could choose one or those objects, change the tags to private, and start betting on it.
- Someone suggesting And finally if someone is just trying to suggest some possible objects, he could search e-Bay for those objects, tag them with a unique tag and present the url of the list to whoever is interested.
There are many other ways to use TagBay. In a sense TagBay is a toy, and not a game. And as every good toy it can be used for many different games. We suggest here only some of them. Also TagBay itself is rapidly evolving. We have tons of stuff we are interested in including, and if you have been reading my blog, you know how my problem is always to find people to code my ideas, more than to find them. And this is why I am so happy for Derek work!
Difficulties that we found:
There were a number of issues that came out when we started developing this program.
- Public vs private tags:
Why would someone tag an object if they are interested to buy it? After all aren’t they making it easier to others to find it, by adding those tags?
This was a serious doubt that we had, and finally we decided to give the possibility to users to tag objects privately. Yet there have to be a balance between private tags and public tags, as public tags are necessary to generate the emerging folksonomy that we wish to use. So we decided for a compromise: public tags can be done from the search page, but private tags requite you to go to the specific object page. In our view (but we are ready to be proven wrong) someone would go to the search page, tag all the entries where he might be interested. Then chose one, and tag that one in a private way.
- Limitations due to the temporary nature of the objects
Considering that most object exist on ebay only for few weeks before being sold, wouldn’t this be not enough time to make a tag cloud and let all cool emergent properties that folksonomy induces, appear?
Maybe, but sellers also can tag the objects they are selling, thus giving a fresh start to all the objects. Also side by side to tagging object we are giving the possibility to tag sellers. Which eventually should survive each transaction and build up an interesting tag cloud.
- I spoke about sellers tagging their own objects, but wouldn’t this invite people to spam your site? After all, wouldn’t it be much better for a seller to add many tags to be present in many searches?
Ah ha! You think tag clouds can be spammed. This is false. Tag clouds cannot be spammed, and no one understand this. And we shall use this site to prove it. We have nothing against spammers, they are absolutely welcome in our site and spam it as much as they feel. Add all the tags they want to each object they sell. It will make ABSOLUTELY NO DIFFERENCE in the search page. Tag clouds are unspammable. And our engine will use tag clouds as its base. Everybody else uses tag sets. And this makes them easily spammable. So, no we don’t fear spammers. In fact we hope that spammers will come to our TagBay site. They are just people trying to sell their stuff, we are trying to make sellers meet with buyers. Wouldn’t be bad to single out spammers just because they are spammers.
TagBay is obviously still in beta, and there are many things that need to be coded. If you have any idea on how to make it better please do not hesitate to contact me. If you want to make a difference on what the final product will be now is the time to do it. Also all new suggestion implemented should be listed in a special page with links to the original suggester home page.
Some of you might remember my rant, once del.icio.us was bought. And some others, who where with me from before might remember the entries I wrote on tag clouds. Some time later I was contacted from an Italian developer, Fabio Vescarelli, who asked me some help in developing some algorithms to find the distance between users in a del.icio.us like program. We had an exchange of email first , and we met in chat some other time. He was building a del.icio.us clone, Smarking. But with some interesting differences. Continue reading Review: Smarking
Finally the time has come. Although I wanted to do this from a long time, only now did I found the time and the technical knowledge to do it:
I divided the blog.
I divided all the Italian posts from the English ones. I created a new blog at http://it.pietrosperoni.it, and my italian posts will, from now on, be posted over there. And only over there. Most of the people (3) who read me (5) either read Italian or English posts. And I am sure it must have been very confusing to scroll through a page and find some posts in English and some in Italian. Plus I always had the sensation that I could not write too much in one language, or possible readers of the other language will just assume the blog contains no information at all for them, and dismiss it. This in time made me slow down posting, as I could not always follow particular threads, that would have involved to post many times, in one language.
But now all this has come to an end.
Of course if you want to read entries from both blogs you should add the rss from the italian blog too. Some topic will remain confined to this blog (like tags, for example), others will remain there (like italian politics), while other will span through both medium (like diet, which already is present in both). The wiki in this case should act like a glue, creating a space where entries from both are aggregated. Plus, being a wiki, I (and whoever wants to come and play) will use it to keep notes, aggregate extra content, and generally make some pages stand out while others will only show the blogs entries, the bookmarks, and the context (i.e. the links from delicious popular page, and from technorati).
Generally it is not a smart idea to to come here every time to see if I have written something. I tend to write when I have something to say, so many days might pass before I say something, then for some days I might make one or more post a day. The solution is to add my rss feeds to your feed reader. Bloglines is a good one. I am sure there are better ones. Feel free to suggest them (as I am always looking for ways to improve).
Now let’s get a bit more technical: making this change also meant getting my hands dirty with MySql Continue reading The Italian blog is born: reasons and technicalities
I think it’s the time to present what have I been doing in the last days. A number of improvement have beed added to this web site. In short I have upgraded to wordpress 2.0. I also moved to the next version of wikka. Some of you might remember that I offered some money to whoever could write some code to get the tag plugin to generate an rss list. I didn’t, at the time, explained why. I will now.
Wordpress 2.0 gives the possibility to start categories on the fly. Just adding them, by listing them. Essentially this makes the category in wordpress work like tags (or keywords, for academics). But categories in wordpress also have an rss feed connected to them. Albeit with some bugs, like linking to the whole blog and not to the particular category. So I passed most of the first of January adding to the entries the relative tags as categories. So now I have no need of an rss feed for the tag page, as the tag page has been substituted with the category pages.
You also will rememer that I installed Wikka. The wiki engine. Now wikka is not only open source but also easy source. It is so simple that even I could hack the code. That is very simple! So I changed the code and inserted the possibility to have default pages. In short if before if you were to look for the url http://wiki.pietrosperoni.it/someunexistingpage and there is no page in the wiki called “someunexistingpage” the result would be that the wiki would ask you to edit the page, and you would be redirected to http://wiki.pietrosperoni.it/someunexistingpage/edit.
Now he would create on the fly the page someunexistingpage with the default content. And the default content I chose was the 4 rss feeds:
- the feed from my blog from the category: someunexistingpage
- the feed from my technorati from the tag: someunexistingpage
- the feed from my delicious bookmarks from the tag…
- and the feed from the popular pages in delicious, always from that tag
So for each tag I now have a wiki page with the most relevant rss appearing there. But being a wiki page I also can add other rss feeds, write definitions, comments, todo lists. In short modify it as I see fit.
Still it is not perfect. As it writes the page the first time, from that moment the page is set. I can delete it, but I cannot, for example, change the default content for all the pages that only contain the default content. I tried to write a plugin to do that, but I failed when I confronted the fact that I needed to write a plugin {{defaultpage}} who should have activated other plugins:{{rrs}}, for example. Something that I ignored how to do.
Also having the same string to work for delicious (as tag), wordpress (as category name) and wikka (as pagename) puts some heavy constraints on what the string might contain. For example I am already running ashore for all the tags that contain a dot inside (aaargh, del.icio.us!) or an accented letter (aargh, dear italian).
If you want to see how the pages look like just see the idea page. But any link from the right column (provided they have no dots inside or accents) will work fine.
I wanted to start this entry congratulating with Joshua for the deal. But I won’t.
Tha facts: the web site delicious have been sold to Yahoo!.
I personally don’t dislike Yahoo. I positively hate them. For having eaten and raped startup websites, one after the other. For being totally obscure in terms of contact with the public. For refusing to answer e-mails. For being so big that they can just claim: “we are too big to answer your e-mails”. We can ignore you, and trample on you; we will not even notice. I have something personal with them from the moment they deleted my web page back in 2003; and with it all the material inside; which included some preprints of academic papers I wrote; some of them I had in single copy. I hate yahoo because they don’t get what is the web2.0 and they try hard to copy it. And when they fail in copying it, they try to buy it. As if you could buy a community. As if you could own a community. As if you could buy a language and the agreement to keep the data open.
So maybe I should congratulate with Joshua for having sold something which had no price for some real and tangible money. But I still will not. Because delicious was not only a community. It was also an experiment. A place for us geeks to meet and discuss. A place where we were changing the web. Yes WE were changing the web through our ideas. And Joshua was good in picking the best ideas. Inviting us to give more. Now do you really think this will continue under Yahoo!’s reign? Forget it! At least for my part.
But this is not the reason why I shall not congratulate with Joshua. No I shall not congratulate with him because he could have made it. Because delicious was clearly, and recognised, the best bookmarking service on the web. And with the whole community behind giving suggestion it was prosperous and growing. Because people have pleaded him to start charging, or put advertisments, or do something, but let us pay for it. Because we knew. We knew he could not possibly pay off it all by himself. And we were happy to join in. We were happy to pay. How many services are you aware of where the costumers ask to pay for them? Few indeed!
Of all the people who have commented the action I feel the person who better captures my feelings is Ronald Johnson, who comments:
Some lessons to learn here:
- Never trust a startup service to store your important data no matter how the owner seems honest to you. Sooner or later he/she will run away with the money and YOUR data.
- Never trust a corporate entity to continue storing your important data. Now that they stole your data, you are subjected to the user-specific ads and they abuse you no matter how strong you cry.
- Never act like a fanboy on services you don’t trust. Instead, invest your time and knowledge on open source projects to ensure your efforts are never sold to third party evils.
I have to add, one of the thing I found most disturbing was the form whith which Joshua announced it. In evidence the words that I found most disturbing:
We’re proud to announce that del.icio.us has joined the Yahoo! family. Together we’ll continue to improve how people discover, remember and share on the Internet, with a big emphasis on the power of community. We’re excited to be working with the Yahoo! Search team – they definitely get social systems and their potential to change the web. (We’re also excited to be joining our fraternal twin Flickr!)
We want to thank everyone who has helped us along the way – our employees, our great investors and advisors, and especially our users. We still want to get your feedback, and we look forward to bringing you new features and more servers in the future.
I look forward to continuing my vision of social and community memory, and taking it to the next level with the del.icio.us community and Yahoo!
The post stinks of corporate declaration, and has already signed the destiny of delicious as just another piece in the yahoo puzzle. A more honest post would have spoken of the money that was passed. How they made an offer that could not be refused. Of the risks of the passage. It would still make people upset, but we might have felt that it was coming from Joshua and not through Joshua, from the Yahoo P.R. office.
All this calls for some actions, for I really don’t want to support Yahoo; and if all I can do is passive resistance, then that’s what I shall do!
- I shall look for a good alternative to Yahoo, ehm, I mean del.icio.us. The folks at slashdot suggest Simpy.
- I want to look better at microformats, and in particular at rel-tag. It might be possible to install a small bookmarking service on site, and then have it send standard info to the community at large. In this way I would not be vulnerable anymore to the next Yahoo! acquisition.
- While I am there I should also look for ways to get out of Flickr (who has been acquire by Y! too). Don’t miss the wonderful description of the mess Yahoo is doing with the Flickr signup page. There I also heard that 23hq might be a good alternative. Still I would prefer something on site that speaks a common language.
- I have to decide what to do with the Delicious Mind Map Maker. You see, I really don’t want to support Yahoo. Not even indirectly. So I am tempted to take it offline. But if I find a better service, and it is bound to be there now that other geeks will start migrating to come out of the belly of the beast, I might just modify it to sustain this other service. Nothing have been decided yet.
- And then I might instead develop my own service or help someone else develop their service, using the tagclouds ideas I spoke about early.
- And last but not least, there is the possibility that I might develop the famous search utility I have been speaking about. Up to now, apart the constraints in time, what really stopped me where ethical reasons. Joshua asked people not to screenscrape delicious, so I felt I would abide by his request. I surely did not want to tax the servers of a poor hacker. But now the ‘poor’ hacker have sold the golden eggs’ hen, and walked away with tons of cash. And I am sure Yahoo will not even notice if I start screenscraping them. At least until they start putting all sorts of advertisments which might make it too hard to do. Hmm, active resistance might have some attraction!
So I probably should congratulate with Joshua. He sold a bunch of quite simple and useless code to Yahoo. He prospected them the possibility to have a great and creative community. Now all he has to do is walk away with the cash, start a delicious clone and we will all be more than happy to join him in the new adventure. Hell! We will not even ask for our part of the booty. Although we might ask for a dinner in a good restaurant.
And I think that’s just fair.
ADDENDUM:
After reading all the comments on slashdot I found a link to a page with most bookmarking services compared. It is a bit old, so not totally updated. But yet it gives some good overviews and can be used for some good pre-screening. Also the maintainer of Simpy, Otis, wrote a long comment explaining how he might even adapt the code to make the mindmap work for that too!
I have to say that I amvery impressed with Wikka. Wikka is a wiki software that I just installed on my web page. It is simple, yet full of plugins. Open source (or I would not consider it). It also permits to integrate freemind mind map inside it. More than this: for each page the administrator, (ehm, that is me!) can decide who is allowed to read, write and comment. I installed it about one week ago, and I avoided to make it public until I would found a way to deal with wiki spam. I already have too much spam on this blog. Finally I found what I think is the perfect solution:
- only registered user can comment and modify the wiki. It might not make it very fast, but at least I know who said what.
- I inserted a plugin such that to register people must write a password in the ‘registration code’. But the password is written on the same log in page.
- To write spam in the wiki they have to manually register. Which I feel is fair. I have no anger toward those that manually spam. Are the mechanical ones that ought to be stopped.
- If the spammers write something that automatically register, I will change the registration code.
- And if they write something that automatically grabs from the page the registration code I change the context (the phrase in which the code appears), making their software useless. I will move from:
- registration code:”pippo pluto” to
- registrati0n code:”pippo pluto”
As you cannot code for something that blocks all permutation of the word “Viagra”, so you cannot code for something that codes for all the permutation of the phrase: “Registration Code”. Ah! And this is the revenge of the mass!
I think the idea is so brilliant that I will look if I can find a similar plugin for wordpress.
The next think that impressed me in Wikka was the use of rss. It is actually very easy to integrate an rss in a page. Maybe it is the same in other wiki engines, I don’t know. But on wikka it is absolutely trivial. You just need to write {{rss url=”http://the.rss.net/address.rss” cachetime=”30″}} and the rss gets taken shown, and cached for 30 minutes. Now 30 minutes cache is what del.icio.us requires from you if you are going to connect an rss to your homepage. So now I have started to integrate all sort of rss from delicious to my web page. Check for example my Tag Cloud page. With the rss from my personal bookmarks tagged with tagcloud, rss from the popular page in delicious delicious/popular/tagcloud, and the rss from technorati (i.e. people who have blogged on Tag Clouds).
And all this is in the floating right bar. So I still can use the rest of the page as place for me to write content, and notes…
And as notes taker this wiki is slowly becoming. I started moving my Reading List to the wiki. And I added to the reading list, the rss of popular reading lists. You see, how it all comes together.
But this is not all! Wikka (and they should pay me after a post like this!) gives the possibility to set the privacy for each page. That is for each page you can chose who can read it, who can comment on it and who can change it. In this way I can use this not only as my personal notes but as the notes for project that I might be sharing with other people.
Come and say hello: http://wiki.pietrosperoni.it
I have been Handelsblatt-ed.
Yeeee!
I think the time have come to write my third, and hopefully last contribution to the topic of tagclouds.
I have been hearing a lot of talk on how users should not use too many tags in linking to url. I also am the maintainer of the mindmap maker, and I often look at some of the maps generated (available to everybody). There is a number of people who tend to use an average of between one and two tags per URL. Their maps are often very ordered. No clustering, no hierarchy. (Forgive me if I don’t put a link to such a map, but since I am going to bash this way of using delicious, I’d rather bash a method than a specific human being. Just go to the list of maps and open a couple, odds are one of them will be of the type I am describing). This way of using delicious uses tags as folders, just with the modification that every now and then you can put an URL in more than one folder at the same time. A bit like big bookstore might carry several copies of the same book, and store them in more than one place (and the Tao Te Ching, ends up in New Age -God knows why- and in Religion).
Of course tags tend not to fit exactly. My Tag Clouds and Cultural Change will be under Tags or Folksonomy or Sociology… Whatever you chose you probably will not put it under Ajax. And yet most of the analysis was done studying the spreading of the term Ajax.
Let’s make a few simple calculations. Continue reading Tag Clouds are hard to Spam
Terrell Russell asked for some suggestion on how to improve his tool, Cloudalicious. He asked for it 3 times, one on the del.icio.us mailing list, and one on a comment on my previous entry, and one on his site (link missing as he wisely took this one off). Now I really think three times are too much, and Terrell should be heavily chastised for this. So I will write him a loong list of things that I think his tool should do, and maybe next time he will think better before asking people how should he employ his time as a programmer. I am always happy to give ideas to people, provided a) they remember me when the idea makes them incredibly rich b) they do the coding. Continue reading Cloudalicious suggestions
Note: This entry is connected also to a mindmap. Some people were having problems in opening the page because of that. As such the mindmap has been stored in a separate page, and can be viewed from here.
Introduction
As correctly pointed out by Jeffrey Zeldman tag clouds are becoming more and more popular. Yet I keep seeing services which should be using tag clouds that keep on using tag sets. It is not just a problem of programming a tool which can only support tag sets, but also but also of programming tools which might in principles produce tag clouds, but such that the users are not invited to use a tag if one already exists, and as such don’t generate a tag cloud.
Example of the first type of tools are Flickr, 43things, consuMating, tagsurf * , example of the second is the tagged version of the BBC* . In all those cases a tag set is used, where instead a tag cloud would be more appropriate. Some of the differences between a tag cloud and a tag set where explained in Vanderwal.net: Explaining and Showing Broad and Narrow Folksonomies. Let’s see them again, and see some consequences of those differences, which should clarify when is better to use one tool and when is better to use the other. Continue reading On Tag Clouds, Metric, Tag Sets and Power Laws
The mind map maker is at the moment out of order. This due to Joshua, who is trying to fix a minor problem with the delicious API. Please for the present time DO NOT make new mind maps (and if you do do not complain if they come up like mine).I will let you know when the situation is fine again.
In the meantime, why not spending some time reading others mind map? You can bookmark a mind map that you like using the tag ‘delimap‘, and as other tags the most important tags in that mindmap. Here are mine, for example.
Update: The mind map has been fixed, thanks to Joshua who promptly responded.
Thanks,
Pietro
I did some spring cleaning on the delicious mind map maker. I deleted some of the oldest maps, also from the period when the program was not working fine. If your map got deleted, please don’t be too angry, and just make it again. Unfortunately the only way I had to find the old maps was according to when the directory was created. This meant that if you kept on using the utility, and recreated the map more recently, your map could still be among the unlucky ones.
I am finally having a bit of free time (although is rapidly filling up, as I take from my box the list of all the things I wanted to do and did not have the time to do it). There is quite a list of things I wish to do on this program, to make it more efficient. If any of you have special requirement, now would be the right time to ask.
Pietro
As I posted the previous entry, I went to technorati to check if it was being pulled. And what I discovered was that technorati was only pulling the first tag in the list.
I make quite an effort to add all the tags that I think might be relevant. This both to improve visibility, and to better categorise the content. I like to make a copy of the same tags in my p.s.blog delicious account. And then see the whole thing as a mindmap. But for the mindmap to really work it is necessary to that if two entries share some content they should also shar at least a tag. So I use many tags. And the mindmap comes out really nice.
Not only this, but I feel that each post belongs to multiple tags, and should be present in multiple pages. For example this entry belongs to both the tag ‘technorati’, and the tag ‘mindmap’, ‘delicious’ etc.
Investigating a bit further I discovered this post, where a similar problem was presented. In that case technorati was pulling the information from the list of categories in the rss feed. Now the problem is that, in wordpress (other tag!), the list of categories is defined before, while the tags are defined after. And although this might seem like a minor problem, it actually means that often we don’t add all the categories that we need. In a sense it should be possible to just ask that wordpress uses tags as categories.
And then post the tags as:
<category>firsttagname</category>
<category>secondtagname</category>
So the end result of this is:
my posts are not appearing in the technorati page where they should: tag, technorati…;
my posts are appeariung in the technorati page where the shouldn’t :General, English…;
And I haven’t got a clue how to fix it.
Pietro
UPDATE:
I did send a mail to teachnorati, and I got this answer:
Hi Pietro,
Your tags must occur within the boundaries of a post, a div of class of storycontent in your case. Technorati should treat your Dublin Core subjects in your Atom feed as tags.
SECOND UPDATE:
After various tests, I realized that technorati does not parse the html, and I usderstood what the mail meant with Technorati should treat your Dublin Core subjects in your Atom feed as tags.. Since the author of the plugin explained that for a couple of more month he is not going to be able to fix it, in the meantime I downloaded another plugin: Technotag. That gives me the possibility to add <tag>tagname</teg> And that’s makes a tag automagically. Let’s hope that this works!
THIRD UPDATE: it works. And as I keep on making small hacks to the plugins that I use, I slowly learn how they work
FOURTH UPDATE:correction, it only worked for the first tag. But I hacked a bit the code and now it works fine on all. I shall send an email to the author, to pass him the change.
Before it would make a tag on every <tag>tagname</teg>, but all the tags would all point to the same address: The one generated by the first tag. Corrected. The new code is available here.
Rss is somehow one of the best ideas. You can have your content, stripped of form BS being redirected all around. This gives a one to many structure. Now we need the opposite. We need to be able to pull the content from many sites in the same place, and check it. A many to one structure.
Most of you will say, “But we already have that, it’s called an aggregator. Just look at bloglines.
Yes, and no, that’s part of it, but it’s not the whole story. We need to have a page that posts all the content from everywhere in a single page.
And again I can hear: “but we have that too: it’s called a technorati tag“.
Again I will repeat: Yes, and no, that’s part of it, but it’s not the whole story. We need to pull the information from the technorati pages to our aggregator.
This is the idea: we need an rss feed of a technorati tag. As we can get the rss feed of a del.icio.us tag, we need to have it for all the blogs. The time have passed to add to your friend list ALL the blogs that might have information of interest. We need to be able to add that rss to our bloglines.
So, either technorati will start releasing the rss, or I predict that:
- a) other services will start competing with technorati offering that info
- b) anonymous hackers will start scrapping the info from technorati to offer the very valuable information.
See also:semanticweb, tags
This is going to be big. It’s called tagsurf. When we were setting up the taoist discussion board, at Tao Bums, I was looking for a board that permitted me to tag individual messages with different tags. The reason is that over there we are now a group of friendly people and every thread start with a topic, but often touches many separate ones. The board had to be in PhP for reasons only knew to the web master, but that we all were happy to follow. So we started looking around, but no board with tagging facility went up. Nothing. I had to admit that the idea was quite new, and I have not seen any such board around in any case. And then we decided for phpBB which being open source would have had new versions with any new cool geeky thing appearing every so often. Well. Now I finally found the first tag based discussion board. It’s called tagsurf. And is very cool. You get to write messages and tag them. As tag you can use any word up to any size. Now, the result of this is that you can tag thing with the url of something. So immediatly a series of utilities started appearing:
People (first one I saw doing it was Russell Beattie) added a tagsurf button. In short if you click on that button you get all the comments on tagsurf that uses your permalink as a tag. In a sense it is outsourcing the discussion board.
Yes, I added it too, is down near the little technoraty bubble, and I just needed to add:
<a href="http://tagsurf.com/post?tag=<?php the_permalink() ?>">Tagsurf this</a>
in the template.
I also went back to see how was tagsurf behaving in del.icio.us. It seem that, as it often appear in other cases, the meme is 6 days old. At the beginning few people noticed it, and now is starting to explode. I too found out because of the delicious discussion board, which I would suggest anybody who is interested to anybody who is interested in delicious OR folksonomy
I think this tagsurf will and can have great impact. They already have some API defined.
I also got an eye to their privacy policy. It seemed simple and clear. Yet now I cannot find it anymore. I suspect that they might be working on it right now.
I also made a small bookmarklet to post an entry on tagsurf about a specific page. Just drag the word bookmarklet on the bar and it should work. Of course for it to work you have to be logged in in tagsurf.
Great points:
- trackback: every post gets is an entry point for trackback. In other words anything you say can receive trackback from anything else. You say something here, and it get people in the blogsphere chatting. And you can follow their conversation. This is something very important that was missing in all the bullettin board I have been using. In a sense many discussion board are only looking in. This is also looking out.
- trackback 2: Every post that you make can send trackback to anything you want. The software to do this automagically respect to the other posts inside tagsurf is still missing, but I can’t imagine it not appearing very soon.
- possibility to mix different threads: since each post gets as many tags as the poster want it is quite easy for people to join different threads of discussion.
Problems I might see coming.
- Spam, spam, spam: I recieve about 30 spam trackbacks a day. And they get filtered by cool programs and finally deleted by me. Yet those programs need me to make the final judgement. Who will make the judgement for all the trackbacks in all those posts? Will the user have to? Can someone close the trackback from his own posts? I see many problem and much discussion over here.
- copyright: This is another big one. Let’s say that I post a cool entry in tagsurf, who gets the copyright of it? It might be important. Imagine that someone takes it, and wants to add some extra tags. But adding tags is not allowed at the moment. So he copies the post and just reposts it with the extra tags. Do I have a say on it?
All together I think this is a wonderful piece of new technology. When tachnoraty started his tag page I wasn’t very impressed, but this, I think, will make some huge effects. And still I can’t see all the implications.
ADDENDUM: just as I ended this post I read fully the great and very interesting post from Russell Beattie. And I found that he had made exactly the same bookmarklet. Oops. Well, I hope he will not sue me, I haven’t copied his code. I just reinvented the wheel.
ADDENDUM to the ADDENDUM: As I was looking at all the people who were commenting on the thread on Russell post I noticed another post with the same bookmarklet. And I thought I would have been the first . At least I get to see if the trackback to posts over there actually works.
ADDENDUM to the ADDENDUMto the ADDENDUM: trackback does not seem to work, or the comment is being held back for security reasons
Did some more debugging. Now any unicode the user used in the tags should be ok. Still there is a big brick wall in terms of memory usage. And some users are not having any luck just out of the fact that their map is taking so much resources that it goes beyond the ISP limit. I could work hard and distribute the whole calculation so that all variables are stored on disk, so the memory would never be hit, but honestly, it is not my top priority at the moment. I am here to help those users run the program on their own machine. And eventually we might solve that problem too. So, what are my top priorities:
- Add an rss feed.I would like to add an rss feed that every time a new map is done, the feed gets updated. It wouldn’t just tell the name but all sorts of data, like the list of the Main Tags. So the users could see if they might be interested in checking the new buddy’s map
- Insert a way for user to delete their own maps. If I am going to go into hosting business, I am not going to be one of those hosts where you can add info, but you cannot delete it. I am aware that users info ultimately is adding value to my site, as such I want users to be happy in having their map here. Not forced.
- Insert a general log of all the maps that are being started, and ended. Right now such a log is absent, and there are about 200 maps completed, and more than twice maps that have been started. So about 300 have been dropped. I bet many of those users would have success, if they tried right now, after those 3 deubugging session. Still I want something that tells me: Warning warning warning, map dropped. Bug? OutOfMemoryError?
- Add the number of posts inside a tag. Just obvious
- Probably add some of the MainTags as keywords to each single map. The problem is: which? All is too much. All the ones that contain more than x posts, y subtags is not flexible enough. The solution should be: if a MainTag is part of a ParetoFront of Delicious than the keyword should be there. The fact that this means writing a whole program that stores in a database the latest ParetoFront is just a small detail
. And before you ask: no, I will not need anybody’s password to do that, and the data will all be public.
- Add a bookmarklet to save a map in your own delicious, with the keywords as tags
- Change the map, so that it can run on a single tag. Useful for big complex maps like mine, and others.
- Make it change the Title of the Map Page, to show the owner of the map. Useful if people want to add the maps to their delicious pages.
And then there are some tests I would like to make, like:
- Check if it would make sense to show all the tags that appear with a single tag, and not the subtags.
There is more? If you can think of other modifications , please drop a line in the comment section. Also if you tried to run the map maker and it is not giving you satisfaction let me know. I’ll whip it appropriatly. HarHarHar. (I’ve always wanted to say that!)
Some people (few) were in the unfortunate situation that the tool would calculate their map, and would correctly add it to the make map page, but then the map could not be open. If you were one of those people, I have good news. I tracked down the bug (this time only derived by my stupidity) and nailed it. So, please try again. Insert again your data, calculate it again, and then open it. With this I ended debugging the obvious big errors. If you try now and the tool does not work, please drop me a line with your username, maybe send me by email (available via my homepage) your complete list of all the posts. And I will see what I can do for you. If you don’t contact me I have no way to know that the tool failed, and I will not be able to help you.
Things are progressing, more and more people are using the tool. Unfortunately not for all was a succesful experience. I could spot two separate bugs. In the first case the map would not be created at all, and the program would stop just after making the poststotag dictionary. In the second the map would be created but it was unreadable from the user. Yesterday evening (in my camper van!) I debugged the first issue. Essentially the program was downloading from delicious two different files, the (don’t click on it) list of all posts, and the list of all tags. Well, the two files were not coherent one with the other, and the list of all tags would in some rare cases list tags that had no post associated with them. Of course as soon as the dictionary would start being created the program would protest, and quite correctly so. I think the problem has something to do with how del.icio.us is at the moment handling the change tag name function. Maybe the problem has been solved by now, and what I got into were some users that had used the function while it was still not completely bug free.
In any case I circumvented the problem by not downloading the tag list file at all, but recovering the list of tags directly from the posts/all page. It is obviously slower (by big map moved from 377 sec to 460 sec.) but more secure.
So, if you tried to use the map before, and you did not had luch luck. If it did not create the map at all, then try again, and now it should work. And if it doesn’t please contact me, and maybe send me an email with your all.xml file.
If instead the map was created, your name was added but the map wan unopenable, then keep having patience, and this evening I hopefully will kill that bug too.
And thanks to all who are using the tool, is such an interesting project for me!
Pietro
The good news is that finally the Mind Map Maker is being used and tested. The bad news is that it does not always work. Somehow it would have been easier if it never worked. I think there are two problems: one problem is that it requires some heavy download from del.icio.us. No matter if the download are for different account, they are all coming from the same IP, so I would not be surprised to discover that del.icio.us have bashed the program on the head more than once. I can somehow half the request by making the program calculate the whole list of tags, instead of downloading it as a separate file. I had it already on my todo list, and I think I will do it tomorrow. So, if you have requested for a password and it did not appear, than fear not, just try again in some half an hour. (Alenahra, I’m speaking to you for example!)
But this is not the only reason why the map maker is failing. There have also been cases where the map maker made some ‘perfectly acceptable’ maps from my point of view, but that for some reason are unreadable from the mind map. What am I refering to: but to niels77 for example, for whom the program made what seem as a perfectly acceptable .mm file but that for some reason neither the java program, nor the free mindmap in my computer seem able to read. This is the kind of mistery that are more easily unraveled in the morning.
But for few maps who don’t make it many did. Just go to the Make Map page and choose one, any one. And each will tell you a story, a point of view, a set of interests, and a suggestion on how that person sees the world. The more I use them the more I like them.
BTW the Make Map has also made it to the popular page. I feel so unprofessional in noting it
Update I checked how many directories have been created respect to how many maps have been completed. The ratio is about 110:70 That’s not that good. It means that if you ask for a map you have about 1/3 of probability that it will not make it. For now just wait some time than try again.
The first person to use the tool (presented here) was Mike Harris, for his delicious entries. Note immediatly how the time needed to compute the map has little to do with the number of posts, and much to do with the number of tags.
- WCityMike: 2029 Posts, 87 Tags and 81 Main Tags, calculated in 86.85 seconds.
- p.s.blog: 21 Posts, 43 Tags and 17 Main Tags, calculated in 0.23 seconds.
- pietrosperoni: 372 Posts, 400 Tags and 152 Main Tags, calculated in 377.40 seconds.
The Main Tags, are the tags that will appear as main branches. And we can also see a difference between Mike maps, and mine. In mine I tend to have about 0.4 of the tags as Main Tags, while Mike tends to have something more near 0.9. This is probably due to the fact that I tend to apply many tags to each post (four or five are common, but sometimes more), while Mike tends to use an average of one or two.
If we look at the map we can also see that there are less clusters than in my map. Note for example how in the small blog map nearly everything is clustered… and those are only 20 posts and 17 Main Tags.
If we look at the source code we can see that, on the 9th line some constants are set:
distances_constant= [0.333333,0.4,0.5,1]
Those constants define the minimum distance for entries to be in the same cluster.
The 1/3 means that if one third of the posts between two tags are in common then the tags should be in the same cluster. And so on. Tags that are farther apart, but have a path of tags between them such that you can go from one to the next without never going above that distance are in the same cluster, too. A process that in the log is referred to as making the distances tables transitive.
Those number have been specifically tweaked for my delicious posts (and generally my style of bookmarking). It seem obvious that for Mark the numbers should be different. Since it is more uncommon for him for posts to share a tag, probably the numbers should be lower. Something like:
distances_constant= [0.1,0.333333,0.25,0.4,1]
The last 1 is just to make sure that tags that are synonimes are shown together.
I think eventually I will modify the program so that it is possible to insert your own constants from outside. But for now I am just grateful to Mike for giving me the material to understand better how to enhance the program.
You know, maybe because my father has been a journalist for so many years I have always been raised to appreciate the complexity of life. And yet I still can’t understand why do we need all this complex machinery.
I totally agree on the importance of the semantic web. And, boy, am I thrilled on the possibility that we might be generating the the internet operating system. I am also aware of the cutting edge problem of who owns your data.
But what I just can’t get is why do we need to make things that can actually be quite simple, into this amazing complexity. I might not be getting the whole picture, and I admit ignorance, above stupidity. But still, why do we need to build this whole house all at once? Example del.icio.us has been amazing, and trivial at the same time. And amazing also because it was so trivial.
Now let’s expand the concept:
Instead of storing one single link let’s store two links, and a set of tags in the middle. Two links with their two titles and maybe their two descriptions. And one set of tags between them.
And people will naturally start using interesting tags.
Like:
‘explains’, ‘terrorises’, ‘defines’, ‘is’, ‘IsTerrorisedBy’, ‘embedds’, ‘uses’, …
It will also be fun.
And then get a page for all the links that uses a certain URI as it’s first link(…/subj/…), and another for those that uses another as their second URI (…/obj/…).
Then you can use delicious to store those pages.
The bookmarks manager was del.icio.us? I can assure you that this will be at least org.asm.ic! And it will not cost the programmer more than 200 lines of code. LAMP, PHP, MySQL, keep it simple. And we will all use it.
I went on programming at my favourite Python program: Delimind.
In short: Made a new release of the Deli Mind program. Here is the source code (just remember to change it from a .txt to a .py). Now similar tags are clustered together.
- Here is how it looks like.
- Here is how the previous version looked like.
- The original from Brownhen (may he live long and prosper) used to be here, although now it is missing.
All on the same data. Mine, now.
Go and enjoy.
(Later addition: while the program works well for small databases of links, like mine at the time in which I wrote this entry, it doesn’t scale well on size. For this reason it crashes for most of the people who try to use it with more than 1000 bookmarks. For this reason I was forced to change the link on the cluster example to a database with fewer nodes.)
Now the tecnical stuff for those that have a bit more patience.
Tags are not all the same, some are more similar than others. So, for example, the tag “September11″ and “GeorgeBush” have more links in common than “GeorgeBush” and “intelligence”. The idea behind this version of DeliMind was to cluster tags that had links in common. Since distance is generally not a transitive property (if I am near to you, and you are near to Jim, I am not necessarily that near to Jim), while clustering is (if I and you are in the same cluster, and you and Jim are in the same cluster, then me and Jim have to be in the same cluster… unless people belong to different clusters, but that’s a complication).
So I started by making a matrix of relations among tags (all_dict). Each tag, respect to each other tag could either be
- Once contained in the other
- Identical
- Disjointed
- With # bookmarks in common
Then according to the number of links each of the two tags, and the number of links in common I invented a measure of similarity. If #A is the number of links in tag A, and #B is the number of links in tag B, and #AB is the number of links in common.
The the relative similarity (SAB) will be:
SAB= sqrt((#AB/#A)*(#AB/#B))
I actually played with various measures:
SAB= ((#AB/#A)+(#AB/#B))/2
SAB= Max(#AB/#A,#AB/#B)
They all went from 0 to 1, and were quite similar… (I am not going to discuss the relative properties)
But the first one just seemed the one that made more sense, and at the end, the resulting map was the one more close to my personal intuition of what should be in what cluster.
Once the similarity matrix was done I started studying the clusters. Generally for each triplet of tags A, B, C I would modify
SAC:=min (previous SAC, max (SAB, SBC))
And I would continue going through all possible triplets, and then starting again from the beginning until no new change were happening.
Why? The idea is that the similarity between two tags measure how easy it is to jump from one to the other. Visualise each tag as an island, and then you have an animal who can jump from one island to the other. But it can only jump up to a certain distance. So if he can find a succession of tags between two tags, A and B, where the similarity (the similarity is the inverse of the distance) is always above its jumping ability (that is, the distance is below its jumping ability), then the animal can move from A to B. If not A and B are in different clusters. Effectively unreachable.
But we don’t know how far can our beast jump. So in this way we end up having a similarity number that sais: somwhere, between A and B is possible to find a succession of tags, such that the distance is never above x, so SAB is equal to the minimum between the original SAB and x.
If it does feel complicated don’t worry. I got confused a few (hundred) times programming it. And just could not understand why those damn tags were not clustering… until I got it right.
So, now you have this nice matrix, only between your main tags (the one that are not contained in another tag, cfr previous version), and you (or actually I) need to cluster the tags.
Not also that you don’t need to cluster the tags only one time. Once you made a clustering (for animal which can jump d), you can still partition inside the clustering for animals that can jump less than d.
The first time I just asked him to cluster each possible number. That is, if a number was present assume that someone was able to jump exactly that distance. In this way I got a heavily clustered map. It was a mess, but a promising mess. I then saw that most of the interestign things were happening between distances of 0.333333 and 0.6666.
That is, it made quite sense to ask for the clusters generated by putting together tags that had one third of the links in common, and tags that had up to two third of the links in common.
This is how I got clusters:
- porno, sex and eros
- GeorgeBush, September11, politics, economy, historical, terrorism, usa
- green, sustainability
- …

Then I just applied the same process in the subtags of each tag.
Ok, I can be satisfied, I can go and have something to eat.
As always, if you find it useful drop me a line, I appreciate.
Pietro
|
|
Recent Comments