The first person to use the tool (presented here) was Mike Harris, for his delicious entries. Note immediatly how the time needed to compute the map has little to do with the number of posts, and much to do with the number of tags.
- WCityMike: 2029 Posts, 87 Tags and 81 Main Tags, calculated in 86.85 seconds.
- p.s.blog: 21 Posts, 43 Tags and 17 Main Tags, calculated in 0.23 seconds.
- pietrosperoni: 372 Posts, 400 Tags and 152 Main Tags, calculated in 377.40 seconds.
The Main Tags, are the tags that will appear as main branches. And we can also see a difference between Mike maps, and mine. In mine I tend to have about 0.4 of the tags as Main Tags, while Mike tends to have something more near 0.9. This is probably due to the fact that I tend to apply many tags to each post (four or five are common, but sometimes more), while Mike tends to use an average of one or two.
If we look at the map we can also see that there are less clusters than in my map. Note for example how in the small blog map nearly everything is clustered… and those are only 20 posts and 17 Main Tags.
If we look at the source code we can see that, on the 9th line some constants are set:
distances_constant= [0.333333,0.4,0.5,1]
Those constants define the minimum distance for entries to be in the same cluster.
The 1/3 means that if one third of the posts between two tags are in common then the tags should be in the same cluster. And so on. Tags that are farther apart, but have a path of tags between them such that you can go from one to the next without never going above that distance are in the same cluster, too. A process that in the log is referred to as making the distances tables transitive.
Those number have been specifically tweaked for my delicious posts (and generally my style of bookmarking). It seem obvious that for Mark the numbers should be different. Since it is more uncommon for him for posts to share a tag, probably the numbers should be lower. Something like:
distances_constant= [0.1,0.333333,0.25,0.4,1]
The last 1 is just to make sure that tags that are synonimes are shown together.
I think eventually I will modify the program so that it is possible to insert your own constants from outside. But for now I am just grateful to Mike for giving me the material to understand better how to enhance the program.
No related posts.

Recent Comments