Tweets

Follow @pietrosperoni (388 followers)

Categories

How Twitter, Google, Wolfram|Alpha and WIkipedia are not competing at all

It seems to me that Twitter, Google, and Wolfram|Alpha, are really not competing at all, but are instead providing complementary services. I would go farther by adding wikipedia (and blogs?), and suggest that the 4 services really represent the digestive process of our information society. From the first Churning to the Backbone

Wolfram|Alpha represents the deeper part. It includes only what is really known inside out from our society. What has been fully digested. FOr something to be in W|A it needs to be known, semantically known, beyond doubt. And notice that I am here speaking about a deeper Wolfram|Alpha than what you have seen here. The Wolfram|Alpha as it should be, once we learned hot to interrogate it proprtly, and once it has epanded with the rest of the knowledge we have.

At a higher level we have wikipedia. Wikipedia permits much more stuff to be present. You have actors, and theories, and stories, and a lot of other stuff.  You also have discussions and point of views. In short you have a lot of stuff that is not being digested anymore, but is also not the bones of our society. It is more like the muscles. The limit of Wikipedia is that since it does not allow for new research, by definition it is limited to what has already been discovered. Although not in a definite way as in Wolfram|Alpha.

And then we have Google. Google is really part of the digestive process. It has new stuff coming in every few days. But is is also less clear. You need to work to get to the results using google. But you can also find new threads. Things that are still not known. There is real food here, waiting to be digested.

And Twitter is the more superficial tool. Twitter has second to second update. It has multiple links in different forms that point to the same resource. Information is not organised in any way, shape or form. But it is information nevertheless. It represents the edge of the knowledge wave of our civilization. It is deeply alive, unpredictable, full of possibilities. You never know how it will react. It is the most alive part of the constant discussion that is going on in our civilisation. It is the civilisation equivalent to the constant chit chat that is going on in our head. Although it has memory, it is not really good with it. Anything that is in Twitter can be true, can be false, can be anything in the middle, neither or both at the same time.

If you are an alive and creative mind that wants to participate in the constant flow of creation of this society you will probably end up interacting in twitter in some ways. But if you want your creation to be grounded in reality you need to use the other levels as well. They are really not competing.

And Blogs? Blogs are ways with which we store personal longer stories. The untwittable (as Chris Anderson from TED called his). They work between the google level and the Twitter level. Letting information move between those levels, and letting complex information be churned before is ready to go deeper. Similarly you have journal articles (and books) working to bring the information to the wikipedia level.

Facebook as a spiritual tool

[crossposted on the moblog, and the facebook notes.]

One of the leit motif in spirituality is to reach an integration among the various parts of oneself. There are many important reasons for this, which I am not going to enter right now. Becoming One is not seen in Taoism as a spiritual goal, but as a spiritual prerequisite. It is not school, it is preschool. Until you are one you cannot really get involved with spirituality. It is like if in your family you decide to build a house, but not everybody agrees on that. Then one part of you builds it in the morning, and someone else of the family will destroy it in the evening. Maybe using the bricks for something else.

The idea that we are many, that each of us is many, is quite common. In psychology is common, Junghian Psychology, if I recall well. Again, in Taoism it even reaches the point of believing that this is true in a litteral point of view. Each of us, is seen as a patchwork of different spirits (shen). And when you die each spirit will then go its way. As such in Taoism until you have reached a real integration between your parts of yourselves (your spirits), you cannot even have reincarnation unless you have developed a unit which is integrated enough to go through the trauma of death without shattering in a 1000 little pieces.

And another idea that is very common (you have it in Taoism, but also in Christianity, for example), is the idea that one day, one time, at some point we will all get together. Christian say “sit by the father”. In Taoism the idea is that any person who have showed a spark of interest for spiritual work will eventually join together in some place beyong space and time, a sort of heaven. And the joke then is if people are following the 1 lifetime program, the 10 lifetime program, the 100 or 1000 lifetime program, to reach it. And the faster it is, the rougher it is.

I have to say I am amazed by how well is Facebook helping in this integration work, for me. I have many friends, on facebook. But more importantòy I have friends from different groups. Each friend knew a different Pietro. Some were from my spiritual life (taoism, tai chi, meditation, …), some from my academic world (artificial life, mathematics), some are Go-brothers, others people I knew from childhood, or from high school, or middle school. And with each of them I was a different person. And now they are all together. All in the same place. And the internet does indeed feel a little bit like this place beyond space and time. And I read of many of them. But what is more important, is that, as I write about my life, I am forced to write in a way that is acceptable for both my academic side and my spiritual side. I can only write in an integrated way, because I know that friends from both worlds will read me. In this sense facebook is catalysing an integration in me. Is helping me to become one.

I know many people are having problems with facebook. I think a lot of the problem is that they are not ready or willing to have this integration. For me Fb is pretty easy: to become my friend you need to know me. With very few exceptions I do not add anyone who is not someone I personally know. But if I have met you, and you want to befriend me, then you are in. I don’t keep people that I know out of the door. Because that would be equivalent to keeping some part of myself out of the door, the part of me that interacted with them. You are all invited to the party. I sometime even go back in time, and look for people I once knew. People that were important in my life. Or people I wished I had the time to know better. Maybe now we have another occasion. But then on my status, in my notes, in the caption of my photo, I try not to speak thinking about one in particular (I might have done it, but mostly I try to avoid it). I speak to all my friends at the same time. And if anyone comments, I answer that person, personally. The answer is personal, but anybody can see it, and thus the integration goes deeper. I write in English and in Italian, because those are the languages with which I live, work, chat, play and love. My inner dialogue is sometimes in Italian and sometimes in English, depending where I am, what I am thinking of doing. And my facebook reflects that.

Most of you know that I use facebook pretty frequently. I update the status often, sometimes more than once a day. But what some of you have not realised is that I do not do much less on facebook. I avoid facebook applications. I only use the ones that are truly useful, that add functionalities that were not there, and are truly helpful. If I want to wish to my friend Happy Chinese New Year, I will do it in person, or through the status. Not through an application. In this way the integration proceeds. I very rarely invite people to use applications. I only do so when I think an application is very very good. (The “skip this” button is my friend). I invited my friends for the geo tagging application. I would do it for the “cause” application. Maybe the iRead could be another one, and the application to play Go online. Here you go, this makes it 4. And when I invite people I only invite people I think will appreciate it (or should, they know it or not ;-) ). I consider the other applications to be equivalent to spam. I try not to spam my friends. When a new application arrives (elves, and pirates, etc…) , I usually just block it. If an application is requiring me to send invitation to let you proceed, I report it (because it is breaking the TOS, and ruining the party for everybody), delete it and block it. With absolutely no pity, whatsoever.

I see often people who get tired of facebook. But very often those are people who are not using facebook as a tool to interact with friends that are far away (in space or time), but as a game. Those are the friends that use more of those facebook useless applications. They get tired, but what they are really getting tired are those useless applications. They are right in getting tired. They just need to use facebook, instead of be used by it. And then fb will stop being a toy, and become an instrument. You will forget about facebook, and think about your friend.

Keeping the application to the minimum necessary.
Speaking to everybody. Inviting all your (real life) friends.
It is fairly easy to let facebook help you in the integration process.

Polyphasic Easter

It is 2.30 am and I just woke up. I went to bed at 2.05 am.

I am not crazy. Not yet, not anymore, at least not more than usual. I am just trying a new sleeping technique. It is called polyphasic sleep. I actually wanted to become polyamorouse, but I got confused during the googling process, and now it is too late.

What follows is a brief intro to polyphasic sleep for the general bear audience. Continue reading Polyphasic Easter

wikitags

I think it’s the time to present what have I been doing in the last days. A number of improvement have beed added to this web site. In short I have upgraded to wordpress 2.0. I also moved to the next version of wikka. Some of you might remember that I offered some money to whoever could write some code to get the tag plugin to generate an rss list. I didn’t, at the time, explained why. I will now.

Wordpress 2.0 gives the possibility to start categories on the fly. Just adding them, by listing them. Essentially this makes the category in wordpress work like tags (or keywords, for academics). But categories in wordpress also have an rss feed connected to them. Albeit with some bugs, like linking to the whole blog and not to the particular category. So I passed most of the first of January adding to the entries the relative tags as categories. So now I have no need of an rss feed for the tag page, as the tag page has been substituted with the category pages.

You also will rememer that I installed Wikka. The wiki engine. Now wikka is not only open source but also easy source. It is so simple that even I could hack the code. That is very simple! So I changed the code and inserted the possibility to have default pages. In short if before if you were to look for the url http://wiki.pietrosperoni.it/someunexistingpage and there is no page in the wiki called “someunexistingpage” the result would be that the wiki would ask you to edit the page, and you would be redirected to http://wiki.pietrosperoni.it/someunexistingpage/edit.
Now he would create on the fly the page someunexistingpage with the default content. And the default content I chose was the 4 rss feeds:

  • the feed from my blog from the category: someunexistingpage
  • the feed from my technorati from the tag: someunexistingpage
  • the feed from my delicious bookmarks from the tag…
  • and the feed from the popular pages in delicious, always from that tag

So for each tag I now have a wiki page with the most relevant rss appearing there. But being a wiki page I also can add other rss feeds, write definitions, comments, todo lists. In short modify it as I see fit.

Still it is not perfect. As it writes the page the first time, from that moment the page is set. I can delete it, but I cannot, for example, change the default content for all the pages that only contain the default content. I tried to write a plugin to do that, but I failed when I confronted the fact that I needed to write a plugin {{defaultpage}} who should have activated other plugins:{{rrs}}, for example. Something that I ignored how to do.
Also having the same string to work for delicious (as tag), wordpress (as category name) and wikka (as pagename) puts some heavy constraints on what the string might contain. For example I am already running ashore for all the tags that contain a dot inside (aaargh, del.icio.us!) or an accented letter (aargh, dear italian).

If you want to see how the pages look like just see the idea page. But any link from the right column (provided they have no dots inside or accents) will work fine.

Cloudalicious suggestions

Terrell Russell asked for some suggestion on how to improve his tool, Cloudalicious. He asked for it 3 times, one on the del.icio.us mailing list, and one on a comment on my previous entry, and one on his site (link missing as he wisely took this one off). Now I really think three times are too much, and Terrell should be heavily chastised for this. So I will write him a loong list of things that I think his tool should do, and maybe next time he will think better before asking people how should he employ his time as a programmer. I am always happy to give ideas to people, provided a) they remember me when the idea makes them incredibly rich b) they do the coding. Continue reading Cloudalicious suggestions

BBC backstage and News in Folksonomy form

Some things are bound to happen. And they tend to happen at the right time. We have been using tags from years now, but the momentum have builded up, day after day. Always seeing more and more computer programs using them. Starting from deli.icio.us and flickr. Then 43 things.com, consumating.com, tagsurf.com and all the clones of the above (BTW if anybody can find me a small open source server program that emulates Flickr for personal use,I would be grateful). And of course technorati tags, and GutenTag that give rss feeds to technorati tag.
But something was missing. Somthing that some people might have noticed. The news were not playing with tags. News were still presented in the old top down way: politics, economics, international…
On Google News, as well as CNN. On Yahoo News, as on BBC.

But finally something is starting to move over there too.
Two services, pretty much at the same time were presented: Yahoo News with tags and BBC with tags.

But there are some serios differences between the two services. Yahoo content is being automatically indexed by a program, who imposes the tags according to what keywords are found in the text. As such Yahoo tags is a Top Down keyword classification of stories.

Instead (and here you can see the revolutionary spirit blowing through English news services), BBC program is a truly down up grassroot program. A program where everybody can add any tag to any article.
The difference is not a minor one, as in the first case it is the user that have to adapt to the world view of Yahoo, while in the second it is BBC that includes in his wider world view the user one. In a sense it is a case of Tagsonomy vs. Folksonomy, or
narrow folksonomy vs. broad folksonomy.

Of course both the program are still in their first days. Full of bugs, and of suggestion from us on how to make it better, smoother, and nearer to our personal desires.

Of course having anybody being able to add any tag to a copy of the BBC content is full of political dangers. What is stories about important politicians start to be tagged as ‘dictator’ or ‘wanker’. This is in fact inevitable, but politicians showld well use this as an indication of their popularity, than something to be changed.

At the moment anybody can add a tag in the BBC news page by login in as ‘guest’/'guest’. And already we have some people who have tagged some stories as ‘wanker. But if we go to delicious we see that nearly no one have used such epitome.

Why is that? My personal position is that people are more careful when tagging something for their own personal use. On delicious everybody have an account. And although you could have as many account you like, they cost. They cost time and memory to set it up. So we all tend to have just the minimum amount of acount needed. But on BBC, at the moment, only BBC person are allowed to have their own account. We normal human being, can just be guest. Ans as such we might feel deresponsabilized respect to what we wrote. So I think that, although the experiment is great, it will only work properly when everybody can set up his own account, and serch his account, or the account of another, well defined person.

Of course this also open up all sort of extra possibilities. After all, if anybody can tag any article with his own tags. Then to each article a set of tag will be defined. What is I want to receive (maybe on my mobile) all the articles tagged with a certain keyword. The possibilities are really endless.

And to look at those possibilities BBC had started a whole new project, called BBC Backstage where geeks are invited to collaborate with the staff of BBC to develop the API to permit to everybody to reuse the BBC material. Cross this with the fact tha much of this material is copyrighted with a copyleft copyright (copygotit?), and you see how the whole situation can positively explode.
Imagine, much of the material from BBC, offered for free, in the way wanted by the best geeks and hackers, to produce information in any noncommercial way they please.

Already many ideas are flowing? An RSS for the results from sport match. Crossing google maps with BBC News.

Possibility to have BBC news accepting trackbacks.

And many many others.

All this would mingle BBC with the common people. Think, all the news, mixed and remixed. Commented, trackbacked. Until you can read an article from BBC news from any device (through rss), in any format you want (through your rss reader). Filtered anyway you want (through folksonomy), and seeing the world response to that article(through trackback and comments).

Thank you BBC
(and no, I am not paid by BBC)

Thanks also Wired for some inspiration.

Technorati rss 2: Guten Tag

As I predicted a service on the net that offers an rss feed of blogs with a certain keyword / tag has appeared. It is It’s a Tag world from Stephane Lee.

The system seems still in its infancy (the name needs to be shorter, for start, to be used as a tag itself) , but by offering the rss feeds of blog entries they give a really valuable tool to the blogphere.
Now, either Technorati starts offering the rss feed itself, or sues Creative Mobs (enstranging a big part of the blogsphere), or accepts to coexist with another, potentially aggressive, competitor.

And indeed, it’s a tag world,
worse than that:
it’s a tag eat tag world!

Pietro

P.S. The system has also a certain humor and pragmaticity, the general tag page is called: ‘Guten Tag’ (german for ‘Good Day’). And each tag, has, in its header, a link to the equivalent Wikipedia entry (which works fine only in English, unfortunately, but still…).

Thanks Stephane.

Visualizing the double hierarchical nature of entries.

I keep on being hunted by a nightmare:

Think about a post. You write a post, and this is in answer to some other posts, some other web pages, done by someone else. And your post will often be answered by other people. In a sense no post is an island. Given a post you can see all the post that answered it, or reviewed it. This through the trackback list. And they themselves has other post that answered them. And so on. But this does not work only one way. You can also go backward in time (which in fact is what we usually do when we follow the links.) You read a post, then you read the post that post is refering to, and so on. And in my dream this is a sort of tapistry, where each post is a node that links together different threads. So each post is not just contained in a thread, but connects to many threads that work through it.

Now think about a discussion group. In a discussion group each post is part of a tree. Each post can be answered by many posts, but it has only one father. One post it is itself answering to. And because of this structure it is possible, and actually easy to generate the classical hierarchical structure, that you can see pretty much everywhere in discussion group. (i.e. the Healing Dao discussion group)

But if you look closely you will notice that discussion groups are actually not having really a tree structure. Posts do yes have one father, but they refer to many other posts. They might not explicitly link to all the posts they refer to, but they surely refer to many posts. This is because in discussion groups there isn’t usually the need to link to all the relevant posts. After all the readers are generally a filtered group of people. Also often a person will use one post to answer a whole bunch of other posts, especially inside a closed community, where everybody reads everything.

Yet the hierarchical way in which posts are written in a discussion group is really useful. You can in an instant perceive how many people answered, what where the thread departing from that post, etc.

Now look at a post in the blogging world. It refers to many other posts. It explicitly links to them. And if it is succesful it will have many posts linking to it themselves. Now forget a moment about the upward link. Each post posts that link to it. In a sense they are replies to it. The link to those posts is saved in the trackback list. And each of those posts itself will have certain posts that refer to it.

Are you starting to see it?
Each post is in a sense the root of a tree, whose branches are the posts that refer to it, and whose sub-branches are all the posts that refer to the branch posts. In a sense nothing new. But now, if you see your posts in this way, you can also wish not to display just the immediate trackbacks, the posts that refer to your posts. But also their trackback too.

And here is the first part of my dea. Since each post is available in feed format, it should be possible to fetch, for each post, not just the trackbacks, but the trackbacks trackback. The post that refer to the post that refer to your post. Which means seeing the tree starting from your post up to depth 2. And in theory it should be possible to reiterate the process, and go deeper and deeper.

Why is this important? Well, when you read a discussion group, it is often useful to see the hierarchical view.

Example
Title of the post 0:
BLAH
Content of the post 0:
blah, blah, blah, blah,
blah, blah, blah, blah,
blah, blah, blah, blah,

blah
Blah.
-Trackback 1
–Trackback to the trackback 1
–Second trackback to the trackback 1
-Trackback 2
-Trackback 3
–Trackback to the trackback 3
—Trackback to the trackback to the trackback 3
-Trackback 4
… and so on.

It might seem an expensive research, but when we read a post, and it has a certain number of trackbacks, it is quite important to see which of those lead to other posts and which didn’t.

And now we go to the second part of the idea.
In a sense there is no reason why the whole tree view structure should only work one way. I mean, each post links to many other posts. Each of those posts link themselves to other posts. And here we have another tree. This time a tree that goes backward in time.

So I think that for each post it should be possible to see both those views.

  • All the entries that are linked from it, and the entries that are linked to those entries, up to a specific depth.
  • All the entries that link to it, and the entries that link to those entries, up to a specific depth.
  • And maybe combine the two view having the first entries, in the format of one entry per line, above it. The later, again in the format of one entry per line, below.

I think this view would greatly increase the ability to see the local structure of the blogsphere. Of course the brothers of a particular entry (the entries that share the same parents) should also be available on the side. As well as the entries that are generally linked from the same offspring. But this is making it unnecessarily complicated. So let’s forget it for the time being.

So, we have reached the conclusion that each post uniquely defines two tree of other posts. The tree generated by it, and the tree that generates it. And I claim that we should work to be able to visualize those trees.

Doing it on Tagsurf
So, where did the idea came to me? Essentially working on tagsurf. Because, you see, tagsurf is maybe the first place where it would be really easy to visualize all this. You have many posts. There is the possibility (although I am not sure if it works right now) to send trackbacks from post to post. So each post does not need to have only one parent, but many. Many. It is true that, as it is now, trackbacks are not used inside the system. The reply is a different thing than the trackback. And each post only belongs to one thread which started with the first post that was not written as a reply to something. So there are quite some changes to be done, to let this vision ground in that system. But is is possible, and comparably easier to do than more generally in the blogsphere.

Those are the changes that I see have to be made to make it possible:

  • Make sure that it is possible to send trackbacks between different posts.
  • Organize all the reply so that they also send a trackback
  • Make sure that each time a post A sends a trackback to another post B, this is also stored inside A
  • Add a view down in time page, that from each post gives you that post, and all the posts that reply (that is trackback) to that post, and so on
  • Hack this page so that the post appear in a hierarchical way, where it is very clear who is answering to what. Generally the way in which livejournal handles comments is a good way
  • Since you stored all the trackback in both directions, organize a page view up in time, that from that post shows you all the posts that entry was answering to. And since they were themselves sending trackback to other posts, add those other posts as subbranches.
  • Make it very easy, given a certain post to use those two views, and try taking away the usual thread view. All the information should still be there.

Once the idea is in place you can then cross the idea with the idea of the tag, you could, for example, investigate one tagsurf entry (blog entry), and one tag. Then only the entries that contain that tag will appear in the two tress. And if an entry does not have that tag, then all its subbranches would be excluded, even if they have the tag. (Thanks Andy for this idea)

Doing it on Technorati
Another one that has all the information to generate those views would be Technorati. Of course I would rather see it in a decentralised way. But it would be so easy for them to do it, while to do it in a decentralised way might be such a nightmare, that I am absolutely hopeful that they might make it before. Think about it. A Technorati page: investigate blogsphere local structure. You pass an url to this page, and the said structure appears. Up to depth… say 3.

Update: BN (in the comments) points out to BlogPulse’s Conversation Tracker, as a limited solution to what I was suggesting. It still has many limits, but it is surely a step in the right direction. Beside is good to be reminded that Technorati isn’t the only service to observe the blogsphere.

technorati tag & rss

Rss is somehow one of the best ideas. You can have your content, stripped of form BS being redirected all around. This gives a one to many structure. Now we need the opposite. We need to be able to pull the content from many sites in the same place, and check it. A many to one structure.
Most of you will say, “But we already have that, it’s called an aggregator. Just look at bloglines.

Yes, and no, that’s part of it, but it’s not the whole story. We need to have a page that posts all the content from everywhere in a single page.

And again I can hear: “but we have that too: it’s called a technorati tag“.

Again I will repeat: Yes, and no, that’s part of it, but it’s not the whole story. We need to pull the information from the technorati pages to our aggregator.

This is the idea: we need an rss feed of a technorati tag. As we can get the rss feed of a del.icio.us tag, we need to have it for all the blogs. The time have passed to add to your friend list ALL the blogs that might have information of interest. We need to be able to add that rss to our bloglines.

So, either technorati will start releasing the rss, or I predict that:

  • a) other services will start competing with technorati offering that info
  • b) anonymous hackers will start scrapping the info from technorati to offer the very valuable information.

See also:semanticweb, tags

Tagsurf first review

This is going to be big. It’s called tagsurf. When we were setting up the taoist discussion board, at Tao Bums, I was looking for a board that permitted me to tag individual messages with different tags. The reason is that over there we are now a group of friendly people and every thread start with a topic, but often touches many separate ones. The board had to be in PhP for reasons only knew to the web master, but that we all were happy to follow. So we started looking around, but no board with tagging facility went up. Nothing. I had to admit that the idea was quite new, and I have not seen any such board around in any case. And then we decided for phpBB which being open source would have had new versions with any new cool geeky thing appearing every so often. Well. Now I finally found the first tag based discussion board. It’s called tagsurf. And is very cool. You get to write messages and tag them. As tag you can use any word up to any size. Now, the result of this is that you can tag thing with the url of something. So immediatly a series of utilities started appearing:
People (first one I saw doing it was Russell Beattie) added a tagsurf button. In short if you click on that button you get all the comments on tagsurf that uses your permalink as a tag. In a sense it is outsourcing the discussion board.
Yes, I added it too, is down near the little technoraty bubble, and I just needed to add:
<a href="http://tagsurf.com/post?tag=<?php the_permalink() ?>">Tagsurf this</a>

in the template.

I also went back to see how was tagsurf behaving in del.icio.us. It seem that, as it often appear in other cases, the meme is 6 days old. At the beginning few people noticed it, and now is starting to explode. I too found out because of the delicious discussion board, which I would suggest anybody who is interested to anybody who is interested in delicious OR folksonomy

I think this tagsurf will and can have great impact. They already have some API defined.

I also got an eye to their privacy policy. It seemed simple and clear. Yet now I cannot find it anymore. I suspect that they might be working on it right now.

I also made a small bookmarklet to post an entry on tagsurf about a specific page. Just drag the word bookmarklet on the bar and it should work. Of course for it to work you have to be logged in in tagsurf.

    Great points:
  • trackback: every post gets is an entry point for trackback. In other words anything you say can receive trackback from anything else. You say something here, and it get people in the blogsphere chatting. And you can follow their conversation. This is something very important that was missing in all the bullettin board I have been using. In a sense many discussion board are only looking in. This is also looking out.
  • trackback 2: Every post that you make can send trackback to anything you want. The software to do this automagically respect to the other posts inside tagsurf is still missing, but I can’t imagine it not appearing very soon.
  • possibility to mix different threads: since each post gets as many tags as the poster want it is quite easy for people to join different threads of discussion.
    Problems I might see coming.
  • Spam, spam, spam: I recieve about 30 spam trackbacks a day. And they get filtered by cool programs and finally deleted by me. Yet those programs need me to make the final judgement. Who will make the judgement for all the trackbacks in all those posts? Will the user have to? Can someone close the trackback from his own posts? I see many problem and much discussion over here.
  • copyright: This is another big one. Let’s say that I post a cool entry in tagsurf, who gets the copyright of it? It might be important. Imagine that someone takes it, and wants to add some extra tags. But adding tags is not allowed at the moment. So he copies the post and just reposts it with the extra tags. Do I have a say on it?

All together I think this is a wonderful piece of new technology. When tachnoraty started his tag page I wasn’t very impressed, but this, I think, will make some huge effects. And still I can’t see all the implications.

ADDENDUM: just as I ended this post I read fully the great and very interesting post from Russell Beattie. And I found that he had made exactly the same bookmarklet. Oops. Well, I hope he will not sue me, I haven’t copied his code. I just reinvented the wheel.

ADDENDUM to the ADDENDUM: As I was looking at all the people who were commenting on the thread on Russell post I noticed another post with the same bookmarklet. And I thought I would have been the first ;) . At least I get to see if the trackback to posts over there actually works.

ADDENDUM to the ADDENDUMto the ADDENDUM: trackback does not seem to work, or the comment is being held back for security reasons

Third Map Maked Debugging session

Did some more debugging. Now any unicode the user used in the tags should be ok. Still there is a big brick wall in terms of memory usage. And some users are not having any luck just out of the fact that their map is taking so much resources that it goes beyond the ISP limit. I could work hard and distribute the whole calculation so that all variables are stored on disk, so the memory would never be hit, but honestly, it is not my top priority at the moment. I am here to help those users run the program on their own machine. And eventually we might solve that problem too. So, what are my top priorities:

  • Add an rss feed.I would like to add an rss feed that every time a new map is done, the feed gets updated. It wouldn’t just tell the name but all sorts of data, like the list of the Main Tags. So the users could see if they might be interested in checking the new buddy’s map
  • Insert a way for user to delete their own maps. If I am going to go into hosting business, I am not going to be one of those hosts where you can add info, but you cannot delete it. I am aware that users info ultimately is adding value to my site, as such I want users to be happy in having their map here. Not forced.
  • Insert a general log of all the maps that are being started, and ended. Right now such a log is absent, and there are about 200 maps completed, and more than twice maps that have been started. So about 300 have been dropped. I bet many of those users would have success, if they tried right now, after those 3 deubugging session. Still I want something that tells me: Warning warning warning, map dropped. Bug? OutOfMemoryError?
  • Add the number of posts inside a tag. Just obvious
  • Probably add some of the MainTags as keywords to each single map. The problem is: which? All is too much. All the ones that contain more than x posts, y subtags is not flexible enough. The solution should be: if a MainTag is part of a ParetoFront of Delicious than the keyword should be there. The fact that this means writing a whole program that stores in a database the latest ParetoFront is just a small detail ;) . And before you ask: no, I will not need anybody’s password to do that, and the data will all be public.
  • Add a bookmarklet to save a map in your own delicious, with the keywords as tags
  • Change the map, so that it can run on a single tag. Useful for big complex maps like mine, and others.
  • Make it change the Title of the Map Page, to show the owner of the map. Useful if people want to add the maps to their delicious pages.

And then there are some tests I would like to make, like:

  • Check if it would make sense to show all the tags that appear with a single tag, and not the subtags.

There is more? If you can think of other modifications , please drop a line in the comment section. Also if you tried to run the map maker and it is not giving you satisfaction let me know. I’ll whip it appropriatly. HarHarHar. (I’ve always wanted to say that!)

A house divided

As the price of houses rises, more and more people find that the best solution is to divide a house among friends. Usually each person gets a room. The problem then is: who gets what room and how much should he pay. Usually the total rent is fixed, and usually the rooms are not exactly all the same. Some might be bigger, some smaller. Some might have a better view, more privacy, closeness to the toilet, more silence, and so on. And what’s also important is that different people might value the various elements in different ways.

I present here two ways of splitting the rent and dividing a house. I personally favour (and has designed) the second, but while I was presenting this method to some friends to get some
feedback, I was told the other, it seemed simpler, yet interesting enough to add it. They both assume that:
a) the rent is fixed,
b) there are no favoritism among the will-be-housemate on
who gets to choose first.

The ‘find the objective value first’ method.

Before the rooms are assigned, get together and agree on what are the objective value of each room (i.e. 20% of the rent for this, 50% of the rent for this). The total value must of course be the whole rent. Then randomly select who gets what room (at the agreed price), and as a final action people are allowed to exchange rooms if they want to.
Positive element: it is simple and quite straightforward.
Negative element: it assumes that people can easily agree on the actual relative value of the rooms, and that such value does not change respectively to the persons.

The ‘each person gets the best room’ method.

As I said this is the method that I love most. First of all let each person inspect all the room. Then each person, writes, secretly, the relative value of each room in a piece of paper. The sum of the values must be equal to the requested rent. The idea is to divide the house so that each person gets a room, and pays for that room the value THEY wrote on the piece of paper, while the sum of the valued paid by each person totally covers the requested rent.

Obviously, very often, the collected money would then be higher than the rent. Let’s call the collected money minus the monthly rent, the ‘extra money’.

Often there is more than one solution, that permit to have a some extra money each month. When this happens, the solution that permits to maximize the extra money is chosen. The extra money is then used to pay for the light, any extra expenses, or whatever is needed for the house.

Sometimes there are more than one optimal solution, that is some solutions generate the same extra money, everybody is paying the requested cost for each room, and all other solutions are less optimal. In that case the adopted solution will be one of the optimal one, randomly chosen.

Examples, examples:
Let’s suppose we have a house with 3 rooms (a, b, and c) and 3 persons (A, B, and C). Let’s suppose the total rent being 100.

Person A might find the three rooms equivalent, so he might just write (a: 33.3, b: 33.3, c: 33.3). Person B might instead favour room B, because is more sunny, and she likes to paint, and then she thinks that room ‘a’ is slightly better than room ‘c’, infact she would prefer not to be in room c at all, so she would write: (a: 35, b: 40, c: 25). Person C instead does not care about the sun, but has noticed that room A has more privacy, plus is near the toilet, and since he likes to have his gf as a guest, thinks that having room A would be a better deal. So he votes (a: 40, b: 30, c: 30).

Then the papers are revealed.

Generally when a room has a person that values it more than all the others, and he values that room more than all other ooms, then that room gets taken by that person at the price he has choose.

In our example we have:
A: (a: 33.3, b: 33.3, c: 33.3)
B: (a: 35, b: 40, c: 25)
C: (a: 40, b: 30, c: 30)
which would give us that A would get room ‘c’ paying one third of the rent. B would get room ‘b’ paying 40% of the rent, and C would get room ‘a’ for 40% of the rent… and the collected money each month would be 33.3+40+40=113.3 . The extra money would be 113.3-100=13.3 and would be used to pay for the electricity, water, gas, or whatever.

It is also possible to rinormalise the prices, by lowering them so that the total sum becomes exactly the cost of the rent, while the relative ratio remains the same. In our example
A: (33.3/113.3)*100=29.4
B: (40/113.3)*100=35.3
C: (40/113.3)*100=35.3
and person A would pay 29.4 of the rent (since he took the room nobody wanted)
person B would pay 35.3 of the rent (and took the sunny room)
person C would pays 35.3 of the rent (and took the room with more privacy)

So, what if the situation is not that easy. There isn’t a person that prefers each room? For example you could be in a situation like:
A: (a: 45, b: 45, c: 10)
B: (a: 40, b: 40, c: 20)
C: (a: 40, b: 30, c: 30)
well in this case it is obvious that person A will get either room a or room b. But it is also obvious that room c will go to person C. So C get’s c at 30% of the rent. Both A and B value the room a and b equivalently. But once the room will be assigned person A will pay more than person B, so it seem fair to me that person A chooses a or b and pays 45, and person B gets the remaining room, but pays less (40).

But things can get even more complicated if some people
value some rooms exactly the same:
A: (a: 45, b: 45, c: 10)
B: (a: 45, b: 45, c: 10)
C: (a: 40, b: 40, c: 20)
in which case A and B have obviously to randomly choose who gets what.

Or if the situation is symmethric among the rooms:
A: (a: 40, b: 30, c: 40)
B: (a: 40, b: 40, c: 30)
C: (a: 30, b: 40, c: 40)
In which case you randomly choose if A gets a or c, and then the other follow obviously.

So here we have the first mehtod, where everybody chooses the value together, this is equivalent on the second method if everybody agrees on the relative value:
A: (a: 35, b: 40, c: 25)
B: (a: 35, b: 40, c: 25)
C: (a: 35, b: 40, c: 25)
After which, also in this method, you would randomly pick who gets which room.

Please, let me know if you have tried it and if it was succesful.

wikipedia fast search

I added an extra bookmarklet. I was in this room with 25 great minds discussing molecular dynamics inquantum fields. I couldn’t understand a iota. Luckily new talks are given in places with wifi connection. So to try to get up to speed with what was going on I wrote a small wikipedia fast search.

Now I could just type “w molecule” in the link bar and the browser would automatically go to http://en.wikipedia.org/wiki/Molecule.

So how do you do it?
Just copy the link in your bookmarks. Copy it in the “Quick Search” directory, then edit the properties and add keyword “w”. And voilà .
Following a talk with it is much faster.

More than del.icio.us: org.asm.ic

You know, maybe because my father has been a journalist for so many years I have always been raised to appreciate the complexity of life. And yet I still can’t understand why do we need all this complex machinery.

I totally agree on the importance of the semantic web. And, boy, am I thrilled on the possibility that we might be generating the the internet operating system. I am also aware of the cutting edge problem of who owns your data.

But what I just can’t get is why do we need to make things that can actually be quite simple, into this amazing complexity. I might not be getting the whole picture, and I admit ignorance, above stupidity. But still, why do we need to build this whole house all at once? Example del.icio.us has been amazing, and trivial at the same time. And amazing also because it was so trivial.

Now let’s expand the concept:
Instead of storing one single link let’s store two links, and a set of tags in the middle. Two links with their two titles and maybe their two descriptions. And one set of tags between them.

And people will naturally start using interesting tags.

Like:
‘explains’, ‘terrorises’, ‘defines’, ‘is’, ‘IsTerrorisedBy’, ‘embedds’, ‘uses’, …

It will also be fun.

And then get a page for all the links that uses a certain URI as it’s first link(…/subj/…), and another for those that uses another as their second URI (…/obj/…).

Then you can use delicious to store those pages.

The bookmarks manager was del.icio.us? I can assure you that this will be at least org.asm.ic! And it will not cost the programmer more than 200 lines of code. LAMP, PHP, MySQL, keep it simple. And we will all use it.

Partial translations

Have you ever tried google translate service? I know, if you did you wish you didn’t, unless you were bored, and were looking for some ways to amuse yourself. But you know, translating text is a really daunting task. Generations of PhD’s have been spent in progressing the state of the art just a little bit every time. I know what I am speaking about, I lived with some of them in COGS, at Sussex University. I remember reading somewhere that new, better automatic translators will soon be available. Good! We are waiting for them.

In the meantime…

I had this idea:
Have you ever tried to translate a page from a language you don’t know… quite well. But you are not also totally ignorant about. Something in between. Here in Europe is quite common. And the same is true when I read posts in Portuogese, or in American from people on the other side of the ocean.

Yes, I can try to use Google translate mechanism, but it doesn’t give me something easyto chew. Look at this post, for example:

Depois do high vem o low. É uma lei do universo.
E no low todo mundo é feio e o mundo é triste e é tudo um saco.

E eu já nem sei o que me move.
From here

Google translates it as:

es low.? a law of the universe. E in low everybody? ugly and the world? sad e? everything a bag.

E I j? nor I know what it moves me.

From my darling Alenahra.

I suppose a better translation would be:

After a high comes a low. It is a law of the universe. And in a low everybody is ugly and the world is sad and everything is empty.

And I still don’t know what is that moves me.

And Ale’ will tell me if I got it right.

My idea is that Google, instead of providing for a tentative answer should provide for all the possible translations for each word. Those translated words should appear when we point to a word with the mouse. I know it is a slow way of reading a document, one word at a time, but soon the reader will catch up the most common words, and will speed up.

What follow is an example. Move on the words to see the title appear. I used some simple translation that I could find. Obviously the tool I envision would have to be more professional.

Depois do high vem o low. É uma lei do universo. E no low todo mundo é feio e o mundo é triste e é tudo um saco.

E eu já nem sei o que me move.

In Italy right now more and more people are getting confortable with english. If you werte to come here only 10 years ago most people would refuse to even try to speak engliish, even if they studied it in school. Now, I believe thanks to internet, people are reading english pages daily, the dictionary often ina corner of the desk, ready to be used. It would be helpful for them to have sucha system.

And I would finally learn Portuogese!

Porto Alegre, aspettami!

Special thanks to travlang.com for providing part of the translations.

Clustering Delicious Tags

I went on programming at my favourite Python program: Delimind.

In short: Made a new release of the Deli Mind program. Here is the source code (just remember to change it from a .txt to a .py). Now similar tags are clustered together.

  1. Here is how it looks like.
  2. Here is how the previous version looked like.
  3. The original from Brownhen (may he live long and prosper) used to be here, although now it is missing.

All on the same data. Mine, now.
Go and enjoy.
(Later addition: while the program works well for small databases of links, like mine at the time in which I wrote this entry, it doesn’t scale well on size. For this reason it crashes for most of the people who try to use it with more than 1000 bookmarks. For this reason I was forced to change the link on the cluster example to a database with fewer nodes.)

Now the tecnical stuff for those that have a bit more patience.

Tags are not all the same, some are more similar than others. So, for example, the tag “September11″ and “GeorgeBush” have more links in common than “GeorgeBush” and “intelligence”. The idea behind this version of DeliMind was to cluster tags that had links in common. Since distance is generally not a transitive property (if I am near to you, and you are near to Jim, I am not necessarily that near to Jim), while clustering is (if I and you are in the same cluster, and you and Jim are in the same cluster, then me and Jim have to be in the same cluster… unless people belong to different clusters, but that’s a complication).

So I started by making a matrix of relations among tags (all_dict). Each tag, respect to each other tag could either be

  1. Once contained in the other
  2. Identical
  3. Disjointed
  4. With # bookmarks in common

Then according to the number of links each of the two tags, and the number of links in common I invented a measure of similarity. If #A is the number of links in tag A, and #B is the number of links in tag B, and #AB is the number of links in common.
The the relative similarity (SAB) will be:
SAB= sqrt((#AB/#A)*(#AB/#B))

I actually played with various measures:
SAB= ((#AB/#A)+(#AB/#B))/2
SAB= Max(#AB/#A,#AB/#B)
They all went from 0 to 1, and were quite similar… (I am not going to discuss the relative properties)
But the first one just seemed the one that made more sense, and at the end, the resulting map was the one more close to my personal intuition of what should be in what cluster.

Once the similarity matrix was done I started studying the clusters. Generally for each triplet of tags A, B, C I would modify
SAC:=min (previous SAC, max (SAB, SBC))
And I would continue going through all possible triplets, and then starting again from the beginning until no new change were happening.

Why? The idea is that the similarity between two tags measure how easy it is to jump from one to the other. Visualise each tag as an island, and then you have an animal who can jump from one island to the other. But it can only jump up to a certain distance. So if he can find a succession of tags between two tags, A and B, where the similarity (the similarity is the inverse of the distance) is always above its jumping ability (that is, the distance is below its jumping ability), then the animal can move from A to B. If not A and B are in different clusters. Effectively unreachable.

But we don’t know how far can our beast jump. So in this way we end up having a similarity number that sais: somwhere, between A and B is possible to find a succession of tags, such that the distance is never above x, so SAB is equal to the minimum between the original SAB and x.

If it does feel complicated don’t worry. I got confused a few (hundred) times programming it. And just could not understand why those damn tags were not clustering… until I got it right.

So, now you have this nice matrix, only between your main tags (the one that are not contained in another tag, cfr previous version), and you (or actually I) need to cluster the tags.

Not also that you don’t need to cluster the tags only one time. Once you made a clustering (for animal which can jump d), you can still partition inside the clustering for animals that can jump less than d.
The first time I just asked him to cluster each possible number. That is, if a number was present assume that someone was able to jump exactly that distance. In this way I got a heavily clustered map. It was a mess, but a promising mess. I then saw that most of the interestign things were happening between distances of 0.333333 and 0.6666.

That is, it made quite sense to ask for the clusters generated by putting together tags that had one third of the links in common, and tags that had up to two third of the links in common.

This is how I got clusters:

  • porno, sex and eros
  • GeorgeBush, September11, politics, economy, historical, terrorism, usa
  • green, sustainability

Example of the Clustered Map
Then I just applied the same process in the subtags of each tag.

Ok, I can be satisfied, I can go and have something to eat.

As always, if you find it useful drop me a line, I appreciate.

Pietro