Second International Workshop on Open Source Information Retrieval

June 28th, 2006

At the 29th Annual International ACM SIGIR Conference on Research and Development on Information Retrieval, 6-11 August 2006, Seattle, USA:

“The goal of the Open Source Information Retrieval Workshop (OSIR) is to bring together practitioners developing open source search technologies in the context of a premier information retrieval research conference to share their recent advances, and to coordinate their strategy and research plans. The intent is to foster community-based development, to promote distribution of transparent Web search tools, and to strengthen the interaction with the research community in information retrieval.”

Open Source Information Retrieval


June 20th, 2006

A public or open index such as the one I outline here needs examination regarding censorship.

If each node is capable of deciding whether to censor something or not, then universal censorship is impossible.

Censorship is applied per node. A node cannot control what another node chooses to censor. The idea is that this should give a level of censorship relative to, and possibly representative of what exists out there in the real world.

Now that’s not really how things are – most of us passively accept (or aren’t aware of) censorship being done on our behalf, and yet in a system of independent and equal nodes, the people behind the nodes need to actively censor.

That’s the same sort of opt-in strategy that gets people pissed off when corporations do it.

Nevertheless, this scheme seems reasonable and democratic, if not justifiable. If each node censors according to the laws and views of the people who maintain it, then no one should be offended, no one should go to jail, and ultimately nothing should be censored – only diminished.

If each node can censor according to its own values, then it can be argued that in sum the Index represents the general attitudes of society, and that things will be more or less as censored as they are in the real world, with nothing completely excluded so long as there is someone who accepts it.

And that’s a problem: if some nodes don’t agree, then objectionable material will ‘leak out’ into the Index.

So if universal censorship is desired, it would be hard or impossible to censor the whole Index.

An uncensored index empowers the individual, and is not necessarily beneficial to the community. It can threaten safety and public standards. Do individuals really have a higher claim to information?

P.S. I’d like to have quoted Jefferson with “Information is the currency of democracy,” but the Jefferson Library ( says they can’t attribute it to him. So instead I’ll give you “”Each man is perfectly free in that which does not harm others” (Rousseau quoting Marquis d’Argenson in The Social Contract).

Artificial Intelligence and the Web

June 14th, 2006

Part of The Twenty-First National Conference on Artificial Intelligence:

“The special track on “AI and the Web” features technical papers on the use of AI techniques, systems, and concepts involving the Web. The emphasis is on papers in two active research areas: (1) using text and language analysis to interpret and understand natural language text found on the web and (2) developing and exploiting “Semantic Web” languages and systems that explicitly encode knowledge using languages such as RDF and OWL. Innovative papers in other areas describing research involving both AI and the Web are included.”

AAAI-06 Artificial Intelligence and the Web Special Technical Track

OpenIndex Site Upgrade

April 7th, 2006

I’ve moved the site entirely into WordPress CMS, so you may find some anomalies here and there.

The RSS feeds should work (let me know!).

There is no new content – this is simply a structural change, but at least there is now a consistent look throughout.

Everything’s CSS (style sheets) now. Everybody hates tables for layout. It’s “Boo tables” and “Hiss tables”. Well let me tell you, tables could make the sidebar drop right down to the footer. With CSS you have to resort to all sorts of tricks, but the fact is, CSS can’t do it. This is progress? I hope somebody’s happy, because I never had anything against tables. Tables rock!

Vernor Vinge and The Coming Technological Singularity

March 7th, 2006

Sometimes I make dark allusions to an Internet index being the germ of a sentient artificial being which will grow to enslave us all. I’m an admittedly pessimistic person, but I think it’s almost inevitable. As a proponent of an open index, I’m a shill for our future overlords, but if I’m going to get to select who becomes my master, I’d prefer a benign one dedicated to public service. The problem with a distributed index, is that it has no obvious ‘off’ switch.

In 1993, science fiction author and mathematician Vernor Vinge wrote The Coming Technological Singularity.

‘Technological singularity’ refers to that point in time – perhaps in the next 20 years, when the technology we have invented becomes smart enough to surpass us in creation. At that point, we become obsolete.

I really do see a comprehensive index as being a significant component of a superhuman intelligence. After all, it is the point of access to human knowledge and experience; a magnificent database. Usually, I feel a little shiver of dread – I believe that an evolved intelligence owes us no more allegiance than we owe our evolutionary ancestors and companions.

Read the rest of this entry »