Archive for the 'General' Category

Wikiasari forum

Thursday, February 8th, 2007

Jimmy Wales of Wikipedia fame wants to start up an open source, for-profit search engine based on Nutch to compete with Google.

There’s a community forum available:

Forum:Index - search

Common Spidey Sense

Tuesday, December 5th, 2006

We’ve come to accept over time that spiders visit this site as much as humans. Judging by the number of spiders that seem to live here, you’d think it was a cave.

The majority of the spiders we recognise, and we appreciate the attention. Google and Yahoo! come here all the time. It makes us proud (hi guys!).

But along with those mechanical spiders, we also get visits from a variety of baby-bots and wanna-bots who rummage through the site heavily for a while and move on, and others who strike repeatedly. They don’t say who they are, or what they are doing.

Many of them are right not to advertise, because they’d get banned right off - those include spam e-mail harvesters from Brazil, and self-appointed cyber-cops like Cyveillance, who come to sniff around to see if we might be offending them.

Those we ban as quickly as we can - we don’t like spammers or bullies (bad bullybot!).

But we also get bots run by regular, decent folk who just want to keep up on what’s going on here.

The problem is that some do it hourly for months on end, and if you are a regular reader of this site, you know one thing for sure - they are wasting their time, and our bandwidth, because this site only gets updated about once every three months.

There are web site managers who jealously guard their web sites, and who go through their stats looking for abusers. There are discussions and exchanges about who certain IPs are, and what they are up to - so bad bots can get a reputation.

A bot might be banned if it sticks its head up in the stats for things like bandwidth consumption, number of hits, frequency of visits, and so on.

A bot will especially attract notice if it doesn’t respect Robots.txt, doesn’t introduce itself, falsifies information, comes from a bad neighbourhood, has a bad reputation, or drags a site down to a crawl.

Bot banning could become a bigger issue in the future as more and more bots are unleashed, and the Internet becomes clogged with spiders, pre-fetchers, harvesters, comment spammers, scrapers, and other critters.

It’s possible that there will come a time when bots are automatically banned at first sight, and the sub-über bot (come on, say it out loud) will need to beg for an invitation.

Predictions for a Web 2.0 social experience

Saturday, November 18th, 2006

Right on brother!

“The next killer app isn’t an app.

It will be a new networking platform that builds on today’s world-wide web and makes possible new generations of more powerful and useful applications. “

What distributed open source search lacks in storage space and speed, it can make up for in processing power. What to do… what to do….

Predictions for a Web 2.0 social experience

Is Google working on an artificial intelligence?

Saturday, November 18th, 2006

You betcha they are.

Google Mind

The Singularity Institute for Artificial Intelligence

Saturday, November 18th, 2006

I’ve gone on before about Vernor Vinge’s singularity - the point where artificial intelligence takes over, and leaves us in the dust - but we need to hear more.

The Singularity Institute wants to make sure it doesn’t kill us and eat us.

The Singularity Institute for Artificial Intelligence