Tim Berners-Lee’s semantic web

The man who invented the WWW wants to give it a boost by making it easier for machines to understand what’s what. The ‘semantic web’ is a mark-up language that builds on XML to enable machines to identify and ‘understand’ relationships between web-based data and resources..

The idea has been in development for many years – it isn’t exactly news – and it hurts my head, but it merits a look.

Following are four links that may help explain the concept of the ‘semantic web’:

San Francisco Chronicle: Horizons expand for inventor of Web Berners-Lee’s codes could revolutionize Internet searches

Nature: Scientific publishing on the ‘semantic web

Business Week: The Next Web

Science & Technology at Scientific American.com: The Semantic Web — A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities

4 Responses to “Tim Berners-Lee’s semantic web”

  1. Tom Booth Says:

    The concept is good but the structure and implementation is completely unrealistic in my opinion.

    XML:Extensible Markup Language, OWL: Ontology Web Language, RDF: Resource Descriptor Framework, URI: Uniform Resource Identifier and DAML: Darpa Agency Mark-up Language. And your average Joe posting a web page is supposed to learn and understand all this and add it to his web site? The articles state that Web Publishers are supposed to add these coded tags and what not to their own websites for this to work.

    It (XML) has some usefulnes and is already being used by some giant web-portals that can afford a full time IT staff to work on it and keep it going but I’m afraid that 99.99% of Web Masters will never comprehend it or use it to contribute their page to the “semantic web” in its current proposed form using “standard”(?) XML.

    It would have to be greatly simplified and standardised so that just about anyone could learn to understand it and use it in no more time than it takes to learn basic HTML. I would hope much less time than that.

    As it is, I’ve been reading about and studying XML off and on for years and I still can’t quite grasp how someone is supposed to use it on their web page in the manner proposed by this “semantic web” idea.

  2. Paul Wilford Says:

    I think there is a ‘coolness’ factor that is not being sufficiently addressed. The semantic web (Internet) is meant to be understood by machines, and that’s cool because….

    …This is a route to AI on the web. It embeds a machine-readable explanation or description of the web into the web itself.

    It isn’t meant for humans to use directly. It needs applications which make use of it. Maybe some of those will reach the average user and his web page.

    The idea of what those applications might be (or become) excites the imagination, and accounts for the coolness factor. That doesn’t seem too be mentioned much, and feels disingenuous.

  3. Tom Booth Says:

    The idea of the “Semantic Web” is certainly way “cool”. But I’m not sure that AI will ever be capable, (any time in the near future anyway) of embeding “a machine-readable explanation or description of the web into the web itself.” without human direction.

    That is, the information being “embeded” would, most probably, in my opinion, be in the form of metadata resulting from human evaluation of the content of a resource and humans somehow encodding that evaluation in a “machine readable” form.

    Machines don’t actually “read” and understand anything. They are only capable of matching character strings. It is up to humans to program the computers and tell them what to do once they find a match.

    If “metadata” CONCEPTS were strictly formatted and standardized then computers would be able to recognize “concepts” if those concepts were represented in the form of unique character strings or codes (so they wouldn’t be confused with ordinary language or be otherwise ambiguous).

    The computer could then be programmed to do things that would appear to be quite intelligent but would actually just be following a program based upon matching a character string.

    For this to happen, people would need access to some sort of standard “language” or set of codes that represented “concepts” and these codes would have to be reasonably understandable to humans before there would be any major movement towards adding the appropriate metadata (concept codes) to their websites.

    In other words, using the Dewey Decimal System “codes” as an example.

    If people simply began adding something like:


    (Though this later example is not within standard HTML specs. that I know of)

    A computer could then be programmed to recognize this as relating to the concept of “Quantum Mechanics” within the concept of “Physics”-530 within the concept of “Science”-500 even if the resource did not make mention of any of these “KEY WORDS”, or any words at all, The resource might infact be a picture of a particle interaction within a bubble chamber as part of a Physics experiment with no words at all, yet the computer would “know” or recognize from the metadata that the resource was related to Quantum Physics.

    Of course, in this example, It might be that one person used DDS for metadata while someone else used Library of Congress codes, while someone else used subject terms (Key words), while someone else used another system of categorization altogether… So in order for something like that to work the “Metadata” codes must be strictly defined and standardized.

    And, of course, using DDS is probably out due to possible copyright issues. The LC system is not comprehensive enough to categorize internet resources. So probably some entirely new set of metadata codes would need to be standardized.

    XML goes some distance in this direction but I’m not sure the potential XML “codes” are sufficiently standardized nor is XML easily understandable for the ordinary web master to implement.

    In other words, whatever system of encodding data is used for this “semantic web”, ideally, it should be both computer readable as well as reasonably understandable to humans who must do the actual conceptualization and encodding so as to make the “concepts” computer readable.

  4. Tom Booth Says:

    Apparently the script for this Blog strips HTML tags or anything in angle brackets.

    So, trying another method, the examples in my last post should have been something like:

    <meta name=”DDS” value=”539″>



Leave a Reply

You must be logged in to post a comment.