The Metaphors of the Net

September 25th, 2022

I. The Genetic Blueprint

A decade after the invention of the World Wide Web, Tim Berners-Lee is promoting the "Semantic Web". The Internet hitherto is a repository of digital content. It has a rudimentary inventory system and very crude data location services. As a sad result, most of the content is invisible and inaccessible. Moreover, the Internet manipulates strings of symbols, not logical or semantic propositions. In other words, the Net compares values but does not know the meaning of the values it thus manipulates. It is unable to interpret strings, to infer new facts, to deduce, induce, derive, or otherwise comprehend what it is doing. In short, it does not understand language. Run an ambiguous term by any search engine and these shortcomings become painfully evident. This lack of understanding of the semantic foundations of its raw material (data, information) prevent applications and databases from sharing resources and feeding each other. The Internet is discrete, not continuous. It resembles an archipelago, with users hopping from island to island in a frantic search for relevancy.

Even visionaries like Berners-Lee do not contemplate an “intelligent Web”. They are simply proposing to let users, content creators, and web developers assign descriptive meta-tags (“name of hotel”) to fields, or to strings of symbols (“Hilton”). These meta-tags (arranged in semantic and relational “ontologies” – lists of metatags, their meanings and how they relate to each other) will be read by various applications and allow them to process the associated strings of symbols correctly (place the word “Hilton” in your address book under “hotels”). This will make information retrieval more efficient and reliable and the information retrieved is bound to be more relevant and amenable to higher level processing (statistics, the development of heuristic rules, etc.). The shift is from HTML (whose tags are concerned with visual appearances and content indexing) to languages such as the DARPA Agent Markup Language, OIL (Ontology Inference Layer or Ontology Interchange Language), or even XML (whose tags are concerned with content taxonomy, document structure, and semantics). This would bring the Internet closer to the classic library card catalogue.