Web 3.0: The Next Evolution of Internet Search—and How You Can Prepare for It Now by Publishing Your Content as Linked Data
by Deltina Hay
In the January issue of the Independent, the first of three articles about Web 3.0 focused on the semantic Web and how you can structure your data to make it easy for search engines to recognize. The series continues here by looking at how you can publish your content as linked data and add it to the ever-growing Linked Open Data cloud.
Recapping Part 1: Microformats is a markup language that helps you classify and tag your content so “semantic” search engines can make more sense of it. Using microformats to mark up content about events, contact information, locations, products, reviews, and so forth helps you create “structured data” that is relatively easy for semantic search engines—including Google’s new Rich Snippets—to recognize.
The term linked data also refers to a way of structuring data, but this way uses the Web to create links among data from many different datasets and to classify data using an established “data commons.” This means using a common reference to represent a piece of data so that data can be linked easily to and from other sources of data, creating what is referred to as a Web of Data.
The most impressive Web of Data is the Linked Open Data (LOD) cloud. Pictured below, as of July 2009, this “cloud” is growing at a tremendous rate. In its center (only a small part of it) you can see “DBpedia,” the dataset that feeds Wikipedia, which gives you an idea of its size.

(Image published under the Creative Commons license. Generated by Richard Cyganiak at richard.cyganiak.de/2007/10/lod.)
Linked data is published to the Web using a very specific model called the “RDF data model,” which involves interlinking data from different data sources adhering to a special linking structure known as “RDF links.”
The resulting Web of Data can be accessed by semantic Web browsers that navigate among different data sources, in much the same way that traditional Web browsers navigate among HTML pages. But a semantic browser follows RDF links to related data, while a traditional browser follows hyperlinks to other Web pages.
For example, if someone is reading data about you and discovers a link to data about a book you’ve published, RDF links will let that person follow the link to a source that has information about reviews of the book or product information about it. So by discovering you, they discover your book, perhaps in the Amazon database, which then leads them to more information in the Google Book Search database, and further, to a review located in a data source somewhere else. The important thing to understand is that the user is surfing through data sources, not popping around on Web sites, which makes the surfing much more relevant and meaningful, and, of course, much less distracting.
Think of this experience as going to a resource library, opening only one book on the topic you want to research, and being led from this one book to a wealth of other relevant resources without ever having to get up from your chair to search for another book. This is the dream that Internet founder Tim Berners-Lee had when he first envisioned the Internet. Through the concept of linked data and the Linked Open Data cloud, we are very close to realizing his dream.
A Quick and Dirty RDF Primer
Note: The following is a very basic explanation of how data is represented using RDF.
RDF—Resource Description Framework—is typically used to represent large datasets that are to be added to the Linked Open Data cloud. Unless you have a large database to add to the cloud, you probably won’t need to know more than what is explained here.
As its name implies, RDF is a way to define resources using a specific framework. That framework is based on the concept of “triples.” Each resource is represented by a number of triples.
A triple consists of a subject, a predicate, and an object that mirrors a simple sentence structure like:
Subject
|
Predicate
|
Object
|
Deltina
|
[has the] Web site
|
http://www.deltina.com
|
Deltina
|
[is] employed at
|
http://www.socialmediapower.com
|
Deltina
|
Knows
|
John Smith
|
John Smith
|
[is] also known as
|
http://DrWho.com
|
RDF triples take the following forms:
The subject is a URI (a type of link) identifying the described resource.
The object can be the URI of another resource that is in some way related to the subject, or a value like a name, a number, or a date.
And, as in basic sentence structure, the predicate indicates what kind of relationship exists between the subject and the object.
The predicate is also a URI, but predicate URIs come from established “vocabularies”—collections of URIs that can be used to represent information about a broad topic. For example:
Friend-of-a-Friend (FOAF) is a vocabulary for describing people.
Music Ontology is a vocabulary for describing artists, albums, and tracks.
Review Vocabulary is a vocabulary for representing reviews.
GoodRelations is a standardized vocabulary for product, price, and company data.
Getting Into the Web of Data
Even if you don’t have large sets of data, you still want to get yourself into the Linked Open Data cloud to be ready to take advantage of the next generation of Internet search. Luckily, there is an easy way to do that. You can create what is called a “static RDF file” and upload it to your Web site server.
The most popular of these static files is the Friend-of-a-Friend, or FOAF, file. This file uses the FOAF vocabulary to represent information about you: your name, your place of employment or business, your Web site, and so on. It can also contain information about people you know, links to your profiles on social networking sites, and links to resources that are associated with you, such as your books or other publications. You can create your own FOAF file using a tool like FOAF-a-Matic (see “Linked Data Resources” below).
Here is part of an example FOAF file. Keep in mind that the purpose of the semantic Web is to make content easier for machines to understand. As a result, these files are not necessarily intuitive to us. More and more tools like the FOAF-a-Matic are being developed, however, to help us technologically challenged humans produce our own RDF files to get our content into the Web of Data (for specifics, see “Linked Data Resources” below).
<rdf:RDF
…
<foaf:Person rdf:ID=”me”>
<foaf:name>Deltina Hay</foaf:name>
<foaf:givenname>Deltina</foaf:givenname>
<foaf:family_name>Hay</foaf:family_name>
<foaf:mbox rdf:resource=”mailto:deltina@deltina.com”/>
<foaf:homepage rdf:resource=”http://www.deltina.com”/>
<foaf:workplaceHomepage rdf:resource=”http://www.socialmediapower.com”/>
<foaf:workInfoHomepage rdf:resource=”http://www.socialmediapower.com/about”/>
<foaf:knows>
<foaf:Person>
<foaf:name>John Smith</foaf:name>
<foaf:mbox rdf:resource=”mailto:john@example.com”/>
<rdfs:seeAlso rdf:resource=”http://www.example.com/john/foaf.rdf”/>
</foaf:Person>
</foaf:knows>
<foaf:knows>
<foaf:Person>
<foaf:name>Jane Doe</foaf:name>
<foaf:mbox rdf:resource=”mailto:jane@example.com”/>
</foaf:Person>
</foaf:knows>
</foaf:Person>
</rdf:RDF>
What this file indicates is that I am a person with a name, an email address, a Web site, and a place of employment, and that I know some people—one of whom has his own RDF file that others can now access to learn more about him.
Once you have your basic FOAF file, save it in the root directory of your Web site server as foaf.rdf. This way, semantic Web browsers can find and recognize the file for what it is and display its content accordingly. You can continue to add content to your FOAF file, such as book information, links to your blogs, links to online communities like Facebook and LinkedIn, and much more. Check “Linked Data Resources” for tools that can help you do this.
For Example: Benefits for Books
The Linked Open Data cloud is not just a bunch of interlinked databases that can be conveniently surfed. It provides ways to use data that have been linked in a consistent way. Given the right tools and knowhow, anyone can draw from this tremendous resource to create powerful applications. And because the cloud is “Open,” it makes the data available for all developers to access. Given the types of tools we have seen emerge from Web 2.0, imagine what those talented developers will serve up for us next!
A convenient example is the RDF Book Mashup. This mashup integrates Web 2.0 data sources such as Amazon, Google, or Yahoo and makes integrated information about books, their authors, reviews, and online bookstores available on the semantic Web. RDF tools can use this information, and you can link to it from your own semantic Web data—that is, your FOAF file.
To add information about you as a publisher to your FOAF file, go to
the RDF Book Mashup site (see “Linked Data Resources”) and follow the instructions. First, search the Book Mashup using your name to be certain all your books are listed. Here is the result from mine:

Next, add this line to your FOAF file (using your own name, of course):
<owl:sameAs rdf:resource =“http://www4.wiwiss.fu-berlin.de/bookmashup/persons/Deltina+Hay” />
Now, not only are you on the Web as a person who knows some other people and has a Web site and an email address, but you are interlinked with other sources in the Web of Data as a publisher of specific books. When others find your FOAF file, they can link to information on your book(s) as well. And they will be able to find information on you if they happen upon your book in the Book Mashup.
If much of what I’ve said seems overwhelming, focus on understanding the general principles. Once you have a general understanding of linked data and what the Linked Open Data cloud is, you can start to understand how to position yourself for the newest generation of Internet search.
Deltina Hay (linkedin.com/in/deltinahay), a veteran Web developer and publisher, is a pioneer in social media and Web 2.0, especially with respect to small business and the publishing industry. She is the owner of Dalton Publishing (daltonpublishing.com), Social Media Power (socialmediapower.com), and the innovative social media Web site service PlumbSocial (plumbsocial.com). Her book A Survival Guide to Social Media and Web 2.0 Optimization can be found or requested anywhere books are sold.
Linked Data Resources
FOAF-a-Matic: ldodds.com/foaf/foaf-a-matic
RDF Book Mashup: www4.wiwiss.fu-berlin.de/bizer/bookmashup
Google Rich Snippets: googlewebmastercentral.blogspot.com/2009/05/introducing-rich-snippets.html
Semantic Tools: readwriteweb.com/archives/top_10_semantic_web_products_of_2009.php
Linked Data Tutorial: www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial
|