Semantic Web Publishing with RDFa
Now, it's time to back up a bit. The term Semantic Web is used in this article to refer to the goal of a machine-parsable Web of structured data, as envisioned by Tim Berners-Lee in his 2001 Scientific American article by that name. Although there still is plenty of spirited debate over exactly how Web 3.0 will take shape, the W3 folks and others have been working diligently on a core set of technologies that has started to gain serious traction in the wild. Check out the layercake diagram from the W3C (Figure 1).
RDF, the data model on which the whole thing is based, represents the world as a set of triples: subject, predicate and object. Each item in the triple can be a URI, a literal or a blank node (a kind of temporary variable). In practice, the predicate is likely to be a URI in a namespace created for the purpose, like cal:dtstart or cal:summary.
Vocabularies and ontologies form the backbone of the Semantic Web. You can define your own, and some tools like Semantic MediaWiki create an ontology for you automatically. When defining the terms in a specialized domain, or when creating a private within-enterprise application, creating your own ontology makes sense. For sharing data with the world, I prefer to reuse existing vocabularies as much as possible. (By vocabulary, I mean an RDF file that defines terms and properties; by ontology, I mean a vocabulary that also contains logical rules.) Some widely used vocabs include the following:
foaf: friend of a friend, for identifying people and other entities (xmlns.com/foaf/spec/20071002.rdf).
ical: based on the iCalendar W3 standard, for calendar and event data (www.w3.org/2002/12/cal/ical).
vcard: intended as an electronic business card, it has simple fields for contact information (www.w3.org/2001/vcard-rdf/3.0).
dc: Dublin Core, defining core properties like title and creator (purl.org/dc/elements/1.1).
cc: for Creative Commons licenses (creativecommons.org/ns).
rss: the RSS 1.0 namespace (purl.org/rss/1.0).
Note that in our document, we can choose our own shorthand name for each vocabulary when we list it in the <html> tag. Then, we can use that shorthand to write what is called a CURIE, or Compact URI, like dc:title or foaf:name. In RDFa, those CURIEs are valid URIs and are much easier to read once you get used to them. One of the core ideas of RDF is to be able to use URIs to refer to concepts and things outside cyberspace, and then use them to make logical statements. So, it helps if the URIs are human-readable.
Going back to the rodeo schedule example, suppose we want to list the contestants in each event. Now, we get into the power of RDFa—the ability to connect different types of data together in a logical way right in an HTML file. The first step is to pick or create a vocabulary to describe the contestants. FOAF is the standard for referring to people, but I also want to specify that they are contestants in the rodeo. I did a search on Swoogle for the word contestant, and after a few minutes examining the available ontologies, I decided that http://smartweb.semanticweb.org/ontology/sportevent is the most apt. I also want to add a contact person for the rodeo as a whole, using the vCard vocabulary. So, I added foaf, contact and sportevent vocabularies to the list at the start of the document, which now looks like this:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:foaf= "http://xmlns.com/foaf/spec/20071002.rdf" xmlns:contact= "http://www.w3.org/2001/vcard-rdf/3.0#" >
Zooming in on just the event itself, we can add some contestants:
<div rel="cal:Vevent"> <span property="cal:dtstart" content="20080222T1400-0700">2:00PM</span> : <span property="cal:summary">Bull Riding</span> <ul>List of Contestants: <li rel="sportevent:Contestant" id="Marchi"> <span property="foaf:name" about="#Marchi" >Guilherme Marchi</span><br/> <a rel="foaf:weblog" about="#Marchi" href="http://example.com/~Marchi" >Marchi's blog</a> </li> <li rel="sportevent:Contestant" id="Briscoe"> <span property="foaf:name" about="#Briscoe">Travis Briscoe</span> </li> </ul> </div>
And, at the bottom of the page, we add a footer with general contact information:
<p class="footer" about="/main/page/for/Rodeo"> For general information or event questions, please call <span property="contact:phone">800-555-1212</phone> or email <a rel="contact:email" href="mailto:email@example.com" >firstname.lastname@example.org</a> </p>
RDFa uses several existing HTML properties and creates a few new ones. Recall that an RDF statement has three parts: subject, predicate and object. The about= or instanceOf= property of a tag can specify the subject. The rel=, rev= or property= property specifies the predicate. Then, the object may be the href=, content= or actual content enclosed by the tag pair. Note that the subject may be in a parent tag and, if missing, defaults to the document itself. Refer to the RDFa Syntax Specification and Primer documents for a detailed explanation of all the ways that RDF can be embedded in HTML.
Re-verifying through the RDFa Distiller returns the necessary @prefix lines to specify the vocabularies, followed by the N3:
@prefix cal: <http://www.w3.org/2002/12/cal/ical#> (...all the other prefixes...) <http://abra.info/lj/rodeo.xhtml> cal:Vevent [ sportevent:Contestant <http://abra.info/lj/rodeo.xhtml#Briscoe>, <http://abra.info/lj/rodeo.xhtml#Marchi>; cal:dtstart "20080222T1400-0700"; cal:summary "Bull Riding" ]. <http://abra.info/main/page/for/Rodeo> contact:email <mailto:email@example.com>; contact:phone "800-555-1212". <http://abra.info/lj/rodeo.xhtml#Briscoe> foaf:name "Travis Briscoe". <http://abra.info/lj/rodeo.xhtml#Marchi> foaf:name "Guilherme Marchi"; foaf:weblog <http://example.com/~Marchi>.
It's just like that. Well, that's not exactly how it went. The RDFa Distiller fails tersely on less-than-valid XHTML, which means that one mismatched tag or missing quotation mark causes unexplained failure. So, what I really did was use the user-friendly W3 Validator service first, at validator.w3.org, which reminded me about some missing tags and also to save my example as .xhtml so it would be returned with the correct MIME type. After passing the validator, I renamed the file and ran it back through the RDFa Distiller to generate the above N3 output. (The Distiller also has some caching issues. It was designed as a check of the syntax specification, not as a user tool. I use it anyway because I like the N3 output format.)
Another useful tool for checking your triple logic is the GetN3 bookmarklet available from www.w3.org/2006/07/SWD/RDFa/impl/js. Once you've saved it as a bookmark, you can use it to extract the RDFa quickly as N3 of any page you have in the browser. It's also more forgiving than the Distiller, so you can use it as a quick logic check without worrying about valid XHTML.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Managing Linux Using Puppet
- Non-Linux FOSS: Caffeine!
- Tech Tip: Really Simple HTTP Server with Python
- SuperTuxKart 0.9.2 Released
- Parsing an RSS News Feed with a Bash Script
- Doing for User Space What We Did for Kernel Space
- Google's SwiftShader Released
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide