What are microformats anyway? Well the folks at microformats.org define them as:
“Simple conventions for embedding semantic markup for a specific problem domain in human-readable (X)HTML/XML documents, Atom/RSS feeds, and "plain" XML that normalize existing content usage patterns using brief, descriptive class names often based on existing interoperable standards to enable decentralized development of resources, tools, and services.”
That sure is a mouthful, but what does all that really mean? It means microformats are really not that complicated. They were designed from the beginning to be simple and readable by humans and machines. They are a simple and concise way to add meaning to web pages.
Right now you are probably asking yourself, “why do we need microformats”? Before I answer that question we have to talk about the semantic web. The semantic web is an extension of the World Wide Web. An extension where content has meaning to both humans and machines. Currently, the web is full of static content. We use search engines to find web pages. Those engines match words, not meanings. After years of using sites like Google and Yahoo, we've all become pretty adept at finding what we want. Yet we still have to sift through countless results that do not help us with our goal. The semantic web is an attempt to solve that problem. If we could give web pages meaning that both humans and machines could read, we could search the web smarter and not harder. Imagine searching the web for information published in august of 2006 that the author felt had a relationship to microformats. Try that with your favorite search engine and the static content it indexes and you should have some idea of how useful the semantic web could be.
Back to your question. We need microformats because they are an evolutionary step in the direction of the semantic web. We can use them to embed meaning into our web pages. Meaning that a machine like an aggregator or web spider could understand. As is often the case with new technologies, we end up with the chicken and the egg paradox. No semantic data so no semantic tools and vice versa. Microformats allow us to quickly and cheaply add meaning to our sites right now. It's happening as we speak. Designers are embedding meaning into their web pages and developers are building tools to extract that meaning. There are other technologies targeting this problem, but none of them are ready to use today. Many of them require a significant change in both the type and way that web data is delivered.
With microformats in mind. I can imagine a web where I post to my blog a review for a book I just read, a classified add for a 1946 Indian Chief and the fact that I have a mint condition 1887 Morgan dollar for sale. Well alright, I can do that right now, but what will it get me? Not much. With a semantic web it would be a different story. Review aggregators would index my book review for others to see. A web spider for a classifieds site would understand that I wanted to purchase a specific motorcycle and an auction bot would see that I have a rare coin for sale. This is decentralized data. I don't go to three separate sites to add all this information to huge databases. The consumers and producers come find me. I can imagine a web where my book review helps someone buy a book, I buy a motorcycle and I sell my rare coin, without having to do anything more that post some information to my blog.
Let's see if we can begin to unfold this mysterious semantic web. We'll start by examining a couple microformats to see how they live and breathe.
We will begin with an elemental microformat. Elemental microformats typically make use of the rel and rev attributes and they do not contain child formats. For this example we'll use a microformat called XFN (XHTML Friends Network). Lets assume I place the following anchor on my home page.
<a href=”http://juan.example.org/”>Juan</a>
By using the rel attribute of the anchor tag I can include meta-data about the relationship between myself and Juan. According to specification, the rel attribute describes the relationship the href has to the source document. You can see how this microformat follows the specifications lead.
<a href=”http://juan.example.org/” rel=”met friend colleague”>Juan</a>
Juan could include a XFN hyperlink back to my home page and to other people he has relationships with. Maybe something like this.
<a href=”http://brucestockwell.net/” rel=”met friend colleague”>Bruce</a>
<a href=”http://eric.example.org/” rel=”met”>Eric co-worker</a>
note: see http://gmpg.org/xfn/11 for a full list of XFN vocabulary options.
You can start to see this taking shape as a social map. The data in the rel tag is defining the edge relationships. Now imagine if you will, a browser capable of parsing these relationships and displaying a social map of sorts. I could then see Juan's friends and maybe even find out we have mutual acquaintances. This social map would allow me to navigate the web in a whole new way.
The idea of the semantic web is starting to unfold. Its a web where documents have meaning that makes sense to both humans and machines. A web where browsing from page to page isn't the only way to get around.
Here is another elemental microformat called votelinks. Votelinks evolved from the idea that the number of hyperlinks to a web page does not fully quantify its popularity. Votelinks allow an author to express their mood regarding a hyperlink they are including on a page. For example, I might have a blog entry that discusses several different code editors and contains the following hyperlinks:
<a href="http://www.vim.org/">Vim</a>
<a href="http://www.gnu.org/software/emacs">Emacs</a>
<a href="http://en.wikipedia.org/wiki/Edit_(MS-DOS)">Dos Edit</a>
Of course I have my preference to which editor is the best. I'm sure we all do. I express my preference through the use of the rev attribute. According to specification, the rev attribute describes the reverse relationship the href has to the source document. I choose from the controlled vocabulary options of the votelinks format which are; vote-for, vote-abstain and vote against. The markup would look like this:
<a href="http://www.vim.org/" rev="vote-for">Vim</a>
<a href="http://www.gnu.org/software/emacs" rev="vote-abstain">Emacs</a>
<a href="http://en.wikipedia.org/wiki/Edit_(MS-DOS)" rev="vote-against">Dos Edit</a>
As web services crawl through this (X)HTML, they can interpret some meaning from the hyperlinks presented. These web services can now, more accurately, rank web pages to reflect the mood of the author.
The last elemental microformat we'll cover today is rel-tag. It is used for folksonomy, or tagging as it is commonly refered to and it is everywhere. Tagging is a type of user generated taxonomy, or more simply a system where users can categorize content as they see fit. Tagging is used by the likes of Flickr, del.ico.us and LastFM. Every major blogging software uses it in some fashion. Here is a sample from my blog, trimmed down some to highlight the rel-tag features.
<a href="...blog/category/programming/" rel="tag">Programming</a>
<a href="...blog/category/xml/" rel="tag">XML</a>
<a href="...blog/category/web-design/" rel="tag">web design</a>
What we have here are three hyperlinks. Each has a rel-tag of programming, xml and web-design, respectively. Notice how the anchor element text gives us a description of the tag, but the actual tag itself is determined by the ending of the href attribute text.
We've seen how relationships can be built using XFN, how Votelinks can express our mood towards individual hyperlinks and how rel-tag can help us categorize or tag a web page or blog entry. Let us try our hand at a compound microformat called hCard. Compound microformats are built from elemental microformats and (X)HTML elements. The hCard microformat is built off the vCard (RFC 2426) specification. Here is an example vCard:
BEGIN:VCARD
VERSION:3.0
N:Stockwell;Bruce
FN:Bruce Stockwell
URL:http://brucestockwell.net/
ORG:Stockwell Information Services
END:VCARD
Microformats adopt existing standards wherever possible. You will recognize this philosophy in play with other microformats too. This is a hCard implementation of the previous vCard:
<div class="vcard">
<a class="url fn" href="http://brucestockwell.net/">Bruce Stockwell</a>
<div class="org">Stockwell Information Services</div>
</div>
Notice how the elements from the vCard map neatly onto the hCard. This philosophy of reuse lends it self well to easy adoption and leverages existing tools and technologies. Part of the beauty, and no doubt the popularity, of microformats has been their ability to leverage existing formats and standards to provide semantic solutions we can use today.
Back to our hCard. Compound microformats rely heavily on the class attribute, using what is called the class design pattern. If you are used to using CSS then you might be asking “won't all this use of the class attribute get in the way of my stylesheets”? The answer is a quick and easy no. Remember you can apply more than one class name to the class attribute. We could mark up my hCard data like this:
<div class="vcard fancyborder">
<a class="url fn" href="http://brucestockwell.net/">Bruce Stockwell</a>
<div class="org cooltext">Stockwell Information Services</div>
</div>
Where fancyborder and cooltext are two CSS classes that would provide some formatting. It's important to remember here that XHTML is XML and XML was designed not only for use in displaying data but for transferring data as well. It's easy to forget the class attribute has a broad semantic meaning that doesn't limit its use to just a selector for CSS.
Let continue making a hCard. I went to http://microformats.org/code/hcard/creator and used their Creator to build a more realistic hCard:
<div id="hcard-Bruce-Stockwell" class="vcard">
<a class="url fn" href="http://brucestockwell.net">Bruce Stockwell</a>
<div class="org">Stockwell Information Services</div>
<a class="email" href="mailto:bruce.stockwell@gmail.com">bruce.stockwell@gmail.com</a>
<div class="adr">
<span class="locality">Virginia Beach</span>,
<span class="region">VA</span>
<span class="postal-code">23454</span>
</div>
<a class="url" href="xmpp:bruce.stockwell@gmail.com">IM</a>
<p style="font-size:smaller;">This <a href="http://microformats.org/wiki/hcard">hCard</a> created with the <a href="http://microformats.org/code/hcard/creator">hCard creator</a>.</p>
</div>
It displays like this:
This is great for showing us how to create an hCard, but I think it obscures some of the beauty of microformats. Microformats are at their best when they are free form. We can mark up our existing data with the correct meta-data and turn what looks like a business card into an pleasant introduction:
<p class="vcard">Hello, my name is <a class="url fn"
href="http://brucestockwell.net">Bruce Stockwell</a>. I work for <span
class="org">Stockwell Information Services</span>. I live in the the Redwing
Neighborhood of <span class="adr"><span class="locality">Virginia Beach</span>,
<abbr class="region" title="Virginia">VA</abbr> <span class="postal-
code">23454</span></span> If you'd like to contact me please feel free to <a
class="url" href="xmpp:bruce.stockwell@gmail.com">IM</a> me or <a class="email" href="mailto:bruce.stockwell@gmail.com">email</a> me.</p>
Which displays like this:
Hello, my name is Bruce Stockwell. I work for Stockwell Information Services. I live in the Redwing Neighborhood of Virginia Beach, VA 23454. If you'd like to contact me please feel free to IM me or email me.
See how easy it was to embed the hCard into the paragraph element? The anchor element pulls double duty as a container for both url and fn. The abbr element handles region perfectly. Many of the other hCard elements like org, adr and locality are defined using the span element. Span elements are great for giving semantic meaning to data that needs to be displayed in line. But don't jump too quickly at using spans or even divs for that matter, without first considering your alternatives. We're trying to move toward the semantic web. It only makes sense that our (X)HTML be as semantically correct as possible before we add or replace any markup to support microformats. Block elements like paragraphs and lists make great containers for microformats and they help maintain the semantic structure of the document.
Lets look at another compound microformat called hCalendar. The hCalendar format is built off of the iCalendar (RFC 2445) specification. Here is an example of iCalendar:
BEGIN:VCALENDAR
VERSION:2.0
BEGIN:VEVENT
URL:http://www.gordonbiersch.com/
DTSTART:20080324T023000Z
DTEND:20080324T070000Z
SUMMARY:My Fortieth Birthday Party
LOCATION:Gordon Biersch\, 4561 Virginia Beach Blvd\, Virginia Beach\, VA 23462
END:VEVENT
END:VCALENDAR
Here is a hCalendar implementation of the previous iCalendar:
<div class="vevent">
<a class="url" href="http://www.gordonbiersch.com/">Gordon Biersch</a>
<span class="summary">My Fortieth Birthday Party</span>:
<abbr class="dtstart" title="2008-03-24T18:00:00-05:00">
March 24th 6:00pm</abbr> - <abbr class="dtend" title="2008-03-25T02:00:00-05:00">
March 25th 2:00am EST</abbr>,
at <span class="location">Gordon Biersch, 4561 Virginia Beach Blvd, Virginia
Beach, VA 23462</span>
</div>
We are going to free form this very business like event into something a little more friendly:
<p class="vevent">Please join me at <a class="url" href="http://www.gordonbiersch.com/">Gordon Biersch</a> for
<span class="summary">my Fortieth Birthday Party</span>. The festivities will start on <abbr class="dtstart" title="2008-03-24T18:00:00-04:00">March 24th at 6:00pm</abbr> and continue until <abbr class="dtend" title="2008-03-25T02:00:00-04:00">
2:00am the next morning</abbr>. Gordon Biersch is located at <span class="location">4561 Virginia Beach Blvd, Virginia Beach, VA 23462</span>
</p>
Here is the result:
Please join me at Gordon Biersch for my Fortieth Birthday Party. The festivities will start on March 24th at 6:00pm and continue until 2:00am the next morning. Gordon Biersch is located at 4561 Virginia Beach Blvd, Virginia Beach, VA 23462
Once again we are able mix the microformats into a free form flow of a normal looking paragraph. I can't stress enough how important I think this aspect of microformats is. You do not have to sacrifice your human semantics to allow the machine to have its semantics. While specifications for vCard and iCalendar both have very simple and easy to understand formats, neither format allows you to include your own personal semantic meaning. This is where Microformats really fit in. They are a layer, if you will, on top of the specifications for HTML 4.0 and XHTML 1.0 and they give you the freedom to meet both the needs of humans and machines.
So which browser do you use to view all of these microformats you've put in your home page or blog? Well the truth is there are no browsers right now that will parse microformats. Fortunately for early adopters, Mike Kaply has built a wonderful microformats add-on for Firefox called Operator. Operator parses a web document looking for microformats. When it finds a microformat it recognizes, Operator allows the user to take different actions with the data it has discovered. You can search Flickr for photos whose tags match tags on a specific page, download hCard information to use in your email client, locate addresses with Google or Yahoo maps and upload hCalendar data to your Google calendar. On the surface, these actions may seem very simple, but there is a big paradigm shift beginning to happen. With Operator, our browser is no longer just showing us web pages, our browser is brokering information for us!
You can download Operator 0.7 at https://addons.mozilla.org or live a little closer to the edge and download Operator 0.8a from Mike's site at http://www.kaply.com/weblog/category/operator/. With Operator installed, you can use the examples above and export my contact information, locate my approximate location on Google maps, and add my birthday to your Google calendar. You can also visit any blog and quickly search Technorati, flickr and YouTube for matching tags.
The semantic web keeps unfolding. We are giving meaning to our web pages that the browser can understand and use. The really good news is Mike is working closely with the Mozilla team to include microformats in Firefox 3.0. In the not so distant future, we will have a browser that is truly a broker of information. Microformats could add a significant connection between the web browser and desktop applications. In fact, if you'll allow me to stretch an analogy a bit, I don't think it will be too long before the browser handles microformats as easily as it handles other types of content.
I've made several references to examples of microformats that are in use today. As the microformat community would say, “the evolution is here”. Slowly, page by page, site by site, (X)HTML is undergoing an evolution. What once made sense only to humans is evolving. What once was static content is now coming alive with meaning. The semantic web is growing more and more every day. As it does, so does its value.
Some big hitters are getting behind microformats. Last year Yahoo Local added 15 million hCards to the scene. Yahoo Tech uses hResume and Yahoo's Upcoming supports hCalendar. Microsoft has even begun to get behind the idea of microformats. With sites like flickr, del.ico.us, cork'd, jyte and LinkedIn all supporting one or more microformats, you can see a real trend toward the adoption of these open standards. Technorati now has http://kitchen.technorati.com/search/ where you can search for contacts, events and reviews. Now people can publish their microformats, via a blog or other web page, to web services that aggregate this information for others to search.
Decentralization has been at the core of the Internet since its inception. Of all the microformat adopters thus far, it is the decentralized services that interest me the most. It's through this concept that I believe the power of microformats really shines through. Of course this won't be the end of silos. Nor will it be the end of lock-ins. What it will mean is more options for those of us who desire a little more control over what we do. You and I will get the most out of microformats by advocating them for everyday use and by submitting them to aggregators like Technorati, Pingerati and the like. Certainly the webs greatest power has always been its ability to link people together, flatten markets and share information. Microformats have been created with this spirit in mind. They are a natural evolution in progress. A step forward in technology that can bring more of the power of the web to our fingertips. The semantic web is our chance to own our own information, to publish it anywhere and to move it freely whenever we desire.
So we've unfolded the semantic web using microformats as our guide. We've learned that microformats are designed to solve very specific problems. There is no complex design here, just problems being approached in the most simple way possible. The solutions we've seen were designed from the beginning to be human readable first, machine second. Existing standards and formats have been used in every situation possible. The vCard and iCalendar formats are mapped almost 1:1 onto hCard and hCalendar. With adr we've seen some modularity and reuse in action. Encompassing all of these principals is the theme of decentralization. Microformats at their core encourage the development of decentralized products and services. There is a growing community of supporters for this technology. Web designers are embedding microformats into their content. Web developers are building tools to utilize this data.
I said earlier there are other technologies with an eye toward the semantic web. Resource Description Framework (RDF) is certainly a heavy weight in this category. Yet as much of an underdog as microformats might appear to be, they are quickly becoming a stepping stone for technologies like RDF. Gleaning Resource Descriptions from Dialects of Languages (GRDDL) is proving to be a path for the transformation of microformats into RDF. As these specifications and standards mature, more interoperability will develop. Web designers and developers would then be able choose what technology is right for their specific problem domain with out sacrificing the ability to transform their data into other technology domains.
Open standards are a wonderful thing to see in action. Microformats are just one example of how sharing knowledge empowers us all to achieve more. In the bigger picture of the semantic web, microformats are quickly resolving the chicken and egg paradox by becoming the egg that begets the chicken. We certainly owe a thanks to the creators of microformats and others like them. These innovative people continue to make the web a place where ideas can flourish and be combined with other ideas to create things we have yet to think of.
Bruce Stockwell
bruce@brucestockwell.net