Which Semantic Markup Will You Choose?

Perhaps the first question should be, why bother to use semantic markup at all? What’s to be gained?

There are three major benefits to using semantic markup:
[list type=”info”]

  • Provide a great deal of information to the search engines, which will allow them to better determine your relevance to a search query;
  • Provide more information in the SERPs, so that users can better determine whether your page is likely to provide the information they seek;
  • Provide a high degree of classification to the data on your page, building out the graph of Linked Data.

[/list]

For the Search Engines

Semantic markup, regardless of which style you eventually select, can provide a great deal of amplifying information to the search engines, when responding to a search query. This information also becomes resident in their index, contributing to the graph of Linked Data, which gradually will make searching more efficient, as the Semantic Web develops.

For the Users

The search engines can display certain aspects of that expanded information in the SERPs, in the form of Rich Snippets. These snippets not only convey more information to users, enticing them to visit your page, they also tend to stand out from the more mundane entries, gaining more attention.

The Options

There are several options available for semantic markup: RDFa, microdata, microformats, Open Graph… which you choose to use is up to you. Microformats has about run its race, and is almost certain to see less use as time goes on). Microdata is the heart of schema.org, with some refinements, and is likely to be around for some time. RDFa, while similar in function, is more scalable, and should also exist for some time. Open Graph is somewhat unique, but doesn’t conflict with any of the others.

I think there are three primary criteria to be considered when selecting the language you want to use:
[list type=”info”]

  • Which has the ontology best suited to your needs;
  • Which is receiving the widest acceptance;
  • Which are you most familiar with/accustomed to.

[/list]
The vocabularies of both Open Graph and schema.org are somewhat limited, although for most sites, they will serve well for what they’re intended. RDFa has the most expressive capabilities of any of the semantic markup languages. But then, since Google, Bing and Yahoo joined in lockstep with schema.org, microdata is the one that seems to have the greatest acceptance and its adoption is likely to continue.

I’ve been interested in RDFa for a few years now and in discussions, one of the most common arguments I’ve heard against it is that it’s so complicated. I’ll grant, it’s a little more complex than straight HTML, but then, it delivers SO much more, too!

schema-org microdata

Figure 1 Microdata – schema.org

RDFa 1.1

Figure 2 RDFa 1.1

As you can see, there’s not a lot of difference… one calls for span itemprop and the other calls for span property. Hardly enough to warrant calling RDFa more complicated. To be fair, though, the RDFa 1.1 in the above example is the “Lite” version of RDFa, and is somewhat simpler than its predecessor.

The nice thing about RDFa is that it’s a lot more flexible. Depending upon your niche, that additional flexibility may or may not be of any use to you, of course. RDFa categorizes data with some XHTML attributes that aren’t available with the other markups, although the most common ones can be utilized with microdata.

Schema.org, having been adopted as the vocabulary endorsed by Google, Bing and Yahoo, may be the most likely to survive the test of time. All three major search engines presently can parse RDFa and microformats, as well as microdata, but whether they will continue to do so in the face of increased implementation of schema.org is a question.

Hopefully, the adoption of schema.org will grow to envelope more classifications, but if the intricacy of your data is sufficient that schema.org doesn’t offer you the granular segmentation you need, the vocabulary available to RDFa may be the quickest solution.

Right after schema.org was first introduced, Google said that whichever markup you choose, you should avoid mixing them on the same page, as their algorithms had difficulty parsing mixed languages. However, Kavi Goel, Google Product Manager, later retracted that statement, and in fact, RDFa and microdata have since been successfully combined on the same page, with no issues.

You’ll need to decide which is best for your site. There is a great deal of partisanship involved in the discussion of which markup is the most appropriate, and frankly, there is no single correct answer. What works best for one site might not be best for another.

Assuming that schema.org offers sufficient capability for the site in question, I would probably lean toward the  microdata markup, based primarily on its somewhat higher adoption level. But then, that’s just me.

Which will you choose?

Editor’s Note: Thanks to Aaron Bradley for pointing out some poor wording that might have confused some readers. Schema.org is the vocabulary, whereas microdata is the actual markup used with it and embraced by Google, Bing and Yahoo.


4 thoughts on “Which Semantic Markup Will You Choose?

  • Reply
    Aaron Bradley

    You’ve made the common mistake here of conflating syntaxes and vocabularies.

    You cannot choose between “RDFa, schema.org, microdata”; RDFa and microdata are markup protocols, and schema.org is a vocabulary (as you, indeed, demonstrate by marking up code with RDFa using schema.org).

    One way or another you cannot, as you do, talk about RDFa and then say, “Schema.org, on the other hand, having been adopted as the the markup endorsed by Google….” The markup protocol – or syntax – endorsed by Google was initially microdata, but they did indeed change later to “allow” RFDa. But what Google et al. have specifically supported is the schema.org *vocabulary*.

    • Reply
      Wissam Dandan Post author

      Hello, Aaron-

      You are right, of course. My reference to schema.org as the markup isn’t really accurate. I just wrote this post the way I think of it. But as it stands, I can see that it might mislead someone that’s not very familiar with it.
      Thanks for raising the point! I’ll edit it accordingly

  • Reply
    Manu Sporny

    Hi Wissam, I’m the current chair of the RDFa 1.1 Working Group at the W3C. Overall, good article. There are still some other issues/points:

    Open Graph
    ———-

    Open Graph is a vocabulary that uses RDFa. Open Graph isn’t a syntax and Facebook does not support Microdata.

    Support for RDFa has grown
    ————————–

    Google (with their Knowledge Graph) and Facebook (with their Open Graph) have shown that machine-readability of documents is an important and very useful part of search and social. Google supports RDFa 1.0, RDFa 1.1, and RDFa 1.1 Lite[1]. Facebook only supports RDFa[2] and has recently expanded their support to RDFa not just in the head of the document, but the body as well. So, the support of RDFa markup among large search/social companies has grown by a very large degree in the past couple of years.

    Vocabulary Support
    ——————

    There are 100s of vocabularies for RDFa. Check out http://prefix.cc/ .
    There is only one major one for Microdata (schema.org).

    RDFa Lite 1.1 markup is technically equivalent to Microdata
    ———————————————————–

    RDFa Lite 1.1 was designed to be 100% technically equivalent to Microdata[7][8]. You can take a Microdata document, do a search/replace of Microdata keywords with RDFa Lite 1.1 keywords, and get /more accurate/ data out (RDFa adds type information for things like strings, URLs).

    RDFa is an official standard, Microdata is not
    ———————————————-

    RDFa is the only markup format that is supported by all the major search engines and Facebook. RDFa 1.0 and 1.1 is also the only markup format that is an official W3C Recommendation[3][4] (world standard) at this point in time. HTML5+RDFa 1.1 is expected to become a Recommendation in the next couple of months. Microdata will not become a W3C Recommendation until 2014[5] (this means it could still change, although it’s unlikely to change at this point).

    Active Community
    —————-

    RDFa has a community behind it[6] that is actively developing the specification and technology. Additionally, core members of the RDFa Working Group and Community have been actively involved in helping people integrate RDFa into their systems and websites. Microdata has no such community undertaking these sorts of endeavors (if someone claims it does, ask them to point to it – it doesn’t exist).

    Not Picking a Winner is Bad for the Web
    —————————————

    At present, many authors are adding both RDFa 1.1 and Microdata markup to their documents in an attempt to hedge against one of the markup formats going away. This leads to really ugly documents that duplicate information. Both formats are here to stay for the foreseeable future, but other organizations needs to get involved and back a particular format if one is going to win out in the long run. Having two technologies that do the exact same thing, where both technologies are shoved into documents is bad for the Web. Organizations need to start taking a leadership role and backing one of them in order to carry out what W3C was hoping would be a market-based approach to picking a winner.

    Additional Points
    —————–

    Microdata is /only/ used for schema.org. It has no significant adoption outside of that use case. Granted, it’s a large use case, but implementers should keep in mind that they only reason Microdata got adoption is because Google said it was the only option in the beginning. Google has since changed their stance on RDFa (giving it equal footing with Microdata), but that isn’t as widely known.

    Closing
    ——-

    This is not to say that Microdata is bad in and of itself. Just that RDFa is technically equivalent, just as simple to use, even more widely supported than Microdata,

    [1] http://semanticweb.com/breaking-schema-org-announces-intent-to-support-rdfa-lite_b24623
    [2] http://ogp.me/
    [3] http://www.w3.org/TR/2012/REC-rdfa-core-20120607/
    [4] http://www.w3.org/TR/2012/REC-rdfa-lite-20120607/
    [5] http://www.w3.org/2007/03/HTML-WG-charter#other
    [6] http://www.w3.org/2010/02/rdfa/
    [7] http://manu.sporny.org/2012/mythical-differences/
    [8] http://www.w3.org/TR/2013/WD-html-rdfa-20130207/#sotd

    • Reply
      Wissam Dandan Post author

      Thanks, Manu, for chiming in with some great clarification and references for our readers. I assure you, no introduction is necessary – many of us have been following you and the WG’s work for some time!
      Great stuff!

Leave a Reply

Your email address will not be published. Required fields are marked *