Semantic SEO: What is the difference between schema.org and Microdata?

Often, developers being new to schema.org and the usage of semantic SEO techniques are confused about the relationship between schema.org and Microdata, Microformats, RDFa, GoodRelations, and other standards.

Here is a quick explanation that I have been given so often that I assume it may be useful for others:

When you expose structured data from within Web content by adding extra markup to HTML content, you have essentially two components:

1. A vocabulary (also known as data schema, ontology, data dictionary, depending on the background of the people you speak to): This provides global identifiers for types of things (“Product”, “Car”, “Restaurant” – often called “classes” or “types”) and for properties (e.g. “screen size”, “weight” – often called properties or attributes)

2. A syntax for publishing the data within Web pages in HTML. The syntax is the convention for the actual characters used to publish a piece of data. Relevant syntaxes in here are RDFa, Microdata, and recently JSON/JSON-LD.

Popular vocabularies on the Web are schema.org, GoodRelations, FOAF, SIOC, and a few others.

At Web scale, the absolutely dominant vocabulary for mainstream search engines is schema.org. GoodRelations is a special case, since 99% of the GoodRelations vocabulary are now integrated in schema.org, so you do not have to choose between the two. In other words, schema.org is now a new namespace for using GoodRelations. Additional vocabularies may have relevance on the long tail and can typically be used in addition to schema.org with no negative effect. Once they will have gained sufficient popularity, search engines may care.

Now, you can use the same vocabulary in multiple syntaxes. For instance, you can publish schema.org in RDFa or Microdata or JSON/JSON-LD. The most appropriate syntax depends on the purpose and on the target applications of your data. In Web content, Microdata and RDFa should be equally well supported by search engines in theory. However, actual support varies.

As of now, I would recommend the following:

1. Microdata syntax for schema.org. RDFa works, but not all structural variants of the same data will be understood by search engines and you need to be a real expert to find out which ones work and which ones don’t.

2. RDFa for GoodRelations in the original namespace, since for historic reasons, search engines know well how to process it.

3. JSON-LD for schema.org in eMails and other upcoming scenarios, see also https://developers.google.com/gmail/actions/reference/formats/json-ld.

Microformats are a special case, since they combine syntax and vocabulary. For very simple data structures, this works well, and Microformats are widely understood by search engines. It is just my personal opinion, and I am sure advocates of Microformats will see things differently, but in the light of schema.org and generic syntaxes like Microdata, RDFa, and JSON-LD, Microformats will be limited to very basic usages, and likely fade out.

So in a nutshell, schema.org in Microdata is currently the most widely understood and recommended variant.

schema.org in RDFa and in particular JSON-LD may become more important in the future, but you will have to monitor closely to which degree search engines can actually process data in those syntaxes.

4 thoughts on “Semantic SEO: What is the difference between schema.org and Microdata?

  1. John Biundo

    Hi Martin,

    This is a very helpful breakdown of the concepts and terminology. Google’s big push for structured data markup will inevitably expose more and more people to this discipline, and there’s real benefit in translating these concepts into a vocabulary familiar to this new audience.

    I’m wondering if JSON-LD should be elevated to first-class citizenship in your “hierarchy”? It may still be early days, but we’re not far from this point, I suspect.

    There’s increasing evidence that Google is consuming, and in some cases preferring,the JSON-LD syntax. Case in point: their Structured Data Markup Helper (https://www.google.com/webmasters/markup-helper) offers a choice of Microdata or JSON-LD as the syntax for the generated structured data. The help page says “Microdata and JSON-LD are two different ways to mark up your data using the schema.org vocabulary. It’s best to choose either microdata or JSON-LD and avoid using both types on a single page or email. Google prefers microdata for web content.” We also now see JSON-LD syntax on the schema.org type pages, and of course it has taken center stage on the Gmail Actions in the Inbox enhancements.

    So yes, Google does “prefer” microdata, but often in Google-speak this can be translated to “if you don’t really know what you’re doing, the safest bet is X”. I have seen this phenomenon happen numerous times with Google. So I translate this to “we fully support JSON-LD”. I know that’s just an opinion, and while I am in the process of testing this hypothesis in the real world, I’m confident acting on this opinion.

    The reason I mention this is made clear by the points raised in your immediate previous blog post (“JSON-LD: Finally, Google Honors Invisible Data for SEO”), which I heartily concur with. Implementing in-page markup is much harder than dropping in an “island of code” via JSON-LD. Granted, it comes with an increased responsibility to ensure that the human visible data is tightly synchronized with the invisible code (and concerns about abuse here are another likely factor in Google’s tepid endorsement), but that is a solvable problem.

    Cheers,
    John

    Reply
    1. heppresearch Post author

      Thanks – basically I think that Microdata and JSON-LD have partly different target applications. The more data you have in your page, and the more granular it is and the more frequently it changes, the more attractive is JSON-LD. But of course having access to both human-readable content and structured data is a valuable source for algorithms, so I do not think JSON-LD will completely supersede Microdata in the foreseeable future.

      Reply
  2. Meraj A. Khan

    Martin,

    This is a nice summary, however IMHO adding Facebook’s Open Graph to the write up would make it a much more useful reference.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s