JSON-LD: Finally, Google Honors Invisible Data for SEO

For quite a while, I have been arguing that RDFa as a syntax for structured data in Web content is problematic when it comes to exposing more granular data than just a few property names. While many advocates of RDFa stressed that reusing the exact same visible elements for structured data markup, as in this example

<body> 
<div property="vcard:tel">+49-89-1234-0</div> 
</body>

was beneficial because it reduces redundancy, it also raises complexity for developers, since you violate the principle of “separation of concerns” – you have to align a given HTML tree structure with a given data structure, dictated by the vocabulary, like schema.org or GoodRelations.

As a consequence, I once developed and promoted the “RDFa in Snippets Style” approach, where the RDFa markup would reside in blocks of invisible <div> or <span> elements, like this:

<body> 
<!-- Content for humans --> 
<div>+49-89-1234-0</div> 
<!-- RDFa rich meta-data --> 
<div property="vcard:tel" content="+49-89-1234-0"/> 
</div> 
</body>

This has been a big success – most of the GoodRelations extensions for shop software, running on at least 20,000 Web shops globally, use that approach and get their content honored by Google.

Now, one caveat has always been that Google indicated that invisible markup, i.e. RDFa elements that do not reuse visible content, would not be honored. The likely rationale for that guideline was that

  1. invisible markup invites spammers that try to manipulate the search engine,
  2. a link to human-readable content allows to combine the structured data and the textual content for information extraction heuristics, and
  3. the data quality is likely higher for visible content (since humans will complain otherwise).

Now, in silence, RDFa in “Snippet Style” (and similar patterns in Microdata) have for long been accepted by Google, as long as other quality indicators for the site were positive. But there was always a doubt, which was bad, since the development effort for weaving in advanced data markup in RDFa or Microdata syntax into HTML templates in a form that combined visible content elements with data markup was, in my experience, 5 – 10 times higher as compared to using RDFa in “Snippet Style”.

The bigs news is that this uncertainty is going away, since Google now openly moves to accepting data markup in JSON-LD syntax not tied to visual elements.

Of course, this is just a first signal, but I personally think that in the future, we will see JSON-LD in script elements for all advanced data markup, and RDFa and Microdata only for the very simple use-cases.

That is a good sign towards a broader use of data markup for e-commerce, for sure.

8 thoughts on “JSON-LD: Finally, Google Honors Invisible Data for SEO

  1. Mark Harrison

    Hi Martin,

    This announcement from Google is potentially very good news. In GS1 (the standards body behind barcodes, networked RFID, traceability, electronic data interchange in supply chains), we have a new project (GS1 Digital / GTIN+ on the Web) where we’re trying to encourage manufacturers and retailers to include structured data about products in their web pages. [ http://gs1.org/digital ] We’re already starting to see some major companies preparing to pilot this or deploy this. I agree totally that it’s going to be so much easier for them to drop in a single block of JSON-LD into their existing product pages than to try to do the semantic markup inline within the visible HTML markup – I’ve tried doing that with RDFa and Microdata and there are IMHO just too many tricky issues that will trip them up and result in a broken set of triples.

    However, the whole community needs really good testing tools for extracting and visualising the embedded structured data, whichever format is used for encoding it.
    I really like what http://rdfa.info/play does for visualising facts extracted from RDFa markup.

    Publicly, Google are saying that they accept JSON-LD but my experience of the Google Structured Data Testing Tool is that it is usually failing to extract any structured data from JSON-LD that was generated using their own Structured Data Markup Helper – or from JSON-LD examples that appear in the schema.org documentation.

    This current situation does not give anyone confidence that Google really accepts / ingests JSON-LD yet.
    Press releases and blog posts are good news – but I really think these have to be fully supported by corresponding improvements to the public-facing testing tools – ideally before the press release and blog posts.

    Maybe you have had a more positive experience of JSON-LD validation using Google’s existing tools?

    Reply
    1. heppresearch Post author

      Hi Mark,
      First: Thanks for spotting by! Note that my post was about the strategic shift this implies. The Google validation tools, and the actual honoring of data in sites often lags behind the strategic decisions by anything from a few weeks to several months or more. Note that Google and other search engines also need training data for their heuristics before activating features based on new types or sources of data, so innovations that are widely adopted by site owners will likely work in live Google systems faster than others.

      This having said: I hope that the Google Structured Data Testing Tool and the Google live systems will implement schema.org changes faster in the future.

      Martin

      Reply
  2. Ben Racicot

    Awesome post. Its really too bad how difficult it is to make use of this stuff. Do you think that we should be attempting to make use of JSON-ld at this point? Is it worth the time in this July 2014?

    Reply
    1. heppresearch Post author

      I think that for the moment, JSON-LD is recommended only for such uses where Google et al. explicitly instructs you to do so, e.g.

      1. in HTML emails, as described in

      and

      2. in music event markup, as described in .

      It will take a bit of time for the full search engine ecosystem to be able to consume JSON-LD data, but this day will come.

      Martin

      Reply
      1. Ben Racicot

        Thanks for the reply Martin. I just noticed that Schema.org offers up the JSON-LD version alongside its Microdata and RDFA examples now.

      2. heppresearch Post author

        Yes, but that does not necessarily mean that the entire infrastructure of search engines can understand all syntaxes equally well.

  3. heppresearch Post author

    Actually: Good news: By mid of June 2015, Google has updated the documentation for all usages of schema.org to include (and list as the first choice) JSON-LD. It is safe to assume that you can now use JSON-LD even for breadcrumbs and Rich Snippets for Products.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s