How to speak ‘Search Engine’

schema preview

The challenge of how to ‘speak’ search engine and tell it how to surface our content is what Search Engine Optimisation is all about. But are we doing it as well as we could?

Christian J. Ward, partnerships lead at Yext, gave a webinar in partnership with Brighton SEO on ‘How to Speak Search Engine’, in which he looked at the current state of search and the problems inherent in how we produce the content that we expect search engines to find.

Search has changed dramatically since Google first began indexing the web in 1998, both in scale and in nature. Google alone executes more than two trillion searches every year – a scale that we can barely comprehend. Search, said Ward, is not just a process for a brand; it’s becoming the number one way that we interact with information generally.

But the way that we search has changed, too. At a recent CMA Digital Breakfast, digital journalist Adam Tinworth remarked that Google is becoming “much more of an answer engine” than a search engine – searches are increasingly phrased in the form of a question, and innovations like the Knowledge Graph and Featured Snippets aim to answer searchers’ questions without them needing to leave Google.

one great answer

We all want Google’s ‘answer engine’ to surface our content in response to searcher queries. One way to help ensure this happens is to write content that will satisfy questions that users might have when coming to our websites.

But even once we have, how can we direct Google and other search engines to the content that will provide the best answer?

Feeding baby Google

To illustrate a problem inherent with the way that we approach content online, Ward used an image which has to be the best depiction of ‘peak content’ that I’ve seen so far.

A presentation slide featuring a photo of an unhappy looking baby being fed with a spoon. The baby is wearing a bib with the word "Googoo" made up of letters from Google's old logo. To the left is a list of content types: Blogs, Ad copy, Featured articles, Webpages, Product write-ups, Menus, blurbs, Services, Lists. Underneath this the text reads "Unstructured..." and then "YUCK!"

These days, brands and websites are churning out more content than ever before in an effort to keep up with each other: blogs, ad copy, sponsored content, product write-ups, ordinary webpages and lots more.

“We’re trying to feed Google – the baby – great content information that, to some degree, it doesn’t want,” said Ward.

At least, not in a form that it can’t easily interpret.

“We pump out so much content that it is very difficult for Google to analyse it and to know what we’re talking about. And it’s partially because it’s unstructured content.”

As an example of how confusing this can be in practice, Ward looked at the search term “tombstone”, which has a whole array of possible meanings: Tombstone is the name of a popular 90s Western; it’s the name of a town in Arizona (for which the film was also named); a word meaning ‘headstone’ or ‘gravestone’; a brand of pizza; a Marvel comic book, and more. Which of these is going to be most relevant to the searcher?

A Google search results page for the keyword "tombstone". A drop-down list below the search bar shows the suggested searches "tombstone - film" and "tombstone - city in Arizona" as well as "tombstone cast" and "tombstone pizza". The search results mostly relate to the film Tombstone, and also include a Twitter user named TheLivingTombstone.

Of course, part of the game here is trying to guess what the searcher intends when they search for the word “tombstone”. But in our content, as well, we have to make it clear which “tombstone” we’re referring to, so that Google can more easily hone in on the right content and serve it to the user.

If you have a webpage about tombstones, and Google can’t tell whether it’s about headstones or pizzas, it won’t be able to show it to a user who is searching for one or the other.

Search engines want to provide their users with more rich data in search results: useful information like event dates, reviews, menus and other details that can answer their query at a glance, or at least help them decide which result will be the most relevant.

Ward quoted Sundar Pichai, the CEO of Google, who in his keynote speech at Google I/O, said,

“It’s not just enough to give [users] links. We really need to help them get things done in the real world.”

Ward believes that Google is working towards an eventual solution which means users will never have to open an app or website.

While this sounds like a very distant future (after all, there are bound to be some circumstances in which users are searching in order to find a website or app, not just an answer from Google), there’s no denying that Google has taken a huge step in this direction in recent years.

Putting definition around the cow

So what can content creators do to move with this trend, and set their websites apart from everything else in the vast sea of online content?

Warner showed a black-and-white photograph, which has been used by Ellen Langer in her work on mindfulness, and asked webinar attendees to volunteer what they thought it was a picture of.

cow illusion

Suggestions came back: a turtle, a skull, the Hindenburg. But when a few guiding lines were added to the image, the subject became clear: it is in fact a picture of a cow.

cow illusion 2

“Now that you see it, it’s impossible to unsee it,” said Ward. “There’s a lot of relationship around that, where just a little bit of definition can burn a pathway. And search works a lot like that.”

In other words, content creators need to put that bit of ‘definition’ around their content in a way that tells search engines what it represents, and what type of content it is. There’s a way to do this using search engine ‘language’, and it’s called structured data.

Structured data has been around for a few years now, and is known as a way to help search engines assess and understand content in order to better place it on the SERP. Yet in spite of this, a shockingly low proportion of website owners actually make use of it.

The Schema.org logo, consisting of white sans-serif text reading "schema.org" on a dark red background, with a slight shadow around the letters.

Take Schema.org, a markup language that is the result of a collaboration between Google, Bing, Yahoo! and Yandex to create a structured data vocabulary that can be understood by all search engines.

A study by Searchmetrics in 2014 found that 36.6% of Google search results incorporated Schema rich snippets, yet only 0.3% of websites actually made use of Schema markup at all.

The study also found that pages which used Schema ranked on average 4 places higher in search than pages which didn’t, although Searchmetrics was keen to emphasise that this might not be entirely down to structured data.

But search results which use Schema are widely agreed to result in higher click-through rate, as they include more useful, relevant and attractive information like pictures, reviews, opening hours, pricing information and more.

So since this study was conducted two years ago, has the number of pages marked up with Schema increased significantly?

Ward did some quick calculations. The Schema.org website proudly proclaims that “Over 10 million sites use Schema.org to markup their pages and email messages.”

A slide from Christian Ward's webinar with white text on a dark background. The title is "Really? 10 Million?" and the text reads, "We passed one billion websites in September of 2014, and it's closer to 1.08 billion today. 10,000,000 divided by 1,080,000,000 = 0.926%. Less than 1%. Nice work, everyone!"

While this figure might sound impressive, it becomes less so when you realise that we passed one billion websites in September 2014, and the number today is closer to 1.08 billion. 10 million as a percentage of 1.08 billion equals… 0.926%. That’s an increase of only 0.626% since Searchmetrics’ study, and still less than 1% of the total websites out there.

“It’s staggering,” said Ward of the low number, “when you think of the ramifications of how much better search does when we can explain it.”

It’s not easy speaking search engine

So then why do so few website owners and content creators use Schema markup on their sites? “There’s a good reason for this,” Ward said. “We all know this is hard work. I don’t think it’s that we mean to be lazy, I just think that ultimately this is very hard to do.”

Until quite recently, for example, all Schema markup code had to be added in-line around the individual elements of the page.

Every element, from addresses and opening hours to reviews, needs to be defined individually with Schema, resulting in a lot of coding legwork and no small amount of headaches when it came to fitting it in with all the other code already on the page.

restaurant schema example

Just like any other language, learning to ‘speak’ search engine is going to require a lot of investment of time and effort. But, Ward maintains, it is definitely worth our while.

“This effort is a way to truly distinguish the work that you do and the work that our community does on behalf of our customers and clients. It just takes a lot of time.”

He pointed to the example of a search for the latitude and longitude of the Empire State Building, the answer to which is displayed in Google’s knowledge graph at the top of the search results page.

The website which provides this information uses Schema.org markup to point Google to where the relevant content is on its page, resulting in the “great user experience” of “one solid answer.”

lat-long schema example

And best practices around structured data are constantly evolving, making it easier for website owners to incorporate it into their code. Google used to only support Schema markup if it was written inline, insisting that the markup needed to be “visible to human users” as well as search engines.

But it has since reviewed this stance and expanded its support for a type of notation called JSON-LD, which allows structured data to be added to the header and footer of a page instead of inline.

Google’s introduction to structured data on Google Developers now states outright that JSON-LD is the recommended markup format for structured data.

“Schema, its use, and the taxonomies – they’re evolving constantly,” said Ward. “We have to get more involved in this process, as a community. We need to be working with Google, and with Yandex, and Yahoo! and Bing.

“Let’s start banding together to try and get some structure out there.”

A screencap of a Siri voice search asking "How big is the Serengeti?" Siri's answer pertains to the breed of cat, answering "Medium", rather than to the region in Africa.

If you need just one more reason to start incorporating structured data into your website markup, it should be the rise of voice search.

Ward cited statistics from Mary Meeker’s recently-released Internet Trends report which show that the volume of Google voice search queries is now 7x what it was in 2010, with 65% of smartphone owners using voice assistants like Siri, Cortana and Google Now.

Users are getting used to being able to ask their voice assistants increasingly specific questions and get a single, definitive answer; but to make this possible, website owners need to be adding the structural markup around their information that will tell the assistant where to look.

“In the end, I want to be able to ask Alexa to email me the logo of the local 7-11, or, ‘Can you tell me if this place is closed or open right now? Do they have any specials right now? What’s the number one item on their menu?’” said Ward.

“All of that data has to be incredibly well-structured in order for us to get the result we’re looking for.”

Related reading

Super food diet selection in wooden bowls. High in antioxidants, vitamins, minerals and anthocyanins.
screen-shot-2016-09-21-at-00-06-31-1024x473
cma feature
Search Console Search Analytics
Simple Share Buttons