Beyond Keywords: Understanding Semantic Analysis
Semantics ~ The meaning of a word, phrase, sentence, or text
I spent quite a bit of time thinking about what I could best offer the world of copywriting from the “technical” SEO perspective. At the end of the day? It all comes down to words and the associations they convey. So let’s deal with the singularly most important concept that comes to mind: semantics.
Going down this road is important because far too often you will run into clients that express their need to have a given group of keywords to be hammered on ad nauseam. This not only leads to some poorly constructed content, but often doesn’t leverage how search engines actually look at it.
You need some ammunition to combat this short-sighted approach, so that’s what we’re going to look at today!
No, We’re Not Talking Code
First things first, when we talk about “semantics” in this context, it’s not about the code that also bears the same name. (You know, the mark-up that is part of the world of web development and surfacing content.)
We are, in fact, talking about information retrieval and how search engines perform semantic analysis on content as they crawl and index it.
There are myriad flavours, including some you may or may not have heard of such as:
- Latent Semantic Analysis
- Probabilistic Latent Semantic Analysis
- Hidden Topic Markov Model
- Latent Dirichlet Allocation
- Phrase Based Information Retrieval
Yes, a whole bunch of fancy names to be certain. Feel free to research those, but we’ll avoid the uber-geeky definitions for now. They’re all just variants of natural language processing that search engines may or may not be using. It’s not related to the code-based approaches known as the “semantic web”. This is about words.
Keywords are Short Sighted
Now that we’re past that, let’s get back to the problem we looked at off the top: clients that are addicted to keywords. Sadly, the SEO world has yet to fully move past this. In the modern search world we want to target “phrases” more so than singular keywords. One- and two-word searches are rare in comparison with more complex search tasks performed by the end user. This is enough for us to consider using (“long-tail”) keyphrases over keywords.
The next issue that arises is that clients will want to stuff multiple instances of said keywords in the copy and, in an attempt to feed the perceived semantic engine, synonyms. Again, this is short-sighted and doesn’t really embrace the concepts related to today’s semantic search capabilities.
You will need to educate clients to break that habit.
Identifying the Concepts
The good news is that most writers will naturally create content that satisfies the food a search engine wants to dine upon. It is often the client of the copywriter that attempts to drag them into the wrong direction.
Let’s look at this in simplistic terms with my favorite example from over the years…
Consider the search query [jaguar]:
- A big cat
- A car
- A football team
- An operating system
- …etc…
While crafting the content on our page we want to flesh out the concept being expressed with related words, phrases and concepts to build upon the topicality.
Singular terms and/or phrases might include:
- Automobiles
- Cars
- Autos
- Vehicle
- Auto
- Car
But these are mere synonyms, so we’d expand on that with other relations which might include:
- Engine
- Garage
- Tires
- Hood
- Spark plug
- Keys
- High Performance
Any guesses which [jaguar] this page is about? Once more, these are singular terms — we’d also build out the core concepts with various phrases, as well as related entities.
In a very simplistic understanding, phrase-based approaches look at top ranking/performing pages for variants of related terms and phrases for scoring purposes. I would recommend reading this post on phrase-based IR (information retrieval) to get a better grip on that stuff.
This ain’t yer daddy’s keyword density myopic approach.
Query Classifications
Another area worth mention in combination with these concepts is “query classification” (more here). This looks at user intent (when searching), and it’s something we should be cognizant of when constructing concepts and terms to be included in any piece of content.
They generally break down into:
- Informational (seeking information)
- Transactional (performing an action)
- Navigational (finding a known entity)
While a given piece of content may offer multiple classification states, it is always important to understand the target, from an SEO perspective, when constructing the “semantic baskets” to be used for said piece of content. (Refer to the link above to learn more about that.)
Putting it All Together
Ok… so we want to consider phrases and terms that buff out the core targets of a given piece of content. Consider optimal occurrences of related phrases when crafting your semantic baskets for a given piece of content. What words, phrases, entities and concepts would a search engine expect to see on that page? (Don’t ever again think in terms of keyword density!)
Some things to consider, as a content manager/editor and/or as an SEO copywriter:
- While doing the keyword research, use various tools to also create a list of “related phrases”
- Layout content program and structural hierarchy
- Map out terms to pages
- Give your writers not only core/secondary target terms, but related phrases as well
- Review and tweak pages prior to launch
- Vary link texts when possible and remember themes/concepts as well as keyword phrases
- Understand the relations of concepts
I like to think in terms of semantic baskets when researching and preparing any important piece of content that will be used for targeting. As stated off the top, in most cases a good copywriter will do most of this naturally.
One Final Thought…
Search engines love words. It’s what users type into it. Words are used to convey concepts and are constructed into phrases, entities and intent. This is what you want to look at when building out your pages. But we’re moving into a world where it goes beyond…. into voice search.
Back in 2013 Google announced what they called “Hummingbird”. And one of the elements within that was called “conversational search” which will treat a search task as an ongoing journey through a given search task. This consideration also drags us away from the truly limited concepts around keyword density and simple synonyms. (For more on that, have a read here.)
The point being, copywriters need to stay on top of the ever-evolving world of search. If you’re clients haven’t? You need to educate them. They’ll thank you for it.
Oh and hey, if you’re feeling real adventurous, you can watch this session on it:
“The difference between the right word and the almost right word is the difference between lightning and the lightning bug. ” – Mark Twain
Connect with David on Twitter, LinkedIn, and Google+
An informative article and really important topic. It is all about understanding the intent of the page and how people may want to search it on search engines. Effective keyword analysis with same intent is important plus Latent Semantic Indexing keywords are really crucial. Enjoyed reading this post and learned some of the things which I need to explore in coming days.
Regards
Susan
Glad you enjoyed the article, Susan! Thanks for chiming in :)
I do agree, other than the “LSI” bit. That is actually also an older approach that is far too often cited by those pesky SEO types. I noted a few other approaches that everyone is free to look into. That’s not the important part though, given that they generally work relatively the same.
For our purposes, we merely need to have a grasp on what semantic analysis does.
If you want to get into something among them, the “phrase based indexing and retrieval” concepts are a great place to start. Google had a whole series of patents on these, which may be telling as well. I cited a Post from Bill which is probably a journey worth taking; http://www.seobythesea.com/2008/09/google-phrase-based-indexing-patent-granted/
Again, thanks for the kind words… feel free to hit me up anytime for more reading
So if we drop the keyword density, our rankings won’t suffer? We have been carefully analyzing the competition’s keyword density and emulating the successful ones and it seems to work. It would be nice to drop it since the writing certainly suffers from too many keywords and sounds artificial and downright awkward at times.
Short answer: Yes. ESPECIALLY if your content is suffering. Keyword density hasn’t been a factor for a very long time. :) Instead, write pages that are so fantastic and resource-rich that your target audience (and Google) turn to them again and again.
Good luck!
Google’s organic algo is ‘very sophisticated’, much nore than many SEO’s think. Creating content that focuses on things, entities, concepts, categories, etc. is much more efficacious than worrying about a competitor’s KW count – thanks David for sharing :-)
Hi Cathie… Dave here
To echo what Heather intimated… indeed the whole “density” thing is a beast from the past. As I was saying in the post, a true copywriter likely already does enough to satisfy what a search engine would look at for semantic relevance. It’s about building relevant semantic baskets with supporting words, phrases and concepts.
It would also look for that same diversity in other graphs such as the link graph, social, demograhic etc…
Are the pages linking into that document also conceptually and semantically related to the topic?
Are the people sharing via social categorized relevant to the topic being shared?
You get the idea…
But, as far as the page you’re crafting, I believe copywriters need to educate clients… that KW density is a myth and well researched, thought out and crafted content, is going to be far more effective for a modern day search engine.
Yup, content is king and kw density shouldn’t interfere with that.
It’s about providing the best experience possible these days. If you have the best content that fully answers the users query in the best possible way then you are going to win in the end.
Hey David,
I liked your piece a lot. Great job!
Your examples on how to use LSI keywords are also quite easy to get :-)
Do you use any tools to help you find more LSI keywords? Like LSIgraph for example?
Cheers,
Nick
LSI Keywords are what some tool spits out and from what I am led to understand from discussion with others it doesn’t do a very good job. I took a similar approach as soon as I knew and understood what Hummingbird does. I saw little to no benefit as it casts too broad a net so it is kinda’ like targeting longtail phrases in the recent past. Sorry hamster phrases are not what I am paid to rank they take care of themselves if you are targeting the concept correctly.
Thinking in terms of “keywords” is missing the big picture. Basically “keywords” as we knew them are now only useful to identify a metric to quantify success. Beyond that it’s about concepts and putting the “target words” in context so that it is easy for Google to understand the words and classify it to a query. SEO is becoming more about optimizing the SERP much like it was with Universal Search. Often the easiest way to the top is via choosing the right type of content to target the concept by identifying if a query is temporal (QDF -> manipulated by CTR), a potential knowledge panel or answer box (Hummingbird) or none of the above and just plain old Universal Search.
Great read about semantics, Heather! Thanks for posting :)
Thanks for sharing this informative article. This is my first time hearing of LSI keywords. Can you please help with more comprehensive article on how to use it for keyword search and more?