SEW presents our new weekend column. We have reserved saturdays to invite people outside of 'the industry' to share their thoughts on our industry. For the next three weeks Sci Fi writer, Laszlo Xalieri (nom de plume), will share his thoughts on the future of search engines. Enjoy.
The world of search marketing is a strange beast. You have in mind designing a free search engine as racehorse that will take users where they want to go quickly and elegantly, but there are so many competing and downright conflicting goals that the design quickly turns chaotic.
Given the need for your search engine to be armored against deliberate abuse and have enormous capacity to index about a hundred destinations people will never visit for every one that they will, your sleek racehorse ends up looking like something between a rhino and a camel and a platypus. Meanwhile, the end users -- barely a consideration in the design equation because, I remind you, it's free for them to use -- end up lying somewhere on the spectrum between commodity and victim.
Because if search is a business -- one that makes money -- then your end goal is to attract consumers and sell access to them to marketers. There's no other model that can work, really. You might as well give away free lollipops instead of search results.
The only reason search results work better than lollipops is that you are selling to the marketers some knowledge about what the users are clicking on -- which, after a fashion, is selling them knowledge about the chinks in your rhino's armor. Because nothing works better for a marketer than to have their product show up as a search result when it's not quite (but maybe close enough to) whatever the searcher was seeking. However, nothing will repel a user more than repeatedly being served something other than what they were looking for.
This is all remedial crap for readers at this site. The only reason I bring it up is to bring back to mind all the design flaws that have to be incorporated into the beast in order for it to be any kind of a money-maker.
Google, still number one among both marketers and users last time I checked, is still a bit of a freak show, beast-wise. If you don't believe me, now is a good time to go check out the sideshow exhibition at http://www.autocompleteme.com.
At some point you have to consider what it is that the users need. In fact, if search engines were completely charity organizations and never needed to make money, you could achieve the largest ever user base by simply figuring out what it is that people were looking for and giving it to them quickly and with no fuss. To maintain a large commercial base of users to sell down the river to your marketers, you still need to do this at least a little bit. That, or give them lollipops.
Google's autocomplete feature, useful as hell and yet still also worthy of all the hilarity it generates, shows the severely flawed yet somehow still functional strategy employed to try to satisfy the users.
Here is the first problem: the user. Somehow you must find out what the hell it is the user wants so you can give it to him/her/it. Yet, perhaps using YouTube comments as a basis, you know for a fact that users are inarticulate evil-minded bastards with less than even a nodding acquaintance with their own native language. Or they're toddlers, so, yeah, same thing.
Or, to put it a little more kindly, they could use a little help. Autocomplete helps by using the kind-hearted grade school teacher technique of turning a fill-in-the-blank question into a multiple-choice question.
There is one flaw in concept here and that's that the choices should be drawn from other search terms used historically, ranked by frequency. That's awesome if the seeker is actually looking for something that large numbers of other people have already looked for -- and immune to the distraction of items showing up in that list along the lines of "My favorite color is ham."
Or is that just a lollipop?
This is really only an amplification of Google's principal strategic flaw for indexing pages in the first place by textual content rather than semantic content, thereby making the word choice more important than the meaning. Google's indexing is driven by keywords and Markov chains (short chains of words, typically three to five words in length, indexed as a group or phrase rather than individually). If the wording of the search doesn't contain a keyword from the page and/or an identifiable hit on one or more Markov chains indexed on that page, that possible page result is overlooked.
Google could seriously improve service here by indexing a few online thesauri and using their Markov chain and keyword data to narrow down which of multiple possible meanings is intended, not just possible spellings, thereby returning more relevant results and wedging a foot in the door of the upcoming semantic web.
Additionally, web content managers could get an assist in keyword selection via semantic analysis of terms used to reach their pages versus, frankly, whether people could correctly spell the keywords they're already using. Nothing will help you if the pages being indexed contain spelling or usage or grammar errors, but, speaking as a writer first and a technologist second, I can get behind any Darwinian approach that strengthens the ability to communicate as a survival strategy.
Anyway, Markov chains are a brilliant analysis tool for determining a search hit and page relevance, but they're at least one step away from clinching the deal semantically. Google, et al., please keep moving. You're almost there.
There are a huge number of approaches that are right around the corner for continuing the process of finding out what it is the user wants and how to give it to them, and I intend to continue this analysis in the next two parts of this article.