SEW presents the weekend column. We have reserved Saturdays to invite people outside of ‘the industry’ to share their thoughts on our industry. For three weeks Sci Fi writer, Laszlo Xalieri (nom de plume), is sharing his opinion on the future of search engines. Enjoy his final installment. (read Act 1 here and Act 2 here.)
Two weeks ago I started this series discussing the basics of why search engines want to help users in the first place — so they can sell advertisers and marketers access to the users. I also discussed a bit of how search engines make guesses, some good and some bad, about what you’re looking for, and how those guesses could be better if the search engine processes understand what you mean semantically instead of just working with the actual words and characters you type, seeing as what you type is frequently rife with the reasons for conducting the search in the first place — simple ignorance or perhaps a failure of memory. (This is the same problem you face when looking up how to spell a word: you need a pretty good idea in the first place or you don’t even know where to start looking.)
Last week I talked some about the tradeoffs between search engines knowing who someone is well enough to know how to give them the results they’re looking for versus the desire for the researcher to protect their online identity from predators. The more you know about someone, the more likely you are to be able to guess, when they input “car bomb” into a search query, whether they’re looking for a terrorist act or a drink recipe, for instance. However, if the system knows to deliver results for terrorist acts, your name might go on a list somewhere.
What Are We Thinking?
The reason we’re willing to flirt with opening ourselves up to identity predators is because our answer providers are more effective if they know who we are. If they know who we are, they can make better guesses about what we mean when we type something. But the true thing the search engines want to know in order to give us the best results isn’t who we are.
Knowing who we are is a second-best option. Instead, they need to know what we’re thinking.
There’s always the fMRI option — the cap that reads blood and oxygen flow in the brain. It’s getting close, but it’s still a little coarse. They’re approaching usability as lie detectors (and analogs are being designed for use as video game controllers).
Right now the best they could do would be to show us a stream of images and base an idea of what we’re interested in on how we respond to what we see — assuming the system can tell we haven’t been distracted by someone pretty walking into the room. In any case, there’d probably be a natural bias toward porn with that approach. Some people would be happy.
Or — our search providers could read everything we read and everything we write and, from knowing our input and output, could “read-ahead cache” what we might be likely to ask about or ask for clarification on. That could be problematic, because not everyone has a blog, or reads blogs, or has a presence on Facebook or Twitter.
75 Million Twitter Neurons
But for the growing number of people who use networks like Twitter (Twitter’s population estimated by Wolfram|Alpha as of January 2010: 75 million) — and are willing to hand over their Twitter name to a search engine — a huge amount of detail could be collected that could assist search engines with providing relevant responses. And Twitter streams are unlikely to store a lot of data that could be used for identity theft.
See, Twitter mimics the structure of a wad of neural tissue. Signals can be ignored for being “below threshold”, transmitted to a few select synapses or broadcast to all, or responded to with an inhibitory message on one or more channels. Certain channels are wired to the equivalent of external senses (journalism/news feeds, individual content generators, etc.) and those channels tend to have tons of followers who redistribute the incoming signal to their various networks based on merit and relevance.
This is a bit of an upgrade on the standard neural model because (largely) the signal contains information interior to the signal rather than meaning having to be teased out after the fact, based on which networks got activated. You can certainly get a lot of work done on a brain with 75 million neurons.
If the metaphor is holding, you should see Twitter stratifying into specialized sections like areas of the brain, with deep links from section to section, and all of them heavily wired into “sensory inputs” that would work much like an optic nerve. Also, you might begin to notice certain hints of diseases analogous to those specific to neural tissues and their functions.
If you really want to know how to give someone what they want, information-wise, all you need to do is determine their probable position in the Twitter networks (easier if they actually participate, since all you have to do then is get their name, read what they tweet and read the tweet-streams they read), and then prioritize your standard results by recency and relevance to known interests, triggers, and thresholds. That analysis, in bulk, should be able to isolate individuals who are information sources and assign them reputation scores based on excitatory or inhibitory responses from their followers.
If you’re a search engine, this also allows you to charge advertisers more to expose their ads to high-reputation individuals and information sources — or to target those individuals on the sly for a little subtle product-placement.
As for Actual Prediction…
Mosey over to the Foresight Exchange and take a look. There you’ll find a wonderful little game I’ve been playing for years, whereby someone registers a prediction, people bet for or against the prediction using fake money that pretty much doubles as a reputation score, and then, as events occur or fail to occur by the set deadline, bets pay out.
At one point in time, the Pentagon wanted to sponsor its own futures market run on real money — but apparently a couple of experts questioned the wisdom of allowing people to wager large amounts of real money on whether some prominent building somewhere would be the subject of a bombing, and, for once, wiser heads prevailed.
A model “futures market” can be built out of actually reading archived tweets (now a part of the U.S. Library of Congress archives), scanning for predictive statements, and using that information to determine who your best prognosticators, market or otherwise, might be. “Wagers” can be based on embedded emphasis, reputation and accuracy can be tracked, and the performance of these individuals can be used to scale the importance/rank of results of items they’ve discussed on Twitter.
Twitter streams are already being used to predict how movies are going to perform at opening and in their first few weeks, what sales are likely to be like for new products, public opinion of certain brands — now is a perfect time to check on how BP is doing, for instance — and other miscellaneous market data.
So why not use Twitter’s swelling brain to predict what people might be searching for? Especially since access to the public stream is free to the public.