For a company that services more than 300 million users, LinkedIn's search functionality was due for a much-needed upgrade. LinkedIn has traditionally taken a more reactive approach to search but is hoping to change that with Galene, their new search architecture.
Prior to building Galene, LinkedIn was utilizing a process that centered on an open-source library called Lucene. According to LinkedIn, this solution was used to "build a search index, searching the index for matching entities, and determining the importance of these entities through relevance scores." Their search index had two primary components to it:
- Inverted Index: Mapping search terms to a list of entities that contain those terms.
- Forward Index: Mapping entities to metadata about those entities.
Unfortunately, the information housed within Lucene was large and couldn't be housed on a single computer. To alleviate that problem, LinkedIn broke up the index into "shards," each which contained a portion of the index.
In addition to the structure above, LinkedIn also had to create processes that took into consideration their limited ability to make live updates to the system, which was a very costly issue.
Pre-Galene Pain Points
LinkedIn's previous system was proving not to be a scalable solution. Some of the top pain points that their organization was facing from a search standpoint were:
- Difficulty rebuilding a complete index.
- Live updates were not efficient.
- Scoring was inflexible.
- The system did not support all necessary search requirements.
- Management of small open sourced components was difficult.
Out With the Old & In With Galene
Below is a diagram provided by LinkedIn that provides a graphic representation of the new Galene search stack.
The End User Experience
Now that we've covered a small portion of the technical reasons LinkedIn upgraded their architecture, it's important to discuss what changes the end user will experience.
- Access: Users are no longer limited to searching just their first and second-degree connections. They now have access to all LinkedIn members.
- Relevance: The algorithm being used for Instant Member Search now includes relevance that it was impossible to incorporate in earlier versions. Included in this list are:
- Offline static rank computation.
- Personalization based on factors such as connection degree.
- Approximate name matching.
- Speed: Instant Search is more than twice as fast as the previous implementation" and utilizes "about a third of the hardware," according to LinkedIn.
Throw in Your Two Cents
As a digital marketer, do you think LinkedIn's upgrades will change your approach to marketing on the social network? What impact do you think it will have to your strategy now that you will have access to more users?
SES Denver (Oct 16) offers an intense day of learning all the critical aspects of search engine optimization (SEO) and paid search advertising (PPC). The mission of SES remains the same as it did from the start - to help you master being found on search engines. Early Bird rates extended through Sept 19. Register today!