Google has launched a new Google Scholar search service, providing the ability to search for scholarly literature located from across the web.
"The goal is to allow and enable users to search over scholarly content," said Anurag Acharya, a Google engineer leading the project.
Much of this material has been added to Google over the past few months. However, the new service allows searchers to specifically search against just the academic material.
Opening Up Invisible Content
Google has worked with publishers to gain access to some material that wouldn't ordinarily be accessible to search spiders, because it is locked behind subscription barriers.
For example, in a search for search engines, the current fifth site listed is for a paper called "ProFusion: Intelligent fusion from multiple, distributed search engines." That paper is only available to those with password access to material within the Journal Of Universal Computer Science, which comes with a subscription.
Normally, such material would never get spidered by search engines such as Google, so the material would be "invisible" to web searchers. But Google's made arrangements with publishers to get into these password areas.
The advantage is that suddenly, searchers have a much better ability to locate material that may be of interest. However, it also means that actually trying to read the full-text of such documents -- which Google does index -- will only be possible for those who have relationships with the publishing sites. Google says, by the way, that it does not earn money off of any new subscriptions generated between searchers and publishers.
This system may lead to problems for some searchers. In the example above, not only could I NOT read the paper, as I didn't have a subscription, but I also could not read even an abstract. Instead, a password-prompt continued to appear, even when I cancelled it, making it extremely difficult to finally close the window (and that's why I haven't linked to the actual paper, to save other people the problem).
This situation is probably unusual, however. One of Google's requirements for inclusion in Google Scholar is that publishers at least show abstracts to searchers.
The special access for publishers flies in the face of Google's anti-cloaking policy. Google is being shown material that regular users wouldn't normally see, its own definition of cloaking. This is a GOOD thing for searchers, but the company needs to amend its cloaking policy so as not to be hypocritical.
Indeed, that's long overdue. This has been a problem since I first reported about a similar issue earlier this year. A sidebar piece for Search Engine Watch members, Google & The Approved Cloaking Problem, looks at the latest case and suggests some fixes for Google, including finally moving forward with formalizing such programs for ALL publishers.
Citation Extraction & Analysis
When spidering the content, Google has worked to understand who the authors of the papers are, as well as the formal titles of the papers and other documents that cite the material. These citations are a key part of the special ranking algorithm used by Google for Google Scholar.
Google says the citation extractions allows it to see the connections between papers even if these connections are not made through links. As a result, it can use citation analysis to try and put the best papers at the top of the results.
Next to each paper listed is a "Cited by" link. Clicking on this link shows the citation analysis in action -- all the pages pointing at the original one listed, through textual citations, will be shown. For example, for the A technique for measuring the relative size and overlap of public web search engines paper has 135 citiations Google knows about through Google Scholar.
The same paper may be hosted in more than one place, of course. In these instances, Google picks what it believes is the best version and provides links to other versions after the paper's description.
In some cases, the material is not actually online. Google may know about a paper only through references it has seen on other papers. In these cases, a Library Search and Web Search link will appear next to the paper or book's title.
Library provides a means to see if there's a local library near you that carries the paper or book, through the Open WorldCat program. This is the same system that recently was integrated into a special version of the Yahoo Toolbar launched earlier this week.
Web Search generates a Google web search designed to try and help find more information about the material across the entire web.
More about the program can be found through Google Scholar's About page.
Driving New Traffic To Libraries?
On ResourceShelf, Shirl Kennedy and Gary Price have coauthored another look at the program that's well worth a read: Big News: "Google Scholar" is Born. They love the program, despite it containing some material they consider not quite scholarly. They also mention other citation tools and how much of this material is already available to the public, if only the public knew to go to libraries.
That's probably another key feature to Google Scholar. Sure, the material may have already been available, but if the public didn't realize this, it remained invisible. More and more, the public continues to turn to search engines to access all types of information. But this move, ironically, may raise more awareness and use of libraries as an important offline research resource.
That's even likely in that we can expect Google's competition, such as Yahoo, to follow suit. Yahoo already has long-standing ties to gather material from academic publishers through its Content Acquisition Program. What Yahoo doesn't currently provide is a specialized way to search through just this material. It's quite likely in my view that this will come.
Want to discuss this story? Visit our forum thread: Google launches search for scholars.
The Google & The Approved Cloaking Problem sidebar to this story for Search Engine Watch members looks at how while programs like Google Scholar help searchers, they still violate Google's own policies on cloaking and leave the service open to accusations of hypocrisy. It also looks at how it's long overdue for Google to extend more support to regular web publishers. Click here to learn more about becoming a member.
NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.
Know your Ambiguous Customer: Effective Multi-Channel Tracking
Wednesday, June 5 at 1pm ET - Learn why a move from the "batch and blast" email approach enables better conversations with your customers.
Register today - don't miss this free webinar!