Author
[email protected] [email protected]
Date published
September 14, 2005
Categories
Chris covered the launch of Google’s new blog search in today’s SearchDay
article,
Google Launches Industrial Strength Blog Search. In this post, I want to add
some of my own thoughts. I’ll also be working up a rundown on reaction from
others, and Gary may be adding his own thoughts as a postscript here or as a
separate post. Top line thoughts? It’s not spam free. I wish it were "full text"
blog search to better represent the blog world. It’s got a short memory, not
going back past March 2005. But the backlink info looks good, certainly better
than you’ll get on Google itself.
- Chris mentioned this in his article, but I think it’s worth stressing,
technically, this is FEED SEARCH. You are only searching through any
feed that Google has found. Some blogs don’t have feeds. Some feeds don’t come
from blogs. Google understands these issues and figures down the line, it may
have to revisit changes to make it truly a blog search, if that’s what’s
intended.
- By default, sorting is by RELEVANCE, not DATE. If you are looking for the
latest posts on a particular topic, use the "Sort by date" link in the upper
right-hand corner. Unfortunately, you can’t save this as a preference.
However…
- As Chris noted, you can have results constantly sent to you via a feed
alert. The feed links are at the bottom of each page. So if you wanted to know
the latest blogs mentioning Google, you’d search for that word, sort by date,
then subscribe.
- Want to know the latest backlinks to your blog? Use the link: command,
such as link:blog.searchenginewatch.com,
sort by date, then subscribe to a feed of that search. That shows all links to
your domain, to any page anywhere on your blog and will send you the newest
ones.
- Want to know the latest backlinks to a particular post? Use the full page
address, such as
link:blog.searchenginewatch.com/blog/050831-091033. That brings back
matches linking just to that page.
- Don’t want to learn these commands? Just type in a full URL, with or
without the http:// prefix into the Blogger
version of Google Blog Search. It will automatically do the right thing
there and show backlinks.
- As Chris notes, Google says that for blog search backlinks, it’s not
suppressing any of the links it knows about. To spell that out, here are some
figures to contemplate:
Notice, a search across the ENTIRE web on Google brings back fewer
backlinks than across the much more limited feed database on Google. Why? The
third line shows the answer. A search on the ENTIRE web on MSN Search web
search brings back more results as well, despite MSN supposedly having a
slightly (very slightly) smaller database of pages based on self-reported
figures. Google simply doesn’t report all the backlinks it knows about for web
search, something it has said time and again when pressed on the issue, a fact
well know to many experienced search marketers.
- It’s not FULL TEXT blog search. Huh? If you post to a blog, you might not
send out the entire text of your post in a feed. We don’t, for instance. Our
reason is that we don’t want everyone assuming they can reprint our material.
Jason Calacanis of Weblogs has
written
of similar issues despite copyright warnings in his full-text feed. But
Google’s only currently searching what’s in the feed, meaning that it actually
may be ignorant of a huge amount of blog content that’s not pushed in a feed.
That produces some skewing, as I
found with
PubSub back in June.
Ideally, I’d like to see Google do what Technorati does and grab the actual
full-text of the post, rather than depend just on the feed. For its part,
Google says this is something it’s pondering.
- The site: command is
said to work,
but I didn’t find that the case.
site:scripting.com came back with no matches, for example. But the new
blogurl:scripting.com seems to do the trick. However, compare that to
site:scripting.com
on Google web search. Blog search gets about 414 matches, while web search of
that blog brings back 344,000 matches. It’s a huge difference and show the
greater blog coverage Google web search actually gives.
The advanced search
page
highlights the issue. You’ll see that the earliest date you can search back to
is March 1, 2005. In other words, the feed database has a much shorter history
range than the web database, something that full text indexing would solve —
though you’d lose the ability to more accurate do things like author and date
range searching if you’re taking scraped data, rather than delimited data in a
feed.
- Spam clearly hasn’t been eliminated. A search for
google blog search brings up a series of "Related Blogs" that are all
spammy in nature to me. However, the main results below look fairly clean. But
for a query on
google, spam is back with a vengeance. The first result (on Google’s
Blogger service) tells me:
Resources To Acquire Stanley Power Tool Or Draper Power Tool On The
Internet Get your stanley power tool on the world wide web. The first thing
I thought of is how easy it is to get stanley power tool online. Google has
listings for many stanley power tool sites. There are lots of stanley power
tool that will help you.
In fact, the first four results when sorted by date are all similar in
terms of spammy, nonsensical copy. Doorway page spam on Google — it is 1999!
What we need is either better spam filtering or some type of super "sort by
date and relevancy" feature. PubSub’s got a feature that’s sort of like this,
but when I last looked, I still found spam and irrelevant content getting
though.
- Freshness or comprehensiveness seems an issue. For that query on
google, I get the latest post as being 40 minutes ago, with the one after
that an hour ago, then the next one two hours ago. That’s it? Over the past
two hours, there’s only been three blog posts about Google?
While I don’t want all those poor selections where just anything mentioning
Google may come up, I also want to see the latest. What we need is either
better spam filtering or some type of super "sort by date and relevancy"
feature. PubSub’s got a feature that’s sort of like this, but when I last
looked, I still found spam and irrelevant content getting though.
Want to discuss or comment? Visit our forum thread,
Google Blog Search Launched.