It seems difficult to imagine that there is anyone on the web who hasn’t heard about the continued rise of blogging these days. Google’s recent acquisition of a major blog tool vendor made it a certainty that this popular form of self publishing gained renewed attention as something special.
Will we see Google move to offer blog-specific searching, in the way that we can currently search through newsgroup discussions on Google Groups? My sidebar article, Loving Each Other More: Search Engines & Blogs, looks at this possibility and the reasons why it would be good for Google or another major search player to do so.
Even if the major search engines fail to make blog searching a reality, there are already a variety of ways you can do it now. Indeed, last month at least two new blog searching services were launched, Feedster and RSS Search.
To be accurate, both are actually RSS search engines. They accept content not by crawling the web but instead by receiving RSS feeds, a mechanism for site owners to allow others to easily learn about new articles and content they’ve posted.
Bloggers using popular blogging tools such as Radio UserLand, Moveable Type or Blogger Pro have these feeds automatically created for their web sites. That’s why RSS search engines are almost like blogging search engines.
Almost, but not quite. Any web site can distribute its content via RSS, so RSS search engines are more than just blog search engines. In addition, not all blog distribute their posts via RSS, so an RSS search engine may miss part of the blogosphere or universe of weblogs that exists.
In this article, we look more about how content from blogs, news sites and other sources is distributed via RSS feeds. These feeds can be a great way for anyone to receive customized news information from growing number of sources.
What Is RSS?
If you’re unfamiliar with RSS feeds, my Making An RSS Feed sidebar article takes a close look at what exactly RSS is and how to create a feed. In short, RSS is a way for web site owners to let you know what new content they have available within their web sites.
There are a wide-range of web sites that “syndicate” their content in this way. Among the top 100 most subscribed feeds reported by Radio UserLand (a news aggregator explained more below) are technology headlines from the New York Times, the daily Dilbert cartoon, PDABuzz.com and former MTV VJ Adam Curry’s weblog.
The beauty of these feeds means that you can effectively create your own custom newspaper or magazine of recent content. Indeed, that was the idea behind Netscape’s initial introduction of one form of RSS back in 1999. Portal users of Netscape could craft their own My Netscape “headlines” areas based on this content.
Netscape gave up on its support of external RSS feeds in 2001, but that hasn’t fazed RSS as a distribution network. Today, there are a number of news aggregators or RSS readers that allow you to subscribe to and read RSS feeds.
News Is Free is an excellent example of a long-standing web-based news aggregator. Using the free service, you can create customized “pages” for different topics, then have headlines from various resources automatically filled into those pages.
For example, let’s say you’d like a page to track news and content relating to the war in Iraq. You might call this page “Iraq News,” then add to it the feeds you are interested in.
To do so, you can keyword search through the over 5,000 feeds that News Is Free has categorized. Currently, that search brings back 17 different sources related to Iraq. Browsing through the results, with just a click, you can customize your page to have headlines from the Associated Press, the BBC or a news roundup from Yahoo News or Electronic Iraq.
Nice — but only 17 sources, I’m sure you’re thinking. Of course, there are many, many other resources distributed via RSS with rich content relating to the war in Iraq. We’ll come back to how to find them further below. However, the point is that RSS news aggregators like News Is Free allow you to assemble your own “channels” of information which will be broadcast to you by web sites.
How about some other news aggregators?
Radio UserLand, popularly known as a blog-building tool, is also another long-standing news aggregator. Enter the URL of a news feed, and it will be added to your personal list.
FeedReader is a small, free software-based tool that I downloaded and tested. As with Radio UserLand, enter the URL of a feed, and headlines will be brought back and made viewable within the application.
NewsGator is an reader that works within Microsoft Outlook, which looked promising to me, since my life largely revolves around Outlook. The only reason I didn’t try it is that I work off a modem and needing to download the 20MB of Microsoft .NET framework files required would have taken forever.
NewsMonster was another tool that looked promising, especially the ability to detect spam in RSS and to rate articles for quality and use ratings provided by others. Unfortunately, it’s not for Internet Explorer users, working only with Mozilla 1.0 or Netscape 7.0 and higher.
Snarf is supposed to be a download-free RSS reader for Internet Explorer. I could get the Snarf window to open in my browser and add news feeds. Unfortunately, I couldn’t see any way to actually view these feeds, once added. Perhaps others will have more luck.
There are plenty of other news aggregators beyond the ones I’ve named, and I’d encourage you to check out some comprehensive lists of these programs. Aaron Swartz maintains a nice, short list of RSS readers, and Radio UserLand provides another short list. John Abbe has made a giant list of RSS readers which looks to be regularly maintained. You’ll even find readers for PDAs classified on it. A similar list is offered by Haiko Hebig.
Finally, how about the web’s major directories? What RSS readers do they list? LookSmart doesn’t provide a category for these, while Yahoo lists only a dozen. So, head ye over to the Open Directory’s RSS News Readers category, which has 40 listings.
Finding RSS Feeds Via RSS Directories
Got your RSS reader all fired up? Ready to use a web-based news aggregator to build a custom news page? Then all you need are some feeds! But how do you find them?
For a start, take a look at the home page of your favorite web sites or blogs. Increasingly, sites are promoting the fact that they provide information in RSS format. Often, this is done using little icons that say “XML” or “RSS.”
You can see an example of this on the Search Engine Watch home page, just below where our recent articles are shown. Clicking on the link to a feed with your browser will bring up what looks like a bunch a garbage. But copy and paste the link into your RSS reader, and you’ll see headlines and story descriptions, plus the ability to click through and read an entire story or have it pulled automatically into your reader.
Want a more comprehensive method of finding feeds? Then you may want to start with one of the two major RSS directories, where feeds are listed by category.
News Is Free, mentioned earlier as a news aggregator, is also a long-standing directory of feeds. Launched in June 2001, it organizes thousands of news feeds — what it calls “news channels” — into categories. You can browse the categories to find feeds of interest or do a search for feeds that mention your keywords in their titles and descriptions.
Want to get your new feed listed? Then visit the site’s contact page. Drop a short note, including the URL to your feed, and News Is Free will consider adding you to its directory.
Syndic8.com is the other major RSS directory, launched nearly a year ago. It lets you look through thousands of RSS feeds that users have submitted or which have been added by volunteers to its collection. In other words, it’s an Open Directory for RSS feeds.
Using the “Categories” link at the top of the Syndic8 home page, you can browse to find feeds organized into topics. By default, you’ll see feeds organized according to the category structure that the Open Directory uses. However, you can also view feeds organized into other category styles by clicking on links at the top of the Categories home page, such as the HV (Headline Viewer) or NIF (News Is Free categories) links.
You can also keyword search for feeds, but you’ll only find those that contain the words you searched for in the feed title, description and some other fairly limited data recorded about the feed Thus, similar to News Is Free, a search for “iraq” brings up only a sparse 13 matches.
Adding your feed to Syndic8 is easy. Simply use the submission page for syndicated URLs. You don’t need to fill out the “User ID” box. Just insert into the “URL 0” box the URL of your RSS feed. Then sit back and wait until someone approves your feed for syndication. For Search Engine Watch, this happened within a week.
Finding RSS Feeds Via RSS Search Engines
RSS directories can be a helpful way to find feeds to add to your news aggregator. However, as seen, keyword searches against the directory listings may not reveal the true depth of content that’s available on a particular topic.
In contrast, the new breed of RSS search engines that are cropping up take you well beyond what the directories do. Instead of searching through the 25 or so words that describe a feed, the RSS search engines let you scan against the actual content within the feeds. Because of this, you may find feeds related to topics that you might have missed when using a directory.
For example, a search on new RSS search engine Feedster brings back over 15,000 posts or articles relating to Iraq. By clicking through on one of the listings, you can read the article and determine if the source site seems to be providing coverage that you are interested in. If so, you can return to Feedster and copy the RSS link shown near the article’s title, in order to add it to your RSS news reader.
Using RSS Search Engines To Find News & Information
It’s great that RSS search engines can help you locate feeds to subscribe to. However, I think the real power for many people will be that they offer on-demand searching of weblog, news and other informational content.
For traditional news, these are all excellent resources. Searches for “iraq” on them bring back articles from places such as Yahoo News, the BBC, USA Today, CNN, the Arab News, the New Zealand Herald and other outlets. However, these sources will largely lack the opinionated yet interesting current events information that’s often present in weblogs.
As mentioned earlier, many blogs have the ability to feed their content in RSS format. This means that RSS search engines may give you a more blog-centric view of what your searching for. Some “headlines” from these search engines in a search for “iraq” include items such as:
- US Military Spams Iraq
- American Right As Ridiculous As European Left
- Not In My Name Shock and Awe Analysis Published on the Web
- Here Are The Arguments On War Or No War
- A Warmonger Explains War To A Peacenik
The New RSS Search Engines
So who are these new RSS search engines? Here’s a rundown on what I’ve seen so far.
Feedster emerged originally as Roogle, a play on RSS+Google, in early March. Along with a name change, it has grown better progressively day-by-day, with both a large jump in relevancy and a much needed improvement in how search results are shown being made last week. The service is definitely one to keep watching. As for adding your feed, use the site’s add form to enter the URL of your RSS feed.
RSS Search is another March baby, showing up in the middle of the month. As with Feedster, it also shows promise, another one to keep an eye on. If you aren’t listed, there’s a submission form now available.
BlogDigger launched only a few days ago and so far, content is only including sites that self-report changes to their RSS files via the Weblogs.com site. (An older article, but one I believe is still valid, from RSS-maven Tara Calishain explains how Weblogs.com works a bit more).
Snarf, mentioned earlier as an RSS reader, I’ve also seen commented on by others as a RSS search engine. If so, this is a capability I haven’t found. Snarf will show you a list of popular feeds that other Snarf users have subscribed to. You can scan the list and pick out feeds that seem of interest — but other than popularity, you have little else to go by. Only the feed’s file name is shown, no topical category, title or description.
Fresh Search, also called Terrar, is also a new service, I believe. I didn’t have a chance to check further on it in time for this article, unfortunately
Last month, Microdoc News also posted an article discussing how to turn Google into an RSS search engine. However, I think what it really does is turn Google into an RSS feed directory or discovery service.
In other words, you can discover feeds that may be about particular topics using the technique described. What you cannot do is search against the actual content of those feeds and read individual articles, in the way that Feedster and RSS Search do so well.
New doubt we’ll see more new RSS search engines emerge. To keep up, watch this list of RSS search engines being maintained by David Davies. He’s also written up a nice, short commentary touching on some of the issues involved with RSS search engines.
In particular, those behind these services are going to have to decide if they wish to seek out past content that they were never fed in the years before they existed.
This isn’t difficult, assuming they add a link crawler component. However, they might also decide that they’d rather only always present the last 30 days or so worth of content. That’s how typical news search engines operate. In contrast, web search engines generally do not drop listings, unless those listings are no longer offered on the web.
Also be sure to visit Fagan Finder’s fantastic blog & RSS meta search page, where you can tap into many of the resources listed in this article from one location, as well as discover many others.
Permalink Surf With Technorati
Unlike the RSS search engines above, you can’t keyword search at Technorati. Nevertheless, there’s a lot to like from this wonderful site that launched earlier this year as a blog discovery and analysis tool.
Enter any URL into the box on the Technorati home page and push the “Link Cosmos” button. You’ll then see what bloggers are linking to that URL, as well as what they are saying in relation to it.
You can also contrast this to Google Lab’s Web Quotes service. With that, you can enter the name of a web site (not the URL), then see what other web pages (as opposed to web logs) say about the site.
Also unlike the RSS search engines mentioned above, Technorati is firmly blog specific. It only spiders content from sites that meet its criteria of being blog-like, such as having RSS distribution, “permalinks” to posts and ease of constructing items. A background page at Technorati explains more about the criteria and the service itself.
Among things to check out, look at Technorati’s Breaking News feature, where it assembles top headlines in an automated, Google News-like fashion. However, it is blogs that power the top headlines, not traditional news sources. A similar Current Events feature is also offered. Technorati also tries to create an “A-List” of top100 blogs based on linkage patterns.
More Blog Link Analysis From Blogdex
An older service that tries to leverage blog links for discovery purposes is Blogdex. Think of it as a “buzz index” according to blogs, where the goal is to show you which links anywhere on the web are currently getting the most references from bloggers. Some past history about Blogdex can be found in this Wired article and a Search Day article on Blogdex, both from mid-2001, when Blogdex launched.
Blogdex creator Cameron Marlow doesn’t depend solely on RSS feeds at the moment but instead uses services that monitor web log changes, like WebLogs.com and blo.gs. He pulls links off the blogs that have been updated and analyzes them to see what’s hot.
Stats are updated every 10 minutes, but the currency of links is generally around a day. In other words, when you visit the site, you’ll pretty much see what’s hot that particular day, rather than over a week or month period.
You can also keyword search at Blogdex, rather than browse through the top linked to items. Keyword searching using the text option (set this with the drop down box) means that you’ll find all the items discussed among blogs that contain the words you are looking for in the name of the item. I’m guessing text within the item’s description and links to the item are also considered. Items are likely ranked by link popularity, but this isn’t explicitly stated.
If you want to know how your own blog item or web page is doing, enter your URL or permalink into the search box and keep the default to search for text in “URLs.” When your page appears, click on the “track this site” or “track this weblog” link that should be shown. Then you’ll see a past history of everyone known within the blogging community to have referenced your page.
Finally, if you aren’t listed in Blogdex, use its submission form to let the service know about your web page or RSS feed. If it already knows it exists, you’ll be told.
Don’t Forget Daypop!
It’s great to see the new RSS search engines that are emerging, but RSS isn’t anything new to Daypop. The popular news search engine has been making RSS content searchable for almost a year. In addition, Daypop has been making weblog and news content searchable for even longer, since it launched back in August 2001.
By default, when you visit Daypop, your search will go against both current “news” and “weblog” content. You also have a choice to search for this content separately, as well as through RSS content. All options are available using a drop-down box on the Daypop home page, as follows:
- News & Weblogs
- RSS Headlines
What’s news content? Whatever site owner Daniel Chan has determined through his human review to be a news site and includes many of the traditional sources and smaller, industry sites with news that you might expect. Obviously, if you want more traditional “story-oriented” content, then set your search option to news.
As for weblog content, Chan said that these are primarily sites that have submitted themselves to Daypop and listed themselves as weblogs, according to the submission page’s instructions (news sites can also submit). That means if you select the weblogs search option, you’ll probably get a more opinionated view of current events.
How about RSS Headlines? Daypop examines a list of feeds provided by the aforementioned News Is Free site. Daypop then visits all the content listed in the feeds, to allow users to search within it.
Why choose the RSS Headlines option? At Daypop, it’s hard to say. Choosing it means you will get back matches from both traditional news sites and weblogs, since both types of sites distribute via RSS. But Daypop also offers what seems to be a similar “News & Weblogs” search option. Would that cover the same material as the RSS Headlines choice? No.
Not all news sites and weblogs distribute via RSS, so when you take that option, you might miss some of the self-submitted or hand-selected sites that Daypop’s other options provide access to. But similarly, there may be some news and weblogs that are only being found via RSS, rather than in the other ways.
In summary, you might want to try them both. The ultimate solution would be to combine all the RSS feeds into the other content and drop any duplicate URLs. Unfortunately, it would still be difficult to know what’s a weblog and what’s a news site, just from the RSS feeds alone. Nevertheless, Chan says he’s considering options. To help him along, I’d encourage those who like the service to make a voluntary donation to support it.