The Once and Future P2P

P2P (peer to peer) technologies seem to be all the rage these days. While Napster - the most famous P2P system - appears to be in a death spiral, other P2P technologies have emerged that look truly interesting and viable, and some are likely to end up as essential tools in your web search arsenal.

Despite the breathless accolades peer to peer systems have received in the popular media, P2P technologies aren't new. In fact, the Internet itself was originally designed as a peer to peer system, with information on any net-connected computer directly accessible by any other, provided the two computers agreed to share data.

The web was also initially conceived as a P2P system. Early versions of web software included both a client (what we call a browser today) and a server. The goal was to create a distributed system where every user participated equally as both user and publisher of information.

The advent of the stand-alone browser changed the web from a true P2P system to one more like a traditional client-server system. As mainstream businesses, e-tailers and traditional media embraced the web, huge, highly trafficked sites (servers) became the norm. The vast majority of web users today are content to simply graze for information with their browser, and if they publish it's generally to a large community area like Yahoo's Geocities, or via a weblog.

While the web was evolving, other P2P systems thrived on the Net. Email is by far the most ubiquitous P2P system. Usenet newsgroups and instant messaging are also fundamentally peer to peer systems. So is the DNS (domain name system), the directory that keeps track of the names and addresses of all Net connected computers.

And then there's Napster, the controversial P2P system that allows direct file swapping between two users, via an internet connection. Napster isn't a "pure" P2P system, because at its heart is a huge central directory of all Napster users and the files they have available for sharing. When you perform a Napster search, you're searching this directory, not the hard drives of other Napster users.

Just like a search engine, Napster's centralized directory is tuned to provide speedy results. But it's not until you've selected a song to download that your computer actually establishes a direct peer to peer connection with the computer where the file resides (unless you use Napster's chat facility, but that's another story).

Napster's central directory was an easy target for the music industry to aim at when they went after the company. Crippling the directory left Napster as little more than a simple file-swapping system with not a lot of interesting stuff left to swap.

True P2P systems like Gnutella and Freenet are far less vulnerable to wide scale control because they are decentralized and widely distributed. But Gnutella, Freenet and their kin have their own set of problems and weaknesses. We'll take a closer look at these systems in upcoming issues of SearchDay.

What are some of the other types of P2P systems you might find useful as a searcher?

Agent technologies. Agents are programs that do things on your behalf. Many are "autonomous," meaning they are smart enough to both act on their own (perhaps while you sleep), and adapt to new information that can supersede prior instructions.

Distributed search. There are several interesting P2P projects that aim to improve search by distributing tasks like crawling and indexing among dozens, hundreds, or even thousands of computers. In theory, distributed search can solve the dual problems of incomplete and out of date web indexes that bedevil even the best major search engines.

Bear in mind that P2P isn't just limited to search. Any type of peer to peer interchange between computers falls under the P2P rubric.

Other types of P2P technologies that you'll likely hear more about include distributed computing efforts, such as the SETI@home project, which harnesses the power of millions of volunteers' computers to analyze radio telescope data in the hunt for extraterrestrial life. Other distributed computing projects are helping analyze the human genome, find a cure for cancer, and solve some of the most intractable mathematics problems.

Perhaps the grandest P2P scheme of all was born at CERN, the very birthplace of the web. CERN is a high energy physics research laboratory, where atomic particles are hurled together in massive high energy accelerators. Each year, CERN experiments generate 1 petabyte of data (equal to 1,000 terabytes, or about 1,000,000,000,000,000 bytes of information). By contrast, the entire Web amounts to only about 25-50 terabytes.

CERN's newest accelerator, the Large Hadron Collider, is expected to generate 100 petabytes in just a single experiment! To keep up with this data onslaught, a team of researchers from Johns Hopkins, Microsoft Corp., the California Institute of Technology, Fermilab and CERN is working to build a massively distributed database that will be accessible via the web. We can only hope that some of the technology from this vast P2P system will "trickle down" to the comparatively puny search engines we rely on to find our way around the web.

In upcoming issues of SearchDay, we'll take a closer look at many of the P2P technologies that can be helpful to searchers. In the meanwhile, here are some other interesting sites to learn more about the world of P2P - a technology you can be sure will be an essential part of the Net,no matter what happens to individual tools or sites like the unfortunate Napster.

O'Reilly P2P Directory
This directory provides links to dozens of P2P projects and sites, as well as providing very readable articles about the technology (click the Article Archive link at the top of the page).

Bots are agents. Many bots use P2P technologies. Be sure to check out the "search bots" section of this site for some very interesting alternatives to large web search engines.

Will P2P Search Replace Search Engines?
Will Napster-style peer-to-peer searching mean an end to search engines? Despite the continuing hype, Danny Sullivan doubts this will be a replacement for web-wide searching.

New at

AltaVista Europe Debuts Helpful Search Features
AltaVista Europe has rolled out a helpful search management feature and new thumbnail images that appear next to some search results.

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

That's it for this issue. Thanks again for subscribing, and watch for tomorrow's issue where we look at how you can more precisely control your searches using phrase searching and the NEAR operator.