Log Analysis: Seeing The Keywords Used To Find Your Web Site

People often spend a lot of time trying to figure out how to be tops for particular keywords without ever examining how they are already doing. There is a very easy way to discover exactly what keywords you’re already successful for. This can keep you from changing pages and possibly losing a good position that you already have.

Web Site Logs

All web servers record visits to a web site in a log file. Each request or hit is represented by a line in the file. For example:

mysoftns.mysoftware.com – – [01/Jun/1997:02:04:44 -0600”
“GET /access/ HTTP/1.0″200 3075

That was someone visiting a site I run, the Access Providers Disk Report. The line shows the person’s domain, when they visited, what they requested, whether the request was successful and how much information was transferred, in that order.

Almost all hosting companies provide some sort of program that turns this information in an easier-to-read statistics report. I say easier, not easy, because some reports are pretty bad while others can be excellent.

Referrer Logging

Web servers can be configured to capture referrer information. This information shows how people are finding your web site. Lots of Internet hosting providers don’t provide referrer logging as standard service, though they should. It is the best way to determine how well your online publicity efforts are working.

Sometimes, referrer information is written to a separate log file. However, it is usually appended to the standard log information, in what’s called extended log format.

Remember that line from above? When referrer logging is enabled, here’s what it looks like:

mysoftns.mysoftware.com – – [01/Jun/1997:02:04:44 -0600”
“GET /access/ HTTP/1.0″200 3075 “http://www.omnigroup.com/
People/cirocco/kimberlyfaq.html” “Mozilla/3.0 (Win95; I)”

Now see the part that says:


That’s a referrer link to my site. If you go to that page, somewhere on it is a link to my site. Here’s another example:

squid.execpc.com – – [15/Jun/1997:21:00:34 -0600”
“GET /meta.htm HTTP/1.0” 200 14792
search+engine”” “Mozilla/2.0 (compatible; MSIE
3.02; Windows 95)”

This is someone who found Search Engine Watch’s meta tag page via AltaVista. See the part that says:


After that, you can see:


That section holds the words used to find the page. Remove the coding AltaVista puts around the words (a good stat program does this), and you get the actual words used to find the page:

meta tags and search engine

FYI, if the AltaVista address looks odd to you, that’s because the example is from logs in 1997. However, all the same principles shown are true today and for other search engines, such as Google.

How Do I Get This Stuff?

There are plenty of programs that will read your logs and produce reports showing the keywords and links people are using to find your site. But they can’t do this unless your hosting provider is capturing referrer information.

Ask your provider if referrer information is captured. If not, tell them you want this to begin immediately, and that you want the information written in extended log format. They’ll know exactly what this means and should be able to do this easily. In most cases, it involves making a very simple change to the server’s configuration file.

If they can’t or won’t make the change, consider moving. This is the most valuable information that your web site logs can record, so you definitely want it.

If your provider already captures the information, then all you need is a program to read it. Ask them if they have an online program that you can use. There are many low-cost ones that they can install. Otherwise, you can download the information and run it through an offline program. Again, there is a wide variety to choose from.

For more information about log analysis tools, see the SEM Tools: Measuring Tools & Web Analytics category of Search Topics in Search Engine Watch. It also covers tools that let you track and analyze your site traffic not via your logs but by having others track your site by storing data remotely.

Related reading

Google Sandbox Is it still affecting new sites in 2019
alexa.com search tools updates competitive analysis
Decommissioning Jet Two charts proving Walmart planned to ground Jet all along