Google News Study Finds Bias But Not Favoritism -- But Study Also Has Flaws

A study has found that Google News results are significantly more likely to have an ideological bias than Yahoo News, though the bias will be on both ends of the spectrum.

Caveat alert! The study involved only one particular type of story -- those related to the 2004 US presidential election. Findings on one story do not indicate the situation with other types.

You can read a summary of the study in Non-traditional sources cloud Google News results from Online Journalism Review. The full study in PDF format is available here. Below, I'll highlight the findings and then give my own comments:

Findings

  • Bias was almost entirely attributed to "non-traditional" news sources. In other words, if all those non-traditional sites had been dropped, Google would have been seen as the same as Yahoo.
     
  • Stories coming up for searches on "George W. Bush" and "John Kerry" were analyzed.
     
  • Checks for stories were done every four hours in the two weeks before the actual election, resulting in 80 "snapshots."
     
  • Five snapshots were chosen randomly, then the first five articles in each were analyzed.
     
  • If articles required payment, a short "free" version was used if offered, otherwise the article was skipped and the "next highest" article was used.
     
  • Articles were analyzed sentence-by-sentence to check for bias in a particular direction. Reviewers were given a code to determine if they reflected bias.
     
  • Reviewers were also asked to rate stories overall, rather than on a sentence-by-sentence basis.
     
  • Despite acknowledging some weaknesses in using candidate names (and exactly what style), the study used them anyway saying it emulated what an average user would do.
     
  • While Google was found to have bias, it was more biased in both directions. In other words, Yahoo's results were more balanced overall. Google had balance and extremes at either end. It wasn't seen as slanted more or less toward liberal/conservative or Bush/Kerry.

My Observations

First, the study singles-out Google for not listing its sources. As a reminder, neither does Yahoo nor most any other news search engine I can think of, as I've written before.

Next, the study doesn't show any data of how an "average user" might search for either candidate or indeed, for information about the election at all. So when I'm told that using names in this way they were used in the way typical people might, I'm not reassured unless I see some query logs.

Most important, the study doesn't seem to take the clustering of news stories that Google does into account. Google will "cluster" similar stories under each other like this generated in a query I did on the word bush at Google News:

Bush May Risk Court Deadlock With Unpopular Choice (Update1)
Bloomberg - 21 hours ago
May 19 (Bloomberg) -- President George W. Bush would risk a deadlocked US Supreme Court were he to choose someone ``way out of the mainstream'' to fill a ...
Reid: Bush, GOP Seek to Reinvent Reality ABC News
Reid: Bush, GOP Seek to Reinvent Reality Guardian Unlimited
Possible Supreme Court Vacancy Said Driving Senate Battle Over ... Black Enterprise
Savannah Morning News - San Francisco Chronicle - all 2,284 related »

Social Security adviser casts doubt on Bush plan
Chicago Tribune, IL - 6 hours ago
WASHINGTON -- Robert Pozen, the business executive who developed the theory behind President Bush's plan to trim Social Security benefits in the future, urged ...
How Bush Makes Sure They Agree Los Angeles Times
Investment chief questions Bush plan Boston Globe
Bush Committed to Private Accounts Plan ABC News
Kansas City Star - Washington Post - all 265 related »

Bush would veto House bill on stem cells
Reuters - 1 hour ago
WASHINGTON (Reuters) - President Bush said on Friday he would veto legislation that would loosen restrictions on embryonic stem cell research and expressed ...
Bush Vows Stem Cell Veto CBS News
Bush threatens veto on stem cell research bill CNN
Bush Says He'd Veto Bill Easing Stem Cell Fund Limits (Update1) Bloomberg
news4colorado.com - FXstreet.com - all 229 related »

Bush should have been told of plane scare - wife
Reuters - 45 minutes ago
AMMAN (Reuters) - Contradicting the White House line, US first lady Laura Bush said on Thursday the president should have been interrupted during a bike ride ...
Mrs. Bush's 5-Day Mideast Mission CBS News
Mrs. Bush: Trip Should've Been Interrupted Washington Post
Mrs. Bush Says President's Bike Trip Should Have Been Interrupted ... KOTV
Expressindia.com - all 85 related »

Bush cheers FCAT scores for reading
Sun-Sentinel.com, FL - 2 hours ago
... The results left Gov. Jeb Bush expressing confidence the state was moving in the right direction, despite problems in the upper grades. ...
FCAT scores show Dade closing gap Miami Herald
FCAT Scores Rise for Students in Grades 3 through 10 WJXX
Younger students fare best on FCAT Gainesville Sun
Tampa Tribune - Palm Beach Post - all 81 related »

Now which five links are you counting? The ones shown in bold represent the links that are actually biggest on the Google News page. The other links are in a smaller font. Do you count the first five links you come to, or just the first biggest links. From what I can tell, the study counted just the biggest ones.

That makes a big difference when comparing to Yahoo. Yahoo doesn't cluster results, so it will show less diversity at a glance. In other words, Yahoo might show 10 stories that same the same thing, keeping alternative views out. In my experience, Google is better at clustering all 10 similar stories under one major headline/link, allowing others stories on slightly different topics/angles to emerge.

Here are more examples, to show this better. Going back to the list above, this is what you get if you count only the biggest/bold links:

  1. Bush May Risk Court Deadlock With Unpopular Choice (Update1)
  2. Social Security adviser casts doubt on Bush plan
  3. Bush would veto House bill on stem cells
  4. Bush should have been told of plane scare - wife
  5. Bush cheers FCAT scores for reading

As you can see, there are five different stories involved (Judicial Appointments, Social Security, Stem Cell Resarch, Plane Scare & FCAT scores).

Now compare to the first five stories listed at Yahoo News for bush:

  1. Bush says he does not fear violent reaction to Saddam photos
  2. Bush: I'll Veto Stem Cell Legislation
  3. Bush: Ideology Motivates Iraq Insurgents
  4. Bush would veto House bill on stem cells
  5. Bush threatens to veto bills easing ban on federal stem cell research funding

As you can see, there are essentially only two stories represented (Iraq, Stem Cell Research)

Now go back to Google. Let's say you took the first five news links -- not the biggest/bold news links, but literally the first five actual article links you came to, just as is the case with Yahoo

  1. Bush May Risk Court Deadlock With Unpopular Choice (Update1)
  2. Reid: Bush, GOP Seek to Reinvent Reality
  3. Reid: Bush, GOP Seek to Reinvent Reality
  4. Possible Supreme Court Vacancy Said Driving Senate Battle Over ...
  5. Savannah Morning News (Isakson says filibuster will fail to stop Bush's judicial nominations)

Now you can see only one story is represented -- that of the fight over judicial appointments.

And the point is? Google's system allows more different stories to appear in response to a query, if you count the biggest links. That means you may end up with more diversity in views -- and yes, more bias. But count things differently, and that might go away.

It's also somewhat troubling that if a story couldn't be read without paying, it was dismissed. Yahoo has agreements with major publishers so stories can be read right on its site. Google does not. By dismissing some inaccessible stories, further skewing or bias may have been brought into the study.

Overall, it's an interesting look, but I find it hard to feel that it concludes anything.

Want to comment? Please join our forum thread, Google News Unbiased When Blogs Left Out?