Vanishing Act: The U.S. Government's Disappearing Data

by Marylaine Block, Guest Writer

More than any other country, the U.S. government has used the web to make a wealth of information available to its citizens. But as we are now discovering, the dark side of web-based information is the ease with which it can be deleted.

Government-sponsored (which is to say, taxpayer-funded) information and research is disappearing from government web sites, much of it in the name of national security. Airport safety data vanished, and chemical plant risk-management plans were deleted from the Environmental Protection Administration's web site.

The Department of Energy removed environmental impact statements which alerted local communities to potential dangers from nearby nuclear energy plants, as well as information on the transportation of hazardous materials.

The US Geological Service asked depository libraries to destroy a CD-ROM database on surface water (as a result, University of Michigan researchers lost access to information vital to their three-year study of hazardous waste facilities, and community activists could no longer access data on chemical plants that violate pollution laws).

An entire database of unclassified technical reports was removed from the Los Alamos National Laboratory Web site since it would have taken too much time to examine each document in the database for its potential security risk.

The Defense Department removed over 6,000 documents from its web site. The Nuclear Regulatory Commission shut down its entire web site and brought it back up again, scrubbed of anything considered potentially useful to terrorists.

According to the American Library Association, the Department of Energy has removed 9,000 scientific research papers that contain keywords such as "nuclear" or "chemical" and "storage" from national laboratory web sites and is reviewing them to see if they pose security risks. The Defense Technical Information Center has removed thousands of documents.

But other information that has no relationship whatsoever with security issues is also vanishing, and there is some suspicion that an ideology test is being applied. The Centers for Disease Control removed reports from its web site on the effectiveness of condoms in AIDS prevention, and on effective programs for the prevention of tobacco use, pregnancy and sexually transmitted diseases among young people.

The National Cancer Institute removed a report debunking the claim that abortions increase the risk of breast cancer, and the Department of Education is, it says, "reevaluating" hundreds of research reports available on its web site.

Furthermore, state governments are also removing data from public access. Florida governor Jeb Bush signed measures closing public access to information on hospital security plans and information on emergency stockpiles of pharmaceuticals.

Florida lawmakers have also proposed restricting access to information about cropdusters and about state investigations of food-borne illnesses. Massachusetts legislators want to restrict access to records such as blueprints for the state's bridges, tunnels and airports.

Michigan and Tennessee legislators are considering barring access to state emergency response plans. In Oklahoma, among the sensitive materials legislators are considering restricting are the times of school board meetings and the location of high pressure gas lines.

Now, it is possible to make an honorable case for many of these deletions of public information. Maps revealing the location of gas pipelines might very well be useful to terrorists, though they would be even more useful to potential buyers of farm land crossed by those pipelines.

Information on safety flaws at chemical plants might indeed be useful to terrorists, though it would be even more useful to emergency workers in nearby communities. Educational and medical research are always in need of updating, but then again, since knowledge is built on mistakes of the past as well as successes, the traditional way of doing it is to add new data rather than to erase the old.

The problem is that the previous presumption, that publicly-funded information is the rightful property of the public until proven otherwise, has been replaced by the presumption that the public has to prove to a suspicious government that it deserves the information. Gary Bass, of OMB Watch, a private group which monitors government spending and legislation, says "We are moving from a right to know to a need to know society."

Where former Freedom of Information Act policy put the burden on the federal agencies to justify withholding documents requested under FOIA, Attorney General John Ashcroft's October 12, 2001 memo to federal agencies instructed them to avoid releasing documents until after conducting a full review of any possible security implications of the disclosure.

Isn't that convenient for government, given that the natural tendency for government officials is toward secrecy? And if you don't believe that, see the Audits and Surveys of State Freedom of Information laws (link below), which reports on the project by a number of news agencies to request public information from a variety of agencies in 19 states; they were repeatedly forbidden access -- in Colorado, a third of the time local agencies failed to comply with state public records law, in Connecticut only 22% of agencies complied, in Maryland requesters had only "a one-in-four chance of immediately getting what they are looking for."

September 11 has become a blanket excuse for governments to conduct their business as they prefer to do -- in private, suppressing all kinds of information, whether or not is has even the most tangential relation to national security, and without any regard to valid public information needs.

Timothy Maier, in a story in the April 8 Insight Magazine, found that "even résumés of senior government officials are being censored in some agencies... When reporter Todd Carter obtained resumes of EPA political appointees to post on the Natural Resources News Service Website the EPA directed him not to post them because of privacy concerns.

The EPA then sent another batch of résumés that blacked out education levels, awards, affiliations and even job experience. When asked for the return of the unredacted résumés, Carter refused and posted résumés on the news-service Website showing that EPA had brought in former Enron employees."

More information will presumably disappear when some government agencies cease to exist as their functions are folded into the new Department of Homeland Security. Among the agencies slated for extinction are the US Immigration and Naturalization Service.

Will anybody in the reconstituted agency preserve the documents on their web pages? If not, will the University of North Texas librarians who operate a "CyberCemetery" of the documents of defunct government agencies preserve them?

What's more, not only is there no government policy stipulating procedures and determinants for the deletion of data from government web sites, no government agency, not even FirstGov.gov, can even tell you what has been deleted from what pages.

So who is keeping track of deleted data? As you would expect, government document librarians are monitoring the situation closely; information on deletions and other threats to public information is available on the Government Documents Round Table web site has also created a Task Force on Permanent Public Access to Government Information.

Other concerned groups include OMBWatch, which monitors the deletion of government web pages, and the Federation of American Scientists, which maintains a Project on Government Secrecy. The Project's director, Steven Aftergood, suggests that what we need is an oversight panel to review deletion decisions so that decisions to withhold public information could not be made "by some anonymous agency official" without the possibility for the public to challenge them."

It seems to me that GODORT is on the right track with its task force for permanent access to government documents, and there are plenty of willing organizations it can partner with, but the job is too big for them. There are just too many web documents to copy, and thanks to OMB's dictum that federal agencies should ignore the Government Printing Office, even finding them in the first place will be a massive, time-consuming project.

Librarians -- not just government documents librarians, but all of us -- are going to need to assume that information on government web pages is temporary. Just as librarians have worked cooperatively to make sure that last copies of printed works are not allowed to vanish, we will need to act cooperatively, and quickly, to preserve information from government web pages, whether by printing it out and cataloging it or by mirroring it on our own web sites.

Because it's not THEIR information. It's OUR information, and we can't let them get away with deleting it. We paid for it, and we need it, if we're to have any hope of knowing what our government is doing. Since giving people access to the information their taxes paid for has always been the job of librarians, we are the ones who are going to have to take on this challenge.

Marylaine Block a former academic librarian, is now an internet trainer, speaker, writer, and editor of two e-zines. This article originally appeared in one of them, ExLibris, on December 6, 2002. Links to all her work are available at http://marylaine.com/.

Audits and Surveys of State Freedom of Information Laws
http://foi.missouri.edu/openrecseries.html
Results of a six-month survey by members of 19 media organizations and the journalism schools at Arizona State University and the University of Arizona auditing 187 state agencies to test Arizona's Public Records Law.

University of North Texas "CyberCemetery"
http://govinfo.library.unt.edu/
UNT librarians maintain an archive of documents of defunct government agencies.

Government Documents Round Table
http://sunsite.berkeley.edu/GODORT/
The Government Documents Round Table is a unit of the American Library Association intended to provide a forum for discussion of problems and concerns, and for the exchange of ideas by librarians working with government documents.

OMBWatch
http://ombwatch.org/
OMB Watch monitors the deletion of government web pages.

Federation of American Scientists
http://www.fas.org/sgp/
The FAS maintains this Project on Government Secrecy.

Updates From FAST and Google

FAST is now using a new relevance ranking algorithm that the company says improves results by 12% in internal testing. FAST's technology, available via its AlltheWeb.com site, Lycos, and numerous other portal sites, also now includes Microsoft Word documents in search results.

Separately, Google announced a new wireless image search service that enables Sprint PCS Vision customers to search and view Google's collection of nearly 400 million web images.

FAST Relevance Press Release
http://www.fastsearch.com/press/press_display.asp?pr_rel=210

Google Image Search Press Release
http://tinyurl.com/3nju

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

Online marketing news
Bye Telemarketing, Hi More Spam?...
Wired News Dec 19 2002 11:21AM GMT
Online search engines news
Verity, Inktomi close deal...
CNET Dec 19 2002 4:37AM GMT
Domain name news
CNN wins massive domain dispute...
Demys Dec 19 2002 1:59AM GMT
Online portals news
The Problem With Portals...
CIO Insight Dec 18 2002 11:52PM GMT
Online search engines news
FAST Search Engine Now Indexes MS Word Documents...
URLwire Dec 18 2002 8:00PM GMT
Sprint, Google offer wireless image search...
CNET Dec 18 2002 6:50PM GMT
Online legal issues news
Bush OKs law to put government information online...
Freedom Forum Online Dec 18 2002 3:16PM GMT
Online portals news
Yahoo to charge for some news searches...
ZDNet Dec 18 2002 12:33PM GMT
Online marketing news
Are Pop-Ups Doomed?...
AtNewYork Dec 18 2002 9:59AM GMT
Internet: international news
New Study Finds Internet May Not Be Opening Up China...
VOA Dec 18 2002 8:45AM GMT
Online search engines news
Beeb website beats Britney in search engine top 10...
Independent Dec 18 2002 7:06AM GMT
Technology features
Copyright verdict, new technology are reasons to hope...
SiliconValley.com Dec 18 2002 5:40AM GMT
XML and metadata news
XML on a chip?...
ZDNet Dec 18 2002 5:35AM GMT
powered by Moreover.com