The Center for Democracy and Technology and OMB Watch issued a report last week detailing the difficulty that commercial search engines have finding information on federal Web sites.
The reports authors found that searches on Google, Yahoo, Microsofts Live Search, Ask.com and USA.gov missed critical information because of the way agencies publish data online. For example, when the authors searched for government telecommunications contracts, the search engines didnt turn up anything from FedBizOpps.gov, Export.gov, GovSales.gov, the Central Contractor Registration site or the Federal Procurement Data System.
In another case, the authors searched for Smithsonian African mask collection. The engines didnt find relevant material in the Smithsonian Institution collection or the Library of Congress online catalog.
It is unclear whether these agencies know that their information is not publicly searchable and have not taken adequate steps to change their practices or whether the agencies simply do not know that this important information is not being crawled, said Ari Schwartz, deputy director of the Center for Democracy and Technology. Our findings show that this is a systemic problem that should be addressed as soon as possible.
Jason Miller
Five years after the E-Government Act’s implementation, a glaring shortcoming of the legislation has been agencies’ failure to make federal information more accessible. The legislation’s sponsors, led by Sen. Joseph Lieberman (I-Conn.), say they plan to address that issue in the bill reauthorizing the act.
The E-Government Act, which President Bush signed into law Dec. 17, 2002, led to many significant information technology advances in cybersecurity, privacy and governance, including establishing an executive position in the Office of Management and Budget to oversee IT issues. But agencies have been less successful in implementing the law’s Section 207 provisions for making agency data easier for users to find.
Agencies do not let commercial search engines index their sites, said Lieberman, chairman of the Homeland Security and Governmental Affairs Committee.
“Our intention is for everything, to the maximum extent possible, to be easily available, except for personal information and classified data,” Lieberman said during a hearing last week on the E-Government Reauthorization Act. “There are more than 2,000 federal government Web sites not included in commercial search engine results. Is it accidental, or is there a policy, or it is just laziness? I would like to know why.”
Lieberman’s follow-on e-government bill would help solve the problem by requiring agencies to annually review, report and test accessibility via commercial search engines; ensure that information is accessible in that manner two years from now; and require OMB to develop relevant guidance and best practices.
The reauthorization bill also would renew several other provisions of the original law through 2012, including the E-Government Fund and appropriations for developing protocols for geographic information systems.
There still is no House companion bill for the E-Government Reauthorization Act, but Lieberman’s staff members have been working with their counterparts on the House Oversight and Government Reform Committee to create one.
Even so, the search problem is far from having a solution, and it continues to frustrate commercial and federal search experts.
“The government produces a lot of information, and those databases cannot be navigated by Web crawlers,” said John Needham, Google’s manager for public-sector content partnerships. “Agencies are concerned more about how information is presented than if users are finding it.”