The deep Web, also called the
invisible Web and the hidden Web, has an aura of secrecy and mystery, conjuring
up images of private caches of supremely useful information beyond the reach of
mortal Web surfers.
The reality is much more
pedestrian.
The deep Web simply consists of
information accessible over the Web but not accessible through ordinary search
tools, including Google and Yahoo. Such search engines can’t find it for two
main reasons: It’s stored within databases and retrievable only with a
particular site’s search tool; or it resides at sites that require registration
or subscription.
How much information lies below
the Web’s gleaming surface? The answer depends on what you read. One estimate
is that the deep Web is about twice as big as the surface Web. Another—and
frequently repeated—estimate has the deep Web about 500 times bigger, but this
number comes from BrightPlanet Corp. (<span
style=’font-size:11.0pt’>www.brightplanet.com), a company that
sells a program for accessing the deep Web.
What’s Down There
Deep Web information is usually
narrow and specialized. Nuclear Explosions Database (<span
class=95StoneSerifIt>www.ga.gov.au/oracle/nukexp_form.jsp)
is typical. A free offering of the Australian government, it lets you search
for the location, time, and size of nuclear explosions worldwide since 1945.
Other examples of deep Web
information include data found in professional directories and phone books,
laws and patents, items for sale at a Web store or Web auction site such as
eBay, archived magazine and newspaper articles, job postings, and stock and
bond prices.
Maybe the best way to get a feel
for the deep Web and what it can do for you is to manually go to several of the
database sites used to store much of its information. Unlike regular Web sites,
these database sites create pages on the fly based on what you search for
initially.
CompletePlanet.com (<span
class=95StoneSerifIt>www.completeplanet.com),
from BrightPlanet, is a directory of more than 70,000 searchable databases. You
can’t search through all the databases simultaneously, but you can search for
appropriate databases and then search through them individually.
Although it’s useful,
CompletePlanet.com hasn’t been updated since 2004. Much information on the Web
about the deep Web is even older, with many links no longer working and sites
mentioned no longer existing. (This is a common problem in general when using
the Web for research, of course. Always check for a “date last updated” notice
to help ensure that whatever page you’re reading doesn’t include old and
obsolete information.)
Another site frequently
recommended for accessing deep Web sites is InfoMine (<span
class=95StoneSerifIt>infomine.ucr.edu),
an offering from the University of California at Riverside, with federal
government support. It’s maintained by librarians and designed for
university-level research. Many of the databases it accesses are fee-based
compilations of articles in scholarly journals.
Yahoo is currently testing a tool
to let you quickly get at information stored at multiple pay sites. With Yahoo
Subscriptions (search.yahoo.com/subscriptions),
you can now search through nine subscription sites, including <span
class=95StoneSerifIt>Consumer Reports,The Wall Street
Journal, the New England Journal of Medicine, and LexisNexis. You’ll
need to have paid a subscription to any given site, however, for full access to
its information.
Google has also made strides in
helping people access deep Web information. The information stored in PDF files
created by Adobe Acrobat used to be considered part of the deep Web, for
instance. But ever since Google started indexing such documents, this material
has migrated from the deep Web to the surface Web.
One of the more intriguing deep
Web tools is Turbo10 (turbo10.com).
It lets you search through nearly 1,000 deep Web and other sites by typing a
search query once, just as with Google or Yahoo. You have the option of
creating your own sublist of these sites and searching only through them, which
can be helpful if you repeatedly do similar types of searches. The brains
behind this advertising-supported site are Nigel and Megan Hamilton, a brother
and sister team in London.
Much deep Web information resides
in U.S. government databases, the U.S. government being the world’s largest
publisher. FirstGov (www.firstgov.gov)
is a searchable portal to such government data as economic forecasts, industry
reports, government regulations, and new legislation.
ScienceGov (<span
class=95StoneSerifIt>science.gov),
a part of FirstGov, is a searchable portal to scientific papers and technical
data generated by 17 U.S. government science organizations within 12 different
federal agencies.
Depending on your purposes,
accessing the deep Web can be an important part of any given search strategy.
Reid Goldsborough is a
syndicated columnist and author of the book <span
style=’font-size:11.0pt’>Straight Talk About the Information Superhighway.
He can be reached at reidgold@netaxs.com or members.home.net/reidgold.
|