Surface Vs Deep Web 

We recently discussed search engines ( http://www.langa.com/newsletters/2001/2001-01-04.htm#2 , http://www.langa.com/newsletters/2001/2001-01-08.htm#1 , http://www.winmag.com/columns/explorer/2001/01.htm ), and that prompted reader Rod Padrick to write about an amazing  site he found:

http://www.completeplanet.com/ 

One of the pages includes "Deep Web Sites" which indicates that the 60 known, largest deep Web sites contain data of about 750 terabytes (HTML included basis), or roughly 40 times the size of the known surface Web. These sites appear in a broad array of domains from science to law to images and commerce. The total number of records or documents within this group is about 85 billion.

Basically, the folks at BrightPlanet found that "Deep Web sources store their content in searchable databases that only produce results dynamically in response to a direct request." Ordinary "spider" indexing of "surface" web sites misses this content, which BrightPlanet says is truly vast:

  • Public information on the deep Web is currently 400 to 550 times larger than the commonly defined World Wide Web
  • The deep Web contains 7,500 terabytes of information, compared to 19 terabytes of information in the surface Web
  • The deep Web contains nearly 550 billion individual documents compared to the 1 billion of the surface Web
  • More than an estimated 100,000 deep Web sites presently exist
  • 60 of the largest deep Web sites collectively contain about 750 terabytes of information – sufficient by themselves to exceed the size of the surface Web by 40 times
  • On average, deep Web sites receive about 50% greater monthly traffic than surface sites and are more highly linked to than surface sites; however, the typical (median) deep Web site is not well known to the Internet search public
  • The deep Web is the largest growing category of new information on the Internet
  • Deep Web sites tend to be narrower with deeper content than conventional surface sites
  • Total quality content of the deep Web is at least 1,000 to 2,000 times greater than that of the surface Web
  • Deep Web content is highly relevant to every information need, market and domain
  • More than half of the deep Web content resides in topic specific databases
  • It’s amazing reading, and you’ll find the full report at http://www.completeplanet.com/tutorials/deepweb/index.asp .



Subscribe to our Windows Secrets Newsletter - It's Free!

Get our unique weekly Newsletter with tips and techniques, how to's and critical updates on Windows 7, Windows 8, Windows XP, Firefox, Internet Explorer, Google, etc. Join our 480,000 subscribers!

PC Drive Maintenance (Excerpt)

Subscribe and get our monthly bonuses - free!

Your hard drives store photos, books, music and film libraries, letters, financial documents and so on. This ebook is aimed at helping you understand your hard drives, expand their capacities and length of life, and recover what you can from them when they fail. We're offering you a FREE Excerpt! Get this excerpt and other 4 bonuses if you subscribe FREE now!

Fred Langa

About Fred Langa

Fred Langa is senior editor. His LangaList Newsletter merged with Windows Secrets on Nov. 16, 2006. Prior to that, Fred was editor of Byte Magazine (1987 to 1991) and editorial director of CMP Media (1991 to 1996), overseeing Windows Magazine and others.