1. Home
  2. Computing & Technology
  3. Web Search

Interview with Michael K. Bergman of BrightPlanet.com
Information Overload

By Wendy Boswell, About.com

Interview with Michael K. Bergman,CTO of BrightPlanet.com

This interview references a study released by BrightPlanet.com about the problem of information retrieval and how much time and money we all spend trying to find our "stuff." For more background, please refer to page one of this article, or, if you are a visual person, we have a supplemental graph that puts this article in a tangible form.

Could you put in a small nutshell what this study is about?

The study is about the significant intellectual effort devoted to finding information and creating text documents within enterprises, and the waste arising from not being able to find that work for later purposes and having to re-create it again.

Why do you believe that this is such a huge problem?

I think this is one of those classic problems of documents and their creation being so ubiquitous that it is like putting your finger between your eyes one inch from your nose: It is so close it is not seen.

Everyone is well familiar with the general anxiety of information overload. That such a daunting problem can be systematically attacked for higher productivity and benefit almost seems too overwhelming.

We like the analogy we present in the paper to the document problem being akin to the development and growth of data warehousing for structured data. Major shifts will occur in both better use of existing documents and adding value (categorization, entity extraction, semantic Web-type stuff) in the next few years.

What are some ideas you have for addressing this problem?

IMHO, there are a few dimensions to the problem and areas that need to be addressed (most of which BrightPlanet is tackling):

  1. Scalable, affordable document management systems -- to date, existing approaches are too expensive, take too long to set up and maintain (therefore very high TCOs), and don't scale to realistic document volumes of the large enterprise into the many millions.
  2. Document use and automation has many potential functional pieces -- search, harvest, categorize, entity ID, metadata, archiving, collaboration, versioning, languages, file formats, security and access rights, dynamic real-time access, text mining, subset extraction, etc., not all of which is of interest to every enterprise. Moreover, documents need to be connected with other data including numeric structured, streaming media, etc. Not any vendor has all piece parts. Thus, componentized, interoperable functionality that plays nice in the sandbox with other tools is imperative. To date, too many approaches are proprietary and monolithic.
  3. Greater awareness. I know it is a kiss of death to say the market needs to be educated, but in this instance I think that is truly the case. The market needs to be able to move from a generalized situational anxiety about "document overload" to one where quantification, measurement, ROI and TCO can be calculated and justified. Papers such as what we just issued plus other efforts by ourselves and others will continue to address this issue.

Where do you see this going in the next few years if not addressed?

I see the continuation of the huge wastes documented in the study. However, I (and others) will continue to beat the tom-toms on this, quantifying the problem to scales that warrant executive attention, to the point that I think we have likely seen the low-water mark on this issue. Over the next 10-15 years I think we will see a revolution in what we refer to internally as DIDIA -- document intelligence, document information automation.

Closing comments?

Yeah, there is a cool book by David Levy called "Scrolling Forward: The Role of Documents in the Digital Age." The first few chapters, especially, point to the ubiquity of documents and their role in our changing econonmy (the latter chapters are not as compelling).

If you'd care to, you may also want to occasionally check out my newly released blog, mkbergman.com for additional analysis and thoughts on the document assets problem.

Explore Web Search
About.com Special Features

Stay connected and entertained with reviews on tips on the latest HDTVs, cellphones and more. More >

Easy ways to connect two computers for networking purposes. More >

  1. Home
  2. Computing & Technology
  3. Web Search
  4. Search Engines
  5. Web Search Interviews
  6. Interview with Michael K. Bergman of BrightPlanet.com

©2009 About.com, a part of The New York Times Company.

All rights reserved.