Enterprise search tools

Tools for searching through oceans of corporate information have been around since the dawn of the IT industry. Initial mechanisms for extracting fixed length records from ‘flat files’ or hierarchical databases gave way in the 1980s to relational database querying through the Structured Query Language (SQL).

While that has since been the primary means of getting at structured data (ie data that can be held in tables), there have been parallel attempts to provide the ability to search through and extract data held in unstructured formats: text documents, emails, intranets and so on. Early and specialist text retrieval products, such as BASIS and STATUS, have given way to Internet-based technologies such as Autonomy and Verity.

But it is not just a matter of finding the documents: these enterprise search technologies endeavour to provide context to the mass of potentially valuable internal data, retrieving a combination of structured and unstructured data in order to address a specific business query.

Research group Ovum predicts that the enterprise search market is set to virtually double in the next five years, fuelled by a number of factors. Part of this growth will be driven by end-user dissatisfaction with current offerings. Increasingly, users want search facilities that are equivalent to Internet search services.

Indeed, Internet search engine companies are looking to move into the enterprise market. Microsoft has also been gearing up its assault; it expects to have its first products out by the end of 2004. At the same time, database vendors, such as IBM and Oracle, have been broadening their databases to handle structured and unstructured data.

A key driver behind much of this activity has been the increased regulatory pressures on organisations, requiring them to be able to search through and make sense of their vast information reserves.


Search terms glossary

  • Concept search Searches for terms related conceptually to words, not the word itself.

  • Fuzzy search Finds matches even if terms are misspelled.

  • Keyword search Finds exact matches for search terms.

  • Precision Degree to which a search engine produces documents that match a query.

  • Relevance How well a result provides information the user is looking for.

Quotes: Searching for the right words

“Enterprise search technology is all about allowing the user to leverage results in a meaningful way to either discover new insights or find the best answer to their question.”
John Felahi, senior director of product marketing at FAST.

“The challenge is to get the business world to see search as something above the utility level. The capabilities of intelligence or advanced search need to become more ubiquitous in the organisation.”
Eric Woods, research director at Ovum analysts.

“The enterprise search market is relatively small, so why does Google want to get into it? They would need to spend money to get their technology ready for the enterprise. And that does not seem to make good business to me.”
Simon Atkinson, managing director of Verity UK.

“This market has tremendous potential, but it has been under-served. Businesses are reluctant to spend the effort and the money to deploy new products. Our approach is to bring to the market a product that has a much lower total cost of ownership, is faster and easier to install, and provides much more satisfaction to employees. “
Dave Girouard, general manager of Google’s enterprise business.

   
 

Splintered market

For 2003, IT market research group Ovum valued the enterprise search software market at $460 million. It forecasts that demand will reach $800 million by 2008 – increasing at a compound annual growth rate of 12%. But the sector remains extremely fragmented. At the end of 2003, for example, analyst IDC reported that market leader Verity achieved market share of 17%. Following Verity in order of their share, were Autonomy, FAST, Convera, Open Text, Caatoosee, IBM and Hummingbird.

Key vendors and their products:

Vendor: Main product
Autonomy: Enterprise Search
Caatoosee: DQ Server
Convera: RetrievalWare Search
Coveo: Enterprise Search
Fast search &transfer (FAST): ESP
Hummingbird: Search Server
IBM: Intelligent Miner for Text
IBM Lotus: Extended Search
ISYS Search Software: ISYS
Open Text: Livelink
Oracle: Oracle Text/Ultra Search
Verity: K2/Ultraseek

 

 
   

   
 

Search mechanics

Enterprise search technology provides results by analysing unstructured, semi-structured and structured information, automatically categorising this information and providing links. Context is then provided through summaries and statistical measurements of the retrieved information's relevance.

Classic search technology, such as an Internet search engine, uses mathematical algorithms to extract information from raw text. Typically it is based on a pattern matching technology and probability theorems devised by Cambridge mathematician Thomas Bayes. Enterprise search technology combines the Bayesian code with other algorithms to rank the relevance of retrieved information.

This ranking is based on two basic criteria: terms and links. The ‘term' value is measured by the location and frequency that the term occurs within documents. For example, the enterprise search software would consider whether the term is in the title of a document, how near the term appears from the beginning of the article, and the number of times the term appears in the article.

The ‘link' value is calculated by assessing the number of links to a particular document. The more links, the higher the information's ranking. Some of the latest enterprise search systems can recognise different languages and attempt a conceptual understanding of information so that ‘intelligent' search results are produced. Other developments include the ability to restrict access to documents, so that those users with insufficient privileges do not see those documents in any search results.

 

 
   

   
 

Predictions: What comes next?

Analysts predict that organisations will expand their commitments to enterprise search, making it universally available to internal users.

  • Tighter regulatory compliance requirements, such as the Sarbanes-Oxley and Basel II, will drive growth of the enterprise search market as companies increasingly need to be able to retrieve an array of information from a variety of document formats in a short time.

  • The amount of unstructured data in organisations is rising fast – in 2004 it is estimated that nearly 85% of all corporate data is unstructured. As companies are required to mine more unstructured information, the role of enterprise search technology in the organisation will become increasingly valued.

  • Enterprise search has found most favour in the pharmaceutical, academic and security services industries; increasingly it is being adopted within financial services companies.

  • The technological focus for vendors will be to combine usability with integration.

  • Desktop searching will become a major battleground, with Apple, Google and Microsoft all vying for market leadership. The battle is given added significance as analysts predict that the winner will be in a position to leverage their success in PC search to gain significant market share in the wider desktop market. It is likely that email searching will be an important factor in this fight.

 

 
   

   
 

Google’s extended search

Since launching its stand-alone corporate search tool in the US in 2000, Google has bided its time, progressively expanding the product's capabilities. Finally in autumn 2004, it began shipping the tool in Europe, and promised it will eventually be available globally.

Google's Search Appliance enables employees – and, potentially, customers – to search for information across the organisation's intranet as well as public web sites. It has the familiar look and feel of a Internet Google search.

One customer is merchant banking giant Morgan Stanley, which deployed the Search Appliance in 2003. It now provides intranet search for over 25,000 employees worldwide, and searches an index of 2.2 million documents from 200 different intranet services on a quotidian basis. Search traffic has increased by a factor of 11 in the past year.

Google is also in the initial phases of testing its Desktop Search product that automatically records emails viewed and web pages visited in order to provide a ‘photographic memory' of what has been viewed on the computer.

 

 
   

Avatar photo

Ben Rossi

Ben was Vitesse Media's editorial director, leading content creation and editorial strategy across all Vitesse products, including its market-leading B2B and consumer magazines, websites, research and...

Related Topics