Friday, December 5, 2008

Can you find what you're looking for?

Most Archiving solutions do a great job of storing information in an efficient manner for a long period of time. They vary in the ways they collect, store, and single instance the data. Some select an archiving solution for the way they collect the data and store it. Some may select an archiving solution for the user interface or how easy it is to use and deploy. An area often overlooked is technology behind the search that is used to find data within the archive.

Finding what you need in a timely matter within the archive is vital when that data is requested by the courts or other entities. When purchasing an archiving solution most will use the search engine that is included within the product without even thinking about using another solution. Enterprise search solutions have advanced technology used to analyze language, meaning, and concepts in your data and use that to produce more relevant results and provide a better rank for the results.

Most would think that the best place to search for a good enterprise search solution is to go the search leaders. Start with Google and Yahoo then maybe Microsoft. Although these players all offer excellent Internet search engines they do not offer the same level of advanced search required for enterprise search needs.

A recent Article posted on the ITPRO website in the UK talks about the the differences between Internet search and enterprise search. For example the relevancy rating in Google considers how many other pages link to a particular page. This does not apply in searching for enterprise data stored on file servers, SharePoint sites, and in email archives. Relevancy should be based on the meaning of the content of the data along with the metadata stored with that document.

Each solution described in the article uses a different technology to determine the meaning of the content within the data and how it bases the result relevancy on that data. For example Recommind uses statistical models built from the semantic analysis of existing data to produce relevant results and Autonomy uses Bayesian statistical models that use inferences from past searches to influence the relevancy of future results.

The article is a very interesting read if you are looking into search technology for your enterprise or to support your existing archive solution.

Here is the link: http://www.itpro.co.uk/608925/why-enterprise-search-is-not-internet-search

Technorati Tags:
, , ,