ss_blog_claim=7bc8aa65888f7df34336e606a547be7d ss_blog_claim=7bc8aa65888f7df34336e606a547be7d
ss_blog_claim=7bc8aa65888f7df34336e606a547be7d ss_blog_claim=7bc8aa65888f7df34336e606a547be7d

Semantic Search Goes To Work

Posted by Matt Williamson on August 3rd, 2008 technology Add comments

Semantic Data FiberWe are increasingly tied to data. Business has always been about the data, whether it is how to successfully grow crops, trade money, or even building cars; all businesses are really the business of knowing how to do something.  The massive amounts of information that our corporations produce on a daily basis, that is the part that we are just now getting good at figuring out.  That data can take many forms: sales metrics, tracking information, emails, office documents, calendar entries, instant messages, presentations or the continuous stream of analytical information from dashboards. But no matter the source, all of it needs to be captured, stored, and evaluated to some degree.

Losing this information to the dark corners of our networks, hard drives and cold storage can no longer be tolerated. Enterprise search has been around in some form or another for more than a decade, but now, with the looming future of semantic search engines we are poised to really gain some ground.

Semantic search differs from the traditional Internet search engine greatly.  Today we use Google, MSN, Yahoo and the rest to find sites where we can then look for information, soon we will use semantic search to actually find the information.  Instead of the search engine running algorithms to make a plausible guess about the character of the information, a semantic search engine would attempt to ‘understand’ the information.  To understand this information the engine must discern what the words mean in conjunction with the surrounding words, and then build out a contextual relationship of what the words, sentences, and paragraphs could mean.

Let’s look at this with a real world example.  If I tell you that I am hungry and that I am looking for a good orange, you know that means the fruit.  But most of today’s search engines will pull up any reference to the word ‘orange’.  The fruit and the color are using the same word, so how do we differentiate? A rule here could be that if the engine finds words related to the food it adds a point in that direction.  So there might be a list of known words associated with food.  Words like these: citrus, fresh, juice, mandarin, blood and navel appearing before the word, it means that we are searching for the fruit.  If the word burnt is in front if it, the we mean the color and not the fruit.

When Microsoft acquired Powerset last month they added a whole new dimension to the Search Relevance team at Microsoft.  Powerset made a name for itself by allowing web users to search through the Wikipedia information.  I picked something that I thought would be a good trial.  How old was Tesla when he died?  Google could not provide me with a return while Powerset offered me 458 results.  On the very first entry I can see that Tesla died impoverished at the age of 86.  Good work.

Google is excellent at finding data and then letting me perform the research.  Powerset seems to be very good at doing some of that leg work for me. True, I could have just hit Google and asked it to find me anything at all about Tesla on the net but that result set is 620,000 strong.  Not too likely I will read all of those pages to find my answer.

Semantic search will change how we interface with the Internet at the core, and that should change the way we think of data in the office as well.  As those same technologies roll into the enterprise we will see a different paradigm emerge.  No longer will we drop countless documents into directories on the shared network drives and forget them.  We will instead place them in that drive and let a local search engine index them and serve them back to us when we either request them or need them.

We are all familiar with the idea that we can perform the search ourselves to find data, but more and more we are able to set up rules and let the search engines notify us when something we care about is found on the Internet.  In the office we will one day set up some of those same rules, or use some that are set up for us by the search provider.  I have yet to see any products that are ready for prime time on the company network, so before you get close to a purchase, run that application on your data to see what happens.

I can see the CFO using a rule set that alerts his email when a document with expenditures is found and indexed.  Perhaps the Marketing director wants to know when a document is found that contains user feedback about the corporation or even a competitor.  This is starting to get interesting.

Links:

Matt
@mattwilliamson

One Response to “Semantic Search Goes To Work”

  1. Semantic Aggregation And Hive Intelligence // TechnologyStory.com Says:

    [...] Wide Web is, as always, on my mind.  I wrote about the future of semantic search in my last post, Semantic Search Goes To Work, and since then I have been reading about little else. How will we collect and categorize [...]

Leave a Reply

Technology Story themed by Clevyr, Inc., based on GlossyBlue by N.Design Studio
Entries RSS Comments RSS Log in