Semantic Aggregation And Hive Intelligence

loud3r-glaci3rThe next iteration of the World Wide Web is, as always, on my mind. I wrote about the future of semantic search in my last post, Semantic Search Goes To Work, and since then I have been reading about little else. How will we collect and categorize information? Will we group whole sites, or sections of sites, into new meta-sites that span entire lanes of this Information Highway? Will we form virtual web sites by pulling together information found across the Web and give these virtual sites new names to make them accessible by people and computers around the world? I know that some companies, like Powerset and Google, are working to this end. You might even make the argument that Google is already a semantic search company because of how Google reads web pages and servers advertising on the AdWords/AdSense network. I think this is partially true, but they are not offering you and me a semantic experience on the Google search engine.

Last week I was contacted by the team at LOUD3R. This site, or collection of sister sites, is not a semantic search engine, but rather a network of collected, sorted and categorized sites in topical sections. As they said to me:

(We) are not really search-based, but are focused more on finding and aggregating news for particular subjects. (We’ve) built a network of sites, each dedicated to a different niche topic (sneakers, cricket, baseball, etc.) The real goal is to cut down on the noise and bring the best, fresh news to enthusiasts of a given topic.

It turns out that LOUD3R is using “semantic language processing and human editorial input” to restrict and direct the flow of information and make it more relevant. This means that humans are making sure that this information is pertinent to the goal of the site, as well as restricting any information from the sites that the LOUD3R team doesn’t care for. Nothing wrong with being selective when it comes to the information they are willing to serve.

summ3rA few examples of LOUD3R sites: DECANT3R.com is devoted to wine and the wine making world, GLACI3R.com is all about the environment and being green, and the timely SUMM3R.com which focuses solely on the Summer Olympics in China. The network sites all follow the same basic look and feel principles, which only makes sense as the CEO and founder, Lowell Goss, was the User Interface chief at Yahoo once upon a time. I am sure we will see more networks like the LOUD3R system spring up.

Where LOUD3R aggregates news posts from around the Internet and then classifies them and places them on various niche sites, there are other sites that take a different approach. Both Alltop and popurls have built networks that grab headlines from around the internet and offer them to site users in formulated and organized fashions. Alltop began its life after the gang at Truemors watched the popurls site send them as many visitors as Google did. Suddenly they wanted to roll their own site of aggregated feeds.

Everything from Aviation to Mixed Martial Arts to Yoga gets categorized and fed via RSS streams from blogs, news sites and even applications like Twitter. Readers can hide content if they decide they would rather not see those sources, but other than that you are pretty limited on what you can and can not do with the sites. Where the Alltop system really shines is how the community has helped to build the sites listed. In fact, the Alltop team specifically mentions the Twitter community in this light. I also appreciate the Alltop crew’s candor on the About page:

If you’ve gotten the impression that Alltop is not based on computer algorithms or popular voting, you’d be right. We are highly subjective and judgmental.

Nothing wrong with that at all. I know, you thought I would rail on and on about how we need to get the machine to offer us the unbiased view of the sites listed, right? Sure, that is exactly what I want, but I think sites like Alltop and popurls offer an excellent first step. We can’t get the machine near the correct rule unless we understand them ourselves. Letting the hive, that is, the Internet community itself, decide and move the segregation of information from here to there is the only way that we can eventually get the true semantic search engines to understand and provide us with real tangible results.

Matt Williamson
http://twitter.com/mattwilliamson
matt at technologystory.com

  • Share/Bookmark
This entry was posted in technology and tagged , , , , , , , , . Bookmark the permalink.

This website uses IntenseDebate comments, but they are not currently loaded because either your browser doesn't support JavaScript, or they didn't load fast enough.