Xerox tool analyzes text to improve search results

Xerox researchers have developed a search tool, dubbed FactSpotter, that analyzes the underlying grammar of a text in order to infer additional information, such as whether ambiguous words are being used as nouns or verbs, or to whom a pronoun refers. In other words, instead of basing results solely on finding keywords, FactSpotter can find more relevant results about and directly relating to the search term.

The research team developed their own metalanguage to describe the grammars of different human languages. So far, they have used it to build descriptions of Dutch, English, French, German, Italian, Portuguese, and Spanish. A joint Fujitsu-Xerox research team has also used it to describe Japanese grammar, showing that it can be used for languages using other writing systems.

FactSpotter itself is written in the C programming language, and the researchers have also developed modules in Java and Python, allowing the software to interface with other applications. Although the software only analyzes written language, it can be linked with audio transcription tools in order to search radio and TV archives, and the company is involved in joint research projects to do just that.

News source: InfoWorld

Report a problem with article
Previous Story

TweakVista open beta begins

Next Story

Ad-Aware 2007


Commenting is disabled on this article.

I wish/hope they will release a PHP module for this - it would definitely be used by many of the forum software written in PHP.

Anyway, it would be great to see the larger search search engines adopt this technology too.

That is how I always wanted Google to be like (well to have an option at least).
Too many pages it gives you are just lacking the content I search for. Just because thousands of people link to that page doesn't make the content relevant or suitable for most people. The chances are bigger, yes, but people looking for quality content have to be told were to get it at some time. Analyzing the text is a big step toward that. Maybe in the future we will see some sort of "AI" that understands well written text and checks up facts written in the text and compares the facts with other pages on the web and then returns if the content seems usable.