Xerox researchers have developed a search tool, dubbed FactSpotter, that analyzes the underlying grammar of a text in order to infer additional information, such as whether ambiguous words are being used as nouns or verbs, or to whom a pronoun refers. In other words, instead of basing results solely on finding keywords, FactSpotter can find more relevant results about and directly relating to the search term.
The research team developed their own metalanguage to describe the grammars of different human languages. So far, they have used it to build descriptions of Dutch, English, French, German, Italian, Portuguese, and Spanish. A joint Fujitsu-Xerox research team has also used it to describe Japanese grammar, showing that it can be used for languages using other writing systems.
FactSpotter itself is written in the C programming language, and the researchers have also developed modules in Java and Python, allowing the software to interface with other applications. Although the software only analyzes written language, it can be linked with audio transcription tools in order to search radio and TV archives, and the company is involved in joint research projects to do just that.
News source: InfoWorld