Crowdsourcing the semantics of numbers with True #


Recommended Posts

Numbers are the lifeblood of science and engineering, but it can be hard to appreciate what a number represents without extensive context. A company called True # wants to change that by providing a way to embed the semantic context of a number in commonly used documents.

Words have obvious meanings to most users, and even when knowledge fails, there's an obvious method to compensate: look it up. Numbers also have semantic values, but they're a lot harder to work with. Most of us can probably recognize 3.14159 and the conceptual baggage it carries, but how many of us would recognize 58.44? (That's a mole of sodium chloride, in grams, for the curious.) And the response that would work for words?look it up?doesn't work so conveniently for numbers. Only one of the top-10 hits in Google refers to salt, and Bing fails entirely (though it does offer "Women's Sexy Mini Skirts by VENUS"). Clearly, we haven't figured out how to make the Web work for numbers in the same way it does for words.

Allen Razdow, who got his start developing MathCad, wants to change that, and he talked to Ars about his attempt, True #. Radzow said he was inspired by all the effort put into the semantic Web, which provides a variety of annotation and data exchange formats for information. The new company is providing a service that allows users to create HTML snippets that link back to a full description of the number; these can be embedded in various document formats, like Word and PDF, and the company is offering plugins for embedding numbers from within Word, Acrobat, Eclipse, and Visual Studio. There will be a public database of numbers made available for free, or firms can pay to host a server for internal figures.

The service is currently in beta, but anyone can sign up for a free account that allows them to add numerical information to the system. The system has a real-time parser that tries to interpret your information as you type. This isn't a full natural language parser, as the differences in the interpretation between the two images shown below clearly indicates. Still, the immediate feedback provides some indication of how well you're following the parser's expectations. Typing an "&" brings up a character picker that allows the entry of a wide variety of common mathematical and scientific symbols.

Finding a number turned out to be problematic, as the browsing and search functions don't work at all on a Mac. The default view into what the service calls the "numberspace" is a zoomable tree, with related concepts (like all the atomic weights, for example) grouped together on neighboring branches. This structure may work well during the developmental stages, but the tree is likely to get very crowded in short order. It is possible to zoom by plugging in various values, but to do that, you need to know how the service defines things like "context" and "property."

Assuming you can find the number of your choice, the site will happily place an HTML snippet on your clipboard, which can be pasted into most applications (although the process is a bit baroque in Word). In addition to a link back to the number's page on the True # site, the HTML contains some basic information about its context. So, in one example we viewed, the number "150 mi" was tagged with "Subject: Nissan EV-11" and "Property: approximate driving range (length)." This information should appear when you hover over a link embedded in a document; obviously, going to the site would provide more details, including the source of the figure.

Full article

Link to comment
Share on other sites

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.