When writing an article or blog post, I'll often note places where I should add a link and go back later to add them in. Sometimes searching for good resources to link to uncovers new information which I wish I had known while I was writing the article.
As I was walking around here at Mashup Camp, I saw a few web services which would allow me to do some data mining on raw text (like this blog post I'm writing) and search the web for relevant information.
I created a simple Ajax editor, which is hosted on Google App Engine. It takes the article's text, sends it securely (over HTTPS) to Open Calais which extracts key words then disambiguates, dedupes, and scores them for relevancy. Using this semantic web data from Calais, the editor then performs a web search on each of the keywords and requests results from the Yahoo Search BOSS API in the JSON format. The text analysis and web search is done automatically when the editor detects that you have stopped typing. To avoid excessive and distracting search refreshes, the editor will wait between three and eight seconds before performing the search.
You can try it out here:
Download the source code here.
This simple little mashup was well received, and I think part of the reason is the potential for growth. The fundamental idea, is to have an environment in which relevant data is brought to you as you work. You don't even need to go info hunting, our computers can find relevant information and bring it to you automagically.
For this particular mashup app, I had envisioned a few additional features, the addition of which is left as an exercise to the reader if you should so choose.
- A WSIWYG editor to create rich HTML instead of just plain text. (I looked at tinyMCE but ran out of time.)
- Search on combinations of keywords instead of just individual items identified by Calais.
- Search other information sources, like news results, images, videos.
- Search on multiple search engines. (Google has an easy to use Ajax search API, but I wanted to try something that was totally new to me.)