55800

How do you think the “Quick Add” feature in Google Calendar works?

Question:

Am thinking about a project which might use similar functionality to how "Quick Add" handles parsing natural language into something that can be understood with some level of semantics. I'm interested in understanding this better and wondered what your thoughts were on how this might be implemented.

<hr />

If you're unfamiliar with what "Quick Add" is, check out <a href="http://www.google.com/support/calendar/bin/answer.py?hl=en&answer=36604#text" rel="nofollow">Google's KB</a> about it.

<hr />

<strong>6/4/10 Update</strong><br /> Additional research on "Natural Language Parsing" (NLP) yields results which are MUCH broader than what I feel is actually implemented in something like "Quick Add". Given that this feature expects specific types of input rather than the true free-form text, I'm thinking this is a much more narrow implementation of NLP. If anyone could suggest more narrow topic matter that I could research rather than the entire breadth of NLP, it would be greatly appreciated.

That said, I've found a nice <a href="http://www.aaai.org/AITopics/pmwiki/pmwiki.php/AITopics/NaturalLanguage" rel="nofollow">collection of resources about NLP</a> including this great <a href="http://www.faqs.org/faqs/natural-lang-processing-faq/" rel="nofollow">FAQ</a>.

Answer1:

I would start by deciding on a standard way to represent all the information I'm interested in: event name, start/end time (and date), guest list, location. For example, I might use an XML notation like this:

<event> <name>meet Sam</name> <starttime>16:30 07/06/2010</starttime> <endtime>17:30 07/06/2010</endtime> </event>

I'd then aim to build up a corpus of diary entries about dates, annotated with their XML forms. How would I collect the data? Well, if I was Google, I'd probably have all sorts of ways. Since I'm me, I'd probably start by writing down all the ways I could think of to express this sort of stuff, then annotating it by hand. If I could add to this by going through friends' e-mails and whatnot, so much the better.

Now I've got a corpus, it can serve as a set of unit tests. I need to code a parser to fit the tests. The parser should translate a string of natural language into the logical form of my annotation. First, it should split the string into its constituent words. This is is called tokenising, and there is off-the-shelf software available to do it. (For example, see <a href="http://www.nltk.org/" rel="nofollow">NLTK</a>.) To interpret the words, I would look for patterns in the data: for example, text following 'at' or 'in' should be tagged as a location; 'for X minutes' means I need to add that number of minutes to the start time to get the end time. Statistical methods would probably be overkill here - it's best to create a series of hand-coded rules that express your own knowledge of how to interpret the words, phrases and constructions in this domain.

Answer2:

It would seem that there's really no narrow approach to this problem. I wanted to avoid having to pull along the entirety of NLP to figure out a solution, but I haven't found any alternative. I'll update this if I find a really great solution later.

Recommend

  • In Python, count unique key/value pairs in a dictionary
  • Shortest Path, Least Turns Algorithm
  • Writing nested dictionary (forest) of a huge depth to a text file in BFS style
  • something like gimp “fuzzy select” in python/PIL
  • Python: Adding to dict of one object in a list changes all dicts of every other object in the list
  • Modifying files nested in tar archive
  • setSelected() with JRadioButton r[]=new JRadioButton[3] not working [duplicate]
  • What is the equivalent of Android permissions in iOS development? [duplicate]
  • Adding independent aspx/asmx pages into DotNetNuke
  • Zend Framework bassed projects
  • Java : How to tint this PNG programmatically?
  • jersey/tomcat Description The origin server did not find a current representation for the target res
  • xtable - background colour of added rows
  • ZipList with Scalaz
  • Where these are stored?
  • HttpListener.IsSupported is false on XP SP3
  • abstracting over a collection
  • How can I tell a form not to dispose a particular control when it closes?
  • quiver not drawing arrows just lots of blue, matlab
  • Find group of records that match multiple values
  • How to return DataSet (xsd) in WCF
  • Bash if statement with multiple conditions
  • Center align outputs in ipython notebook
  • Suppressing passwd when calling sqlplus from shell script
  • Reading a file into a multidimensional array
  • JBoss External Properties Files in Classpath
  • How do I superscript characters in a UIButton?
  • jQuery ready not fired after rails link_to is clicked
  • Custom Tabgroup Appcelerator
  • jQuery .attr() and value
  • PHPUnit_Framework_TestCase class is not available. Fix… - Makegood , Eclipse
  • Projection media query: browser support and workarounds?
  • C# - Is there a limit to the size of an httpWebRequest stream?
  • Different response to non-authenticated users and AJAX calls
  • What is Eclipse's Declaration View used for?
  • Azure Cloud Service Web Role web pages do not load
  • How would I use PHP exceptions to define a redirect?
  • vba code to select only visible cells in specific column except heading
  • retrieve vertices with no linked edge in arangodb
  • Revoking OAuth Access Token Results in 404 Not Found