NLP for a Knowledge Database

(The NLP stands for Natural Language Processing[L1 ].)

My friends and I had a competition to see who could write the most interesting IRC bot[L2 ]. My entry (and the most interesting, in my opinion) was a bot that answered questions about the show Law & Order[L3 ] using show information downloaded from[L4 ]. To make the task more challenging I designed the bot to answer queries phrased as proper English sentences. Below is a sample session. My queries begin with the phrase lobot, .

  lobot, who was james earl jones?
  L&O Bot: James Earl Jones played Horace Mccoy in "Profile"
  lobot, who was 'samuel l. jackson'?
  L&O Bot: Samuel L. Jackson played Louis Taggert in "The Violence
    of Summer"
  lobot, what was it about?
  L&O Bot: Episode 14: Greevey and Logan reopen an investigation to
    help Stone and Robinette build a stronger case against the person accused of
    assaulting a TV reporter.
  lobot, what was 'whose monkey'?
  L&O Bot: Whose Monkey is it Anyway?: The death of a lab
    technician and the abduction of 17 infected monkeys lead to a trial
    involving the treatment of research animals.
  lobot, who was in it?
  L&O Bot: Episode 239: Tricia Paoluccio (Maxine Walden) Kevin
    Isola (George Peavey) Dennis Creaghan (Liddle) Tresa Hughes (May Peavey)
    David Fonteno (Judge Derek Hafner) Ira Hawkins () Ron Nakahara (Dr. Tong)
    Catherine Hickland (Faye Ireland) Terrence Mann (Oyler) Richard Russell
    Ramos (Paul Serafien) Robert ... --More--(44%)
  lobot, which episode was about a horse?
  L&O Bot: Corpus Delicti: The death of a show horse leads to a
    trial involving insurance fraud, a sting operation and a wealthy woman's
  lobot, when did the first episode of season 11 air?
   L&O Bot: Episode 230: 18-Oct-2000

The language parser understands a variety of question types and can even some types of compound questions.

The engine is designed to allow for any show, not just Law & Order. Additional query types and phrases are easily added by modifying the source code. Be aware that because English is a Type 0 grammar [L5 ] you have no chance of enumerating all possible English sentences.

The program is written using fickle and taccle. To compile the program yourself you may need first to modify the Makefile. For your convenience a pre-built package is also available.

For copyright reasons I can not distribute the actual show database file. A sample is provided in the distributions. See the file database for usage.

After you have the Tcl code ready (executing the Makefile or just downloading the pre-built version) execute the script knowbot.tcl. Now just type in your questions, making sure to end each one with a question mark. If the parser understands your question it will quickly respond; otherwise it will give an ominous Invalid query.