Version 1 of NLP for a Knowledge Database

Updated 2004-10-11 21:20:58 by jt

(The NLP stands for Natural Language Processing[L1 ].)

My friends and I had a competition to see who could write the most interesting IRC bot[L2 ]. My entry (and the most interesting, in my opinion) was a bot that answered questions about the show Law & Order[L3 ] using show information downloaded from epguides.com[L4 ]. To make the task more challenging I designed the bot to answer queries phrased as proper English sentences. Below is a sample session. My queries begin with the phrase lobot, .

  lobot, who was james earl jones?
  L&O Bot: James Earl Jones played Horace Mccoy in "Profile"
  lobot, who was 'samuel l. jackson'?
  L&O Bot: Samuel L. Jackson played Louis Taggert in "The Violence of Summer"
  lobot, what was it about?
  L&O Bot: Episode 14: Greevey and Logan reopen an investigation to help Stone and Robinette build a stronger case against the person accused of assaulting a TV reporter.
  lobot, what was 'whose monkey'?
  L&O Bot: Whose Monkey is it Anyway?: The death of a lab technician and the abduction of 17 infected monkeys lead to a trial involving the treatment of research animals.
  lobot, who was in it?
  L&O Bot: Episode 239: Tricia Paoluccio (Maxine Walden) Kevin Isola (George Peavey) Dennis Creaghan (Liddle) Tresa Hughes (May Peavey) David Fonteno (Judge Derek Hafner) Ira Hawkins () Ron Nakahara (Dr. Tong) Catherine Hickland (Faye Ireland) Terrence Mann (Oyler) Richard Russell Ramos (Paul Serafien) Robert ... --More--(44%)
  lobot, which episode was about a horse?
  L&O Bot: Corpus Delicti: The death of a show horse leads to a trial involving insurance fraud, a sting operation and a wealthy woman's disappearance.

The language parser understands a variety of question types and can even some types of compound questions.

The program is written using fickle and taccle. To compile the program yourself you may need first to modify the Makefile. For your convenience a pre-built package is also available.

For copyright reasons I can not distribute the actual show database file. A sample is provided in the distributions. See the file database for usage.


Natural languages | Category Human Language