“Natural Language Processing” programs are programs that appear to understand English, or other human languages. These programs are also called “NLP” programs.
Some NLP programs are designed only to answer questions. These are called QA programs, or Question Answering programs.
Some NLP programs are designed to have trivial conversations, or “chats”, with a human. These programs are sometimes called “chatbots”.
Generally NLP programs are concerned with processing natural language text, for example text entered via a user’s keyboard. Programs that recognize spoken language, are categorized as Speech Recognition programs, and may use concepts and algorithms from NLP to improve their accuracy. Dragon Naturally Speaking and ViaVoice? are two examples of speech recognition programs.
You may like to investigate these NLP programs.
Eliza was the first chatbot. It’s old, and low-tech. It was a very simple program, but it was found to be surprisingly good at fooling people. People would often think that it could really understand them. This was a complete illusion.
A version of Eliza is available on the web here : http://www.manifestation.com/neurotoys/eliza.php3 . This version is written using Javascript, which is embedded in the web page, so you can read the source code.
This is how Eliza works. Eliza has a set of built-in responses. It chooses which response to give, according to keywords in the user’s input text. By choosing its response according to keywords, it gives a response that seems to follow from what the user wrote.
Sometimes it can’t see any keyword in the user’s input. In these cases, it gives a response that doesn’t refer to what the user said, such as “I see.”, or “What does that suggest to you?”. These responses encourage the user to keep the conversation going, and hide the fact that the program failed to understand anything at all.
Alice works in a manner similar to Eliza, so it’s technically unsophisticated. Alice can be accessed through a website. Alice is open-source, so people can make their own programs using Alice code. An “AliceBot” is one individual copy of Alice, customised by changing a few details and giving it a new name. Several Alicebots exist, and can be accessed through the Web, which is probably why Alice is quite well-known. The Alice website is at http://www.alicebot.org/
An Alice based chatbot, designed to answer questions on the life and works of William Shakespeare.
Uses Google to search and return answers where the current knowledge base does not contain a suitable response.
The website includes a one to one chatterbot, a mutliuser bot chatroom and a bot forum, and can not only return text answers but multimedia and image results.
Designed for educational purposes.
Shakespearebot is at http://www.shakespearebot.com .
START? is a question-answering program that is accessed through a website. It is sophisticated, and it can answer quite a wide range of questions. It was created at MIT.
START? can be accessed here : http://www.ai.mit.edu/projects/infolab/ailab .
AnswerBus? is another question-answering program that is accessed through a website.
It is here : http://www.answerbus.com/index.shtml
This is a classic program written in the 70s. You can read about SHRDLU in the Wikipedia here : http://en.wikipedia.org/wiki/SHRDLU . A downloadable version is available, and source-code in Lisp is included. If you want to download it then follow the links from the Wikipedia page.
More NLP programs can be found at the Chatterbox Challenge website, at http://web.infoave.net/~kbcowart/
Prolog and Lisp are the two best-known programming languages for A.I.,
but Prolog is said to be better for writing NLP programs than Lisp.
For information on free Prolog compilers and related resources, click on PrologResources.
There is a dictionary, containing almost all English words, which is available as a free download. It is called WordNet. It is available from here : http://www.cogsci.princeton.edu/~wn/.
WordNet tells you the PartOfSpeech of almost all English words. This makes it a useful resource for programmers writing NLP programs. It also has dictionary definition of each word.
WordNet also has 80,000 facts of this form : “Dog isa canine”, “Canine isa carnivore”, “Carnivore isa mammal”. It also has some other facts, such as “Lid is-part-of box”, and “Receiver is-part-of telephone”.
For more information, see the page on WordNet.
A fairly simple grammar, showing the structure of English sentences, can be found here : http://www.scientificpsychic.com/grammar/enggram1.html .
There is something called “The Link Grammar Parser”. This is a program, running on a website, which lets you enter a sentence, which it then parses. I.e. it tells you the PartOfSpeech of each word, and shows you the grammatical structure of the sentence.
This parser uses a much bigger grammar than the simple one mentioned above.
There is a free downloadable version of this program. This program can be incorporated into A.I. programs written by other people. I haven’t downloaded it myself.
The Link Grammar website is here : http://hyper.link.cs.cmu.edu/link/
– MartinSondergaard, London.
For more NLP resourses, see LinguisticData.
See also AssociationForComputationalLinguistics.