|
Computer program learns language rules,
creates sentences
Aug. 31, 2005
By Susan S. Lang/Cornell University
and World Science staff
Researchers say they have developed a system that lets a computer scan text in various languages, figure out the grammatical rules behind
it, and spit out simple, but sensible sentences of its own making.
The method also works for such data as sheet music or genetic code, the researchers
said, and has implications for speech recognition and genomics.
It could also provide new insights into language learning,
according to the scientists.
The program “can take a body of text, abstract from it a collection of recurring patterns or rules and then generate new material,” said Shimon Edelman, a computer scientist and professor of psychology at Cornell University in Ithaca,
N.Y.
Edelman co-wrote a paper describing the findings in the Aug. 16 issue of the research journal Proceedings of the National Academy of Sciences.
The system, called Automatic Distillation of Structure (ADIOS), identifies complex patterns in raw texts by repeatedly aligning sentences and looking for overlapping parts.
For example, the sentences “I would like to book a first-class flight to Chicago,” “I want to book a first-class flight to Boston,” and “Book a first-class flight for me, please” might
tip off the computer that the following is a useable string of text: “book a first-class flight.”
The researchers said in one
experiment, they “trained” the system with a group of thousands of normal
sentences of this type, all relating to airline travel. As a result, the
computer generated sentences including the following: “I would like to book the first
class”; “ I plan to make a round trip”; “ what kind of food would be
served”; and “ how many flights does Continental have.”
Because the program can identify patterns that contain other patterns, which in turn
contain still more, the computer’s knowledge grows “as a sort of forest of branching trees of possibilities,” said Edelman.
The researchers said they
also tested the method with translations of the Bible in several languages, also
with successful results.
ADIOS relies on processes believed to occur in human language learning, Edelman added. “This may eventually help researchers understand how children, who learn language in a similar item-by-item fashion and with very little supervision, eventually master the full complexities of their native tongue.”
In biology research, he
said, the system can help scientists decipher sequences of chemical units in
genetic code. If the code contains information for producing a given molecule,
for instance, the system can help predict the structure and function of that
molecule.
The system has also been
tested on musical notion, added
Edelman, who developed it with researchers at Cornell and at Tel Aviv
University.
* * *
Send us a comment
on this story, or send
it to a friend
|