"Long before it's in the papers"
May 26, 2015

RETURN TO THE WORLD SCIENCE HOME PAGE


Robot is designed to learn actions through trial and error

May 26, 2015
Courtesy University of California - Berkeley
and World Science staff

Re­search­ers say they have pro­grammed ro­bots to learn me­chan­i­cal tasks on their own through tri­al and er­ror, in a pro­cess in­spired by the way hu­mans learn.

Com­put­er en­gi­neers at the Uni­vers­ity of Cal­i­for­nia, Berke­ley dem­on­strat­ed their tech­nique, which they de­scribed as a type of “re­in­force­ment learn­ing,” by hav­ing a ro­bot com­plete var­i­ous tasks. These in­clud­ed put­ting a clothes hang­er on a rack, as­sem­bling a toy plane and screw­ing a cap on a wa­ter bot­tle.

Courtesy University of California - Berkeley


It’s “a new ap­proach to em­pow­er­ing a ro­bot to learn,” said Berke­ley re­search­er Pie­ter Abbeel. “The key is that when a ro­bot is faced with some­thing new, we won’t have to re­pro­gram it. The ex­act same soft­ware, which en­codes how the ro­bot can learn, was used to al­low the ro­bot to learn all the dif­fer­ent tasks we gave it.”

Abbeel and col­leagues plan to pre­s­ent the work on Thurs­day, May 28, in Se­at­tle at the In­terna­t­ional Con­fer­ence on Ro­botics and Au­toma­t­ion.

“We still have a long way to go be­fore our ro­bots can learn to clean a house or sort laun­dry, but our in­i­tial re­sults in­di­cate that these kinds of deep learn­ing tech­niques can have a trans­form­a­tive ef­fect in terms of en­a­bling ro­bots to learn com­plex tasks en­tirely from scratch,” Abbeel said.

“Most ro­botic ap­plica­t­ions are in con­trolled en­vi­ron­ments where ob­jects are in pre­dict­a­ble po­si­tions,” added study col­la­bo­ra­tor Trev­or Dar­rell. “The chal­lenge of put­ting ro­bots in­to real-life set­tings, like homes or of­fices, is that those en­vi­ron­ments are con­stantly chang­ing. The ro­bot must be able to per­ceive and adapt to its sur­round­ings.”

The re­search­ers turned to a new branch of ar­ti­fi­cial in­tel­li­gence known as deep learn­ing, which is loosely in­spired by the cel­lu­lar cir­cuit­ry of the hu­man brain.

“For all our ver­sa­til­ity, hu­mans are not born with a rep­er­toire of be­hav­iors that can be de­ployed like a Swiss ar­my knife, and we do not need to be pro­grammed,” said post­doc­tor­al re­search­er Sergey Le­vine, an­oth­er col­la­bo­ra­tor in the proj­ect. 

“In­stead, we learn new skills over the course of our life from ex­pe­ri­ence and from oth­er hu­mans. This learn­ing pro­cess is so deeply root­ed in our nerv­ous sys­tem, that we can­not even com­mu­ni­cate to an­oth­er per­son pre­cisely how the re­sult­ing skill should be ex­e­cut­ed. We can at best hope to of­fer point­ers and guid­ance as they learn it on their own.”

In the world of ar­ti­fi­cial in­tel­li­gence, deep learn­ing pro­grams cre­ate “neu­ral nets” in which lay­ers of ar­ti­fi­cial neu­rons, or brain cells, pro­cess overlapping raw sen­so­ry da­ta, wheth­er it be sound waves or im­age pix­els. This helps the ro­bot rec­og­nize pat­terns and cat­e­gories among the da­ta it is re­ceiv­ing. 

Peo­ple who use Siri on their iPhones, Google’s speech-to-text pro­gram or Google Street View might al­ready have ben­e­fit­ed from deep learn­ing, but ap­ply­ing it to ro­bots “mov­ing about in an un­struc­tured 3D en­vi­ron­ment is a whole dif­fer­ent ball­game,” said Ph.D. stu­dent Chel­sea Finn, an­oth­er mem­ber of the re­search team. 

“There are no la­beled di­rec­tions, no ex­am­ples of how to solve the prob­lem in ad­vance. There are no ex­am­ples of the cor­rect so­lu­tion like one would have in speech and vi­sion rec­og­ni­tion pro­grams.”

In the ex­pe­ri­ments, the UC Berke­ley re­search­ers worked with a de­vice known as Wil­low Gar­age Per­son­al Ro­bot 2, or PR2, which they nick­named BRETT, or Berke­ley Ro­bot for the Elimina­t­ion of Te­di­ous Tasks.

They pre­s­ented BRETT with tasks such as plac­ing blocks in­to match­ing open­ings or stack­ing Lego blocks. The pro­gram con­trol­ling the learn­ing in­clud­ed a “re­ward func­tion” that pro­vid­ed a score based up­on how well the ro­bot was do­ing with the task. The ro­bot takes in the scene, in­clud­ing the po­si­tion of its own arms and hands, as viewed by a cam­era. The pro­gram pro­vides real-time feed­back via the score based up­on the ro­bot’s move­ments. Move­ments that br­ing the ro­bot clos­er to com­plet­ing the task score high­er than those that do not. The score feeds back through the neu­ral net, so the ro­bot can learn which move­ments are bet­ter for the task at hand.

The ro­bot could mas­ter a typ­i­cal as­sign­ment in about 10 min­utes, the re­search­ers said, but it could take about three hours when the ro­bot was­n’t giv­en the loca­t­ion for the ob­jects in the scene and needed to learn vi­sion and con­trol to­geth­er.

* * *

Send us a comment on this story, or send it to a friend











Sign up for
e-newsletter
   
 
subscribe
 
cancel

On Home Page         

LATEST

  • Sto­ne to­ols pre-da­ted “man,” stu­dy finds

  • N­ew act­ion pl­an to save rar­est ape

EXCLUSIVES

  • Study links global warming, war for first time—in Syria

  • Smart­er mice with a “hum­anized” gene?

  • Was black­mail essen­tial for marr­iage to evolve?

  • Plu­to has even cold­er “twin” of sim­ilar size, studies find

MORE NEWS

  • F­rog said to de­scribe its home through song

  • Even r­ats will lend a help­ing paw: study

  • D­rug may undo aging-assoc­iated brain changes in ani­mals

Researchers have programmed robots to learn mechanical tasks on their own through trial and error, in a process inspired by the way humans learn, according to a report. Computer engineers at the University of California, Berkeley demonstrated their technique, which they described as a type of “reinforcement learning,” by having a robot complete various tasks. These included putting a clothes hanger on a rack, assembling a toy plane and screwing a cap on a water bottle. It’s “a new approach to empowering a robot to learn,” said Berkeley researcher Pieter Abbeel. “The key is that when a robot is faced with something new, we won’t have to reprogram it. The exact same software, which encodes how the robot can learn, was used to allow the robot to learn all the different tasks we gave it.” Abbeel and colleagues plan to present the work on Thursday, May 28, in Seattle at the International Conference on Robotics and Automation. “Most robotic applications are in controlled environments where objects are in predictable positions,” said study collaborator Trevor Darrell. “The challenge of putting robots into real-life settings, like homes or offices, is that those environments are constantly changing. The robot must be able to perceive and adapt to its surroundings.” The researchers turned to a new branch of artificial intelligence known as deep learning, which is loosely inspired by the cellular circuitry of the human brain. “For all our versatility, humans are not born with a repertoire of behaviors that can be deployed like a Swiss army knife, and we do not need to be programmed,” said postdoctoral researcher Sergey Levine, another collaborator in the project. “Instead, we learn new skills over the course of our life from experience and from other humans. This learning process is so deeply rooted in our nervous system, that we cannot even communicate to another person precisely how the resulting skill should be executed. We can at best hope to offer pointers and guidance as they learn it on their own.” In the world of artificial intelligence, deep learning programs create “neural nets” in which layers of artificial neurons, or brain cells, process overlapping raw sensory data, whether it be sound waves or image pixels. This helps the robot recognize patterns and categories among the data it is receiving. People who use Siri on their iPhones, Google’s speech-to-text program or Google Street View might already have benefited from deep learning, but applying it to robots “moving about in an unstructured 3D environment is a whole different ballgame,” said Ph.D. student Chelsea Finn, another member of the research team. “There are no labeled directions, no examples of how to solve the problem in advance. There are no examples of the correct solution like one would have in speech and vision recognition programs.” In the experiments, the UC Berkeley researchers worked with a device known as Willow Garage Personal Robot 2, or PR2, which they nicknamed BRETT, or Berkeley Robot for the Elimination of Tedious Tasks. They presented BRETT with tasks such as placing blocks into matching openings or stacking Lego blocks. The program controlling the learning included a reward function that provided a score based upon how well the robot was doing with the task. The robot takes in the scene, including the position of its own arms and hands, as viewed by a camera. The program provides real-time feedback via the score based upon the robot’s movements. Movements that bring the robot closer to completing the task score higher than those that do not. The score feeds back through the neural net, so the robot can learn which movements are better for the task at hand. The robot could master a typical assignment in about 10 minutes, the researchers said, but it could take about three hours when the robot wasn’t given the location for the objects in the scene and needed to learn vision and control together. “We still have a long way to go before our robots can learn to clean a house or sort laundry, but our initial results indicate that these kinds of deep learning techniques can have a transformative effect in terms of enabling robots to learn complex tasks entirely from scratch,” Abbeel said.