+2  A: 

Hello again Drew,

I was working on a project not that dissimilar from this (making a robotic tuna) and one of the methods we were exploring was using a genetic algorithm to tune the performance of an artificial central pattern generator (in our case the pattern was a number of sine waves operating on each joint of the tail). It might be worth giving a shot, Genetic Algorithms are another one of those tools that can be incredibly powerful, if you are careful about selecting a fitness function.

Bradley Powers
I think I'll investigate these CPGs as I've seen a few papers mention them. Neural oscillators are another biologically inspired model that seem to have yielded some successes in this area. Only today I managed to implement the interface to the simulation that lets me exactly place objects and measure them in a god-like way, rather than as an agent (which has erroneous complex sensors and actuators). So I have all the pieces in my library to stage some learning exercises. Can't wait!
Drew Noakes
Best of luck Drew!
Bradley Powers
+2  A: 

Here's a great paper from 1999 by Peter Nordin and Mats G. Nordahl that outlines an evolutionary approach to controlling a humanoid robot, based on their experience building the ELVIS robot:

Drew Noakes
+1 A very good paper... when I was envisioning it I only thought of 2 layers, but the paper shows 3 layers.
Lirik
Drew Noakes
Drew Noakes
+2  A: 

I've been thinking about this for quite some time now and I realized that you need at least two intelligent "agents" to make this work properly. The basic idea is that you have two types intelligent activity here:

  1. Subconscious Motor Control (SMC).
  2. Conscious Decision Making (CDM).

Training for the SMC could be done on-line... if you really think about it: defining success within motor control is basically done when you provide a signal to your robot, it evaluates that signal and either accepts it or rejects it. If your robot accepts a signal and it results in a "failure", then your robot goes "offline" and it can't accept any more signals. Defining "failure" and "offline" could be tricky, but I was thinking that it would be a failure if, for example, a sensor on the robot indicates that the robot is immobile (laying on the ground).

So your fitness function for the SMC might be something of the sort: numAcceptedSignals/numGivenSignals + numFailure

The CDM is another AI agent that generates signals and the fitness function for it could be: (numSignalsAccepted/numSignalsGenerated)/(numWinGoals/numLossGoals)

So what you do is you run the CDM and all the output that comes out of it goes to the SMC... at the end of a game you run your fitness functions. Alternately you can combine the SMC and the CDM into a single agent and you can make a composite fitness function based on the other two fitness functions. I don't know how else you could do it...

Finally, you have to determine what constitutes a learning session: is it half a game, full game, just a few moves, etc. If a game lasts 1 minute and you have a total of 8 players on the field, then the process of training could be VERY slow!

Update

Here is a quick reference to a paper that used genetic programming to create "softbots" that play soccer: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.136&rep=rep1&type=pdf

With regards to your comments: I was thinking that for the subconscious motor control (SMC), the signals would come from the conscious decision maker (CDM). This way you're evolving your SMC agent to properly handle the CDM agent's commands (signals). You want to maximize the up-time of the SMC agent regardless of what the CDM agent says.

The SMC agent receives an input, for example a vector force on a joint, and it then runs it through its processing unit to determine if it should execute that input or if it should reject it. The SMC should only execute inputs that it doesn't "think" it will recover from and it should reject inputs that it "thinks" would lead to a "catastrophic failure".

Now the SMC agent has an output: accept or reject a signal (1 or 0). The CDM can use that signal for its own training... the CDM wants to maximize the number of signals that the SMC accepts and it also wants to satisfy a goal: a high score for its own team and a low score for the opposing team. So the CDM has its own processing unit that is being evolved to satisfy both of those needs. Your reference provided a 3-layer design, while mine is only a 2-layer... I think mine was a right step in towards the 3-layer design.

One more thing to note here: is falling really a "catastrophic failure"? What if your robot falls, but the CDM makes it stand up again? I think that would be a valid behavior, so you shouldn't penalize the robot for falling... perhaps a better thing to do is penalize it for the amount of time it takes in order to perform a goal (not necessarily a soccer goal).

Lirik
Interesting ideas. I'm curious for more detail on your mention of signals being provided, and accepted/rejected. Where does this signal come from in the first place? At this point I'm mostly interested in simple locomotion rather than full-game scoring. Ie. my fitness function would be more likely to involve distance travelled before falling over! In the case where motion is continuous, then time taken to cover ground would be considered as well. The concept of accepted/rejected signals is foreign to me though. Could you discuss a little further?
Drew Noakes
@Drew, I updated my answer... I think locomotion might be a good start.The SMC receives signals that it should process (i.e. move joint A with force F in direction x,y,z) and it would run it through its processing unit to determine if this move could lead to a catastrophic failure. The SMC wants to maximize the up-time of the robot and its ability to respond to the CDM... maximizing the up-time also might take care of the issues with efficiency (i.e. power, idle-time, etc).
Lirik
@Drew, all an all, this is quite a complex task you're undertaking... which I'm sure you're realizing as you get more involved in it.
Lirik
@Lirik, thanks for your edit. I can see what you're getting at better now. I agree that layering is important and I'm still not sure how to do it. For example, let's say the agent is running forwards but starts leaning to the side slightly. The SMC should apply the appropriate tilt to the foot/hip/wherever to keep the alignment upright. But this can't be an all-out veto -- it must be merged into the current motion, otherwise the agent would freeze mid-stride and topple over.
Drew Noakes
Another issue I'm contemplating is that dynamic walking involves being in a constant state of falling. I can model the centre of mass and area of stability, but the agent is constantly going to be unstable by that definition. This kind of thinking moves further down the hard-math/physics approach, rather than the machine learning style approach. Different RoboCup teams use different approaches -- some use inverse kinematics, others genetic algos/neural nets. I imagine having both types of tools available for different scenarios might be useful.
Drew Noakes
BTW thanks for that paper, but it's for the 2D league, where agents are simplified to circles on a plane. That league is also much more mature, and the games are much more exciting to watch. RoboCup 2010 is happening right now in Singapore. The simulated 3D league final is tomorrow! If you find a video online you can see that the state of the art in 3D robot control still leaves a lot to be desired. So yes, quite a complex undertaking, but lots of fun.
Drew Noakes
@Drew, I was thinking about the walking part too... I think that with a GA/GP you can let the algorithm evolve it's own walking "style": in the beginning it might be a crawl, a roll or a hop, but eventually it might learn to walk or run. I've been doing some research on AI and trading, but I'm also interested in robotics. If you want to talk about this in a more detailed discussion, then stop by my blog (mlai-lirik.blogspot.com) and drop me a comment with some contact info or just send an e-mail to my temporary e-mail: [email protected] (the e-mails only exist for a few hours)
Lirik
+3  A: 

There is a significant body of research literature on robot motion planning and robot locomotion.

General Robot Locomotion Control

For bipedal robots, there are at least two major approaches to robot design and control (whether the robot is simulated or physically real):

  • Zero Moment Point - a dynamics-based approach to locomotion stability and control.
  • Biologically-inspired locomotion - a control approach modeled after biological neural networks in mammals, insects, etc., that focuses on use of central pattern generators modified by other motor control programs/loops to control overall walking and maintain stability.

Motion Control for Bipedal Soccer Robot

There are really two aspects to handling the control issues for your simulated biped robot:

  1. Basic walking and locomotion control
  2. Task-oriented motion planning

The first part is just about handling the basic control issues for maintaining robot stability (assuming you are using some physics-based model with gravity), walking in a straight-line, turning, etc. The second part is focused on getting your robot to accomplish specific tasks as a soccer player, e.g., run toward the ball, kick the ball, block an opposing player, etc. It is probably easiest to solve these separately and link the second part as a higher-level controller that sends trajectory and goal directives to the first part.

There are a lot of relevant papers and books which could be suggested, but I've listed some potentially useful ones below that you may wish to include in whatever research you have already done.

Reading Suggestions

LaValle, Steven Michael (2006). Planning Algorithms, Cambridge University Press.

Raibert, Marc (1986). Legged Robots that Balance. MIT Press.

Vukobratovic, Miomir and Borovac, Branislav (2004). "Zero-Moment Point - Thirty Five Years of its Life", International Journal of Humanoid Robotics, Vol. 1, No. 1, pp 157–173.

Hirose, Masato and Takenaka, T (2001). "Development of the humanoid robot ASIMO", Honda R&D Technical Review, vol 13, no. 1.

Wu, QiDi and Liu, ChengJu and Zhang, JiaQi and Chen, QiJun (2009). "Survey of locomotion control of legged robots inspired by biological concept ", Science in China Series F: Information Sciences, vol 52, no. 10, pp 1715--1729, Springer.

Wahde, Mattias and Pettersson, Jimmy (2002) "A brief review of bipedal robotics research", Proceedings of the 8th Mechatronics Forum International Conference, pp 480-488.

Shan, J., Junshi, C. and Jiapin, C. (2000). "Design of central pattern generator for humanoid robot walking based on multi-objective GA", In: Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1930–1935.

Chestnutt, J., Lau, M., Cheung, G., Kuffner, J., Hodgins, J., and Kanade, T. (2005). "Footstep planning for the Honda ASIMO humanoid", Proceedings of the 2005 IEEE International Conference on Robotics and Automation (ICRA 2005), pp 629-634.

Joel Hoff
Thanks for the thorough answer. I'll definitely read through those papers you have linked to.
Drew Noakes
@Drew - I've added another paper reference for ASIMO and provided a Google Books weblink for LaValle's planning algorithms book.
Joel Hoff