views:

200

answers:

5

I am running a physics simulation and applying a set of movement instructions to a simulated skeleton. I have a multiple sets of instructions for the skeleton consisting of force application to legs, arms, torso etc. and duration of force applied to their respective bone. Each set of instructions (behavior) is developed by testing its effectiveness performing the desired behavior, and then modifying the behavior with a genetic algorithm with other similar behaviors, and testing it again. The skeleton will have an array behaviors in its set list.

I have fitness functions which test for stability, speed, minimization of entropy and force on joints. The problem is that any given behavior will work for a specific context. One behavior works on flat ground, another works if there is a bump in front of the right foot, another if it's in front of the left, and so on. So the fitness of each behavior varies based on the context. Picking a behavior simply on its previous fitness level won't work because that fitness score doesn't apply to this context.

My question is, how do I program to have the skeleton pick the best behavior for the context? Such as picking the best walking behavior for a randomized bumpy terrain.

A: 

You're using a genetic algorithm to modify the behavior, so that must mean you have devised a fitness function for each combination of factors. Is that your question?

If yes, the answer depends on what metrics you use to define best walking behavior:

  1. Maximize stability
  2. Maximize speed
  3. Minimize forces on joints
  4. Minimize energy or entropy production

Or do you just try a bunch of parameters, record the values, and then let the genetic algorithm drive you to the best solution?

If each behavior works well in one context and not another, I'd try quantifying how to sense and interpolate between contexts and blend the strategies to see if that would help.

duffymo
I do have fitness functions which program for maximizing stability, speed, minimize force on joints and minimize energy use. My problem is that each behavior is good at working in one situation. For example. One behavior is good at walking on flat ground. Another behavior is good at walking if there is a bump directly in front of the right foot and needs to be stepped on. Another behavior is good if that bump is higher, or tilted to the side, or is in front of the left foot. A behavior that works for one situation, won't work for another. I should edit the question to say that.
DrSammyD
+1  A: 
Beta
I can get the algorithm to let it traverse a bumpy terrain, but I'm trying to get that same behavior to work on another randomized bumpy terrain.
DrSammyD
A: 

It sounds like at this point you have just a classification problem. You want to map some knowledge about what you are currently walking on to one of a set of classes. Knowing the class of the terrain allows you to then invoke the proper subroutine. Is this correct?

If so, then there are a wide array of classification engines that you can use including neural networks, Bayesian networks, decision trees, nearest neighbor, etc. In order to pick the best fit, we will need more information about your problem.

First, what kind of input or sensory data do you have available to help you identify the behavior class you should invoke? Second, can you describe the circumstances in which you will be training this classifier and what the circumstances are during runtime when you deploy it, such as any limits on computational resources or requirements of robustness to noise?

EDIT: Since you have a fixed number of classes, and you have some parameterized model for generating all possible terrains, I would consider using k-means clustering. The principle is as follows. You cluster a whole bunch of terrains into k different classes, where each cluster is associated with one of your specialized subroutines that performs best for that cluster of terrains. Then when a new terrain comes in, it will probably fall near one of these clusters. You then invoke the corresponding specialized subroutine to navigate that terrain.

Do this offline: Generate enough random terrains to sufficiently sample the parameter space, map these terrains to your sensory space (but remember which points in sensory space correspond to which terrains), and then run k-means clustering on this sensory space corpus where k is the number of classes you want to learn. Your distance function between a class representative C and a point P in sensory space would be simply the fitness function of letting algorithm C navigate the terrain that generated P. You would then get a partitioning of your sensory space into k clusters, each cluster mapping to the best subroutine that you've got. Each cluster will have a representative point in sensory space.

Now during runtime: You will get some unlabeled point in sensory space. Use a different distance function to find the closest representative point to this new incoming point. That tells you what class the terrain is.

Note that the success of this method depends on the quality of the mapping from the parameter space of terrain generation to sensory space, from sensory space to your fitness functions, and the eventual distance function you use to compare points in sensory space.

Note also that if you had enough memory, instead of only using the k representative sensory points to tell you which class an unlabeled sensory point belongs to, you might go through your training set and label all points with the learned class. Then during runtime you pick the nearest neighbor, and conclude that your unlabeled point in sensory space is in the same class as that neighbor.

Eric
I'm using the physx engine to simulate the world. My sensory data is all the collision shapes (which include boxes, circles, capsules, and triangle meshes) from an area around the collider, and I collect their size, orientation, velocity (directional and angular). I haven't implemented this yet, but I'm thinking I should let the GA collect these variables, let it randomly choose when to branch off of another behavior based on one of those variables being in a certain range, and then create another behavior on that variable. I don't know how to prune out unfit behaviors over multiple tests.
DrSammyD
It's also possible that I could simulate vision and just get the collision objects in front of the character. Also the character knows where each of it's body parts is, since it's made of the same parts it's colliding with. As for computational limits, I'm trying to get this to simulate on current to near-future consumer computer hardware.
DrSammyD
The end goal is to have two of these characters fight each other with some degree of martial arts ability. But first I need them to walk. I'm not sure if it's too ambitious, as I'm quickly getting the sense that it is for the hardware currently available. Still, I'd like to try. The more I can learn now, the better, since it might be possible further down the road.
DrSammyD
A: 

There are three aspects to my answer: (1) control theory, (2) sensing, and (3) merging sensing and action.

CONTROL THEORY

The answer to your problem depends partially on what kind of control scheme you are using: is it feed-forward or feedback control? If the latter, what simulated real-time sensors do you have other than terrain information?

Simply having terrain information and incorporating it into your control strategy would not mean you are using feedback control. It is possible to use such information to select a feed-forward strategy, which seems closest to the problem that you have described.

SENSING

Whether you are using feed-forward or feedback control, you need to represent the terrain information and any other sensory data as an input space for your control system. Part of training your GA-based motion controller should be moving your skeleton through a broad range of random terrain in order to learn feature detectors. The feature detectors classify the terrain scenarios by segmenting the input space into regions critical to deciding what is the best action policy, i.e., what control behavior to employ.

How to best represent the input space depends on the level of granularity of the terrain information you have for your simulation. If it's just a discrete space of terrain type and/or obstacles in some grid space, you may be able to present it directly to your GA without transformation. If, however, the data is in a continuous space such as terrain type and obstacles at arbitrary range/direction, you may need to transform it into a space from which it may be easier to infer spatial relationships, such as coarse-coded range and direction, e.g., near, mid, far and forward, left-forward, left, etc. Gaussian and fuzzy classifiers can be useful for the latter approach, but discrete-valued coding can also work.

MERGING SENSING AND ACTION

Using one of the input-space-encoding approaches above, you have a few options for how to connect behavior selection search space and motion control search space:

  1. Separate the two spaces into two learning problems and use a separate GA to evolve the parameters of a standard multi-layer perceptron neural network. The latter would have your sensor data (perhaps transformed) as inputs and your set of skeleton behaviors as outputs. Instead of using back-propagation or some other ANN-learning method to learn the network weights, your GA could use some fitness function to evolve the parameters over a series of simulated trials, e.g., fitness = distance traveled in a fixed time period toward point B starting from point A. This should evolve over successive generations from completely random selection of behaviors to something more coordinated and useful.

  2. Merge the two search spaces (behavior selection and skeleton motor control) by linking a multi-layer perceptron network as described in (1) above into the existing GA-based controller framework that you have, using the skeleton behavior set as the linkage. The parameter space that will be evolved will be both the neural network weights and whatever your existing controller parameter space is. Assuming that you are using a multi-objective genetic algorithm, such as the NSGA-II algorithm, (since you have multiple fitness functions), the fitness functions would be stability, speed, minimization of entropy, force on joints, etc, plus some fitness function(s) targeted at learning the behavior-selection policy, e.g., distance moved toward point B starting from point A in a fixed time period.

    The difference between this approach and (1) above is that you may be able to learn both better coordination of behaviors and finer-grain motor control since the parameter space is likely to be better explored when the two problems are merged as opposed to being separate. The downside is that it may take much longer to converge on reasonable parameter solutions(s), and not all aspects of motor control may be learned as well as they would if the two learning problems were kept separate.

Given that you already have working evolved solutions for the motor control problem, you are probably better off using approach (1) to learn the behavior-selection model with a separate GA. Also, there are many alternatives to the hybrid GA-ANN scheme I described above for learning the latter model, including not learning a model at all and instead using a path planning algorithm as described in a separate answer from me. I simply offered this approach since you are already familiar with GA-based machine learning.

The action selection problem is a robust area of research in both machine learning and autonomous robotics. It's probably well-worth reading up on this topic in itself to gain better perspective and insight into your current problem, and you may be able to devise a simpler strategy than anything I've suggested so far by viewing your problem through the lens of this paradigm.

Joel Hoff
I'm going for the arbitrary range. Each object can have different positions, angles, and even velocities (directional and angular), and they aren't mapped to a grid or enumerated. Can I have the input space integrated into the behavior space? How would I connect the two search spaces?
DrSammyD
@DrSammyD - If your world model is as detailed as you described in a recent comment to [this answer](http://stackoverflow.com/questions/3077380/ai-behavior-decision-making/3090297#3090297 "title"), then there is an entirely different approach that you may wish to consider: robot path planning. I've posted a [separate answer](http://stackoverflow.com/questions/3077380/ai-behavior-decision-making/3107448#3107448 "title") to explore this approach since it is fundamentally different than the one above.
Joel Hoff
@DrSammyD - I've responded to your question of how to connect the two search spaces by adding some discussion in my answer above.
Joel Hoff
+1  A: 

In a different answer I've given to this question, I assumed that the "terrain" information you have for your model was very approximate and large-grained, e.g., "smooth and flat", "rough", "rocky", etc. and perhaps only at a grid level. However, if the world model is in fact very detailed, such as from a simulated version of a 3-D laser range scanner, then algorithmic and computational path/motion planning approaches from robotics are likely to be more useful than a machine-learning classifier system.

PATH/MOTION PLANNING METHODS

There are a fairly large number of path and motion planning methods, including some perhaps more suited to walking/locomotion, but a few of the more general ones worth mentioning are:

  • Visibility graphs
  • Potential Fields
  • Sampling-based methods

The general solution approach would be use a path planning method to determine the walking trajectory that your skeleton should follow to avoid obstacles, and then use your GA-based controller to achieve the appropriate motion. This is very much at the core of robotics: sense the world and determine actions and motor control required to achieve some goal(s).

Also, a quick literature search turned up the following papers and a book as a source of ideas and starting points for further investigation. The paper on legged robot motion planning may be especially useful as it discusses several motion planning strategies.

Reading Suggestions

Steven Michael LaValle (2006). Planning Algorithms, Cambridge University Press.

Kris Hauser, Timothy Bretl, Jean-Claude Latombe, Kensuke Harada, Brian Wilcox (2008). "Motion Planning for Legged Robots on Varied Terrain", The International Journal of Robotics Research, Vol. 27, No. 11-12, 1325-1349, DOI: 10.1177/0278364908098447

Guilherme N. DeSouza and Avinash C. Kak (2002). "Vision for Mobile Robot Navigation: A Survey", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 2, February, pp 237-267.

Joel Hoff
+1 for the first two reading suggestions. LaValle is a great reference for any real work related to planning, although it doesn't explicitly consider robust planning for closed kinematic chains.
Andrew Walker