tags:

views:

562

answers:

2

I like the options offered by the _ast module, it's really powerful. Is there a way of getting the full AST from it?

For example, if I get the AST of the following code :


import os
os.listdir(".")

by using :


ast = compile(source_string,"<string>","exec",_ast.PyCF_ONLY_AST)

the body of the ast object will have two elements, an import object, and a expr object. However, I'd like to go further, and obtain the AST of import and listdir, in other words, I'd like to make _ast descend to the lowest level possible.

I think it's logical that this sort of thing should be possible. The question is how?

EDIT: by the lowest level possible, I didn't mean accesing what's "visible". I'd like to get the AST for the implementation of listdir as well: like stat and other function calls that may be executed for it.

+4  A: 
py> ast._fields
('body',)
py> ast.body
[<_ast.Import object at 0xb7978e8c>, <_ast.Expr object at 0xb7978f0c>]
py> ast.body[1]
<_ast.Expr object at 0xb7978f0c>
py> ast.body[1]._fields
('value',)
py> ast.body[1].value
<_ast.Call object at 0xb7978f2c>
py> ast.body[1].value._fields
('func', 'args', 'keywords', 'starargs', 'kwargs')
py> ast.body[1].value.args
[<_ast.Str object at 0xb7978fac>]
py> ast.body[1].value.args[0]
<_ast.Str object at 0xb7978fac>
py> ast.body[1].value.args[0]._fields
('s',)
py> ast.body[1].value.args[0].s
'.'

HTH

Martin v. Löwis
I know how to get that. The thing is, how do I get to listdir's AST? Not the function argument, but the implementation beneath.
Geo
There is no AST for listdir - it is implemented in C.
Martin v. Löwis
then how would I go about obtaining all there is to obtain? how could I make _ast recurse into each python implemented function?
Geo
@Geo, as I mentioned in my answer: upgrade to 2.6, use module ast without leading underscore, and subclass ast.NodeVisitor as appropriate (or equivalently use ast.iter_child_nodes recursively and ast.iter_fields as needed).
Alex Martelli
You can't "recurse" into Python-implemented functions, either. First, if you have "foo.bar()", you can't even know which bar is being called, because of late binding - you actually have to run that code. Even for module-level functions: By merely compiling a call to them, the actual function being compiled is not considered at all. It may not exist, and even if it does exist, you may have only byte code of the function. If you want an AST of the function, you need to locate the module source, and compile it yourself. Traversing from your code to the module "automatically" is just not possible.
Martin v. Löwis
+4  A: 

You do get the whole tree this way -- all the way to the bottom -- but, it IS held as a tree, exactly... so at each level to get the children you have to explicitly visit the needed attributes. For example (i'm naming the compile result cf rather than ast because that would hide the standard library ast module -- I assume you only have 2.5 rather than 2.6, which is why you're using the lower-level _ast module instead?)...:

>>> cf.body[0].names[0].name
'os'

This is what tells you that the import statement is importing name os (and that one only because 1 is the lengths of the .names field of .body[0] which is the import).

In Python 2.6's module ast you also get helpers to let you navigate more easily on a tree (e.g. by the Visitor design pattern) -- but the whole tree is there in either 2.5 (with _ast) or 2.5 (with ast), and in either case is represented in exactly the same way.

To handily visit all the nodes in the tree, in 2.6, use module ast (no leading underscore) and subclass ast.NodeVisitor as appropriate (or equivalently use ast.iter_child_nodes recursively and ast.iter_fields as needed). Of course these helpers can be implemented in pure Python on top of _ast if you're stuck in 2.5 for some reason.

Alex Martelli