views:

331

answers:

3

I am working on a latex document that will require typesetting significant amounts of python source code. I'm using pygments (the python module, not the online demo) to encapsulate this python in latex, which works well except in the case of long individual lines - which simply continue off the page. I could manually wrap these lines except that this just doesn't seem that elegant a solution to me, and I prefer spending time puzzling about crazy automated solutions than on repetitive tasks.

What I would like is some way of processing the python source code to wrap the lines to a certain maximum character length, while preserving functionality. I've had a play around with some python and the closest I've come is inserting \\\n in the last whitespace before the maximum line length - but of course, if this ends up in strings and comments, things go wrong. Quite frankly, I'm not sure how to approach this problem.

So, is anyone aware of a module or tool that can process source code so that no lines exceed a certain length - or at least a good way to start to go about coding something like that?

+1  A: 

I'd check a reformat tool in an editor like NetBeans.

When you reformat java it properly fixes the lengths of lines both inside and outside of comments, if the same algorithm were applied to Python, it would work.

For Java it allows you to set any wrapping width and a bunch of other parameters. I'd be pretty surprised if that didn't exist either native or as a plugin.

Can't tell for sure just from the description, but it's worth a try:

http://www.netbeans.org/features/python/

Bill K
Reformatting java is likely to be a bit easier than the same in Python. Can anyone confirm if Netbeans (Or any other editor) is able to do this correctly?
TokenMacGuy
A less intelligent but similar solution in Vim would be to visually select the data to "re-columnize" and press `gw` to allow columns to have a maximum width of whatever `textwidth` is currently set to. But this Vim trick really works better on plaintext than source code.
Mark Rushakoff
+3  A: 

You might want to extend your current approach a bit, but using the tokenize module from the standard library to determine where to put your line breaks. That way you can see the actual tokens (COMMENT, STRING, etc.) of your source code rather than just the whitespace-separated words.

Here is a short example of what tokenize can do:

>>> from cStringIO import StringIO
>>> from tokenize import tokenize
>>> 
>>> python_code = '''
... def foo(): # This is a comment
...     print 'foo'
... '''
>>> 
>>> fp = StringIO(python_code)
>>> 
>>> tokenize(fp.readline)
1,0-1,1:    NL '\n'
2,0-2,3:    NAME 'def'
2,4-2,7:    NAME 'foo'
2,7-2,8:    OP '('
2,8-2,9:    OP ')'
2,9-2,10:   OP ':'
2,11-2,30:  COMMENT '# This is a comment'
2,30-2,31:  NEWLINE '\n'
3,0-3,4:    INDENT '    '
3,4-3,9:    NAME 'print'
3,10-3,15:  STRING "'foo'"
3,15-3,16:  NEWLINE '\n'
4,0-4,0:    DEDENT ''
4,0-4,0:    ENDMARKER ''
Rick Copeland
Now that looks promising, I will look into the tokenize module. Thanks!
Markus
I imagine that this is the way to go. Will post code as another answer when i get a chance to get back to it.
Markus
+1  A: 

I use the listings package in LaTeX to insert source code; it does syntax highlight, linebreaks et al.

Put the following in your preamble:

\usepackage{listings}
%\lstloadlanguages{Python} # Load only these languages
\newcommand{\MyHookSign}{\hbox{\ensuremath\hookleftarrow}}

\lstset{
    % Language
    language=Python,
    % Basic setup
    %basicstyle=\footnotesize,
    basicstyle=\scriptsize,
    keywordstyle=\bfseries,
    commentstyle=,
    % Looks
    frame=single,
    % Linebreaks
    breaklines,
    prebreak={\space\MyHookSign},
    % Line numbering
    tabsize=4,
    stepnumber=5,
    numbers=left,
    firstnumber=1,
    %numberstyle=\scriptsize,
    numberstyle=\tiny,
    % Above and beyond ASCII!
    extendedchars=true
}

The package has hook for inline code, including entire files, showing it as figures, ...

Morten Siebuhr
Good tip, I'll definitely give that a go.
Markus