views:

136

answers:

6

When I do the following list comprehension I end up with nested lists:

channel_values = [x for x in [ y.split(' ') for y in
    open(channel_output_file).readlines() ] if x and not x == '\n']

Basically I have a file composed of this:

7656 7653 7649 7646 7643 7640 7637 7634 7631 7627 7624 7621 7618 7615
8626 8623 8620 8617 8614 8610 8607 8604 8600 8597 8594 8597 8594 4444
<snip several thousand lines>

Where each line of this file is terminated by a new line.

Basically I need to add each number (they are all separated by a single space) into a list.

Is there a better way to do this via list comprehension?

+8  A: 

You don't need list comprehensions for this:

channel_values = open(channel_output_file).read().split()
Lukáš Lalinský
+1, You beat me to it
Nadia Alramli
+4  A: 

Just do this:

channel_values = open(channel_output_file).read().split()

split() will split according to whitespace that includes ' ' '\t' and '\n'. It will split all the values into one list.

If you want integer values you can do:

channel_values = map(int, open(channel_output_file).read().split())

or with list comprehensions:

channel_values = [int(x) for x in open(channel_output_file).read().split()]
Nadia Alramli
A: 

Well another problem is that you're leaving the file open. Note that open is an alias for file.

try this:

f = file(channel_output_file)
channel_values = f.read().split()
f.close()

Note they'll be string values so if you want integer ones change the second line to

channel_values = [int(x) for x in f.read().split()]

int(x) will throw a ValueError if you have a non integer value in the file.

Bryan McLemore
I thought that the files were closed automatically once you left the scope of the list comprehension?
UberJumper
The file object is closed when it's garbage collected and it's garbage collected when there are no references to it. So no, it doesn't leave the file open, because there are no references to it after the line executes.
Lukáš Lalinský
Thanks, i got worried for a second :)
UberJumper
+2  A: 

Also, the reason the original list comprehension had nested lists is because you added an extra level of list comprehension with the inner set of square brackets. You meant this:

channel_values = [x for x in y.split(' ') for y in
    open(channel_output_file) if x and not x == '\n']

The other answers are still better ways to write the code, but that was the cause of the problem.

Peter Westlake
You had this as `open(channel_output_file).readlines()` but all you really need is `open(channel_output_file)`. The file object returned by `open()` works as an iterator that returns lines; `readlines()` will slurp in every line, which is not needed here. I have edited your code to remove the "readlines()". Also voted you +1.
steveha
This will fail because you've mixed up the order of "for x in ..." and "for y in ...". Python's nested list comprehension syntax is counter-intuitive unless you remember it mimics the order you'd right your for loops without comprehensions. Also, why not just split() and skip the test for newline?
Jeffrey Harris
(hope this comment doesn't appear more than once - apologies if it does)Point taken about the order of x and y, thanks! That's very confusing.The reason I left in split(), and the readlines() that Steve removed, was to show what caused the problem with nested lists by making the smallest possible change to the original code. Lukáš and Nadia had already shown how to make the code far better.
Peter Westlake
A: 

Is there a better way to do this via list comprehension?

Sort of..

Instead of reading each line as an array, with the .readlines() methods, you can just use .read():

channel_values = [x for x in open(channel_output_file).readlines().split(' ')
if x not in [' ', '\n']]

If you need to do anything more complicated, particularly if it involves multiple list-comprehensions, you're almost always better of expanding it into a regular for loop.

out = []
for y in open(channel_output_file).readlines():
    for x in y.split(' '):
        if x not in [' ', '\n']:
            out.append(x)

Or using a for loop and a list-comprehension:

out = []
for y in open(channel_output_file).readlines():
    out.extend(
        [x for x in y.split(' ')
        if x != ' ' and x != '\n'])

Basically, if you can't do something simply with a list comprehension (or need to nest them), list-comprehensions are probably not the best solution.

dbr
A: 

If you don't care about dangling file references, and you really must have a list read into memory all at once, the one-liner mentioned in other answers does work:

channel_values = open(channel_output_path).read().split()

In production code, I would probably use a generator, why read all those lines if you don't need them?

def generate_values_for_filename(filename):
    with open(filename) as f:
        for line in f:
            for value in line.split():
                yield value

You can always make a list later if you really need to do something other than iterate over values:

channel_values = list(generate_values_for_filename(channel_output_path))
Jeffrey Harris