views:

364

answers:

4

I would like to call a Python script from within a Bash while loop. However, I do not understand very well how to use appropriately the while loop (and maybe variable) syntax of the Bash. The behaviour I am looking for is that, while a file still contains lines (DNA sequences), I am calling a Python script to extract groups of sequences so that another program (dialign2) can align them. Finally, I add the alignments to a result file. Note: I am not trying to iterate over the file. What should I change in order for the Bash while loop to work? I also want to be sure that the while loop will re-check the changing file.txt on each loop. Here is my attempt:

#!/bin/bash
# Call a python script as many times as needed to treat a text file

c=1
while [ `wc -l file.txt` > 0 ] ; # Stop when file.txt has no more lines
do
    echo "Python script called $c times"
    python script.py # Uses file.txt and removes lines from it
    # The Python script also returns a temp.txt file containing DNA sequences
    c=$c + 1
    dialign -f temp.txt # aligns DNA sequences
    cat temp.fa >>results.txt # append DNA alignements to result file
done

Thanks!

+1  A: 

try -gt to eliminate the shell metacharacter >

while [ `wc -l  file.txt`  -gt 0 ]
do
    ...
    c=$[c + 1]
done
Joe Koberg
what exactly is the -gt operator doing?
Morlock
gt = "greater than"
vladr
@Vlad Romascanu Thank you!
Morlock
or you can use `>` inside `while (( ))` for integer comparisons like this: `while (( $(wc -l < file.txt) > 0 ))` (I learned from **ghostdog74** about the redirection with `wc` which eliminates the filename and only returns the count.)
Dennis Williamson
you don't need to scan the entire file to know whether it's empty; `-s` will of course work, as will (just for fun) `grep -lv '^ *$' file.txt >/dev/null` which will detect (and stop after) the first non-blank line -- with status 1 for a file with one or more non-empty lines, and status 0 for all other cases (incl. empty/zero-byte files)
vladr
+2  A: 

No idea why you want to do this.

c=1
while [[ -s file.txt ]] ; # Stop when file.txt has no more lines
do
    echo "Python script called $c times"
    python script.py # Uses file.txt and removes lines from it
    c=$(($c + 1))
done
MattH
`-s file` is True if the file exists and has a size greater than zero.
MattH
You just have to be careful that the python script doesn't leave any trailing newline or whatnot.
gnibbler
There's plenty about this question that prompts warnings of caution. I'm struggling to imagine why a python script without arguments would need to be called multiple times by bash.
MattH
Thanks, that totally does the trick for me. The reason I want to do that (suggest another approach if you see a better one) is to recursively align DNA sequences using the program 'dialign2', while choosing which sequences to align with a Python script each time and removing those sequences from the input text file (in fasta format) for another pass. The choosing of the sequences to align is based on a portion of there name, which I feel very comfortable doing in Python. Thanks again!
Morlock
I added details in code to show a bit more about what I am trying to do.
Morlock
A: 

The following should do what you say you want:

#!/bin/bash

c=1
while read line; 
do
    echo "Python script called $c times"
    # $line contains a line of text from file.txt
    python script.py 
    c=$((c + 1))
done < file.txt

However, there is no need to use bash, to iterate over the lines in a file. You can do that quite easily without ever leaving python:

myfile = open('file.txt', 'r')

for count, line in enumerate(myfile):
    print '%i lines in file' % (count + 1,)
    # the variable "line" contains the line of text from the file.txt 

    # Do your thing here.
vezult
UUOC. no need cat with while loop
ghostdog74
Hi, I am not trying to merely iterate over the lines of a file, but to create another temporary file on each pass to be used by another program, until the original file is empty. I added details to the question. Cheers
Morlock
@ghostdog74: True. But if you try while read x < file.txt, you get an infinite loop. I've improved my example.
vezult
A: 

@OP if you want to loop through a file , just use while read loop. Also, you are not using the variables $c as well as the line. Are you passing each line to your Python script? Or you just calling your Python script whenever a line is encountered? (your script going to be slow if you do that)

while true
do
    while read -r line
    do
       # if you are taking STDIN in myscript.py, then something must be passed to
       # myscript.py, if not i really don't understand what you are doing.

       echo "$line" | python myscript.py > temp.txt
       dialign -f temp.txt # aligns DNA sequences
       cat temp.txt >>results.txt
    done <"file.txt"
    if [ ! -s "file.txt" ]; break ;fi
done  

Lastly, you could have done everything in Python. the way to iterate "file.txt" in Python is simply

f=open("file.txt"):
for line in f:
    print "do something with line"
    print "or bring what you have in myscript.py here"
f.close()
ghostdog74
Thanks, but I am not trying to loop through a file. I added details to the original question.
Morlock
yes you are. you are using wc -l on file.txt with while loop, that means you are iterating a file. the only difference is whether you want to use the line variable that is iterated or not.
ghostdog74
I am merely trying to use a characteristic of the file (the fact that it still has information in it) as a criterion for continuing to use the Python script on it. I guess the 'wc' function IS iterating the file. Is this what you mean?
Morlock
yes, that's what i mean. because you are checking for count of lines
ghostdog74