I'm trying to write a script that breaks up a VERY large file into smaller pieces that are then sent to a script that runs in the background. The motivation is that if the script is running in the background, I can run in parallel.
Here is my code, ./seq works just like the normal seq command (which mac doesn't have). and $1 is the huge file to be split.
echo "Splitting and Running Script"
for i in $(./seq 0 14000000 500000)
do
awk ' { if (NR>='$i' && NR<'$(($i+500000))') { print $0 > "xPart'$i'" } }' $1
python FastQ2Seq.py xPart$i &
done
wait
echo "Concatenating"
for k in *.out.seq
do
cat $k >> original.seq
done
for j in *.out.qul
do
cat $j >> original.qul
done
echo "Cleaning"
rm xPart*
My problem is that only xPart0 is made and it only has 499995 lines in it before the program hangs. I put some debugging echos in the script and I know the awk statement is what stops the script. I just can't figure out what's going wrong.