tags:

views:

672

answers:

4

I'm trying to convert a large number of files from a plain text layout to CSV. The first few lines of one of the files looks like this:

SLICE AT X= -0.25
   ELEM NO         XI-COORD               INWARD-NORMAL
 1     0     0.000   0.000   0.000     0.000   0.000   0.000
 2     0     0.000   0.000   0.000     0.000   0.000   0.000
 3     0     0.000   0.000   0.000     0.000   0.000   0.000

The number given in the first line (-0.25) needs to be inserted as a parameter in each of the data rows. Since this number varies in each of the hundreds of files, I can't provide it as a literal.

I've written the following sed program:

# Reduce line 1 to just a number.
s/SLICE AT X= //
# Store line 1 in hold space.
1h
# Clear the other header line.
2d
# Insert X coordinate from hold space.
/^\ \{1,\}/G
# Separate values with commas.
s/\ \{1,\}/,/g

It gets as far as producing this:

-0.25
,1,0,0.000,0.000,0.000,0.000,0.000,0.000
-0.25
,2,0,0.000,0.000,0.000,0.000,0.000,0.000
-0.25
,3,0,0.000,0.000,0.000,0.000,0.000,0.000
-0.25
,4,0,0.000,0.000,0.000,0.000,0.000,0.000
-0.25

Note that the first line of the output is the original first line.

Could anyone help me out getting the pasted number into the start of each line?

Thanks in advance,

Ross

+1  A: 

Does it have to be sed? This does the trick for me:

$ perl -lane '$x=$1,next if m/^SLICE AT X= (.+)$/; next if $. == 2; print join "," => ($x, @F)' /tmp/so-1255443
-0.25,1,0,0.000,0.000,0.000,0.000,0.000,0.000
-0.25,2,0,0.000,0.000,0.000,0.000,0.000,0.000
-0.25,3,0,0.000,0.000,0.000,0.000,0.000,0.000
pilcrow
Thanks,This does a great job with the provided excerpt, but runs into problems with multiple files as input. Although each of the files uses the same layout, the second line creeps into the output. I wonder if it might be due to the absolute line position?
rossmcf
$. does not reset to zero on a new file when using <> . see eof in perlfunc documentation
William Pursell
Thanks. I ended up just wrapping the perl call in a little shell script, which did the trick. Thanks for the lateral thinking.
rossmcf
+1  A: 

Note that this is really best done with perl, but here's a sed solution.

#!/usr/bin/sed -f

# Reduce line 1 to just a number.
s/SLICE AT X= //
# Store line 1 in hold space.
1h
# Clear the other header line.
1,2d
# Insert X coordinate from hold space.
x
G
# Separate values with commas.
s/\ \{1,\}/,/g
s/\n//g
p
s/\([^,]*\),.*/\1/
h
d

The issue is that G appends the hold space, so you need to use x first to swap the pattern and hold space, append the hold space (which was the pattern space), output your line, and then restore the holdspace. Really, sed is not the right tool for this...

William Pursell
+2  A: 

I agree with William Pursell: you have not reached the limit of what this tool can do, but you've reached the limit of what should be done with this tool.

Anyway, here's another approach, still slightly kludgey.


# Reduce line 1 to just a number.
s/SLICE AT X= //
# Store line 1 in hold space.
1h
# Clear the other header line.
1,2d
# Insert X coordinate from hold space.
/^\ \{1,\}/G
# The \n from line 1 tells me where to split/swap
s/\(.*\)\n\(.*\)/\2\1/
# Separate values with commas.
s/ \{1,\}/,/g
Beta
+1  A: 

you can use awk for such tasks. use sed only for very simple tasks.

awk '/SLICE AT X/{ num = $NF;print;next}
NR>2{
    $(NF+1) = num     
    $1=$1    
}1' OFS="," file

output

# more file
SLICE AT X= -0.25
   ELEM NO         XI-COORD               INWARD-NORMAL
 1     0     0.000   0.000   0.000     0.000   0.000   0.000
 2     0     0.000   0.000   0.000     0.000   0.000   0.000
 3     0     0.000   0.000   0.000     0.000   0.000   0.000
# ./shell.sh
SLICE AT X= -0.25
   ELEM NO         XI-COORD               INWARD-NORMAL
1,0,0.000,0.000,0.000,0.000,0.000,0.000,-0.25
2,0,0.000,0.000,0.000,0.000,0.000,0.000,-0.25
3,0,0.000,0.000,0.000,0.000,0.000,0.000,-0.25
ghostdog74