tags:

views:

271

answers:

2

I have some files of fixed line size, fixed field size that I need to extract information from. Nornmally, I'd use Cygwin (cut et al), but that's not an option in this case due to (boneheaded) management policies I can't change. It has to be done using standard XP toolset included with Windows.

I need to extract the 10 characters at offset 7 and 4 characters at offset 22 (zero-based), and output them to a file but with a slight twist:

  • The first field may have a negative, positive, or no sign (at the start or end). The sign should be moved to the front, or removed totally if it's positive.
  • The second field should have leading and trailing spaces removed.

For example:

          1         2         3          <- ignore (these lines not in file,)
0123456789012345678901234567890123456789 <- ignore ( here only for info.)
xxxxxxx    15.22-yyyyyABCDzzzzzzzzzzz...
xxxxxxx   122.00+yyyyy XX zzzzzzzzzzz...
xxxxxxx         9yyyyyYYY zzzzzzzzzzz...

should produce (< indicates end of line):

-15.22,ABCD<
122.00,XX<
9,YYY<
+2  A: 

This site has some pointers on how to extract substrings in cmd.exe: http://www.dostips.com/DtTipsStringManipulation.php

That site suggests that you can use

%varname:~2,3%

to subscript a variable. This seems to fill your needs, except you now have to get each line into a variable.

Next you want to look at the ghastly for loop syntax and if and branching (you can goto :labels in batch).

This stuff is all rather ugly, but if you really have to go there...

Here is a page in SO on looping through files and doing stuff to them: http://stackoverflow.com/questions/155932/how-do-you-loop-through-each-line-in-a-text-file-using-a-windows-batch-file

Daren Thomas
Thanks, @Daren. It turns out that VBScript will be far easier. +1 for the answer anyway.
paxdiablo
yup. I would avoid cmd.exe wherever possible (I was tempted to see if I could post some source, but then... it's just plain ghastly!)
Daren Thomas
Daren: Hey, don't insult languaes with unfamiliar syntax. You can do great things in batch files and for some people it's actually fun (admitted, those are usually the same people who delight in solving problems with Brainfuck or SNUSP).
Joey
@Johannes: A language can be both useful (to get things done) and ghastly at the same time. These are orthogonal concepts. As to fun... there are a lot of really weird people out there (about one in twenty).
Daren Thomas
+1  A: 

If you working with modern windows, you are not restricted to cmd.exe commands natively, you can use vbscript. If your policy is not to use vbscript either, then I guess you should sack your management :)

Set objFS=CreateObject("Scripting.FileSystemObject")
strFile = "c:\test\file"
Set objFile = objFS.OpenTextFile(strFile)
strFirstLine = objFile.ReadLine
Do Until objFile.AtEndOfStream
    strLine= objFile.ReadLine
    var1 = Mid(strLine,10) ' do substring from position 10 onwards
    ' var2 = Mid (strLine,<pos>,<length>) ' get next offset and save to var2
    WScript.Echo var1 & var2  ' print them out.
Loop

Basically, to "cut" characters of a string, you use Mid() function. please look at the vbscript documentation to find out more.

Save the above as test.vbs and, on the command line, do

c:\test> cscript /nologo test.vbs > newfile

Of course, "substring" can also be done with pure cmd.exe but I will leave it to some others to guide you.

Update by Pax: Based on this answer, I came up with the following which will be a good start:

option explicit
dim objFs, objFile, strLine, value1, value2

if wscript.arguments.count < 1 then
    wscript.echo "Usage: process <input-file>"
    wscript.quit
end if

set objFs=createObject("Scripting.FileSystemObject")
set objFile = objFs.openTextFile(wscript.arguments.item(0))
do  until objFile.atEndOfStream
    strLine= objFile.readLine
    value1 = trim(mid(strLine, 8, 10))
    value2 = trim(mid(strLine, 23, 4))
    if right(value1,1) = "-" then value1 = "-" & left(value1,len(value1)-1)
    if right(value1,1) = "+" then value1 = left(value1,len(value1)-1)
    if left(value1,1) = "+" then value1 = mid(value1,2)
    wscript.echo value1 & "," & value2
loop

This matches all the requirements we had. We can make the offsets and lengths into command-line arguments later.

End update.

ghostdog74
Unfortunately, not my management, this was for a friend. But vbscript is a good point, they're using XP. Perhaps I should clarify that.
paxdiablo
Thanks, @ghostdog, much obliged. I've added the code I'm sending through since it handles the other requirements as well. +1 and accept. Cheers.
paxdiablo