tags:

views:

40

answers:

1

Every month, 4 or 5 text files are created. The data in the files is pulled into MS Access and used in a mailmerge.

Each file contains a header. This is an example: HEADER|0000000130|0000527350|0000171250|0000058000|0000756600|0000814753|0000819455|100106

The 2nd field is the number of records contained in the file (excluding the header line). The last field is the date in the form yymmdd.

Using gawk (for Windows), I've done ok with rearranging/modifying the data and writing it all out to a new file for importing into Access except for the following.

I'm trying to create a unique ID number for each record. The ID number has the form 1mmddyyXXXX, where XXXX is a number, padded with leading zeros. Using the header above, the first record in the output file would get the ID number 10106100001 and the last record would get the ID 10106100130.

I've tried putting the second field in the header into a variable, rearranging the last header field into the required date format and then looping with "for" statements to append the XXXX part of the ID and then outputting it all with printf but so far I've been complete rubbish at it.

Thanks for your help! gary

+2  A: 

Invoke awk(1) with the option "-F |" and use the following statement to set the identifier: id=sprintf("1%02d%02d%02d%04d", substr($9,3,2), substr($9,5,2), substr($9,1,2), NR)

Steve Emmerson
`NR - 1` since you're not counting the header record.
Dennis Williamson
Thank you both! As usual, I was making the solution out to be more difficult than it was.
Gary