ansaurus

Question

Optimal way to replace characters in large string with Python?

Answer 1

+2 A:

You may find the answers here helpful.

http://stackoverflow.com/questions/1175540/iterative-find-replace-from-a-list-of-tuples-in-python/1175547

Unknown 2009-08-07 00:18:55

Thanks. I don't think this is exactly what I'm looking for, but it's good to know nonetheless.

Sean O'Hollaren 2009-08-07 00:36:30

Answer 2

+1 A:

Here's an outline (untested) ... basically you do it a line at a time

for line in infile:
    data = line.rstrip("\n").split("|")
    kind = data[0]
    # start of changes
    if kind == "OBR":
        data[7] += "0000" # check that 7 is correct!
    # end of changes
    outrecord = "|".join(data)
    outfile.write(outrecord + "\n")

The above assumes that you are selecting fix-up targets by line-type (example: "OBR") and column index (example: 7). If there are only a few such targets, just add more similar fix statements. If there are many targets, you could specify them like this:

fix_targets = {
    "OBR": [7],
    "XYZ": [1, 42],
    }

and the fix_up code would look like this:

if kind in fix_targets:
    for col_index in fix_targets[kind]:
        data[col_index] += "0000"

You may like in any case to add code to check that data[col_index] really is a date in YYYYMMDD format before changing it.

None of the above addresses removing the unwanted headers, because you didn't show example data. I guess that applying your replacements to each line (and avoiding writing the line if it became only whitespace after replacements) would do the trick.

John Machin 2009-08-07 00:23:53

Perfect. I figured it might be something like this, but I couldn't really visualize it. Thanks for the answer.

Sean O'Hollaren 2009-08-07 00:37:20

Good. outfile.write(outrecord + '\n') has a built-in Python equivalent: print >> outfile, outrecord, which is simpler.

EOL 2009-08-07 08:02:35

@EOL - except that the "print chevron" syntax, which is not often used, is also deprecated.

JimB 2009-08-07 13:35:09

ansaurus

tags:

views:

answers:

Optimal way to replace characters in large string with Python?

related questions