ansaurus

Question

python: find and replace numbers < 1 in text file

Answer 1

+4 A:

typical technique would be:

read file line by line
split each line into a list of strings
convert each string to the float
compare converted value with 1
replace when needed
write back to the new file

As I don't see you having any code yet, I hope that this would be a good start

SilentGhost 2010-04-21 17:14:20

thanks, i was aware they would have to be converted to floats, but didn't know about splitting lines.

hjp 2010-04-22 16:42:51

Answer 2

+3 A:

def float_filter(input):
    for number in input.split():
        if float(number) < 1.0:
            yield "0.100000E+01"
        else:
            yield number

input = "0.259515E+03 0.235095E+03 0.208262E+03 0.230223E+03 0.267333E+03 0.217889E+03 0.156233E+03 0.144876E+03 0.136187E+03 0.137865E+00"
print " ".join(float_filter(input))

ephes 2010-04-21 17:19:57

Answer 3

+4 A:

I think when you are beginning programming, it's useful to see some examples; and I assume you've tried this problem on your own first!

Here is a break-down of how you could approach this:

contents='0.259515E+03 0.235095E+03 0.208262E+03 0.230223E+03 0.267333E+03 0.217889E+03 0.156233E+03 0.144876E+03 0.136187E+03 0.137865E+00'

The split method works on strings. It returns a list of strings. By default, it splits on whitespace:

string_numbers=contents.split()
print(string_numbers)
# ['0.259515E+03', '0.235095E+03', '0.208262E+03', '0.230223E+03', '0.267333E+03', '0.217889E+03', '0.156233E+03', '0.144876E+03', '0.136187E+03', '0.137865E+00']

The map command applies its first argument (the function float) to each of the elements of its second argument (the list string_numbers). The float function converts each string into a floating-point object.

float_numbers=map(float,string_numbers)
print(float_numbers)
# [259.51499999999999, 235.095, 208.262, 230.22300000000001, 267.33300000000003, 217.88900000000001, 156.233, 144.876, 136.18700000000001, 0.13786499999999999]

You can use a list comprehension to process the list, converting numbers less than 1 into the number 1. The conditional expression (1 if num<1 else num) equals 1 when num is less than 1, otherwise, it equals num.

processed_numbers=[(1 if num<1 else num) for num in float_numbers]
print(processed_numbers)
# [259.51499999999999, 235.095, 208.262, 230.22300000000001, 267.33300000000003, 217.88900000000001, 156.233, 144.876, 136.18700000000001, 1]

This is the same thing, all in one line:

processed_numbers=[(1 if num<1 else num) for num in map(float,contents.split())]

To generate a string out of the elements of processed_numbers, you could use the str.join method:

comma_separated_string=', '.join(map(str,processed_numbers))

unutbu 2010-04-21 17:22:12

this is great. i did spend ages today wondering why the conditional expression wasn't working, then realised i had to upgrade from 2.4

hjp 2010-04-22 16:44:21

Answer 4

A:

You could use regular expressions for parsing the string. I'm assuming here that the mantissa is never larger than 1 (ie, begins with 0). This means that for the number to be less than 1, the exponent must be either 0 or negative. The following regular expression matches '0', '.', unlimited number of decimal digits (at least 1), 'E' and either '+00' or '-' and two decimal digits.

0\.\d+E(-\d\d|\+00)

Assuming that you have the file read into variable 'text', you can use the regexp with the following python code:

result = re.sub(r"0\.\d*E(-\d\d|\+00)", "0.100000E+01", text)

Edit: Just realized that the description doesn't limit the valid range of input numbers to positive numbers. Negative numbers can be matched with the following regexp:

-0\.\d+E[-+]\d\d

This can be alternated with the first one using the (pattern1|pattern2) syntax which results in the following Python code:

result = re.sub(r"(0\.\d+E(-\d\d|\+00)|-0\.\d+E[-+]\d\d)", "0.100000E+00", subject)

Also if there's a chance that the exponent goes past 99, the regexp can be further modified by adding a '+' sign after the '\d\d' patterns. This allows matching digits ending in two OR MORE digits.

Mikko Rantanen 2010-04-21 17:22:39

Answer 5

A:

I've got the script working as I want now...thanks people. When writing the list to a new file I used the replace method to get rid of the brackets and commas - is there a simpler way?

ftext = open("C:\\Users\\hhp06\\Desktop\\out.grd", "r")
otext = open("C:\\Users\\hhp06\\Desktop\\out2.grd", "w+")

for line in ftext:
    stringnum = line.split()
    floatnum = map(float, stringnum)
    procnum = [(1.0 if num<1 else num) for num in floatnum]
    stringproc = str(procnum)
    s = (stringproc).replace(",", " ").replace("[", "  ").replace("]", "")
    otext.writelines(s + "\n")
otext.close()

hjp 2010-04-22 16:49:50

Hi hjp. It's better to add a comment to ask a person directly, or post a new question. That way more people will notice it. But, to answer your question, you definitely don't want to be using `replace` for this task. Try `s=', '.join(map(str,procnum))`. See http://docs.python.org/library/stdtypes.html#str.join

unutbu 2010-04-22 21:01:13

Answer 6

+2 A:

import numpy as np

a = np.genfromtxt('file.txt')  # read file
a[a<1] = 0.1                   # replace
np.savetxt('converted.txt', a) # save to file

J.F. Sebastian 2010-04-22 17:58:02

ansaurus

tags:

views:

answers:

python: find and replace numbers < 1 in text file

related questions