Please excuse the length of this reply.
If I understand you correctly, you want to do simple syntax
checking of parentheses to make sure they are balanced correctly.
In your question you specify a test based on simple counting, but
as others have pointed out, this does not catch things like
"([)]".
I'd also like to offer some constructive criticism on other
aspects of your code.
To start with, it is better to get the filename from the command
line, and not to prompt for it. This is so that you can easily
run the program repeatedly, when developing it, without having
to type in the filename all the time. This is your way:
$ python foo.py
Give a file name:data
[some output]
$ python foo.py
Give a file name:data
[some output]
$ python foo.py
Give a file name:data
[some output]
You need to type in the filename every time. You don't need to type
in the command to run the program more than once. After the first
time, you can use the arrow key to get it from the shell's command
history. If you get the filename from the command line, you can do
this instead:
$ python foo.py testfile
[some output]
$ python foo.py testfile
[some output]
$ python foo.py testfile
[some output]
This way, when you test the second time, you don't need to type more
than the up arrow and Enter keys. This is a small convenience, but
it is important: when you're developing software, even small
things can start annoying. It's like a large grain of sand under your
foot when going on a long walk: you won't even notice it for the first
couple of kilometers, but after a few more, you're bleeding.
In Python, to access the command line arguments, you need the
sys.argv
list. The relevant changes to your program:
import sys
filename = sys.argv[1]
If you do want to prompt, you should use something else than the
built-in input
function. It interprets whatever the user types
as a Python expression, and that causes all sorts of problems.
You could read using sys.stdin.readline
.
Anyway, we've now got the name of the file safely stored in
the filename
variable. It's time to do something with it.
Your parentheses
function does pretty much everything, and
experience has shown that that's often not the best way of
doing things. Every function should, instead, do just one
thing, but do that well.
I suggest that you should keep the parts of actually opening
and closing a file separate from the actual counting. This
will simplify the logic of the counting, since it does not
need to worry about the rest. In code:
import sys
def check_parentheses(f):
pass # we'll come to this later
def main():
filename = sys.argv[1]
try:
f = file(filename)
except IOError:
sys.stderr.write('Error: Cannot open file %s' % filename)
sys.exit(1)
check_parentheses(f)
f.close()
main()
I changed a couple of other things, too, in addition to rearranging
things. First, I write the error message to the standard error output.
This is the proper way to do it, and means fewer surprises to shell
users to redirect the output. (If that doesn't make any sense to you,
don't worry about it, just accept it as a given for now.)
Second, if there's an error, I exit the program with sys.exit(1)
.
This tells whoever started the program that it failed. In Unix
shell, this lets you do things like the following:
if python foo.py inputfile
then
echo "inputfile is OK!"
else
echo "inputfile is BAD!"
fi
The shell script might do something more interesting than just reporting
success or failure, of course. It might, for example, remove all
broken files, or e-mail whoever wrote them to ask them to fix them.
The beauty is that you, who write the checker program, do not need
to care. You just set the program exit code properly, and let whoever
writes the shell script to worry about the rest.
The next step is to actually read the contents of the file. This
can be done in various ways. The easiest way is to do it line-by-line,
like this:
for line in f:
# do something with the line
We then need to look at each character in the line:
for line in f:
for c in line:
# do something with the character
We're now ready to actually start checking parentheses. As suggested
by others, a stack is the appropriate data structure for this.
A stack is basically a list (or array) where you add items to one
end, and take them out in reverse order. Think of it as a stack of
coins: you can add a coin to the top, and you can remove the topmost
coin, but you can't remove one from the middle or bottom.
(Well, you can, and it's a neat trick if you do, but computers are
simple beasts and get upset by magic tricks.)
We will use a Python list as a stack. To add an item, we use
the list's append
method, and to remove we use the pop
method.
An example:
stack = list()
stack.append('(')
stack.append('[')
stack.pop() # this will return '['
stack.pop() # this will return '('
To look at the topmost item in the stack, we use stack[-1]
(in
other words, the last item in the list).
We use the stack as follows: when we find an opening parentheses ('('),
bracket ('['), or brace ('{'), we put it on the stack. When we
find a closing one, we check the topmost item on the stack, and
make sure that it matches the closing one. If not, we print an
error. Like this:
def check_parentheses(f):
stack = list()
for line in f:
for c in line:
if c == '(' or c == '[' or c == '{':
stack.append(c)
elif c == ')':
if stack[-1] != '(':
print 'Error: unmatched )'
else:
stack.pop()
elif c == ']':
if stack[-1] != '[':
print 'Error: unmatched ]'
else:
stack.pop()
elif c == '}':
if stack[-1] != '{':
print 'Error: unmatched }'
else:
stack.pop()
This now does find unmatched parentheses of various kinds. We
can improve it a little bit by reporting the line and column
where we find the problem. We need a line number and column number
counter.
def error(c, line_number, column_number):
print 'Error: unmatched', c, 'line', line_number, 'column', column_number
def check_parentheses(f):
stack = list()
line_number = 0
for line in f:
line_number = line_number + 1
column_number = 0
for c in line:
column_number = column_number + 1
if c == '(' or c == '[' or c == '{':
stack.append(c)
elif c == ')':
if stack[-1] != '(':
error(')', line_number, column_number)
else:
stack.pop()
elif c == ']':
if stack[-1] != '[':
error(']', line_number, column_number)
else:
stack.pop()
elif c == '}':
if stack[-1] != '{':
error('}', line_number, column_number)
else:
stack.pop()
Note also how I added a helper function, error
, to do the actual
printing of the error message. If you want to change the error message,
you now only need to do it in one place.
Another thing to notice is that the cases for handling the closing
symbols are all very similar. We could make that to a function, too.
def check(stack, wanted, c, line_number, column_number):
if stack[-1] != wanted:
error(c, line_number, column_number)
else:
stack.pop()
def check_parentheses(f):
stack = list()
line_number = 0
for line in f:
line_number = line_number + 1
column_number = 0
for c in line:
column_number = column_number + 1
if c == '(' or c == '[' or c == '{':
stack.append(c)
elif c == ')':
check(stack, '(', ')', line_number, column_number)
elif c == ']':
check(stack, '[', ']', line_number, column_number)
elif c == '}':
check(stack, '{', '}', line_number, column_number)
The program can be refined further, but this should suffice for now.
I'll include the whole code at the end.
Note that this program only cares about parentheses of various kinds.
If you really want to check a whole Python program for syntactic
correctness, you'll need to parse all of Python's syntax, and that
is pretty complicated, and too much for one Stack Overflow answer.
If that's what you really do want, please ask a followup question.
The whole program:
import sys
def error(c, line_number, column_number):
print 'Error: unmatched', c, 'line', line_number, 'column', column_number
def check(stack, wanted, c, line_number, column_number):
if stack[-1] != wanted:
error(c, line_number, column_number)
else:
stack.pop()
def check_parentheses(f):
stack = list()
line_number = 0
for line in f:
line_number = line_number + 1
column_number = 0
for c in line:
column_number = column_number + 1
if c == '(' or c == '[' or c == '{':
stack.append(c)
elif c == ')':
check(stack, '(', ')', line_number, column_number)
elif c == ']':
check(stack, '[', ']', line_number, column_number)
elif c == '}':
check(stack, '{', '}', line_number, column_number)
def main():
filename = sys.argv[1]
try:
f = file(filename)
except IOError:
sys.stderr.write('Error: Cannot open file %s' % filename)
sys.exit(1)
check_parentheses(f)
f.close()
main()