ansaurus

Question

Search backward through a string using a regex (in Python)?

Answer 1

A:

You can do look-behind assertions with (?<=...) or (?<!...), but in general you can only match forwards.

Ignacio Vazquez-Abrams 2010-03-19 21:55:40

In .NET you could do a lookahead for the function followed by a lookbehind for the comment. Unfortunately, in Python lookbehinds can only match fixed-length strings.

Alan Moore 2010-03-19 23:36:37

Answer 2

A:

The question is why are these comments not inside the function, so you can use doc.

But there is no easy way with regex.

evilpie 2010-03-19 21:59:18

he might be creating a python app to read doxygen comments in C or something

Carson Myers 2010-03-20 06:11:58

Answer 3

+2 A:

simplest way is to just use a group, you don't need to go backwards...

 (commentRegex)functionRegex

Then just extract group 1. You will need to run in multi-line mode to get it working, i don't know python so i can't be more helpful.

It's also possible with lookahead assertions, but this way is simpler.

Paul Creasey 2010-03-19 21:59:44

Answer 4

+2 A:

I think you should use a regex that only matches doxymentation that's immediately before the function. Maybe something like this (simplified example):

import re

test = """

/**
    @doxygen comment
*/
void function()
{
}

"""

doxygenRegex = r"(?P<comment>/\*\*(?:[^/]|/(?!\*\*))*\*/)"
functionRegex = r"(?P<function>\s\w+\s+(?P<functionName>\w+)\s*\()"

match = re.search(doxygenRegex + functionRegex, test)
print match.groupdict()

As long as this matches something, you can loop the regex matching - but starting the search at test[match.end():] next time. Hope that makes sense to you...

BTW if you only want to extract the comment and nothing about the function, you can use lookahead - just replace functionRegex with r"(?=\s\w+\s+\w+\s*\()".

AndiDog 2010-03-19 22:05:03

...the trick being to make sure the "comment" regex can't match more than one comment at a time. (You forgot to mention that, 'Dog.) BTW, shouldn't the "function" regex start with `\s+` or `\s*`?

Alan Moore 2010-03-19 23:53:32

Yes, it will only match the very last comment before a function. And it could be `\s+`, right. As said, it's a simplified example.

AndiDog 2010-03-20 10:04:37

Answer 5

+1 A:

Note that C isn't a regular language, so it cannot be parsed by regular expressions. Have you considered leveraging doxygen itself to parse this file?

Mike Graham 2010-03-20 01:23:21

Answer 6

A:

here's a non regex approach, split on */ and find if the function you are looking for is at the next item. eg

test = """

/**
    @doxygen comment
*/
void function()
{
}

"""

t=test.split("*/")
for n,comm in enumerate(t):
    try:
        if "void" in t[n+1]:
             print t[n]
    except IndexError: pass

ghostdog74 2010-03-20 02:05:29

Answer 7

+1 A:

This can be achived using a single reg-ex.

The key is to capture the comment just before the desired function. The easy way to do this is to use non-greedy qualifier. For example: /\*\*(.*?)\*/ with MULTILINE flag; however, in Python, non-greedy and MULTILINE do not work together (at least on my environment). So, you need a little trick like this:

/\*\*((?:[^\*]|\*(?!/))*)\*/.

This is to match:

1: the comment begin /**.

2: everything that is not * OR * that does not follows by /

3: the comment end */.

From this idea the code you want is:

function_name  = "function2"
regex_comment  = "/\*\*((?:[^\*]|\*(?!/))*)\*/"
regex_static   = "(?:(\w+)\s*::\s*)?"
regex_function = "(\w+)\s+"+regex_static+"(?:"+function_name+")\s*\([^\)]*\)"
regex = re.compile(regex_comment+"\s*"+regex_function, re.MULTILINE)
text  = """
/**
    @doxygen comment1
*/
void test::function1()
{
}

/**
    @doxygen comment2
*/
void test::function2()
{
}
"""
match = regex.search(text)
if (match == None): print "None"
else:               print match.group(1)

When run, you got:


    @doxygen comment2

Variation: If you want to capture /** and */ too, use regex_comment = "(/\*\*(?:[^\*]|\*(?!/))*\*/)".

Hope this helps.

NawaMan 2010-03-20 08:32:02

ansaurus

tags:

views:

answers:

Search backward through a string using a regex (in Python)?

related questions