tags:

views:

63

answers:

2

Hi

let's say I have two xml strings:

String logToSearch = "<abc><number>123456789012</number></abc>"

String logToSearch2 = "<abc><number xsi:type=\"soapenc:string\" /></abc>"

String logToSearch3 = "<abc><number /></abc>";

I need a pattern which finds the number tag if the tag contains value, i.e. the match should be found only in the logToSearch.

I'm not saying i'm looking for the number itself, but rather that the matcher.find method should return true only for the first string.

For now i have this: Pattern pattern = Pattern.compile("<(" + pattrenString + ").*?>", Pattern.CASE_INSENSITIVE); where the patternString is simply "number". I tried to add "<(" + pattrenString + ")[^/>].*?> but it didn't work because in [^/>] each character is treated separately.

Thanks

+1  A: 

This is absolutely the wrong way to parse XML. In fact, if you need more than just the basic example given here, there's provably no way to solve the more complex cases with regex.

Use an easy XML parser, like XOM. Now, using xpath, query for the elements and filter those without data. I can only imagine that this question is a precursor to future headaches unless you modify your approach right now.

Stefan Kendall
no worries, I'm not trying to parse the xml file, i just need to remove cc numbers from my log files. to do that i need to search for the tags where the number is stored. I cannot search for the numbers itself because there are other important numbers in the log file which may look like a card number (for example session id).
lp3
A: 

So a search for "<number[^/>]*>" would find the opening tag. If you want to be sure it isn't empty, try "<number[^/>]*>[^<]" or "<number[^/>]*>[0-9]"

Matt Kane
hey Matt, your solution works great. Thanks a lot.
lp3
If you wouldn't mind upvoting and accepting it
Matt Kane
sure, let me register first
lp3
Vote Up requires 15 reputation which I do not have, but i accepted your answer. Thanks again.
lp3
Well, now you are a tiny bit closer.
Matt Kane