tags:

views:

301

answers:

4

Hi There I were wondering if anyone can help me with a regex

I want to insert the following into a database:

(#text1#,#text2#,#text3#,#text4#,#text5#,#text6#, #text7#, #text8#, #text9#), (#text1#,#text2#,#text3#,#text4#,#text5#,#text6#, #text7#, #text8#, #text9#), (#text1#,#text2#,#text3#,#text4#,#text5#,#text6#, #text7#, #text8#, #text9#);

but sometimes I will not have 9 textfields that I can place into my database

(#text1#,#text2#,#text3#,#text4#,#text5#,#text6#, #text7#, #text8#, #text9#),

(#text1#,#text2#,#text3#,#text4#,#), <<<--- String breaks and mess up my insert

(#text1#,#text2#,#text3#,#text4#,#text5#,#text6#, #text7#, #text8#, #text9#);

Is there a way that I can write a regex to delete if the line is not having a start tag ( and and end * edit ** The lines will always have the start tag (# and always have a closing tag #)

I have tried /^(#.?#,#.?#,#.?#,#.?#,#.?#,#.?#,#.?#,#.?#,#.*?#)$/ig

but do not really have an idea how to do it?

Can anyone please help me...

I have created a page where you can insert a REGEX to see if your solution workds

+2  A: 
Tim Sylvester
Thanks Codebender :-) I figure the problem were the (m) added white space - /^\((?:#.+#,\s*){8}(?:#.+#\s*)\)[,;]$/gm < this works and deletes the whole text - hmmm but it also deletes the error ones?var htstring29 = ''+stripped28+'';var stripped29 = htstring29.replace(/^\((?:#.+#,\s*){8}(?:#.+#\s*)\)[,;]$/gm, '');
Gerald Ferreira
This one doesn't quite work due to + (probably) being a greedy quantifier. I would try `/^\((?:#.+?#,\s*){8}(?:#.+?#\s*)\)[,;]$/gm`
Sean Nyman
I have created the sample on this page and cannot get it to work - can you maybe see if you can get it to work ? http://online-affiliate-programs.co.za/test/index.asp
Gerald Ferreira
Your page seems to be un-escaping the regex and breaking it. Probably something to do with accepting the string as a parameter. http://codebender.net/so/1218103.html
Tim Sylvester
Hi Codebender thank you for the time that you spend on assisting me with a solution - \((?:[^#\n]*?#[^#\n]*?#[,\s]?){0,8}\)[,;]\s* <<< works 100% I think the thing I learned most from the excersise were the www.radsoftware.com.au builder :-) Thanks 100 Million - I appreciate it!
Gerald Ferreira
No problem. Actually I generally use Expresso, but it's not free and I only have it registered on my own computer.
Tim Sylvester
Will have a look at it - Thanks - Another question regarding the solution that you have provided, lets say I want to use the string to delete all the ones not mathing - smaller but now I want to also match any string bigger than 9 and also delete it?
Gerald Ferreira
Like everything smaller /\((?:[^#\n]*?#[^#\n]*?#[,\s]?){0,8}\)[,;]\s*/ig (this removes everything smaller) so I can do this in one line and in another I want to delete everything bigger - this should leave me with only the results that pass the test?Any ideas on how to format it - thanks
Gerald Ferreira
Unfortunately, there's no way to check for "less or more than 9". I would do it in two passes, once with "{0,8}" and again with "{10,}" (ten or more). You could also use {9} to match all the valid lines and join them back into a new string.
Tim Sylvester
+1  A: 

You could try it with this:

/^\(([\s]*#[^#]+#,?){9}\)[,;]$/

edit:

In perl, if you want to remove occurrences of any pattern of your above set that has less than 9 #\d#'s, you can use the following:

$string =~ s/\(([\s]*#[^#]+#[\s]*,?){0,8}\)[,;]*//g;

It allows for spaces at either end of the #\d#, an optional comma separating them within the parens, and either a comma or a semi after the group. Your resulting $string will be the list of 9-token groups from your input string, as they appear in the original.

akf
I have created an interactive version of what I am looking for at http://online-affiliate-programs.co.za/test/index.asp can you maybe see if you can figure it out, thanks
Gerald Ferreira
A: 
/^\((#\w+#,?\s?){9}\)$/ matches exactly 9.
Cliff
with a potentially broken 10th element due to the ,?
Greg Domjan
A: 

Deleting from the string has some complications, such as what to do with the error line if it is last - contains the ;

line = ^\s*\(.*\)[,;]\s*$
a string token = #[\w\s]*#
a list of tokens = token(?:\s*,\s*token)
7 or less items {0,7}
a list of 8 or less tokens = token(?:\s*,\s*token){0,7}

Making

^\s*\(#[\w\s]*#(?:\s*,\s*#[\w\s]*#){0,7}\)\s*[,;]\s*$

with which you want to replace these lines with nothing globally treating the string as multiple lines /match/replace/gm

/^\s*\(#[\w\s]*#(?:\s*,\s*#[\w\s]*#){0,7}\)\s*[,;]\s*$//gm

If you have set your string character to # for the purpose of the insert then the token could be simplified to #[^#]+#

In your example short line the last token only has the one # which I have not allowed for here so far, nor an entirely empty element which might be acceptable to your sql parser.

Greg Domjan
Hi GregThanks for the commentYes what I am trying to do is to delete everything that is not('1', '2', '3', '4', '5', '6', '7', '8', '9'), - where the numbers are text/url's etc. I have used the # instead of ' to delete all " ' " like bob's when inserting into the database, so in a next move I replace # with 'This is always there ( and it always ends with ), but the values inside is not always the same I figured I could first delete all the ('1'), and then the ('1','2'), up ('1', '2', '3', '4', '5', '6', '7', '8'), untill it leaves the ('1', '2', '3', '4', '5', '6', '7', '8', '9')
Gerald Ferreira