views:

193

answers:

5

I've been trying to construct a ruby regex which matches trailing spaces - but not indentation placeholders - so I can gsub them out.

I had this /\b[\t ]+$/ and it was working a treat until I realised it only works when the line ends are [a-zA-Z]. :-( So I evolved it into this /(?!^[\t ]+)[\t ]+$/ and it seems like it's getting better, but it still doesn't work properly. I've spent hours trying to get this to work to no avail. Please help.

Here's some text test so it's easy to throw into Rubular, but the indent lines are getting stripped so it'll need a few spaces and/or tabs. Once lines 3 & 4 have spaces back in, it shouldn't match on lines 3-5, 7, 9.

some test test  
some test test      


  some other test (text)
  some other test (text)  
  likely here{ dfdf }
  likely here{ dfdf }        
  and this ;
  and this ;  

Alternatively, is there an simpler / more elegant way to do this?

A: 

Wouldn't this help?

/([^\t ])([\t ]+)$/

You need to do something with the matched last non-space character, though.

ndim
A: 

edit: oh, you meant non blank lines. Then you would need something like /([^\s])\s+/ and sub them with the first part

I'm not entirely sure what you are asking for, but wouldn't something like this work if you just want to capture the trailing whitespaces?

([\s]+)$

or if you only wanted to capture tabs

([ \t]+)$

Since regexes are greedy, they'll capture as much as they can. You don't really need to give them context beforehand if you know what you want to capture.

I still am not sure what you mean by trailing indentation placeholders, so I'm sorry if I'm misunderstanding.

Xzhsh
Sorry, by 'indentation placeholders' I mean where your code editor automatically indents to keep your blocks aligned. These lines consist entirely of white space, and thus get stripped with the standard stripping regex that comes with TextMate: `[\t ]+$`
Tim
+1  A: 

Your first expression is close, and you just need to change the \b to a negated character class. This should work better:

/([^\t ])[\t ]+$

In plain words, this matches all tabs and spaces on lines that follow a character that is not a tab or a space.

Mike Pelley
Unfortunately this appears to match indentation placeholders - which are lines consisting entirely of whitespace.
Tim
It does not for me, so I'm not sure why it does for you. The line must have a character that is not a space or a tab for this to match. A slightly more generic version would be /([^\s])\s+$/, in case you have whitespace that is not a space or a tab.
Mike Pelley
Looking above, mckeed's second answer is very similar to mine, although it is more concise and has the added benefit of being ruby'ized, so I'd suggest you accept his as the answer ;o)
Mike Pelley
+4  A: 

If you're using 1.9, you can use look-behind:

/(?<=\S)[\t ]+$/

but unfortunately, it's not supported in older versions of ruby, so you'll have to handle the captured character:

str.gsub(/(\S)[\t ]+$/) { $1 }
mckeed
Yep, not on 1.9 yet for most projects. The second one is the ticket - thanks very much!
Tim
A: 

perhaps this...

[\t|\s]+?$

or [ ]+$

Paul