views:

39

answers:

3

I'm in the process of updating some old CSS files in our systems, and we have a bunch that have lots of empty classes simply taking up space in the file. I'd love to learn how to write Regular expressions, but I just don't get them. I'm hoping the more I expose myself to them (with a little more cohesive explanation), the more I'll end up understanding them.

The Problem

That said, I'm looking for an expression that will identify text followed by a '{' (some have spaces in between, and some do not) and if there are no letters or numbers between that bracket and '}' (spaces don't count), it will be identified as a matching string.

I suppose I can trim the whitespace out of the doc before I run a regular expression through it, but I don't want to change the basic structure of the text. I'm hoping to return it into a large <textarea>.

Bonus points for explaining the characters and their meanings, and also an expression identifying lines in the copy without any text or numbers, as well. I will likely use the final expression in PHP script.

tl;dr:

Regular Expression to match:

.a_class_or #an_id {
    /* if there aren't any alphanumerics in here, 
       this should be a matching line of text */
}
+1  A: 

This would match for zero or more whitespace between { and }

\{\s*\}

You have to escape both { and }, as they are special chars in RegEx.

\s - means any whitespace char including tab, new line etc...

\s* - any number of whitespace

Draco Ater
actuall, pcre is quite permissive about {}'s, there is no need to escape in exprs like this, when they cannot be treated in a special way.
stereofrog
+1  A: 

Heres a good start on one. Tested in PHP.

/[#\w\s\.\-\[\]\=\^\~\:]+\{[\s\n\r\t]*\}/

Now breaking it into parts;

This part matches the selector. # is for IDs, \w matches alphanumerics and underscores, period is for classes. Then I threw in [ ] = ^ and ~ for some advanced css selectors, but I'd be surprised if they get used very often.

[#\w\s\.\-\[\]\=\^\~\:]+

The second part looks for an empty space inside of curly brackets. We've got special characters here for spaces, newlines, returns, and tabs. If there is anything else it won't match.

\{[\s\n\r\t]*\}

And here is the code I used to test it in PHP, if you're interested.

<?php
$myString = <<<HERE
#myDiv a.awesome[href=google.com]{

}
HERE;
$regex = "/[#\w\s\.\-\[\]\=\^\~\:]+\{[\s\n\r\t]*\}/";
echo preg_replace($regex, "replaced!", $myString);
Greg W
Thanks. And thanks for being informative, too.
dclowd9901
I notice it only replaces the brackets. Is there any way to get it to replace the string from the . or # forward?
dclowd9901
I noticed it does this with classes/ids with colons in them (such as with a ':visited' or ':hover' selector).sf_subnavigation ul li a:replaced!Why does it get caught on the colon?Update: I added a \: to the expression's first part, and it worked. Thanks! I'm learning already!
dclowd9901
Aha, good catch! I was worried I might miss something little like that. I'll update the post as well in case others try to copy it.
Greg W
I added new functionality that finds comments that don't pertain to any code; that is, they are comments followed immediately by other comments:$regex2 = '#/\*\-+[\s|\w]*-*\*/\s*(?=/)#i'Wrote that baby from scratch ;)
dclowd9901
+1  A: 

I don't recommend regular expressions for this, since I really doubt CSS is a regular language anyway.

Download CSS Tidy for PHP - I just did it myself in 5 minutes and tested it. Works great.

require( 'csstidy/class.csstidy.php' );

$tidy = new csstidy();

$tidy->parse( "
.test {
  font-weight: bold;
}

.empty {
}

.test2 {
  font-style: italic;
}" );

echo '<pre>';
echo $tidy->print->plain();

Here's the output

.test {
font-weight:700;
}

.test2 {
font-style:italic;
}
Peter Bailey
This'll do. Thanks for the head's up! But I'd still like to know how to do Regular Expressions.
dclowd9901
Then I recommend working through some tutorials.
Peter Bailey