views:

35

answers:

1

Hi

I am going through a large website (1600+ pages) to make it pass Priority 1 W3C WAI. As a result, things like image tags need to have alt attributes.

What would be the regular expression for finding img tags without alt attributes? If possible, with a wee explanation so I can use to find other issues.

I am in an office with Visual Web Developer 2008. The Edit >> Find dialogue can use regular expressions.

+1  A: 

This is really tricky, because regular expressions are mostly about matching something that is there. With look-around trickery, you can do things like 'find A that is not preceded/followed by B', etc. But I think the most pragmatic solution for you wouldn't be that.

My proposal relies a little bit on your existing code not doing too crazy things, and you might have to fine-tune it, but I think it's a good shot, if you really want to use a RegEx-search for your problem.

So what I suggest would be to find all img tags, that can (but don't need to) have all valid attributes for an img-element. Whether that is an approach you can work with is for you to decide.

Proposal:

/<img\s*((src|align|border|height|hspace|ismap|longdesc|usemap|vspace|width|class|dir|lang|style|title|id)="[^"]"\s*)*\s*\/?>/

The current limitations are:

  1. It expects your attribute values to be delimited by double quotes,
  2. It doesn't take into account possible inline on*Event attributes,
  3. It doesn't find img elements with 'illegal' attributes.
Thomas
It didn't work straight off, but it's a good enough shot to vote up: I have learnt things from your answer, for example, the need to finally get to grips with regex. I also think that VS's Find could allow iterative searches. Combined with a facility for putting a string that ISN'T there (eg, alt=") would make this a doddle. Oh, well.
awrigley
@awrigley: Have you been able to pinpoint what's not working? For example, I've included the `/`slashes as RegEx delimiters. That might not be necessary at all. Including all (legal) events isn't hard at all, I just didn't want to do the typing. As long as your current image tags are at least trying to be valid, this should be extendable to a useful search expression.
Thomas
@Thomas: I have tried without the delimiters as well, but no joy. I think I have located most of them manually, but even introducing a fault, the results returned are 0. Can't blame you for not wanting to type.
awrigley
@Thomas: up against the deadline now, will just have to research this later.
awrigley
@awrigley: Sure, if you've gone through manually there's no point in spending too much time on this. When you have a minute or two, it'd be lovely if you could post a couple of samples that it should find (but doesn't, currently). I've actually tested it against a few cases I made up, and it seemed to perform nicely, so I'm hopeful that it could be fixed to work for you. However, no pressure. If you have to get your work done, do that. For me, it's all but academical interest. ;)
Thomas