ansaurus

Question

Finding an html element ID based on a text displayed.

Answer 1

A:

I'm not sure what you mean by using the the "Home telephone" string but here are a couple of ways to do this:

/id=(.*?)\s+.*(?=Home telephone)/

where (?=) construct is positive lookahead if you programming language supports it.

ANother way is to simply grep for Home telephone and then grab the id value using awk or sed

ennuikiller 2009-12-09 18:55:53

Answer 2

A:

XPath is the easiest way to retrieve values from XML and HTML documents (provided that they are well-formed).

The expression you want is this:

//div[text() = 'Home telephone']/@id

Which reads, "Find all divs whose text value is equal to 'Home telephone', and return the id attribute for everything that matches."

Depending on your language, there are typically several built-in or third-party (and free) XPath interpreters that are available.

It's a bad idea to parse HTML using regular expressions because HTML isn't a regular language. Regular expressions can't deal with even the simplest of HTML edge cases because regular expressions can't properly deal with nesting. HTML is an inherently nested structure.

Welbog 2009-12-09 19:23:14

Thanks for the response. I am using java script to write an extension for use within Selenium and this seems to be the best way to do what I am looking for.

2009-12-10 10:46:21

Answer 3

A:

In C#, you'd set up a regex that looked like this:

string elementText = "Home\\stelephone"; // you can change this as needed
Regex regex = new Regex(
  "id=\"(.*?)\"\\s+.*(?="+ elementText +")",
RegexOptions.IgnoreCase
| RegexOptions.CultureInvariant
| RegexOptions.IgnorePatternWhitespace
| RegexOptions.Compiled
);

// Capture all Matches in the InputText
MatchCollection ms = regex.Matches(InputText);

InputText would be your html file opened for reading.

ddc0660 2009-12-09 19:36:32

ansaurus

tags:

views:

answers:

Finding an html element ID based on a text displayed.

related questions