I would use Python and BeautifulSoup for the job. It is very solid at handling this kind of stuff.
For your case, you can use SoupStrainer to make BeautifulSoup only parse DIVs in the document that have the class you want, so it doesn't have to have the whole thing in memory.
For example, say your document looks like this:
<div class="test">Hello World</div>
<div class="hello">Aloha World</div>
<div>Hey There</div>
You can write this:
>>> from BeautifulSoup import BeautifulSoup, SoupStrainer
>>> doc = '''
... <div class="test">Hello World</div>
... <div class="hello">Aloha World</div>
... <div>Hey There</div>
... '''
>>> findDivs = SoupStrainer('div', {'class':'hello'})
>>> [tag for tag in BeautifulSoup(doc, parseOnlyThese=findDivs)]
[<div class="hello">Aloha World</div>]