tags:

views:

35

answers:

1

Hi,

Is anyone aware of an address parser plugin for Ruby?

I might have to use one of the paid webservices but thought there might be a plugin.

Another thought is go down the NLP route where I could build up a database over time.

Does anybody use any NLP plugin for ruby?

I want to use it to logically parse and sanitise something like this from the HTML:

  <address><strong>HALL (J&amp;E) LTD</strong><br />Head Office<br />
    Questor House<br />
    191 Hawley Road<br />
    Dartford<br />
    Kent <br />
    DA1 1PU</address>
    <p class="tel"><strong>Tel:</strong> +44 (0)1322 223456</p>
    <p class="fax"><strong>Fax:</strong> +44 (0)1322 291458</p>
    <p><strong>Website:</strong> <a target="_blank" href="http://www.jehall.co.uk"&gt;www.jehall.co.uk&lt;/a&gt;&lt;/p&gt;
    <p><strong>Email:</strong> <a href="mailto&#58;helpline&#64;jehall&#46;co&#46;uk?subject=Enquiry%20from%20Defence%20Suppliers%20Directory&amp;cc=defenceenquiries&#64;armedforces&#46;co&#46;uk">helpline&#64;jehall&#46;co&#46;uk</a></p>
</div>

Any suggestions gladly appreciated.

Paul

+1  A: 

You might have some success with Googles geocoding service. This can return structured addresses. There are ruby gems for interfacing with Googles maps API

Steve Weet
This has the advantage that you can verify that the address actually exists (in most cases) especially if you have postal or zip codes
bjg
I think this is my default way if there is nothing out of the box. It makes a whole lot of sense. It just comes down to cost then.
dagda1