views:

105

answers:

3

I am working on a program that will automatically get your characters stats and whatnot from the wow armory. I already have the html, and i can identify where the string is, but i need to get the "this.effective" value, which in this case is 594. But since its always changing (and so are the other values, i cant just take it a certain position. Any help would GREATLY appreciated.

Thanks

Matt --------- This is the html snippet:

 function strengthObject() {
  this.base="168";
  this.effective="594";
  this.block="29";
  this.attack="1168";

this.diff=this.effective - this.base;


A: 

One way would be to use a regular expression to extract this value from the HTML source:

this.effective="(\d+)"

Note that HTML scraping is not an ideal solution (for example, it may break when the format of the HTML changes) however I don't know about the "wow armory" and what other ways there are to get this information.

Daniel Fortunov
i don't think there are any other ways of retrieving this information programmatic-ly.
Matt
You can get the data as XML documents if you set the user agent
John Burton
+1  A: 

You can do it using regular expressions:

using System;
using System.Text.RegularExpressions;

class Program
{
    public static void Main()
    {
        string html = @"        function strengthObject() {
                this.base=""168"";
                this.effective=""594"";
                this.block=""29"";
                this.attack=""1168"";";

        string regex = @"this.effective=""(\d+)""";

        Match match = Regex.Match(html, regex);
        if (match.Success)
        {
            int effective = int.Parse(match.Groups[1].Value);
            Console.WriteLine("Effective = " + effective);
            // etc..
        }
        else
        {
            // Handle failure...
        }
    }
}
Mark Byers
Thanks for this great idea, i never though about using it this way.
Matt
+1  A: 

It's much easier to extract the information from the XML version of the website.

If you make a request to a URL like this (Only with a valid character name) then you get back an XML document that you can use an XML parser to easily extract the data.

http://eu.wowarmory.com/character-sheet.xml?r=Nordrassil&cn=Someone

The URLs are the same as the ones you see in your web browser.

Please note though that you MUST set the User Agent field of the request to be that of a supported browser that supports the XML version of the file or you get back HTML instead. I use "Mozilla/5.0 Firefox/2.0.0.1" as the user agent in my program and it works fine.

Oh, also don't make more than a few requests in second, or an average of more than one request every 3 or 4 seconds or the site blocks your IP for a few hours ...

John Burton