views:

223

answers:

4

I'll post my entire class and maybe someone with MUCH more experience can help me design something better. I'm really new to doing things Asynchronously, so I'm really lost here. Hopefully my design isn't TOO bad. :P

IMDB Class:

public class IMDB
{
    WebClient WebClientX = new WebClient();
    byte[] Buffer = null;

    public string[] SearchForMovie(string SearchParameter)
    {
        //Format the search parameter so it forms a valid IMDB *SEARCH* url.
        //From within the search website we're going to pull the actual movie
        //link.
        string sitesearchURL = FindURL(SearchParameter);

        //Have a method download asynchronously the ENTIRE source code of the
        //IMDB *search* website, and save it to the byte[] "Buffer".
        WebClientX.DownloadDataCompleted += new DownloadDataCompletedEventHandler(WebClientX_DownloadDataCompleted);
        WebClientX.DownloadDataAsync(new Uri(sitesearchURL));

        //Convert the byte[] to a string so we can easily find the *ACTUAL*
        //movie URL.
        string sitesearchSource = Encoding.ASCII.GetString(Buffer);

        //Pass the IMDB source code to method FindInformation() to FIND the movie
        //URL.
        string MovieURL = FindMovieURL(sitesearchSource);


        //Download the source code from the recently found movie URL.
        WebClientX.DownloadDataAsync(new Uri(MovieURL));

        //Convert the source code to readable string for scraping of information.
        string sitemovieSource = Encoding.ASCII.GetString(Buffer);

        string[] MovieInformation = ScrapeInformation(sitemovieSource);
        return MovieInformation;
    }

    void WebClientX_DownloadDataCompleted(object sender, DownloadDataCompletedEventArgs e)
    {
        Buffer = e.Result;
        throw new NotImplementedException();
    }

    /// <summary>
    /// Formats a valid IMDB url for ease of use according to a search parameter.
    /// </summary>
    /// <param name="sitesearchSource"></param>
    /// <returns></returns>
    private string FindMovieURL(string sitesearchSource)
    {
        int Start = sitesearchSource.IndexOf("<link rel=\"canonical\" href=\"");
        string IMDBCode = sitesearchSource.Substring((Start + 28), 35);

        return IMDBCode;            
    }

    private string[] ScrapeInformation(string Source)
    {
        string[] Information = new string[5];

        Information[0] = FindTitle(Source);
        Information[1] = FindDirector(Source);
        Information[2] = FindYear(Source);
        Information[3] = FindPlot(Source);
        Information[4] = FindPoster(Source);

        return Information;
    }

    /*************************************************************************/

    private string FindURL(string Search)
    {
        string[] SearchArray = Search.Split(' ');
        string FormattedQuery = "";
        foreach (string X in SearchArray)
        {
            FormattedQuery += X + "+";
        }

        FormattedQuery.Remove((FormattedQuery.Length - 1), 1);

        string TheFormattedQuery = "http://www.imdb.com/find?s=all&amp;q=" + FormattedQuery + "&x=0&y=0";
        return TheFormattedQuery;
    }

    private string FindTitle(string Source)
    {
        //<title>Couples Retreat (2009)</title>

        int Start = Source.IndexOf("<title>");
        string Bookmark = Source.Substring((Start + 7), 400);

        int End = Bookmark.IndexOf("</title>");
        string Title = Bookmark.Substring(0, End - 7);

        return Title;
    }

    private string FindDirector(string Source)
    {
        int Start = Source.IndexOf("<h5>Director:</h5>");
        string Bookmark = Source.Substring((Start + 18), 250);

        Start = Bookmark.IndexOf(">");
        Bookmark = Bookmark.Substring(Start + 1, 100);

        int End = Bookmark.IndexOf("</a>");
        string Director = Bookmark.Substring(0, End - 1);

        return Director;
    }

    private string FindYear(string Source)
    {
        int Start = Source.IndexOf("<h5>Release Date:</h5>");
        string Bookmark = Source.Substring((Start + 22), 40);

        int End = Bookmark.IndexOf("<a class=");
        string ReleaseYear = Bookmark.Substring(0, End - 1);

        return ReleaseYear;
    }

    private string FindPlot(string Source)
    {
        int Start = Source.IndexOf("<h5>Plot:</h5>");
        string Bookmark = Source.Substring((Start + 14), 700);

        int End = Bookmark.IndexOf("<a class");
        string Plot = Bookmark.Substring(0, End - 1);

        return Plot;
    }

    private string FindPoster(string Source)
    {
        int Start = Source.IndexOf("<a name=\"poster\" href=");
        string Bookmark = Source.Substring((Start + 22), 700);

        Start = Bookmark.IndexOf("src=\"");
        string PosterURL = Bookmark.Substring((Start + 5), 103);

        return PosterURL;            
    }        
}

Form1.cs Class (My windows Form):

public partial class MainSearchForm : Form
{
    public MainSearchForm()
    {
        InitializeComponent();
    }

    SearchFunctions.IMDB IMDBClass = new QuickFlick.SearchFunctions.IMDB();
    int YPosition = 5;

    private void btnSearch_Click(object sender, EventArgs e)
    {
        string[] MovieInformation = IMDBClass.SearchForMovie(txtSearch.Text);
        LoadMovieInformation(MovieInformation);
    }

    public void LoadMovieInformation(string[] FoundInfo)
    {
        MovieItem TheMovieItem = new MovieItem();
        TheMovieItem.SetTitle(FoundInfo[0]);
        TheMovieItem.SetDirector(FoundInfo[1]);
        TheMovieItem.SetRelease(FoundInfo[2]);
        TheMovieItem.SetPlot(FoundInfo[3]);
        TheMovieItem.SetPoster(FoundInfo[4]);

        TheMovieItem.Location = new Point(5, YPosition);
        YPosition += 196;

        panel1.Controls.Add(TheMovieItem);
    }
}

Now, the gist of what my program is trying to accomplish.

A user will write the name down of a movie, and I'll pull up the information about it. Nothing else! :P It's mostly intended for me to learn Async functions etc. but I'm scared I might be approaching this the completely wrong way.

Once again, I'm not looking for much code, just design of the program. Methods, order of methods, unnecesary methods, etc. :D

Thanks a bunch SO, as always, you rock!

+2  A: 

I'm not going to comment on the entire design because, like @Vinko, I think you ought to focus your question a little more in that respect. I will say though that you are fundamentally misunderstanding how the asynchronous methods work. They will return immediately to the calling thread without completing first. That's the whole point with asynchronous threads - they don't wait until they have finished before they return.

For this to work, you must either use synchronous calls or wait on some event after the call and signal that event in the asynchronous callback handler so that your code doesn't attempt to access the returned data until after the call has completed. The benefit in this approach is that your thread is free to perform other tasks while waiting on the call to complete, but you must absorb the complexity of managing the waiting process (it's not much).

See the sample code for the DownloadDataCompleted event at MSDN. You'll need to do a wait for each of the web client downloads that you do. Note that you need to invoke the Set() method on the AutoResetEvent object passed to your handler in the example in your handler.

tvanfosson
+1  A: 

You need to have models.

For example the FindTitle() and FindDirector() methods should be in a separate class.. perhaps called 'Movie' or 'IMDBMovie'. This class should have attributes such as 'title' and 'director' with getters / setters.

This article could prove helpful to you: MVC

clownbaby
I am not entirely sure that MVC is required (it is a Windows application), however I +1'd for the gist of what you are saying. Definitely, the OP should look into a bit more logical separation. Breaking anything down into logical separation with clear-cut data objects (or domain objects, if you are going down a DDD route) and services would be a good point for refactoring. As I am hinting at in the original question, the approach will vary based upon the scope that this plays in the entire application.
joseph.ferris
I didn't say MVC is required. I was simply letting the OP know about a well known OO architecture.
clownbaby
+1  A: 

Better to split the model object from the data retrieval object.

  1. User sees View, uses btnSearch_Click
  2. Search starts on other thread, btnSearch_Click action finishes
  3. Search finishes, data may be available, new object creation and addition to view.
  4. View has new data event and redisplays content.

Some adjusted code to do the async calls.

public string[] SearchForMovie(string SearchParameter)
{
    //Format the search parameter so it forms a valid IMDB *SEARCH* url.
    //From within the search website we're going to pull the actual movie
    //link.
    string sitesearchURL = FindURL(SearchParameter);

    //Have a method download asynchronously the ENTIRE source code of the
    //IMDB *search* website, and save it to the byte[] "Buffer".
    WebClientX.DownloadDataCompleted += new DownloadDataCompletedEventHandler(WebClientX_DownloadSearchCompleted);
    WebClientX.DownloadDataAsync(new Uri(sitesearchURL));

}

void WebClientX_DownloadSearchCompleted(object sender, DownloadDataCompletedEventArgs e)
{
    Buffer = e.Result;

    //Convert the byte[] to a string so we can easily find the *ACTUAL*
    //movie URL.
    string sitesearchSource = Encoding.ASCII.GetString(Buffer);

    //Pass the IMDB source code to method FindInformation() to FIND the movie
    //URL.
    string MovieURL = FindMovieURL(sitesearchSource);


    //Download the source code from the recently found movie URL.
    WebClientX.DownloadDataCompleted += new DownloadDataCompletedEventHandler(WebClientX_DownloadMovieCompleted);
    WebClientX.DownloadDataAsync(new Uri(MovieURL));
}

void WebClientX_DownloadMovieCompleted(object sender, DownloadDataCompletedEventArgs e)
{
    Buffer = e.Result;
    //Convert the source code to readable string for scraping of information.
    string sitemovieSource = Encoding.ASCII.GetString(Buffer);

    // would create a movie object here rather than have the scrape function on this class
    string[] MovieInformation = ScrapeInformation(sitemovieSource);
    Model.LoadMovieInformation(MovieInformation);
}
Greg Domjan
+1  A: 

I would agree with clownbaby on the need for models, but would disagree on the need for FindTitle, FindDirectory etc. needing to be in separate classes.

I think that you should structure your code as follows:

  1. Have a class that takes in the search parameters in a generic way and then calls a search through some kind of factory class.
  2. Have the factory class internally create a specialized version of the search class based on either a specification by the user, or a specification in the configuration settings of the application.
  3. Have the search class format the search query and then call another class which simply makes the HTTP request and gets back the HTTP response contents to the search class.
  4. Have the search class then parse the HTTP response to generate the results to be fed back to the class that called the search in Item 1.

The advantages of this sort of approach are that you can then isolate the code that is specialized for processing website specific parsing routines into it's own class and isolate it from the UI as well as utilize a single class to make the HTTP request, regardless of how many websites you want to scrape.

Put another way, the View layer should not care how exactly the data is retrieved and parsed and should have the flexibility to call on various sources of information.

Likewise, the Data layer should not care how the data should be processed and should only be concerned with where to get the data from.

The Domain layer is the one that will sit in the middle, take raw data from the Data layer, parse it and make sense of it and then format it into a shape that the View can understand.

This way, if say you want to also target Yahoo! Movies, Wikipedia or any of the other sources of movie data, you can do that. All you would have to do is add another class in the Domain layer and inform the View that a new source is available in the Domain layer.

Good design is about limiting the impact of changes so that additional features can be built into the application.

Good design is also about being able to test individual components. A design such as the one I have described will allow you to test the various classes that comprise individual layers and ensure that they work as advertised.

Something I find very useful is that if a class cannot be tested very easily using NUnit and Rhino.Mocks, it's probably not very well-designed.

Umar Farooq Khawaja