views:

425

answers:

3

I am trying to build a c# console app that will monitor about 3000 urls (Just need to know that HEAD request returned 200, not necessarily content, etc.)

My attempt here was to build a routine the checks the web URLS, looping and creating threads each executing the routine. What's happening is if i run with <20 threads, it executes ok most of the time, but if i use >20 threads, some of the url's time out. I tried increasing the Timeout to 30 seconds, same occurs. The network I am running this on is more than capable of executing 50 HTTP HEAD requests (10MBIT connection at ISP), and both the CPU and network run very low when executing the routine.

When a timeout occurs, i test the same IP on a browser and it works fine, I tested this repeatedly and there was never a case during testing that a "timed out" url was actually timing out.

The reason i want to run >20 threads is that i want to perform this test every 5 minutes, with some of the URL's taking a full 10sec (or higher if the timeout is set higher), i want to make sure that its able to run through all URLs within 2-3 minutes.

Is there a better way to go about checking if a URL is available, or, should I be looking at the system/network for an issue.

MAIN

        while (rdr.Read())
        {
            Thread t = new Thread(new ParameterizedThreadStart(check_web));

            t.Start(rdr[0]);


        }


      static void check_web(object weburl)
      {
          bool isok;
          isok = ConnectionAvailable(weburl.ToString());
      }



      public static bool ConnectionAvailable(string strServer)
      {

          try
          {
              strServer = "http://" + strServer;
              HttpWebRequest reqFP = (HttpWebRequest)HttpWebRequest.Create(strServer);
              reqFP.Timeout = 10000;
              reqFP.Method = "HEAD";

              HttpWebResponse rspFP = (HttpWebResponse)reqFP.GetResponse();
              if (HttpStatusCode.OK == rspFP.StatusCode)
              {
                  Console.WriteLine(strServer + " - OK");
                  rspFP.Close();
                  return true;
              }
              else
              {
                  Console.WriteLine(strServer + " Server returned error..");
                  rspFP.Close();
                  return false;

              }

          }

          catch (WebException x)
          {
              if (x.ToString().Contains("timed out"))
              {
                  Console.WriteLine(strServer + " - Timed out");
              }
              else
              {
                  Console.WriteLine(x.Message.ToString());
              }

              return false;

          }

      }
+5  A: 

Just remember, you asked.

Very bad implementation.

  1. Do not go creating threads like that. It does very little good to have more threads than processor cores. The extra threads will pretty much just compete with each other, especially since they're all running the same code.

  2. You need to implement using blocks. If you throw an exception (and chances are you will), then you will be leaking resources.

  3. What is the purpose in returning a bool? Do you check it somewhere? In any case, your error and exception processing are a mess.

    • When you get a non-200 response, you don't display the error code.
    • You're comparing against the Message property to decide if it's a timeout. Microsoft should put a space between the "time" and "out" just to spite you.
    • When it's not a timeout, you display only the Message property, not the entire exception, and the Message property is already a string and doesn't need you to call ToString() on it.

Next Batch of Changes

This isn't finished, I don't think, but try this one:

public static void Main()
{
    // Don't mind the interpretation. I needed an excuse to define "rdr"
    using (var conn = new SqlConnection())
    {
        conn.Open();
        using (var cmd = new SqlCommand("SELECT Url FROM UrlsToCheck", conn))
        {
            using (var rdr = cmd.ExecuteReader())
            {
                while (rdr.Read())
                {
                    // Use the thread pool. Please.
                    ThreadPool.QueueUserWorkItem(
                        delegate(object weburl)
                            {
                                // I invented a reason for you to return bool
                                if (!ConnectionAvailable(weburl.ToString()))
                                {
                                    // Console would be getting pretty busy with all
                                    // those threads
                                    Debug.WriteLine(
                                        String.Format(
                                            "{0} was not available",
                                            weburl));
                                }
                            },
                            rdr[0]);
                }
            }
        }
    }
}

public static bool ConnectionAvailable(string strServer)
{
    try
    {
        strServer = "http://" + strServer;
        var reqFp = (HttpWebRequest)WebRequest.Create(strServer);
        reqFp.Timeout = 10000;
        reqFp.Method = "HEAD";

        // BTW, what's an "FP"?
        using (var rspFp = (HttpWebResponse) reqFp.GetResponse()) // IDisposable 
        {
            if (HttpStatusCode.OK == rspFp.StatusCode)
            {
                Debug.WriteLine(string.Format("{0} - OK", strServer));
                return true; // Dispose called when using is exited
            }

            // Include the error because it's nice to know these things
            Debug.WriteLine(String.Format(
                 "{0} Server returned error: {1}", 
                 strServer, rspFp.StatusCode));
            return false;
        }
    }
    catch (WebException x)
    {
        // Don't tempt fate and don't let programs read human-readable messages
        if (x.Status == WebExceptionStatus.Timeout)
        {
            Debug.WriteLine(string.Format("{0} - Timed out", strServer));
        }
        else
        {
            // The FULL exception, please
            Debug.WriteLine(x.ToString());
        }

        return false;
    }
}

Almost Done - Not Tested Late Night Code

public static void Main()
{
    using (var conn = new SqlConnection())
    {
        conn.Open();
        using (var cmd = new SqlCommand("", conn))
        {
            using (var rdr = cmd.ExecuteReader())
            {
                if (rdr == null)
                {
                    return;
                }

                while (rdr.Read())
                {
                    ThreadPool.QueueUserWorkItem(
                        CheckConnectionAvailable, rdr[0]);
                }
            }
        }
    }
}

private static void CheckConnectionAvailable(object weburl)
{
    try
    {
        // If this works, it's a lot simpler
        var strServer = new Uri("http://" + weburl);
        using (var client = new WebClient())
        {
            client.UploadDataCompleted += ClientOnUploadDataCompleted;
            client.UploadDataAsync(
                strServer, "HEAD", new byte[] {}, strServer);
        }
    }
    catch (WebException x)
    {
        Debug.WriteLine(x);
    }
}

private static void ClientOnUploadDataCompleted(
    object sender, UploadDataCompletedEventArgs args)
{
    if (args.Error == null)
    {
        Debug.WriteLine(string.Format("{0} - OK", args.UserState));
    }
    else
    {
        Debug.WriteLine(string.Format("{0} - Error", args.Error));
    }
}
John Saunders
yes, take a breath! There is so much more to be said. He's also probably running into a connection per second issue depending on the OS he is running.
jvanderh
It's midnight GMT-5, and past my bedtime, so I may leave the rest until tomorrow. I thought it would be nice to not block thread pool threads for no good reason, so I think that will be next. After all that, I'd better test this, or I'll be very embarrassed if it doesn't work.
John Saunders
+1  A: 

Use ThreadPool class. Don't spawn hundreds of threads like this. Threads have such a huge overhead and what happens in your case is that your CPU will spend 99% time on context switching and 1% doing real work.

lubos hasko
A: 

Don't use threads.

Asynch Call backs and queues. Why create a thread when the resource that they are all wanting is access to the outside world. Limit your threads to about 5, and then implement a class that uses a queue. split the code into two parts, the fetch and the process. One controls the flow of data while the other controls access to the outside world.

Use whatever language you like but you won't got wrong if you think that threads are for processing and number crunching and async call backs are for resource management.

IGA