views:

452

answers:

4

I am using the .NET Chart Control library that comes with .NET 4.0 Beta 2 to create and save images to disk on a background thread. I am not showing the chart on the screen, however, simply creating a chart, saving it to disk, and destroying it. Something like this:

public void GeneratePlot(IList<DataPoint> series, Stream outputStream) {
    using (var ch = new Chart()) {
        ch.ChartAreas.Add(new ChartArea());
        var s = new Series();
        foreach (var pnt in series) s.Points.Add(pnt);
        ch.Series.Add(s);
        ch.SaveImage(outputStream, ChartImageFormat.Png);
    }
}

It was taking about 300 - 400 ms to create and save each chart. I have potentially hundreds of charts to create, so I thought I would use Parallel.For() to parallelize these tasks. I have an 8 core machine, however, when I try to create 4 charts at a time, my chart create/save time increases to anywhere from 800 to 1400 ms, almost all of which is consumed by Chart.SaveImage.

I thought this might be a limitation of disk I/O, so to test that I changed the last line to:

ch.SaveImage(Stream.Null, ChartImageFormat.Png);

Even writing to a null stream the performance is still about the same (800 - 1400 ms).

Am I not supposed to create images on background threads in parallel with this library, or am I doing something wrong?

Thanks

EDIT: Added Complete Code Sample

Simply change the flag passed to CreateCharts() to test parallel versus serial.

using System;
using System.Collections.Generic;
using System.IO;
using System.Threading;
using System.Threading.Tasks;
using System.Windows.Forms.DataVisualization.Charting;

namespace ConsoleChartTest
{
    class Program
    {
        public static void GeneratePlot(IEnumerable<DataPoint> series, Stream outputStream)
        {
            long beginTime = Environment.TickCount;

            using (var ch = new Chart())
            {
                ch.ChartAreas.Add(new ChartArea());
                var s = new Series();
                foreach (var pnt in series)
                    s.Points.Add(pnt);
                ch.Series.Add(s);

                long endTime = Environment.TickCount;
                long createTime = endTime - beginTime;

                beginTime = Environment.TickCount;
                ch.SaveImage(outputStream, ChartImageFormat.Png);
                endTime = Environment.TickCount;
                long saveTime = endTime - beginTime;

                Console.WriteLine("Thread Id: {0,2}  Create Time: {1,3}  Save Time: {2,3}",
                    Thread.CurrentThread.ManagedThreadId, createTime, saveTime);
            }
        }

        public static void CreateCharts(bool parallel)
        {
            var data = new DataPoint[20000];
            for (int i = 0; i < data.Length; i++)
            {
                data[i] = new DataPoint(i, i);
            }

            if (parallel)
            {
                Parallel.For(0, 10, (i) => GeneratePlot(data, Stream.Null));
            }
            else
            {
                for (int i = 0; i < 10; i++)
                    GeneratePlot(data, Stream.Null);
            }
        }

        static void Main(string[] args)
        {
            Console.WriteLine("Main Thread Id: {0,2}", Thread.CurrentThread.ManagedThreadId);

            long beginTime = Environment.TickCount;
            CreateCharts(false);
            long endTime = Environment.TickCount;
            Console.WriteLine("Total Time: {0}", endTime - beginTime);
        }
    }
}
A: 

Maybe you could save the image as a BMP, which would take more disk space, but would cut down on the amount of processing it would need to do. You should try different image formats to see if the compression is causing problems. If so, you might want to spin off some other threads to do the compression after.

Kibbee
This does not answer why it doesn't appear to be paralleled.
Dykam
It may go in parallel, but if you computer is task switching too much, then things could actually slow down. Perhaps the PNG encoder already runs in parallel under the hood, and trying to run parallel instances of it creates too many threads causing your machine to context switch too much. What does your processor usage look like when running with 1 thread, as compared to with many threads? Is it already at 100% with 1 thread?
Kibbee
I tried BMP (along with all the other formats), but it didn't really affect the parallel performance.
dewald
A: 

Remember, you have Hyper-threading and really cores. You must be careful about it, hyper-threading doesn't have the same performance as a core.

Other thing, a good deal to work with parallel executing is while you create your pool thread you must set a max number of threads like

MaxThreads = Cores - 2;

When I say cores, read cores not hyper-threading cores.

1 - OS 1 - Main application X - Cores to process.

If you create too many threads you are gonna lost performance because of the concurrence in processor.

Create Jpeg or Png images is other good point, this way you'll take less time on HD while saving the image. Take care about the quality of the jpeg and png too, 'cause if it's 100% it can be big.

Other point that is relevant. You are gonna have concurrence on HD, 'cause will be a lot of threads creating archives on it. What you can do about it ? It's really a trouble harder to solve 'cause we don't have parallel hds. So, you can create a place to send the images buffers, like other thread, that don't process anything, just be there receiving the buffer of the images and storing it in a internal list for example. And in 5 em 5 seconds (or some conditions that you think that's better), it start writing the pictures on hd. So you will have a thread just work on HD "without" concurrence, and other threads just to process the images.

at.

SaCi
@SaCi - It is not a HD issue as I am testing it by writing to Stream.Null, so it is never touching the HD. I tried decreasing the maximum number of threads to 4 (since I am on an 8 core machine), but that didn't seem to help.
dewald
A: 

If I had to guess, Id say it looks like the code inside SaveImage is protected by a CriticalSection or a Lock, that is allowing only one thread at a time to run in parts of that code.

So, you could be right about not being allowed to use background threads, but I think it's more likely that you aren't allowed to have more than 1 thread running inside SaveImage at a time. The documentation on this function is pretty sparse, but the timings are very suggestive. 4 charts take about 4 times a long as 1 chart.

If you save just 1 chart using a background thread - does it go at full speed?

John Knoeller
+2  A: 

You're running into problems with the System.Drawing namespace. There's some heavy thread locking in there which will serialize certain tasks. Not until you call Chart.SaveImage() does it actually render the image, that's what's eating all your time.

If you change your test program up a bit you can see parallelization is happening, but it being severely hindered by the locking inside the Graphics drawing code.

Toy around with the count = 50 in the main method here... seeing both outputs at the same time helps I think, you can see that the Parallel one is consistently faster, though it doesn't linearly scale because of the locking in the drawing namespace:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
using System.Threading;
using System.Threading.Tasks;
using System.Windows.Forms.DataVisualization.Charting;

namespace ConsoleChartTest
{
  class Program
  {
    static void Main(string[] args)
    {
      var count = 50;
      Console.WriteLine("Serial Test Start, Count: {0}");
      Console.WriteLine("Main Thread Id: {0,2}", Thread.CurrentThread.ManagedThreadId);

      var sw = new Stopwatch();
      sw.Start();
      CreateCharts(count, false);
      sw.Stop();
      Console.WriteLine("Total Serial Time: {0}ms", sw.ElapsedMilliseconds);

      Console.WriteLine("Parallel Test Start");
      Console.WriteLine("Main Thread Id: {0,2}", Thread.CurrentThread.ManagedThreadId);

      sw.Restart();
      CreateCharts(count, true);
      sw.Stop();
      Console.WriteLine("Total Parallel Time: {0}ms", sw.ElapsedMilliseconds);
    }

    public static void GeneratePlot(IEnumerable<DataPoint> series, Stream outputStream)
    {
      var sw = new Stopwatch();
      sw.Start();

        var ch = new Chart();
        ch.ChartAreas.Add(new ChartArea());
        var s = new Series();
        foreach(var pnt in series) s.Points.Add(pnt);
        ch.Series.Add(s);

        sw.Stop();
        long createTime = sw.ElapsedMilliseconds;
        sw.Restart();

        ch.SaveImage(outputStream, ChartImageFormat.Png);
        sw.Stop();

        Console.WriteLine("Thread Id: {0,2}  Create Time: {1,3}ms  Save Time: {2,3}ms",
            Thread.CurrentThread.ManagedThreadId, createTime, sw.ElapsedMilliseconds);
    }

    public static void CreateCharts(int count, bool parallel)
    {
      var data = new DataPoint[20000];
      if (parallel)
      {
        Parallel.For(0, data.Length, (i) => data[i] = new DataPoint(i, i));
        Parallel.For(0, count, (i) => GeneratePlot(data, Stream.Null));
      }
      else
      {
        for (int i = 0; i < data.Length; i++)
          data[i] = new DataPoint(i, i);
        for (int i = 0; i < count; i++)
          GeneratePlot(data, Stream.Null);
      }
    }
  }
}

What's locking up is Chart.SaveImage() -> ChartImage.GetImage() -> ChartPicture.Paint()

Nick Craver
Is there any way that you know of to get around this bottleneck?
dewald
@dewald - Not directly, though there may be some third party alternatives that don't use `System.Drawing` (I think SoftwareFX uses their own engine, but it's not cheap). We can hope they improve threading in 4.0 RC/RTM but this doesn't seem likely. Sorry the answer sucks, but from my view it's like the Drawing namespace got left in dark ages.
Nick Craver