Currently I have multi-threaded downloader class that uses HttpWebRequest/Response. All works fine, it's super fast, BUT.. the problem is that the data needs to be streamed while it's downloading to another app. That means that it must be streamed in the right order, the first chunk first, and then the next in the queue. Currently my downloader class is sync and Download() returns byte[]. In my async multi-threaded class I make for example, list with 4 empty elements (for slots) and I pass each index of the slot to each thread using the Download() function. That simulates synchronization, but that's not what I need. How should I do the queue thing, to make sure the data is streamed as soon as the first chunk start downloading.
To create a synchronized multi-threaded downloader, you will need to create correct data structure, and you'll need more than just byte[]
of data.
Steps:
- Break your download into multiple chunks based on the size of the content or fixed-sized-content downloader about 500KB downloaded by each thread.
- When starting the thread, specify the chunk-index - 1st part, 2nd part etc
- When download is available, align the final content based on the chunk index.
If interested, you may want to have a look at the code of prozilla (C, Linux based - at ) or Axel.
Can you show the code where you do the downloads, and the code where you kick off the multiple async threads?
Maybe I am not understanding your scenario fully, but if I were you, I would use Async (BeginRead on the responseStream). Then I would do the following....
void StartReading(Stream responseStream)
{
byte [] buffer = new byte[1024];
Context ctx = new Context();
ctx.Buffer = buffer;
ctx.InputStream = responseStream;
ctx.OutputStream = new MemoryStream(); // change this to be your output stream
responseStream.BeginRead(buffer, 0, buffer.Length; new AsyncCallback(ReadCallback), ctx);
}
void ReadCallback(IAsyncResult ar)
{
Context ctx = (Context)ar.AsyncState;
int read = 0;
try {
read = ctx.InputStream.EndRead(ar);
if (read > 0)
{
ctx.OutputStream.Write(ctx.Buffer, 0, read);
// kick off another async read
ctx.InputStream.BeginRead(ctx.Buffer, 0, ctx.Buffer.Length, new AsyncCallback(ReadCallback), ctx);
} else {
ctx.InputStream.Close();
ctx.OutputStream.Close();
}
} catch {
}
}
}
If your question is about how to determine which thread is downloading the first chunk and when that first chunk is ready for use, use an event per thread and keep track of which chunks you've assigned to which threads. Keep track of which event you pass to the first thread (that will be downloading the first chunk of data), the event you pass to the second thread (for the 2nd chunk of data) etc. Have the main thread, or another background thread (to avoid blocking the UI thread), wait on the first event. When the first thread finishes downloading its chunk, it sets/signals the first event. The thread that is waiting will then wake up and can use the first chunk of data.
The other download threads can do the same, signalling their respective events when they are done. Use a manual reset event so that the event will remain signaled even if nobody is waiting on it. When the thread that needs the data blocks in order finishes processing the first data block, it can wait on the 2nd event. If the 2nd event has already been signalled, then the wait will return immediately and the thread can begin processing the 2nd data block.
For a very large download you can reuse the events and threads in a round-robin fashion. The order that they finish isn't important as long as the thread that consumes the data chunks consumes them in order and waits on the respective events in order.
If you're clever and careful, you can probably do all of this using only one event: create a global array of data chunk pointers / objects initially set to null, worker threads download chunks of data and assign the finished chunks to their respective slot in the global array and then signal the shared event. The consumer thread keeps a data chunk counter so it knows which data chunk it needs to handle next, waits on the shared event, and when it is signalled looks at the next slot in the global array to see if data has appeared there. If there is still no data in the next slot in sequence, the consumer thread goes back to waiting on the event. You'll also need a way for the worker threads to know which data block they should download next - a global counter protected by a mutex or accessed using interlockedadd/exchange would suffice. Each worker thread increments the global counter and downloads that data chunk number, and assigns the result to the nth slot in the global list of data chunks.