Can someone point me to an article/algorithm on how can a read a long file at a certain rate? Say i do not want to pass 10 kb/sec while issuing reads.
The crude solution is just to read a chunk at a time and then sleep eg 10k then sleep a second. But the first question I have to ask is: why? There are a couple of likely answers:
- You don't want to create work faster than it can be done; or
- You don't want to create too great a load on the system.
My suggestion is not to control it at the read level. That's kind of messy and inaccurate. Instead control it at the work end. Java has lots of great concurrency tools to deal with this. There are a few alternative ways of doing this.
I tend to like using a producer consumer pattern for soling this kind of problem. It gives you great options on being able to monitor progress by having a reporting thread and so on and it can be a really clean solution.
Something like an ArrayBlockingQueue can be used for the kind of throttling needed for both (1) and (2). With a limited capacity the reader will eventually block when the queue is full so won't fill up too fast. The workers (consumers) can be controlled to only work so fast to also throttle the rate covering (2).
It depends a little on whether you mean "don't exceed a certain rate" or "stay close to a certain rate."
If you mean "don't exceed", you can guarantee that with a simple loop:
while not EOF do
read a buffer
Thread.wait(time)
write the buffer
od
The amount of time to wait is a simple function of the size of the buffer; if the buffer size is 10K bytes, you want to wait a second between reads.
If you want to get closer than that, you probably need to use a timer.
- create a Runnable to do the reading
- create a Timer with a TimerTask to do the reading
- schedule the TimerTask n times a second.
If you're concerned about the speed at which you're passing the data on to something else, instead of controlling the read, put the data into a data structure like a queue or circular buffer, and control the other end; send data periodically. You need to be careful with that, though, depending on the data set size and such, because you can run into memory limitations if the reader is very much faster than the writer.
If you have used Java I/O then you should be familiar with decorating streams. I suggest an InputStream
subclass that takes another InputStream
and throttles the flow rate. (You could subclass FileInputStream
but that approach is highly error-prone and inflexible.)
Your exact implementation will depend upon your exact requirements. Generally you will want to note the time your last read returned (System.nanoTime
). On the current read, after the underlying read, wait
until sufficient time has passed for the amount of data transferred. A more sophisticated implementation may buffer and return (almost) immediately with only as much data as rate dictates (be careful that you should only return a read length of 0 if the buffer is of zero length).
- while !EOF
- store System.currentTimeMillis() + 1000 (1 sec) in a long variable
- read a 10K buffer
- check if stored time has passed
- if it isn't, Thread.sleep() for stored time - current time
Creating ThrottledInputStream that takes another InputStream as suggested would be a nice solution.
a simple solution , by creating a ThrottledInputStream
this should be used like this:
final InputStream slowIS = new ThrottledInputStream(new BufferedInputStream(new FileInputStream("c:\\file.txt"),8000),300);
300 is the number of kilobytes per second. 8000 is the block size for BufferedInputStream
this should of course be generalized by implementing read(byte b[], int off, int len) , which will spare you a ton of System.currentTimeMillis() calls. System.currentTimeMillis() is called once for each byte read, which can cause a bit of an overhead it should also be possible to store the number of bytes that can savely be read without calling System.currentTimeMillis().
be sure to put a BufferedInputStream in between, otherwise the FileInputStream will be polled in single bytes rather than blocks. this will reduce the cpu load form 10% to almost 0. you will risk to exceed the data rate by the number of bytes in the block size.
import java.io.InputStream;
import java.io.IOException;
public class ThrottledInputStream extends InputStream {
private final InputStream rawStream;
private long totalBytesRead;
private long startTimeMillis;
private static final int BYTES_PER_KILOBYTE = 1024;
private static final int MILLIS_PER_SECOND = 1000;
private final int ratePerMillis;
public ThrottledInputStream(InputStream rawStream, int kBytesPersecond) {
this.rawStream = rawStream;
ratePerMillis = kBytesPersecond * BYTES_PER_KILOBYTE / MILLIS_PER_SECOND;
}
@Override
public int read() throws IOException {
if (startTimeMillis == 0) {
startTimeMillis = System.currentTimeMillis();
}
long now = System.currentTimeMillis();
long interval = now - startTimeMillis;
//see if we are too fast..
if (interval * ratePerMillis < totalBytesRead + 1) { //+1 because we are reading 1 byte
try {
final long sleepTime = ratePerMillis / (totalBytesRead + 1) - interval; // will most likely only be relevant on the first few passes
Thread.sleep(Math.max(1, sleepTime));
} catch (InterruptedException e) {//never realized what that is good for :)
}
}
totalBytesRead += 1;
return rawStream.read();
}
}