views:

233

answers:

5

Ok, so I'm reading a binary file into a char array I've allocated with malloc. (btw the code here isn't the actual code, I just wrote it on the spot to demonstrate, so any mistakes here are probably not mistakes in the actual program.) This method reads at about 50million bytes per second.

main

char *buffer = (char*)malloc(file_length_in_bytes*sizeof(char));
memset(buffer,0,file_length_in_bytes*sizeof(char));
//start time here
read_whole_file(buffer);
//end time here
free(buffer);

read_whole_buffer

void read_whole_buffer(char* buffer)
{
  //file already opened
  fseek(_file_pointer, 0, SEEK_SET);
  int a = sizeof(buffer[0]);
  fread(buffer, a, file_length_in_bytes*a, _file_pointer);
}

I've written something similar with managed c++ that uses filestream I believe and the function ReadByte() to read the entire file, byte by byte, and it reads at around 50million bytes per second.

Also, I have a sata and an IDE drive in my computer, and I've loading the file off of both, doesn't make any difference at all(Which is weird because I was under the assumption that SATA read much faster than IDE.)

Question

Maybe you can all understand why this doesn't make any sense to me. As far as I knew, it should be much faster to fread a whole file into an array, as opposed to reading it byte by byte. On top of that, through testing I've discovered that managed c++ is slower (only noticeable though if you are benchmarking your code and you require speed.)

SO

Why in the world am I reading at the same speed with both applications. Also is 50 million bytes from a file, into an array quick?

Maybe I my motherboard is bottle necking me? That just doesn't seem to make much sense eather.

Is there maybe a faster way to read a file into an array?

thanks.

My 'script timer'

Records start and end time with millisecond resolution...Most importantly it's not a timer

#pragma once
#ifndef __Script_Timer__
    #define __Script_Timer__
    #include <sys/timeb.h>
    extern "C"
    {
     struct Script_Timer
     {
      unsigned long milliseconds;
      unsigned long seconds;
      struct timeb start_t;
      struct timeb end_t;
     };
     void End_ST(Script_Timer *This)
     {
      ftime(&This->end_t);
      This->seconds = This->end_t.time - This->start_t.time;
      This->milliseconds = (This->seconds * 1000) + (This->end_t.millitm - This->start_t.millitm);
     }
     void Start_ST(Script_Timer *This)
     {
      ftime(&This->start_t);
     }  
    }
#endif

Read buffer thing

char face = 0;
char comp = 0;
char nutz = 0;
for(int i=0;i<(_length*sizeof(char));++i)
{
 face = buffer[i];
 if(face == comp)
  nutz = (face + comp)/i;
 comp++;
}
A: 

I've done some tests on this, and after a certain point, the effect of increased buffer size goes down the bigger the buffer. There is usually an optimum buffer size you can find with a bit of trial and error.

Note also that fread() (or more specifically the C or C++ I/O library) will probably be doing its own buffering. If your system suports it a plain read() may (or may not) be a bit faster.

anon
In a sense then I'd have to create a buffer, with smaller buffers, such as buffer[5][4096]?
kelton52
I think you are right about the buffering, but is there a way to read a file without the read function buffering?
kelton52
As I said, try using te read() function, rather than fread(), or try using other OS features, such as the Win32 ReadFile() API..
anon
I did see a slight improvement of about 1-5 million characters per second.
kelton52
+1  A: 

Transfers from or to main memory run at speeds of gigabytes per second. Inside the CPU data flows even faster. It is not surprising that, whatever you do at the software side, the hard drive itself remains the bottleneck.

Here are some numbers from my system, using PerformanceTest 7.0:

  • hard disk: Samsung HD103SI 5400 rpm: sequential read/write at 80 MB/s
  • memory: 3 * 2 GB at 400 MHz DDR3: read/write around 2.2 GB/s

So if your system is a bit older than mine, a hard drive speed of 50 MB/s is not surprising. The connection to the drive (IDE/SATA) is not all that relevant; it's mainly about the number of bits passing the drive heads per second, purely a hardware thing.

Another thing to keep in mind is your OS's filesystem cache. It could be that the second time round, the hard drive isn't accessed at all.

The 180 MB/s memory read speed that you mention in your comment does seem a bit on the low side, but that may well depend on the exact code. Your CPU's caches come into play here. Maybe you could post the code you used to measure this?

Thomas
Even though I have one SATA drive and one IDE drive? I should notice SOME difference if the hard drive was the bottle neck, correct?
kelton52
Also, the fastest I've been able to read bytes from an array was about 180 million bytes per second, far from gigabytes. I feel like I'm missing a lot of speed I could be using, and I need all that I can find. Any suggestions for that?
kelton52
Too much to say for a comment. I'll edit my answer, hang on...
Thomas
I'm pretty sure you're right about the hard drives being the bottleneck now. Thank you. And i rewrote a 'read from buffer' and it doesn't even seem to register on my timer. The biggest file I opened was 700 megs. Didn't even see a millisecond tick. Doesn't seem right. I'll post my code for that also.
kelton52
The bottleneck is not the hard drive, but the communication channel between the program and the hard drive. There are USB, SATA and other protocols to setup as well as data and address bus sharing on the PC. Also, if the data file is fragmented, the drive will have to make more than one access.
Thomas Matthews
+1  A: 

The FILE* API uses buffered streams, so even if you read byte by byte, the API internally reads buffer by buffer. So your comparison will not make a big difference.

The low level IO API (open, read, write, close) is unbuffered, so using this one will make a difference.

It may also be faster for you, if you do not need the automatic buffering of the FILE* API!

frunsi
Yeah I tested that, and it gave me 1-5 extra million bytes per second...relatively not much of a gain. The biggest gain I've had so far is about 10-15 MBps by slicing my buffer into smaller pieces, around 4096 Byte a piece. I also squared the size and noticed no definate change.
kelton52
4k is good buffer size for various reasons, also you should use gettimeofday() for your timer on unix/linux and QueryPerformanceCounter on windows. An upvote would be nice :)
frunsi
I explored both QueryPerfomanceCounter and gettimeofday, and neither would fit my specific needs. As far as I know the solution I came up with for the timing is actually pretty sound and reliable.
kelton52
Well, it may be sufficient here, but the resolution of ftime() is about 10ms on windows (though it looks like it would be 1ms).
frunsi
well does dettimeofday() catch milliseconds? I've only ran across examples for getting seconds.
kelton52
gettimeofday() should have _microsecond_ accuracy, even in practice. AFAIK it somehow depends on CPU speed, but with a few hundred MHz CPU you already get microsecond accuracy. At least this all is much more accurate than ftime()! ;)
frunsi
A: 

The issue in block reading is the overhead between your program and the hard drive. Most of this is to maintain portability and ease of development. Although some of the overhead is to manage multiple tasks or programs running concurrently.

Managed C, or C#, has another layer between your program and the hard disk. This, I belive, is called CLI. CLI is written to be language generic so that programs written in other langauges (such as Visual Basic) can easily share data and resources. Supposedly, this should also shrink the size of the executable since the OS now contains more of the code.

The read operation can be further optimized by diving deeper into either the C style code or into the C++ streams. If you really need performance, you will have to sacrifice portability and use platform specific technologies. If you can tell the I/O card to dump a quantity of bytes directly into your array, you're doing great. Some computers have the capability of delegating I/O operations away from the main processor; however, there may be some data bus or address bus sharing issues that will slow it down.

I optimized a utility that processes 1GB data files from 1 hour to 2 minutes, primarily by reading data into huge buffers (5MB to 10MB). The performance bottleneck is in the I/O system. For example, if the data file was on the network, performance often doubled.

Thomas Matthews
Yeah but my current read can read a gigabyte in 16.6 seconds. It also sounds a lot simplier than what you're doing.
kelton52
I can also read through > 45MB worth of source code, and remove the comments (line and block) in ~1-10 ms(I don't know because it doesn't register with my timer, and my timer, as stated earlier, has a 10ms margin of error.)...When I actually get a reading though I'll translate into how many gigabytes I can parse.
kelton52
done. I can remove all the comments, and copy the buffer to a new buffer for a 1GB file in 8.31 seconds
kelton52
got it down to 5.75
kelton52
A: 

Hello kelton,

Could you publish the piece of code you used to fill your buffer for a 1GB file in 5.75s ?

Thanks in advance.

Sasfepu
You can check it out here...http://blog.skylabsonline.com/?p=53
kelton52