views:

260

answers:

5

When I type in the foll. code, I get the output as 1073741823.

#include <iostream>
#include <vector>
using namespace std;
int main()
{
  vector <int> v;
  cout<<v.max_size();
  return 0;
}

However when I try to resize the vector to 1,000,000,000, by v.resize(1000000000); the program stops executing. How can I enable the program to allocate the required memory, when it seems that it should be able to?

I am using MinGW in Windows 7. I have 2 GB RAM. Should it not be possible? In case it is not possible, can't I declare it as an array of integers and get away? BUt even that doesn't work.

Another thing is that, suppose I would use a file(which can easily handle so much data ). How can I let it read and write and the same time. Using fstream file("file.txt', ios::out | ios::in ); doesn't create a file, in the first place. But supposing the file exists, I am unable to use to do reading and writing simultaneously. WHat I mean is this : Let the contents of the file be 111111 Then if I run : -

#include <fstream>
#include <iostream>
using namespace std;
int main()
{
  fstream file("file.txt",ios:in|ios::out);
  char x;
  while( file>>x)
  {
    file<<'0';
  }
 return 0;
}

Shouldn't the file's contents now be 101010 ? Read one character and then overwrite the next one with 0 ? Or incase the entire contents were read at once into some buffer, should there not be atleast one 0 in the file ? 1111110 ? But the contents remain unaltered. Please explain. Thank you.

+10  A: 
  1. A 32-bit process can only address 4GB address space at a single time. Usually, plenty of this 4GB address space is used to map other stuff. Your vector is going to take too much contiguous address space (4 billion bytes) which is not likely to be available.

  2. You should memory map the file. See mmap.

Mehrdad Afshari
This discussion: http://stackoverflow.com/questions/2791330/which-is-better-in-general-map-or-vector-in-c/2791491#2791491 may be germane.
Clifford
Address space on Win32 is limited to 2GiB. Linux should be 3GiB I think.
Vincent Robert
@Vincent: there's also 3GB switch on Windows, although 2GB is the default: http://technet.microsoft.com/en-us/library/bb124810%28EXCHG.65%29.aspx
sbk
Because memory mapped files rely in virtual memory, the address range limit still remains.
Clifford
@Clifford: you'll note that the example is processing one byte at a time. I.e. vipersnake005 understands that the solution to "doesn't fit in memory" is "process one chunk at a time". He had problems in applying that principle to a modifying operation. Memory mapped files solve precisely that problem: They map one part of a file into the address space, and modifications to that address space are written back to file.
MSalters
@MSalters: You are correct; you can page sections of a larger file into memory by moving the offset. I think I misunderstood Mehrdad post; it reads like point (2) is the solution to point (1), since the OP did not number his distinct questions in that way.
Clifford
Note that the suggestion to "see mmap" would only apply to POSIX. Since vipersnake005 is using MinGW, he is implicitly using Win32. Therefore either the Win32 API must be used directly http://msdn.microsoft.com/en-us/library/ms810613.aspx or Boost http://www.boost.org/doc/libs/1_42_0/libs/iostreams/doc/classes/mapped_file.html.
Clifford
@Clifford: You're right. Direct usage of Windows API is, of course, preferred if you are going to only target Windows. I thought the OP is looking for a more portable approach by not using VC compiler.
Mehrdad Afshari
@Mehrdad Afshari: In that case the Boost library would suit. Under the hood Boost on Windows uses Windows API memory-mapped files for both this and shared memory. mmap would never have achieved portability since it is a POSIX API and Windows does not support POSIX. It may have been relevant had he perhaps been using Cygwin. MinGW uses Microsoft's library not GNU's libc.
Clifford
A: 

However when I try to resize the vector to 1,000,000,000, by v.resize(1000000000); the program stops executing. How can I enable the program to allocate the required memory, when it seems that it should be able to?

It can depend on the C++ STL implementation, but the act of resizing often causes the application to reserve a lot more than what you ask for.

Alex Reynolds
A: 

An integer is four bytes, so 1,000,000,000 integers will take up around 3.72GB.

lunixbochs
No, it will take (at least) exactly 4 **GB** or roundabout 3.72 **GiB**. See [binary prefix](http://en.wikipedia.org/wiki/Binary_prefix) for details.
FredOverflow
@FredOverflow: Somewhat pedantic! The only place I have ever seen that terminology in frequent use is on Wikipedia. In that context (where the reader may not be a computer expert), it serves a useful purpose of disambiguating, but here I think we all knew what he meant entirely unambiguously.
Clifford
A: 

You are asking to allocate one billion integers in contiguous sequence. Apart from the difficulty of finding such a huge contiguous space you simply don't have that space at all. Recall that an integer on a common 32-bit system occupies 32 bits, or 4 bytes. Multiply that by one billion and you go far beyond the 2GB you have. In addition, a std::vector is allowed to reserve more than you ask for.

As for your second question, if you both read and write at the same time with the same fstream object, make sure you seekg() and seekp() before your read and write.

wilhelmtell
Makes sense, it shouldn't be able to allocate so much. But then why does `v.maxsize()` allow me to allocate more than that ?
shreedhar
And regarding seekg and seekp, I was under the impression that, there is only one file pointer for both read and write ? If there are two, Can I write into the file and then read the entire contents without using `seekg(0)` ?
shreedhar
`vector::max_size()` doesn't "allow" you to allocate that much. It only tells you what are the limits your standard library implementation imposes. Whether your hardware can reach this limit or not is a different story.
wilhelmtell
If you write to a file and then wish to read with the same object then you need to seek to where you wish to read from. Similarly, if you wish to write into a file you read from using the same object then you must seek to the position you wish to write to. Both writing and reading automatically changes the get and put pointers, so you must adjust them yourself. I personally find it easier to just use an `ifstream` object for reading a file and an `ofstream` for writing to a file whenever I can.
wilhelmtell
+1  A: 

The maximum the STL implementation will cope with is one thing; the maximum amount of memory available from the OS is something else; you are hitting the latter.

You might for example be able to create a vector of that many char elements. Either way, don't expect blistering performance unless you physically have that much memory (plus whatever the OS and anything else running needs); accessing such a vector will no doubt result in much disk thrashing as the system pages memory in and out from disk.

Also a processor with a 32bit address space (or when running 32 bit OS regardless of physical address space) can only address 4Gb (physical or virtual), so there is an architectural limit. Moreover some OS's limit the user space; for example the user space in Win32 is fixed at 2Gb. Various versions of Win64 artificially limit the user space in order to allow Microsoft to charge different prices, so using Win64 is no guarantee of sufficient address space.

Clifford