i am using C++
say i want to store 40 records( usrnames) i will simply use array
if i want to store 40000 records ( usrnames) . i will search it sequentially
which data structure should i use?
any thoughts?
i am using C++
say i want to store 40 records( usrnames) i will simply use array
if i want to store 40000 records ( usrnames) . i will search it sequentially
which data structure should i use?
any thoughts?
You need to specify what the insertion and removal requirements are. Do things need to be removed and inserted at random points in the sequence?
Also, why the requirement to search sequentially? Are you doing searches that aren't suitable for a hash table lookup?
At the moment I'd suggest a deque
or a list
. Often it's best to choose a container with the interface that makes for the simplest implementation for your algorithm and then only change the choice if the performance is inadequate and an alternative provides the necessary speedup.
A vector
has two principle advantages, there is no per-object memory overhead, although vectors will over-allocate to prevent frequent copying and objects are stored contiguously so sequential access tends to be fast. These are also its disadvantages. Growing vectors require reallocation and copying, and insertion and removal from anywhere other than the end of the vector also require copying. Contiguous storage can produce problems for vectors with large numbers of objects or large objects as the contiguous storage requirements can be hard to satisfy even with only mild memory fragmentation.
A list
doesn't require contigous storage but list nodes usually have a per-object overhead of two pointers (in most implementation). This can be significant in list of very small objects (e.g. in a list of pointers, each node is 3x the size of the data item). Insertion and removal from the middle of a list is very cheap though and list nodes never need to me moved in memory once created.
A deque
uses chunked storage, so it has a low per-object overhead similar to a vector, but doesn't require contiguous storage over the whole container so doesn't have the same problem with fragmented memory spaces. It is often a very good choice for collections and is often overlooked.
std::vector and std::list seem good for this task. You can use an array if you know the maximum number of records beforehands.
If you need only sequentially search and storage, then list is the proper container.
Also, vector wouldn't be a bad choice.
As a rule of thumb, prefer vector
to list
or, diety forbid, C-style array.
After the vector is filled, make sure it is properly ordered using the sort
algorithm. You can then search for a particular record using either find
, binary_search
or lower_bound
. (You don't need to sort to use find
.)
Seriously unless you are in a resource constrained environment (embedded platform, phone, or other). Use a std::map
, save the effort of doing sorting or searching and let the container take care of everything. This will possibly be a sorted tree structure, probably balance (e.g. Red-Black), which means you will get good searching performance. Unless the size of you data is close to the size of one or two pointers, the memory overhead of whatever data structure you pick is negligable. You Graphics Card probably has more memory that you are going to use up for the data you are think about.
As others said there is very little good reason to use vanilla array, if you don't want to use a map
use std::vector
or std::list
depending on whether you need insert/delete data (=>list) or not (=>vector)
Also consider if you really need all that data in memory, how about putting it on disk via sqlite. Or even use sqlite for in memory access. It all depends on what you need to do with your data.