Hi developers!
I am working with a very large dataset. Essentially I will be working with millions of records and storing a value into a dataset.
Each time that I store a value, I must first check to make sure that the value is not already in the data structure. If the value is in the data structure, I must update (or remove/add) the record to update the count.
There are repeats within the dataset, and I don't want to use a bad data structure and get a speed of O(n), as I'd like to be able to run this overnight and come in in the morning with it done!
Any advice?