I have fairly large hash (some 10M keys) and I would like to delete some elements from it.
I usually don't like to use delete
or splice
, and I wind up copying what I want instead of deleting what I don't. But this time, since the hash is really large, I think I'd like to delete directly from it.
So I'm doing something like this:
foreach my $key (keys %hash) {
if (should_be_deleted($key)) {
delete($hash{$key});
}
}
And it seems to work OK. But.. what if I'd like to delete some elements even before iterating on them? I'll explain by example:
foreach my $key (keys %hash) {
if (should_be_deleted($key)) {
delete($hash{$key});
# if $key should be deleted, so does "$key.a", "kkk.$key" and some other keys
# I already know to calculate. I would like to delete them now...
}
}
I thought of some possible solutions - like checking if a key still exists as the first step in the loop or first looping and creating a list of keys to delete (without actually deleting them), then actually deleting in another loop.
What are your thought regarding this?
UPDATE
It's seems that the double-pass approach has a consensus. However, it is quite inefficient in the sense that during the first pass I double-check keys that were already marked for deletion. This is kinda recursive, because not only I check the key, I also calculate the other keys that should be deleted, although they were already calculated by the original key.
Perhaps I need to use some more dynamic data structure for iterating over the keys, that will updated dynamically?