I know that Python dict
s will "leak" when items are removed (because the item's slot will be overwritten with the magic "removed" value)… But will the set
class behave the same way? Is it safe to keep a set
around, adding and removing stuff from it over time?
Edit: Alright, I've tried it out, and here's what I found:
>>> import gc >>> gc.collect() 0 >>> nums = range(1000000) >>> gc.collect() 0 ### rsize: 20 megs ### A baseline measurement >>> s = set(nums) >>> gc.collect() 0 ### rsize: 36 megs >>> for n in nums: s.remove(n) >>> gc.collect() 0 ### rsize: 36 megs ### Memory usage doesn't drop after removing every item from the set… >>> s = None >>> gc.collect() 0 ### rsize: 20 megs ### … but nulling the reference to the set *does* free the memory. >>> s = set(nums) >>> for n in nums: s.remove(n) >>> for n in nums: s.add(n) >>> gc.collect() 0 ### rsize: 36 megs ### Removing then re-adding keys uses a constant amount of memory… >>> for n in nums: s.remove(n) >>> for n in nums: s.add(n+1000000) >>> gc.collect() 0 ### rsize: 47 megs ### … but adding new keys uses more memory.