I was surprised that sys.getsizeof( 10000*[x] )
is 40036 regardless of x: 0, "a", 1000*"a", {}.
Is there a deep_getsizeof
which properly considers elements that share memory ?
(The question came from looking at in-memory database tables like
range(1000000) -> province names: list or dict ?)
(Python is 2.6.4 on a mac ppc.)
Added: 10000*["Mississippi"] is 10000 pointers to one "Mississippi", as several people have pointed out. Try this:
nstates = [AlabamatoWyoming() for j in xrange(N)]
where AlabamatoWyoming() -> a string "Alabama" .. "Wyoming".
What's deep_getsizeof(nstates) ?
(How can we tell ?
- a proper deep_getsizeof: difficult, ~ gc tracer
- estimate from total vm
- inside knowledge of the python implementation
- guess.
Added 25jan: see also when-does-python-allocate-new-memory-for-identical-strings