tags:

views:

361

answers:

8

Despite reading this article, I'm still confused as to the representation of the data in memory when it comes to using symbols. If a symbol, two of them contained in different objects, exist in the same memory location, then how is it that they contain different values? I'd have expected the same memory location to contain the same value. As a quote from the link:

Unlike strings, symbols of the same name are initialized and exist in memory only once during a session of ruby

I just don't understand how it manages to differentiate the values contained in the same memory location.


EDIT

So let's consider the example:

patient1 = { :ruby => "red" }
patient2 = { :ruby => "programming" }

patient1.each_key {|key| puts key.object_id.to_s}
3918094
patient2.each_key {|key| puts key.object_id.to_s}
3918094

patient1 and patient2 are both hashes, that's fine. :ruby however is a symbol. If we were to output the following:

patient1.each_key {|key| puts key.to_s}

Then what will be output? "red", or "programming"?


FURTHER EDIT

I'm still really quite confused. I'm thinking a symbol is a pointer to a value. Let's forget hashes for a second. The questions I have are; can you assign a value to a symbol? Is a symbol just a pointer to a variable with a value in it? If symbols are global, does that mean a symbol always points to one thing?

+9  A: 

Consider this.

x = :sym
y = :sym 
(x.__id__ == y.__id__ ) && ( :sym.__id__ == x.__id__) # => true

x = "string"
y = "string"
(x.__id__ == y.__id__ ) || ( "string".__id__ == x.__id__) # => false

So however you create a symbol object, as long as it's contents are the same, it will refer to the same object in the memory. This is not a problem because a symbol is an immutable object. Strings are mutable.

Edit: (In response to the comment below)

In the original article, the value is not being stored in a symbol, it is being stored in a hash. Consider this:

hash1 = { "string" => "value"}
hash2 = { "string" => "value"}

This creates 6 objects in the memory - 4 string objects and 2 hash objects.

hash1 = { :symbol => "value"}
hash2 = { :symbol => "value"}

This only creates 5 objects in memory - 1 symbol, 2 string and 2 hash objects.

anshul
The example in the link, however, shows the symbols containing _different_ values, but the symbol has the same name, and the same memory location. When they're output, they have _different_ values, that's the part I don't get. Surely they should contain the same value?
Kezzer
I've just made an edit to try and explain how I'm confused still. My brain cannot compute ;)
Kezzer
Symbols don't contain values, they _are_ values. Hashes contain values.
Mladen Jablanović
It's the `Hash` (created by {... => ...} in your code) that stores key/value pairs, not the `Symbol`s themselves. The `Symbol`s (e.g. `:symbol` or `:sym` or `:ruby`) are the keys in the pairs. Only as part of a `Hash` do they "point" to anything.
James A. Rosen
+1  A: 
patient1 = { :ruby => "red" }
patient2 = { :ruby => "programming" }

patient1.each_key {|key| puts key.object_id.to_s}
3918094
patient2.each_key {|key| puts key.object_id.to_s}
3918094

patient1 and patient2 are both hashes, that's fine. :ruby however is a symbol. If we were to output the following:

patient1.each_key {|key| puts key.to_s}

Then what will be output? "red", or "programming"?

Neither, of course. The output will be ruby. Which, BTW, you could have found out in less time than it took you to type the question, by simply typing it into IRB instead.

Why would it be red or programming? Symbols always evaluate to themselves. The value of the symbol :ruby is the symbol :ruby itself and the string representation of the symbol :ruby is the string value "ruby".

[BTW: puts always converts its arguments to strings, anyway. There's no need to call to_s on it.]

Jörg W Mittag
I don't have IRB on current machine, neither would I be able to install it hence why, so my apologies for that.
Kezzer
Jörg W Mittag
Thanks, just playing around with it now. I'm still a little confused though so going over it again.
Kezzer
+1  A: 

The symbol :ruby does not contain "red" or "programming". The symbol :ruby is just the symbol :ruby. It is your hashes, patient1 and patient2 that each contain those values, in each case pointed to by the same key.

Think about it this way: If you go into the living room on christmas morning, and see two boxes with a tag on them that say "Kezzer" on them. On has socks in it, and the other has coal. You're not going to get confused and ask how "Kezzer" can contain both socks and coal, even though it is the same name. Because the name isn't containing the (crappy) presents. It's just pointing at them. Similarly, :ruby doesn't contain the values in your hash, it just points at them.

jcdyer
+4  A: 

I was able to grock symbols when I thought of it like this. A ruby string is an object that has a bunch of methods and properties. People like to use strings for keys, and when the string is used for a key then all those extra methods aren't used. So they made symbols, which are string objects with all the functionality removed, except that which is needed for it to be a good key.

Just think of symbols as constant strings.

Segfault
Reading through the posts, this one probably makes the most sense to me. :ruby is just stored somewhere in memory, if I use "ruby" somewhere, then "ruby" again somewhere again, it's just duplication. So using symbols is a way to reduce duplication of common data. As you say, constant strings. I guess there's some underlying mechanism that will find that symbol again to use?
Kezzer
+4  A: 
patient1.each_key {|key| puts key.to_s}

Then what will be output? "red", or "programming"?

Neither, it will output "ruby".

You're confusing symbols and hashes. They aren't related, but they're useful together. The symbol in question is :ruby; it has nothing to do with the values in the hash, and it's internal integer representation will always be the same, and it's "value" (when converted to a string) will always be "ruby".

meagar
+6  A: 

You might be presuming that the declaration you've made defines the value of a Symbol to be something other than what it is. In fact, a Symbol is just an "internalized" String value that remains constant. It is because they are stored using a simple integer identifier that they are frequently used as that is more efficient than managing a large number of variable-length strings.

Take the case of your example:

patient1 = { :ruby => "red" }

This should be read as: "declare a variable patient1 and define it to be a Hash, and in this store the value 'red' under the key (symbol 'ruby')"

Another way of writing this is:

patient1 = Hash.new
patient1[:ruby] = 'red'

puts patient1[:ruby]
# 'red'

As you are making an assignment it is hardly surprising that the result you get back is identical to what you assigned it with in the first place.

The Symbol concept can be a little confusing as it's not a feature of most other languages.

Each String object is distinct even if the values are identical:

[ "foo", "foo", "foo", "bar", "bar", "bar" ].each do |v|
  puts v.inspect + ' ' + v.object_id.to_s
end

# "foo" 2148099960
# "foo" 2148099940
# "foo" 2148099920
# "bar" 2148099900
# "bar" 2148099880
# "bar" 2148099860

Every Symbol with the same value refers to the same object:

[ :foo, :foo, :foo, :bar, :bar, :bar ].each do |v|
  puts v.inspect + ' ' + v.object_id.to_s
end

# :foo 228508
# :foo 228508
# :foo 228508
# :bar 228668
# :bar 228668
# :bar 228668

Converting strings to symbols maps identical values to the same unique Symbol:

[ "foo", "foo", "foo", "bar", "bar", "bar" ].each do |v|
  v = v.to_sym
  puts v.inspect + ' ' + v.object_id.to_s
end

# :foo 228508
# :foo 228508
# :foo 228508
# :bar 228668
# :bar 228668
# :bar 228668

Likewise, converting from Symbol to String creates a distinct string every time:

[ :foo, :foo, :foo, :bar, :bar, :bar ].each do |v|
  v = v.to_s
  puts v.inspect + ' ' + v.object_id.to_s
end

# "foo" 2148097820
# "foo" 2148097700
# "foo" 2148097580
# "bar" 2148097460
# "bar" 2148097340
# "bar" 2148097220

You can think of Symbol values as being drawn from an internal Hash table and you can see all values that have been encoded to Symbols using a simple method call:

Symbol.all_values

# => [:RUBY_PATCHLEVEL, :vi_editing_mode, :Separator, :TkLSHFT, :one?, :setuid?, :auto_indent_mode, :setregid, :back, :Fail, :RET, :member?, :TkOp, :AP_NAME, :readbyte, :suspend_context, :oct, :store, :WNOHANG, :@seek, :autoload, :rest, :IN_INPUT, :close_read, :type, :filename_quote_characters=, ...

As you define new symbols either by the colon-notation or by using .to_sym this table will grow.

tadman
A: 

I would recommend reading the Wikipedia article on hash tables - I think it will help you get a sense of what {:ruby => "red"} really means.

Another exercise that might help your understanding of the situation: consider {1 => "red"}. Semantically, this doesn't mean "set the value of 1 to "red"", which is impossible in Ruby. Rather, it means "create a Hash object, and store the value "red" for the key 1.

Greg Campbell
+1  A: 

Symbols are not pointers. They do not contain values. Symbols simply are. :ruby is the symbol :ruby and that's all there is to it. It doesn't contain a value, it doesn't do anything, it just exists as the symbol :ruby. The symbol :ruby is a value just like the number 1 is. It doesn't point to another value any more than the number 1 does.

Chuck