R provides two different methods for accessing the elements of a list or data.frame- the []
and [[]]
operators.
What is the difference between the two? In what situations should I use one over the other?
R provides two different methods for accessing the elements of a list or data.frame- the []
and [[]]
operators.
What is the difference between the two? In what situations should I use one over the other?
Double brackets accesses a list element, while a single bracket gives you back a list with a single element.
lst <- list('one','two','three')
a <- lst[1]
class(a)
## returns "list"
a <- lst[[1]]
class(a)
## returns "character"
The significant differences between the two methods are the class of the objects they return when used for extraction and whether they may accept a range of values, or just a single value during assignment.
Consider the case of data extraction on the following list:
foo <- list( str='R', vec=c(1,2,3), bool=TRUE )
Say we would like to extract the value stored by bool from foo and use it inside an if()
statement. This will illustrate the differences between the return values of []
and [[]]
when they are used for data extraction. The []
method returns objects of class list (or data.frame if foo was a data.frame) while the [[]]
method returns objects whose class is determined by the type of their values.
So, using the []
method results in the following:
if( foo[ 'bool' ] ){ print("Hi!") }
Error in if (foo["bool"]) { : argument is not interpretable as logical
class( foo[ 'bool' ] )
[1] "list"
This is because the []
method returned a list and a list is not valid object to pass directly into an if()
statement. In this case we need to use [[]]
because it will return the "bare" object stored in 'bool' which will have the appropriate class:
if( foo[[ 'bool' ]] ){ print("Hi!") }
[1] "Hi!"
class( foo[[ 'bool' ]] )
[1] "logical"
The second difference is that the []
operator may be used to access a range of slots in a list or columns in a data frame while the [[]]
operator is limited to accessing a single slot or column. Consider the case of value assignment using a second list, bar()
:
bar <- list( mat=matrix(0,nrow=2,ncol=2), rand=rnorm(1) )
Say we want to overwrite the last two slots of foo with the data contained in bar. If we try to use the [[]]
operator, this is what happens:
foo[[ 2:3 ]] <- bar
Error in foo[[2:3]] <- bar :
more elements supplied than there are to replace
This is because [[]]
is limited to accessing a single element. We need to use []
:
foo[ 2:3 ] <- bar
print( foo )
$str
[1] "R"
$vec
[,1] [,2]
[1,] 0 0
[2,] 0 0
$bool
[1] -0.6291121
Note that while the assignment was successful, the slots in foo kept their original names.
Never use single bracket [ ] indexing for a list. It's almost always a bug when you do it.
This really is a language misfeature.