ansaurus

Question

Answer 1

+6 A:

Regarding your questions, let me address them in order and give some examples:

1) A list is returned if and when the return statement adds one. Consider

 R> retList <- function() return(list(1,2,3,4)); class(retList())
 [1] "list"
 R> notList <- function() return(c(1,2,3,4)); class(notList())
 [1] "numeric"
 R>

2) Names are simply not set:

R> retList <- function() return(list(1,2,3,4)); names(retList())
NULL
R>

3) They do not return the same thing. Your example gives

R> x <- list(1,2,3,4)
R> x[1]
[[1]]
[1] 1
R> x[[1]]
[1] 1

where x[1] returns the first element of x -- which is the same as x. Every scalar is a vector of length one. On the other hand x[[1]] returns the first element of the list.

4) Lastly, the two are different between they create, respectively, a list containing four scalars and a list with a single element (that happens to be a vector of four elements).

Dirk Eddelbuettel 2010-01-12 17:33:15

Very helpful, thanks. (Re item #1 in your answer--i agree, but what i had in mind was built-ins like 'strsplit', not user-created functions). In any event, +1 from me.

doug 2010-01-12 23:14:45

@doug About item #1 I think the only way is to check help for specific function, section `Value`. Like in `?strsplit`: "A list of the same length as x". But you should consider that there can be a function return different values depended on arguments (eg. sapply can return list or vector).

Marek 2010-01-13 09:05:54

Answer 2

+2 A:

Just to take a subset of your questions:

This article on indexing addresses the question of the difference between [] and [[]].

In short [[]] selects a single item from a list and [] returns a list of the selected items. In your example, x = list(1, 2, 3, 4)' item 1 is a single integer but x[[1]] returns a single 1 and x[1] returns a list with only one value.

> x = list(1, 2, 3, 4)
> x[1]
[[1]]
[1] 1

> x[[1]]
[1] 1

JD Long 2010-01-12 17:35:20

Answer 3

+12 A:

Just to address the last part of your question, since that really points out the difference between a list and vector in R:

Why do these two expressions not return the same result?

x = list(1, 2, 3, 4); x2 = list(1:4)

A list can contain any other class as each element. So you can have a list where the first element is a character vector, the second is a data frame, etc. In this case, you have created two different lists. x has four vectors, each of length 1. x2 has 1 vector of length 4:

> length(x[[1]])
[1] 1
> length(x2[[1]])
[1] 4

So these are completely different lists.

R lists are very much like a hash map data structure in that each index value can be associated with any object. Here's a simple example of a list that contains 3 different classes (including a function):

> complicated.list <- list("a"=1:4, "b"=1:3, "c"=matrix(1:4, nrow=2), "d"=search)
> lapply(complicated.list, class)
$a
[1] "integer"
$b
[1] "integer"
$c
[1] "matrix"
$d
[1] "function"

Given that the last element is the search function, I can call it like so:

> complicated.list[["d"]]()
[1] ".GlobalEnv" ...

As a final comment on this: it should be noted that a data.frame is really a list (from the data.frame documentation):

A data frame is a list of variables of the same number of rows with unique row names, given class ‘"data.frame"’

That's why columns in a data.frame can have different data types, while columns in a matrix cannot. As an example, here I try to create a matrix with numbers and characters:

> a <- 1:4
> class(a)
[1] "integer"
> b <- c("a","b","c","d")
> d <- cbind(a, b)
> d
 a   b  
[1,] "1" "a"
[2,] "2" "b"
[3,] "3" "c"
[4,] "4" "d"
> class(d[,1])
[1] "character"

Note how I cannot change the data type in the first column to numeric because the second column has characters:

> d[,1] <- as.numeric(d[,1])
> class(d[,1])
[1] "character"

Shane 2010-01-12 18:01:18

This helps, thanks. (By the way, your example re 'complicated list', as you might already know, is the standard way to replicate the 'switch' statement in C++, Java, etc. in languages that don't have one; probably a good way to do this in R when i need to). +1

doug 2010-01-12 23:30:27

Right, although there is a useful `switch` function in R that can be used for that purpose (see `help(switch)`).

Shane 2010-01-13 01:21:31

Answer 4

+2 A:

One reason lists work as they do (ordered) is to address the need for an ordered container that can contain any type at any node, which vectors do not do. Lists are re-used for a variety of purposes in R, including forming the base of a data.frame, which is a list of vectors of arbitrary type (but the same length).

Why do these two expressions not return the same result?

x = list(1, 2, 3, 4); x2 = list(1:4)

To add to @Shane's answer, if you wanted to get the same result, try:

x3 = as.list(1:4)

Which coerces the vector 1:4 into a list.

Alex Brown 2010-01-12 18:19:46

Answer 5

+1 A:

You say:

For another, lists can be returned from functions even though you never passed in a List when you called the function, and even though the function doesn't contain a List constructor, e.g.,

x = strsplit(LETTERS[1:10], "") # passing in an object of type 'character'
class(x)
# => 'list'

And I guess suggest that this is a problem(?). I'm here to tell you whey it's not a problem :-). Your example is a bit simple, in that when you do the string-split, you have a list with elements that are 1 element long, so you know that x[[1]] is the same as unlist(x)[1]. But what if the result of strsplit returned results of different length in each bin. Simply returning a vector (vs. a list) won't do at all.

For instance:

stuff <- c("You, me, and dupree","You me, and dupree",
           "He ran away, but not ver far, and not very fast")
x <- strsplit(stuff, ",")
xx <- unlist(strsplit(stuff, ","))

In the first case (x : which returns a list), you can tell what the 2nd "part" of the 3rd string was, eg: x[[3]][2]. How could you do the same using xx now that the results have been "unraveled" (unlist-ed)?

Steve Lianoglou 2010-01-12 21:56:59

Answer 6

A:

Just to add one more point to this:

R does have a data structure equivalent to the Python dict in the hash package. You can read about it in this blog post from the Open Data Group. Here's a simple example:

> library(hash)
> h <- hash( keys=c('foo','bar','baz'), values=1:3 )
> h[c('foo','bar')]
<hash> containing 2 key-value pairs.
  bar : 2
  foo : 1

In terms of usability, the hash class is very similar to a list. But the performance is better for large datasets.

Shane 2010-02-17 19:14:40

I'm aware of the hash package--it is mentioned in my original question as a suitable proxy for the traditional hash type.

doug 2010-02-17 19:25:36

ansaurus

tags:

views:

answers:

How to Correctly Use Lists in R?

related questions