ansaurus

Question

How to use R's ellipsis feature when writing your own function?

Answer 1

+3 A:

You gave half the answer already. Consider

R> my_ellipsis_function <- function(...) {
+   input_list <- as.list(substitute(list(...)))
+ }
R> print(my_ellipsis_function(a=1:10, b=2:20))
[[1]]
list

$a
1:10

$b
11:20

R>

So this took two arguments a and b from the call and converted it to a list. Wasn't that what you asked for?

Dirk Eddelbuettel 2010-06-16 21:45:09

Not quite what I want. That actually appears to return a list of lists. Notice the `[[1]]`. Also, I'd like to know how the magic incantation `as.list(substitute(list(...)))` works.

Ryan Thompson 2010-06-16 22:05:48

The inner `list(...)` creates a `list` object based on the arguments. Then `substitute()` creates the parse tree for the unevaluated expression; see the help for this function. As well as a good advanced text on R (or S). This is not trivial stuff.

Dirk Eddelbuettel 2010-06-16 22:23:33

Ok, what about the `[[-1L]]` part (from my question)? Shouldn't it be `[[1]]`?

Ryan Thompson 2010-06-16 22:57:57

You need to read up on indexing. The minus means 'exclude', i.e. `print(c(1:3)[-1])` will print 2 and 3 only. The `L` is a new-fangled way to ensure it ends up as a integer, this is done a lot in the R sources.

Dirk Eddelbuettel 2010-06-16 23:24:54

I don't need to read up on indexing, but I *do* need to pay closer attention to the output of the commands that you show. The difference between the `[[1]]` and the `$a` indices made me think that nested lists were involved. But now I see that what you actually get is the list I want, but with an extra element at the front. So then the `[-1L]` makes sense. Where does that extra first element come from, anyway? And is there any reason I should use this instead of simply `list(...)`?

Ryan Thompson 2010-06-18 00:50:02

Answer 2

+4 A:

You can convert the ellipsis into a list with list(), and then perform your operations on it:

> test.func <- function(...) { lapply(list(...), class) }
> test.func(a="b", b=1)
$a
[1] "character"

$b
[1] "numeric"

So your get_list_from_ellipsis function is nothing more than list.

A valid use case for this is in cases where you want to pass in an unknown number of objects for operation (as in your example of c() or data.frame()). It's not a good idea to use the ... when you know each parameter in advance, however, as it adds some ambiguity and further complication to the argument string (and makes the function signature unclear to any other user). The argument list is an important piece of documentation for function users.

Otherwise, it is also useful for cases when you want to pass through parameters to a subfunction without exposing them all in your own function arguments. This can be noted in the function documentation.

Shane 2010-06-16 21:45:32

I know about using the ellipsis as a pass-through for arguments to subfunctions, but it is also common practice among R primitives to use the ellipsis in the way I have described. In fact, both the `list` and `c` functions work in this way, but both are primitives, so I can't easily inspect their source code to understand how they work.

Ryan Thompson 2010-06-16 22:07:59

Ok, well using `list()` does exactly what you want, right?

Shane 2010-06-17 00:08:07

`rbind.data.frame` use this way.

Marek 2010-06-17 23:06:16

If `list(...)` is sufficient, why do R builtins such as `data.frame` use the longer form `as.list(substitute(list(...)))[-1L]` instead?

Ryan Thompson 2010-06-18 00:29:51

As I didn't create `data.frame`, I don't know the answer to that (that said, I'm sure that there *is* a good reason for it). I use `list()` for this purpose in my own packages and have yet to encounter a problem with it.

Shane 2010-06-18 13:03:06

Answer 3

+3 A:

Just to add to Shane and Dirk's responses: it is interesting to compare

get_list_from_ellipsis1 <- function(...) list(...)
get_list_from_ellipsis1(a = 1:10, b = 2:20)

$a
 [1]  1  2  3  4  5  6  7  8  9 10

$b
 [1]  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

with

get_list_from_ellipsis2 <- function(...) as.list(substitute(list(...)))[-1L]
get_list_from_ellipsis2(a = 1:10, b = 2:20)

$a
1:10

$b
2:20

As it stands, either version appears suitable for your purposes in my_ellipsis_function, though the first is clearly simpler.

Richie Cotton 2010-06-17 15:37:19

Answer 4

+4 A:

I read answers and comments and I see that few things wasn't mentioned.

data.frame uses list(...) version. Fragment of the code:
```
object <- as.list(substitute(list(...)))[-1L]
mrn <- is.null(row.names)
x <- list(...)
```
object is used to do some magic with column names, but x is used to create final data.frame.
For use of unevaluated ... argument look at write.csv code where match.call is used.
As you write in comment result in Dirk answer is not a list of lists. Is a list of length 4, which elements are language type. First object is a symbol - list, second is expression 1:10 and so on. That explain why [-1L] is needed: it removes expected symbol from provided arguments in ... (cause it is always a list).
As Dirk states substitute returns "parse tree the unevaluated expression".
When you call my_ellipsis_function(a=1:10,b=11:20,c=21:30) then ... "creates" a list of arguments: list(a=1:10,b=11:20,c=21:30) and substitute make it a list of four elements:
```
List of 4
$  : symbol list
$ a: language 1:10
$ b: language 11:20
$ c: language 21:30
```
First element doesn't have a name and this is [[1]] in Dirk answer. I achieve this results using:
```
my_ellipsis_function <- function(...) {
  input_list <- as.list(substitute(list(...)))
  str(input_list)
  NULL
}
my_ellipsis_function(a=1:10,b=11:20,c=21:30)
```

As above we can use str to check what objects are in a function.

my_ellipsis_function <- function(...) {
    input_list <- list(...)
    output_list <- lapply(X=input_list, function(x) {str(x);summary(x)})
    return(output_list)
}
my_ellipsis_function(a=1:10,b=11:20,c=21:30)
 int [1:10] 1 2 3 4 5 6 7 8 9 10
 int [1:10] 11 12 13 14 15 16 17 18 19 20
 int [1:10] 21 22 23 24 25 26 27 28 29 30
$a
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00    3.25    5.50    5.50    7.75   10.00 
$b
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   11.0    13.2    15.5    15.5    17.8    20.0 
$c
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   21.0    23.2    25.5    25.5    27.8    30.0

It's ok. Lets see substitute version:

   my_ellipsis_function <- function(...) {
       input_list <- as.list(substitute(list(...)))
       output_list <- lapply(X=input_list, function(x) {str(x);summary(x)})
       return(output_list)
   }
   my_ellipsis_function(a=1:10,b=11:20,c=21:30)
    symbol list
    language 1:10
    language 11:20
    language 21:30
   [[1]]
   Length  Class   Mode 
        1   name   name 
   $a
   Length  Class   Mode 
        3   call   call 
   $b
   Length  Class   Mode 
        3   call   call 
   $c
   Length  Class   Mode 
        3   call   call

Isn't what we needed. You will need additional tricks to deal with these kind of objects (as in write.csv).

If you want use ... then you should use it as in Shane answer, by list(...).

Marek 2010-06-21 08:50:26

+1 Fantastic discussion of the issues.

Shane 2010-06-21 15:51:47

Thank you for explaining the differences.

Ryan Thompson 2010-06-21 15:55:37

ansaurus

tags:

views:

answers:

How to use R's ellipsis feature when writing your own function?

Edit

related questions