ansaurus

Question

Why do vector indices in R start with 1, instead of 0?

Answer 1

+13 A:

FORTRAN is one language that starts arrays at 1. Mathematicians deal with vectors that always start with component 1 and go through N. Linear algebra conventions start with row and column numbered 1 and go through N as well.

C started with zero because of the pointer arithmetic that was implicit underneath. Java, JavaScript, C++, and C# followed suit from C.

duffymo 2010-06-28 19:05:10

Exactly. C's 0 indexing always seemed utterly reasonless to me until I learned a little bit about pointer arithmetic. Then it made sense as a design choice.

Sharpie 2010-07-01 02:16:12

Answer 2

A:

You're doing it wrong. If you want to store additional attributes in an object, use attr:

> foo <- 1:20
> attr(foo, "created") <- Sys.time()               # just as an example
> str(foo)
 atomic [1:20] 1 2 3 4 5 6 7 8 9 10 ...
 - attr(*, "created")= POSIXct[1:1], format: "2010-06-28 14:07:15"    # our time
> summary(foo)                                     # object works as usual
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00    5.75   10.50   10.50   15.20   20.00 
>

Dirk Eddelbuettel 2010-06-28 19:09:00

What am I doing wrong? I wasn't trying to store any additional information in my object.

Frank 2010-06-28 19:11:49

I misread the last line of your question. To answer your question: R isn't C. That's all.

Dirk Eddelbuettel 2010-06-28 19:30:08

Answer 3

+2 A:

0 is only "usual" because that's what C did, and a lot of later languages slavishly copied C syntax. By default in Fortran arrays are 1-based.

In Ada there is no default and you have to pick the beginnning and end ranges. Interestingly, it seems that most code I've come across picks '1' for the lower bound. I think that's a pretty good indication of where folks would have gone given a free choice.

T.E.D. 2010-06-28 19:19:06

Answer 4

+6 A:

Vectors in math are often represented as n-tuples, elements of which are indexed from 1 to n. I suspect that r wanted to stay true to this notation.

Jan Gorzny 2010-06-28 19:19:11

Answer 5

+2 A:

R is a "platform for experimentation and research". Its aim is to enable "statisticians to use the full capabilities of such an environment" without rethinking the way they usually deal with statistics. So people use formulas to make regression models, and people start counting at 1.

wok 2010-06-29 06:30:55

Answer 6

+1 A:

Frank, I think you were misinterpreting what you saw when you typed arr[0]. The numeric(0) just means that the result is a numeric vector with no elements. It does not mean that the type of the vector is being "stored" in element 0. You would have gotten the same result if you had typed, for example, arr[arr > 30]. No element meets that condition, so the result vector has no elements. Likewise, no element has index 0. This is intentional, and has nothing to do with the 0 space being used for something else.

goodside 2010-06-30 08:09:41

I think that is [what Dirk try to explain](http://stackoverflow.com/questions/3135325/why-do-vector-indices-in-r-start-with-1-instead-of-0/3135372#3135372) but you got the point. +1

Marek 2010-06-30 08:23:32

Answer 7

A:

The way this question is worded, it strikes me as the programming equivalent of the "Ugly American"

Pierreten 2010-07-02 04:37:37

But why you post it as an answer and not as a comment?

Marek 2010-07-02 07:51:05

Answer 8

+1 A:

Actually, I think that the C like version that "start with 0" is very logical when you look at the way the memory is organized. In C we can write the following :

int* T = new int[10];

The first element of the array is *T. This is perfectly "logical" because *T is the adress of the first memory case pointed. The second element is the second case so *(T+1) : we move forward by one "sizeof(int)".

To make the code more readable, C implemented an alias : T[i] for *(T+i). To access the first element, you have to access *T that is T[0]. That's perfectly natural.

This idea is extended by iterators :

std::vector<int> T(10);
int val = *(T.begin()+3);

T[i] is just an alias for *(T.begin()+i).

In fortran/R, we usually start with 1 because of mathematical issues but there's certainly other good choices (cf this link for example). Do not forget that fortran can easily use array that start with 0 :

PROGRAM ZEROARRAY
REAL T(0:9)
T(0) = 3.14
END

Elenaher 2010-07-19 12:21:02

ansaurus

tags:

views:

answers:

Why do vector indices in R start with 1, instead of 0?

related questions