The use of 1 or 0 for indexing

tristan · December 11, 2018, 10:32pm

I’ll keep my usual rant to just, starting at 1 is the “mathematical” way of indexing :).

OvermindDL1 · December 11, 2018, 10:47pm

Not that I’ve ever seen?
The reasons I’ve ever seen 0-based is because of math, the reasons of 1 based have been purely for humans, which then tend to prefer 0 once they start dealing with the math of it. ^.^

How about one of the most basic examples, would you prefer to do (C/C++ syntax to keep it much shorter) arr[i + j + k], or would you prefer to do arr[i + j + k - 2], because the latter is what you have to do with 1-based indexing (and yes I know it’s actually an offset, but the term ‘index’ is the common term now, english changes over time annoyingly). Of course there is Djikstra’s paper on it from a mathematical perspective (ignoring the hardware performance aspects). Studies showing fenceposts errors are significantly more common with 1-based indexing. There is less calculations for the processor to do with 0-based indexing (which can be substantial on such architectures as you can index from memory to memory without needing to bring it to the CPU to sum a 1 to it first). Etc… etc… etc… I’ve never seen a good argument for 1-based indexing as of yet (like some people show things like for(int i=1;i<=len;++i)... when even the 0-based for(int i=0;i<len;++i)... is still shorter by a character and matches the math better). And somehow +1’s get scattered all over the place when I look at 1-based programming languages (Lua especially) that just wouldn’t happen with 0-based indexing. 0 is the natural base number.

So yes, 1-based array indexing is inherently broken by design and mathematically unnatural.

tristan · December 11, 2018, 11:02pm

That is all programming.

I just have it ingrained in my memory from childhood my dad, a mathematician, ranting about this because I, having started with C, would start stuff from 0.

And holy crap… that is what happens at the end of this Dijkstra story! I can’t wait to show him this, haha.

OvermindDL1 · December 11, 2018, 11:26pm

Lol, my father was a first-generation programmer (old pre-unix hacker days) and he used languages that started numbering with 1, so we had similar rants but he eventually ended up coming around to the 0-based point of view in time. ^.^

Djikstra’s was math, especially in regards to sequences and limits about how to represent number ranges in math. ^.^

sribe · December 11, 2018, 11:36pm

Haha, remember HP Basic? You could specify in array declarations whether to use 0- or 1-based indexing, thus had to go back to the declaration to know what was being used. What a “feature”!

OvermindDL1 · December 11, 2018, 11:40pm

Heh, in Fortran you can list both your starting and ending indices, so you could do integer a(-5:5) and it goes from -5 to 5. ^.^

dwahyudi · December 12, 2018, 2:56am

What about -1?

mjadczak · December 12, 2018, 9:25am

The only good one I’ve seen, and used myself, is when transcribing mathematical formulae (especially involving matrices) from a paper into something like Matlab (which indeed uses 1-based indexing, same as matrix maths). Other than that, I agree that 0 is the way to go, and mixing the two is just asking for trouble.

NobbZ · December 12, 2018, 10:20am

Indexes are natural numbers, and according to Peano, those start with 0¹. So indexes are zero-based by definition.

If though, you talk about the nth element of something, of course, this has to be one-based by convention, as by human understanding there is no zeroth element.

So after understanding the difference between the nth element or the element at a given index, this is not a matter anymore. One of the biggest problem is, that many language/library designers haven’t understood the difference.

¹: First Peano-axiom: “0 is a natural number”.

peerreynders · December 12, 2018, 11:38am

Qqwy · December 12, 2018, 1:52pm

I always thought there was a deep mathematical meaning behind mathematical matrices and other collections starting at 1.

As far as I’ve been able to ascertain, there is not. Starting at 1 is a human convention without any fundamental reason behind it.

Starting at 0, however, has the advantages that Dijkstra mentioned: It is easier to reason about ranges when they are half-open with the lower limit being the inclusive one.

Personally, I find the pointer-arithmetic rules to be a less strong argument, because this is a prime reason to introduce an intermediate abstraction so you do not have to remember the index = y+w*x vs index=y+w*x+1 or index = y % 7 vs index = (y % 7) + 1-stuff.

However, the half-open interval argument in and of itself is by far strong enough to always prefer 0-based indexing!

@OvermindDL1 Before computers existed, the word index already meant:

A movable finger on a gauge, scale, etc.

And this is actually very close to what we mean when indexing a (programmatical) collection.
There is no confusion with ‘offset vs index’ at all: Both start at zero.
Confusion only starts when we are talking in a natural language that works with ordinals: These always start at one; index 0 indicates the first ordinal element, index 1 the second ordinal element, and so on. (Coincidentally, this is exactly why nth is a bad name for a function, because it introduces an ordinal name into an otherwise index-based system, making it unclear which numbering scheme is used.)

tristan · December 12, 2018, 3:42pm

Eh, it is still almost all about programming. The one part that tries to give a more generic mathematical reason I don’t really understand: “inclusion of the upper bound would then force the latter to be unnatural by the time the sequence has shrunk to the empty one. That is ugly, so for the upper bound we prefer < as in a”

Qqwy · December 12, 2018, 4:04pm

I believe Dijkstra is here talking about what happens to your bounds when you are doing mathematical induction, which is analogous to (or, to be exact, it proves) recursion.

peerreynders · December 12, 2018, 4:39pm

I view it as a (sloppy) language issue (which is only far too common in English). This discussion wouldn’t even be happening if

index was always understood to be 1-based while offset was understood to 0-based AND
everybody was using those terms correctly and consistently without resorting to “you know what I mean” (which I obviously don’t because otherwise we wouldn’t regularly be having these misunderstandings/disagreements).

OvermindDL1 · December 12, 2018, 5:42pm

Even then if you look at the base algorithms then you will always see 0-based indexing pop up somewhere in the expansion, it may get canceled out to look like 1-based at times, but 0-based always seems to appear. Kind of like how pi is not the circle constant but is only half the circle constant. pi should have been 6.28... and because of that we have 2*pi in *EVERY* single formula that uses pi, the 2 can of course get canceled out sometimes (like some circle algorithms in 2d, since 2/2d ends up at 1) but the original root equations are always 2*pi.

Ah not linked those together in my mind yet, thanks!

I actually do use “Zeroth” in my daily language… Though I use a few oddities in my daily language. ^.^;

Like base 10… Base 10 is such a horrible base. We should be using base 6 or 60 or so, not 10… >.<

Base 6 is simple enough that you can keep all ‘unit values’ in your mind (easier than 10!) and it has the nice effect of even divisibility by 2 and 3, which is much more useful than 2 and 5 in almost everything, as well as you don’t get infinite repeating numbers after the point when you take a division by 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, and I forgot about 13 and higher, unlike base 10 where you get infinite repeating numbers after the point on even the trivially tiny 3, as well as 7, 9, 11, etc…

Ah, good to know! Even more showing how English’s definitions keep changing. ^.^;

It’s ugly as in it requires you to add +1’s all over the place, where that and only that representation does not. Being required to add things is very easy to forget, like forgetting the +c when derivating integrations.

Apparently as above index originally started at 0 as well. ^.^

peerreynders · December 12, 2018, 5:54pm

I was being hypothetical.

(I see no reason for the term “newbie” to exist other than the originator of the term not being aware of “neophyte” - of course it’s entirely possible that I’m missing some nuanced difference. I’ve always assumed that offset was used for items organized in a linear and contiguous manner, while the same isn’t necessarily true for index which implied an ordering scheme not necessarily based on spatial arrangement and proximity).

tristan · December 12, 2018, 5:57pm

What extra +1 is needed? You mean instead of doing <=? You could say instead that starting from 0 means you have to go to len - 1

OvermindDL1 · December 12, 2018, 7:36pm

I was using +1 as a stand-in for all mannor of +/-'s that you have to do in such cases. Like if you are indexing into a multi-dimensional flat array then you need to do arr[x*xSize*ySize + y*ySize + z] if 0-based, otherwise you have to do arr[(x-1)*xSize*ySize + (y-1)*ySize + z] or if you did the potentially very confusing thing and used those as 0-based into a 1-based array then you could do arr[x*xSize*ySize + y*ySize + z + 1], and there are endless examples of such things, not just with arrays but also with looping style, equations, and so much more. 0 is the natural base number. ^.^

tristan · December 12, 2018, 7:58pm

Ah ok, right, again programming based on offsets

OvermindDL1 · December 12, 2018, 9:30pm

That’s the same in mathematics as well, like when doing integration across dimensions or matrix work or so.