1. 23
  1. 25

    It’s not that big of a deal. Just do as documented in Whatever language. If one is better than the other, it won’t be more than a small detail advantage. Let it go.

    I can totally see how non tech people find amusing how nerds care about such small silly things.

    1. 17

      We should have a tag for bikeshedding. I think most fields, tech or non-tech have similar silly arguments. Get into a climbing group and ask them about the best cams out there and why they are using hexes or tricams, you can be sure to have them argue for a while. Meanwhile this leaves the whole crag for yourself to climb! Caring about pointless details is always easier than caring about getting actual things done and I have my fair share of guilt at that!

    2. 14

      0-indexing is better because 0 is a natural number, dangit

      Yeah I know lots of people find it unintuitive, but so much math works out so much better when you start counting from 0. I will fite people over this

      1. 7

        There is no consensus of 0 being part of the natural numbers. There is an ISO standard, but some definitions of natural numbers specify that natural numbers contain all positive whole numbers, and since 0 is neither positive, nor negative, it’s not part of the natural numbers.

        1. 11

          Peano arithmetic, which I think can be considered fairly standard, is defined in terms of a Zero and a Successor function.

          The existence of a zero among the naturals means they can be considered a group, as zero is the identity for addition.

          Integers are generally defined as two-tuples of natural numbers where (m,n) represents the difference m-n. The existence of a zero among the naturals means that every integer has a canonical representation as (m,0), (0,n), or (0,0); the first form is simply m, the second is -n, and the third is 0 (which satisfies both, as a bonus showing clearly that 0=-0). This is clearly more elegant than the alternative, where the integral representation of natural m is (m+1,1).

          Defining S_0 as the base case of a sequence means that S_n results from n applications of the inductive rule. The simplest example of this is actually peano arithmetic itself, where the number n self-evidently results from n applications of the Successor function to the Zero.

          These are just a few examples. I’m unaware of any arguments in favour of excluding zero from the naturals.

          1. 8

            The existence of a zero among the naturals means they can be considered a group, as zero is the identity for addition.

            Not a group but a monoid. Without zero they’d be a semigroup, i.e. lacking identity (or maybe you had some group in mind but with respect to addition the naturals are a monoid, still if people think multiplication is somehow more natural then starting at 1 also gives you a monoid so this argument can go both ways).

            I agree with the rest of your comment.

            Edit: I guess I’ll put some thoughts here.

            The difference between counting the position of something (what is the first number?) and the quantity of something (what is the smallest quantity?) is the difference that matters w.r.t. this indexing question. The reason 0-indexing is more natural is because when you have a positional number system (such as base whatever then whatever is the number of symbols in the system, the first symbol will always have the meaning 0 because when you run out of symbols like suc(9) in decimal, then you start again in the next position (a positional number system) and get 10).

            Just to drive the point home: A list in a programming language is a base for a positional number system, the elements in the list are the digits you are using, if you ever find yourself doing index arithmetic of the form x / (length list) combined with x % (length list) then you are working with a base (length list) number system.

            Another way to define the natural numbers is in terms of homotopies; if you imagine \mathbb(C) \ {0} then a path that doesn’t circle around the origin is contractible and defined as 0 (a path that has a homotopy to the point it originates from and ends at is contractible) which allows you to prove that 0 = -0, while other numbers can be obtained by looking at how many times you circle around the origin and in which direction the spiral is (you define one direction as positive and the other as negative but I said we’re defining the naturals so you’d ignore directionality, however you’d have have to start at 0 because otherwise extending to the whole numbers would make your earlier system incompatible). This may seem kinda ridiculous but the point is that this kind of inductive number system with a base case is often bidirectional and so it makes sense for the base case to be 0.

            1. 5

              Peano arithmetic, which I think can be considered fairly standard, is defined in terms of a Zero and a Successor function.

              While this is commonly the case today, Peano himself originally defined it for positive integers only.

            2. 3

              I will also fite anybody who says 0 isn’t a natural number

              1. 1

                In addition (pun intended) to the rest of the thread, I’ll point out that the Von Neumann ordinals give a correspondence between natural numbers and set cardinalities. The number 1 corresponds to the sets with exactly one element. Similarly, the number 0 corresponds to the empty sets.

                Think of the natural numbers as the numbers for counting discrete things. (This is a decategorification!) Counting sheep in a field ala Baez, we count zero sheep, one sheep, two sheep, three sheep, etc. The fact that a field can have zero sheep is still countable with natural numbers.

                1. 3

                  Think of the natural numbers as the numbers for counting discrete things. (This is a decategorification!) Counting sheep in a field ala Baez, we count zero sheep, one sheep, two sheep, three sheep, etc. The fact that a field can have zero sheep is still countable with natural numbers.


                  This is what I meant (elsewhere down-thread) when I said that zero is a generalisation of magnitude. Without zero, the question ‘how many [sheep in a field, for example] are there’ has two different kinds of answers: ‘there are n’ or ‘there aren’t any’. That is, Either None PositiveInt. If zero can be a magnitude, then the answer always takes the form ‘there are n’, with the former ‘none’ case replaced by n=0.

                  (There are additionally generalisations to be had: integers let you reason about deficits, and rationals let you reason about fractional quantities (‘how much’ instead of ‘how many’. But those generalisations come at the context of added complexity, where zero is effectively free.)

            3. 13

              Most of the article recaps the fairly well known indexes as offsets argument. But this part at the end I’ve never heard before:

              What’s funny to think about is that if instead C had not done that and used 1-based indexing, people today would certainly be claiming how C is superior for providing both 1-based indexing with p[i] and 0-based pointer offsets with p + i.

              I wholeheartedly agree!

              1. 4

                The author forgets to mention that in C p[1] also dereferences the pointer. So it is equivalent to *(p + 1), not to p + 1. Using 1-based indexing for bracket notation would simultaneously assign two new concepts to one operator. I can see how that could have been an unfortunate choice.

                1. 1

                  The author forgets to mention that in C p[1] also dereferences the pointer.

                  I thought that was just implied. Nevertheless, you have a good point.

              2. 7

                For many centuries, the existence of zero was not even acknowledged. Even after it was, it was highly controversial and took a long time to come into vogue.

                People are culturally prejudiced against 0. It makes sense that people find the use of 0 as an index less ‘natural’ than 1, but the reasons for that aren’t inherent to our psyche or to the world; they’re purely cultural.

                Zero was an excellent innovation and I whole-heartedly welcome it: into the domain of mathematics, into the domain of programming languages, everywhere else. It’s a generalisation of magnitude, and a more natural way to count.

                1. 2

                  We totally should use Roman numerals for indexing.

                  1. 2

                    INTERCAL didn’t go far enough, huh?

                2. 6

                  The post zealously dashes directly past the question of what happens with fencepost errors. In Lua, an off-by-one error towards zero will silently not fail. Instead, looking up the zeroth element of a table will return nil. When combined with the fact that lots of Lua code will need to add +1 to each index in order to be correct, this combination of design choices is a footgun which invites folks to make indexing mistakes.

                  1. 5

                    Prolog has both nth0/3 and nth1/3 so you can use both zero- and one-based indexing. I find having both actually pretty nice - in different circumstances, different ones seem more natural, so one can just use whatever makes sense at the time.

                    1. 4

                      I think Ada did the only correct thing and allowed indices to start at both 0, 1 and 56.

                      1. 3

                        I think there is a strawman in there.

                        1. 4

                          don’t even see an argument for 1 based indexing here. It just says “people like 0 based indexing because of Dijkstra and C” but not why 1 based indexing would be better.

                          1. 2

                            Granted there is no “why 1-based indexing is ‘better’” section, maybe that isn’t what the author is going for anyways, but I noticed two points:

                            It really shows how conditioned an entire community can be when they find the statement “given a list x, the first item in x is x[1], the second item in x is x[2]” to be unnatural. :)


                            … and I think even proponents of 0-based indexing would agree, in spite of the fact that most of them wouldn’t even notice that they don’t call it a number zero reason.

                        2. 4

                          And nowadays, all arguments that say that indexes should be 0-based are actually arguments that offsets are 0-based, indexes are offsets, therefore indexes should be 0-based. That’s a circular argument.

                          This is not a circular argument.

                          The rest seems mostly straw-mans, without argument for 1-based indexing.

                          1. 2

                            Here is my very simple argument: p[N] means the Nth element of an array, that is p + N. p is a pointer to data, so the actual memory access is p + (N * sizeof(p)) unless you 1-index, in which case it’s p + ((N - 1) * sizeof(p)). It doesn’t take a genius to see the lost perf unless you contort your data structures in unnatural ways.

                            There are plenty of other cases it goes wrong (plenty of maths bounds are [x,y) not (x,y] ), offsets don’t work automatically, and what was previously x[a-b] becomes x[a - b + 1], etc.

                            The argument that people count from one is also cultural - floor numbers in America start at one, but zero in at least NZ, and I can’t imagine NZ is alone there.

                            People know the babies don’t get born at 1 year old - although again some cultures disagree - but again offset math is broken by 1 based counting current year - age only gives the birth year in zero based counting.

                            1. 3

                              Why should the programmer care about the underlying definition (p is a pointer)? Also I question the lost performance: that’s what, one extra operation, if that?

                              1. 1

                                Why should the programmer care: A programmer should know how there computing environment works, but it’s not just that, even higher level implementations will end up needing to contend with these issues - you end up needing to compensate for the difference between how memory actually behaves and the design of your language.

                                One extra op: Code size bloat, and runtime of one of the hottest instruction sequences. IIRC one of the most significant amounts of engineering effort in Wirth languages (the pascals, oberons, modulas, etc) is converting one based indexing to zero based, it was vastly more important than the bounds checking people blame for perf problems in safe languages today.

                            2. 1

                              Again on clocks starting at midnight instead of 1AM. Again on unit circles starting at 0 radians instead of 1/π. Again on scales being tared at 0kg instead of 1kg. Again on rulers starting at 0cm instead of 1cm.

                              I suppose my argument is that the index of the nth thing isn’t as useful as the bounds of the nth thing … any 1-based language needs to either compile-time desugar the index by subtracting 1 or incur the runtime cost of subtracting 1 from the index any time it’s actually used – because under the hood there’s just offsets – while gaining only a very mild degree of comfort for primates that first learn to think in fingers.

                              Also I’m not sure an appeal to tradition is any more valid than an appeal to authority.

                              1. 1

                                APL lets you pick, which provides the advantage of increased clarity based on problem domain and the disadvantage of having to figure out which one is being used in a given namespace.

                                Half-open zero-based sequence indexing (as in python) usually imposes the lowest number-fiddling overhead in the types of problems I work on in array-based scientific computing. Having to get the N+1th term from a length-N list always feels impossibly wrong to me.

                                1. 1

                                  Maybe a dumb question, but in terms of implementations do there have to be a bunch of +1s or -1s in the actually run machine code to handle 1-based stuff? Or I guess you just end up storing a shifted base pointer?

                                  Maybe I’m just used to it, but I feel like 0-indexing ends up working when when you’re combining indices and operating across mutiple arrays (i+j style stuff). Like a lot of what I do is offsets. And for the rest … I mean that’s why foreach and iterators are nice. Keep indices out of usual code.

                                  “The first element is 1 and the last is at list.length” is … pretty nice though. Just aesthetically speaking