Why Are Arrays Indexed From 0?
There is a good chance this question has occurred to you if you are new to programming (and perhaps even if you aren’t). It seems weird. When we count on our fingers or in our heads, we’re used to counting from 1, yet the first item in an array is at index 0. There is a good reason for this and I didn’t pick up on it until I started working with C++ and began to understand what an array is from the computer’s perspective.
How an array is stored in C++
If you’re coming from JavaScript or another high level language, you’re used to arrays being represented with brackets, with anywhere from zero to many elements separated by commas in between them. If you call console.log()
and pass an array into it, it will print the entire array, brackets and all:
In C++, however, the equivalent array declaration and output statement
give us the following:
This is the memory address of the first element of the array. That’s it. Nothing else is stored in the variable array
. The variable just points to the address.
So where is the rest of the array?
In C++ (and many other low level languages) the elements of an array are stored contiguously in memory. Memory addresses can be easy to shrug off as meaningless because they are difficult-to-read hexadecimal numbers, and our computers have tons of memory. But they are numbered, and if we declare an array of five integers, as I have above, the computer will reserve five contiguous blocks of memory and store the above integers in them. If I change my output statement to read
it will give me the next address in memory:
You’ll notice that this address is 4 higher than the previous address, rather than 1 higher. This is because on my machine blocks of memory have a capacity of 4 8-bit bytes, which may differ on yours. But to be clear there are no other blocks of memory between these two addresses. The pattern holds if I add 2 to the variable array
:
What you should be beginning to see here is that the address of the first element in the array is all we need to access the rest of it. You can think of an array’s index not as a direct link to an address, as you would a key in a hash map, but rather an offset from its start. So if I rewrite my output statements to access the values at those addresses (and forgive me if you’re not familiar with pointers and dereferencing them), we can start to see something that looks pretty similar to bracket notation indexing:
And we’ll see our array values printed in order in the command line:
I’ve added 0 to array
in the first output statement so you can see that an index of 0 just means that we’re offsetting the first element’s address by 0, not because it actually needs to be there. But that’s it, an offset or index of 0 just means that we’re starting at the address of the first element, every other index is in relation to that first address.