I thought I had learned well enough what row-major and column-major order were from my last confusion, but I didn’t follow what the implications would be when moving from 2 indices to 3.
After learning image processing on MATLAB, I assumed that since a single image is indexed by (row, column)
, (row, column, image number)
was the most sensible order when dealing with multiple images.
I had been confused by the first few Python tutorials I saw (like this one for Tensorflow), where they use the first index for the image number rather than the last for stacks of images.
At first I assumed it was just some convention differences between academic communities.
The insight came when I saw the following alternate explanation in the numpy.reshape
docs:
‘C’ means to read / write the elements using C-like index order, with the last axis index changing fastest, back to the first axis index changing slowest. ‘F’ means to read / write the elements using Fortran-like index order, with the first index changing fastest, and the last index changing slowest
With this version of row-major vs column-major, picking the first index for images in Python seems completely natural.
Lets say we have two images, both are 3 x 4.
On the left we’ll show image 1, and on the right image 2.
Using the function numpy.unravel_index
in Python, you can find what the 3 indexes are as you step through the linear indexes 0-23.
When we use this along with a FuncAnimation
, we get a nice visual of how the pixels are ordered in the 3D stack:
If we watch the figure title, which shows how you would index the stack to access the current element, we see how the last index changes quickest and first index changes slowest.
The unravel_index
function also takes an order
keyword that allows you to specify ‘F’ for Fortran column-order (instead of the default ‘C’ for C/Python row-ordering).
Using this, we can also visualize what the order is in MATLAB/Fortran:
The two clear differences are
If you’re interested in the animation code to make this, it’s located here.