numpy 2-D Array Axis Ordering & meshgrid Behavior[ ]
I’ve been working lately with data generated from an optical profilometer (this one), which consists of an array of values mapped across a 2-D grid in and . I’m needing to manipulate it in various ways in Python—plotting, leveling, fitting, etc.—and many of these operations require the underlying grid to be explicitly specified.
numpy.meshgrid is a handy function for this, but its axis ordering assumptions have been somewhat awkward to keep straight.
The axes of
numpy arrays are identified by integer indices; for a 2-D array, thus, there is
numpy’s intrinsic behavior is for indexing along
axis=0 to refer to the row number in the array, and for indexing along
axis=1 to refer to the column number. Thus, in
arr[i0, i1], the indices
i1 are specifing the element position along
axis=0 (how many rows down) and
axis=1 (how many columns across), respectively. This is nicely intuitive in the context of linear algebra, as in the array element the first index, , refers to the row number of the element, and the second, , to the column number.
Where problems have typically started to arise for me is when contemplating plotting data that’s mapped across such a 2-D array. Which
axis should correspond to the -axis of the plot? On one hand, since 2-D data is usually plotted as , one might think that
axis=0 would correspond to . On the other hand, since is usually plotted along the horizontal axis (i.e., as the column number changes along a grid), it might make more sense to identify
axis=1 as the -axis.
My inclination has always been to keep to the former convention, as it’s more natural for me to associate the first index to the array with the -axis. It turns out, though, that in certain cases
numpy takes the latter convention by default. Where this has historically confused me most significantly is in the use of the
meshgrid function. Consider the following:
>>> import numpy as np >>> a = np.arange(6).reshape((3,2)) >>> a array([[0, 1], [2, 3], [4, 5]]) >>> a.shape (3, 2) >>> x, y = np.meshgrid(*map(np.arange, a.shape)) >>> x array([[0, 1, 2], [0, 1, 2]]) >>> x.shape (2, 3) >>> y array([[0, 0, 0], [1, 1, 1]]) >>> y.shape (2, 3)
Thus, the default
meshgrid behavior is to produce mesh arrays whose shapes and axis associations are transposed with respect to that of the input array. This behavior corresponds to the default value of the
indexing argument to
'xy', which indicates to map the first axis of the input along the rows of the output matrices. I’ve just recently found the
indexing='ij' mode, which yields meshing matrices that are much more intuitive, for me at least:
>>> x, y = np.meshgrid(*map(np.arange, a.shape), indexing='ij') >>> x array([[0, 0], [1, 1], [2, 2]]) >>> y array([[0, 1], [0, 1], [0, 1]])
The shapes match, and the -values advance along
Time to trawl back through all my various profilometry analysis code and clean up this confusion….