53834

Applying functions to multidimensional numpy arrays without loops

I am working with raster data with numpy (after reading from GDAL), which represents elevation. My goal is calculate water flow direction for every pixel in the array using numpy, determined primarily from the difference in elevation between a given pixel and it's 8 neighbours.

I have already implemented a rolling window technique to generate a multidimensional array with each pixel and it's neighbours, which works as below:

def rolling_window(array, window_size): itemsize = array.itemsize shape = (array.shape[0] - window_size + 1, array.shape[1] - window_size + 1, window_size, window_size) strides = (array.shape[1] * itemsize, itemsize, array.shape[1] * itemsize, itemsize) return np.lib.stride_tricks.as_strided(array, shape=shape, strides=strides) array = np.arange(100) array = array.reshape(10, 10) w = rolling_window(array, 3) # produces array with shape (8, 8, 3, 3) - edge cases are not currently dealt with.

So, a series of 3 x 3 arrays, centred around the study pixel at 1,1, each within another dimension of the array for the raster "rows" e.g., from one pixel of the input, the array representing it could be as below, where the pixel valued 4 is the study pixel, and the other values are it's immediate neighbours.

array([[[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8]]]])

A simplified version of my current method for working with this multidimensional array is the following function:

def flow_dir(array): # Value to assign output based on element index. flow_idx_dict = {0: 32, 1: 64, 2: 128, 3: 16, 5: 1, 6: 8, 7: 4, 8: 2} # Generates the rolling window array as mentioned above. w = rolling_window(array, 3) # Iterate though each pixel array. for x, i in enumerate(w, 1): for y, j in enumerate(i, 1): j = j.flatten() # Centre pixel value after flattening. centre = j[4] # Some default values. idx = 4 max_drop = 0 # Iterate over pixel values in array. for count, px in enumerate(j): # Calculate difference between centre pixel and neighbour. drop = centre - px # Find the maximum difference pixel index. if count != 4: if drop > max_drop: max_drop = drop idx = count # Assign a value from a dict, matching index to flow direction category. value = flow_idx_dict[idx] # Update each pixel in the input array with the flow direction. array[x, y] = value return array

Understandably, all these for loops and if statements are very slow. I know there must be a vectorized numpy way to do this, but I'm struggling to find the exact functions(s) I need, or perhaps have not understood how to implement them properly. I have tried np.apply_along_axis, np.where, np.nditer, and others, but to no avail so far. What I think I need is:

<ol> <li>

A way to apply a function to each of these pixel arrays produced by the rolling window without using for loops to access them.

</li> <li>

Find the maximum drop index value, without using if statements and enumerate.

</li> <li>

To be able to update the input array in a batch, not by individual element.

</li> </ol>

Answer1:

I think rolling windows can be avoided here; It is easier and more readable to vectorize on NxN array than NxNx3x3.

Consider this data :

array = np.array([[78, 72, 69, 71, 58, 49], [74, 67, 56, 49, 46, 50], [69, 53, 44, 37, 38, 48], [64, 58, 55, 22, 33, 24], [68, 61, 47, 21, 16, 19], [74, 53, 34, 12, 11, 12]]) N=6

First, compute the 8 gradients and codes this way :

gradient = np.empty((8,N-2,N-2),dtype=np.float) code = np.empty(8,dtype=np.int) for k in range(8): theta = -k*np.pi/4 code[k] = 2**k j, i = np.int(1.5*np.cos(theta)),-np.int(1.5*np.sin(theta)) d = np.linalg.norm([i,j]) gradient[k] = (array[1+i: N-1+i,1+j: N-1+j]-array[1: N-1,1: N-1])/d

It is fast because there is few external loops (8). (-gradient).argmax(axis=0) give for each pixel the direction of the flow.

Finally, take the code :

direction = (-gradient).argmax(axis=0) result = code.take(direction)

result :

array([[ 2, 2, 4, 4], [ 1, 2, 4, 8], [128, 1, 2, 4], [ 2, 1, 4, 4]])

Recommend

  • Extract matched text and its neighboring context n words upstream n words downstream (nth occurrence
  • How do I control the viewport of a Grid in wxPython?
  • line and circle intersection [duplicate]
  • NSWindowController: loadWindow loads window from nib but showWindow: does nothing
  • BizTalk force empty elements to be created without using xslt call template
  • Why does this simple MySQL query not return the row?
  • How to compute daily average for 15 years?
  • SqlBulkCopy Ignore Duplicate Records of Datatable From DataBase
  • Need clear idea about timespec structure
  • When to use clone() and how actually addAll() and add() works
  • Using meshgrid to convert X,Y,Z triplet to three 2D arrays for surface plot in matplotlib
  • Google Analytics - Using Client ID as a custom dimension
  • numpy array divide column by vector
  • Assignment of Allocatables of Different Shapes in Fortran [duplicate]
  • spring data neo4j 3.0.0 - why two labels set by default
  • Numpy: How to get rid of the minima along axis=1, given the indices - in an efficient way?
  • Add Jpanel to Jframe NetBeans
  • Basic defensive programming [duplicate]
  • In C what exactly happens if i use () to initialize a double dimension array instead of the {}?
  • Get the number 18437736874454810627
  • What Makes These Two Array Adds Different?
  • How to create a 2D image by rotating 1D vector of numbers around its center element?
  • Retrieving specified columns from a list of csv files to create a data data frame in R
  • Laravel: Getting Session ID oddly truncates when using foreach
  • GAE: Way to get reference to an HttpSession from its ID?
  • Yii2: Config params vs. const/define
  • Java Scanner input dilemma. Automatically inputs without allowing user to type
  • vba code to select only visible cells in specific column except heading
  • Convert array of 8 bytes to signed long in C++
  • what is the difference between the asp.net mvc application and asp.net web application
  • How do you troubleshoot character encoding problems?
  • Do I've to free mysql result after storing it?
  • Transpose CSV data with awk (pivot transformation)
  • Matrix multiplication with MKL
  • JTable with a ScrollPane misbehaving
  • Understanding cpu registers
  • Recursive/Hierarchical Query Using Postgres
  • Running Map reduces the dimensions of the matrices
  • Sorting a 2D array using the second column C++
  • Binding checkboxes to object values in AngularJs