Question:

I am trying to convert part of a native python function to cython to improve the compute time. I would like to write a cython function just for the loop component that is taking up the time (as ipython lprun kindly told me). However this function takes in variably sized matrices .. and I can't see how to bring that across easily to statically typed cython.

```
for index1 in range(0,num_products):
for index2 in range(0,num_products):
cond_prob = (data[index1] * data[index2]).sum() / max(col_sums[index1], col_sums[index2])
prox[index1][index2] = cond_prob
```

This issue is that num_products changes year to year, so the matrix (data) size is variable.

What is the best strategy here?

<ol><li>Should I write two C functions. One to create a matrix of a certain dimension using memalloc, and then One to do the loops over the created matrix?</li> <li>Is there some fancy cython/numpy wizardry to help in this scenario? Can I write a C function that takes in a variably sized Numpy Array in memory and pass the size?</li> </ol>Answer1:Cython code is (strategically) statically typed, but that doesn't mean that arrays must have a fixed size. In straight C passing a multidimensional array to a function can be a little awkward maybe, but in Cython you should be able to do something like the following:

Note I took the function and variable names from your <a href="https://stackoverflow.com/q/22853837/2379410" rel="nofollow">follow-up question.</a>

```
import numpy as np
cimport numpy as np
cimport cython
@cython.boundscheck(False)
@cython.cdivision(True)
def cooccurance_probability_cy(double[:,:] X):
cdef int P, i, j, k
P = X.shape[0]
cdef double item
cdef double [:] CS = np.sum(X, axis=1)
cdef double [:,:] D = np.empty((P, P), dtype=np.float)
for i in range(P):
for j in range(P):
item = 0
for k in range(P):
item += X[i,k] * X[j,k]
D[i,j] = item / max(CS[i], CS[j])
return D
```

On the other hand, using just Numpy should also be quite fast for this problem, if you use the right functions and some broadcasting. In fact, as the calculation complexity is dominated by the matrix multiplication, I found the following is much faster than the Cython code above (`np.inner`

uses a highly optimized BLAS routine):

```
def new(X):
CS = np.sum(X, axis=1, keepdims=True)
D = np.inner(X,X) / np.maximum(CS, CS.T)
return D
```

Answer2:Have you tried getting rid of the for loops in numpy?

for the first part of your equation you could for example try:

```
(data[ np.newaxis,:] * data[:,np.newaxis]).sum(2)
```

if memory is an issue you can also use the np.einsum() function. For the second part one could probably also cook up a numpy expression (bit more difficult) if you've not already tried that.