I have a large sparse matrix, call it P:
> str(P) Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:7868093] 4221 6098 8780 10313 11102 14243 20570 22145 24468 24977 ... ..@ p : int [1:7357] 0 0 269 388 692 2434 3662 4179 4205 4256 ... ..@ Dim : int [1:2] 1303967 7356 ..@ Dimnames:List of 2 .. ..$ : NULL .. ..$ : NULL ..@ x : num [1:7868093] 1 1 1 1 1 1 1 1 1 1 ... ..@ factors : list()
I'd like to row-normalize (say, with the L-2 norm)... (taking advantage of vector-recycling) the straight-forward approach would be something like:
> row_normalized_P <- P / rowSums(P^2)
But this causes a memory allocation error, since it appears the
rowSums result is being recycled into a <em>dense</em> matrix with dimensions equal to
P is known to be sparse (or at the very least is stored in sparse format), does anyone know of a non-iterative approach to achieve the desired
row_normalized_P shown above?
(I.e. the resultant matrix will be equally sparse as
P itself... and I'd like to avoid ever having a dense matrix allocated during the normalization steps.)
The only semi-efficient method I've found around this is to
apply across rows (more accurately through blocks of rows coerced into dense sub-matrices) of
P, but I'd like to try to remove the looping logic from my codebase if I can, and I'm wondering if perhaps there's a built-in in the Matrix package (that I'm just not aware of) that helps with this particular type of computation.
Cheers and thanks for any help!
I figured out a nice solution (as usual, about 15 minutes after posting :-/ )...
> row_normalized_P <- Matrix::Diagonal(x = 1 / sqrt(Matrix::rowSums(P^2))) %*% P