Einstein summation in Numpy
Einstein summation is a convention in tensor algebra where repeated indices are implicitly summed. For example, imagine we have a matrix (a tensor of rank 2) $A_{i,j}$. To compute the trace, i.e. the sum of diagonal elements one has to compute
\begin{equation} \textrm{Tr}(A) = \sum_{i=j}^n A_{ij} \end{equation}
In the Einstein summation convention, the $\sum$ symbol is skipped and the trace of the matrix is indicated as $A_{ii}$. You can extend this thing to very complex kind of computations with tensors of higher rank. For example, you may want to compute the trace over the last two dimensions of a tensor of rank 4 $B_{i,j,k,l}$. This is equivalent to:
\begin{equation} B^\star_{ij} = \sum_{k=l}^n B_{i,j,k,l} \end{equation}
The result is a tensor of rank 2, as we have summed over two indices.
This is done in numpy with the np.einsum
function.
Here the sum is done over the missing indices k,l
. The input array is indicated with 'ijkl'
while the output array as 'ij'
.
Batched matrix operations via np.einsum
My greatest interest in the usage of numpy einstein summation is when doing operations on batched squares matrices. By batched matrix, here I mean an array of square matrices, hence an array with three indices.
Batched outer products
To start, suppose we have a tensor with shape [batch_size, N]
where the first dimension batch_size
is the number of $N$ arrays of which we want to compute the outer products with themselves.
The result should be a tensor of shape [batch_size,N,N]
where every element along the two last dimensions is the outer product of the N
size vector at row batch_size
.
This is equivalent to computing the summation of the outer products:
\begin{align} x \in \mathbb{R}^{b \times n} \end{align}
\begin{align} X \in \mathbb{R}^{b \times n \times n} \end{align}
\begin{align} X = \sum_{j,k}^n x_{ij} x_{jk} \end{align}
where $b$ is the batch_size
and $n=$N
.
This operation can be done via the appropriate np.einsum
call as follows in this example:
Batched graph matrices
Similarly to the discussed above, complex operations on graph-related matrices can be done. Here I propose batched graph laplacians, modularity matrix.
Here are some example functions that I wrote for networkqit