Low rank estimation of sparse observed ratios
Reccomender systems are a class of algorithms to deal with missing information. Given that we have some available rate about the relations of a set of objects, and these informations are specified by real numbers, how can we estimate the relations between another subset of objects that we did not observe?
Let us take for example a sparse set of observations $\mathbf{R} = { R_{ij} }$ between a set of objects, from the same set $V$. We denote these objects as $i$ and object $j$. Unfortunately, we only have a partial information about the relations of these objects, and this information may be contaminated by noise. We only know the pairs specified by the subset $E$ of the complete set ${ (i,j) | (i,j) \in V^2 }$.
This sparse graph can be cast in a dense square matrix, if the set of object that $i$ and $j$ belongs are the same, but a matrix with a lot of missing entries, so not exactly a matrix, but something like this:
\begin{align}
\begin{pmatrix}
1 & \color{red}{?} & 2 & \color{red}{?} & \color{red}{?}\
\color{red}{?} & \color{red}{?} & \color{red}{?} & \color{red}{?} & 4\
2 & \color{red}{?} & 4 & 5 & \color{red}{?}\
\color{red}{?} & \color{red}{?} & 3 & \color{red}{?} & \color{red}{?}\
\color{red}{?} & 1 & \color{red}{?} & 3 & \color{red}{?}\
\end{pmatrix}
\end{align}
Now we make an important hypothesis. The nature of the non-missing numbers in our sparse observation is one of ratio of positive real numbers. They can be, for example, currency exchange rates.
Hence, the relations between the observed objects $i$ and $j$ can be written as the ratios of an unknown load variable $u_i$ and $u_j$.
On the diagonal, the matrix $R$ has ones, as the ratio of any number with itself is one.
An example of such matrix, is the following:
\begin{align}
\mathbf{R}= \begin{pmatrix}
1 & \color{red}{?} & 2 & \color{red}{?} & \color{red}{?}\
\color{red}{?} & 1 & \color{red}{?} & \color{red}{?} & 4\
1/2 & \color{red}{?} & 1 & 1/5 & \color{red}{?}\
\color{red}{?} & \color{red}{?} & 5 & 1 & 1/3\
\color{red}{?} & 1/4 & \color{red}{?} & 3 & 1\
\end{pmatrix}
\end{align}
If this is the case, our sparse observation matrix has a very specific structure that we can exploit.
We know, in the currency rates case, that the symmetric elements must be related by an inverse relation.
In other words, switching the row and column indices, makes the rate become its inverse:
\begin{equation}
R_{ij} = R_{ji}^{-1}
\end{equation}
For example, in the case above described, starting
How can we estimate the remaining missing numbers in the matrix $R$?
Moreover, if we only can describe this matrix as a ratio of the elements of a vector, how can we be sure that the ratios are fair? The simplest option is to transform this problem into one of least-squares estimation. We can estimate a load vector $\mathbf{u}={u_1, \ldots, u_n }$ by means of nonlinear optimization of the squares residual, as specified in the following equation:
\begin{equation} \underset{\mathbf{u}\in \mathbf{R^+}}{\textrm{argmin}} \sum_{(i,j) \in E }\left( R_{ij} - \frac{u_i}{u_j} \right)^2 \end{equation}
However, in the case that the ratios take very big or very small values, this problem may become numerically badly behaved. The terms in the residual of squares with the largest ratios are those that dominate and local optimization may become stuck at some local-optima.
A trick to tackle this problem, is to transform ratios into subtractions, via logarithmic transformation. If indeed, we take the logarithm of the non-missing entries in $R_{ij}$, then the range of the values gets stretched and numerically this can be solved more easily. The log-transformed matrix becomes a skew-symmetric matrix $\mathbf{L}$ with zero diagonal elements. The corresponding problem of optimization becomes:
\begin{equation} \underset{\mathbf{v}\in \mathbf{R}^n}{\textrm{argmin}} \sum_{(i,j) \in E }\left( \log(R_{ij}) - (v \cdot \mathbf{1}^T - v^T \cdot \mathbf{1}) \right)^2 \end{equation}
where the quantity $\mathbf{V} = (v \cdot \mathbf{1}^T - v^T \cdot \mathbf{1})$ is a rank 2 matrix that has elements $V_{ij} = v_i - v_j$. Also, the problem becomes much simpler, as the positivity constraint is lost, and we can look over the unbounded real domain. Once solved, the estimated load vector $\mathbf{u}$ is obtained by taking the exponential of the elements of $\mathbf{v}$.
Why this problem is interesting
Currency markets are often handled by exchanges. They may offer only some specific pairs to trade, as for example USD $ to EUR, or USD to CAD. However one would like to know, if given a set of specific exchange rates, others, missing exchange rates are fairly traded. In the case they are not, an arbitrage possibility emerges, i.e. a way to buy and sell a series of assets, in order to profit from their inbalance in the market. In a stationary market, all the prices are fair, meaning that it is not possible to perform risk-free arbitrage, as high frequency trading system are already doing it.
However, there are markets where non-stationarity conditions are prolonged in time, or the exchanges are fraudolent, and the rates are not fair. In this case, the sum of squares of the errors can become non-zero, and the optimization procedure that I’ve described is of help in finding these situations.