I need to take the matrix product of two NumPy matrices (or other 2d arrays) containing log probabilities. The naive way np.log(np.dot(np.exp(a), np.exp(b)))
is not preferred for obvious reasons.
Using
from scipy.misc import logsumexp
res = np.zeros((a.shape[0], b.shape[1]))
for n in range(b.shape[1]):
# broadcast b[:,n] over rows of a, sum columns
res[:, n] = logsumexp(a + b[:, n].T, axis=1)
works but runs about 100 times slower than np.log(np.dot(np.exp(a), np.exp(b)))
Using
logsumexp((tile(a, (b.shape[1],1)) + repeat(b.T, a.shape[0], axis=0)).reshape(b.shape[1],a.shape[0],a.shape[1]), 2).T
or other combinations of tile and reshape also work but run even slower than the loop above due to the prohibitively large amounts of memory required for realistically sized input matrices.
I am currently considering writing a NumPy extension in C to compute this, but of course I'd rather avoid that. Is there an established way to do this, or does anybody know of a less memory intensive way of performing this computation?
EDIT:
Thanks to larsmans for this solution (see below for derivation):
def logdot(a, b):
max_a, max_b = np.max(a), np.max(b)
exp_a, exp_b = a - max_a, b - max_b
np.exp(exp_a, out=exp_a)
np.exp(exp_b, out=exp_b)
c = np.dot(exp_a, exp_b)
np.log(c, out=c)
c += max_a + max_b
return c
A quick comparison of this method to the method posted above (logdot_old
) using iPython's magic %timeit
function yields the following:
In [1] a = np.log(np.random.rand(1000,2000))
In [2] b = np.log(np.random.rand(2000,1500))
In [3] x = logdot(a, b)
In [4] y = logdot_old(a, b) # this takes a while
In [5] np.any(np.abs(x-y) > 1e-14)
Out [5] False
In [6] %timeit logdot_old(a, b)
1 loops, best of 3: 1min 18s per loop
In [6] %timeit logdot(a, b)
1 loops, best of 3: 264 ms per loop
Obviously larsmans' method obliterates mine!
question from:
https://stackoverflow.com/questions/23630277/numerically-stable-way-to-multiply-log-probability-matrices-in-numpy