I’m trying to port some code from python, to elixir. The python code generates tensors from embedding in BERT, then does some form of similarity comparison between them. From what i’ve found online, it looks like cosine similarity is the calculation I’m looking for, but I can’t quite understand it enough to implement it in Nx.
The formula is listed as
A ⋅ B / ||A|| ||B||. I have two tensors with with the shape
#Nx.Tensor<f32...>. So far this is all i’ve come up with:
for i <- 0..5, j <- 0..5 do t1 = tensor1[i] t2 = tensor2[j] Nx.dot(t1, t2) / ??? end
I found another formula on the wikipedia page that’s numpy code, which says:
np.sum(a*b)/(np.sqrt(np.sum(a**2)) * np.sqrt(np.sum(b**2)))
I think converted to
defmodule CosSim do import Nx.Defn defn cosine_similarity(a, b) do left = Nx.sqrt(Nx.sum(a**2)) right = Nx.sqrt(Nx.sum(b**2)) Nx.sum(a * b) / (left * right) end end
If I do a quick test:
a = Nx.tensor([1,2,3]) b = Nx.tensor([4,5,6]) CosSim.cosine_similarity(a, b) #Nx.Tensor< f32 EXLA.Backend<host:0, 0.528063503.4042653716.111775> 0.9746317863464355 >
If I try to validate it in python:
>>> a = np.matrix([1,2,3]) >>> b = np.matrix([4,5,6]) >>> np.sum(a*b)/(np.sqrt(np.sum(a**2)) * np.sqrt(np.sum(b**2))) ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
So something is off. Admittdly i’ve very new to any of this ML/Nx stuff, so maybe I’m way off, or maybe i’m close. Any tips?