Word Morphing – an original idea

From now on, I’ll overload a node symbol n to be its associated word’s embedding vector.To use the word2vec embeddings, we’ll download Google’s pre-trained embeddings from here, and use gensim package to access them.Given the cosine similarity distance function, we can define our f function to be Eq..Definition of weight function using cosine similarity limited to nearest neighborswhere neighbors(n₁) denotes the nearest nodes in the graph to n₁ in terms of cosine similarity..However, we’ll use the observation that if the vectors have length 1, then the cosine similarity can be obtained using a monotonic transformation over the euclidean distance..Definition of weight function using euclidean distanceSo to sum up, we’ll normalize the word embeddings, use the euclidean distance as a mean to find semantically similar words, and the same euclidean distance to direct A* search process in order to find the optimal path.I chose neighbors(n) to include its 1000 nearest nodes. However, in order to make the search more efficient, we can dilute these using dilute_factor of 10: we pick the nearest neighbor, the 10th nearest neighbor, the 20th, and so on – until we have 100 nodes. The intuition behind it is that the best path from some intermediate node to E might pass through its nearest neighbor. If it doesnt, it might be the case that it wont pass through the second neighbor neither, since the first and second neighbors might be almost the same. So to save some computations, we just skip some of the nearest neighbors.And here comes the fun part:The results:Implementing the word morphing project was fun, but not as fun as playing around and trying this tool on whatever pair of words I could have thought of. I encourage you to go ahead and play with it yourself. Let me know in the comments what interesting and surprising morphings you have found :)This post was originally posted at www.anotherdatum.com.[1] https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality [2] https://www.cs.auckland.ac.nz/courses/compsci709s2c/resources/Mike.d/astarNilsson.pdfBio: Yoel Zeldes is a Algorithms Engineer at Taboola and is also a Machine Learning enthusiast, who especially enjoys the insights of Deep Learning.Original.. More details

Leave a Reply