Skip to content

Word Embeddings

Niall Walsh edited this page Feb 22, 2019 · 2 revisions

Word Embeddings

Word embeddings were a breakthrough finding in the field of NLP. It meant huge amounts of rich, contextual word relationship information could be contained in a condensed and efficient manner. Since their genesis with Word2Vec, there has been a number of advancements in the field, arriving at BERT today.

Word embeddings are either Skip-Gram (SG) or Continuous Bag of Words (CBOW). The difference between them is illustrated below:

CBOWSG

CBOW's output will be a word prediction based on it's surrounding words. The order of context words does not matter.

SG's output will be a number of predicted surrounding words based on the word.

CBOW is faster while skip-gram is slower but does a better job for infrequent words.

They all use vector cosine similarity to calculate how similar one word is to another.

Word2Vec was the first popularized neural word embedding method. It takes advantage of only local contexts and is a predictive model. It comes in both CBOW and SG forms.

fastText builds on Word2Vec by learning the vector representation of each word and the n-grams within each word. The averages of the sub-words are then averaged into one vector at each training step. This allows it to infer the meaning of words not seen in the training vocabularly by breaking them down into sub-words. FastText vectors have been shown to be more accurate than Word2Vec vectors by a number of different measures, but take a lot longer to train.

Global Vectors for Word Representation

GloVe, unlike Word2Vec, is not a predictive model. It leverages the same intuition behind the co-occuring matrix used for distributional embeddings, but uses neural methods to decompose the co-occurrence matrix into more expressive and dense word vectors. It hasn't been proven to outperform Word2Vec, but it is faster to train. Both should be experimented with in any case.

Embeddings from Language Models

Bidirectional Encoder Representations from Transformers

Clone this wiki locally