Gender-Preserving Embeddings

Word embeddings attempt to represent the meaning of words using lower-dimensional real-valued vectors such that words that have similar meaning are embedded close to each other in the embedding space. Counting-based approaches that create word embeddings in a top-down fashion have been overtaken by prediction-based methods, which (randomly) initialise the embeddings and subsequently update them such that the words in their co-occurrence contexts can be accurately predicted. Arguably word2vec¹ is the most popular suite of algorithms for learning word embeddings but the history of neural embeddings dates back to the work by Bengio (2008)....

March 9, 2019 · 4 min · Danushka

Selection Bias

The problem of domain adaptation concerns with adapting a model (e.g. a classifier) on one particular domain (source domain) to be applicable in a different domain (target domain). The main difficulty in doing so is that the distributions from which data points $(x, y)$ are sampled differ between the source and the target domains. This can be a severe problem in many practical applications such as developing systems to diagnose diseases where it is relatively easy to obtain samples from patients diagnosed with that particular disease but difficult to obtain samples from healthy individuals....

March 9, 2019 · 4 min · Danushka