AKA Story

Covering rare words

Table of Contents 1. Covering rare words 1.1. goal 1.2. motivation 1.3. ingredients 1.4. steps 1.5. outlook 1.6. resources Covering rare words goal This week’s blogpost treats a new network architecture, named pointer models, for taking care of rare words. We will dive into some details of the implementation and give a short analysis of the benefits of these kind of models. motivation Motivation for the introduction of new architectures comes directly from short-comings of RNN language models, as well as encoder decoder frameworks. Rare words, especially named entities do not experience good word embeddings and hence do not lead to appropriate sentence embeddings which might be used to initialize a decoder component for predicting an output sequence. Furthermore, the […]

Beam Search

Beam search Goal In this short summary we will have a look at the beam search algorithm, which is applied in NLP for optimizing the generation of sequences of words or characters. Motivation Most recurrent neural networks are optimized on predicting the next most probable output based on the history of some input sequence. However, in general this does not lead to the most probable sequence. Ingredients RNN, decoder, greedy search, conditioning Steps Decoder architectures in the form of recurrent neural networks, LSTMs or GRUs as they are used for generating sequences of words or characters are optimized on predicting the next word. Hence during training, the networks sees a certain input sequence and should learn to predict the next […]

Memory Neural Networks :MemNN

Goal This summary tries to provide an rough explanation of memory neural networks. In particular, the we focus on the existing architectures with external memory components. Motivation A lot of task, as the babi tasks require a long-term memory component in order to understand longer passages of text, like stories. More general, QA tasks demand accessing memories in a wider context, such as past utterances which date back several days or even weeks. Ingredients External memory, RNN, LSTM, Embedding model, Scoring function, Softmax, Hops. Steps Neural Networks in general rely on storing information about training data in the weights of their hidden layers. However, current architectures, such as RNN and LSTM, limit the access of information seen in the past […]