AKA Story

Autoencoder

Goal Autoencoder have long been proposed to tackle the problem of unsupervised learning. In this week’s summary we have a look at their capabilities of providing a features that can be successfully used in supervised tasks and sketch their framework architecture. Motivation In supervised learning, back in the days, deeper architectures need some kind of pretraining of layers before the actual supervised tasked could be pursued. Autoencoder came in handy for this and allowed to train one layer after the other and were able to find useful features for the supervised learning. Ingredients unsupervised learning, features, representation, encoder, decoder, denoising Steps Let us start by looking at the general architecture. An autoencoder consists of two basic parts: the encoder and […]

Beam Search

Beam search Goal In this short summary we will have a look at the beam search algorithm, which is applied in NLP for optimizing the generation of sequences of words or characters. Motivation Most recurrent neural networks are optimized on predicting the next most probable output based on the history of some input sequence. However, in general this does not lead to the most probable sequence. Ingredients RNN, decoder, greedy search, conditioning Steps Decoder architectures in the form of recurrent neural networks, LSTMs or GRUs as they are used for generating sequences of words or characters are optimized on predicting the next word. Hence during training, the networks sees a certain input sequence and should learn to predict the next […]

Attention/Memory in Deep Learning

Goal Attention mechanisms in neural networks are a quite new phenomena and we are going to provide some background on them here. Motivation Generally speaking attention mechanisms allow the network to focus only on a certain subset of the data provided for a given task. Being able to distinguish between the necessary information at a specific step of a task further reduces the amount of information that has to be processed. Ingredients recurrent neural networks, convolutional neural networks, encoder, decoder, embedding, weights, memory, reinforcement learning Steps The idea behind attention mechanisms is certainly motivated by observing the visual attention of humans. Despite processing the visual input all at the same time, humans rather pay attention to small regions one after […]

Sequence-to-Sequence

In this summary I like to provide a rough overview of Sequence-To-Sequence neural network architectures and what purposes these serve. Motivation A key observation when dealing with neural networks is that these can only handle objects of a fixed size. This means that the architecture has to be adopted if sequences like sentences should be process-able. The same problems with objects of variable length also appear on the dialog level, where a certain number of utterances and responses string together. Besides dialog modeling, speech recognition and machine translation demand for advanced neural networks. Ingredients Deep neural network, hidden layer, recurrent neural network, encoder, decoder, LSTM, back-propagation, word embedding, sentence embedding Steps As already stated standard neural networks can not deal with […]