{"id":997,"date":"2016-03-04T01:05:55","date_gmt":"2016-03-04T01:05:55","guid":{"rendered":"http:\/\/ec2-52-79-86-100.ap-northeast-2.compute.amazonaws.com\/?p=997"},"modified":"2024-05-01T11:54:26","modified_gmt":"2024-05-01T02:54:26","slug":"sequence-to-sequence","status":"publish","type":"post","link":"https:\/\/blog.themusio.com\/?p=997","title":{"rendered":"Sequence-to-Sequence"},"content":{"rendered":"<p>In this summary I like to provide a rough overview of Sequence-To-Sequence neural network architectures and what purposes these serve.<\/p>\n<p><strong>Motivation<\/strong><br \/>\nA key observation when dealing with neural networks is that these can only handle objects of a fixed size.<br \/>\nThis means that the architecture has to be adopted if sequences like sentences should be process-able.<br \/>\nThe same problems with objects of variable length also appear on the dialog level, where a certain number of utterances and responses string together.\u00a0Besides dialog modeling, speech recognition and machine translation demand for advanced neural networks.<\/p>\n<p><strong>Ingredients<\/strong><br \/>\nDeep neural network, hidden layer, recurrent neural network, encoder, decoder, LSTM, back-propagation, word embedding, sentence embedding<\/p>\n<p><strong>Steps<\/strong><br \/>\nAs already stated standard neural networks can not deal with sequences of variable lengths.<br \/>\nMoreover they have no knowledge of the previous input.<br \/>\nHowever this is of big importance for understanding sentences for example.<\/p>\n<p>For this reasons, altered neural networks architectures where proposed and pursued.<br \/>\nA first step are recurrent neural networks which can process variable sequences of fixed size objects, as words in sentences.<br \/>\nThey also solve the problem of keeping knowledge about the previous input by passing a state in the hidden layer.<br \/>\nFor certain mathematical shortcomings of standard neurons, this does not allow for endless stretching back memories about previous inputs.<br \/>\nA proposal to deal with these memory issues are Long-Short-Term Memory and GRU cells replacing neurons.<\/p>\n<p>Only within recent years, such tuned recurrent neural networks were used as encoders and decoders in the framework of Sequence-To-Sequence architectures.<br \/>\nThe encoder part maps an input, e.g. a sequence of words to a fixed-size vector by processing word by word.<br \/>\nThe vector output can then be considered as an sentence embedding, which abstractly stores the meaning.<br \/>\nIn a second step the decoder maps the abstract vector into an output sequence, by spitting out word by word.<br \/>\nIn this way the network architecture is able to respond to an utterance with an response.<br \/>\nLast year this concept was generalized to including a dialog encoder layer on top of the standard encoder.<br \/>\nThis might further enhance the architecture to keep track of previous utterances in a full dialog.<\/p>\n<p>The Sequence-To-Sequence architectures as every machine learning system has to undergo a certain training process.<br \/>\nHere, the encoder and the decoder are trained together by presenting corresponding sequence pairs to them.<br \/>\nOptimization methods using back-propagation algorithms can be borrowed from standard neural networks.<\/p>\n<p>As for every deep neural network the amount of available training data is crucial for achieving a good performance.<br \/>\nInteresting data sets for dialog modeling exist as Subtitles of Movie corpora or Scripts for theater plays and TV series.<\/p>\n<p>Latest results show that such architectures are able to model dialogs of the previous form well.<br \/>\nFor specific data sets from IT help-desk discussions a given technical problem can sometimes be addressed properly.<\/p>\n<p>With focus on Musio and it&#8217;s emotional abilities it is of importance to generate specific datasets that capture proper emotional responses.<\/p>\n<p><strong>Resources<br \/>\n<\/strong>&#8220;<a href=\"http:\/\/arxiv.org\/pdf\/1409.3215v3.pdf\" target=\"_blank\" rel=\"noopener\">Sequence to Sequence Learning with Neural Networks<\/a>&#8220;(PDF). <em>Sequence to Sequence Learning with Neural Networks. <\/em>December 2014.\u00a0Retrieved Feburary 26, 2016.<br \/>\n&#8220;<a href=\"http:\/\/arxiv.org\/pdf\/1406.1078.pdf\" target=\"_blank\" rel=\"noopener\">Learning Phrase Representations using RNN Encoder\u2013Decoder for Statistical Machine Translation<\/a>&#8220;(PDF). Learning Phrase Representations using RNN Encoder\u2013Decoder for Statistical Machine Translation. September 2014.\u00a0Retrieved Feburary\u00a026, 2016.<br \/>\n&#8220;<a href=\"http:\/\/arxiv.org\/pdf\/1507.04808v2.pdf\" target=\"_blank\" rel=\"noopener\">Building End-To-End Dialogue Systems:\u00a0Using Generative Hierarchical Neural Network Models<\/a>&#8220;(PDF).\u00a0<em>Building End-To-End Dialogue Systems:\u00a0Using Generative Hierarchical Neural Network Models<\/em>. November 2015.\u00a0Retrieved Feburary\u00a026, 2016.<br \/>\n&#8220;<a href=\"http:\/\/www.aclweb.org\/anthology\/D13-1176\" target=\"_blank\" rel=\"noopener\">Recurrent Continuous Translation Models<\/a>&#8221; (PDF). <em>Recurrent Continuous Translation Models<\/em>. October 2013.\u00a0Retrieved Feburary\u00a026, 2016.<br \/>\n&#8220;<a href=\"http:\/\/arxiv.org\/pdf\/1308.0850.pdf\" target=\"_blank\" rel=\"noopener\">Generating Sequences With Recurrent Neural Networks<\/a>&#8221; (PDF). <em>Generating Sequences With Recurrent Neural Networks.\u00a0<\/em>June\u00a02014.\u00a0Retrieved Feburary\u00a026, 2016.<br \/>\n&#8220;<a href=\"http:\/\/arxiv.org\/pdf\/1409.0473.pdf\" target=\"_blank\" rel=\"noopener\">NEURAL MACHINE TRANSLATION\u00a0BY JOINTLY LEARNING TO ALIGN AND TRANSLATE<\/a>&#8221; (PDF).\u00a0<em>NEURAL MACHINE TRANSLATION\u00a0BY JOINTLY LEARNING TO ALIGN AND TRANSLATE. <\/em>April 2015.\u00a0Retrieved Feburary\u00a026, 2016.<br \/>\n&#8220;<a href=\"http:\/\/www.machinelearning.org\/proceedings\/icml2006\/047_Connectionist_Tempor.pdf\" target=\"_blank\" rel=\"noopener\">Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks<\/a>&#8221; (PDF). Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks. 2006.\u00a0Retrieved Feburary\u00a026, 2016.<br \/>\n&#8220;<a href=\"https:\/\/github.com\/julianser\/hed-dlg\" target=\"_blank\" rel=\"noopener\">Hierarchical Encoder Decoder for Dialog Modelling<\/a>&#8221; (GIT).\u00a0<em>Hierarchical Encoder Decoder for Dialog Modelling<\/em>.\u00a0Retrieved February 26, 2016.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this summary I like to provide a rough overview of Sequence-To-Sequence neural network architectures and wh [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[3642,3640],"tags":[3650,3652,3760,4304,3656,3762,3658,3700,3664,4002,4306,4006,4308,4274,4278,3710,4310,4312,3932,4010,3840,3906,3854,3712,3714,4314,4266],"class_list":["post-997","post","type-post","status-publish","format-standard","hentry","category-ai-en","category-all-en","tag-ai-ja-en","tag-aka-ja-en","tag-artificial-intelligence-en","tag-back-propagation-en","tag-baggage-en","tag-children-book-ja-en","tag-christmas-en","tag-cmos-en","tag-crowd-funding-en","tag-decoder-en","tag-deep-neural-network-en","tag-encoder-en","tag-hidden-layer-en","tag-hinoki-en","tag-lstm-en","tag-musio-en","tag-n-gram-en","tag-naive-bayesian-classifier-en","tag-natural-conversation-ja-en","tag-natural-interaction-en","tag-natural-language-processing-ja-en","tag-negative-sampling-en","tag-neural-network-en","tag-neural-networks-en","tag-recurrent-neural-network-en","tag-sentence-embedding-en","tag-word-embedding-en"],"aioseo_notices":[],"jetpack_sharing_enabled":true,"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/blog.themusio.com\/index.php?rest_route=\/wp\/v2\/posts\/997","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.themusio.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.themusio.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.themusio.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.themusio.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=997"}],"version-history":[{"count":2,"href":"https:\/\/blog.themusio.com\/index.php?rest_route=\/wp\/v2\/posts\/997\/revisions"}],"predecessor-version":[{"id":10902,"href":"https:\/\/blog.themusio.com\/index.php?rest_route=\/wp\/v2\/posts\/997\/revisions\/10902"}],"wp:attachment":[{"href":"https:\/\/blog.themusio.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=997"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.themusio.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=997"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.themusio.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=997"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}