Performance Evaluation of Bach’s Retrieval & Scoring System
OverviewMost current applications of automated dialogue systems involve narrowly focused language understanding and simple models of dialogue interaction. Understanding language and generating natural dialogue are important in building friendly interfaces for dialogue system, but it is particularly critical in settings where the speaker is focused on 1D situation. Real human conversation is highly context-dependent, and human speakers jointly build contributions to the shared context. That is, human dialogue has a very complex structure by itself, and exhibits a complex network of relations between other dialogues. AKA has continuously tried to build friendly dialogue interfaces, and understand situation- and context-dependent interpretation of speaker utterances, including multiple situations. Bach, multiple linked dialogue data platform for AKA’s dialogue system, is our solution to appropriate information retrieval and more flexible dialogue with robust understanding.
Data Retrieval SystemBach’s retrieval system can provide a number of users with concurrent, efficient access to multiple dialogue collections having different topics and multimodal character viewpoints. In general, the effectiveness of retrieval systems is measured by comparing time and extracted datapoints on a common set of queries and responses. Of course, the faster the response time is and the larger the datapoints extract, the best the retrieval system is. The improved performances of response time and datapoints are shown in the figures below.
Scoring SystemLarge-scale retrieval systems can be regarded as being a composition of multiple phases, and two phases of filtering and ranking are particularly important to IR scoring system to generate the best response. Bach also has its own techniques of filtering and ranking that understand the content of documents and queries and infer probable relationships between them. AKA has developed a confirmation strategy to find the speaker’s intended item from the retrieval results by making use of a synthetic tree structure of the manual and rating each of the utterance. Also we make use of distribution of document statistics, knowledge bases, and BERT based pre-trained models. The improved quality of IR system through new scoring system are shown in the figures below.
Context-dependent ModelA query is usually a poor expression of an information need. Many relevant terms can be absent from queries and terms included may be ambiguous. Therefore, AKA has continuously developed context-dependent deep learning networks to understand query-specific contexts and infer the best response in a given query. To model contextual relevance between document-query pairs, we used attention based neural networks. Attention learns to weight sub-units within documents based on how important they are, and extracts the most relevant information that is useful for prediction. We identified that our attention based models can achieve competitive performance compared to other context-dependent pretrained DNNs, and the IR system outperformed the previous one.
In future work, we plan to investigate the integration of some contextual factors in IR systems such as the speaker’s domain of interest, knowledge about the subject, preference, document recency and AKA’s unique information. AKA’s quest for a perfect conversational engine continues.