Meet Musio

Open Domain Dialogue Dataset Comparison Report

Bach vs. Others This document presents a comparison between curated open-domain dialogue datasets available in the public domain and the data produced by AKA’s Bach data platform. The current report focuses on quantitative measurement which could be done in a transparent manner and represent objective differences found in the data. The analysis was performed using the following criteria: Total Number of Tokens Number of tokens is a measure of the overall size of the dataset. It is very important for training the modern Deep Learning-based models. Bach dataset displays clear superiority to others. Higher is better. Vocabulary Size Vocabulary size is the number of unique tokens appearing in the dataset. It represents the variety of speech in dialogues. Our dataset […]

AKA’s Paper (ReSmart) is accepted by HIMS 2020

AKA’s paper is accepted by International Conference HIMS (Health Informatics and Medical System) //americancse.org/events/csce2020/conferences/hims20 (July, 2020)