AE-HCN Datasets (ICASSP 2019)

Data for the paper "Contextual Out-of-Domain Utterance Handling with Counterfeit Data Augmentation" by Sungjin Lee and Igor Shalyminov [Paper] [Slides]

Datasets:

babi_task6 - clean version of bAbI Dialog Task 6 for Hybrid Code Network training
babi_task6_ood_0.2_0.4 - bAbI Dialog Task 6, version with OOD augmentations. OOD turns distributed as follows: OOD turn sequence starts with a probability p_start=0.2 and keeps going with p_cont=0.4. Every OOD sequence ends up with a segment-level OOD turn. For more detail on data augmentation, check out our papers: 1 and 2
Google datasets - coming soon

Data augmentation code can be found in this repo

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
babi_task6.zip		babi_task6.zip
babi_task6_ood_0.2_0.4.zip		babi_task6_ood_0.2_0.4.zip
dialog.jpg		dialog.jpg