ICARUS

(Interruption-CApable Replies Using Seq2Seq)

This is the code for When can I Speak? Predicting initiation points for spoken dialogue agents.

1. Installations

pip install librosa torchaudio torch transformers pandas

Install fairseq for wav2vec 1.0 model.
- Then edit the trainer files accordingly to indicate where you put your wav2vec_large.pt file
You would also need to use Meticulous if you want to train your own GMM / Heatmap / MSE-based models.

2. Data

The Switchboard dataset is not included in this directory, and would need to be downloaded separately.

To reproduce the training, eval, and test data used in this work, you would need our time-aligned version of the Switchboard Dialogue Act corpus (SwDA) and the original Switchboard audios and transcripts (the .mrk files).

For easier replication, we include our timed version of SwDA in this repository (timed_swda_files.zip).

We have code in this repository to align individual utterances from SwDA .utt.csv files to word timestamps in the Switchboard LDC release, and we are working on releasing this timestamped version.

Then, extract the train, validation, and test set data in the following manner.

python3 swda_data_split.py --part="train" --seq --size=200
python3 swda_data_split.py --part="val" --seq --size=20
python3 swda_data_split.py --part="test" --seq --size=20

Using the --seq flag processes the dialogues sequentially. Not using this flag would execute parallel processing and you can adjust the size of the threadpool in swda_data_split.py.

The code expects a full_data directory in the same level as the swda_data_split.py script with the following structure:

|- full_data

|---- Switchboard Conversation

|-------- audio .wav file

|-------- transcript .mrk file with word-level timestamps

|-------- time-aligned SwDA transcript .utt.ts.csv file

Extracting the data should resulting in three .hdf5 files in a new processed_data/ directory.

3. Run Training

You can run training on three different types of models:

Gaussian Mixture Model (icarus_gmm_trainer.py)
Heatmap (icarus_heatmap_trainer.py)
MSE-based Regression Model (icarus_min_trainer.py)

Running icarus_gmm_trainer.py directly should reproduce our GMM-WGR results on an A100 GPU.

4. Evaluation

icarus_models_preds.py evaluates trained models for their MAE-True and MAE-Pred values. icarus_silent_preds.py evaluates the silence baseline.

Assuming that the trained model is experiments/[NUM]/[MODEL_NAME.pt], then model performance can be evaluated like this:

python3 icarus_models_preds.py --model_path=experiments/NUM/MODEL_NAME.pt --exp_num=NUM

If you want to see the actual predictions from the models, you can add --add_pred tag, and the prediction results should be stored under experiments/NUM/predictions.csv.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
align_dicts.p		align_dicts.p
align_transcript.py		align_transcript.py
dialogue_acts.txt		dialogue_acts.txt
icarus_gmm_trainer.py		icarus_gmm_trainer.py
icarus_heatmap.py		icarus_heatmap.py
icarus_heatmap_trainer.py		icarus_heatmap_trainer.py
icarus_insert_bc.py		icarus_insert_bc.py
icarus_mae.py		icarus_mae.py
icarus_min_trainer.py		icarus_min_trainer.py
icarus_mode_gmm.py		icarus_mode_gmm.py
icarus_mode_gpt2_lstm.py		icarus_mode_gpt2_lstm.py
icarus_models_preds.py		icarus_models_preds.py
icarus_silent_preds.py		icarus_silent_preds.py
icarus_token_aligner.py		icarus_token_aligner.py
icarus_util.py		icarus_util.py
swda_data_split.py		swda_data_split.py
swda_gmm_dataset.py		swda_gmm_dataset.py
timed_swda_files.zip		timed_swda_files.zip
transcript_analysis.py		transcript_analysis.py
wrong_files.p		wrong_files.p
wrong_sound_files.txt		wrong_sound_files.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ICARUS

(Interruption-CApable Replies Using Seq2Seq)

1. Installations

2. Data

3. Run Training

4. Evaluation

About

Releases

Packages

Languages

License

siyan-sylvia-li/icarus_final

Folders and files

Latest commit

History

Repository files navigation

ICARUS

(Interruption-CApable Replies Using Seq2Seq)

1. Installations

2. Data

3. Run Training

4. Evaluation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages