diff --git a/README.md b/README.md index 6c0355c739e4dc8160dae0b205bbfbf1b85e0a84..1abbcd47d5d77317c2135f989d4657423405767f 100644 --- a/README.md +++ b/README.md @@ -1 +1,74 @@ -# To Complete \ No newline at end of file +# A Recurrent Variational Autoencoder for Speech Enhancement + +This repository contains the implementation of the speech enhancement method proposed in: + +>S. Leglaive, X. Alameda-Pineda, L. Girin, R. Horaud, [A Recurrent Variational Autoencoder for Speech Enhancement](https://hal.archives-ouvertes.fr/hal-02329000/document), in Proc. of the IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2020. + +Audio examples are available [here](https://sleglaive.github.io/demo-icassp2020.html). + +If you use this code, please cite the above-mentioned paper ([Bibtex](https://hal.archives-ouvertes.fr/hal-02329000v1/bibtex)). + +## Repository structure + +```bash +. +├── audio +│ ├── mix_qut_wsj0.wav +│ ├── mix_sunrise.wav +│ └── mix_thierry_roland.wav +├── environment.yml +├── LICENSE.txt +├── main.py +├── README.md +├── saved_model +│ ├── WSJ0_2019-07-15-10h01_RVAE_BRNNenc_BRNNdec_latent_dim=16 +│ │ ├── final_model_RVAE_epoch145.pt +│ │ ├── loss.pdf +│ │ ├── loss_RVAE.pckl +│ │ ├── parameters.pckl +│ │ └── parameters.txt +│ ├── WSJ0_2019-07-15-10h14_RVAE_RNNenc_RNNdec_latent_dim=16 +│ │ ├── final_model_RVAE_epoch121.pt +│ │ ├── loss.pdf +│ │ ├── loss_RVAE.pckl +│ │ ├── parameters.pckl +│ │ └── parameters.txt +│ └── WSJ0_2019-07-15-10h21_FFNN_VAE_latent_dim=16 +│ ├── final_model_RVAE_epoch65.pt +│ ├── loss.pdf +│ ├── loss_RVAE.pckl +│ ├── parameters.pckl +│ └── parameters.txt +├── SE_algorithms.py +└── training + ├── speech_dataset.py + ├── train_BRNN_WSJ0.py + ├── train_FFNN_WSJ0.py + ├── train_RNN_WSJ0.py + └── VAEs.py +``` + +## Python files + +* ```main.py```: Main script to run the speech enhancement algorithms. If you just want to test the method quickly, run this script. Input and output audio files are located in the ```audio``` folder. + +* ```SE_algorithms.py```: Implementation of the speech enhancement algorithms (MCEM, PEEM, VEM). + +* ```./training/speech_dataset.py```: Custom Pytorch dataset for training. + +* ```./training/VAEs.py```: Pytorch implementation of the FFNN, RNN and BRNN variational autoencoders (VAEs). + +* ```./training/train_FFNN_WSJ0.py```: Script to train the FFNN VAE. + +* ```./training/train_RNN_WSJ0.py```: Script to train the RNN VAE. + +* ```./training/train_BRNN_WSJ0.py```: Script to train the BRNN VAE. + + +## Conda environment + +```environment.yml``` describes the conda environment that was used for the experiments. + +## License + +GNU Affero General Public License (version 3), see ```LICENSE.txt```. \ No newline at end of file