Commit 59de1e87 authored by Simon's avatar Simon
Browse files

typo readme

parent b2d8405d
......@@ -71,7 +71,7 @@ In order to guide you in this project, you have access to the following Jupyter
* `3-feature-extraction.ipynb`: In this notebook, you will extract the log-Mel spectrograms for the 3068 audio files in the SONYC-UST dataset. It may take a significant amount of time, so anticipate!
* `4-model-training.ipynb`: In this notebook, you will build and train a convolutional neural network (CNN) to perform urban sound tagging with [Keras](https://keras.io/). Using transfer learning, your CNN will build upon a model called [VGGish](https://github.com/tensorflow/models/tree/master/research/audioset/vggish). It was trained on [AudioSet](https://github.com/tensorflow/models/tree/master/research/audioset), a dataset of over 2 million human-labeled 10-second YouTube video soundtracks, with labels taken from an ontology of more than 600 audio event classes. This represents more than 5 thousand hours of audio. The method you will implement is based on ["Convolutional Neural Networks with Transfer Learning for Urban Sound Tagging"](http://dcase.community/documents/challenge2019/technical_reports/DCASE2019_Kim_107.pdf) that was proposed by Bongjun Kim (Department of Computer Science, Northwestern University, Evnaston, Illinois, USA) and obtained the 3rd best score at the [DCASE 2019 Challange, task 5](http://dcase.community/challenge2019/task-urban-sound-tagging).
* `4-model-training.ipynb`: In this notebook, you will build and train a convolutional neural network (CNN) to perform urban sound tagging with [Keras](https://keras.io/). Using transfer learning, your CNN will build upon a model called [VGGish](https://github.com/tensorflow/models/tree/master/research/audioset/vggish). It was trained on [AudioSet](https://github.com/tensorflow/models/tree/master/research/audioset), a dataset of over 2 million human-labeled 10-second YouTube video soundtracks, with labels taken from an ontology of more than 600 audio event classes. This represents more than 5 thousand hours of audio. The method you will implement is based on ["Convolutional Neural Networks with Transfer Learning for Urban Sound Tagging"](http://dcase.community/documents/challenge2019/technical_reports/DCASE2019_Kim_107.pdf) that was proposed by Bongjun Kim (Department of Computer Science, Northwestern University, Evnaston, Illinois, USA) and obtained the 3rd best score at the [DCASE 2019 Challenge, task 5](http://dcase.community/challenge2019/task-urban-sound-tagging).
* `5-model-testing.ipynb`: In this notebook, you will evaluate the performance of your trained CNN using standard metrics for [multi-label classification](https://en.wikipedia.org/wiki/Multi-label_classification). While developing your model, you should only use the validation set of the [SONYC-UST dataset](https://zenodo.org/record/2590742#.XIkTPBNKjuM). When you are satisfied of the performance on the validation set, you can evaluate the model on the test set. You should absolutely avoid evaluating the model on the test set while developing, because if you do so you will start learning the test set.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment