diff --git a/.gitignore b/.gitignore index d9aca805166aa859d12a51e3fcf603a95b66a407..980cbeefc079cdd24b56ec2fba79bacf557ac5f2 100644 --- a/.gitignore +++ b/.gitignore @@ -1,5 +1,5 @@ transformers -debug_squad runs +debug_squad_* cached_* squad_train_* diff --git a/README.md b/README.md index 094b6bf7c4ab1b84a19db91f324cbbc2ff487fd5..6fdd7a4ffd90cb2628ea0326220e006311ce1661 100644 --- a/README.md +++ b/README.md @@ -14,6 +14,7 @@ fusion-output -f <jobid> # watch the job logs during Fusion supercomputer documentation : https://mesocentre.pages.centralesupelec.fr/user_doc/ +Transformers on github : https://github.com/huggingface/transformers The documentation for `run_squad.py` can be found here : https://huggingface.co/transformers/examples.html#squad ## Configure environment @@ -39,13 +40,17 @@ Run the network training ```bash qsub <pbs_script>.pbs ``` -> Some temporary data is written in directory `--output_dir` (`./debug_squad/`). You may have to clean the directory manually before relaunching the training `rm -r ./debug_squad/` -Two training examples : +Two training examples are provided : - `single_gpu_training.pbs` : train the network on a single GPUs - `dual_gpu_training.pbs` : train the network on a two GPUs +Notes : + +- Some temporary data is written in directory `--output_dir` (`./debug_squad/`). You may have to clean the directory manually before relaunching the training `rm -r ./debug_squad/` +- During the TP sessions, you can use the reservation `isiaq` instead of the `gpuq` by commenting/decommenting lines beginning with `#PBS -q`) + ## Misc notes ### Squad dataset location diff --git a/dual_gpu_training.pbs b/dual_gpu_training.pbs index 74dc12a502b7b2e565d3385f478a631a6bed59b7..b968b57a4028ac026570cf39042bd94e8343779f 100644 --- a/dual_gpu_training.pbs +++ b/dual_gpu_training.pbs @@ -4,7 +4,8 @@ #PBS -l walltime=02:00:00 #PBS -l select=1:ncpus=24:ngpus=2:mem=20gb #PBS -q gpuq -#PBS -P test +##PBS -q isiaq +#PBS -P isia # Go to the current directory cd $PBS_O_WORKDIR diff --git a/single_gpu_training.pbs b/single_gpu_training.pbs index 1c8ee2e6affbff0107dffeffb1a5a7b086899cae..460d289b4870ecc652394150ede7ef608fc6998f 100644 --- a/single_gpu_training.pbs +++ b/single_gpu_training.pbs @@ -4,7 +4,8 @@ #PBS -l walltime=02:00:00 #PBS -l select=1:ncpus=12:ngpus=1:mem=20gb #PBS -q gpuq -#PBS -P test +##PBS -q isiaq +#PBS -P isia # Go to the current directory cd $PBS_O_WORKDIR