Commit c7717d8c authored by Pradat Yoann's avatar Pradat Yoann

remove gitlab ci; add badges

parent b4784d84
image: "python:3.9"
default:
tags:
- docker
stages:
- test
variables:
CODECOV_TOKEN: "c8bcf054-f0cb-4bf1-8866-1b862905ef89"
before_script:
- make install
test:
stage: test
script:
- make test
- bash <(curl -s https://codecov.io/bash)
only:
refs:
- master
......@@ -24,6 +24,9 @@ test:
$(PIP) install --upgrade pytest pytest-cov
$(PYTEST) --cov-config=.coveragerc --cov-report term-missing --cov . .
codecov:
bash <(curl -s https://codecov.io/bash) -t c8bcf054-f0cb-4bf1-8866-1b862905ef89
build:
@echo "---------------Build variant_annotator-----------------"
$(PYTHON) -m pip install --user --upgrade setuptools wheel
......
# Biotool for annotating variants from a VCF file.
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![codecov](https://codecov.io/gh/ypradat/VariantAnnotator/branch/master/graph/badge.svg?token=H821S3WZHS)](https://codecov.io/gh/ypradat/VariantAnnotator)
The tool is divided in 3 steps
- Manual parsing of the VCF
- Run [vcf2maf](https://github.com/mskcc/vcf2maf) to extract standard information
......@@ -7,7 +10,7 @@ The tool is divided in 3 steps
## 1. What is the tool doing ?
VEP annotates variants with information from multiple external databases and can be configure for to answer a lot of
VEP annotates variants with information from multiple external databases and can be configured to answer a lot of
specific needs. For more details, see [VEP's options
page](https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html). VEP does not however extract information
like number of reads or somatic status from the VCF file. vcf2maf is supposed to perform these tasks but failed to do on
......
This diff is collapsed.
Chromosome Position dbSNP_RS Tumor_Seq_Allele1 Tumor_Seq_Allele2 Variant_Quality Filter_VCF Hugo_Symbol Variant_Classification Variant_Type Transcript_ID n_GT n_SS n_FA n_DP n_AD t_GT t_SS t_FA t_DP t_AD
1 16386305 rs143272992 G GC 50 PASS FAM131C Intron INS ENST00000375662.4 0/0 2 0.000 11 11,0 0/1 2 0.333 6 4,2
3 147121629 ATC A 50 PASS ZIC4 Intron DEL ENST00000491672.1 0/0 2 0.000 6 6,0 0/1 2 0.333 6 4,2
3 184043925 rs112208190 AAC A 50 PASS EIF4G1 Intron DEL ENST00000392537.2 0/0 2 0.000 7 7,0 0/1 2 0.667 6 2,4
7 22533451 rs116873396 TCA T 50 PASS STEAP1B Frame_Shift_Del DEL ENST00000404369.4 0/0 2 0.000 22 22,0 0/1 2 0.333 6 4,2
11 112042479 CT C 50 PASS TEX12 Intron DEL ENST00000280358.4 0/0 2 0.000 31 31,0 0/1 2 0.333 6 4,2
12 49431403 G GT 50 PASS KMT2D Frame_Shift_Ins INS ENST00000301067.7 0/0 2 0.000 48 48,0 0/1 2 0.212 33 26,7
13 33332313 CA C 50 PASS PDS5B Frame_Shift_Del DEL ENST00000315596.10 0/0 2 0.000 47 47,0 0/1 2 0.333 6 4,2
17 38712160 CT C 50 PASS CCR7 Intron DEL ENST00000246657.2 0/0 2 0.000 5 5,0 0/1 2 0.333 6 4,2
20 50342306 TTC T 50 PASS ATP9A Intron DEL ENST00000338821.5 0/0 2 0.000 22 22,0 0/1 2 0.400 15 9,6
Chromosome Position dbSNP_RS Tumor_Seq_Allele1 Tumor_Seq_Allele2 Variant_Quality Filter_VCF Hugo_Symbol Variant_Classification Variant_Type Transcript_ID n_GT n_SS n_FA n_DP n_AD t_GT t_SS t_FA t_DP t_AD
1 44476442 C T 43 PASS SLC6A9 5'UTR SNP ENST00000372307.3 0/0 2 0.000 69 69,0 0/1 2 0.276 58 42,16
1 244583577 G T 6 PASS ADSS Missense_Mutation SNP ENST00000366535.3 0/0 2 0.000 77 77,0 0/1 2 0.083 36 33,3
2 25678299 C T 24 PASS DTNB Missense_Mutation SNP ENST00000406818.3 0/0 2 0.000 33 33,0 0/1 2 0.471 17 9,8
3 85932472 C T 56 PASS CADM2 Silent SNP ENST00000383699.3 0/0 2 0.000 50 50,0 0/1 2 0.514 37 18,19
6 7986778 G A 27 PASS BLOC1S5-TXNDC5 Intron SNP ENST00000539054.1 0/0 2 0.000 23 23,0 0/1 2 0.435 23 13,10
7 75609837 C G 13 PASS POR Intron SNP ENST00000394893.1 0/0 2 0.000 8 8,0 0/1 2 0.455 11 6,5
7 149129243 G A 16 PASS ZNF777 Missense_Mutation SNP ENST00000247930.4 0/0 2 0.000 18 18,0 0/1 2 0.222 27 21,6
7 150840441 C T 26 PASS AGAP3 Missense_Mutation SNP ENST00000463381.1 0/0 2 0.000 28 28,0 0/1 2 0.375 24 15,9
10 116247760 T C 26 PASS ABLIM1 Missense_Mutation SNP ENST00000392952.3 0/0 2 0.000 72 72,0 0/1 2 0.244 45 34,11
12 43944926 T C 44 PASS ADAMTS20 Missense_Mutation SNP ENST00000389420.3 0/0 2 0.000 50 50,0 0/1 2 0.469 32 17,15
13 50464902 T C 8 PASS IGR SNP 0/0 2 0.000 75 75,0 0/1 2 0.077 65 60,5
14 65266493 T C 20 PASS SPTB Missense_Mutation SNP ENST00000556626.1 0/0 2 0.000 21 21,0 0/1 2 0.400 20 12,8
15 91043489 C T 9 PASS IQGAP1 3'UTR SNP ENST00000268182.5 0/0 2 0.000 18 18,0 0/1 2 0.500 6 3,3
16 88790292 T C 20 PASS PIEZO1 Missense_Mutation SNP ENST00000301015.9 0/0 2 0.000 34 34,0 0/1 2 0.280 25 18,7
17 40272381 G A 99 PASS KAT2A Silent SNP ENST00000225916.5 0/0 2 0.000 35 35,0 0/1 2 0.525 61 29,32
19 42585066 G A 17 PASS ZNF574 Missense_Mutation SNP ENST00000600245.1 0/0 2 0.000 19 19,0 0/1 2 0.280 25 18,7
20 16730581 G A 21 PASS OTOR Missense_Mutation SNP ENST00000246081.2 0/0 2 0.000 52 52,0 0/1 2 0.186 43 35,8
22 23040479 C G 41 PASS IGLV2-23 RNA SNP ENST00000390306.2 0/0 2 0.000 23 23,0 0/1 2 0.520 25 12,13
X 51076024 rs143435240 G A 6 PASS NUDT10 Silent SNP ENST00000376006.3 0/0 2 0.013 78 77,1 0/1 2 0.068 44 41,3
X 77160816 A G 7 PASS COX7B 3'UTR SNP ENST00000481445.1 0/0 2 0.000 51 51,0 0/1 2 0.130 23 20,3
X 77160852 T A 7 PASS COX7B 3'UTR SNP ENST00000481445.1 0/0 2 0.000 35 35,0 0/1 2 0.176 17 14,3
X 78216689 C T 26 PASS P2RY10 Silent SNP ENST00000171757.2 0/0 2 0.000 63 63,0 0/1 2 0.185 54 44,10
X 122757148 A T 6 PASS THOC2 Intron SNP ENST00000245838.8 0/0 2 0.000 48 48,0 0/1 2 0.231 13 10,3
X 152684244 T G 6 PASS ZFP92 Missense_Mutation SNP ENST00000338647.5 0/0 2 0.000 62 62,0 0/1 2 0.059 51 48,3
## ENSEMBL VARIANT EFFECT PREDICTOR v102.0
## Output produced at 2021-02-14 10:37:55
## Using cache in /Users/ypradat/.vep/homo_sapiens/101_GRCh37
## Using API version 101, DB version ?
## ensembl-funcgen version 101.b918a49
## ensembl-variation version 101.851c7e0
## ensembl version 101.856c8e8
## ensembl-io version 101.943b6c2
## COSMIC version 90
## sift version sift5.2.2
## regbuild version 1.0
## polyphen version 2.2.2
## genebuild version 2011-04
## dbSNP version 153
## ESP version 20141103
## assembly version GRCh37.p13
## ClinVar version 201912
## 1000genomes version phase3
## gnomAD version r2.1
## HGMD-PUBLIC version 20194
## gencode version GENCODE 19
## Column descriptions:
## Uploaded_variation : Identifier of uploaded variant
## Location : Location of variant in standard coordinate format (chr:start or chr:start-end)
## Allele : The variant allele used to calculate the consequence
## Gene : Stable ID of affected gene
## Feature : Stable ID of feature
## Feature_type : Type of feature - Transcript, RegulatoryFeature or MotifFeature
## Consequence : Consequence type
## cDNA_position : Relative position of base pair in cDNA sequence
## CDS_position : Relative position of base pair in coding sequence
## Protein_position : Relative position of amino acid in protein
## Amino_acids : Reference and variant amino acids
## Codons : Reference and variant codon sequence
## Existing_variation : Identifier(s) of co-located known variants
## Extra column keys:
## IMPACT : Subjective impact classification of consequence type
## DISTANCE : Shortest distance from variant to transcript
## STRAND : Strand of the feature (1/-1)
## FLAGS : Transcript quality flags
## SYMBOL : Gene symbol (e.g. HGNC)
## SYMBOL_SOURCE : Source of gene symbol
## HGNC_ID : Stable identifer of HGNC gene symbol
## BIOTYPE : Biotype of transcript or regulatory feature
## CANONICAL : Indicates if transcript is canonical for this gene
## MANE : MANE (Matched Annotation by NCBI and EMBL-EBI) Transcript
## TSL : Transcript support level
## APPRIS : Annotates alternatively spliced transcripts as primary or alternate based on a range of computational methods
## CCDS : Indicates if transcript is a CCDS transcript
## ENSP : Protein identifer
## SWISSPROT : UniProtKB/Swiss-Prot accession
## TREMBL : UniProtKB/TrEMBL accession
## UNIPARC : UniParc accession
## UNIPROT_ISOFORM : Direct mappings to UniProtKB isoforms
## SIFT : SIFT prediction and/or score
## PolyPhen : PolyPhen prediction and/or score
## EXON : Exon number(s) / total
## INTRON : Intron number(s) / total
## HGVSc : HGVS coding sequence name
## HGVSp : HGVS protein sequence name
## HGVS_OFFSET : Indicates by how many bases the HGVS notations for this variant have been shifted
## AF : Frequency of existing variant in 1000 Genomes combined population
## AFR_AF : Frequency of existing variant in 1000 Genomes combined African population
## AMR_AF : Frequency of existing variant in 1000 Genomes combined American population
## EAS_AF : Frequency of existing variant in 1000 Genomes combined East Asian population
## EUR_AF : Frequency of existing variant in 1000 Genomes combined European population
## SAS_AF : Frequency of existing variant in 1000 Genomes combined South Asian population
## AA_AF : Frequency of existing variant in NHLBI-ESP African American population
## EA_AF : Frequency of existing variant in NHLBI-ESP European American population
## gnomAD_AF : Frequency of existing variant in gnomAD exomes combined population
## gnomAD_AFR_AF : Frequency of existing variant in gnomAD exomes African/American population
## gnomAD_AMR_AF : Frequency of existing variant in gnomAD exomes American population
## gnomAD_ASJ_AF : Frequency of existing variant in gnomAD exomes Ashkenazi Jewish population
## gnomAD_EAS_AF : Frequency of existing variant in gnomAD exomes East Asian population
## gnomAD_FIN_AF : Frequency of existing variant in gnomAD exomes Finnish population
## gnomAD_NFE_AF : Frequency of existing variant in gnomAD exomes Non-Finnish European population
## gnomAD_OTH_AF : Frequency of existing variant in gnomAD exomes other combined populations
## gnomAD_SAS_AF : Frequency of existing variant in gnomAD exomes South Asian population
## MAX_AF : Maximum observed allele frequency in 1000 Genomes, ESP and ExAC/gnomAD
## MAX_AF_POPS : Populations in which maximum allele frequency was observed
## CLIN_SIG : ClinVar clinical significance of the dbSNP variant
## SOMATIC : Somatic status of existing variant
## PHENO : Indicates if existing variant(s) is associated with a phenotype, disease or trait; multiple values correspond to multiple variants
## PUBMED : Pubmed ID(s) of publications that cite existing variant
## MOTIF_NAME : The stable identifier of a transcription factor binding profile (TFBP) aligned at this position
## MOTIF_POS : The relative position of the variation in the aligned TFBP
## HIGH_INF_POS : A flag indicating if the variant falls in a high information position of the TFBP
## MOTIF_SCORE_CHANGE : The difference in motif score of the reference and variant sequences for the TFBP
## TRANSCRIPTION_FACTORS : List of transcription factors which bind to the transcription factor binding profile
#Uploaded_variation Location Allele Gene Feature Feature_type Consequence cDNA_position CDS_position Protein_position Amino_acids Codons Existing_variation Extra
rs143272992 1:16386305-16386306 C ENSG00000185519 ENST00000375662.4 Transcript intron_variant - - - - - rs372070031 IMPACT=MODIFIER;STRAND=-1;SYMBOL=FAM131C;SYMBOL_SOURCE=HGNC;HGNC_ID=26717;BIOTYPE=protein_coding;CANONICAL=YES;CCDS=CCDS41270.1;ENSP=ENSP00000364814;SWISSPROT=Q96AQ9;UNIPARC=UPI000022B016;INTRON=5/6;HGVSc=ENST00000375662.4:c.451+58dup;AFR_AF=0.2731;AMR_AF=0.3573;EAS_AF=0.3581;EUR_AF=0.3757;SAS_AF=0.5051;MAX_AF=0.5051;MAX_AF_POPS=SAS
3_147121630_TC/- 3:147121630-147121631 - ENSG00000174963 ENST00000525172.2 Transcript intron_variant - - - - - rs142316820 IMPACT=MODIFIER;STRAND=-1;SYMBOL=ZIC4;SYMBOL_SOURCE=HGNC;HGNC_ID=20393;BIOTYPE=protein_coding;CANONICAL=YES;CCDS=CCDS54652.1;ENSP=ENSP00000435509;SWISSPROT=Q8N9L1;TREMBL=C9JZU7,C9JD04,C9J6T3,B3KPI4;UNIPARC=UPI0001914D88;INTRON=1/4;HGVSc=ENST00000525172.2:c.135+120_135+121del
rs112208190 3:184043926-184043927 - ENSG00000114867 ENST00000424196.1 Transcript intron_variant - - - - - rs34901174 IMPACT=MODIFIER;STRAND=1;SYMBOL=EIF4G1;SYMBOL_SOURCE=HGNC;HGNC_ID=3296;BIOTYPE=protein_coding;CANONICAL=YES;CCDS=CCDS54687.1;ENSP=ENSP00000416255;SWISSPROT=Q04637;TREMBL=Q96I65,C9JWW9,C9JWH7,C9JSU8,C9J987,C9J6B6,C9J556;UNIPARC=UPI00015E0966;INTRON=20/31;HGVSc=ENST00000424196.1:c.3243+217_3243+218del
rs116873396 7:22533452-22533453 - ENSG00000105889 ENST00000404369.4 Transcript frameshift_variant,splice_region_variant 503-504 87-88 29-30 HE/QX caTGag/caag - IMPACT=HIGH;STRAND=-1;SYMBOL=STEAP1B;SYMBOL_SOURCE=HGNC;HGNC_ID=41907;BIOTYPE=protein_coding;CANONICAL=YES;CCDS=CCDS56469.1;ENSP=ENSP00000384370;TREMBL=C9JL51,C9JE84,B5MCI2;UNIPARC=UPI000173A267;EXON=3/5;HGVSc=ENST00000404369.4:c.87_88del;HGVSp=ENSP00000384370.4:p.His29GlnfsTer24
11_112042480_T/- 11:112042480 - ENSG00000150783 ENST00000280358.4 Transcript intron_variant - - - - - rs1225064086 IMPACT=MODIFIER;STRAND=1;SYMBOL=TEX12;SYMBOL_SOURCE=HGNC;HGNC_ID=11734;BIOTYPE=protein_coding;CANONICAL=YES;CCDS=CCDS31679.1;ENSP=ENSP00000280358;SWISSPROT=Q9BXU0;UNIPARC=UPI00001377E3;INTRON=4/4;HGVSc=ENST00000280358.4:c.228-9del;gnomAD_AF=1.711e-05;gnomAD_AFR_AF=8.319e-05;gnomAD_AMR_AF=0;gnomAD_ASJ_AF=0;gnomAD_EAS_AF=8.285e-05;gnomAD_FIN_AF=0;gnomAD_NFE_AF=1.133e-05;gnomAD_OTH_AF=0;gnomAD_SAS_AF=0;MAX_AF=8.319e-05;MAX_AF_POPS=gnomAD_AFR
12_49431404_-/T 12:49431403-49431404 T ENSG00000167548 ENST00000301067.7 Transcript frameshift_variant 9735-9736 9735-9736 3245-3246 -/X -/A - IMPACT=HIGH;STRAND=-1;SYMBOL=KMT2D;SYMBOL_SOURCE=HGNC;HGNC_ID=7133;BIOTYPE=protein_coding;CANONICAL=YES;CCDS=CCDS44873.1;ENSP=ENSP00000301067;SWISSPROT=O14686;TREMBL=Q6PIA1,Q59FG6,F8VWW4;UNIPARC=UPI0000EE84D6;EXON=34/54;HGVSc=ENST00000301067.7:c.9735dup;HGVSp=ENSP00000301067.7:p.Pro3246ThrfsTer5
13_33332314_A/- 13:33332314 - ENSG00000083642 ENST00000315596.10 Transcript frameshift_variant 3332 3146 1049 Q/X cAa/ca - IMPACT=HIGH;STRAND=1;SYMBOL=PDS5B;SYMBOL_SOURCE=HGNC;HGNC_ID=20418;BIOTYPE=protein_coding;CANONICAL=YES;CCDS=CCDS41878.1;ENSP=ENSP00000313851;SWISSPROT=Q9NTI5;UNIPARC=UPI000006D4A9;EXON=27/35;HGVSc=ENST00000315596.10:c.3148del;HGVSp=ENSP00000313851.10:p.Thr1050GlnfsTer12;HGVS_OFFSET=2
17_38712161_T/- 17:38712161 - ENSG00000126353 ENST00000246657.2 Transcript intron_variant - - - - - rs372297045 IMPACT=MODIFIER;STRAND=-1;SYMBOL=CCR7;SYMBOL_SOURCE=HGNC;HGNC_ID=1608;BIOTYPE=protein_coding;CANONICAL=YES;CCDS=CCDS11369.1;ENSP=ENSP00000246657;SWISSPROT=P32248;TREMBL=J3KTN5,J3KSS9,A0N0Q0;UNIPARC=UPI0000001C2F;INTRON=2/2;HGVSc=ENST00000246657.2:c.61-91del;AFR_AF=0;AMR_AF=0;EAS_AF=0.004;EUR_AF=0;SAS_AF=0;MAX_AF=0.004;MAX_AF_POPS=EAS
20_50342307_TC/- 20:50342307-50342308 - ENSG00000054793 ENST00000338821.5 Transcript intron_variant - - - - - - IMPACT=MODIFIER;STRAND=-1;SYMBOL=ATP9A;SYMBOL_SOURCE=HGNC;HGNC_ID=13540;BIOTYPE=protein_coding;CANONICAL=YES;CCDS=CCDS33489.1;ENSP=ENSP00000342481;SWISSPROT=O75110;TREMBL=Q2NLD0,B4DR18;UNIPARC=UPI000004D334;INTRON=3/27;HGVSc=ENST00000338821.5:c.327+50_327+51del
Chromosome Position dbSNP_RS Tumor_Seq_Allele1 Tumor_Seq_Allele2 Variant_Quality Filter_VCF n_GT n_SS n_TIR n_TAR n_DP n_DP4 n_AD n_depth n_ref_count n_alt_count t_GT t_SS t_TIR t_TAR t_DP t_DP4 t_AD t_depth t_ref_count t_alt_count
1 24486218 T TTTG 0/0 24 24.0 24.0 0.0 0/1 2 22 0,22,0,6 22.0 22.0 6.0
1 27107272 CGT C 0/0 0,0 21,21 26 0 26.0 21.0 0.0 0/1 2 3,3 14,14 18 4,14,2,1 4 18.0 14.0 3.0
1 117122285 G GTCC 0/0 8 8.0 8.0 0.0 0/1 2 16 4,12,1,6 16.0 16.0 7.0
1 171607569 CAG C 0/0 0,0 12,12 15 12 15.0 12.0 0.0 0/1 2 4,4 7,7 11 10,1,3,1 7 11.0 7.0 4.0
1 175116046 C CT,CTT 0/0 38 0 38.0 38.0 0.0 0/2 2 56 57,2,11,1 7 56.0 59.0 12.0
1 201981005 C CA 0/0 0 0.0 0/1 2 3
1 223284687 G GA 0/0 0,0 91,91 93 91 93.0 91.0 0.0 0/1 2 37,37 35,36 78 32,46,18,19 35 78.0 35.0 37.0
1 226553491 AAAC A,AAACA 0/0 12 12,0,0,0 12.0 12.0 0.0 0/1 2 9 4,0,5,0 9.0 4.0 5.0
2 11273407 GTC G,GTCTC 0/0 9 0 9.0 9.0 0.0 0/2 2 23 24,2,7,1 4 23.0 26.0 8.0
2 42996759 AAGAG A 0/0 5 5.0 5.0 0.0 0/1 2 11 11,0,7,0 11.0 11.0 7.0
2 55181022 TA TAA,T 0/0 20 0 20.0 20.0 0.0 0/2 2 20 25,0,4,0 3 20.0 25.0 4.0
2 153471255 T TCAAAA,TCAAAACAAAACAAAACAAAA 0/0 8 0 8.0 8.0 0.0 0/2 2 8 8,0,3,0 4 8.0 8.0 3.0
2 176988290 C CGCA 0/0 0 0.0 0/1 2 12
3 38355176 T TGCGCGCGCGCGC,TGCGCGCGTGTGTGTGTGTGTGTGCGCGC 0/0 20 1 20.0 20.0 0.0 0/1 2 18 18,2,5,0 8 18.0 20.0 5.0
3 148711906 G GT 0/0 26 0 26.0 26.0 0.0 0/1 2 22 21,2,1,1 4 22.0 23.0 2.0
3 164776647 T TACACAC,TACAC,TACACACAC 0/0 44 0 44.0 44.0 0.0 0/2 2 19 25,0,9,0 3 19.0 25.0 9.0
3 195792288 CGGGG C 0/0 7 0 7.0 7.0 0.0 0/1 2 11 11,0,5,0 5 11.0 11.0 5.0
3 195956726 AAG A 0/0 50 0 50.0 50.0 0.0 0/1 2 75 70,5,9,1 8 75.0 75.0 10.0
4 100472247 T TA 0/0 10 0,10,0,0 10.0 10.0 0.0 0/1 2 16 0,11,0,5 16.0 11.0 5.0
4 108608100 CTCTAACACT C 0/0 8 34 8.0 8.0 0.0 1/1 2 48 48,0,39,0 42 48.0 48.0 39.0
4 121739411 GCACA G 0/0 0 0.0 0/1 2 4
5 1093609 G GGGGCGGGGACT 0/0 0 0.0 0/1 2 18
5 174940299 TAAAAAAAAAAAAA T 0/0 0 0.0 0/1 2 3
6 32487441 A AC 0/0 4 4.0 4.0 0.0 1/1 2 20 2,18,2,18 20.0 20.0 20.0
6 32548732 C CT 0/0 31 31.0 31.0 0.0 0/1 2 106 1,87,0,18 106.0 88.0 18.0
6 32557600 T TC 0/0 42 42.0 42.0 0.0 0/1 2 68 1,57,0,10 68.0 58.0 10.0
6 64289938 ATT AT,ATTT,A 0/0 36 0 36.0 36.0 0.0 0/3 2 22 29,2,4,1 3 22.0 31.0 5.0
6 75950109 TA TAA,T,TAAA 0/0 45 0 45.0 45.0 0.0 0/2 2 27 6,32,1,6 5 27.0 38.0 7.0
6 90577711 TCTTTGCCCAGACATGGA T 0/0 36 36.0 36.0 0.0 0/1 2 71 18,37,3,13 71.0 55.0 16.0
6 105300084 G GTT 0/0 6 9 6.0 6.0 0.0 1/1 2 30 26,3,20,3 8 30.0 29.0 23.0
6 117631463 T TTAA 0/0 21 1 21.0 21.0 0.0 0/1 2 25 5,20,2,5 5 25.0 25.0 7.0
6 152765726 GA GAA,G 0/0 0 0.0 0/1 2 5
6 158484691 CAAAAAAAAAAA C 0/0 0 0.0 0/1 2 3
6 163899794 CT CTT,C 0/0 36 0 36.0 36.0 0.0 0/2 2 22 24,2,3,0 3 22.0 26.0 3.0
7 122269207 CT C 0/0 0 0.0 0/1 2 4
7 135099044 TA T,TAA 0/0 0 0.0 0/1 2 9
7 144532829 A AG 0/0 0 0.0 0/1 2 3
7 150937074 CT C 0/0 0 0.0 0/1 2 3
8 55542730 CTTTGAAATGCTTGGTCAA C 0/0 0,0 73,73 58 73 58.0 73.0 0.0 0/1 2 7,7 27,27 30 21,9,3,4 27 30.0 27.0 7.0
8 124382376 TA T 0/0 0 0.0 0/1 2 3
9 37305829 GAT G 0/0 7 0 7.0 7.0 0.0 0/1 2 12 0,12,0,2 3 12.0 12.0 2.0
9 95039956 TA T 0/0 6 0 6.0 6.0 0.0 0/1 2 6 6,0,2,0 3 6.0 6.0 2.0
9 130950344 ACCC A 0/0 7 0,7,0,0 7.0 7.0 0.0 0/1 2 7 0,4,0,3 7.0 4.0 3.0
9 136249406 GATAATG GATG,GATATATG,GATA 0/0 21 21.0 21.0 0.0 0/1 2 14 12,0,8,0 14.0 12.0 8.0
9 140161631 TGTGGGGCTGAG T,TGTGGAG 0/0 15 0 15.0 15.0 0.0 0/2 2 20 7,11,1,4 4 20.0 18.0 5.0
10 8115955 A ACC 0/0 0,0 58,58 69 58 69.0 58.0 0.0 0/1 2 30,30 65,66 107 24,72,8,20 65 107.0 65.0 30.0
10 36811687 G GGT 0/0 77 0 77.0 77.0 0.0 0/1 2 95 97,3,24,0 11 95.0 100.0 24.0
10 75672607 C CA,CAA 0/0 16 0 16.0 16.0 0.0 0/2 2 13 14,2,9,2 4 13.0 16.0 11.0
10 78839127 A AGT,AGTGT 0/0 0 0.0 0/2 2 7
11 120348771 G GT 0/0 0 0.0 0/1 2 3
12 58187016 CT C 0/0 24 0 24.0 24.0 0.0 0/1 2 19 1,19,0,3 3 19.0 20.0 3.0
12 75893478 CA C 0/0 0 0.0 0/1 2 3
12 100930180 CT C 0/0 21 0 21.0 21.0 0.0 0/1 2 16 16,0,3,0 3 16.0 16.0 3.0
12 110463769 C CT 0/0 0 0.0 0/1 2 3
13 95227137 AAAAG A,AA 0/0 18 3 18.0 18.0 0.0 0/2 2 44 3,31,2,8 0 44.0 34.0 10.0
14 35515606 TGG T 0/0 4 0 4.0 4.0 0.0 0/1 2 7 6,0,6,0 4 7.0 6.0 6.0
14 72941206 G GA 0/0 9 9.0 9.0 0.0 0/1 2 105 67,1,39,1 105.0 68.0 40.0
14 92563254 AT ATT,A 0/0 24 0 24.0 24.0 0.0 0/2 2 32 0,35,0,6 3 32.0 35.0 6.0
15 28473275 T TA 0/0 0 0.0 0/1 2 3
15 81187286 TA T,TAA 0/0 0 0.0 0/2 2 4
15 88690736 CCTTCTTCTTCTTCTTCTTCTT C 0/0 0,0 12,12 18 0 18.0 12.0 0.0 0/1 2 5,5 5,6 27 0,28,0,4 4 27.0 5.0 5.0
16 460578 CT C 0/0 0 0.0 0/1 2 3
16 628994 TGGGC T 0/0 17 0 17.0 17.0 0.0 0/1 2 41 0,41,0,19 7 41.0 41.0 19.0
16 4700318 CA C 0/0 15 0 15.0 15.0 0.0 0/1 2 17 18,2,7,0 3 17.0 20.0 7.0
16 58200464 ACT A 0/0 142 7 142.0 142.0 0.0 0/1 2 126 96,30,9,4 13 126.0 126.0 13.0
16 84230067 GCAACCCCTTCGCT G 0/0 0 0.0 0/1 2 4
16 84230082 AACCCCTTC A 0/0 10 10.0 10.0 0.0 0/1 2 12 8,0,4,0 12.0 8.0 4.0
16 84230091 GCTCAA G 0/0 12 12.0 12.0 0.0 0/1 2 11 7,0,4,0 11.0 7.0 4.0
16 89167075 C CCCCAGGAGGCTCCCGGGAG 0/0 0 0.0 0/1 2 3
17 1264611 TA T 0/0 26 0 26.0 26.0 0.0 0/1 2 32 4,28,1,9 5 32.0 32.0 10.0
17 2297571 GCA G 0/0 0 0.0 0/1 2 7
17 3352494 C CA 0/0 26 0 26.0 26.0 0.0 0/1 2 20 0,20,0,11 3 20.0 20.0 11.0
17 4802255 GGCCTCTGCCTCGCTCCACCC G 0/0 5 0 5.0 5.0 0.0 0/1 2 13 4,9,1,3 6 13.0 13.0 4.0
17 55075670 CA CAA,C 0/0 31 0 31.0 31.0 0.0 0/2 2 36 41,0,7,0 8 36.0 41.0 7.0
17 58525216 GA G 0/0 0 0.0 0/1 2 3
17 67101527 TC T 0/0 0 0.0 0/1 2 3
17 76456454 GAGTGTA G,GAGTGTGCA 0/0 34 34.0 34.0 0.0 0/2 2 38 0,37,0,7 38.0 37.0 7.0
18 43666280 TAGTTAATATATTAATACCTTAAGA T,TAGTTAATATATTAATACCTTAAGAT 0/0 25 36 25.0 25.0 0.0 0/2 2 15 4,6,3,2 35 15.0 10.0 5.0
19 2901114 CGCCGAAGTCT C 0/0 4 0 4.0 4.0 0.0 0/1 2 18 2,16,1,5 5 18.0 18.0 6.0
19 4199809 C CA 0/0 0,0 3,3 3 0 3.0 3.0 0.0 0/1 2 7,7 1,1 8 5,2,5,2 5 8.0 1.0 7.0
19 16513357 CT C 0/0 0 0.0 0/1 2 4
19 41062902 AC A 0/0 0,0 13,13 20 0 20.0 13.0 0.0 0/1 2 13,13 14,14 46 9 46.0 14.0 13.0
20 17928175 CCTG C 0/0 0,0 249,249 255 249 255.0 249.0 0.0 0/1 2 69,70 239,239 330 240,90,49,17 239 330.0 239.0 69.0
20 30060720 GCA G 0/0 0,0 37,37 40 37 40.0 37.0 0.0 0/1 2 9,9 40,41 54 49,5,7,1 40 54.0 40.0 9.0
20 30354257 GGT GGTGT,G 0/0 48 1 48.0 48.0 0.0 0/1 2 41 48,0,9,0 3 41.0 48.0 9.0
20 46279833 GCAA G 0/0 70 70.0 70.0 0.0 0/1 2 108 82,25,8,3 108.0 107.0 11.0
21 30338153 C CA 0/0 0 0.0 0/1 2 3
21 34726106 A AT 0/0 0,0 191,191 229 191 229.0 191.0 0.0 0/1 2 57,57 148,148 253 47,204,14,36 148 253.0 148.0 57.0
21 41384834 C CTT 0/0 4 0 4.0 4.0 0.0 0/1 2 6 6,0,2,0 3 6.0 6.0 2.0
21 45712357 AC A 0/0 3 0 3.0 3.0 0.0 0/1 2 8 0,7,0,4 4 8.0 7.0 4.0
22 31301792 A AGCCACC 0/0 0 0.0 0/1 2 4
22 41918653 GT G 0/0 0,0 2,2 2 0 2.0 2.0 0.0 0/1 2 4,4 0,0 4 4 4.0 0.0 4.0
X 23724675 CAAA C,CA 0/0 28 0 28.0 28.0 0.0 0/1 2 20 26,0,3,0 3 20.0 26.0 3.0
X 54972154 C CGT 0/0 13 13.0 13.0 0.0 0/1 2 5 0,6,0,2 5.0 6.0 2.0
X 55027964 AT A 0/0 0 0.0 0/1 2 5
X 70361651 CA CAA,C 0/0 39 0 39.0 39.0 0.0 0/2 2 35 38,7,20,2 5 35.0 45.0 22.0
X 106461963 CA C 0/0 23 0 23.0 23.0 0.0 0/1 2 13 14,0,3,0 3 13.0 14.0 3.0
X 129203198 C CA 0/0 11 0 11.0 11.0 0.0 0/1 2 9 12,0,5,0 3 9.0 12.0 5.0
## ENSEMBL VARIANT EFFECT PREDICTOR v101.0
## Output produced at 2020-08-28 13:09:12
## Using cache in /Users/ypradat/.vep/homo_sapiens/101_GRCh37
## Using API version 101, DB version ?
## ensembl-funcgen version 101.b918a49
## ensembl-variation version 101.50e7372
## ensembl version 101.856c8e8
## ensembl-io version 101.943b6c2
## ESP version 20141103
## ClinVar version 201912
## polyphen version 2.2.2
## COSMIC version 90
## dbSNP version 153
## assembly version GRCh37.p13
## HGMD-PUBLIC version 20194
## sift version sift5.2.2
## regbuild version 1.0
## 1000genomes version phase3
## gencode version GENCODE 19
## gnomAD version r2.1
## genebuild version 2011-04
## Column descriptions:
## Uploaded_variation : Identifier of uploaded variant
## Location : Location of variant in standard coordinate format (chr:start or chr:start-end)
## Allele : The variant allele used to calculate the consequence
## Gene : Stable ID of affected gene
## Feature : Stable ID of feature
## Feature_type : Type of feature - Transcript, RegulatoryFeature or MotifFeature
## Consequence : Consequence type
## cDNA_position : Relative position of base pair in cDNA sequence
## CDS_position : Relative position of base pair in coding sequence
## Protein_position : Relative position of amino acid in protein
## Amino_acids : Reference and variant amino acids
## Codons : Reference and variant codon sequence
## Existing_variation : Identifier(s) of co-located known variants
## Extra column keys:
## IMPACT : Subjective impact classification of consequence type
## DISTANCE : Shortest distance from variant to transcript
## STRAND : Strand of the feature (1/-1)
## FLAGS : Transcript quality flags
## SYMBOL : Gene symbol (e.g. HGNC)
## SYMBOL_SOURCE : Source of gene symbol
## HGNC_ID : Stable identifer of HGNC gene symbol
## BIOTYPE : Biotype of transcript or regulatory feature
## CANONICAL : Indicates if transcript is canonical for this gene
## MANE : MANE (Matched Annotation by NCBI and EMBL-EBI) Transcript
## TSL : Transcript support level
## APPRIS : Annotates alternatively spliced transcripts as primary or alternate based on a range of computational methods
## CCDS : Indicates if transcript is a CCDS transcript
## ENSP : Protein identifer
## SWISSPROT : UniProtKB/Swiss-Prot accession
## TREMBL : UniProtKB/TrEMBL accession
## UNIPARC : UniParc accession
## SIFT : SIFT prediction and/or score
## PolyPhen : PolyPhen prediction and/or score
## EXON : Exon number(s) / total
## INTRON : Intron number(s) / total
## HGVSc : HGVS coding sequence name
## HGVSp : HGVS protein sequence name
## HGVS_OFFSET : Indicates by how many bases the HGVS notations for this variant have been shifted
## AF : Frequency of existing variant in 1000 Genomes combined population
## AFR_AF : Frequency of existing variant in 1000 Genomes combined African population
## AMR_AF : Frequency of existing variant in 1000 Genomes combined American population
## EAS_AF : Frequency of existing variant in 1000 Genomes combined East Asian population
## EUR_AF : Frequency of existing variant in 1000 Genomes combined European population
## SAS_AF : Frequency of existing variant in 1000 Genomes combined South Asian population
## AA_AF : Frequency of existing variant in NHLBI-ESP African American population
## EA_AF : Frequency of existing variant in NHLBI-ESP European American population
## gnomAD_AF : Frequency of existing variant in gnomAD exomes combined population
## gnomAD_AFR_AF : Frequency of existing variant in gnomAD exomes African/American population
## gnomAD_AMR_AF : Frequency of existing variant in gnomAD exomes American population
## gnomAD_ASJ_AF : Frequency of existing variant in gnomAD exomes Ashkenazi Jewish population
## gnomAD_EAS_AF : Frequency of existing variant in gnomAD exomes East Asian population
## gnomAD_FIN_AF : Frequency of existing variant in gnomAD exomes Finnish population
## gnomAD_NFE_AF : Frequency of existing variant in gnomAD exomes Non-Finnish European population
## gnomAD_OTH_AF : Frequency of existing variant in gnomAD exomes other combined populations
## gnomAD_SAS_AF : Frequency of existing variant in gnomAD exomes South Asian population
## MAX_AF : Maximum observed allele frequency in 1000 Genomes, ESP and ExAC/gnomAD
## MAX_AF_POPS : Populations in which maximum allele frequency was observed
## CLIN_SIG : ClinVar clinical significance of the dbSNP variant
## SOMATIC : Somatic status of existing variant
## PHENO : Indicates if existing variant(s) is associated with a phenotype, disease or trait; multiple values correspond to multiple variants
## PUBMED : Pubmed ID(s) of publications that cite existing variant
## MOTIF_NAME : The stable identifier of a transcription factor binding profile (TFBP) aligned at this position
## MOTIF_POS : The relative position of the variation in the aligned TFBP
## HIGH_INF_POS : A flag indicating if the variant falls in a high information position of the TFBP
## MOTIF_SCORE_CHANGE : The difference in motif score of the reference and variant sequences for the TFBP
## TRANSCRIPTION_FACTORS : List of transcription factors which bind to the transcription factor binding profile
#Uploaded_variation Location Allele Gene Feature Feature_type Consequence cDNA_position CDS_position Protein_position Amino_acids Codons Existing_variation Extra
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment