Skip to content

vcf_files

- Using the Tool - Output

This output folder structure provides text-based files containing the original mutations paired with the SigProfilerMatrixGenerator classification for each chromosome. The files are separated into dinucleotides (DBS), multinucleotide substitutions (MNS), smaller insertions/deletions (ID), and single nucleotide variants (SNV) folders containing the appropriate files. These files are only generated when seqInfo is set to true. Similarly, if exome=True, an additional file will appear that contains all of the original mutations that occurred within the exome.

Overview

overview

The individual output folder structure for each of the 3 folders is the same. Each mutation gets resaved in the appropriate MNS, DBS, ID, or SNV folder in the correct chromosome based file.

DBS, SNV, MNS

The output for each file are identical. For example, the sample table below represents the output of the 1_seqInfo.txt file in the DBS folder. The headers for each file are the same with the exception of the MNS files which don't contain a matrix classification or a strand classification {1, 0, -1}. The MNS file simply contains the mutation present in the original vcf file.

Sample Chromosome Position SBS6144 classification Strand
MELA_006 1 18915081 N:T[CC>TT]C -1
MELA_006 1 57243769 U:T[CC>TT]T -1
The second line refers to a dinucleotide mutation found in the MELA_006 sample on chromosome 1 at position 57243769 classified as N:T[CC>TT]C (Untranscribed T[CC>TT]T) on the non-reference strand.

DBS

Dinucleotide substitution refers to two adjacent nucleotides that have both mutated to another dinucleotide combination.

Below is a screenshot of what the generated file should look like.

DBS

MNS

Multinucleotide substitution are classified as a series of mutations occurring within 5 base pairs of each other. DBSs are not included in this classification.

Below is a screenshot of what the generated file should look like. MNS

SBS

Single nucleotide variant is a single base pair mutated to another base pair.

Below is a screenshot of what the generated file should look like. SNV