vcf_files
- Using the Tool - Output
This output folder structure provides text-based files containing the original mutations paired with the SigProfilerMatrixGenerator classification for each chromosome. The files are separated into dinucleotides (DBS), multinucleotide substitutions (MNS), smaller insertions/deletions (ID), and single nucleotide variants (SNV) folders containing the appropriate files. These files are only generated when seqInfo is set to true. Similarly, if exome=True, an additional file will appear that contains all of the original mutations that occurred within the exome.
Overview
The individual output folder structure for each of the 3 folders is the same. Each mutation gets resaved in the appropriate MNS, DBS, ID, or SNV folder in the correct chromosome based file.
The output for each file are identical. For example, the sample table below represents the output of the 1_seqInfo.txt file in the DBS folder. The headers for each file are the same with the exception of the MNS files which don't contain a matrix classification or a strand classification {1, 0, -1}. The MNS file simply contains the mutation present in the original vcf file.
| Sample | Chromosome | Position | SBS6144 classification | Strand |
|---|---|---|---|---|
| MELA_006 | 1 | 18915081 | N:T[CC>TT]C | -1 |
| MELA_006 | 1 | 57243769 | U:T[CC>TT]T | -1 |
| The second line refers to a dinucleotide mutation found in the MELA_006 sample on chromosome 1 at position 57243769 classified as N:T[CC>TT]C (Untranscribed T[CC>TT]T) on the non-reference strand. |
DBS
Dinucleotide substitution refers to two adjacent nucleotides that have both mutated to another dinucleotide combination.
Below is a screenshot of what the generated file should look like.
MNS
Multinucleotide substitution are classified as a series of mutations occurring within 5 base pairs of each other. DBSs are not included in this classification.
Below is a screenshot of what the generated file should look like.
SBS
Single nucleotide variant is a single base pair mutated to another base pair.
Below is a screenshot of what the generated file should look like.