Quick Start Example
This section provides an example for users to quickly get started with using the SigProfilerAssignment tool. The following example will use somatic mutational data from breast cancer samples from Nik-Zainal et al. 2012 Cell, and will showcase how to use SigProfilerAssignment with different types of files containing the input somatic mutations, including variant calling files (VCFs) and mutational matrices.
Prerequisites¶
This tutorial requires that you have completed all steps in the installation guide, specifically:
- Installed SigProfilerAssignment
- Downloaded GRCh37 reference genome using SigProfilerMatrixGenerator
Downloading input example data¶
This example uses somatic mutational data from a breast cancer genome. Download the example dataset BRCA.zip at the following location or use the command line:
ftp://alexandrovlab-ftp.ucsd.edu/pub/tools/SigProfilerAssignment/Example_data/
If using the command line, then enter the following command in bash on MacOS X or Unix systems:
$ wget ftp://alexandrovlab-ftp.ucsd.edu/pub/tools/SigProfilerAssignment/Example_data/BRCA.zip
Once BRCA.zip has been downloaded, unzip the file. The unzipped BRCA folder contains BRCA.txt and another folder BRCA_vcf. The file BRCA.txt is a mutational matrix defined using SBS-96 classification (created by SigProfilerMatrixGenerator) and BRCA_vcf contains the corresponding VCF file associated to the sample.
Running SigProfilerAssignment from VCF¶
You will be assigning reference mutational signatures from COSMIC v3.5 to the breast cancer sample in the subfolder BRCA_vcf used as input for this example.
First, start a Python interactive shell and import the SigProfilerAssignment library.
$ python
>>> from SigProfilerAssignment import Analyzer as Analyze
Next, assign reference COSMIC signatures by running the following command. Note: Update "path/to/BRCA_vcf" with the actual path to the BRCA_vcf folder.
Analyze.cosmic_fit(samples="path/to/BRCA_vcf",
output="output_vcf",
input_type="vcf",
context_type="96",
genome_build="GRCh37")
You can also run SigProfilerAssignment cosmic_fit function from command line:
$ SigProfilerAssignment cosmic_fit "path/to/BRCA_vcf" "output_vcf" --input_type "vcf" --context_type "96" --genome_build "GRCh37"
After SigProfilerAssignment has finished running, an output directory name output_vcf will be created. This directory will contain the output files and is located in the directory where the Python instance was started. To learn more about the output produced by SigProfilerAssignment, please refer to the Using the Tool - Output section.
Running SigProfilerAssignment (Mutational matrix)¶
You will be assigning reference mutational signatures from COSMIC v3.5 to the mutational matrix defined using the SBS-96 classification named BRCA.txt input for this example.
First, start a Python interactive shell and import the SigProfilerAssignment library.
$ python
>>> from SigProfilerAssignment import Analyzer as Analyze
Next, assign reference COSMIC signatures by running the following command. Note: Update "path/to/BRCA.txt" with the actual path to the BRCA.txt file.
Analyze.cosmic_fit(samples="path/to/BRCA.txt",
output="output_mm",
input_type="matrix")
You can also run SigProfilerAssignment cosmic_fit function from command line:
$ SigProfilerAssignment cosmic_fit "path/to/BRCA.txt" "output_mm" --input_type "matrix"
output_mm will be created. This directory will contain the output files and is located in the directory where the Python instance was started. To learn more about the output produced by SigProfilerAssignment, please refer to the Using the Tool - Output section.
Running SigProfilerAssignment (Multi-sample segmentation)¶
You will be assigning reference mutational signatures from COSMIC v3.5 to the multi-sample segmentation file obtained from one of the copy number calling tools named all.breast.ascat.summary.sample.tsv input for this example.
First, start a Python interactive shell and import the SigProfilerAssignment library.
$ python
>>> from SigProfilerAssignment import Analyzer as Analyze
Next, assign reference COSMIC signatures by running the following command. Note: Update "path/to/all.breast.ascat.summary.sample.tsv" with the actual path to the all.breast.ascat.summary.sample.tsv file.
Analyze.cosmic_fit(samples="path/to/all.breast.ascat.summary.sample.tsv",
output="example_sf",
input_type="seg:ASCAT_NGS",
cosmic_version=3.5,
collapse_to_SBS96=False)
You can also run SigProfilerAssignment cosmic_fit function from command line:
$ SigProfilerAssignment cosmic_fit "path/to/all.breast.ascat.summary.sample.tsv" "example_sf" --input_type "seg:ASCAT_NGS" --cosmic_version "3.5" --collapse_to_SBS96 False
After SigProfilerAssignment has finished running, an output directory name example_sf will be created. This directory will contain the output files and is located in the directory where the Python instance was started. To learn more about the output produced by SigProfilerAssignment, please refer to the Using the Tool - Output section.
Additional Information¶
In the above examples, the other non specified parameters are passed in with their default values. All of the function arguments and their types are explained in detail in the Using the Tool - Input section. To learn more about the files that were produced, you can refer to Using the Tool - Output.