Installation (R-wrapper)
This section will help you set up the necessary software and packages required to run SigProfilerMatrixGeneratorR. Note, you must first install the python package for SigProfilerMatrixGenerator which is described below.
- Home
- Using SigProfilerMatrixGenerator - Input
- Using SigProfilerMatrixGenerator - Output
- Quick Start Example for SigProfilerMatrixGenerator
- Currently Supported Genomes
Prerequisites
- Internet Connection
- python - v3.4+
- pandas - any version
This will automatically be downloaded when you install SigProfilerMatrixGenerator. - Wget - v1.9
- SigProfilerPlotting - latest version
This will automatically be downloaded when you install SigProfilerMatrixGenerator. Installation of matplotlib is necessary for the plotting tool to be installed. - Reference genomes - latest version
At least one of the reference genome files must be downloaded and installed on the system prior to use of the SigProfilerMatrixGenerator tool. These are not automatically downloaded when the tool is installed and require additional steps.
Upgrades
If there is an updated version of the tool that has been released, use the following command within Terminal or the Command Line: pip install SigProfilerMatrixGenerator --upgrade.
This will upgrade the tool to its latest version.
Mac/Unix
For Mac/OSX systems, the use of a package manager like Conda is recommended to simplify environment setup.
To install SigProfilerMatrixGenerator, first check if you have python installed.
python
Check that you have the required python version by opening Terminal (⌘ + Space, type terminal, and hit return to open the application) and entering the command:
/usr/bin/python. This system version of Python is currently Python 2. If you have multiple versions of Python installed, try the below command.
Follow these instructions to download the most recent version of Python for your operating system if you do not have v3.4 or higher: Python Installation.
Installation instructions for our recommended python package manager, Conda, through the Anaconda distribution can be found here: Anaconda Installation.
If you are installing python for the first time, pip is automatically installed in the same location.
pip
If necessary, separate installations of pandas, Wget, and SigProfilerPlotting can be achieved via pip.
pandas and SigProfilerPlotting are automatically installed with SigProfilerMatrixGenerator tool. Thus, you only need to install wget separately.
Check if you have pip installed on your operating system and which version using by entering this command into Terminal:
You should see an output similar to: This tells you which version of pip is currently installed, and which version of Python it is set up to install packages for. This is especially helpful if you have more than one version of Python installed on your system.Follow the instructions here to download and install PIP for your operating system: PIP Installation I, PIP Installation II.
To install wget via pip, refer here: wget pip.
To install wget via Conda, refer here: wget conda.
SigProfilerMatrixGeneratorR
Now that you've successfully downloaded all the required software, you can easily install SigProfilerMatrixGenerator using pip.
This will start running the installation process and once installation is complete, you should see the following output on the command line and these folders where your python framework and packages are saved.
Installing R Dependencies
You must first install the devtools and reticulate libraries:
$ R
>> install.packages("devtools")
>> install.packages("reticulate")
Once these are installed, you can install SigProfilerMatirxGeneratorR:
$ R
>> library("reticulate")
>> use_python("path_to_your_python3")
>> py_config()
>> library("devtools")
>> install_github("AlexandrovLab/SigProfilerMatrixGeneratorR")
Reference Genome
Prior to use of the SigProfilerMatrixGeneratorR tool, the reference genome files need to be installed. Install your desired reference genome from the command line as follows:
This example installs the custom human 37 assembly reference files but you can install any of the available genome assemblies. The installation will use bash commands as default.
If the server has firewall in place, wget will not work. The install command has an additional rsync parameter that must be set to True which acts as a wget equivalent.
The installation process saves the custom reference files for all chromosomes in the genome assembly so ~3 Gb of storage must be available for the downloads for each genome. You can find all the downloaded reference files in the main SigProfilerMatrixGenerator folder. Because the custom files are so large, this step could take some time.