# Table of Contents
* [Hardware](#Hardware)
	* [Disk space](#Disk-space)
	* [RAM memory](#RAM-memory)
	* [CPUs](#CPUs)
* [Software](#Software)
	* [GEM Mapper](#GEM-Mapper)
	* [Conda](#Conda)
	* [Python dependencies](#Python-dependencies)
	* [DSRC FASTQ compressor](#DSRC-FASTQ-compressor)
	* [TADbit](#TADbit)


# Hardware

## Disk space

The processing of a "typical" Hi-C experiment of about 200 M reads will occupy a space of around __100 GB per experiment__. After the anaysis many of the intermediate files can be compressed or erased, but at it is probable that at each of the experiment/replicate will at least 50 Gb in disk.

## RAM memory

The more the better. RAM is specially important to load matrices at high resolution, but usually __32 Gb__ of RAM should be enough to deal with 50 kb resolution matrices on a human genome.

## CPUs

No limitations here, just time. A __8 core computer__ should be abble to process a single Hi-C experiment (200 M reads, analyzed at 50 kb) in __3-4 days__. This includes all the steps of the mapping, filtering, normalization and detection of TADs and compartments.

The 3D modeling will depend on the size of the regions to be modeled.

# Software

## Conda

Conda (http://conda.pydata.org/docs/index.html) is a package manager, mainly hosting python programs, that is very usefull when no root access is available and the softwares have complicated dependencies.

To install is just download the installer from http://conda.pydata.org/miniconda.html

In [None]:
! wget https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh

And run it with all the default options. The installer will create a `miniconda2` folder in your home directory where all the programs that you need will be stored (including python).

## Python dependencies

With conda you can install the needed dependencies:

In [9]:
! conda install -y scipy # scientific computing in python
! conda install -y numpy # scientific computing in python
! conda install -y matplotlib # to produce plots
! conda install -y jupyter # this notebook :)
! conda install -y -c https://conda.anaconda.org/bcbio pysam # to deal with SAM/BAM files
! conda install -y -c https://conda.anaconda.org/salilab imp # for 3D modeling
! conda install -y pip # yet another python package manager
! conda install -y -c bioconda mcl # for clustering

Fetching package metadata .......
Solving package specifications: ..........

Package plan for installation in environment /home/fransua/.miniconda2:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    scipy-0.18.1               |      np111py27_0        30.9 MB

The following packages will be UPDATED:

    scipy: 0.17.1-np111py27_1 --> 0.18.1-np111py27_0

Fetching packages ...
scipy-0.18.1-n 100% |################################| Time: 0:00:08   4.05 MB/s
Extracting packages ...
[      COMPLETE      ]|###################################################| 100%
Unlinking packages ...
[      COMPLETE      ]|###################################################| 100%
Linking packages ...
[      COMPLETE      ]|###################################################| 100%
Fetching package metadata .......
Solving package specifications: ..........

Package plan for installation in environment /home/f

## GEM Mapper

In this course we will use GEM, but any other alternative is just fine.

To install GEM, go to the download page: https://sourceforge.net/projects/gemlibrary/files/gem-library/Binary%20pre-release%202/
and download the `i3` version (the other version is for older computers, and you usually won't have to use it).

In [1]:
! wget -O GEM.tbz2 https://sourceforge.net/projects/gemlibrary/files/gem-library/Binary%20pre-release%202/GEM-binaries-Linux-x86_64-core_i3-20121106-022124.tbz2/download

--2018-09-17 12:28:40--  https://sourceforge.net/projects/gemlibrary/files/gem-library/Binary%20pre-release%202/GEM-binaries-Linux-x86_64-core_i3-20121106-022124.tbz2/download
Resolving sourceforge.net... 216.105.38.13
Connecting to sourceforge.net|216.105.38.13|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://downloads.sourceforge.net/project/gemlibrary/gem-library/Binary%20pre-release%202/GEM-binaries-Linux-x86_64-core_i3-20121106-022124.tbz2?r=&ts=1537183721&use_mirror=netcologne [following]
--2018-09-17 12:28:41--  https://downloads.sourceforge.net/project/gemlibrary/gem-library/Binary%20pre-release%202/GEM-binaries-Linux-x86_64-core_i3-20121106-022124.tbz2?r=&ts=1537183721&use_mirror=netcologne
Resolving downloads.sourceforge.net... 216.105.38.13
Connecting to downloads.sourceforge.net|216.105.38.13|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://netcologne.dl.sourceforge.net/project/gemlibrary/gem-librar

Uncompress the archive: 

In [2]:
! tar -xjvf GEM.tbz2

GEM-binaries-Linux-x86_64-core_i3-20121106-022124/
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/gem-indexer
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/gem-indexer.man
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/gem-map-2-map.man
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/gem-mapper.man
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/gem-indexer_fasta2meta+cont
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/gem-2-sam
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/gem-map-2-map
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/gem-indexer_generate
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/gem-2-sam.man
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/gem-mapper
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/gem-indexer_bwt-dna
GEM-binaries-Linux-x86_64-core_i3-20121106-022124/LICENSE


And copy the needed binaries to somewhere in your PATH, like:

In [3]:
! rm -f GEM-binaries-Linux-x86_64-core_i3-20121106-022124/*man
! rm -f GEM-binaries-Linux-x86_64-core_i3-20121106-022124/LICENCE

In [6]:
! cp GEM-binaries-Linux-x86_64-core_i3-20121106-022124/* ~/miniconda2/bin/

## DSRC FASTQ compressor

DSRC is a FASTQ compressor, it's not needed, but we use it as the size of the files is significantly smaller than using gunzip (>30%), and, more importantly, the access to them can be parallelized, and is much faster than any other alternative.

It can be downloaded from https://github.com/lrog/dsrc

In [8]:
! wget http://sun.aei.polsl.pl/dsrc/download/2.0rc/dsrc

wget: /home/dcastillo/miniconda2/lib/libuuid.so.1: no version information available (required by wget)
--2018-09-16 17:44:10--  http://sun.aei.polsl.pl/dsrc/download/2.0rc/dsrc
Resolving sun.aei.polsl.pl (sun.aei.polsl.pl)... 157.158.77.3
Connecting to sun.aei.polsl.pl (sun.aei.polsl.pl)|157.158.77.3|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1761768 (1.7M) [text/plain]
Saving to: ‘dsrc’


2018-09-16 17:44:11 (1.95 MB/s) - ‘dsrc’ saved [1761768/1761768]



In [9]:
! chmod +x dsrc

And copy to miniconda bin

In [10]:
! mv dsrc ~/miniconda2/bin/

## TADbit

For now TADbit is not available through conda or pip package manager, so to install it we will have to clone the repository, and compile the binaries:

In [None]:
! git clone git@github.com:3DGenomes/TADbit.git -b dev
! cd TADbit; python setup.py install