{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## Modeling a region of the chromatin"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__running time__: < 60 min\n",
"\n",
"After the optimization step we have to create an ensemble of models using the optimal set of parameters that maximize our correlation with the input data.\n",
"\n",
"### PSC cell"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
",-----------------.\n",
"| MODELED_REGIONs |\n",
",----.-------.-------.--------.------------.--------.-------.-------.\n",
"| Id | JOBid | Type | PATHid | PARAM_md5 | RESO | BEG | END |\n",
"|----+-------+-------+--------+------------+--------+-------+-------|\n",
"| 1 | 26 | OPTIM | 62 | 49e20a90c8 | 10,000 | 3,395 | 3,545 |\n",
"'----^-------^-------^--------^------------^--------^-------^-------'\n"
]
}
],
"source": [
"%%bash\n",
"\n",
"tadbit describe -w ../results/PSC_rep1/ -t 13"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From the models produced (n_models) we will tell TADbit to conserve a number of them (n_keep) that best satisfy the imposed restraints."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The final ensemble is built with the `tadbit model` and the `--model` flag. To pass to tadbit which matrix we want to use we pass the `jobid` of the previously generated text matrix. The following command takes around 20 minutes with 8 cpus:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
" o Loading Hi-C matrix\n",
"\n",
" Modeling\n",
"********\n",
"\n",
" - Region: Chromosome chr3 from 33950000 to 35450000 at resolution 10 kb (150 particles)\n",
"\n",
" o Loading optimized parameters\n",
"Loaded UpFreq: 0.6 LowFreq: -0.8 MaxDist: 250 scale: 0.01 cutoff: 2 Correlation:0.72\n",
" 1/1 0.6 -0.8 250 0.01 2.0 | 0.7301\n"
]
}
],
"source": [
"%%bash\n",
"\n",
"tadbit model -w ../results/PSC_rep1/ \\\n",
" --reso 10000 \\\n",
" --crm chr3 \\\n",
" --beg 33950000 --end 35450000 \\\n",
" --nmodels 2000 --nkeep 1000 \\\n",
" --cpu 8 \\\n",
" --jobid 25 \\\n",
" --model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The result is the generated ensemble of models in a format that can be loaded in the python TADbit library stored in the `07_model/
\n",
"In the modelling we have used a scale of 0.01 nm/bp; that means that if we expect 100 bp/nm of chromatin in each bead and between two consecutives beads.\n",
"
\n",
"
\n", "Walking_angle plots the angle between triplets of contiguous particles. The higher are these values the straighter are the models.\n", "
" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "\n", "Interactions plot (particles closer than the given cutoff)\n", "
" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "\n", "The accessibility is calculated by considering a mesh surface around the model and checking if each point of this mesh could be replaced by an object (i.e. a protein) represented as a sphere of a given radius.\n", "
\n", "\n", "Outer part of the model can be excluded from the estimation of accessible surface because contacts from this outer part to particles outside the model are unknown. To exclude the outer contour a sphere with a higher radius (superradius) is first tested in the mesh before proceding to the accessibility calculation.\n", "
" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "