
Counting amino acids

# insulin [Homo sapiens] GI:386828

for amino_acid in "ACDEFGHIKLMNPQRSTVWY":
  number = insulin.count(amino_acid)
  print amino_acid, number

Repetitive tasks

Consider the solutions of challenge #4 from previous session:

dna = "AGCTTCGA"

print dna.count("A")
print dna.count("C")
print dna.count("T")
print dna.count("G")
dna = "AGCTTCGA"
elem =  dna.count ("A")
print  elem
elem =  dna.count ("C")
print  elem
elem =  dna.count ("T")
print  elem
elem =  dna.count ("G")
print  elem

##Loops with for The for command repeats other commands:

dna = "AGCTTCGA”

for base in "ACTG":
  print dna.count(base)

The commands that are repeated must be indented (shifted right by four spaces).


dna = "AGCTTCGA”
for base in "ACTG":
  print dna.count(base)

Would you prefer this implementation?

dna = "AGCTTCGA"

print dna.count("A")
print dna.count("C")
print dna.count("T")
print dna.count("G")

Why or why not?

Challenge #1

Retrieve the 1132-residue sequence of human telomerase reverse transcriptase isoform 1 from the NCBI protein database. Choose the FASTA format. Copy the sequence to a text file (telomerase.txt). Write a program that reads the telomerase.txt file and prints first the whole sequence and then the sequence residue by residue.

See the Solution to challenge #1

Use a for loop to read a file line by line

Input_file = open(my_file.txt)
for line in Input_file:
  print line

Look how beautiful it can be…

import urllib
url = '\
src_human = urllib.urlopen(url)
for line in src_human:
  print line,

Challenge #2

Write a file and program that reads the telomerase.txt prints its content line by line.

See the Solution to challenge #2

Manipulating string


Challenge #3

Which amino acid is the most frequent in the sequence of the telomerase reverse transcriptase isoform 1?

See the Solution to challenge #3




Back to first page.