Sequence simulations are useful in several studies in bioinformatics. Such simulations are covered here in earlier blogs. Note that the simulation scripts are written in R. In this note, we will use python/biopython to simulate DNA elements. Objective of the study is to simulate DNA sequence after taking following inputs from user:
1) Length of the desired sequence
2) Desired number of sequences
3) IUPAC code or Standard code (ATGC)
script will take above input and stores the sequences in the same folder where code is executed with date and time stamp. Stored format is fasta.
============================================
from Bio.Alphabet import IUPAC
import random
from datetime import datetime
n=int(input("type the length of the sequence: "))
j=int(input("type the number of sequences: "))
k=(input("choose S for standard code and I for IUPAC: "))
if k == "S":
sequnce=IUPAC.IUPACUnambiguousDNA.letters
elif k == "I":
sequnce=IUPAC.IUPACAmbiguousDNA.letters
dnafile = open("dnasequence_"+datetime.now().strftime("%Y%m%d_%H%M%S")+".fa", "a")
for j in range(j):
my_seq=''.join(random.choice(sequnce) for i in range(n))
id="seq "+str(j+1)
my_seq = ">"+id+"\n"+my_seq
dnafile.write(my_seq+"\n")
dnafile.close()
============================================
Please note that python is particular about indentation and make sure that indentation is correct.
1) Length of the desired sequence
2) Desired number of sequences
3) IUPAC code or Standard code (ATGC)
script will take above input and stores the sequences in the same folder where code is executed with date and time stamp. Stored format is fasta.
============================================
from Bio.Alphabet import IUPAC
import random
from datetime import datetime
n=int(input("type the length of the sequence: "))
j=int(input("type the number of sequences: "))
k=(input("choose S for standard code and I for IUPAC: "))
if k == "S":
sequnce=IUPAC.IUPACUnambiguousDNA.letters
elif k == "I":
sequnce=IUPAC.IUPACAmbiguousDNA.letters
dnafile = open("dnasequence_"+datetime.now().strftime("%Y%m%d_%H%M%S")+".fa", "a")
for j in range(j):
my_seq=''.join(random.choice(sequnce) for i in range(n))
id="seq "+str(j+1)
my_seq = ">"+id+"\n"+my_seq
dnafile.write(my_seq+"\n")
dnafile.close()
============================================
Please note that python is particular about indentation and make sure that indentation is correct.