Given these inputs:
my $init_seq = "AAAAAAAAAA" #length 10 bp
my $sub_rate = 0.003;
my $nof_tags = 1000;
my @dna = qw( A C G T );
I want to generate:
One thousand length-10 tags
Substitution rate for each position in a tag is 0.003
Yielding output like:
AAAAAAAAAA
AATAACAAAA
.....
AAGGAAAAGA # 1000th tags
Is there a compact ...
Hello,
I'm trying to make a glob-like expansion of a set of DNA strings that have multiple possible bases.
The base of my DNA strings contains the letters A, C, G, and T. However, I can have special characters like M which could be an A or a C.
For example, say I have the string:
ATMM
I would like to take this string as input and o...
I need a bit of help with is this code. I know the sections that should be recursive, or at least I think I do but am not sure how to implement it. I am trying to implement a path finding program from an alignment matrix that will find multiple routes back to the zero value. For example if you excute my code and insert CGCA as the first ...
Task:
to cluster a large pool of short DNA fragments in classes that share common sub-sequence-patterns and find the consensus sequence of each class.
Pool: ca. 300 sequence fragments
8 - 20 letters per fragment
4 possible letters: a,g,t,c
each fragment is structured in three regions:
5 generic letters
8 or more positions of g's...
How can I modify the Smith-Waterman algorithm using a substitution matrix to align proteins in Perl?
[citations needed]
...
Hi,
I have been trying to wrap my head around this for a while now but have not been able to come up with a good solution. Here goes:
Given a number of sets:
set1: A, T
set2: C
set3: A, C, G
set4: T
set5: G
I want to generate all possible sequences from a list of sets. In this example the length of the sequence is 5, but it can be a...
I'd like to create an rRNA sequence database with a web front end for the lab I work in. It seems common in biology to want to search a large number of sequences using alignment algorithms such as BLAST and HMMER, so I wondered if there is any existing php/python/rails projects that allow easy creation of a generic sequence database with...
Hi
Which commercial databases are adept in storing biological sequences like Protein/DNA sequence? Are there any which were designed specifically to store such sequences?
cheers
...
I have tried google with no luck. I have seen some weak references to robust multi-array averaging done with python but no code. I am not so interested in reinventing the wheel. Any suggestions on a python module, script ....
If I could find a nice explanation or example of the algorithm I would write a python implementation to share.
...
I have an interesting genetics problem that I would like to solve in native Python (nothing outside the standard library). This in order for the solution to be very easy to use on any computer, without requiring the user to install additional modules.
Here it is. I received 100,000s of DNA sequences (up to 2 billion) from a 454 new gene...
I am working with DNA sequences of length 25 (see examples below). I have a list of 230,000 and need to look for each sequence in the entire genome (toxoplasma gondii parasite) I am not sure how large the genome is but much more that 230,000 sequences.
I need to look for each of my sequences of 25 characters example(AGCCTCCCATGATTGAACAG...
(This is a database / R commands question)
I wish (for my thesis work), to import tRNA data into R and have it aligned.
My questions are:
1) What resources can I use for the data.
2) What commands might help me with the import/alignment.
So far, I found two nice repositories that holds such data:
tRNAdb at the University of Leipzig
...
Hi, I'd like to do some kind of "search and replace" algorithm which will, in an efficient manner if possible, identify a substring of a string which occurs more than once and replace all occurrences of that substring with a token.
For example, given a string "AbcAdAefgAbijkAblmnAbAb", notice that "A" recurs, so reduce in pass one to "#...
I need to compare the DNA sequences of X and Y chromosomes, and find patterns (composed of around 50-75 base pairs) that are unique to the Y chromosome. Note that these sequence parts can repeat in the chromosome. This needs to be done quickly (BLAST takes 47 days, need a few hours or less). Are there any algorithms or programs in partic...
I'm attempting to write a basic dna sequencer. In that, given two sequences of the same length, it will output the strings which are the same, with a minimal length of 3.
So input of
abcdef dfeabc
will return
1 abc
I am not sure how to go about solving the problem.
I can compare the two strings, and see if they are completely equal...