(This is a database / R commands question)
I wish (for my thesis work), to import tRNA data into R and have it aligned.
My questions are:
1) What resources can I use for the data.
2) What commands might help me with the import/alignment.
So far, I found two nice repositories that holds such data:
tRNAdb at the University of Leipzig
...
I'm looking for a faster way to calculate GC content for DNA strings read in from a FASTA file. This boils down to taking a string and counting the number of times that the letter 'G' or 'C' appears. I also want to specify the range of characters to consider.
I have a working function that is fairly slow, and it's causing a bottleneck...
I'm trying to do a simple genomic track intersection in R, and running into major performance problems, probably related to my use of for loops.
In this situation, I have pre-defined windows at intervals of 100bp and I'm trying to calculate how much of each window is covered by the annotations in mylist. Graphically, it looks somethi...
I'm making a large number of seqlogos programmatically. They are hundreds of columns wide and so running a seqlogo normally creates letters that are too thin to see. I've noticed that I only care about a few of these columns (not necessarily consecutive columns) ... most are noise but some are highly conserved.
I use something like this...
I work as support staff in a Biology research institute as a student, and Perl seems to be used everywhere. Not for every single project, but it seems that more than half the people here have a few Perl books in/on their office/desk.
Why is Perl used so much in Biology?
...
EDIT: Please close this question.
I asked and got an answer for it on BioStar here.
In BioPerl, a sequence object can have any number of features, and each of these can have subfeatures nested within them. For example, a feature may be a complete coding sequence of a gene, and its subfeatures might be individual exons that ar...
How can I extract DNA sequence using a Perl script from genome browser (UCSC), if I have their coordinates?
...
I have a dense matrix where the indices correspond to genes. While gene identifiers are often integers, they are not contiguous integers. They could be strings instead, too.
I suppose I could use a boost sparse matrix of some sort with integer keys, and it wouldn't matter if they're contiguous. Or would this still occupy a great deal of...
I'm storing contacts between different elements. I want to eliminate elements of certain type and store new contacts of elements which were interconnected by the eliminated element.
Problem background
Imagine this problem. You have a water molecule which is in contact with other molecules (if the contact is a hydrogen bond, there can b...
Hey guys this is my first question on here. I'm trying to make a local copy of the UniprotKB in SQL.
The UniprotKB is 2.1GB, and it comes in XML and a special text format used by SwissProt
Here are my options:
1) Use a SAX parser (XML) - I chose Ruby, and Nokogiri. I started writing the parser, but my initial reaction: how would I map...
My fried suggest I try to apply for a job at EMBL. I'm not bioinformatic in any way, but my friend (who by the way is a biologist working at EMBL) insists that I could adapt to the new environment as long as I have a interest in subject and am generally good at learning new things.
But here is a catch. For the last 4 years I've been wor...
I have done a couple research jobs in Bio-informatics and I have used Matlab for them. Matlab had a lot of powerful tools and was easy to use. I did thinks with genome sequencing and predicting metabolic pathways. I am wondering what other people think is best? or there might not be one specific language but a few that lend themselves be...
I have a scientific data management problem which seems general, but I can't find an existing solution or even a description of it, which I have long puzzled over. I am about to embark on a major rewrite (python) but I thought I'd cast about one last time for existing solutions, so I can scrap my own and get back to the biology, or at l...
Hi there. I have a FASTA file containing several protein sequences. The format is like
----------------------
>protein1
MYRALRLLARSRPLVRAPAAALASAPGLGGAAVPSFWPPNAAR
MASQNSFRIEYDTFGELKVPNDKYYGAQTVRSTMNFKIGGVTE
RMPTPVIKAFGILKRAAAEVNQDYGLDPKIANAIMKAADEVAE
GKLNDHFPLVVWQTGSGTQTNMNVNEVISNRAIEMLGGELGSK
IPVHPNDHVNKSQ
>protein2
MRSRPAGPALLLLLLF...
Hi all,
I am new to Image Processing. I will use Image Processing to for Medical Images. I am searching for video lectures or any other good learning resources? Any help. Thanks in advance.
Regards,
Saghar Ayyaz
...
I've written a bookmarlet to open a user defined web link, in this specific case a specific genomic location in the UCSC genome broswer.
javascript:d=%22%22+(window.getSelection?window.getSelection():document.getSelection?document.getSelection():document.selection.createRange().text);d=d.replace(/%5Cr%5Cn%7C%5Cr%7C%5Cn/g,%22%20,%22);if(...
How can I fetch genomic sequence efficiently using Python? For example, from a .fa file or some other easily obtained format? I basically want an interface fetch_seq(chrom, strand, start, end) which will return the sequence [start, end] on the given chromosome on the specified strand.
Analogously, is there a programmatic python interf...
I want to use the P4 Python Package on a windows machine.
from: [http://www.bmnh.org/~pf/p4.html][1]
I have python 2.6 installed and working with numpy ready and realines.py installed.
There is a win32-gbu version of GSL installed on my windows machine, from
gnuwin32.sourceforge.net/packages/gsl.htm
When I try to install P4, using set...
I'm trying to get the mean length of fasta sequences using Erlang. A fasta file looks like this
>title1
ATGACTAGCTAGCAGCGATCGACCGTCGTACGC
ATCGATCGCATCGATGCTACGATCGATCATATA
ATGACTAGCTAGCAGCGATCGACCGTCGTACGC
ATCGATCGCATCGATGCTACGATCTCGTACGC
>title2
ATCGATCGCATCGATGCTACGATCTCGTACGC
ATGACTAGCTAGCAGCGATCGACCGTCGTACGC
ATCGATCGCATCGATGCTACGATC...
I'm writing a Clojure implementation of this coding challenge, attempting to find the average length of sequence records in Fasta format:
>1
GATCGA
GTC
>2
GCA
>3
AAAAA
For more background see this related StackOverflow post about an Erlang solution.
My beginner Clojure attempt uses lazy-seq to attempt to read in the file one record a...