tags:

views:

77

answers:

4

Hi,

I am working in R and have a DNA sequence for example:"cgtcgctgtttgtcaaagtcg...." that is possibly 1000+ letters long. However, I only want to look at letters 5 to 200, for example, and to define this subset of the string as a new object. I tried looking at the nchar function, but haven't found something that would do this.

THank you for your help.

Chris

A: 

Could you first just make a temporary string that's a trimmed from the long one?

lod3n
How do I trim it?....sorry for naive question (I am a new user)
C_BioInfo
I think you would use substr
lod3n
+8  A: 

Try

substr("cgtcgctgtttgtcaa[...]", 5, 200)

See substr().

Artelius
THanks a lot!Chris
C_BioInfo
That link for substr documentation appears to be dead. How about this one: http://stat.ethz.ch/R-manual/R-patched/library/base/html/substr.html
Argalatyr
+4  A: 

Use the substring function:

> tmp.string <- paste(LETTERS, collapse="")
> tmp.string <- substr(tmp.string, 4, 10)
> tmp.string
[1] "DEFGHIJ"
Shane
+2  A: 

See also the Bioconductor package Biostrings that is a good choice if you need to handle large biological sequences or set of sequences.

#source("http://bioconductor.org/biocLite.R");biocLite("Biostrings") 
library(Biostrings)
s <-paste(rep("gtcgctgtttgtcaac",20),collapse="")
d <- DNAString(s)
d[5:200]
as.character(d[5:200])
Paolo