SDF Import
The following gives an overview of the most important import/export
functionalities for small molecules provided by
ChemmineR
. The given example creates an instance of the
SDFset
class using as sample data set the first 100
compounds from this PubChem SD file (SDF):
Compound_00650001_00675000.sdf.gz
(ftp://ftp.ncbi.nih.gov/pubchem/Compound/CURRENT-Full/SDF/).
SDFs can be imported with the read.SDFset
function:
sdfset <- read.SDFset("http://faculty.ucr.edu/ tgirke/Documents/R_BioCond/Samples/sdfsample.sdf")
data(sdfsample) # Loads the same SDFset provided by the library
sdfset <- sdfsample
valid <- validSDF(sdfset) # Identifies invalid SDFs in SDFset objects
sdfset <- sdfset[valid] # Removes invalid SDFs, if there are any
Import SD file into SDFstr
container:
sdfstr <- read.SDFstr("http://faculty.ucr.edu/ tgirke/Documents/R_BioCond/Samples/sdfsample.sdf")
Create
SDFset
from SDFstr
class:
sdfstr <- as(sdfset, "SDFstr")
sdfstr
## An instance of "SDFstr" with 100 molecules
as(sdfstr, "SDFset")
## An instance of "SDFset" with 100 molecules
SMILES Import
The read.SMIset
function imports one or many molecules
from a SMILES file and stores them in a SMIset
container. The input file is expected to contain one SMILES string per
row with tab-separated compound identifiers at the end of each line. The
compound identifiers are optional.
Create sample SMILES file and then import it:
data(smisample); smiset <- smisample
write.SMI(smiset[1:4], file="sub.smi")
smiset <- read.SMIset("sub.smi")
Inspect content of SMIset
:
data(smisample) # Loads the same SMIset provided by the library
smiset <- smisample
smiset
## An instance of "SMIset" with 100 molecules
view(smiset[1:2])
## $`650001`
## An instance of "SMI"
## [1] "O=C(NC1CCCC1)CN(c1cc2OCCOc2cc1)C(=O)CCC(=O)Nc1noc(c1)C"
##
## $`650002`
## An instance of "SMI"
## [1] "O=c1[nH]c(=O)n(c2nc(n(CCCc3ccccc3)c12)NCCCO)C"
Accessor functions:
cid(smiset[1:4])
## [1] "650001" "650002" "650003" "650004"
smi <- as.character(smiset[1:2])
Create SMIset
from named character vector:
as(smi, "SMIset")
## An instance of "SMIset" with 2 molecules