The drug-related expression data are downloaded from the CMAP web site here. The getCmap function downloads the CMAP rank matrix along with the compound annotations, and getCmapCEL downloads the corresponding 7,056 CEL files. The functions will write the downloaded files to the data and data/CEL directories within the present working directory of the user’s R session. Since some of the raw data sets are large, the functions will only rerun the download if the argument rerun is assigned TRUE. If the raw data are not needed then users can skip this time consuming download step and work with the preprocessed data obtained in the next section.
The experimental design of the CMAP project is defined in the file cmap_instances_02.xls. Note, this file required some cleaning in LibreOffice (Excel would work for this too). After this it was saved as tab delimited txt file named cmap_instances_02.txt. The following count statisitics are extracted from this file.