some of the annotation data is presented as .db objects in bioconductor. For eg. mouse4302.db, org.Hs.eg.db etc. Some times, it is better to know what is inside and how each db works. Please note that all the elements inside db (i.e information inside db packages) is available using sql statements with slight modifications (for eg. joints, select etc). Following points are written keeping mouse4302.db in point.
Logic of extracting information from mouse4302.db is as follows:
1) Select (Bioconductor database, what you have, what you want and data type of what you have)
2) In bioconductor parlance,
Select ( db package, keys, columns in the database, key type)
db package= mouse4302.db = provided by bioconductor
keys= what you have = Affymetrix probe ids (here) for eg. 1425683_at, 1425684_at
columns = what you want from the database
keytype= type of data you have = PROBEID (type of probesets)
3) Code:
a) select(mouse4302.db, keys=ids, columns=c("SYMBOL"), keytype = "PROBEID")
ids = list of affymetrix probe ids from mouse4302 chip
In this command, we are supplying list of affymetrix probe IDs (keys=ids), requesting for corresponding Gene symbol (columns=c("SYMBOL")) and informing the program that we supplied PROBEIDs (keytype="PROBEID").
Output will have probe set IDs and corresponding Gene symbol.
b) select(mouse4302.db, keys=c("1425683_at", "1425684_at"), columns=c("SYMBOL"), keytype = "PROBEID")
In this command, we are supplying two affymetrix probe IDs ( keys=c("1425683_at", "1425684_at")), requesting for corresponding Gene symbol (columns=c("SYMBOL")) and informing the program that we supplied PROBEIDs (keytype="PROBEID").
Output will have probe set IDs and corresponding Gene symbol as displayed below.
we would like to know what kind of information does the program take i.e key types.
corresponding command would be: keytypes(mouse4302.db)
Keytypes would be displayed which can be used to probe mouse4302.db. Remember these are the only keytypes that are allowed in this package. In the above example we have used "PROBEID" keytype as we wanted to know gene symbol for the affymetrix probes we had.
Logic of extracting information from mouse4302.db is as follows:
1) Select (Bioconductor database, what you have, what you want and data type of what you have)
2) In bioconductor parlance,
Select ( db package, keys, columns in the database, key type)
db package= mouse4302.db = provided by bioconductor
keys= what you have = Affymetrix probe ids (here) for eg. 1425683_at, 1425684_at
columns = what you want from the database
keytype= type of data you have = PROBEID (type of probesets)
3) Code:
a) select(mouse4302.db, keys=ids, columns=c("SYMBOL"), keytype = "PROBEID")
ids = list of affymetrix probe ids from mouse4302 chip
In this command, we are supplying list of affymetrix probe IDs (keys=ids), requesting for corresponding Gene symbol (columns=c("SYMBOL")) and informing the program that we supplied PROBEIDs (keytype="PROBEID").
Output will have probe set IDs and corresponding Gene symbol.
b) select(mouse4302.db, keys=c("1425683_at", "1425684_at"), columns=c("SYMBOL"), keytype = "PROBEID")
In this command, we are supplying two affymetrix probe IDs ( keys=c("1425683_at", "1425684_at")), requesting for corresponding Gene symbol (columns=c("SYMBOL")) and informing the program that we supplied PROBEIDs (keytype="PROBEID").
Output will have probe set IDs and corresponding Gene symbol as displayed below.
we would like to know what kind of information does the program take i.e key types.
corresponding command would be: keytypes(mouse4302.db)
Keytypes would be displayed which can be used to probe mouse4302.db. Remember these are the only keytypes that are allowed in this package. In the above example we have used "PROBEID" keytype as we wanted to know gene symbol for the affymetrix probes we had.
Now, we would like to know what kind of information we can get from this db. Available information we can from the database can be known from the following command:
command: columns(mouse4302.db)
Running the above command will tell us following information is within mouse4302.db and can be extracted using appropriate keytype.
Let us extract mouse probe ID and Gene ontology (GO and ONTOLOGY) s for the gene symbol Akr1b10. Please remember that list would be huge.
command: select(mouse4302.db, keys=c("Akr1b10"), columns=c("PROBEID", "GO", "ONTOLOGY"), keytype = "SYMBOL")
Please note that key is gene symbol and key type is symbol. we wanted to extract affymetrix probes for that gene in mouse4302 chip, Gene ontology ID and GO component (Molecular Function, Cellular component and Biological process). Mouse 4302.db presents evidence in Ontology components by default.