| makeTxDbFromBiomart {GenomicFeatures} | R Documentation |
The makeTxDbFromBiomart function allows the user
to make a TxDb object from transcript annotations
available on a BioMart database.
makeTxDbFromBiomart(biomart="ensembl",
dataset="hsapiens_gene_ensembl",
transcript_ids=NULL,
circ_seqs=DEFAULT_CIRC_SEQS,
filters="",
id_prefix="ensembl_",
host="www.biomart.org",
port=80,
miRBaseBuild=NA)
getChromInfoFromBiomart(biomart="ensembl",
dataset="hsapiens_gene_ensembl",
id_prefix="ensembl_",
host="www.biomart.org",
port=80)
biomart |
which BioMart database to use.
Get the list of all available BioMart databases with the
|
dataset |
which dataset from BioMart. For example:
|
transcript_ids |
optionally, only retrieve transcript annotation data for the specified set of transcript ids. If this is used, then the meta information displayed for the resulting TxDb object will say 'Full dataset: no'. Otherwise it will say 'Full dataset: yes'. |
circ_seqs |
a character vector to list out which chromosomes should be marked as circular. |
filters |
Additional filters to use in the BioMart query. Must be
a named list. An example is |
host |
The host URL of the BioMart. Defaults to www.biomart.org. |
port |
The port to use in the HTTP communication with the host. |
id_prefix |
Specifies the prefix used in BioMart attributes. For
example, some BioMarts may have an attribute specified as
|
miRBaseBuild |
specify the string for the appropriate build
Information from mirbase.db to use for microRNAs. This can be
learned by calling |
makeTxDbFromBiomart is a convenience function that feeds
data from a BioMart database to the lower level
makeTxDb function.
See ?makeTxDbFromUCSC for a similar function
that feeds data from the UCSC source.
The listMarts function from the biomaRt package can be
used to list all public BioMart databases.
Not all databases returned by this function contain datasets that
are compatible with (i.e. understood by) makeTxDbFromBiomart.
Here is a list of datasets known to be compatible (updated on Sep 24, 2014):
All the datasets in the main Ensembl database:
use biomart="ensembl".
All the datasets in the Ensembl Fungi database:
use biomart="fungi_mart_XX" where XX is the release
version of the database e.g. "fungi_mart_22".
All the datasets in the Ensembl Metazoa database:
use biomart="metazoa_mart_XX" where XX is the release
version of the database e.g. "metazoa_mart_22".
All the datasets in the Ensembl Plants database:
use biomart="plants_mart_XX" where XX is the release
version of the database e.g. "plants_mart_22".
All the datasets in the Ensembl Protists database:
use biomart="protists_mart_XX" where XX is the release
version of the database e.g. "protists_mart_22".
All the datasets in the Gramene Mart:
use biomart="ENSEMBL_MART_PLANT".
Not all these datasets have CDS information.
A TxDb object.
M. Carlson and H. Pages
makeTxDbFromUCSC, makeTxDbFromGRanges,
and makeTxDbFromGFF, for convenient ways to make a
TxDb object from UCSC online resources, or from a
GRanges object, or from a GFF or GTF file.
The listMarts, useMart,
and listDatasets functions in the
biomaRt package.
The supportedMiRBaseBuildValues function for
listing all the possible values for the miRBaseBuild
argument.
The TxDb class.
makeTxDb for the low-level function used by the
makeTxDbFrom* functions to make the TxDb object
returned to the user.
## Discover which datasets are available in the "ensembl" BioMart
## database:
library(biomaRt)
head(listDatasets(useMart("ensembl")))
## Retrieving an incomplete transcript dataset for Human from the
## "ensembl" BioMart database:
transcript_ids <- c(
"ENST00000013894",
"ENST00000268655",
"ENST00000313243",
"ENST00000435657",
"ENST00000384428",
"ENST00000478783"
)
txdb <- makeTxDbFromBiomart(transcript_ids=transcript_ids)
txdb # note that these annotations match the GRCh38 genome assembly
## Now what if we want to use another mirror? We might make use of the
## new host argument. But wait! If we use biomaRt, we can see that
## this host has named the mart differently!
listMarts(host="uswest.ensembl.org")
## Therefore we must also change the name passed into the "mart"
## argument thusly:
try(
txdb <- makeTxDbFromBiomart(biomart="ENSEMBL_MART_ENSEMBL",
transcript_ids=transcript_ids,
host="uswest.ensembl.org")
)
txdb