This page describes how to load Ensembl core data into your InterMine-bio database.
First you will need the data from Ensembl, which are available via MySQL databases. Download the Ensembl MySQL database and create the database locally:
for example: download homo_sapiens_core_70_37 to a local directory, unzip all gz files, and load it to your MySQL database
# create a new db in MySQL
$ mysql -u DB_USER -p
mysql> create database homo_sapiens_core_70;
# load data into db
$ mysql -u DB_USER -p homo_sapiens_core_70 < homo_sapiens_core_70_37.sql
$ mysqlimport -u DB_USER -p homo_sapiens_core_70 -L *.txt -v
Add the location of the downloaded Ensembl MySQL databases to your mine properties file, for example:
# core database
db.ensembl.9606.core.datasource.serverName=SERVER_NAME
# port: uncomment the next line if use different port other than 3306
# db.ensembl.9606.core.datasource.port=PORT_NUMBER
db.ensembl.9606.core.datasource.databaseName=homo_sapiens_core_70
db.ensembl.9606.core.datasource.species=homo_sapiens
db.ensembl.9606.core.datasource.user=DB_USER
db.ensembl.9606.core.datasource.password=DB_PASSWORD
These properties are used by the Perl script.
InterMine’s Ensembl converter uses Ensembl’s Perl API. Follow Ensembl’s instructions for how to install the necessary Perl modules:
Run this command in /bio/scripts
$ ./ensembl.pl [Release Version] MINE_NAME TAXONID DATA_DESTINATION
for example:
$ ./ensembl.pl flymine 7165 /data/ensembl/current
This is located in the project.xml file, and it should look something like:
<source name="ensembl" type="ensembl-core">
<property name="src.data.dir" location="/MY_DATA_DIR/ensembl"/>
</source>
When you run a database build, every XML file in the directory specificed will be processed and loaded into the database.
Run a build. The entry in project.xml will instruct the build process to load the XML files you created in the previous step into the database. For example, run this command in MINE_NAME/integrate:
$ ant -v -Dsource=ensembl