Ensembl Compara
================================
Download data from BioMart
----------------------------
#. [http://www.ensembl.org/biomart/martview/]
#. select database for primary organism, eg. `Ensembl Genes`
#. select dataset for primary organism, eg. `Drosophila melanogaster features (BDGP5.25)`
#. select FILTERS
#. click on "FILTERS" on the left panel in BioMart (this will populate the main panel with filter options)
#. select `MULTI SPECIES COMPARISONS`
#. check the checkbox next to `Homolog filters`
#. select the organism of interest in the dropdown
#. eg. `Orthologous Caenorhabditis elegans Genes`
#. make sure that next to the dropdown, `Only` is checked
#. select ATTRIBUTES
#. check the `Homologs` radio button at the top of the center panel
#. uncheck the `Ensembl Transcript ID` option, `Ensembl Gene ID` is now the only output
#. click on `ORTHOLOGS (Max select 6 orthologs):` to open that section of the form
#. select on the Gene ID for the organism of interest, eg. Drosophila Ensembl Gene ID
#. Run query
#. select the `[Results]` button at the top of the page
#. create `TSV` file, check box next to `Unique results only`
#. when prompted, save file as `TAXONID1_TAXONID2`
Add entry to project XML file
------------------------------------
.. code-block:: xml
Run build
------------
Data file
~~~~~~~~~~~~~~
Tab-delimited files should be named _, eg. 9606_10090 for a file with human genes and mouse orthologues.
=============== ==================
Gene ID Homologue ID
=============== ==================
ENSG00000253023 ENSMUSG00000088328
ENSG00000238364 ENSMUSG00000088728
=============== ==================
Download script
~~~~~~~~~~~~~~~~~
When you have created your query, you can export the Perl script or XML so you can run the query automatically next time, eg:
.. code-block:: xml
.. index:: Ensembl Compara