InterMine has two InterPro data sources. One that loads the protein domains, e.g. name and description and one that loads the relationship between the proteins and domains.
protein domains, e.g. name and description
project XML example
<source name="interpro" type="interpro">
<property name="src.data.dir" location="/data/interpro"/>
</source>
This source queries for proteins already in the database and loads related protein domains. So this source must be run after UniProt.
protein domains, their relationship to the protein and protein domain region
ftp://ftp.ebi.ac.uk/pub/databases/interpro/protein2ipr.dat.gz ftp://ftp.ebi.ac.uk/pub/databases/interpro/match_complete.dat.gz
project XML example
<!-- has to be after UniProt because only loads protein domains for loaded proteins -->
<source name="protein2ipr" type="protein2ipr">
<property name="src.data.dir" location="/data/interpro"/>
<property name="src.data.dir.includes" value="protein2ipr.dat"/>
<property name="protein2ipr.organisms" value="9606"/>
<property name=”osAlias” value=”os.production”/>
</source>