InterMine is an open source data warehouse system for the integration and analysis of complex biological data, developed for the last 10 years by the Micklem Lab at the University of Cambridge. InterMine has been used for developing data warehousing solutions for a number of projects, including for storage and analysis of modENCODE data, and as a data mining platform for a number of major model organism databases as part of the InterMOD project.
InterMine has been developed with the support of the Wellcome Trust , , , as well as support from the National Human Genome Research Institute [R01HG004834]. The Wellcome Trust also recently granted a further 5 years of funding for InterMine development, as well as development of HumanMine, a data warehouse of human genetic, genomic and proteomic data, ensuring continued development of InterMine as a framework.
The publicly available InterMine instances include:
- FlyMine - a data warehouse of integrated fruit fly genetic, genomic and proteomic data
- modMine - a data warehouse including a repository for modENCODE project fly and worm data, alongside analysis tools
- YeastMine - an integrated data warehouse of yeast genomic data, developed by SGD
- RatMine - an integrated data warehouse of rat genomic data, developed by RGD
- MouseMine - an integrated data warehouse of mouse genomic data, developed by MGI
- metabolicMine - a data warehouse targeted at the metabolic disease community, containing relevant datasets from rat, mouse and human
- TargetMine - a data warehouse for candidate gene prioritisation and drug target discovery, developed at NIBIO, Japan
- mitominer - a data warehouse of mitochondrial proteomics data for a range of organisms
How to cite us
If you use the InterMine framework in your research, we would appreciate it if you cite the following publication:
- InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data. Smith RN et al. (2012). Bioinformatics, in press.
Individual data warehouses also have specific publications associated with them. To cite those in a paper, where available, please use the individual publication associated with the data warehouse in question:
- FlyMine: an integrated database for Drosophila and Anopheles genomics. Lyne, R. et al. Genome Biol. 8, R129 (2007).
- MitoMiner: a data warehouse for mitochondrial proteomics data. Smith, A. C., Blackshaw, J. A. & Robinson, A. J. Nucleic Acids Res. 40, D1160-1167 (2012).
- modMine: flexible access to modENCODE data. Contrino, S. et al. Nucleic Acids Res. 40, D1082-1088 (2012).
- TargetMine, an integrated data warehouse for candidate gene prioritisation and target discovery. Chen, Y. A., Tripathi, L. P. & Mizuguchi, K. PLoS One 6, e17844 (2011).
- YeastMine - an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit. Balakrishnan, R. et al. Database 2012, bar062 (2012).
All InterMine code is freely available under the open source LGPL license.