The ABCC has developed ToTeM, a Tool for Text Mining, to perform full text searches and query expansion on Medline abstracts. We are using Lucene/Solr search engine for fast efficient queries. ToTeM was designed to help users do text mining and visualization on full text searches based on a pair-wise queries or phrase queries that were not adjacent to each other. Visualization and sophisticated phrase queries based on word distance are currently in development. These queries will also be linked to synonyms, so that users who search for a gene name will also get its associated transcript ids as well.
MEDLINE is a freely available resource containing journal citations for biomedical literature. ToTeM allows you to search on specific parameters from all of MEDLINE. The website offers you basic queries, but can be extended using our Web services. Currently, we can do multiple searches on the fields listed below:
- PubMed ID
- Abstract Text
- MeSH Terms
- Journal Name
- Publication Year
Using Solr/Lucene, we indexed the many of the fields in the MEDLINE XML files, but only stored the above fields. You can do a search on all the fields without specifying which fields. Using bioDBnet, gene to disease, protein/miRNA interactants, GO annotations, and compounds are used to extend queries.
Solr - MEDLINE Stats
The complete MEDLINE was last downloaded on January 25,2017. Updates are performed on a regular basis (upon new MEDLINE XMLs). We will do a complete index each year.
Total Records: 26910009 as of 1/25/17.
If you have a problem with a query or questions that were not answered in the FAQs
, please contact us!