Unexplored Opportunities in the Human Druggable Genome

February 3, 2017

Tudor Oprea

The Illuminating the Druggable Genome Knowledge Management Center (IDG KMC) evaluates, organizes and distils more than 80 protein-centric and 20 gene-centric resources for over 20,000 curated human proteins.  The IDG KMC currently focuses on four protein families: G-Protein Coupled Receptors, Nuclear Receptors, Ion Channels and Kinases. Emergent properties and knowledge for target-disease associations are derived, via IDG KMC, from text mining of biomedical literature, drug labels and patents, as well as via human curation and semantic development, coupled with algorithmic processing of Big Data.
Of the 20186 proteins, 37.6%  are categorized as “Tdark”, i.e., proteins that lack adequate information regarding function and disease relevance [1].  Another 601 proteins (3%) have a confirmed Mechanism of Action for FDA-approved drugs ("Tclin") [2].  Two other categories reflect levels of literature, functional and disease annotations (“Tbio”), as well as knowledge about (potent) small molecules (“Tchem”), respectively [1, 3]. Pharos, the IDG KMC interface portal supports mining and interactive browsing of this multi-dimensional data collection, providing informative summaries for the broader scientific community [3].
This integrative effort supports the following observations: i) there is a Knowledge Deficit, i.e., we lack understanding of protein function for over 37% of human proteome; and ii) only 3% of the human proteome is therapeutically addressed by drugs, although this value is higher for the 4 target families outlined here. These observations are confirmed by mining the patent corpus, by examining R01 funding patterns, and disease associations. Global drug sales analyses confirm that GPCRs, ion channels, kinases and nuclear receptors are very lucrative targets [1].
[1] Oprea, TI. et al. Unexplored opportunities in the human druggable genome. Nat. Rev. Drug Discov. Poster (2016).
[2] Santos, R. et al. A comprehensive map of molecular drug targets. Nat. Rev. Drug Discov. 16, 19–34 (2017).
[3] Nguyen, D.-T. et al. Pharos: Collating protein information to shed light on the druggable genome. Nucleic Acids Res. 45, D995–D1002 (2017). Available at