About the Gadgets & Data

GadgetsData

Gadgets

Name entity_linking_icon Entity Linker
Description The Entity Linker gadget facilitates the natural language processing task of Entity Linking on biographical data. Entity Linking consists of connecting keywords contained in a text with their corresponding keywords stored in a knowledgebase. When English text input (e.g. the abstract of an academic paper) is given to the Entity Linker, it outputs the keywords contained in the text in the form that they are registered in the knowledgebase.
Reference http://prm-ezcatdb.cbrc.jp/entity_linking/
Institute National Institute of Advanced Industrial Science and Technology (AIST)
Contributors Masami Ikeda & Hiroya Takamura
Name NamedEntityRecognizer_icon.png NamedEntityRecognizer
Description Named Entity Recognizer is a gadget that facilitates the natural language processing task of Named Entity Recognition (NER) on literature information. NER is used to extract and classify keywords such as disease names, cell names, and pharmacological substances found within texts. When English text is input, the gadget will find keywords in the text that match one or more of 37 pre-defined criteria (including names of diseases, cells, pharmacological substances, and other proper nouns relevant to the field of drug discovery).
Reference http://prm-ezcatdb.cbrc.jp/named_entity_recognition/
Institute National Institute of Advanced Industrial Science and Technology (AIST)
Contributors Masami Ikeda & Hiroya Takamura
Name JaMIE_icon.png JaMIE
Description Relation extraction is the extraction of semantic relations between keywords in a text. When a Japanese medical text (e.g. CT image reading finding) is input into this gadget, it outputs the relationship between the keywords in the text and their associated keywords in the knowledgebase.
Reference https://github.com/racerandom/JaMIE
Institute Kyoto University
Contributors Fei Cheng & Sadao Kurohashi
Name SemanticSearch_icon.png Semantic Search
Description This is a document retrieval system that presents similar documents when given medical documents such as electronic medical records and radiological findings. The search target is a group of medical documents annotated by the PRISM project. An example application of this gadget would be searching for existing cases in a hospital.
Reference https://aoi.naist.jp/prism-search/
Institute Nara Institute of Science and Technology (NAIST)
Contributors
Name HeaRT_icon.png HeaRT
Description When medical documents such as electronic medical records and findings are input, a Gantt-like chart is created that illustrates the information in chronological order. This can be used to facilitate information sharing among health professionals.
Reference https://aoi.naist.jp/prism-heart/
Institute Nara Institute of Science and Technology (NAIST)
Contributors
Name SFM, bST
Description SFMEDM_icon.png Space-efficient feature maps for string alignment kernels (SFMEDP) takes a set of input strings and outputs a set of feature vectors. Using the features produced by SFMEDM, a support vector machine (SVM) can be used to perform predictive tasks such as string classification and regression. One example of this gadget's utility is prediction tasks which use amino acids as training data. Because strings are mapped in a nonlinear space, prediction performance using SFMEDM is highly accurate and memory efficient.
Reference https://github.com/tb-yasu/SFMEDM
https://github.com/kampersanda/integer_sketch_search
Institute RIKEN
Contributors Yasuo Tabei
Name kGCN_icon.png kGCN Network Prediction
Description Graph convolutional neural networks (GCNs) allow structural information of small molecule compounds to be input as graphs and have been reported to perform well on many types of prediction tasks. kGCN is an open-source, GCN-based gadget that provides the necessary preprocessing for building prediction models. Bayesian optimization for model tuning and atom visualization contribute significantly to the prediction for interpretation of results. This gadget predicts and outputs new links that may exist between nodes upon input of the dataset, nodes, and trained models.
Reference https://github.com/clinfo/kGCN
Institute Kyoto University
Contributors Ryosuke Kojima & Yasushi Okuno
Name molenc_icon.png Molenc
Description One approach to using information about a compound in machine learning is to generate fingerprints, which are vectors that indicate how many specific substructures are present in the compound. There are various ways to generate fingerprints, but this gadget generates Signature Molecular Descriptors (SMDs) which were originally published by J.L. Faulon et al. in 2003. By inputting a list of structural information (SMILES) of a compound of interest into the gadget, a correspondence table of features (substructures) and SMD fingerprints is generated and output. Preexisting correspondence tables can be uploaded by ticking the "encoding.dix" checkbox and pressing Run. If the user does not have a correspondence table, the user should ensure that the aforementioned box is not checked before pressing Run.
Reference https://github.com/UnixJunkie/molenc
Institute Kyushu Institute of Technology
Contributors Francois Berenger & Yoshihiro Yamanishi
Name VanishingRankingKernels_icon.png Vanishing Ranking Kernels
Description Ligand-based virtual screening is performed by learning a classification model of activity strength for a set of compounds and predicting it for a set of compounds of unknown activity based on vanishing kernels and intermolecular Tamimoto coefficients. The resulting model defines an applicability domain (AD) for the activity. This AD is used to improve screening efficiency. The input file is the feature (descriptor) rather than the structure of the compound. Please refer to https://github.com/UnixJunkie/rankers for details.
Reference https://github.com/UnixJunkie/rankers
Institute Kyushu Institute of Technology
Contributors Francois Berenger & Yoshihiro Yamanishi
Name DietNetworks_icon.png Modified Diet Networks
Description With ultra-high dimensional (n<<p) data, such as genomic data, it is difficult to avoid overlearning even with regularization and other methods. Diet Networks is a deep learning method designed to train on high dimensional data, and Modified Diet Networks is an improved version of Diet Networks that provides stable and accurate predictions. This gadget is equipped with a pre-trained model that uses Modified Diet Networks to classify lung cancer patients into lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) based on their somatic mutation profiles. When patient information is provided to the model in the form of a vector of counts of how many somatic mutations are present in each gene/patient (multiple patients can be entered as a matrix) the model outputs a prediction of whether the patient is LUAD or LUSC.
Reference https://www.mdpi.com/2218-273X/10/9/1249
Institute National Cancer Center
Contributors Ken Asada & Ryuji Hamamoto
Name MultiomicsAnalyzer_icon.png Multiomics Analyzer
Description In recent years, multi-omics data analysis has attracted attention for various applications, but its methodology has not yet been fully established. One of the biggest challenges in omics data analysis is how to handle high-dimensional data. The Multi-omics Analyzer is equipped with models created by applying unsupervised deep learning methods to miRNA and mRNA data acquired from lung cancer patients in The Cancer Genome Atlas (TCGA). When miRNA/mRNA data is input, it outputs feature vectors with reduced dimensionality. The feature vectors obtained from the Multi-omics Analyzer can be used for prediction tasks such as classification and regression.
Reference https://www.mdpi.com/2218-273X/10/4/524/htm
Institute National Cancer Center
Contributors Kazuma Kobayashi & Ryuji Hamamoto
Name SubsetBinder_icon.png Subset Binder
Description This gadget uses an algorithm called subset binding to find groups of items that are linked across two datasets. For example, when medical information and omics data are input, the gadget outputs a pair of molecule groups that fluctuate together as well as a corresponding group of medical information that changes in conjunction with them. Miné includes two types of data - hepatotoxicity phenotype data and gene expression profiles - that reflect hepatotoxicity when high concentrations of acetaminophen are administered to rats as data for operation checks. Subset binding is based on association rule mining technology, so the parameters used are generally the same as those used in association rule mining.
Reference https://www.researchsquare.com/article/rs-405195/latest.pdf
Institute National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN) & RIKEN
Contributors Yayoi Natsume-Kitatani & Naonori Ueda
Name DEFAULT_icon.png RPPA
Description This gadget is a prognostic system for lung squamous cell carcinoma and lung adenocarcinoma using a Deep Autoencoder. It can predict prognosis using only reversed phase protein array (RPPA) data as well as six types of omics data (RNA sequencing data, miRNA sequencing data, DNA methylation data, copy number variation, somatic mutation, DNA, sequencing data, and RPPA data).
Reference https://www.mdpi.com/2218-273X/10/10/1460
Institute National Cancer Center
Contributors Ken Asada, Ryuji Hamamoto
Name kGCN_icon.png PathoGN
Description Upon input of mutation information and gene relationship networks, this gadget presents information such as the pathogenicity of the mutation and the predicted relevant genes present in the biomolecular network. This is a novel method implemented as an extension of kGCN.
Reference
Institute Kyoto University
Contributors Ryosuke Kojima, Yasushi Okuno
Name DEFAULT_icon.png INGOR
Description INGOR is an implementation of a Bayesian network estimation algorithm. When provided with measured biomolecular profiles, it creates a causal network between biomolecules such as genes and proteins. This gadget can be used to elucidate intermolecular regulatory causal mechanisms and to search for new drug target candidates.
Reference https://ytlab.jp/clinfo/ingor/tutorialja.html
Institute Kyoto University
Contributors Yoshinori Tamada, Yasushi Okuno
Name DEFAULT_icon.png INGOR ECv
Description Given a measured biomolecular profile and a Bayesian network estimated with INGOR, INGOR ECv outputs the edge contribution value (ECv) for each branch in the network and the partial network extracted using this value. This tool is for academic use only.
Reference https://doi.org/10.1038/s41598-021-02394-w
Institute Kyoto University
Contributors Nakazawa, M.A., Tamada, Y., Tanaka, Y., Ikeguchi, M., Higashihara, K., Okuno, Y.
Name DEFAULT_icon.png INGOR RC
Description When the measured biomolecular profiles and the Bayesian network estimated by INGOR are input to INGOR RC, it will output the relative contribution value (RC) for each branch needed to visualize the network for each sample. This tool is for academic use only.
Reference https://doi.org/10.1038/s41598-021-90556-1
Institute Kyoto University
Contributors Tanaka, Y., Higashihara, K., Nakazawa, M.A., Yamashita, F., Tamada, Y., Okuno, Y.
Name DEFAULT_icon.png INGOR Network
Description Based on the measured biomolecular profiles, this Bayesian network estimation algorithm provides results of stratification and grouping of samples based on causal networks, ECv, and networks among biomolecules such as genes and proteins. It can be used for elucidating intermolecular regulatory causal mechanisms, searching for novel drug target candidates, and patient stratification. This tool is for academic use only.
Reference
Institute Kyoto University
Contributors Yoshinori Tamada, Yasushi Okuno
Name TargetMine_icon.png TargetMine
Description A data warehouse that integrates more than 30 public data sources widely used internationally to support early drug discovery research, especially in target discovery, enabling efficient knowledge discovery. TargetMine covers a wide range of data from genes, proteins, and pathways to 3D structures and interactions with compounds. Presently, the data incorporated in TargetMine is primarily focused on the most studied model organisms in the field of drug discovery: humans, rats, and mice.
Reference https://doi.org/10.1093/bioinformatics/btac507
Institute National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN)
Contributors Yi-An Chen, Kenji Mizuguchi

Data

There is no data available for release.