Resources
⌬ cdd-toolbox ⌬
Welcome to the Computational Drug Discovery Toolbox repository! This curated collection provides a comprehensive set of resources for Computational Drug Discovery (CDD) and related fields. Whether you are a researcher, student, or professional in the field of computational chemistry, this toolbox aims to serve as a valuable resource.
Table of Contents
- Natural Compounds Libraries
- Chemical Bioactivity Databases
- 3D Protein Structures Databases
- Protein Engineering
- Binding Site Detection
- Pharmacophore Screening Tools
- Molecular docking
- ADMET Prediction
- QSAR modeling
- Quantum chemistry
- Molecular dynamics simulations
- Topology Preparation
- Normal Mode Analysis for Predicting Protein Motions
- Virtual Screening Server for Drug Repurposing
- Peptide Design Tools
- PROTAC Database and Ternary Complex Modelling
- Machine learning for drug discovery
- Artificial intelligence for drug discovery
- Retrosynthesis prediction
- Miscellaneous tools
- Cheminformatics Free Courses
- Blogs
Natural Compounds Libraries
- COCONUT Natural Products: A comprehensive database boasting over 400,000 natural products aggregated from more than 50 diverse sources.
- LOTUS Natural Products: Similar to COCONUT, LOTUS focuses on molecular annotations, making it an invaluable resource for sourcing organisms.
- ZINC15 Natural Products: This repository contains a vast collection of over 200,000 commercially available natural compounds, primarily utilized for virtual screening purposes.
- Collective Molecular Activities of Useful Plants: This database emphasizes the molecular activities of beneficial plants, aiding in the exploration of their pharmacological potential.
- Natural Product Activity & Species Source Database (NPASS): NPASS connects 94,000 natural product activities with their respective species sources, facilitating research in natural product discovery.
- Cannabis Compound Database: A specialized database housing information on over 6,000 compounds found in cannabis plants, catering to cannabis-related research endeavors.
- SuperNatural III: Offering insights into natural compounds and their biological activities, SuperNatural III serves as a valuable resource for researchers in various fields.
- FooDB: With a comprehensive compilation of over 70,000 food components, FooDB aids researchers in exploring the chemical composition of diverse foods.
- AfroDB: Highlighting more than 4,000 compounds sourced from African medicinal plants, AfroDB significantly contributes to ethnopharmacological studies and drug discovery efforts.
- Comprehensive Marine Natural Products Database: This resource provides extensive information on over 31,000 marine-derived natural products, supporting marine biotechnology and pharmaceutical research.
- SistematX Secondary Metabolites Database: Focused on over 8,000 secondary metabolites, SistematX serves as a valuable tool for researchers studying natural product chemistry.
- Eximed Natural-Product-Based Library: With more than 5,000 natural product-like compounds tailored for high-throughput screening, this library aids in drug discovery efforts.
- CoumarinDB: Specifically targeting approximately 900 naturally occurring coumarin compounds, CoumarinDB is a specialized resource for researchers in this field.
- ArtemisiaDB: Dedicated to compounds from the genus Artemisia, this database offers valuable insights for researchers interested in this plant group.
- OTAVA Natural Product-Like Library: Housing over 1,000 natural product-like compounds designed for high-throughput screening, this library is a valuable resource in drug discovery.
- BIAdb: Providing information on bioactive peptides and proteins with therapeutic potential, BIAdb supports research in biomedicine and pharmacology.
- IMPPAT: This comprehensive database offers digitized data from over 100 traditional Indian medicine books, 7000+ research articles, and other sources, making it the most extensive repository of phytochemicals found in Indian medicinal plants.
- NP-MRD: Focused on over 280,000 NMR studies, NP-MRD provides a detailed exploration of natural products through NMR spectroscopy.
- IBS Natural Compounds: Offering information on over 60,000 natural compounds, IBS Natural Compounds is a valuable resource for researchers in natural product chemistry and drug discovery.
- Phytochemicals: This comprehensive resource provides extensive information on phytochemicals, supporting research in plant chemistry and pharmacology.
Chemical bioactivity databases
- ChEMBL database - A large-scale bioactivity database that focuses on bioactive molecules and their targets. It contains information on the binding, functional, and ADMET properties of drugs.
- BindingDB - A comprehensive database that provides information on the binding affinities of drugs to their target biomolecules.
- PubChem database - Offers information on the biological activities of small molecules, including chemical structures, properties, and bioassay data.
- PDBbind database - A database focusing on the experimentally measured binding affinities of biomolecular complexes. It includes data on protein-ligand complexes derived from the PDB, providing insights into the structures and energetics of protein-ligand interactions.
- BRENDA Enzymes database - A comprehensive enzyme information system that collects and provides data on enzyme function, structure, and properties. It covers a wide range of enzyme-related information.
- ExCAPE-DB: ExCAPE chemogenomics database - A chemogenomics database that integrates data on chemical compounds, protein targets, and biological activities.
- DrugBank - Provides comprehensive data on approved and investigational drugs, including details on chemical structures, pharmacology, and therapeutic indications.
- ZINC - A platform for researchers to explore and obtain compounds for various computational and experimental studies.
- ChemSpider - A free chemical database offering information on chemical structures, properties, and associated data.
- DrugSpaceX - A database designed to explore chemical and biological spaces related to drug discovery.
- Therapeutics Data Commons - An AI foundation for therapeutic science, offering an intuitive interface for various learning tasks in the drug discovery field.
3D protein structures databases
- RCSB Protein Data Bank - Global repository for biological macromolecule structures, facilitating research in structural biology and drug discovery.
- Protein Data Bank in Europe - European counterpart to RCSB PDB, providing deposition, retrieval, and analysis of macromolecular structures.
- Orientation of proteins in membranes database - Database focusing on spatial arrangements of integral membrane proteins, aiding understanding of their structural features.
- UniProt - Comprehensive resource combining protein sequences, structures, functions, and interactions, with links to 3D structures.
- InterPro - Integrated resource classifying proteins into families, predicting domains, and incorporating information from various protein databases.
- AlphaFold protein structure database - Database housing predicted protein structures generated by the AlphaFold deep learning system.
- Proteopedia - Collaborative web-based resource with interactive 3D visualizations providing information on protein structure and function.
Protein engineering
- DynaMut - A web tool for the analysis and prediction of protein stability changes upon mutation using Normal Mode Analysis.
Binding site detection
- ProteinsPlus - A tool designed for the identification and analysis of protein binding sites. It facilitates the exploration of protein-ligand interactions.
- PrankWeb - A web-based platform specializing in the prediction and analysis of protein binding sites.
- CASTp - A resource for the detection and characterization of protein binding sites. It offers insights into the volume and area of cavities on protein surfaces, contributing to the study of ligand binding and functional sites.
- CavityPlus - A web server designed for the identification and analysis of protein cavities and binding sites. It provides tools for the characterization of binding pockets, assisting researchers in studying protein-ligand interactions and structure-based drug design.
- CaverWeb: Identification of Tunnels and Channels in Proteins and Analysis of Ligand Transport - A tool for identifying tunnels and channels in protein structures. It supports the analysis of ligand transport pathways.
Pharmacophore screening tools
- ZINCPharmer - A web-based pharmacophore screening tool that facilitates the exploration of chemical databases. It aids researchers in identifying potential ligands based on pharmacophoric features.
- Pharmit - A pharmacophore modeling and virtual screening platform. It enables users to define and search for pharmacophoric patterns within chemical databases, assisting in the identification of compounds with specific bioactive properties for drug development.
- PharmMapper - A web server designed for pharmacophore mapping with statistical methods.
Molecular docking
- OpenBabel - An open-source chemical toolbox designed for the interconversion of chemical file formats. Useful for batch preparing ligands for molecular docking.
- MGLTools - A software package designed for visualization and analysis of molecular structures including preparation of protein and ligand structures.
- AutoDockTools - A GUI for setting up and analyzing molecular docking simulations using the AutoDock suite. It provides a user-friendly environment for preparing input files and visualizing docking results.
- AutoDock Vina - A molecular docking program that efficiently predicts the binding modes of small molecules to target proteins. It utilizes an advanced scoring function and optimization algorithm, making it widely used for virtual screening and drug discovery.
- EasyDockVina2 - A user-friendly tool built on top of AutoDock Vina, streamlining the molecular docking process. It simplifies the setup of docking simulations, making it accessible to users with varying levels of expertise.
- AutoDock Vina web server - A web-based interface for molecular docking simulations. It aids in predicting the binding modes of ligands to target proteins.
- Smina - A fork of AutoDock Vina with additional features and optimizations. It focuses on improving scoring and minimization.
- Gnina - A fork of Smina with integrated support for scoring and optimizing ligands using convolutional neural networks.
- EasyDock - A fully automated pipeline for molecular docking based on Vina with a full support of Smina and Gnina.
- HADDOCK - A platform used for all types of molecular docking including protein, ligand, peptide, and nucleic acid docking.
Molecular interaction visualization
- PLIP - Protein ligand interaction profiler - A tool for profiling interactions between proteins and ligands. It helps analyze and visualize protein-ligand interactions, providing insights into molecular binding mechanisms.
- LigPlot+ - A Java-based program that automatically generates 2D ligand-protein interaction diagrams.
- Discovery Studio Visualizer - A powerful molecular visualization tool that allows users to explore and analyze complex biological and chemical information. It provides an intuitive interface for visualizing molecular structures, protein-ligand interactions, and conducting various analyses.
Pharmacokinetics parameters prediction tools
- SwissADME - A web tool for predicting pharmacokinetic parameters and drug-like properties. It assists researchers in assessing the drug-likeness and pharmacokinetic profile of small molecules. Supports multiple instances.
- pkCSM - A platform offering predictions for various physicochemical, pharmacokinetic and toxicity properties with the theory behind each prediction. Suitable for ADMET screening.
- ADMETlab 2.0 - A comprehensive tool for predicting absorption, distribution, metabolism, excretion, and toxicity properties of chemical compounds.
- ProTox-II - A predictive tool for assessing the toxicity endpoints of drugs.
- PreADMET - A web service providing predictions for pharmacokinetic properties. Does not support multiple instances.
- FAF-Drugs - A program designed to filter extensive compound libraries based on ADMET properties before in silico screening or modeling studies.
QSAR modeling
- QSAR Toolbox - The Toolbox is a free software for transparent chemical hazard assessment, offering tools for data retrieval, metabolism simulation, and property profiling. It aids in identifying analogues and chemical categories for read-across and trend analysis, filling data gaps.
- OCHEM - An online platform offering tools for building QSAR models for predictions of chemical properties.
- ChemMaster - A general cheminformatics software used to handle chemical data, in particular for drug design purposes including QSAR modeling.
- 3D-QSAR - A compilation of tools available online for 3D-QSAR modeling.
Quantum chemistry
- Gaussian - A widely used software suite for electronic structure modeling and quantum chemistry calculations.
- ORCA - A popular computational chemistry software package that is widely used for electronic structure calculations, including Density Functional Theory (DFT) calculations. While it is not open source, it does offer a free version for academic and personal use.
- Quantum ESPRESSO - An integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale.
Molecular dynamics simulations
- GROMACS - A versatile package to perform molecular dynamics, scalable and efficient in performing large-scale simulations.
- LAMMPS - A classical molecular dynamics simulation code, designed to run efficiently on parallel computers.
- NAMD - A parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.
- AMBER - A suite of biomolecular simulation programs that includes several force fields for the simulation of proteins, nucleic acids, and carbohydrates.
- Desmond - Developed by Schrödinger at D. E. Shaw Research, Desmond is a high-performance molecular dynamics simulation program with a focus on drug discovery.
Topology preparation
- CGenFF - Provides a force field parameterization platform for small organic molecules within the CHARMM force field. It assists researchers in preparing molecular topologies for use in molecular dynamics simulations.
- SwissParam - A tool for generating force field parameters for small molecules.
- Automated Topology Builder - A web-based tool for generating molecular topologies and force field parameters.
- CHARMM-GUI - A user-friendly interface for CHARMM, providing tools for the generation of molecular topologies and input files for simulations.
Normal mode analysis for predicting protein motions
- iMod Server - A web-based platform dedicated to Normal Mode Analysis (NMA) for predicting protein motions. It allows researchers to analyze and simulate the vibrational modes of proteins, providing valuable insights into their dynamic behavior and structural flexibility.
Virtual screening server for drug repurposing
- DrugRep - A virtual screening server designed for drug repurposing.
Peptide design tools
- PepDraw - A web-based tool for designing and visualizing peptide structures.
- PepSite - A peptide design tool that aids researchers in identifying potential binding sites on protein surfaces.
- Peptimap - A tool dedicated to the analysis and visualization of peptide structures.
PROTAC database and ternary complex modelling
- PROTAC-db - A database focused on Proteolysis Targeting Chimeras (PROTACs). It provides a resource for researchers interested in PROTACs by offering information on their design, targets, and associated experimental data, aiding in the exploration of targeted protein degradation.
- PROsettaC - A platform for ternary complex modeling, specifically focusing on the structural modeling of protein-protein interactions. It provides tools for predicting the three-dimensional structures of protein complexes, contributing to the understanding of PROTAC-induced ternary complexes and their implications in drug discovery.
Machine learning for drug discovery
- RDKit - An open-source toolkit for cheminformatics and medicinal chemistry, RDKit facilitates tasks like molecular structure representation, substructure searching, and descriptor calculation.
- Google Colab - A cloud-based platform that provides free access to Jupyter notebooks along with GPU support. It allows users to run and share Python code collaboratively, making it particularly useful for data analysis, machine learning, and research projects.
- Anaconda - A distribution platform for Python and R programming languages, designed for data science, machine learning, and scientific computing.
- Pandas - An open-source data manipulation and analysis library for Python. It provides easy-to-use data structures, such as dataframes, and a plethora of functions for data cleaning, exploration, and transformation.
- Numpy - A library for numerical computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays.
- Scikit-Learn - An open-source machine learning library for Python. It offers a simple and efficient toolkit for various machine learning tasks, including classification, regression, clustering, and dimensionality reduction.
- Matplotlib - A popular 2D plotting library for Python.
- Seaborn - A statistical data visualization library based on Matplotlib. It simplifies the creation of attractive and informative statistical graphics in Python.
- MoleculeNet - Offers datasets for benchmarking predictive models of various chemical drug properties.
- Kaggle - A large ML and AI community. Can be useful for sourcing and sharing datasets and models related to drug discovery.
- Hugging Face - A new growing AI community similar to Kaggle.
- Code Ocean - A cloud-based platform designed to facilitate the sharing, collaboration, and reproducibility of research code and data.
- Zenodo - An open-access repository for research outputs. It is an initiative developed by CERN to provide a platform for researchers to share and preserve various types of scholarly content, including datasets, software, publications, and other research-related materials.
- ChemML - ChemML empowers users to perform various data science tasks and ML workflows, making modern data science accessible in the broader chemistry and materials community.
- Datagrok - Datagrok offers robust support for small molecules and popular building blocks in cheminformatics. It understands various chemical notations like SMILES and SMARTS. Users can visualize molecules in 2D or 3D, sketch them, and extract properties.
Artificial intelligence for drug discovery
- Keras - An open-source high-level neural networks API written in Python. It provides a user-friendly interface for building and experimenting with deep learning models.
- TensorFlow - An open-source machine learning framework developed by Google. It offers a comprehensive set of tools for building and deploying machine learning models, with a particular emphasis on deep learning.
- DeepChem - An open-source library for deep learning in drug discovery and cheminformatics
- TorchDrug - A cheminformatics toolkit based on PyTorch. It provides functionalities for molecular property prediction, compound screening, and deep learning-based analyses.
- DEEPScreen - A Python-based tool for virtual screening studies with deep convolutional neural networks using compound images.
- GraphINVENT - A Python-based platform deployed by the MolecularAI group at AstraZeneca, for molecular design based on graph generation models. It employs graph neural networks to generate molecular graphs with desired properties.
Retrosynthesis prediction
- Spaya AI-powered retrosynthesis platform - An AI-powered platform for retrosynthetic analysis. It assists chemists in designing synthetic routes for target compounds.
- AiZynthFinder - A tool based on Monte Carlo tree search, designed for retrosynthetic planning in organic chemistry. It aids chemists in generating synthetic routes by exploring reaction databases and proposing viable retrosynthetic steps.
- ASKCOS - An AI-driven platform developed for computer-assisted organic synthesis planning. Utilizing machine learning algorithms, ASKCOS assists chemists in predicting viable reaction pathways and proposing synthetic routes for target molecules.
- IBM RoboRXN - An AI-driven platform for automated reaction prediction and planning. Using advanced machine learning models, it enables researchers to predict the outcomes of chemical reactions and design synthetic routes for desired products.
Miscellaneous tools
- OPSIN: Open Parser for Systematic IUPAC nomenclature - Useful for converting chemical names into structured representations.
- OSRA: Optical Structure Recognition - An Optical Structure Recognition tool that converts graphical representations of chemical structures, such as images, into connection tables.
- MetaPredict - A tool designed for predicting various molecular properties and activities. It supports computational chemistry by providing predictive models for diverse chemical entities.
- RPBS Web Portal - A platform offering a range of tools for computational biology and chemistry.
- AI based scoring function platform - Employs artificial intelligence to develop scoring functions for evaluating molecular interactions. It aids in predicting binding affinities and guiding drug discovery efforts.
- ChemPlot: A Tool For Chemical Space Visualization - A tool dedicated to visualizing chemical space.
- ChemDB Chemoinformatics Portal - A chemoinformatics portal offering various tools for chemical data analysis. It supports tasks such as compound search, similarity analysis, and property prediction.
- Open Targets Platform - The Open Targets Platform integrates genetic, genomic, and chemical data for target identification in drug discovery.
- Screening Explorer - A tool designed for compound screening and analysis.
- LigRMSD - A tool for calculating the root-mean-square deviation (RMSD) between ligand structures.
- BoBER: web interface to the base of bioisosterically exchangeable replacements - A web interface providing bioisosteric replacements for chemical structures.
- Disease List Automatically Derived For You - This tool automatically generates a list of diseases based on relevant data and criteria.
- Python Code Examples - Provide code snippets and examples for common tasks in the Python programming language. It’s a valuable resource for learning and implementing Python in scientific computing.
- NERDD New E-Resource for Drug Discovery - A new e-resource focused on drug discovery. It provides information and tools to support researchers in the drug development process.
- MetaChemiBio - A tool designed for predicting molecular properties and activities.
- The Utrecht Biomolecular Interactions software portal - This portal provides software tools for studying biomolecular interactions.
- LigBuilder3 - A tool for building ligand structures and conducting virtual screening.
- MolAiCal - Useful for binding free energy calculations.
- ChemMine Tools - Provide a collection of cheminformatics and computational chemistry tools.
- MayaChemTools - A growing collection of Perl and Python scripts, modules, and classes designed to support diverse daily computational discovery needs.
- SCBDD - An online platform containing a series of software and web servers that can assist in cheminformatics and drug discovery.
- Click2Drug - Deployed by the Swiss Institute of Bioinformatics, presents an extensive compilation of CADD software, databases, and web services. This collection categorizes tools based on their application field, aiming to encompass the entire drug design pipeline.
- Galaxy Europe - A Galaxy instance focused on Cheminformatics.
Cheminformatics free courses
- Computational chemistry lectures by TMP Chem - A YouTube playlist offering computational chemistry lectures by TMP Chem. It covers various aspects of computational chemistry, providing valuable insights for learners interested in the field.
- Strasbourg Summer School in Chemoinformatics, 2022 - A YouTube playlist featuring lectures from the Strasbourg Summer School in Chemoinformatics. The content covers topics related to cheminformatics.
- BIGCHEM - The BIGCHEM project provides free courses on big data in chemistry. It offers online resources and training materials to explore the intersection of big data and chemistry, fostering understanding and skills in cheminformatics.
- Geometric Deep Learning Course - A course on Geometric Deep Learing delivered as part of the African Master’s in Machine Intelligence.
- Drug Discovery Course by StereoElectronics - Fundamentals and principles of methods used in the drug discovery pipeline.
- drugdesign.org - A collection of free courses on drug design, cheminformatics, molecular modeling, property prediction, and QSAR.
Blogs
- Practical Fragments - A blog focusing on fragment-based drug discovery. It provides practical insights, case studies, and discussions on the use of small fragments in drug design.
- avrilomics - A blog covering topics in genomics, bioinformatics, and related fields. It offers insights, updates, and discussions on advancements and practical applications in these areas.
- Practical Cheminformatics - A blog dedicated to practical aspects of cheminformatics. It explores tools, techniques, and applications in the field of chemical informatics and computational chemistry.
- Cheminformania - A compilation of articles focused on valuable insights and hands-on demonstrations detailing diverse cheminformatic tasks, delving progressively deeper into the realm of deep learning and AI applications.
- Daily Dose of Data Science - A blog that consolidates captivating frameworks, libraries, technologies, and insights to streamline the entire process of a Data Science project.
- Machine Learning Mastery - A valuable online resource that offers tutorials, guides, and practical insights into machine learning concepts, algorithms, and techniques. Authored by Jason Brownlee, a machine learning expert and educator, the site covers a wide range of topics, from fundamental concepts for beginners to advanced techniques for seasoned practitioners.
- Chem-Workflows - A collection of Jupyter Notebook-based tutorials written by Dr. Angel J. Ruiz Moreno for chemical data exploration and visualization.
- Structural Bioinformatics - A guide to structural biology and structure-based drug design.
- Bioinformatics Answers - An online platform designed for bioinformatics researchers, computational biologists, and individuals involved in life sciences. It serves as a community-driven question and answer platform, where users can seek help, share knowledge, and engage in discussions related to bioinformatics and computational biology.
- McConnellsMedChem - A Medicinal chemistry blog serving as a resource for professionals, researchers, or students involved in the field of medicinal chemistry.
- DrugDiscovery.NET - A blog about AI and machine learning in drug discovery.