Blog | Yassir Boulaamane

Artificial Intelligence

Beyond Parameter Counts: The Shift Toward Rigorous Evaluation in Scientific AI

Limitations of scale-based evaluation in structural biology, and the transition toward physical grounding, out-of-distribution generalization, and biophysical developability validation.

Yassir Boulaamane

• Jul 21, 2026

Protein Descriptors

Protein Descriptors for Machine Learning in Drug Discovery

A practical map of protein and protein-ligand descriptors for machine learning modeling, from sequence-only features to MD-derived structural and interaction features.

Yassir Boulaamane

• Jul 20, 2026

Graph Neural Networks

Graph Neural Networks and DGL: A Beginner's Guide

Introduction to graph representation learning, neighborhood aggregation, and Deep Graph Library (DGL) workflows, including a glossary and tutorial reference map.

Yassir Boulaamane

• Jul 10, 2026

Ensemble Docking

Ensemble Docking for Binding and Activity Prediction

Limitations of static single-structure docking for flexible targets, and an integrated workflow combining molecular dynamics, clustering, and machine learning.

Yassir Boulaamane

• Jul 5, 2026

Artificial Intelligence

You Are Not an Impostor: Agentic Coding and the New Computational Scientist

A pipeline I never wrote a line of worked flawlessly on the first try, and it left me with impostor syndrome. Here is why relying on AI agents is not cheating, but a permanent shift in what scientific expertise actually means.

Yassir Boulaamane

• Jun 23, 2026

Machine Learning

Reproducible ≠ Robust: Why One UMAP Seed Isn't Enough to Trust a Split

Pinning random_state=42 makes a UMAP-based train/test split perfectly reproducible — and that is exactly why it can lull you into a false sense of rigor. Reproducibility guarantees you get the same answer every run; it says nothing about whether that answer is typical. Here's the distinction, why it matters for evaluating GNNs on chemical-domain shifts, and how to fix it.

Yassir Boulaamane

• Jun 23, 2026

Molecular Docking

Choosing the Right Partial Charges for Molecular Docking: AM1-BCC, PM6 and Beyond

Partial charges quietly drive the electrostatics behind every docking score. This guide compares the common semi-empirical options — AM1-BCC, PM6, AM1/PM3, Gasteiger and RESP — and gives a practical recommendation for when to use each.

Yassir Boulaamane

• Jun 23, 2026

Cheminformatics

Building a 3D Pharmacophore Model from PDB Data: A Free Python Workflow

A step-by-step, fully open-source pipeline that turns raw Protein Data Bank structures into a ligand-based 3D pharmacophore — mining the PDB, aligning binding pockets, clustering ligands, fixing bond orders, and distilling a consensus feature map ready for virtual screening.

Yassir Boulaamane

• Jun 23, 2026

Artificial Intelligence

Beyond Static Models: Agentic AI and Multi-Agent Systems in Drug Discovery

AI in drug discovery is moving past one-shot predictions toward autonomous agents that plan experiments, write their own analysis code, call docking and QM tools, and critique their results. Here's what the shift means, the design patterns behind it, and five open-source agents worth examining — with an honest look at the caveats.

Yassir Boulaamane

• Jun 23, 2026

Machine Learning

Does data leakage really inflate binding-affinity GNNs? A laptop-scale reproduction

I tried to reproduce the well-known PDBbind data-leakage effect with a small 3D GNN on a laptop — and couldn't. Two independent diagnostics show why: leakage only inflates models strong enough to memorize.

Yassir Boulaamane

• Jun 19, 2026