Principal Component Analysis and Free Energy Landscape Mapping Using GROMACS
2 minute read
This guide outlines how to perform Principal Component Analysis (PCA) and compute Free Energy Landscapes (FEL) from molecular dynamics (MD) simulations using GROMACS. These analyses are useful to capture dominant motions and identify energetically favorable states in biomolecular systems. —
Part 1: Principal Component Analysis (PCA)
Step 1: Covariance Matrix and Eigenvector Calculation
Run the following command to compute the covariance matrix and extract eigenvectors:
gmx covar -s md_0_100.tpr -f md_0_100.xtc -o eigenvalues.xvg -v eigenvectors.trr -xpma covar.xpm
Note: If the command fails using
md_0_100.gro
, usemd_0_100.tpr
.
Select protein and ligand when prompted.
Step 2: Determine the Number of Principal Components
Analyze eigenvalues.xvg
to compute:
Total Variance
Total Variance = Σ λᵢ
Explained Variance (%)
Explained Variance (%) = (λᵢ / Σ λᵢ) × 100
Cumulative Variance
Cumulative Variance = Σ (λᵢ / Σ λᵢ) × 100
Select the minimum number of components that cumulatively explain more than 50% of the total variance.
Step 3: Plot Principal Component Projections
For projecting motion along PC1 to PC5, run:
gmx anaeig -f md_0_100.xtc -s md_0_100.tpr -v eigenvectors.trr -first 1 -last 5 \
-proj pc15_lovastatina.xvg -2d project2d_s100_lovastatina.xvg -tu ns
-proj
: 1D projection along each selected eigenvector-2d
: Combined 2D projection using the first 5 components
Part 2: Free Energy Landscape (FEL) Calculation
Step 1: Generate Principal Component Projections
gmx anaeig -f md_0_100.xtc -s md_0_100.tpr -v eigenvectors.trr -last 1 -proj pc1.xvg
gmx anaeig -f md_0_100.xtc -s md_0_100.tpr -v eigenvectors.trr -first 2 -last 2 -proj pc2.xvg
Merge PC1 and PC2 into a single file:
paste pc1.xvg pc2.xvg | awk '{print $1, $2, $4}' > PC1PC2.xvg
Step 2: Compute Free Energy Surface
Use GROMACS SHAM module:
gmx sham -f PC1PC2.xvg -ls FES.xpm
Convert .xpm
to .dat
using a Python script:
python2.7 xpm2txt.py -f FES.xpm -o fel.dat
Python script xpm2txt.py
converts the XPM matrix to a 3-column text file (X, Y, Energy).
Step 3: Extract Minimum Energy Conformation
After plotting fel.dat
, identify the time of the minimum energy, then extract the conformation:
gmx trjconv -s md_0_100.tpr -f md_0_100.xtc -o min_energy.pdb -dump 520
Replace 520
with the appropriate time (in ps) corresponding to the energy minimum.
Summary
This workflow provides a systematic approach to:
- Identify dominant motions in your trajectory via PCA
- Visualize structural variation through projection plots
- Map the free energy landscape based on PC space
- Extract the most stable conformations for further analysis
These methods are widely used for post-simulation analysis in structural biology and drug discovery.