GROMACS Tutorial: Coarse-Grained Simulations with Martini 3 Force Field

 

In today’s blog post, we’re going to explore an interesting technique called coarse-grained (CG) simulation and we are going to see how to run them using GROMACS.

Specifically, we’ll focus on the Martini force field, which is rapidly gaining popularity to perform CG simulations of biomolecular systems.

In this article, we won’t dive into the intricate details of the coarse-graining technique itself, as that warrants a discussion of its own. Instead, the primary focus will be on guiding you through the process of running a CG simulation using GROMACS.

Now, before we dive in, it’s important to note that this tutorial assumes a basic familiarity with GROMACS. If you’re new to GROMACS, I encourage you to explore some of the previous blog posts to familiarize yourself with the software’s functionalities.

. Once you have a grasp of the basics, you’ll be ready to start.

 

 

Standard MD simulations, while incredibly valuable, can be computationally intensive and time-consuming, often limiting our ability to study long-timescale processes in biology.

To overcome these limitations, coarse-grained (CG) simulations have emerged as a powerful solution.

The basic principle behind CG simulations is the mapping of multiple atoms to a single representative entity, often referred to as a bead. By doing so, we have different advantages over atomistic simulations. Here I will only mention the two that are more intuitive:

  1. Reduced Atom Count: In CG simulations, multiple atoms are mapped to a single representative entity, resulting in a significantly reduced number of atoms in the system. This reduction in atom count allows for more efficient computations and faster simulations.

  2. Larger Timesteps: The simplification of the system in CG simulations also enables the use of larger timesteps compared to atomistic simulations. With fewer degrees of freedom to consider, CG simulations can effectively explore longer timescales, potentially up to 20 fs or even more. This larger timestep further contributes to the computational speedup of CG simulations.

 

The reduction of degrees of freedom significantly accelerates the simulation while still capturing the essential dynamics of the system. This efficiency enables the investigation of biologically relevant phenomena occurring over longer timescales that would otherwise be challenging to simulate using atomistic models.

 

 

The general workflow is not particularly different from a standard MD simulation. The only thing we need to change is the initial step since the protein will need to be converted from an atomistic structure to a coarse-grained one but the rest will be pretty much similar to a standard MD simulation except for the fact that we will have to twist some of the parameters in the mdp file.

 

 

 

 

First of all, we can download an atomistic structure from the Protein Data Bank. As an example, we are going to use the famous lysozyme (PDB ID: 1AKI)

1
wget http://www.pdb.org/pdb/files/1AKI.pdb

 

The first thing to do as always is to make sure to clean the protein from other molecules. You can use the grep command

1
grep -v HETATM 1AKI.pdb > 1aki_clean.pdb

 

To convert the clean atomistic structure of the protein to CG and generate the corresponding topology file we can use the Martinize python script.

Download martinize

You can download Martinize by following the procedure reported here. Or you can simply run these two:

1
pip install vermouth
1
pip install git+https://github.com/marrink-lab/vermouth-martinize.git#vermouth

To check whether the installation was successful you can try:

1
martinize2 -h

 

To convert an atomistic structure into a coarse-grained (CG) representation using the martinize2 tool, we need to provide various flags and parameters. Let’s break down the command and explain each flag in detail:

1
martinize2 -f 1aki_clean.pdb -dssp /home/user/anaconda3/envs/martinize/bin/mkdssp -x 1aki_cg.pdb -o topol.top -ff martini3001 -scfix -cys auto -p backbone -elastic -ef 700.0 -el 0.5 -eu 0.9
Syntax
  1. The command begins with martinize2, which is the executable for the martinize2 tool.
  2. The -f flag is used to specify the input atomistic structure file in PDB format. In this case, it is 1aki_clean.pdb.
  3. Provide the secondary structure via the -dssp flag (More info below).
  4. The -x flag is used to specify the output file name for the converted CG structure. Here, the name 1aki_cg.pdb is chosen for the coarse-grained structure file.
  5. Select the -o flag and name the topology (topol.top). The topology file will contain all the information about the system
  6. Select the Martini 3 force field with -ff martini3001
  7. The -scfix flag applies side-chain corrections.
  8. The -cys flag is set to auto, which allows the program to automatically detect disulfide bonds in the protein and create appropriate constraints.
  9. The -p backbone option defines position restraints on the backbone.
  10. The -elastic flag activates the elastic network model. Elastic networks are used to introduce harmonic constraints between pairs of atoms within a certain cutoff distance, simulating the connectivity and flexibility of the protein.
  11. The -ef, -el, and -eu flags control the parameters for the elastic network. -ef 700.0 sets the force constant for the elastic network springs to 700 • kJ/mol/nm$^2$, -el 0.5 defines the lower cutoff for interactions, and -eu 0.9 specifies the upper cutoff. These parameters determine the strength and range of the harmonic constraints in the elastic network.
DSSP

Note that this also requires as input the secondary structure of the protein. A convenient way to pass it to Martinize is to download the dssp tool and give the executable via the -dssp flag.

More info on the tool is here. You can download it via conda.

1
conda install -c salilab dssp

You will need to give the full path to dssp to the martinize2. To get it you can type one of the following two commands:

1
2
which dssp
which mkdssp

It should return a path. In my case was something like this

1
/home/user/anaconda3/envs/martinize2_env/bin/mkdssp

This will be the path to use for the script

 

This will generate three files:

  • the CG structure (1aki_cg.pdb)
  • The corresponding topology (topol.top)
  • The parameters file for the protein (molecule_0.itp). Here you will also find the command you used to generate the parameters.

 

By comparing the initial all-atom structure (1aki_clean.pdb) and the obtained CG structure (1aki_cg.pdb), you will easily notice a difference.

In the all-atom structure, each atom is individually represented, retaining the level of atomistic detail. In contrast, the coarse-grained structure simplifies the representation by grouping atoms into beads, reducing the complexity of the system and enabling faster simulations.

All-atom to Coarse-grained structure comparison (Serine amino acid)

The image showcases the comparison between the all-atom structure and the coarse-grained structure of a serine residue in the protein. In this example, the serine residue is mapped onto two beads,

 

 

After obtaining our coarse-grained (CG) structure, we need to prepare the system for the simulation. This process typically involves three essential steps: solvation, ion addition for system neutrality, and optionally embedding the protein in a membrane environment.

To streamline this process, one convenient tool we can utilize is the insane tool to build CG systems. It offers a range of functionalities, including generating a solvated system and embedding the protein in a membrane if desired. You can obtain the insane.py script from here.

Once you have acquired the script, you can utilize it to solvate the system with water and add the necessary ions for neutralization. Let’s examine the following Python command:

1
python2.7 insane.py -f 1aki_cg.pdb -o system.gro -p topol.top -d 7 -sol W:100 -salt 0.15 
Syntax

Let’s break down the options and parameters of this command:

  • -f 1aki_cg.pdb: This option specifies the input CG structure file in PDB format, which in this case is 1aki_cg.pdb.

  • -p topol.top: The -p flag is used to provide the topology file (topol.top).

  • -o system.gro: The -o flag sets the name of the output file for the solvated system in the gro format (system.gro)

  • -d 7: builds the periodic box such that the distance between two periodic images is greater than 7 nm.

  • -sol W: The -sol flag specifies the solvent type to add to the system. Here, W represents water molecules. By specifying -sol W, we indicate that we want to solvate the system with water.

  • -salt 0.15: The -salt flag is used to add salt ions to the system. The value 0.15 indicates the desired salt concentration in moles per liter (M).

 

By executing this Python command, the insane.py script will generate a solvated system, with the specified number of water molecules and added ions to neutralize the system. The resulting structure will be saved as monomer_6cm4_memb_cg.gro and can be further used for subsequent steps in the simulation.

 

 

The itp files for the Martini 3 force field can be downloaded here. I generally store them in a directory named martini_itp. Inside here you will find the general parameters of the force field, as well as the ones for the water and the ions.

1
tar -xvf martini_itp.tar.gz

 

Now you can also move the previously generated itp file for the protein inside this folder.

1
mv molecule_0.itp martini_itp

 

Lastly, we need to modify the topology file (topol.top) to include all the parameters we need for our simulation. This is what the file should be like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#include "martini_itp/martini_v3.0.0.itp"
#include "martini_itp/martini_v3.0.0_solvents_v1.itp"
#include "martini_itp/martini_v3.0.0_ions_v1.itp"
#include "martini_itp/molecule_0.itp"

[ system ]
; name
Insanely solvated protein.

[ molecules ]
; name  number
molecule_0       1
W            10672
NA+            113
CL-            121

Now we are ready to go.

 

 

Once we have prepared the solvated system, we can proceed with the simulation, which follows a similar protocol to a standard molecular dynamics (MD) simulation. As always, the key steps include minimization, equilibration, and the production run.

The only differences you will find are the mdp files for the runs, and the difference in performance with respect to an atomic simulation.

 

 

Let’s start by creating a directory dedicated to the minimization phase and navigating into it:

1
2
mkdir minim
cd minim

 

Next, you will need to download the mdp file for the minimization run and copy that into this directory.

This file contains the specifications for the minimization run. For example, let’s assume the file is named minim.mdp. To generate the binary input file (tpr) needed for the minimization run, execute the following command:

1
gmx_mpi grompp -f minim.mdp -c ../system.gro -r ../system.gro -p ../topol.top -o em.tpr

 

With the em.tpr file generated, we can proceed to run the minimization simulation using the following command:

1
gmx_mpi mdrun -v -deffnm em

 

 

After the minimization step, we can proceed with the equilibration phase of our coarse-grained (CG) simulation.

Due to the increased stability of CG models, a more rough equilibration is typically sufficient. In this case, we will focus on a 50 ns NPT (constant number of particles, pressure, and temperature) simulation.

Now we can create a directory for the NPT equilibration and navigate into it.

1
2
mkdir npt
cd npt

 

Find the mdp file for the NPT run here. Next, we can generate the binary input file (tpr) for the NPT equilibration run using the following command:

1
gmx_mpi grompp -f npt.mdp -c ../minim/em.gro -r ../minim/em.gro -p ../topol.top -o npt.tpr

 

With the npt.tpr file generated, we can now run the NPT equilibration simulation. To execute the simulation in the background, use the following command:

1
nohup gmx_mpi mdrun -v -deffnm npt &

 

 

Finally, we will run a 1 μs production run following the usual procedure. Create a directory and move into it.

1
2
mkdir md
cd md

 

Download the mdp file for the production run here, then you can use it to generate the md.tpr file.

1
gmx_mpi grompp -f md.mdp -c ../npt/npt.gro -t ../npt/npt.cpt -p ../topol.top -o md.tpr -n ../index.ndx

 

Run the simulation in background:

1
nohup gmx_mpi mdrun -v -deffnm md &

 

 

Coarse-grained (CG) simulations offer the advantage of significantly reducing the computational time required for molecular dynamics simulations.

However, it is crucial to exercise caution and ensure the reliability of the results obtained from CG simulations. Comparing the fluctuations of the CG system with those of the same system at the all-atom level can help validate the simulation and assess its quality.

To perform this comparison, you can utilize the tools available in GROMACS.

 

  1. gmx rms to compute the root-mean-square deviation (RMSD)

The RMSD analysis allows you to quantify the deviation between the CG and all-atom structures by calculating the average distance between corresponding atoms in the two systems. If the RMSD value is within an acceptable range, it indicates that the CG model accurately captures the behavior of the all-atom system.

 

  1. gmx rmsf to compute the root-mean-square deviation (RMSF)

The RMSF analysis measures the fluctuation of individual atoms in the CG system. By comparing the RMSF profiles of the CG and all-atom systems, you can identify any significant discrepancies in the dynamics and fluctuations of the two representations. Large differences in the RMSF values may indicate potential inaccuracies or limitations in the CG model.

If the fluctuations are too high you may have to increase the value of the elastic network provided to the martinize tool with -ef, and vice versa if the fluctuations are too low you may want to decrease the strength of the elastic network.

 

Backbone in CG structures

We generally compute the RMSD backbone to backbone but, since we mapped the original atomistic structure to CG GROMACS will not be able to recognize the backbone by default. For this reason, you will have to create a specific index file (index.ndx):

1
gmx_mpi make_ndx -f system.gro -o index.ndx

and select only the backbone bead (BB).

1
2
a BB
q

Now you can use this group of atoms to compute the RMSD and RMSF.