How to run a Molecular Dynamics simulation using GROMACS

GROMACS basic functions and commands

 

During your journey in Computational Chemistry, you will most likely come across different software used for molecular simulation. One of the most widely used is GROMACS, a free and open source software used to perform Molecular Dynamics (MD) simulations and to analyze the resulting trajectories.

If you are new to this world it can be frightening to go through the documentation and try to figure out how this program works. We have all been there.

Luckily for you, on this website, we have a detailed guide explaining to you how to perform simulation using GROMACS. Here I will give you a brief overview of the software as well as a general introduction to its basic commands.

However, before going through the tutorials I highly recommend you to check out my articles on Molecular Dynamics (MD) so that you can get a general idea of the technique, as well as the different steps involved in a simulation. Also, I will consider that you are quite comfortable with the concepts of:

  1. Force field
  2. Ensembles

 

 

GROMACS is a versatile, high-performance molecular dynamics simulation package that can be used to study the behavior of molecules and biomolecules in various environments. It is particularly well-suited for simulating biomolecular systems, such as proteins, nucleic acids, and lipids, and has been widely used in a variety of fields, including chemistry, biology, and materials science.

One of the key features of GROMACS is its ability to perform highly efficient simulations using advanced algorithms and techniques making it possible to simulate large systems with millions of atoms, and to do so with high precision and accuracy.

GROMACS allows users to easily set up and run simulations, analyze the results, and also modify and optimize their simulations as needed. This makes it an ideal tool for researchers and scientists who are looking to study complex biomolecular systems and gain a better understanding of their behavior and interactions at the molecular level.

In addition to its powerful simulation capabilities, GROMACS also offers a wide range of advanced features and tools, such as support for free energy calculations and the ability to perform enhance sampling methods. These features make it possible to study a wide range of phenomena and to tackle challenging research problems such as ligand binding or protein-protein interactions.

If you are interested in this software and you have no idea how to get started you are in the right place. Here is a guide on how to install GROMACS if you haven’t done that already.

Let’s begin.

 

 

The general idea of GROMACS is not different from any other way of performing an MD simulation.

The first thing that we need to do is to prepare our system for the simulation. The system preparation phase is composed by three steps.

  1. Create a simulation box
  2. Solvate the system (e.g. using water)
  3. Neutralize the overall system using counterions ($Na^+$, $Cl^-$)

 

 

Once our system is ready we can proceed with the simulation phase.

  1. Energy minimization
  2. NVT equilibration
  3. NPT equilibration
  4. Production run

For each of these steps, we have specific commands to use and they require specific file formats.

Below you can find a list of the most common commands needed to carry out a simple MD simulation. The list is of course not exhaustive as the number of things you can do with GROMACS is immense.

 

 

Let’s start with the basics.

Generally speaking, a GROMACS command can be seen as composed of four components:

 

1
gmx module -flag option

 

  • gmx tells the computer we want to use GROMACS and is the starting point for each GROMACS command.
  • The module specifies the operation we want to compute
  • Within each action, we have different flags (-flag) that we can use
  • After the flag, we need to give GROMACS an option, which can be a value or a file with some specific extension
Warning
If you have installed an MPI version of GROMACS you should invoke the program with gmx_mpi instead of the standard gmx.

For each module you have different available flags and file formats you need to provide to the software. You can always refer to the official documentation to check for all the possible options within a command.

In addition you can get the available options for each module by typing;

1
gmx module -h 

 

As you may notice, the syntax for GROMACS is quite easy. Let’s see how it works in practice.

 

 

In the initial phase of our experiment, we need to set up our system. The workflow involves four commands:

  • pdb2gmx
  • editconf
  • solvate
  • genion

 

System preparation workflow

Let’s consider the commands involved in more detail.

 

 

It is very likely that you have the starting structure for your minimization as a PDB file (More info on PDB files here). However, you will need to convert it into files that can be used by GROMACS.

The first command we are going to need is pdb2gmx, which is needed to generate three files.

  1. topol.top : The topology file will contain force field parameters information for each atom in the structure. This includes bonding as well as non bonding parameters.
  2. posre.itp: A position restraint file that will be useful in the equilibration step of our simulation.
  3. protein.gro : An output structure in the gro format. The result is a coordinate file that will have x, y, and z coordinates for each atom of the structure in it, much like a simple PDB file.

This will be the starting point for our GROMACS simulation. If you are interested, you can find more info on these GROMACS file formats here.

The command is the following:

 

1
gmx pdb2gmx -f protein.PDB -o protein.gro -water tip3p
Syntax
  1. Call the program with gmx
  2. Select the pdb2gmx module
  3. Call the -f flag and provide the starting PDB structure protein.pdb
  4. Select the -o flag and decide how you want to name the output file
  5. The -water flag allows us to select the water model we want to use
Warning
pdb2gmx will give you an error if you have an incomplete structure that lacks some residues. Make sure to reconstruct the missing residues via homology modeling.

Once you launched the command you will be asked to choose the Force Field you desire.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
Select the Force Field:
From '/usr/local/gromacs/share/gromacs/top':
 1: AMBER03 protein, nucleic AMBER94 (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)
 2: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)
 3: AMBER96 protein, nucleic AMBER94 (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)
 4: AMBER99 protein, nucleic AMBER94 (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)
 5: AMBER99SB protein, nucleic AMBER94 (Hornak et al., Proteins 65, 712-725, 2006)
 6: AMBER99SB-ILDN protein, nucleic AMBER94 (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010)
 7: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)
 8: CHARMM27 all-atom force field (CHARM22 plus CMAP for proteins)
 9: GROMOS96 43a1 force field
10: GROMOS96 43a2 force field (improved alkane dihedrals)
11: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)
12: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)
13: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)
14: GROMOS96 54a7 force field (Eur. Biophys. J. (2011), 40,, 843-856, DOI: 10.1007/s00249-011-0700-9)
15: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)

 

 

The editconf command defines the size and shape of the box for our simulation.

 

1
gmx editconf -f protein.gro -o protein_box.gro -c -d 1.0 -bt cubic
Syntax
  1. Call the program with gmx
  2. Select the editconf command
  3. Select the -f flag and provide the starting structure (protein.gro)
  4. Call the -o flag and decide how you want to name the output file (protein_box.gro)
  5. The -c places our system in the center of the box
  6. Choose the -d flag and place our system at the selected distance from the box (at least 1.0 nm in our case)
  7. Decide the shape of the simulation box with the -bt flag
Warning
The size of the box needs to be selected with great care. When using periodic boundary conditions we need to make sure that we respect the minimum image convention.

 

 

The name says it all. We use this command to solvate the box.

 

1
gmx solvate -cp protein_box.gro -cs spc216.gro -o protein_solv.gro -p topol.top
Syntax
  1. Call the program with gmx
  2. Select the solvate command
  3. Select the -cp flag and provide the starting structure for the solute (protein_box.gro)
  4. Specify the solvent with the -cs flag. The default solvent is Simple Point Charge (spc) water
  5. Call the -o flag and decide how you want to name the output file (protein_solv.gro)
  6. Finally, we use the -p flag and provide the previously generated topology file

 

 

In the final step of the system preparation, we use the genion command to replace solvent molecules with ions. By adding ions we are able to neutralize the system.

 

1
gmx genion -s ions.tpr -o system.gro -p topol.top -pname NA -nname CL -neutral
Syntax
  1. Call the program with gmx
  2. Select the genion command
  3. Select the -s flag and provide the .tpr file for the ions (ions.tpr)
  4. Call the -o flag and decide how you want to name the output file (system.gro)
  5. Use the -p flag and provide the topology file (topol.top)
  6. Choose the -pname flag to decide the name of the positive ions name
  7. The -nname flag is used to decide the name of the negative charge ions
  8. The -neutral flag tells GROMACS to neutralize the net charge of the system
Warning
To use this command we need to generate a tpr file for the ions. This is done via the grompp command which is discussed below.

Once you launched the command you will be asked to choose the atoms you want to replace with ions. Of course, you will have to select the solvent.

 

 

At this stage, we have our system ready to be simulated.

As already mentioned, the simulation phase involves multiple steps. A typical MD protocol consists of different simulations carried out within different ensembles.

All of them are simply based on two commands, regardless of the type of simulation we want to run:

  1. grompp: we use it to generate an input file in the tpr format
  2. mdrun: we run the simulation using the previously generated tpr file

 

GROMACS Simulation workflow

Overall, the workflow to perform a simulation is extremely intuitive. Let’s analyze the commands more in-depth.

 

 

The grompp command is necessary to assemble:

  • A coordinate file (system.gro)
  • A topology file (topol.top)
  • An mdp file containing the parameters for our simulation (simulation.mdp)

As a result, we get a .tpr file with coordinates, topology, and parameters for minimization. Now we can use this file to run the simulation.

Simple right?

 

1
gmx grompp -f simulation.mdp -c system.gro -r system.gro -t system.cpt -p topol.top -o simulation.tpr
Syntax
  1. Call the program with gmx
  2. Select the grompp command
  3. Select the -f flag and provide the mdp file with the parameters for the run (simulation.mdp)
  4. The -c flag is used to provide the coordinates of atoms
  5. The -r flag is needed if we are using position restraints. Mostly during the equilibration steps of your simulation.
  6. In some cases we may want to continue our simulation starting from a previous one. To this aim, we can use the -t flag coupled with a previously generated cpt file.
  7. Use the -p flag and provide the topology file (topol.top)
  8. Call the -o flag and decide how you want to name the output file (simulation.tpr)

 

 

Finally, we are ready to launch the simulation with the mdrun command.

 

1
gmx mdrun -v -deffnm simulation
Syntax
  1. Call the program with gmx
  2. Select the mdrun command
  3. The -v makes the command verbose. It is useful to make clearer what the program is actually doing while running.
  4. The -deffnm option is followed by the prefix of the tpr file we are using. This will generate files having the same prefix.

                    (simulation.tpr $\Rightarrow$ -deffnm simulation)

After completion you will generally get five files:

  • simulation.log containing information about the run
  • simulation.gro with the final structure of the system
  • simulation.edr with energies that GROMACS collects during the simulation
  • simulation.xtc a “light” weight trajectory with the coordinates of our system in low precision
  • simulation.trr with the higher precision trajectory of positions, velocities, and forces during the simulation

You can use these files to perform a variety of different analysis.

Check out the article where I show you how you can extract thermodynamic properties from an edr file.

 

 

Now that you know the basics of this software you can check the tutorials on how to run an MD simulation using GROMACS. Two of them are available for now. I suggest that you run them in the specified order.

  1. GROMACS simulation of a protein in water environment.
  2. GROMACS simulation of a protein in membrane environment.