GROMACS mdp file parameters

 

A Molecular Dynamics (MD) simulation can be carried out within different conditions, for different time lengths, using different algorithms to solve the equations of motion, and so on.

To make sure that you can perform all the different kinds of simulation according to your needs GROMACS allows you to specify several parameters.

These parameters need to be inserted in a so-called Molecular Dynamics parameters file (mdp file), a simple text file in the mdp extension.

This file is then going to be used as input in the gmx grompp module to assemble a tpr file needed to run the simulation.

 

 

As you may know, setting up an MD simulation is not a trivial task. It requires different steps, and some of them may differ depending on your system.

Almost in every case, you will need to perform different simulations to bring your system to equilibrium before running a definitive simulation to extract relevant data.

During this process, you will need an mdp file for each simulation you want to run. Furthermore, you will need to change the parameters in every file to adjust each simulation according to your needs.

The list of possible parameters is huge. In this article, I will only discuss the most important ones and I will give you a few templates that you can use for some general protein simulations.

You can always refer to the official guide for more information.

 

Warning

In most cases, you will use fairly standard values for your simulations and you won’t have to change them significantly.

However, always keep in mind that the parameters may need to be adjusted depending on your system. These templates are just intended as a starting point that you can modify and adapt to your specific case.

Let’s briefly discuss the mdp files for the most common kinds of simulation.

 

 

The first mdp file we are going to consider is the one for the energy minimization.

During this procedure, we are simply interested in minimizing the potential energy of the system to avoid steric clashes.

A typical mdp file for a minimization run is the one you can find in the code box below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
; MINIMIZATION RUN (minim.mdp)
;(used as input into grompp to generate em.tpr)
;
; Parameters describing what to do, when to stop, and what to save
integrator  = steep         ; A steepest descent algorithm for energy minimization
emtol       = 1000.0        ; Stop minimization when the maximum force < 1000.0 kJ/mol/nm
emstep      = 0.01          ; Minimization step size
nsteps      = 50000         ; Maximum number of (minimization) steps to perform

;ELECTROSTATIC AND VDWAALS
; Parameters describing how to find the neighbors of each atom and how to calculate the interactions
nstlist         = 1         ; Frequency to update the neighbor list and long range forces
cutoff-scheme   = Verlet    ; Buffered neighbor searching
ns_type         = grid      ; Method to determine neighbor list (simple, grid)
coulombtype     = PME       ; Treatment of long range electrostatic interactions
rcoulomb        = 1.0       ; Short-range electrostatic cut-off (nm)
rvdw            = 1.0       ; Short-range Van der Waals cut-off (nm)
pbc             = xyz       ; Periodic Boundary Conditions in all 3 dimensions

 

The file already contains some comments (specified by the semicolumn “;”) with information about each parameter.

As you can see the file is composed of two paragraphs.

The first one is telling GROMACS some basic info about the run such as:

  • The algorithm we want to use for the minimization. integrator=steep means that we are using the well-known steepest descent method. You can also use more sophisticated algorithms such as the conjugate gradient (cg) or the l-bfgs. This will not be necessary for the majority of the systems.

  • The maximum number of steps for the minimization (nsteps=50000). If it does not converge within this number of steps the run will end anyway.

  • The convergence is given by emtol=1000. This means that the run will be considered as converged when the maximum force ($F_{max}$) is smaller than 1000 kJ mol$^{-1}$ nm$^{-1}$.

  • The step size of the minimization is given by emstep=0.01 (nm)

 

The second paragraph is dedicated to the treatment of:

  1. Electrostatic and Van der Waals (VDW) interactions
  2. Implementation of Periodic Boundary Conditions (PBC)

For this part, we usually have some standard values that are conserved in the other mdp files that you will find in this article. For instance, it is generally recommendable to use a cut-off value in the range of 1.0-1.2 nm for both Electrostatic (rcoulomb = 1.0) and VDW (rvdw=1.0) interactions, and to use the Particle Mesh Ewald (PME) method to treat the long-range non-bonding interactions.

 

 

The second mdp file we will consider is a template for an equilibration in the NVT ensemble.

This is usually required to bring the system to the desired temperature.

Find the mdp file for a general NVT equilibration in the code box below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
define                  = -DPOSRES  ; position restrain the protein
; Run parameters
integrator              = md        ; leap-frog integrator
nsteps                  = 50000     ; 2 * 50000 = 100 ps
dt                      = 0.002     ; 2 fs
; Output control
nstxout                 = 500       ; save coordinates every 1.0 ps
nstvout                 = 500       ; save velocities every 1.0 ps
nstenergy               = 500       ; save energies every 1.0 ps
nstlog                  = 500       ; update log file every 1.0 ps
; Bond parameters
continuation            = no        ; first dynamics run
constraint_algorithm    = lincs     ; holonomic constraints 
constraints             = h-bonds   ; bonds involving H are constrained
lincs_iter              = 1         ; accuracy of LINCS
lincs_order             = 4         ; also related to accuracy
; Nonbonded settings 
cutoff-scheme           = Verlet    ; Buffered neighbor searching
ns_type                 = grid      ; search neighboring grid cells
nstlist                 = 10        ; 20 fs, largely irrelevant with Verlet
rcoulomb                = 1.0       ; short-range electrostatic cutoff (in nm)
rvdw                    = 1.0       ; short-range van der Waals cutoff (in nm)
DispCorr                = EnerPres  ; account for cut-off vdW scheme
; Electrostatics
coulombtype             = PME       ; Particle Mesh Ewald for long-range electrostatics
pme_order               = 4         ; cubic interpolation
fourierspacing          = 0.16      ; grid spacing for FFT
; Temperature coupling is on
tcoupl                  = V-rescale             ; modified Berendsen thermostat
tc-grps                 = Protein Non-Protein   ; two coupling groups - more accurate
tau_t                   = 0.1     0.1           ; time constant, in ps
ref_t                   = 300     300           ; reference temperature, one for each group, in K
; Pressure coupling is off
pcoupl                  = no        ; no pressure coupling in NVT
; Periodic boundary conditions
pbc                     = xyz       ; 3-D PBC
; Velocity generation
gen_vel                 = yes       ; assign velocities from Maxwell distribution
gen_temp                = 300       ; temperature for Maxwell distribution
gen_seed                = -1        ; generate a random seed

 

Let’s go through some of the most important parameters in the text file.

In the first line, we apply position restraints to the protein via the define=-DPOSRES keyword. Position restraints are contained in the posre.itp file that you create via the gmx pdb2gmx module, and it allows us to equilibrate the solvent around the protein without causing significant structural changes in the protein.

 

Then we have a section dedicated to the Run parameters

1
2
3
4
; Run parameters
integrator              = md        ; leap-frog integrator
nsteps                  = 50000     ; 2 * 50000 = 100 ps
dt                      = 0.002     ; 2 fs

Here we specify:

1. The type of algorithm we want to use to solve the equations of motion. The leap-frog integrator was selected in this case (integrator=md)

 

2. The duration of the simulation is given by the time step we select (dt) as well as the number of steps to perform (nsteps). In this specific case, we selected a time step of 2 fs (dt=0.002) and we decided to perform 50.000 steps (nsteps=50000). The overall simulation time ($t$) is given by the time step multiplied by the number of steps.

 

$$ t = dt \cdot nsteps = 0.002 \cdot 50000=100000fs = 100ps = 0.1ns $$

 

Based on the complexity of your system the simulation time required to equilibrate may vary a lot. There is no rule of thumb to know a priori how much time your system will need to equilibrate. A little bit of experience can help. You can check if the system has equilibrated by looking at the RMSD value.

 

The following paragraph is telling GROMACS how frequently to save information to the output files.

1
2
3
4
5
; Output control
nstxout                 = 500       ; save coordinates every 1.0 ps
nstvout                 = 500       ; save velocities every 1.0 ps
nstenergy               = 500       ; save energies every 1.0 ps
nstlog                  = 500       ; update log file every 1.0 ps
  • The nstxout=500 and nstvout=500 parameters mean that we are writing the coordinates and the velocities to the trajectory file in the trr format every 500 time steps.

 

$$ t = dt \cdot nsteps = 0.002 \cdot 500 = 1000 fs = 1 ps $$

 

If you want the trajectory in a less-consuming format (xtc) you can use the nstxout-compressed parameter instead of the ones shown in the template.

 

  • The nstenergy=500 parameter is needed to communicate to GROMACS how often we want to save the energies in the edr output file. You can then use this file to extract thermodynamic properties from the simulation via the gmx energy module.

 

  • The nstlog=500 specifies how frequently to save information to the log file, which provides a summary of the run.

 

Let’s consider the next section.

1
2
3
4
5
6
; Bond parameters
continuation            = no        ; first dynamics run
constraint_algorithm    = lincs     ; holonomic constraints 
constraints             = h-bonds   ; bonds involving H are constrained
lincs_iter              = 1         ; accuracy of LINCS
lincs_order             = 4         ; also related to accuracy

The purpose of this box is to mainly determine two things:

1. We are not continuing this simulation starting from another run (continuation=no). This is often the case for an NVT equilibration.

2. We want to constraint the bonds of the protein using the LINCS algorithm.

 

I am going to skip the sections related to the electrostatic and VdW interactions that were already discussed in the previous mdp file. Let’s focus on the section where we specify the conditions for the experiment.

1
2
3
4
5
6
7
; Temperature coupling is on
tcoupl                  = V-rescale             ; modified Berendsen thermostat
tc-grps                 = Protein Non-Protein   ; two coupling groups - more accurate
tau_t                   = 0.1     0.1           ; time constant, in ps
ref_t                   = 300     300           ; reference temperature, one for each group, in K
; Pressure coupling is off
pcoupl                  = no        ; no pressure coupling in NVT

To bring the system to the desired temperature we need an algorithm to simulate a thermostat. GROMACS provides you with different algorithms. You can decide the algorithm to use by setting the tcoupl parameter. The V-rescale thermostat is appropriate for a general equilibration. However, for more complex systems you may need to use less sophisticated algorithms such as the berendsen.

You can create separate groups for the thermostat with the tc-groups parameter. It is generally recommendable to create at least two groups, consisting of the protein (Protein) and the rest of the system (Non-protein). You can pass any GROMACS index you want.

Finally, we can choose the desired temperature for both groups with the ref_t parameter and the coupling constant with tau_t.

Note
In an NVT simulation, we don’t adjust the pressure of the system so the box of the system stays fixed (pcoupl=no). We will discuss this in more detail in the following mdp file.

 

In the last section of the file, we simply need to decide if we want to assign initial velocities to our system.

1
2
3
4
; Velocity generation
gen_vel                 = yes       ; assign velocities from Maxwell distribution
gen_temp                = 300       ; temperature for Maxwell distribution
gen_seed                = -1        ; generate a random seed

Since the NVT equilibration is generally the first step we need to assign initial velocities using the keyword gen_vel=yes. The velocities will be selected from a Maxwell-Boltzmann distribution at the temperature that you choose for the gen_temp parameter.

 

 

The NPT equilibration is required to bring the system to the desired pressure.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
define                  = -DPOSRES  ; position restrain the protein
; Run parameters
integrator              = md        ; leap-frog integrator
nsteps                  = 50000     ; 2 * 50000 = 100 ps
dt                      = 0.002     ; 2 fs
; Output control
nstxout                 = 500       ; save coordinates every 1.0 ps
nstvout                 = 500       ; save velocities every 1.0 ps
nstenergy               = 500       ; save energies every 1.0 ps
nstlog                  = 500       ; update log file every 1.0 ps
; Bond parameters
continuation            = yes       ; Restarting after NVT 
constraint_algorithm    = lincs     ; holonomic constraints 
constraints             = h-bonds   ; bonds involving H are constrained
lincs_iter              = 1         ; accuracy of LINCS
lincs_order             = 4         ; also related to accuracy
; Nonbonded settings 
cutoff-scheme           = Verlet    ; Buffered neighbor searching
ns_type                 = grid      ; search neighboring grid cells
nstlist                 = 10        ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb                = 1.0       ; short-range electrostatic cutoff (in nm)
rvdw                    = 1.0       ; short-range van der Waals cutoff (in nm)
DispCorr                = EnerPres  ; account for cut-off vdW scheme
; Electrostatics
coulombtype             = PME       ; Particle Mesh Ewald for long-range electrostatics
pme_order               = 4         ; cubic interpolation
fourierspacing          = 0.16      ; grid spacing for FFT
; Temperature coupling is on
tcoupl                  = V-rescale             ; modified Berendsen thermostat
tc-grps                 = Protein Non-Protein   ; two coupling groups - more accurate
tau_t                   = 0.1     0.1           ; time constant, in ps
ref_t                   = 300     300           ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl                  = Parrinello-Rahman     ; Pressure coupling on in NPT
pcoupltype              = isotropic             ; uniform scaling of box vectors
tau_p                   = 2.0                   ; time constant, in ps
ref_p                   = 1.0                   ; reference pressure, in bar
compressibility         = 4.5e-5                ; isothermal compressibility of water, bar^-1
refcoord_scaling        = com
; Periodic boundary conditions
pbc                     = xyz       ; 3-D PBC
; Velocity generation
gen_vel                 = no        ; Velocity generation is off 

 

Most of the parameters in this file were already discussed in the previous parts of this article.

Here I am just going to highlight the main difference with respect to the previous file for the NVT equilibration.

1
2
3
4
5
6
7
; Pressure coupling is on
pcoupl                  = Parrinello-Rahman     ; Pressure coupling on in NPT
pcoupltype              = isotropic             ; uniform scaling of box vectors
tau_p                   = 2.0                   ; time constant, in ps
ref_p                   = 1.0                   ; reference pressure, in bar
compressibility         = 4.5e-5                ; isothermal compressibility of water, bar^-1
refcoord_scaling        = com

As you can see in this case the pressure coupling is on, meaning that we adjust the size of the box to reach the desired pressure value. In other words, we are going to simulate a barostat.

Different barostat algorithms are available in GROMACS. In this case, we selected via the pcoupl parameter one of the most common, the Parrinello-Rahman. For the equilibration phase, you may also use less sophisticated barostats such as the Berendsen.

We can select the ideal pressure value in bars with ref_p=1.0, and the time constant with tau_p=2.0. Furthermore, the size of the box is going to be scaled uniformly in all three dimensions due to pcoupltype=isotropic. When simulating a protein in membrane environment it is recommendable to use a semiisotropic scaling (isotropic scale in x,y axis and different in z axis)

 

Note

The NPT simulation is usually the second equilibration step. For this reason, we are going to select continuation=yes so that we can continue from the previous simulation in the NVT ensemble.

Moreover, we don’t need to generate new velocities as they have already been assigned in the previous step (gen_vel=no). Velocities are read from the previous file

 

 

Last but not least we have the mdp file for a production run.

This is generally a simulation within the NPT ensemble with no restraints on the protein.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
; Run parameters
integrator              = md        ; leap-frog integrator
nsteps                  = 500000    ; 2 * 500000 = 1000 ps (1 ns)
dt                      = 0.002     ; 2 fs
; Output control
nstxout                 = 0         ; suppress bulky .trr file by specifying 
nstvout                 = 0         ; 0 for output frequency of nstxout,
nstfout                 = 0         ; nstvout, and nstfout
nstenergy               = 5000      ; save energies every 10.0 ps
nstlog                  = 5000      ; update log file every 10.0 ps
nstxout-compressed      = 5000      ; save compressed coordinates every 10.0 ps
compressed-x-grps       = System    ; save the whole system
; Bond parameters
continuation            = yes       ; Restarting after NPT 
constraint_algorithm    = lincs     ; holonomic constraints 
constraints             = h-bonds   ; bonds involving H are constrained
lincs_iter              = 1         ; accuracy of LINCS
lincs_order             = 4         ; also related to accuracy
; Neighborsearching
cutoff-scheme           = Verlet    ; Buffered neighbor searching
ns_type                 = grid      ; search neighboring grid cells
nstlist                 = 10        ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb                = 1.0       ; short-range electrostatic cutoff (in nm)
rvdw                    = 1.0       ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype             = PME       ; Particle Mesh Ewald for long-range electrostatics
pme_order               = 4         ; cubic interpolation
fourierspacing          = 0.16      ; grid spacing for FFT
; Temperature coupling is on
tcoupl                  = V-rescale             ; modified Berendsen thermostat
tc-grps                 = Protein Non-Protein   ; two coupling groups - more accurate
tau_t                   = 0.1     0.1           ; time constant, in ps
ref_t                   = 300     300           ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl                  = Parrinello-Rahman     ; Pressure coupling on in NPT
pcoupltype              = isotropic             ; uniform scaling of box vectors
tau_p                   = 2.0                   ; time constant, in ps
ref_p                   = 1.0                   ; reference pressure, in bar
compressibility         = 4.5e-5                ; isothermal compressibility of water, bar^-1
; Periodic boundary conditions
pbc                     = xyz       ; 3-D PBC
; Dispersion correction
DispCorr                = EnerPres  ; account for cut-off vdW scheme
; Velocity generation
gen_vel                 = no        ; Velocity generation is off 

 

Production runs are usually longer than the previous simulations. For this reason, it is preferable to save the coordinates in a less memory-consuming trajectory file format (xtc instead of trr).

This is specified in the “Output control” section.

1
2
3
4
5
6
7
8
; Output control
nstxout                 = 0         ; suppress bulky .trr file by specifying 
nstvout                 = 0         ; 0 for output frequency of nstxout,
nstfout                 = 0         ; nstvout, and nstfout
nstenergy               = 5000      ; save energies every 10.0 ps
nstlog                  = 5000      ; update log file every 10.0 ps
nstxout-compressed      = 5000      ; save compressed coordinates every 10.0 ps
compressed-x-grps       = System    ; save the whole system

The output parameters that we previously saw (nstxout, nstvout) are all set to zero.

We write the coordinated in the xtc file every 5000 steps (nstxout-compressed=5000). We also need to specify that we want to save the whole system by passing the compressed-x-grps=System parameter. You can also decide to only save the protein (compressed-x-grps=Protein).

Note
We don’t have the define=-DPOSRES line as we don’t need position restraints on the protein anymore.