Align structures in PyMOL

 

As a beginner in computational chemistry, mastering the art of aligning molecular structures is crucial for comparative protein analysis. There are several ways you can use to align two molecular structures, here I will show you how to do it using the PyMOL command line.

Indeed, PyMOL offers powerful alignment capabilities that allow you to compare and contrast different structures, identify structural similarities and differences, and gain insights into protein function.

 

 Two ligands docked in the same protein and aligned with PyMOL

 

Example Aligning structures can be useful when you want to compare two ligands after a docking experiment

 

Description
This command is used to align two molecules or part of them.

 

The way this command work is quite intuitive.

Let’s say that you have two structures (object_1, object_2) and you want to align them. You only need to type the following in the PyMOL console.

1
align object_1, object_2

 

Note that the first one will align with the second one. Therefore the first selected object will be the “mobile” one while the second will be the reference.

1
align mobile, reference

 

PyMOL uses a two-step approach for aligning structures: first, it performs a sequence alignment, and then it minimizes the Root Mean Square Deviation (RMSD) between the aligned residues. As a result, you will get the structures aligned in the display and something like this will be printed to the console.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
PyMOL>align object_1, object_2  
 Match: read scoring matrix.  
 Match: assigning 451 x 1631 pairwise scores.  
 MatchAlign: aligning residues (451 vs 1631)...  
 MatchAlign: score 1275.000  
 ExecutiveAlign: 2246 atoms aligned.  
 ExecutiveRMS: 89 atoms rejected during cycle 1 (RMSD=14.98).  
 ExecutiveRMS: 153 atoms rejected during cycle 2 (RMSD=10.83).  
 ExecutiveRMS: 65 atoms rejected during cycle 3 (RMSD=5.87).  
 ExecutiveRMS: 113 atoms rejected during cycle 4 (RMSD=2.96).  
 ExecutiveRMS: 78 atoms rejected during cycle 5 (RMSD=2.13).  
 Executive: RMSD =    1.885 (1714 to 1714 atoms)
Note
The molecules you want to align need to be in two different objects, otherwise, PyMOL will give you an error. To avoid this, you can make temporary objects of the molecules you want to align.

 

 

By utilizing PyMOL’s alignment capabilities, you can perform various alignments to gain insights into the structural characteristics of biomolecules and their interactions. Now let’s look at some use cases you may encounter during your research. Most of them require that you are more or less familiar with the Pymol selection tool so I suggest you read this article if you want to get an idea of how it works.

 

 

Let’s say you have two protein structures in the PDB format, protein_A.pdb and protein_B.pdb, and you want to align them to analyze their structural similarities and differences. You can follow these two steps:

  1. Load the protein structures into PyMOL: You can use the load command in PyMOL to load the protein structures into separate objects. For example:
1
2
load protein_A.pdb, mobile 
load protein_B.pdb, reference

 

This will load the protein structures from the PDB files protein_A.pdb and protein_B.pdb into two separate objects named mobile and reference in the PyMOL GUI.

  1. Now you can customize the color and appearance of the proteins as you wish. For instance:
1
2
3
hide all
show sticks, mobile
show sticks, reference

 

  1. Proceed to align them as previously shown using the align command:
1
align mobile, reference

 

You can also use the fetch command to directly retrieve the pdb file from the Protein Data Bank.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Load structures
fetch 7lcj, mobile
fetch 7lck, reference

# Modify appearance
hide all
show cartoon, mobile
show cartoon, reference

#align
align mobile, reference

 

 Two proteins aligned using PyMOL

 

 

Sometimes you may notice that the standard alignment is not great. If that is the case, you can improve by selecting a subset of residues and atoms. 

Let’s take as an example a situation where you have two proteins and you want to align one protein to a specific chain or certain residues of the other protein. You can use PyMOL’s align command along with the sele command to specify the chains or residues of interest. Here’s an example:

  1. Load the two proteins
1
2
load protein_A, object_1
load protein_B, object_2

 

  1. Select the chain or residues of interest: You can use the sele command to specify the chain or residues of interest that you want to align.
1
2
3
4
5
# Select chain A of object_1 (first protein loaded)
sele chain_A, object_1 and chain A 

# Select a range of residues (100-150) from object_2 
sele region, object_2 and resi 100-150

 

This will select chain A from object1 as the chain of interest, or residues 100 to 150 from object2, respectively.

 

  1. Align the selected chain or residues: You can use the align command to align the selected chain or residues to the other protein structure.
1
2
3
4
5
# Aligning object_1 to a range of residues in object_2
align object_1, region 

# Aligning obect_2 to chain A of object 1
align object_2, chain_A

 

 

Similarly, we can align two proteins based on their backbones by first extracting the alpha carbons of the protein with the sele command.

1
2
3
4
5
6
7
# Load structures
load protein1.pdb, object1 
load protein2.pdb, object2

# Extract alpha carbons in objects ca_protein1 and ca_protein2
sele ca_protein1, object1 and name CA 
sele ca_protein2, object2 and name CA

 

Align the two objects containing the alpha carbons:

1
align ca_protein1, ca_protein2

 

 

Let’s say that you run a MD simulation using GROMACS or any other software and you want to observe how the position of the ligand changes along the course of the simulation.

Something you could do is to load both the initial and final frames (you can get them via gmx trjconv) of the simulation and align two ligands based on their names and residue codes (resn) using the sele command in PyMOL.

1
2
3
4
5
6
7
# Load ligands
load protein1.pdb, object1  
load protein2.pdb, object2  

# Extract ligands with residue name "LIG" in objects ligand1 and ligand2
sele ligand1, object1 and resn LIG  
sele ligand2, object2 and resn LIG

 

Align the two objects containing the ligands:

1
align ligand1, ligand2

This will align the two ligands based on their residue name “LIG”, minimizing the RMSD between the ligand atoms. After aligning the ligands, you can visually compare the superimposed structures in PyMOL to analyze whether something changed.

 

 

PyMOL also allows you to align multiple structures. To do that, you have to load all the structures and then perform consecutive alignments. A little cumbersome but it gets the job done.

1
2
3
4
5
6
7
8
# Load PDB files 
load protein1.pdb, reference 
load protein2.pdb, object2 
load protein3.pdb, object3 

# Align objects object2 and object3 to object1 
align object2, reference 
align object3, reference

In this example, we have loaded three PDB files (protein1.pdb, protein2.pdb, and protein3.pdb) into three separate objects (reference, object2, and object3).

We then used the align command to align object2 and object3 to the same reference.