GSOC2018: Final Report
It’s the final day of GSOC 2018… It has been a great experience - I learned a lot about programming and development and it also helped me understand some of the tools that I use everyday without thinking.
So, heres a summary of what I have done during GSOC.
First evaluation
During the past 3 months, I helped implement on-the-fly transformations in MDAnalysis, as part of the GSOC 2018 program with NumFOCUS. These transformations greatly enhance the functionality of MDAnalysis, and are a great step towards making it a complete package for treatment and analysis of molecular simulation data. A detailed discussion on how this project could unfold can be read in MDAnalysis issue #1215.
In the first month - the time until the first evaluation - I added and modified the functions to add and
apply transformations in the ProtoReader
class. It was crucial to have a solid foundation for the
transformations API, and this step was the cornerstone of the project.
-
The code and its evolution over the 4 weeks can be seen in this pull request. In this PR, a very simple translation function was also added, as it helped with testing, and would serve as a template for future transformations
-
An issue was opened by my mentor Jonathan Barnoud to add the newly implemented transformations to the MDAnalysis online docs - issue #947. It was fixed by me in this PR.
Second evaluation
Coordinate rotation
After making sure the transformations API was solid, it was time to add a library of commonly used
transformations. First was a rotation transform - rotateby
. This transformation performs a simple rotation
of the whole coordinate system using a given directional vector and a given point as the axis of rotation.
-
The PR for this function can be checked here
-
Later, I found an error in the rotation transformation calculation when using a custom point and vector. The PR correcting this issue can be seen here
AtomGroup centering
After the rotation, AtomGroup
centering was the next challenge. The first centering function to be added
was center_in_box
. As the name says this function centers a given AtomGroup
in the unit cell, by
default. A point
argument (changed later to center_to
) can be given, and the AtomGroup
will be
centered on that coordinate.
- the PR for the
center_in_box
transformation can be seen here
PBC treatment - take 1
Another very important transformation - one that can be considered the mostly used when working with molecular
dynamics simulation trajectories - is unwrapping. This transformation is used to correct artifacts cause by periodic boundary conditions. When molecules partly cross the boundaries of the unit cell (when using PBC)
their bonds are broken (visually only) and, in some molecular visualization packages, they appear as
stretched over the unit cell. This is corrected with unwrapping functions. The make_hole
function of MDAnalysis was capable of making the molecules, that were broken due to PBC, whole again. However, it had a severe limitation: it only supported orthogonal unit cells.
Thus, I was set on extending the functionality of make_hole
to triclinic unit cells.
- the PR that describes this attempt can be checked here
Despite being very fast, it failed on more complex systems like the C20 fullerene. It later became deprecated
when Richard Gowers ported make_whole
to cython and extended its
functionality to triclinic unit cells (here is his pull
request).
Final Evaluation
More centering functions
After adding the first centering function - center_in_box
- other two useful centering functions were
implemented: center_in_axis
and center_in_plane
.
center_in_axis
centers a given AtomGroup
on a given axis x, y or z. The center_in_plane
transformation centers the given plane of the weighted center of the AtomGroup
in the unit cell. These transformations are useful when working with membrane systems.
- the PR that describes both of these transformations can be accessed
here. In this PR some changes are made to the argument
variable names and documentation of
center_in_box
.
Fitting transformations
With the centering transformations now done, it was time to focus on molecule fitting. Transformations that
perform the fitting of a given molecule to a reference structure are useful when analyzing and visualizing
conformational changes over time. I added two new transformation functions: fit_translation
and
fit_rot_trans
. The first one removes the translations of a given AtomGroup
, either in all directions
or in a given plane XY, XZ or YZ, by aligning its weighted center to the center of a given reference structure.
The latter removes both rotation and translation by performing a least squares alignment to the reference. This function can also take a plane as argument.
- the PR for the fitting transformations can be checked here
PBC treatment - take 2
The last transformations to be added were the wrap
and unwrap
functions. With the PR for the porting of
make_whole
to cython now merged, I added the transformations that dealt with PBC. The wrap
function
translates all the atoms to the interior of the unit cell - by default it does the reverse of make_whole
.
unwrap
simply uses the make_whole
function, but in the context of the on-the-fly transformations API.
- Here is the PR for these two transformations
Demos and blog posts
With some of the transformations now merged with the develop branch of MDAnalysis, and others being reviewed, I was asked to make a blog post in the MDAnalysis blog and a jupyter notebook showing what someone described as the “pure magic” of the on-the-fly transformations.
-
the PR for the blog post can be found [here]https://github.com/MDAnalysis/MDAnalysis.github.io/pull/87)
-
the jupyter notebook demonstrating how transformations work can be accessed in this PR
Closing words
The past 3 months have been a very rewarding experience. I have learned a lot about programming, GitHub and software development on bigger projects. And, more importantly, I’m proud of my contribution to MDAnalysis.
My mentors - Jonathan Barnoud and Richard Gowers - were very helpful in the development of this project. I’m also thankful of Max Linke, Oliver Beckstein for their reviews and comments, and Manuel Melo for motivating me to apply to GSOC2018 and helping me with the project and coding.