The website server I worked on can be found here: http://membranome.org/proteins.php. When I began development, the server had a functional database and a great site to browse approximately 6,000 unique proteins. The proteins on the database where all called single dimers. These dimers contain only a single helix, not two. All the models looked something like what is presented below.
At the time I began work, the Membranome web server didn’t contain any embedded visualization of the protein model. In order to embed visualizations, I needed to investigate the current PHP web archetecture. This required intensive exploration into the site.
Each protein model was contained in what is called a Protein Data Bank (PDB) file (*.pdb). These files are the standard method for storing any type of chemical model. The documentation and style guides for reading and writing PDB files can be found here. Each PDB files contains several lines where the first word indicates what type of line the row represents. The first line is MODEL, which allows the creator of the PDB file to save multiple models in the same .pdb file. Next, REMARKS, which the creator can use to write any sort of comments or values they wish. The REMARKS do not affect the visualization, but can be important to parse and present to a viewer in a table below. The HELIX indicator indicates how many helix ribbons the model will contain. The goal of my work was to parse these files correctly and support any number of HELIX’s. Finally, the bulk of the PDB file is the ATOM and HETATM lines. These lines indicate 3D coordinates (x, y, z) to the reader so the reader can present the model in 3d space. It also contains the type of molecule that should be presented. A example of the PDB file is shown below.
There was some complicated data parsing that was required in order to get the final visualization working properly. These PDB files contain several models. Some PDB files contain 3 models, others contain 11. Each model listed in the PDB file had different REMARKS with different values. I needed to parse out this information to present to the end user reviewing the models. The end result would present a table that allows the final user to click through the models listed in a table and update the visualization in the Three.js canvas.
I created this table seen below the visualizer to allow the user to dynamically click through the different models contained in the PDB file. You can also see that I parsed out all the data from the REMARKS file and presented it clearly in the rows. Because the protein PDB files contain any number of models, this table was created dynamically using jQuery.
I am still currently working for Andrei Lomzie on the improvement of his Laboratory’s web presence. I am currently working on implementing a queuing system for this FORTRAN application. In order to generate these advanced PDB files, Andrei has a FORTRAN program written to estimate the organization of the molecules. This program is open to the public and can be run here.
If you were to run it, it would take approximately 15 minutes to complete. The output is a PDB file like the ones shown above. This process takes too long to require visitors to leave the page running for 15 minutes. My current task is to create a queueing system that will create temporary links that can be accessed to after running the application.