The Protein Structure Database (PSdb)

David W. Deerfield II, Joe Geigel
Pittsburgh Supercomputing Center, 4400 Fifth Ave, Pittsburgh, PA 15213, USA


Contents:

  • Abstract
  • Introduction
  • Previous Work
  • Data Components of the PSdb
  • The PSdb Viewer
  • Availability
  • Future Work
  • Acknowledgements
  • References

  • Abstract

    The classification of protein structural information, especially using the overall structure of the protein (the fold) as the basis for the classification [1], is an exciting area of research in structural biology. We were interested in the complementary question, could we develop a classification scheme based on the primary structure (residue based) that would allow us to understand on the molecular level the intricacies of protein structure.

    In this paper we present the Protein Structure Database (PSdb), a new protein database that relates secondary (e.g. Helix, Sheet, Turn, Random Coil), supersecondary (e.g., helix-helix interactions), and tertiary information (e.g. Solvent accessibility, internal relative distances, and ligand interactions) to the primary structure. The data for each protein is supplied on a residue by residue basis and encoded in a series of flat ASCII files.

    Relationships between the various levels of structure (primary, secondary, tertiary) can be investigated visually using PSdbView, a graphical tool provided to view the information within the PSdb. This tool allows for side by side comparison of residue based data and includes a variety of standard mechanisms for visualizing protein data including Ramachandran plots, C(alpha)-C(alpha) distance plots, and differences in solvent accessible molecular surface area graphs (e.g., differences in the exposed surface with and without including either the ligands, metal ions or buried waters in the computations). PSdbView is written in Java, thus providing a platform independent means for exploring PSdb entries over the Internet.

    Introduction

    Two of the most important and difficult problems in molecular biology, the protein folding problem and the structure-function relationship in proteins, are directly concerned with relating the three-dimensional structure of a protein to the primary sequence. The richest source of information about protein structure is the Protein Data Bank (PDB) maintained by the Brookhaven National Laboratory [2]. The PDB is a collection of individual "flat" text files, each of which contains the three-dimensional coordinates of one of the several thousand macromolecular structures determined by various experimental techniques (e.g. X-ray crystallography or NMR). While these files hold massive amounts of crucial information, that information is difficult to access and use in a systematic fashion.

    There are many efforts that aim to unlock the information contained within the PDB [3 - 21], but there is no one database currently available that provides the flexibility to allow for a variety of questions to be asked. In practice, this has resulted in each investigator's developing specific single-use programs to determine the desired information. These programs are often problem-specific, hastily written and not easily adapted to related problems. This lack of readily available query tools places profound limitations on current research. Many researchers could profit from using structural information to guide and interpret their results, but few have either the skills or time to design and implement database programs.

    In this paper, we describe the Protein Structure database (PSdb), a database which extends the information available in current databases and allows researchers to quickly explore relationships between the primary, secondary, supersecondary, tertiary, and terniary structures. The extensions to the currently available databases include an extended number of secondary structure elements identified, the ability to use a super-secondary structure for searches, and a greater number of environmental descriptors for each residue.

    Previous Work

    There have been a limited number of tools available to researchers for the examination and correlation of the structural information within the PDB. The NRL_3D database [22 - 24] was created to address whether the structure of a specific sequence had been reported. It is a collection of the primary sequences, annotations and author assigned secondary structure of the crystal structures reported in the PDB with a resolution of less than 3 Angstroms. In a series of companion files, the authors, keywords, species from which the sample was taken, and other information is maintained. Chris Sander and his co-workers at EMBL [3] created one of the first structural databases based upon the secondary structure assignments made by the program DSSP. This database consists of a series of flat files with primary sequence and secondary structure information. "Molecular Structure in Biology", a commercial program available from Oxford University Press, contains at least three different images of each of the PDB entries along with the capability to rotate wireframe images of the protein. The researcher is allowed to browse through the information contained within the header of the PDB file and to perform searches through this information.

    Several groups have incorporated protein structure information into relational databases. PKB [25] is a user interface combining a relational database and a series of user accessible macros. The information contained in this database is taken directly from the header information of each PDB file (e.g., secondary structure elements (Helix, Sheet or Turn), resolution, R-Value, refinement technique, and cell description). Macros have been added to the system to allow the user to perform relatively complex mathematical modeling and theading [25]. SESAM follows similar ideas. It has been developed by Wodak et al. [26 - 35] at the Universite Libre de Bruxelles, Belgium. The core of the database is a series of structural determinants and solvent accessible computations by DSSP [3]. Isis (Integrated Sequence/Integrated Structures) is a commercial protein sequence/structural database developed by the Protein Engineering Club Database Group [36, 37]. Primary sequences (e.g., NBRF-PIR and SWISS-PROT) are contained in the OWL sequence database and the structural information is contained within the BIPED relational database. The structural information includes structural domains (sheet, helix, and turn presumably taken from the file header), torsional angles, solvent accessibility and hydrogen bonds. The database is provided as either flat files or Oracle tables.

    Data Components of the PSdb

    PSdb entries are derived from coordinate data found in the PDB. The information in each entry is compiled and stored on a residue by residue basis in a series of flat ASCII files. The data included for each residue includes:
  • Secondary Structure and Phi,Psi Classifications.
  • Solvent Accessible Molecular Surface Area.
  • Environmental Classification (both Eisenberg and ours).
  • Helical axis and sheet information
  • C(alpha)-C(alpha) distances
  • C(alpha) Chirality, Bond Length, Bond Angle and Dihedral Angles
  • Ligands, Metal Ion and structural water contacts
  • Hydrogen bonds and Disulfide Partners.
  • We developed software for determining the various values reported in the PSdb. In the following sections, we briefly describe the algorithms used in these computations. In the case of multiple conformations, we always used the first conformation reported in the PDB file.

    Secondary Structure Classification

    Many existing databases base their secondary structure classification on the author assignments found in PDB files. The problem with relying on these definitions is that the assignment of the secondary structure is highly idiosyncratic. Helix, sheet, and turn structures are determined using the investigator's personal criteria. While this may cause few problems in the central regions of helix and sheet structures, it can lead to very different classifications at the ends of these structures [4]. Unfortunately, the ends of these secondary structural features may be critical to our understanding of protein structure [4] and function [38].

    To address this problem, a new classification has been developed that provides a single, consistent definition for secondary structure. The classification is dervived using an algorithm developed by Deerfield [39]. The algorithm initially performs a classification of the secondary structure for each residue based upon the Phi, Psi and Omega angles. This is followed by a more stringent examination of secondary structure for the residues (e.g., whether there are enough consecutive residues to define an element or whether the necessary hydrogen bonds are present). The variety of secondary structural elements identified in the current implementation of this algorithm is greater than in other databases (e.g., C7(eq) and all beta turns are identified). The amount of helical structure identified by this methodology is slightly greater than from DSSP, but similar to the percent helix as determined by CD spectroscopy.

    The PSdb also records secondary structure of the author as reported in the PDB file and the Phi, Psi definition as classified by Garnier [40], Scheraga [41 - 43], and Thornton [44]. The Phi, Psi definitions are provided for competeness and comparison. In addition, these definitions are available for use as the background for PSdbView Ramachandran Plots (see below).

    Solvent Accessible Molecular Surface Area and Environment Computations

    We used Connolly's algorithm [45] for computing the solvent accessible molecular surface area (SAMSA) using van der Waal radii (Table I) based upon the atom and its hybridization [46]. We computed the SAMSA for all heavy atoms in a single residue and all of the side-chain atoms (in this context, side-chain atoms are all heavy atoms EXCEPT for N, C and O). We kept track of the contribution of the polar and nonpolar atoms for both the entire residue and side-chain. The SAMSA is reported as either the absolute value or as a percentage of the theoretical maximum SAMSA for the specific residue. We computed a total buried area by subtracting the computed SAMSA for the side-chain from the theoretical maximum SAMSA for the side-chain. We systematically increased the contacts used in the computation of the SAMSA (Table II). The abbreviations used for this data in the PSdb are summarized in Table III.

    PFracc was computed using Eisenberg's [47] algorithm (Eq 1). We computed Eisenberg's environmental descriptors; but due to differences in the algorithm used to compute the SAMSA (Eisenberg used the Lee and Richards algorithm) and different van der Waal radii (Table I), we reproduced his environmental descriptors in only about 95% of the tested cases. We have recently reported [48] a new approach towards defining the environmental regions based upon the theoretical features in the environmental plot. The region boundaries (AMH Classification, Figure 1a). are defined as a series of radial lines starting at the upper left hand corner of the environment plot and the series of arcs representing the distance from the upper left hand corner. The upperleft hand corner of the Environmental Plot represents a fully exposed residue side-chain. It should be noted that Eisenberg used Method #1 (Table II) for computing his environment classification, whereas, we feel that Method #4 (Table II) is potentially more appropriate. A comparison of the Eisenberg and AMH Environmental classifications are illustrated in Figure 1.

    We have had a long standing interest in the interaction of divalent metal ions with peptides and proteins [49 - 55]. Indeed, one of the driving forces in the development of the PSdb is to understand the role of metal ions in the stabilization of specific tertiary structures and the propensities of the various side-chains to interact with specific metal ions [56].

    Metal Ion Coordination

    All ligands in the first coordination sphere for each metal ion is identified. This includes ligands from the protein (e.g., backbone oxygen and sidechain groups), ligands and the water of hydration about the metal ion. The water of hydration about the metal ion is treated as a member of this site due to the slow exchange rate and was used as a part of the metal ion for the surface area computations (Method #4, Table II). The distance criteria are a function of the metal ion and ligand. For example, Zn(II)-O distances are required to be less than 3.0 Angstroms while Zn(II)-S are required to be less than 4.0 Angstroms.

    Bound Water Molecules

    All water molecules were checked for possible hydrogen bond interactions with the protein and ligands about the protein. A distance criteria was used and the heavy atom-heavy atom distance was required to be less than 3.5 Angstroms. We did not use an angle dependence in this assignment, which will be included in future releases. Waters with three of more contacts to the protein and ligands was treated as a structural water and was included in the surface computations (Method #5, Table II).

    The PSdb Viewer

    In order to visually inspect and investigate the entries of the PSdb, we provide, PSdbView, a tool for visualizing individual PSdb entries. The tool is written in Java, thus providing a platform independent viewer as well as a means of viewing the PSdb over the Internet.

    PSdb consists of a number of screens, each screen providing a visual overview of specific set of PSdb data. These screens are summarized below.

    Basic Info Screen (Figure 2)
    This screen provides basic textual information about the protein (including protein description, method of determination, resolution, R-value, and keywords.)

    Seqbar Screen (Figure 3)
    This screen allows for side by side comparison of the data values for catagories of residue based PSdb data. It consists of a number of sequence bars, one for each data catagory, that is color coded based on the value of the catagory data for each residue.

    Sequence bars for given catagories can be added to or deleted from the screen, thus giving the user the capability to view only those data catagories of interest.

    The sequence bars themselves are catagorized within groups corresponding to the residue based data in the PSdb. Catagories include Residue Charge and Polarity, Disulfide Bonds, Amide/Chiral, (e.g. whether the backbone amide bond (Omega) is cis or trans, and whether C(alpha) is D or L), Secondary Structure and Phi/Psi Classifications, Solvent Accessible Molecular Surface Data (Table III), Metal Ion Interactions, and Structural Water Molecules Bound.

    Legend Screens (Figure 4)
    These screens display color mappings for each PSdb catagory.

    C(alpha)-C(alpha) Distance Plot/Hydrogen Bond Screen (Figure 5)
    This 2D plot color codes and displays the C(alpha) to C(alpha) distances for each residue pair on the upper diagonal and the location of the inter-residue hydrogen bonds present within the protein on the lower diagonal. Included with the 2D graph is a set of sequence bars along the top for side by side comparison of PSdb residue data with the distance plot.

    Ramachandran Plot (Figure 6)
    On this 2D graph, individual residues are plotted based on their phi and psi dihedral angles. The background of the plot can be color coded to reflect structural region classifications as defined by either Garnier[40], Scheraga [41 - 43], or Thornton [44].

    Structural Environment Plot (Figure 7)
    This screen consists of a 2D plot describing the structural environment of the protein (i.e Polar Fraction vs. Total Buried Area). Each residue plotted is color coded based on its charge and polarity class. The background of the plot can be rendered to reflect the the AMH or Eisenberg Environmental classifications.

    Surface Area Screens (Figure 8)
    This series of screens is used to investigate the solvent accessible surface area values (Table III) and their relationships to interchain interactions and interactions with ligands, ions, and structural waters. Each screen consists of a series of sequence bars and graphs. The sequence bars display color coded solvent accessible molecular surface data, and the graphs plot differences in the numerical surface data due to interchain interaction, ligands, ions, metals, and structural waters for each residue.

    Surface area plots are currently available for exposed surface area values, buried surface area values, and polar fraction values.

    Availability

    The PSdb can now be accessed via the Internet at http://www.psc.edu/~geigel/PSdb. The PSdb viewer is available as an applet which can be run via a Java enabled WWW browser. It is also available for download as an application that can be run locally using a local Java interpretter.

    Future Work

    We are in the process of developing a PSdb searching engine with associated GUI that will allow researchers to query the database and easily pose questions via the Internet. We are designing the engine to respond to questions such as:
  • What is the environment for a given peptide sequence (e.g., Gly-Gly)? [57]
  • What residues are present in specific secondary structures (e.g., Turns)?
  • What are the sequences within a specific structure (e.g., amphiphilic helices)?
  • Visualization beyond the two-dimensional representation of proteins is essential. Thus, we will extend PSdbView to work in conjunction with molecular graphics packages, thus enabling the color coded seqbar data to be mapped accordingly onto the displayed 3d structure. This integration will also allow for the exploration of alternative methods for the representation of the results from the database queries.

    Lastly, we will extend PSC's efforts in context sequence analysis to include the primary, secondary, and tertiary structure information contained within the PSdb. The alignment of sequences has traditionally been based upon pairwise alignments using gross structural or evolutionary measures. Context sequence alignment has only recently been used successfully. From the multiple sequence alignment of homologous proteins, one can determine areas of high mutability. If one of these sequences is in the PSdb, then the relationship between structure and mutability can be examined.

    Acknowledgements

    This work was funded by a grant from the NIH-NCRR (1 P41 RR06009). We greatly acknowledge help during the early stages of this work by Catherine P. Milligan, Joseph C. Lappa, Alexander J. Ropelewski and Amanda M. Holland-Minkley. We would also like to thank Hugh B. Nicholas Jr. for many helpful discussions.

    References

    [1] Murzin et al., J. Mol. Biol. 1995, 247, 536-540.

    [2] Abola, E.W., Bernstein, F.C., Bryant, S., Koetzle, T.F. and Weng, J. (1987) Crystallographic Databases - Information Content, Software Systems, Scientific Applications, eds., Allen, F.H., Bergerhoff, G., and Sievers, R., Data Commission of the International Union of Crystallography, Bonn, Cambridge, Chester, pp 107-132.

    [3] Kabsch, W. and Sander, C. (1983) Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers, 22, pp. 2577-2637.

    [4] Richardson, J.S. and Richardson, D.C. (1988) Amino Acid Preferences for Specific Locations at the Ends of a Helices. Science,240, pp. 1648 - 1652.

    [5] Rooman, M.J. and Wodak, S.J. (1988) Identification of predictive sequence motifs limited by protein structure database size. Nature, 335, pp. 45-49.

    [6] Janin, J., Wodak, S., Levitt, M. and Maigret, B. (1978) Conformation of Amino Acid Side-chains in Proteins. J. Mol. Biol., 125, pp. 357-386.

    [7] Huysmans, M., Richelle, J., and Wodak, S.J. (1991) SESAM: A relational Database for Structure and Sequence of Macromolecules. Proteins,11, pp. 59-76.

    [8] M. Vasquez, G. Nemethy and Scheraga, H.A. (1983) Computed Conformational States of the 20 Naturally Occuring Amino Acid Residues and of the Prototype Residue a-Aminobutyric Acid. Macromolecules, 16, pp.1043-1049.

    [9] Lewis, P.N., Momany, F.A. and Sceherga, H.A. (1971) Folding of Polypeptide Chains in Proteins: A Proposed Mechanism for Folding. Proc. Natl. Acad. Sci. USA, 68, pp. 2293-2297.

    [10] Kuntz, I.D. (1972) Protein Folding. J. Am. Chem. Soc, 94, pp. 4009-4012.

    [11] Crawford, J.L., Lipscomb, W.N. and Schellman, C.G. (1973) The Reverse Turn as a Polypeptide Conformation in Globular Proteins. Proc. Natl. Acad. Sci. USA, 70, pp. 538-542.

    [12] Levitt, M. and Greer, J. (1977) Automatic Identification of Secondary Structure in Globular Proteins. J. Mol. Biol. , 114, pp. 181-239.

    [13] Rose, G.D. and Seltzer, J.P. (1977) A New Algorithm for Finding the Peptide Chain Turns in a Globular Protein. J. Mol. Biol., 113, pp. 153-164.

    [14] Chou, P.Y. and Fasman, G.D. (1977) '-Turns in Proteins J. Mol. Biol., 115, pp. 135-175.

    [15] Kolaskar, A.S., Ramabraham, V. and Soman, K.W., (1980) Reversal of Polypeptide Chain in Globular Proteins. Int J. Peptide Protein Res., 16, pp. 1-11.

    [16] Ramakrishman, C. and Soman, K.V. (1982) Identification of secondary structures in globular proteins - a new algorithm. Int J. Peptide Protein Res., 20, pp. 218-237.

    [17] Hohne and Kretschmer (1985) Stud. Biophys., 108, 165-186.

    [18] Taylor, W.R. and Thornton, J.M. (1984) Recognition of Super-secondary Structure in Proteins. J. Mol. Biol., 173, pp. 487-514,

    [19] Taylor, W.R. and Thornton, J.M. (1983) Prediction of super-secondary structure in proteins. Nature, 301, pp. 540-542.

    [20] Staden, R. (1988) Method to define and locate patterns of motifs in sequences. Cabios, 4,pp. 53-60.

    [21] Richards, F.M. and Kundrot, C.E. (1988) Identification of Structural Motifs From Protein Coordinate Data: Secondary Structure and First-Level Supersecondary Structure. Proteins, 3, pp. 71-84.

    [22] Namboodiri, K., Pattabiraman, N., Lowrey, A. and Gaber, B.P. (1988) J. Mol Graphics, 6, 211-212

    [23] George, D.G., Barker, W.C. and Hunt, L.T. (1986) The Protein identification resource. Nucl. Acids Res., 14, pp. 11-15.

    [24] Pattabiramin, N., Namboodiri, K., Lowrey, A. and Gaber, B.P. (1989) Protein Sequences and Data Analysis (communicated).

    [25] Bryant, S.H. (1989) PKB: A Program System and Data Base for Analysis of Protein Structure. Proteins, 5, pp. 233-247.

    [26] Morffen, A.J., Rodd, S.J.P., and Snelgrove, M.(1983) J. Comput. Chem.,7, 9-16.

    [27] McGregor, M.J., Islam, S.A. and Sternberg, M.J.E. (1987) Analysis of the Relationship Between Side-chain Conformation and Secondary Structure in Globular Proteins. J. Mol. Biol., 198, pp. 295-310.

    [28] Morffew, A.J. and Todd, S.J.P. (1986) Computers Chem., 10, 9-14.

    [29] Pabo, C.O. and Suchanek, E.G. (1986) Computer-Aided Model-Building Strategies for Protein Design. Biochemistry, 25, pp. 5987-5991.

    [30] Kanehisa, M., Klein, P., Greif, P. and DeLisi, C. (1984) Computer analysis and structure prediction of nucleic acids and proteins. Nucl. Acid Res., 12, pp. 417-428.

    [31] Morffew, A.J., Todd, S.J.P. and Snelgrove, M.J. (1983) Computers Chem.,7, 9-16.

    [32] Pabo, C. (1987) Nature, 327, 467.

    [33] Glen, R.C. and Rose, V.S. (1987) J. Mol. Graphics, 5, 79-86.

    [34] Frey, P.M.D., Paton, N.W., Kemp, G.J.L., Fothergill, J.E. (1990) Protein Engr.,3, 235-243.

    [35] Pongor, S. (1988) Nature,332, 24.

    [36] Akrigg, D., Bleasby, A.J., Dix, N.I.M., Findlay, J.B.C., Worth, A.C. T, Parry-Smith, D., and Wootton, J.C. (1988) Nature, 335, 745-746.

    [37] Blundell, T.L., Sibanda, B.L., Sternberg, M.J.E. and Thornton, J.M. (1987) Knowledge-based prediction of protein structures and the design of novel molecules. Nature, 326, pp. 347-352.

    [38] Presta, L.G. and Rose, G.D. (1988) Helix Signals in Proteins. Science, 240, pp. 1632-1641.

    [39] Deerfield, D.W., II, manuscript in preparation.

    [40] Garnier J., Robson B., 1989. "Prediction of Protein Structure and the Principles of Protein Conformation", Ed. Gerald D. Fasman, Plenum Publishers, NY, NY, pp 417-467.

    [41] Vasquez M., Nemethy G., Scherga H.A., 1983. Macromolecules 16:1043-1049. Chou P.Y., Fasman G.D., 1977. J. Mol. Biol., 115:135-175.

    [42] Venkatachalam C.M., 1968. Bioploymers, 6:1425-1436.

    [43] P.N. Lewis, F.A. Momany and H.A. Scherga, 1973. Biochim. Biophys. Acta, 303, 211-229.

    [44] Laskowski, R.A., MacArthur, M.W., Moss, D.S., Thornton, J.M. (1993), PROCHECK: a program to check the stereochemical quality of protein structures, J. Appl. Crtst, 26, pg 282-291.

    [45] Connolly, M.L (1983), Analytical Molecular Surface Calculation, J. Appl Cryst, 16, pg 548-558.

    [46] Francl, M.M., Hunt, R.F.,Jr., Hehre, W.J., (1984) J. Am Chem Soc., 106, pg 563-570.

    [47] Bowie, J.U., Luthy R., Eisenberg, D. (1991), A Method to Identify Protein Sequences That Fold into a Known Three-Dimensional Structure, Science, 253, 164-170.

    [48] Deerfield, D.W. II, Holland-Minkley, A.M., Geigel, J, Nicholas, H.B., Jr., (1996), Classification of the Environment of Protein Residues, submitted J. Prot. Chem

    [49] Deerfield, D.W., II, Fox, D.J., Head-Gordon, M., Hiskey, R.G., Pedersen, L.G., (1995), The First Solvation Shell of Magnesium Ions in a Model Protein Environment with Formate, Water and X=NH3, H2S, Imidazole, Formaldehyde and Chloride as Ligands: An ab initio Study. Proteins, 21:244-255, 1995.

    [50] Cabaniss, S., Deerfield, D.W., II, Monroe, D.M., Hiskey, R.G., Pedersen, L.G., (1995), Is Ca(II) ion binding to Prothrombin Fragment 1 Intrinsically Cooperative or is Cooperative Binding Accounted for by Self Association? Blood Coagulation Fibrinolysis, 6:464-473, 1995.

    [51]Deerfield II, D.W., Fox, D.J., Head-Gordon, M., Hiskey, R.G., Pedersen, L.G., (1991), The Interaction of Calcium and Magnesium Ions with Malonate and the Role of the Waters of Hydration: A Quantum Mechanics Study. J. Am. Chem. Soc., 113:1892-1899, 1991.

    [52] Deerfield, D.W., II, Nicholas, H.B., Jr., Hiskey, R.G., Pedersen, L.G., (1989), Salt or Ion Bridges in Biological Systems: A Study Employing Quantum and Molecular Mechanics. Proteins, 6, 168-192, 1989.

    [53] Maynard, A.T., Eastman, M.A., Darden, T., Deerfield, D.W., II, Hiskey, R.G., Pedersen, L.G., (1988), The Effect of Calcium(II) and Magnesium(II) Ions on the 18-23 (gamma)-Carboxyglutamic Acid containing Cyclic Peptide Loop of Bovine Prothrombin: An AMBER Molecular Mechanics Study. Int. J. Peptide Protein Res., 31:137-149, 1988.

    [54] Deerfield, D.W., II, Olson, D.L., Berkowitz, P., Byrd, P.A., Koehler, K.A., Pedersen, L.G., Hiskey, R.G., (1987), Mg(II) Binding by Bovine Prothrombin Fragment 1 via Equilibrium Dialysis and the Relative Roles of Mg(II) and Ca(II) in Blood Coagulation.J. Biol. Chem., 262:4017-4023, 1987.

    [55] Deerfield, D.W., II, Berkowitz, P., Olson, D.L., Wells, S., Hoke, R.A., Koehler, K.A., Pedersen, L.G., Hiskey, R.G., (1986), The Effect of Divalent Metal Ions on the Electrophoretic Mobility of Bovine Prothrombin and Bovine Prothrombin Fragment 1. J. Biol. Chem., 261:4833-4839, 1986.

    [56] Deerfield II, D.W., Nicholas Jr., H.B., (1996) Classification of Metal Ion Binding Sites within Protein Structures, J. Prot. Chem., to be submitted, 1996.

    [57] Deerfield, D.W., II, Holland-Minkley, A., Hempel, J.D., Nicholas, H.B., Jr. (1994), Conformational Flexibility of the Gly-Gly dipeptide within protein structures. J. Protein. Chem., 13:526.