AUTOBA : Automated Backbone Assignment

Introduction:

Backbone assignment is the very first and key requirement in structural proteomics research by NMR. In this context, a new and simple protocol based on the correlation peaks and the distinctive peak patterns observable in the F1-F3 and F2-F3 projection planes of the 3D-HN(C)N spectrum has been presented and AUTOBA executes automation of the backbone assignment of the spectrum obtained through this protocol.

AUTOBA has been written in C language and uses only standard C libraries. The working of AUTOBA in brief has been described here. First the protein primary sequence is transformed in a way that all amino acids are replaced by letter X except for (i) prolines which are replaced by P and (ii) the residues which are used as check points are replaced by appropriate letters like Glycine by G, Alanines by A, and Serines/threonines by B and so on. Next, the program read the (i) HSQC peak list for backbone amide chemical shifts and (ii) 2D-hNcnH peak list for self and seqeuntial correlations and (iii) check point information (Note: check points here are the HSQC peaks identified for a particular amino acid type and are requisite for AUTOBA). Foremost, the self peaks in the 2D-hNcnH spectrum are distinguished from the sequential peaks by comparing them with the HSQC peak (backbone only) chemical shifts and then the program finds all the possible sequential connectivities between the HSQC peaks using the 2D-hNcnH peak-list. Next, the stretches of connected HSQC peaks are transformed into sequence format like --XXXBXXGXX-- using the check point information provided already. Now, the determined sequences are compared with the transformed primary sequence, and when there is a match the program will read the set of peaks corresponding to the --XXXBXXGXX-- stretch as assigned. This process will be repeated by excluding the peaks already assigned. Finally, the whole assignment is converted into the common BMRB format.

The presented approach is very well applicable to small well-folded proteins which show nicely dispersed 15N-1H HSQC spectrum and the whole assignment process (including the data acquistion and analysis) can be completed in less than a day. This automation approach is further extended to analysis of 3D HN(C)N spectra using the same kind of check points. This will be applicable (i) higher molecular weight proteins (>12 kDa and upto 20 kDa) and (ii) for unstructured/unfolded/denatured-states of proteins where the spectral dispersion in the 2D spectra may not be adequate. We have in fact successfully incorporated these features in the same version of AUTOBA. This particular analysis requires protein primary sequence,3D-HNCN peak list, and the check points.



To know more about the AUTOBA protocol, please see the references listed below.

References:

1.      Panchal S C, Bhavesh N S, Hosur R V. “Improved 3D triple resonance experiments, HNN and HN(C)N, for HN and 15N sequential correlations in (13C, 15N) labeled proteins: application to unfolded proteins J Biomol NMR 2001; (20): 135-147.

2.      Neel S. Bhavesh, Sanjay C. Panchal, and R. V. Hosur “An efficient high throughput resonance assignment procedure for structural genomics and protein folding research by NMR Biochemistry 40, 14727-14735 (2001)

3.      Amarnath Chatterjee, Neel S. Bhavesh, Sanjay C. Panchal and R. V. Hosur “A novel protocol based on HN(C )N for rapid resonance assignment in (15N, 13C) labeled proteins: implications to structural genomics Biochem. Biophys. Res. Commun. 293, 427-432 (2002).

4.      Amarnath Chatterjee, Ashutosh Kumar and Ramakrishna V. Hosur “Alanine check points in HNN and HN(C)N spectra J Magn. Reson. 181(1), 21-28 (2006).

5.      Jeetender Chugh, Dinesh Kumar and Ramakrishna V. Hosur “Tuning the HNN experiment: Generation of serine-threonine check points” Journal of Biomolecular NMR 40(2), 145-152 (2008)

6.      Jeetender Chugh and Ramakrishna V Hosur “Spectroscopic labeling of A, S/T in the 1H-15N HSQC spectrum of uniformly (15N-13C) labeled proteinsJ Magn Reson; 2008, 194(2), 289-294.

7.      Dinesh Kumar, Jeetender Chugh, and Ramakrishna V Hosur “Generation of  Serine/Threonine Check  Points in HN(C)N spectra”  J. Chem. Sci. (2009) 121(6), 955–964.

8.      Dinesh Kumar, Subhradip Paul,  and Ramakrishna V Hosur. “BEST-HNN and 2D-(HN)NH experiments for rapid backbone assignment in proteinsJ Magn Reson 2010b; (204): 111-117.

9.      Dinesh Kumar, Jithender G Reddy, and Ramakrishna V. Hosur. “hnCOcaNH and hncoCANH pulse sequences for rapid and unambiguous backbone assignment in (13C, 15N) labeled proteins J Magn Reson 2010 (In Press).

10.  Dinesh Kumar and Ramakrishna V. Hosur. “An efficient high-throughput protocol based on 2D-HN(C)N for unambiguous HN and 15N backbone assignment in small folded proteins in less than a day Current Science: Research Communication (2010).

Instructions for using AUTOBA online server:

1. Provide the necessary information and upload the input files in AUTOBA sub-panels displayed below.

2. Before submitting, be sure that the input files are in right format. The formats of input files have been kept very simple and can be checked from AUTOBA sub-panels individually.

3. Click on "Submit your Input" button. If your files are in right format, you will get a successful upload message and if the format of input files is not matching, you will be informed that "Check your Input File format".

4. The next step is to submit the check points. There are two ways of submitting the check points. In first case, the check points are identified by the program from the data provided from variants hncNH or (HN)NH spectra by comparing the intensity and signs of 15N-1H cross peaks in variants of 2D-hncNH or 2D-(HN)NH spectra. There are three variants of each experiment: 2D-hncNH-G/2D-(HN)NH-G, 2D-hncNH-A/2D-(HN)NH-A and hncNH-ST/2D-(HN)NH-ST and provide identification of glycines, alanines and serines/threonines, respectively. In addition, the variants of 2D-hncNH also provide identification of residues which are present next to these special residues (G, A, and ST). In the second case, the check points are identified by the user and the direct input is provided to the server in the form a file.

5. Finally, click on "Execute AUTOBA" button. The program will run and the backbone assignment would be e-mailed to the user (in text as well as in BMRB format).

6. For test run, the demo data can be downloaded from here DEMO Data on Ubiquitin

Note: AUTOBA also performs the assignment based 3D-HN(C)N peak-list and the a few check points. The 3D-HN(C)N peak list here also includes the intensity of self and seqeuntial correlation peaks which in turn lead to identification of triplet specific peak patterns and hence provide internal checks for glycines and the residues present next to glycines for the assignment process.

AUTOBA

Protein Primary Sequence
(FASTA Format)



Sample Protein Primary Sequence

HSQC Peak List

Peak List For backbone only amide chemical shifts, use the 1H-15N projection plane of 3D-CBCACONH experiment.

Sample HSQC Peak List

2D-hNcnH Peak List
(Mandatory)



Sample 2D-hNcnH Peak List

2D-hnCOcanH Peak List
(Optional; For extracting 13CO assignment)


Sample 2D-hnCOcanHH Peak List

2D-hncoCAnH Peak List
(Optional; For extracting 13CA assignment)

Sample 2D-hncoCAnH Peak List

3D-HN(C)N Peak List
(mandatory for 3D-HN(C)N based backbone assignment)

Sample 3D-HN(C)N Peak List

3D-hnCOcaHN Peak List
(Optional)

Sample 3D-hnCOcaHN Peak List

3D-hncoCAHN Peak List
(Optional)

Sample 3D-hncoCAHN Peak List

Cut-off Threshold
1H ppm


15N ppm


13C ppm

User Email ID

Submit Check Point Data

HSQC peaks identified for amino acid types


Sample Checkpoint File


HSQC peaks identified for particular amino acid types and the following (i+1) residues

Sample Checkpoint File


Amino acid type Identification from Variants of 2D-hncNH or 2D-(HN)NH or 2D-(HC)NH spectrum

Intensity of the Checkpoint peak (Here G, A and S/T)



Check Point File



Sample Checkpoint File




Information about input files and upload formats:

The main inputs required for AUTOBA are:

          (i) peak list for backbone only amide chemical shifts. This can be derived from 1H- 15N projection plane of CBCACONH experiment
          (ii)peak list for self (HiNi) and sequential HiNi+1) corelations derived from 2D-hNcnH spectrum
          (iii)the protein primary sequence (Fasta Format) and finally
          (iv)a few check points (i.e the HSQC peaks identified for particular amino acid types).

The format of peak lists matches with that of the CARA peak list. Example peak lists have also been provided individually in each sub-panel of AUTOBA.
The peak lists can be uploaded to the server via the form elements displayed in each AUTOBA sub-panel.
The various pulse sequences described here can be downloaded from here(for Bruker Spectrometers only) Pulse-Sequences

Cite our work
AUTOBA: AUTOmated Backbone Assignment based on sequential amide correlations and G/A/ST-check points