Cleavage sites in the family Potyviridae

These data have been prepared by Mike Adams, Rothamsted Research, using all sequences available on June 29th 2008. The pages were previously hosted on the Rothamsted Research web site but have now been moved here to provide a single site for information on, and analysis of, plant virus sequences. This information is based on published data and our own analyses as reported in the paper: Adams, Antoniw & Beaudoin (2005). Overview and analysis of the polyprotein cleavage sites in the family Potyviridae. Molecular Plant Pathology 6, 471-487.

If you want to print these pages and retain all the information, you will probably need to change Page Set Up Orientation to Landscape.



Introduction

The ssRNA genome of members of the family Potyviridae encodes a polyprotein that is subsequently processed by virus-encoded proteases to generate functional proteins. The cleavage sites show considerable sequence conservation but with some variations that have phylogenetic significance. The genome organisation of a typical member of the family is shown in Figure 1 which indicates the 10 mature proteins and the nine cleavage sites (arrowed). In some members of the genus Ipomovirus, there appear to be two copies of the P1 protein (P1a and P1b) and the HC-Pro is absent.

Figure 1. Potyvirus genome map

Figure 1. Potyvirus genome map

The only departure from this genome organisation is in the genus Bymovirus, where the genome is bipartite. The longer bymovirus RNA (RNA1) corresponds to the 3'-portion of other members of the family but the two proteins produced from RNA2 are rather different (Figure 2).

Figure 2. Bymovirus genome map

Figure 2. Bymovirus genome map

The following tables provide summaries of the polyprotein cleavage sites within the family. Usually 4 amino acids are shown before the cleavage site and one after, as this includes most of the sequence conservation. In most cases these sequences are deduced from peptide alignments. In a few cases, the sites differ in position from those given in the sequence file header and so represent my estimates rather than those of the original authors. All the available sequences in the family Potyviridae available were used, whether complete or partial (see top of page for date of latest update). The accuracy of these data are only as good as the underlying sequence files and some of the apparently unusual cleavage sites (especially where only detected once) may arise from errors in sequencing.


(1). P1/HC-Pro cleavage site (cut by the P1 serine proteinase)


Site Nos of sequences Nos of viruses
AXHF/S 2 1
FXWY/G 2 1
IXEF/S 10 3
IXFY/A 3 2
IXFY/C 1 1
IXHF/S 86 5
IXHY/A 18 4
IXHY/S 128 19
IXHY/T 4 1
IXLY/G 1 1
IXLY/S 1 1
IXQF/S 7 3
IXQY/A 1 1
IXQY/S 2 2
IXYY/S 2 2
LXHY/S 6 3
LXWY/C 1 1
LXWY/G 2 1
LXWY/S 1 1
LXYF/S 1 1
MXEY/S 2 1
MXHF/S 2 2
MXHY/S 11 6
MXQF/A 1 1
MXQF/S 103 5
MXQY/N 15 1
MXQY/S 9 4
TXHY/S 10 2
VXFY/S 1 1
VXHY/S 25 5
YXQY/S 1 1
     
Dipeptide summary    
F/A 1 1
F/S 210 20
Y/A 22 7
Y/C 2 2
Y/G 5 3
Y/N 15 1
Y/S 198 45
Y/T 4 1



(2). HC-Pro/P3 cleavage site (cut by the HC-Pro cysteine proteinase)


Site Nos of sequences Nos of viruses
CXVG/G 1 1
FXVG/G 1 1
LXCG/G 1 1
YXIG/G 11 6
YXMG/G 1 1
YXVG/G 408 62



(3). Bymovirus P1-Pro/P2 cleavage site (cut by the P1-Pro cysteine proteinase)


Virus Site Nos of sequences
Barley mild mosaic virus YXVG/A 11
Barley yellow mosaic virus GXVG/S 7
Oat mosaic virus FXTG/N 1
Wheat yellow mosaic virus GXVG/S 4



(4). Dipeptide summary of all other cleavage sites (cut by the NIa-Pro cysteine proteinase)

Click here for a summary of the most important sites by genus


  P3/6K1   6K1/CI   CI/6K2   6K2/VPg   VPg/NIa-Pro   NIa-Pro/NIb   NIb/CP
  Seqs Viruses   Seqs Viruses   Seqs Viruses   Seqs Viruses   Seqs Viruses   Seqs Viruses   Seqs Viruses
                                         
E/A 3 1               314 6   110 20         17* 3
E/G 13 2   1 1         55 17   61 21         2 1
E/H 2 1                                    
E/I                         1 1            
E/M                   1 1                  
E/N             1 1         5 2            
E/R 1 1                                    
E/S 2 2         13 4   21 1   455 33   1 1   85 4
E/T                         2 1            
G/A                         1 1   1 1      
G/G             1 1                        
G/K                   1 1                  
H/A       1 1                           16 1
H/G 1 1               1 1         1 1   1 1
H/S                                     1 1
H/V                         1 1            
Q/A 269 45   33 11   73 6   13 7   1 1   90 14   793 67
Q/C                               24 1      
Q/D 5 2         3 2                     8 1
Q/E             1 1                     3 2
Q/F                                     1 1
Q/G 10 6   24 5   36 10   231 50         55 21   177 24
Q/H 1 1                           4 4      
Q/I                               1 1      
Q/K 2 2         2 2                        
Q/M       2 1   1 1               4 3   12 6
Q/N 1 1   4 3   269 4               4 3   7 2
Q/P             1 1               1 1   2 2
Q/R 57 2                           2 2      
Q/S 64 11   270 58   183 47               153 38   945 84
Q/T 2 2   106 1   44 6               115 2   8 2
Q/V 8 2         2 1         2 2   5 1   32 6
Q/Y 1 1                                    
R/S                                     1 1

*, exclusively Rymoviruses
, exclusively Tritimoviruses