dbacp02314
General Description
Peptide name : Cathepsin B
Source/Organism : Mouse
Linear/Cyclic : Not found
Chirality : Not found
Sequence Information
Sequence : MWWSLILLSCLLALTSAHDKPSFHPLSDDLINYINKQNTTWQAGRNFYNVDISYLKKLCGTVLGGPKLPGRVAFGEDIDLPETFDAREQWSNCPTIGQIRDQGSCGSCWAFGAVEAISDRTCIHTNGRVNVEVSAEDLLTCCGIQCGDGCNGGYPSGAWSFWTKKGLVSGGVYNSHVGCLPYTIPPCEHHVNGSRPPCTGEGDTPRCNKSCEAGYSPSYKEDKHFGYTSYSVSNSVKEIMAEIYKNGPVEGAFTVFSDFLTYKSGVYKHEAGDMMGGHAIRILGWGVENGVPYWLAANSWNLDWGDNGFFKILRGENHCGIESEIVAGIPRTDQYWGRF
Peptide length: 339
C-terminal modification: Not found
N-terminal modification : Not found
Non-natural peptide information: None
Activity Information
Assay type : Not specified
Assay time : Not found
Activity : Not found
Cell line : Not found
Cancer type : Not found
Other activity : Not found
Physicochemical Properties
Amino acid composition bar chart :
Molecular mass : 37279.4147 Dalton
Aliphatic index : 0.687
Instability index : 23.9065
Hydrophobicity (GRAVY) : -0.305
Isoelectric point : 5.5657
Charge (pH 7) : -8.7456
Aromaticity : 0.118
Molar extinction coefficient (cysteine, cystine): (88350, 89350)
Hydrophobic/hydrophilic ratio : 1.14556962
hydrophobic moment : 0.0344
Missing amino acid : None
Most occurring amino acid : G
Most occurring amino acid frequency : 41
Least occurring amino acid : M
Least occurring amino acid frequency : 4
Structural Information
3D structure :
Secondary structure fraction (Helix, Turn, Sheet): (0.2, 0.3, 0.3)
SMILES Notation: CC[C@H](C)[C@H](NC(=O)CNC(=O)[C@H](CS)NC(=O)[C@H](CS)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)CNC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](Cc1c[nH]cn1)NC(=O)[C@@H](NC(=O)[C@H](CS)NC(=O)[C@@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)CNC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](C)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](CS)NC(=O)[C@H](CO)NC(=O)CNC(=O)[C@H](CS)NC(=O)[C@H](CO)NC(=O)CNC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@@H](NC(=O)[C@H](CCC(N)=O)NC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@@H]1CCCN1C(=O)[C@H](CS)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CNC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(C)C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1C(=O)CNC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CS)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCCNC(=N)N)NC(=O)CNC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1C(=O)[C@H](Cc1c[nH]cn1)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](Cc1c[nH]cn1)NC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CS)NC(=O)[C@H](CO)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CO)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@@H](N)CCSC)[C@@H](C)CC)[C@@H](C)O)[C@@H](C)CC)[C@@H](C)CC)[C@@H](C)O)[C@@H](C)O)C(C)C)[C@@H](C)CC)[C@@H](C)O)C(C)C)C(C)C)[C@@H](C)CC)[C@@H](C)O)[C@@H](C)O)[C@@H](C)CC)[C@@H](C)CC)C(C)C)[C@@H](C)CC)[C@@H](C)O)[C@@H](C)CC)[C@@H](C)O)C(C)C)C(C)C)C(C)C)[C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CS)C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(=O)NCC(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)NCC(=O)N[C@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@H](C(=O)N[C@H](C(=O)N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@H](C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)N[C@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(=O)N[C@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(=O)N1CCC[C@H]1C(=O)N[C@H](C(=O)N[C@@H](CCC(=O)O)C(=O)NCC(=O)N[C@@H](C)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCSC)C(=O)NCC(=O)NCC(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](C)C(=O)N[C@H](C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)NCC(=O)N[C@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(=N)N)C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](CS)C(=O)NCC(=O)N[C@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)NCC(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](Cc1ccccc1)C(=O)O)[C@@H](C)O)[C@@H](C)CC)C(C)C)[C@@H](C)CC)[C@@H](C)CC)[C@@H](C)CC)C(C)C)C(C)C)[C@@H](C)CC)[C@@H](C)CC)C(C)C)[C@@H](C)O)C(C)C)[C@@H](C)O)C(C)C)[C@@H](C)CC)[C@@H](C)CC)C(C)C)C(C)C)[C@@H](C)O)[C@@H](C)O)[C@@H](C)O)C(C)C)[C@@H](C)CC)[C@@H](C)O)C(C)C)C(C)C)C(C)C)[C@@H](C)O
Secondary Structure :
| Method | Prediction |
|---|---|
| GOR | HHHHHHHHHHHHHHHHTTTCTTCCCCTTTHEEEETTTCCCEETTTTCEEECCHHTTTETEEEETCCCCCCEEEECCCCTCCHHHHHHHTTTTCCCECCEEETTTTTTEEHTHHHHHHHHHEEEECTTCEEEEECCTTHEEEEECETCTTCTTCCCCTCEEHHHTTTEEEEEEEEEEETCCCTCCCTTEEEETTCCCTTCCTTCCCTTTTTTTTTCCTTTTTTTEEEEEEEEETTTHHHHHHHHHTTCCCTTEEEEEHHTHHTTTTEHHHHTHHHHTCHHHEEEEEEETTTCCEEEHHTTCCTTTTCTTHHHHETTTTTTTHHHHEEECCCCCCTTTTEE |
| Chou-Fasman (CF) | EEEEEEEEECCCEECCCCCCCCCCCHHHHEEEECCCEEEECCCCCCEEEEEEEHHHHEEEEECCCCCCEEEEHHHHHCCCCHHHHHHHCCCCEEEEECCCCCCCCCHHHHCHHHHHCCEEEEECCEEEEEEHHHHHHEEEEEEECCCCCCCCCCCCCCEEECCCEEEEEEEECEEEECEEEEECCCCCCEECCCCCCCCCCCCCCCCCHHHHCCCCCHHHHHHEEEEEEEEEECHHHHHHHHCCCCCCCCEEEEEECEEEECEEEEHHHHHHHCCCCCEEEEEEECCCCEEEECCCCCCCCCCCCCCCEECCCCCCCCCCCCCCEECEECCEEEEECCC |
| Neural Network (NN) | HHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCHHCCCCCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCECCCCCCCCCCCCCCHHHCCCCEEECCCCCEEEEHHHHHHHHCCCCCCCCCCCCCCCCCCCEEHCCCCCECCCCEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEECCCCCHHHHHHHCCCCCCCCCEEEEECCEECCCCCCCCCCCCHHCCCCEEEEECCCCCCCCEEEHHCCCCCCCCCCCCEEEECCCCCCCCCCEEECCCCCCCCCCCCC |
| Joint/Consensus | HHHHHHHHHHHHHHCCCCCCCCCCCCCCCCEEECCCCCCCCCCCCCCEEECCHHHHCEEEEECCCCCCCCEECCCCCCCCCHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCHHHHHCCCCEEECCCCCEEEEHHHHHHEEEEECCCCCCCCCCCCCCCEEECCCCCEEEEEEEEEEECCCCCCCCCCCCEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEEEEECCCHHHHHHHHCCCCCCCCEEEEECCCEECCCCCHHHHCHHHHCCCCEEEEEECCCCCCCEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCEECCCCCCCCCCCCC |
Molecular Descriptors and ADMET Properties
Molecular Descriptors: Not available.
ADMET Properties: Not available.
Cross Referencing databases
Pubmed Id : 2012677 2226854 3463996 1746902 16141072 15489334 1889751 12782676 21183079 23806337
Uniprot : Click here
PDB : Not available
CancerPPD : Not available
ApIAPDB : Not available
CancerPPD2 ID : Not available
Reference
1 : Gerhard DS, et al. The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome Res. 2004; 14:2121-7. doi: 10.1101/gr.2596504
2 : Qian F, et al. Characterization of multiple cathepsin B mRNAs in murine B16a melanoma. Anticancer Res. 1991; 11:1445-51.
3 : Park J, et al. SIRT5-mediated lysine desuccinylation impacts diverse metabolic pathways. Mol Cell. 2013; 50:919-30. doi: 10.1016/j.molcel.2013.06.001
4 : Qian F, et al. The structure of the mouse cathepsin B gene and its putative promoter. DNA Cell Biol. 1991; 10:159-68. doi: 10.1089/dna.1991.10.159
5 : Friedrichs B, et al. Thyroid functions of mouse cathepsins B, K, and L. J Clin Invest. 2003; 111:1733-45. doi: 10.1172/JCI15990
6 : Huttlin EL, et al. A tissue-specific atlas of mouse protein phosphorylation and expression. Cell. 2010; 143:1174-89. doi: 10.1016/j.cell.2010.12.001
7 : Chan SJ, et al. Nucleotide and predicted amino acid sequences of cloned human and mouse preprocathepsin B cDNAs. Proc Natl Acad Sci U S A. 1986; 83:7721-5. doi: 10.1073/pnas.83.20.7721
8 : Ferrara M, et al. Gene structure of mouse cathepsin B. FEBS Lett. 1990; 273:195-9. doi: 10.1016/0014-5793(90)81083-z
9 : Freimert C, et al. Isolation of a cathepsin B-encoding cDNA from murine osteogenic cells. Gene. 1991; 103:259-61. doi: 10.1016/0378-1119(91)90283-h
10 : Carninci P, et al. The transcriptional landscape of the mammalian genome. Science. 2005; 309:1559-63. doi: 10.1126/science.1112014
Literature
Paper title : The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).
Doi : https://doi.org/10.1101/gr.2596504
Abstract : The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.
Paper title : Characterization of multiple cathepsin B mRNAs in murine B16a melanoma.
Doi : https://doi.org/Not available
Abstract : We have previously shown that the highly metastatic murine B16a melanoma expresses a high level of cathepsin B mRNA which is associated with three transcripts of 2.2, 4.0 and 5.0 kb, while in contrast only a single 2.2 kb cathepsin B RNA was detected in normal murine tissues. Using recombinant DNA techniques, cDNAs corresponding to these three transcripts have been isolated from a B16a melanoma cDNA library. Sequence analysis indicates that all three mRNA transcripts contain identical coding sequences for normal preprocathepsin B. However, the 4.0 and 5.0 kb transcripts contain unusually long extended 3' untranslated regions. These results suggest that the post-transcriptional processing pathway of the cathepsin B gene is modified in B16 melanomas. The results also indicate that the increased extracellular secretion of larger forms of cathepsin B by tumors is most likely due to post-translational mechanisms and does not involve alternative splicing or a coding mutation in the gene.
Paper title : SIRT5-mediated lysine desuccinylation impacts diverse metabolic pathways.
Doi : https://doi.org/10.1016/j.molcel.2013.06.001
Abstract : Protein function is regulated by diverse posttranslational modifications. The mitochondrial sirtuin SIRT5 removes malonyl and succinyl moieties from target lysines. The spectrum of protein substrates subject to these modifications is unknown. We report systematic profiling of the mammalian succinylome, identifying 2,565 succinylation sites on 779 proteins. Most of these do not overlap with acetylation sites, suggesting differential regulation of succinylation and acetylation. Our analysis reveals potential impacts of lysine succinylation on enzymes involved in mitochondrial metabolism; e.g., amino acid degradation, the tricarboxylic acid cycle (TCA) cycle, and fatty acid metabolism. Lysine succinylation is also present on cytosolic and nuclear proteins; indeed, we show that a substantial fraction of SIRT5 is extramitochondrial. SIRT5 represses biochemical activity of, and cellular respiration through, two protein complexes identified in our analysis, pyruvate dehydrogenase complex and succinate dehydrogenase. Our data reveal widespread roles for lysine succinylation in regulating metabolism and potentially other cellular functions.
Paper title : The structure of the mouse cathepsin B gene and its putative promoter.
Doi : https://doi.org/10.1089/dna.1991.10.159
Abstract : The mouse cathepsin B gene and its flanking regions were cloned and characterized. The gene contains 10 exons and 9 introns spanning about 20 kb. Although the exon-intron organization of the mouse cathepsin B gene showed some similarity to the rat cathepsin H and L genes, significant differences were found. In particular, the highly conserved sequence that contains the catalytically active cysteine in these genes is split at different sites by an intron. As with other thiol proteinases, there is no obvious correspondence between the coding exons and structural or functional units within preprocathepsin B. These results suggest that the lysosomal thiol proteinase genes are evolutionarily ancient and that intron shifting has occurred subsequent to their divergence from a common ancestral form. The 5'-flanking region and exon 1 sequences in the mouse cathepsin B gene have a high GC content of approximately 72%. The 5'-flanking region also contains several potential Sp1 binding sites, but lacks TATA and CAAT motifs. These characteristics suggest that cathepsin B is a "housekeeping" gene and its transcription may be controlled by multiple transcription factors, including Sp1.
Paper title : Thyroid functions of mouse cathepsins B, K, and L.
Doi : https://doi.org/10.1172/JCI15990
Abstract : Thyroid function depends on processing of the prohormone thyroglobulin by sequential proteolytic events. From in vitro analysis it is known that cysteine proteinases mediate proteolytic processing of thyroglobulin. Here, we have analyzed mice with deficiencies in cathepsins B, K, L, B and K, or K and L in order to investigate which of the cysteine proteinases is most important for proteolytic processing of thyroglobulin in vivo. Immunolabeling demonstrated a rearrangement of the endocytic system and a redistribution of extracellularly located enzymes in thyroids of cathepsin-deficient mice. Cathepsin L was upregulated in thyroids of cathepsin K(-/-) or B(-/-)/K(-/-) mice, suggesting a compensation of cathepsin L for cathepsin K deficiency. Impaired proteolysis resulted in the persistence of thyroglobulin in the thyroids of mice with deficiencies in cathepsin B or L. The typical multilayered appearance of extracellularly stored thyroglobulin was retained in cathepsin K(-/-) mice only. These results suggest that cathepsins B and L are involved in the solubilization of thyroglobulin from its covalently cross-linked storage form. Cathepsin K(-/-)/L(-/-) mice had significantly reduced levels of free thyroxine, indicating that utilization of luminal thyroglobulin for thyroxine liberation is mediated by a combinatory action of cathepsins K and L.
Paper title : A tissue-specific atlas of mouse protein phosphorylation and expression.
Doi : https://doi.org/10.1016/j.cell.2010.12.001
Abstract : Although most tissues in an organism are genetically identical, the biochemistry of each is optimized to fulfill its unique physiological roles, with important consequences for human health and disease. Each tissue's unique physiology requires tightly regulated gene and protein expression coordinated by specialized, phosphorylation-dependent intracellular signaling. To better understand the role of phosphorylation in maintenance of physiological differences among tissues, we performed proteomic and phosphoproteomic characterizations of nine mouse tissues. We identified 12,039 proteins, including 6296 phosphoproteins harboring nearly 36,000 phosphorylation sites. Comparing protein abundances and phosphorylation levels revealed specialized, interconnected phosphorylation networks within each tissue while suggesting that many proteins are regulated by phosphorylation independently of their expression. Our data suggest that the "typical" phosphoprotein is widely expressed yet displays variable, often tissue-specific phosphorylation that tunes protein activity to the specific needs of each tissue. We offer this dataset as an online resource for the biological research community.
Paper title : Nucleotide and predicted amino acid sequences of cloned human and mouse preprocathepsin B cDNAs.
Doi : https://doi.org/10.1073/pnas.83.20.7721
Abstract : Cathepsin B is a lysosomal thiol proteinase that may have additional extralysosomal functions. To further our investigations on the structure, mode of biosynthesis, and intracellular sorting of this enzyme, we have determined the complete coding sequences for human and mouse preprocathepsin B by using cDNA clones isolated from human hepatoma and kidney phage libraries. The nucleotide sequences predict that the primary structure of preprocathepsin B contains 339 amino acids organized as follows: a 17-residue NH2-terminal prepeptide sequence followed by a 62-residue propeptide region, 254 residues in mature (single chain) cathepsin B, and a 6-residue extension at the COOH terminus. A comparison of procathepsin B sequences from three species (human, mouse, and rat) reveals that the homology between the propeptides is relatively conserved with a minimum of 68% sequence identity. In particular, two conserved sequences in the propeptide that may be functionally significant include a potential glycosylation site and the presence of a single cysteine at position 59. Comparative analysis of the three sequences also suggests that processing of procathepsin B is a multistep process, during which enzymatically active intermediate forms may be generated. The availability of the cDNA clones will facilitate the identification of possible active or inactive intermediate processive forms as well as studies on the transcriptional regulation of the cathepsin B gene.
Paper title : Gene structure of mouse cathepsin B.
Doi : https://doi.org/10.1016/0014-5793(90)81083-z
Abstract : The structure of a genomic DNA fragment encoding mouse cathepsin B was characterized. The genomic insert spans 15 kbp and contains 9 exons encoding the 339 amino acid residues of mouse preprocathepsin B. Intron break-points are not found at the junctions of the pre-peptide, pro-peptide and mature enzyme. Like other cysteine proteinase genes, the region around the cysteinyl active site is split by an intron, but in contrast with cathepsins L and H the intron break-point is located immediately after the active site.
Paper title : Isolation of a cathepsin B-encoding cDNA from murine osteogenic cells.
Doi : https://doi.org/10.1016/0378-1119(91)90283-h
Abstract : Cathepsin B-encoding cDNA (CTSB) clones have been isolated from a lambda gt10 library of a murine osteosarcoma by differential screening during a search for genes which are typically expressed during osteogenic differentiation in mouse mandibular condyles in vitro. Sequencing of the CTSB 3' end revealed that the isolated sequence contained an 825-bp 3'-noncoding region, the polyadenylation signal and the poly(A) tail. The enhanced CTSB expression during the early stages of the enchondral ossification-like process in mandibular condyles in vitro suggests that CTSB participates in the degradation of cartilage matrix prior to the synthesis of bone matrix proteins.
Paper title : The transcriptional landscape of the mammalian genome.
Doi : https://doi.org/10.1126/science.1112014
Abstract : This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.