dbACP: A Comprehensive Database of Anti-Cancer Peptides

dbacp02314

General Description

Peptide name : Cathepsin B

Source/Organism : Mouse

Linear/Cyclic : Not found

Chirality : Not found

Sequence Information

Sequence : MWWSLILLSCLLALTSAHDKPSFHPLSDDLINYINKQNTTWQAGRNFYNVDISYLKKLCGTVLGGPKLPGRVAFGEDIDLPETFDAREQWSNCPTIGQIRDQGSCGSCWAFGAVEAISDRTCIHTNGRVNVEVSAEDLLTCCGIQCGDGCNGGYPSGAWSFWTKKGLVSGGVYNSHVGCLPYTIPPCEHHVNGSRPPCTGEGDTPRCNKSCEAGYSPSYKEDKHFGYTSYSVSNSVKEIMAEIYKNGPVEGAFTVFSDFLTYKSGVYKHEAGDMMGGHAIRILGWGVENGVPYWLAANSWNLDWGDNGFFKILRGENHCGIESEIVAGIPRTDQYWGRF

Peptide length: 339

C-terminal modification: Not found

N-terminal modification : Not found

Non-natural peptide information: None

Activity Information

Assay type : Not specified

Assay time : Not found

Activity : Not found

Cell line : Not found

Cancer type : Not found

Other activity : Not found

Physicochemical Properties

Amino acid composition bar chart :

Molecular mass : 37279.4147 Dalton

Aliphatic index : 0.687

Instability index : 23.9065

Hydrophobicity (GRAVY) : -0.305

Isoelectric point : 5.5657

Charge (pH 7) : -8.7456

Aromaticity : 0.118

Molar extinction coefficient (cysteine, cystine): (88350, 89350)

Hydrophobic/hydrophilic ratio : 1.14556962

hydrophobic moment : 0.0344

Missing amino acid : None

Most occurring amino acid : G

Most occurring amino acid frequency : 41

Least occurring amino acid : M

Least occurring amino acid frequency : 4

Structural Information

3D structure :

Secondary structure fraction (Helix, Turn, Sheet): (0.2, 0.3, 0.3)

SMILES Notation: CC[C@H](C)[C@H](NC(=O)CNC(=O)[C@H](CS)NC(=O)[C@H](CS)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)CNC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](Cc1c[nH]cn1)NC(=O)[C@@H](NC(=O)[C@H](CS)NC(=O)[C@@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)CNC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](C)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](CS)NC(=O)[C@H](CO)NC(=O)CNC(=O)[C@H](CS)NC(=O)[C@H](CO)NC(=O)CNC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@@H](NC(=O)[C@H](CCC(N)=O)NC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@@H]1CCCN1C(=O)[C@H](CS)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CNC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(C)C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1C(=O)CNC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CS)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCCNC(=N)N)NC(=O)CNC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1C(=O)[C@H](Cc1c[nH]cn1)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](Cc1c[nH]cn1)NC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CS)NC(=O)[C@H](CO)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CO)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@@H](N)CCSC)[C@@H](C)CC)[C@@H](C)O)[C@@H](C)CC)[C@@H](C)CC)[C@@H](C)O)[C@@H](C)O)C(C)C)[C@@H](C)CC)[C@@H](C)O)C(C)C)C(C)C)[C@@H](C)CC)[C@@H](C)O)[C@@H](C)O)[C@@H](C)CC)[C@@H](C)CC)C(C)C)[C@@H](C)CC)[C@@H](C)O)[C@@H](C)CC)[C@@H](C)O)C(C)C)C(C)C)C(C)C)[C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CS)C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(=O)NCC(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)NCC(=O)N[C@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@H](C(=O)N[C@H](C(=O)N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@H](C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)N[C@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(=O)N[C@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(=O)N1CCC[C@H]1C(=O)N[C@H](C(=O)N[C@@H](CCC(=O)O)C(=O)NCC(=O)N[C@@H](C)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCSC)C(=O)NCC(=O)NCC(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](C)C(=O)N[C@H](C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)NCC(=O)N[C@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(=N)N)C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](CS)C(=O)NCC(=O)N[C@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)NCC(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](Cc1ccccc1)C(=O)O)[C@@H](C)O)[C@@H](C)CC)C(C)C)[C@@H](C)CC)[C@@H](C)CC)[C@@H](C)CC)C(C)C)C(C)C)[C@@H](C)CC)[C@@H](C)CC)C(C)C)[C@@H](C)O)C(C)C)[C@@H](C)O)C(C)C)[C@@H](C)CC)[C@@H](C)CC)C(C)C)C(C)C)[C@@H](C)O)[C@@H](C)O)[C@@H](C)O)C(C)C)[C@@H](C)CC)[C@@H](C)O)C(C)C)C(C)C)C(C)C)[C@@H](C)O

Secondary Structure :

Method Prediction
GOR HHHHHHHHHHHHHHHHTTTCTTCCCCTTTHEEEETTTCCCEETTTTCEEECCHHTTTETEEEETCCCCCCEEEECCCCTCCHHHHHHHTTTTCCCECCEEETTTTTTEEHTHHHHHHHHHEEEECTTCEEEEECCTTHEEEEECETCTTCTTCCCCTCEEHHHTTTEEEEEEEEEEETCCCTCCCTTEEEETTCCCTTCCTTCCCTTTTTTTTTCCTTTTTTTEEEEEEEEETTTHHHHHHHHHTTCCCTTEEEEEHHTHHTTTTEHHHHTHHHHTCHHHEEEEEEETTTCCEEEHHTTCCTTTTCTTHHHHETTTTTTTHHHHEEECCCCCCTTTTEE
Chou-Fasman (CF) EEEEEEEEECCCEECCCCCCCCCCCHHHHEEEECCCEEEECCCCCCEEEEEEEHHHHEEEEECCCCCCEEEEHHHHHCCCCHHHHHHHCCCCEEEEECCCCCCCCCHHHHCHHHHHCCEEEEECCEEEEEEHHHHHHEEEEEEECCCCCCCCCCCCCCEEECCCEEEEEEEECEEEECEEEEECCCCCCEECCCCCCCCCCCCCCCCCHHHHCCCCCHHHHHHEEEEEEEEEECHHHHHHHHCCCCCCCCEEEEEECEEEECEEEEHHHHHHHCCCCCEEEEEEECCCCEEEECCCCCCCCCCCCCCCEECCCCCCCCCCCCCCEECEECCEEEEECCC
Neural Network (NN) HHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCHHCCCCCCCCCCCCCCCCCCHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCECCCCCCCCCCCCCCHHHCCCCEEECCCCCEEEEHHHHHHHHCCCCCCCCCCCCCCCCCCCEEHCCCCCECCCCEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEECCCCCHHHHHHHCCCCCCCCCEEEEECCEECCCCCCCCCCCCHHCCCCEEEEECCCCCCCCEEEHHCCCCCCCCCCCCEEEECCCCCCCCCCEEECCCCCCCCCCCCC
Joint/Consensus HHHHHHHHHHHHHHCCCCCCCCCCCCCCCCEEECCCCCCCCCCCCCCEEECCHHHHCEEEEECCCCCCCCEECCCCCCCCCHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCHHHHHCCCCEEECCCCCEEEEHHHHHHEEEEECCCCCCCCCCCCCCCEEECCCCCEEEEEEEEEEECCCCCCCCCCCCEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEEEEECCCHHHHHHHHCCCCCCCCEEEEECCCEECCCCCHHHHCHHHHCCCCEEEEEECCCCCCCEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCEECCCCCCCCCCCCC

Molecular Descriptors and ADMET Properties

Molecular Descriptors: Not available.

ADMET Properties: Not available.

Cross Referencing databases

CancerPPD : Not available

ApIAPDB : Not available

CancerPPD2 ID : Not available

Reference

1 : Gerhard DS, et al. The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome Res. 2004; 14:2121-7. doi: 10.1101/gr.2596504

2 : Qian F, et al. Characterization of multiple cathepsin B mRNAs in murine B16a melanoma. Anticancer Res. 1991; 11:1445-51.

3 : Park J, et al. SIRT5-mediated lysine desuccinylation impacts diverse metabolic pathways. Mol Cell. 2013; 50:919-30. doi: 10.1016/j.molcel.2013.06.001

4 : Qian F, et al. The structure of the mouse cathepsin B gene and its putative promoter. DNA Cell Biol. 1991; 10:159-68. doi: 10.1089/dna.1991.10.159

5 : Friedrichs B, et al. Thyroid functions of mouse cathepsins B, K, and L. J Clin Invest. 2003; 111:1733-45. doi: 10.1172/JCI15990

6 : Huttlin EL, et al. A tissue-specific atlas of mouse protein phosphorylation and expression. Cell. 2010; 143:1174-89. doi: 10.1016/j.cell.2010.12.001

7 : Chan SJ, et al. Nucleotide and predicted amino acid sequences of cloned human and mouse preprocathepsin B cDNAs. Proc Natl Acad Sci U S A. 1986; 83:7721-5. doi: 10.1073/pnas.83.20.7721

8 : Ferrara M, et al. Gene structure of mouse cathepsin B. FEBS Lett. 1990; 273:195-9. doi: 10.1016/0014-5793(90)81083-z

9 : Freimert C, et al. Isolation of a cathepsin B-encoding cDNA from murine osteogenic cells. Gene. 1991; 103:259-61. doi: 10.1016/0378-1119(91)90283-h

10 : Carninci P, et al. The transcriptional landscape of the mammalian genome. Science. 2005; 309:1559-63. doi: 10.1126/science.1112014

Literature

Paper title : The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).

Doi : https://doi.org/10.1101/gr.2596504

Abstract : The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.

Paper title : Characterization of multiple cathepsin B mRNAs in murine B16a melanoma.

Doi : https://doi.org/Not available

Abstract : We have previously shown that the highly metastatic murine B16a melanoma expresses a high level of cathepsin B mRNA which is associated with three transcripts of 2.2, 4.0 and 5.0 kb, while in contrast only a single 2.2 kb cathepsin B RNA was detected in normal murine tissues. Using recombinant DNA techniques, cDNAs corresponding to these three transcripts have been isolated from a B16a melanoma cDNA library. Sequence analysis indicates that all three mRNA transcripts contain identical coding sequences for normal preprocathepsin B. However, the 4.0 and 5.0 kb transcripts contain unusually long extended 3' untranslated regions. These results suggest that the post-transcriptional processing pathway of the cathepsin B gene is modified in B16 melanomas. The results also indicate that the increased extracellular secretion of larger forms of cathepsin B by tumors is most likely due to post-translational mechanisms and does not involve alternative splicing or a coding mutation in the gene.

Paper title : SIRT5-mediated lysine desuccinylation impacts diverse metabolic pathways.

Doi : https://doi.org/10.1016/j.molcel.2013.06.001

Abstract : Protein function is regulated by diverse posttranslational modifications. The mitochondrial sirtuin SIRT5 removes malonyl and succinyl moieties from target lysines. The spectrum of protein substrates subject to these modifications is unknown. We report systematic profiling of the mammalian succinylome, identifying 2,565 succinylation sites on 779 proteins. Most of these do not overlap with acetylation sites, suggesting differential regulation of succinylation and acetylation. Our analysis reveals potential impacts of lysine succinylation on enzymes involved in mitochondrial metabolism; e.g., amino acid degradation, the tricarboxylic acid cycle (TCA) cycle, and fatty acid metabolism. Lysine succinylation is also present on cytosolic and nuclear proteins; indeed, we show that a substantial fraction of SIRT5 is extramitochondrial. SIRT5 represses biochemical activity of, and cellular respiration through, two protein complexes identified in our analysis, pyruvate dehydrogenase complex and succinate dehydrogenase. Our data reveal widespread roles for lysine succinylation in regulating metabolism and potentially other cellular functions.

Paper title : The structure of the mouse cathepsin B gene and its putative promoter.

Doi : https://doi.org/10.1089/dna.1991.10.159

Abstract : The mouse cathepsin B gene and its flanking regions were cloned and characterized. The gene contains 10 exons and 9 introns spanning about 20 kb. Although the exon-intron organization of the mouse cathepsin B gene showed some similarity to the rat cathepsin H and L genes, significant differences were found. In particular, the highly conserved sequence that contains the catalytically active cysteine in these genes is split at different sites by an intron. As with other thiol proteinases, there is no obvious correspondence between the coding exons and structural or functional units within preprocathepsin B. These results suggest that the lysosomal thiol proteinase genes are evolutionarily ancient and that intron shifting has occurred subsequent to their divergence from a common ancestral form. The 5'-flanking region and exon 1 sequences in the mouse cathepsin B gene have a high GC content of approximately 72%. The 5'-flanking region also contains several potential Sp1 binding sites, but lacks TATA and CAAT motifs. These characteristics suggest that cathepsin B is a "housekeeping" gene and its transcription may be controlled by multiple transcription factors, including Sp1.

Paper title : Thyroid functions of mouse cathepsins B, K, and L.

Doi : https://doi.org/10.1172/JCI15990

Abstract : Thyroid function depends on processing of the prohormone thyroglobulin by sequential proteolytic events. From in vitro analysis it is known that cysteine proteinases mediate proteolytic processing of thyroglobulin. Here, we have analyzed mice with deficiencies in cathepsins B, K, L, B and K, or K and L in order to investigate which of the cysteine proteinases is most important for proteolytic processing of thyroglobulin in vivo. Immunolabeling demonstrated a rearrangement of the endocytic system and a redistribution of extracellularly located enzymes in thyroids of cathepsin-deficient mice. Cathepsin L was upregulated in thyroids of cathepsin K(-/-) or B(-/-)/K(-/-) mice, suggesting a compensation of cathepsin L for cathepsin K deficiency. Impaired proteolysis resulted in the persistence of thyroglobulin in the thyroids of mice with deficiencies in cathepsin B or L. The typical multilayered appearance of extracellularly stored thyroglobulin was retained in cathepsin K(-/-) mice only. These results suggest that cathepsins B and L are involved in the solubilization of thyroglobulin from its covalently cross-linked storage form. Cathepsin K(-/-)/L(-/-) mice had significantly reduced levels of free thyroxine, indicating that utilization of luminal thyroglobulin for thyroxine liberation is mediated by a combinatory action of cathepsins K and L.

Paper title : A tissue-specific atlas of mouse protein phosphorylation and expression.

Doi : https://doi.org/10.1016/j.cell.2010.12.001

Abstract : Although most tissues in an organism are genetically identical, the biochemistry of each is optimized to fulfill its unique physiological roles, with important consequences for human health and disease. Each tissue's unique physiology requires tightly regulated gene and protein expression coordinated by specialized, phosphorylation-dependent intracellular signaling. To better understand the role of phosphorylation in maintenance of physiological differences among tissues, we performed proteomic and phosphoproteomic characterizations of nine mouse tissues. We identified 12,039 proteins, including 6296 phosphoproteins harboring nearly 36,000 phosphorylation sites. Comparing protein abundances and phosphorylation levels revealed specialized, interconnected phosphorylation networks within each tissue while suggesting that many proteins are regulated by phosphorylation independently of their expression. Our data suggest that the "typical" phosphoprotein is widely expressed yet displays variable, often tissue-specific phosphorylation that tunes protein activity to the specific needs of each tissue. We offer this dataset as an online resource for the biological research community.

Paper title : Nucleotide and predicted amino acid sequences of cloned human and mouse preprocathepsin B cDNAs.

Doi : https://doi.org/10.1073/pnas.83.20.7721

Abstract : Cathepsin B is a lysosomal thiol proteinase that may have additional extralysosomal functions. To further our investigations on the structure, mode of biosynthesis, and intracellular sorting of this enzyme, we have determined the complete coding sequences for human and mouse preprocathepsin B by using cDNA clones isolated from human hepatoma and kidney phage libraries. The nucleotide sequences predict that the primary structure of preprocathepsin B contains 339 amino acids organized as follows: a 17-residue NH2-terminal prepeptide sequence followed by a 62-residue propeptide region, 254 residues in mature (single chain) cathepsin B, and a 6-residue extension at the COOH terminus. A comparison of procathepsin B sequences from three species (human, mouse, and rat) reveals that the homology between the propeptides is relatively conserved with a minimum of 68% sequence identity. In particular, two conserved sequences in the propeptide that may be functionally significant include a potential glycosylation site and the presence of a single cysteine at position 59. Comparative analysis of the three sequences also suggests that processing of procathepsin B is a multistep process, during which enzymatically active intermediate forms may be generated. The availability of the cDNA clones will facilitate the identification of possible active or inactive intermediate processive forms as well as studies on the transcriptional regulation of the cathepsin B gene.

Paper title : Gene structure of mouse cathepsin B.

Doi : https://doi.org/10.1016/0014-5793(90)81083-z

Abstract : The structure of a genomic DNA fragment encoding mouse cathepsin B was characterized. The genomic insert spans 15 kbp and contains 9 exons encoding the 339 amino acid residues of mouse preprocathepsin B. Intron break-points are not found at the junctions of the pre-peptide, pro-peptide and mature enzyme. Like other cysteine proteinase genes, the region around the cysteinyl active site is split by an intron, but in contrast with cathepsins L and H the intron break-point is located immediately after the active site.

Paper title : Isolation of a cathepsin B-encoding cDNA from murine osteogenic cells.

Doi : https://doi.org/10.1016/0378-1119(91)90283-h

Abstract : Cathepsin B-encoding cDNA (CTSB) clones have been isolated from a lambda gt10 library of a murine osteosarcoma by differential screening during a search for genes which are typically expressed during osteogenic differentiation in mouse mandibular condyles in vitro. Sequencing of the CTSB 3' end revealed that the isolated sequence contained an 825-bp 3'-noncoding region, the polyadenylation signal and the poly(A) tail. The enhanced CTSB expression during the early stages of the enchondral ossification-like process in mandibular condyles in vitro suggests that CTSB participates in the degradation of cartilage matrix prior to the synthesis of bone matrix proteins.

Paper title : The transcriptional landscape of the mammalian genome.

Doi : https://doi.org/10.1126/science.1112014

Abstract : This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.