dbACP: A Comprehensive Database of Anti-Cancer Peptides

dbACP Help Center

Field Information about dbACP

Field Name Description
Peptide Name Name of anticancer peptide.
Source/Organism The origin or source organism of the peptide.
Linear/Cyclic Specifies whether the peptide is linear or cyclic in structure.
Chirality Indicates the stereochemistry of the peptide’s amino acids (D or L conformation).
Sequence The amino acid sequence of the peptide.
C-terminal modification Describes any chemical modifications present at the C-terminus of the peptide.
N-terminal modification Details any chemical modifications present at the N-terminus of the peptide.
Assay type The type of assay or method used to determine the peptide's biological activity.
Assay time The duration for which the peptide was tested in the assay.
Activity The biological activity of the peptide, such as its effective concentration (e.g., LC50).
Mechanism of action Describes the biological mechanism through which the peptide exerts its activity.
Cell line The specific cell line on which the peptide’s activity was tested.
Cancer type The type of cancer targeted by the peptide during testing.
Other activity Additional biological activities displayed by the peptide (e.g., antibacterial, antifungal).
PDB file format A link to download the peptide structure in PDB format.
Peptide ADMET properties A downloadable file or reference provides information on the peptide's ADMET properties.
Peptide molecular descriptors A downloadable file providing the peptide's QSAR descriptors.
PubMed ID The PubMed ID (PMID) for the scientific publication that describes the peptide’s properties and activities.
Amino acid percentages It shows the percentage of each amino acid in the peptide sequence.
Amino acid count A count of each amino acid within the peptide sequence.
Missing amino acid Amino acids that are absent from the peptide sequence.
Most occurring amino acid The amino acid that appears most frequently in the peptide sequence.
Most occurring amino acid frequency The number of times the most occurring amino acid appears in the sequence.
Least occurring amino acid The amino acid that appears least frequently in the peptide sequence.
Least occurring amino acid frequency The frequency of the least occurring amino acid in the sequence.
Hydrophobic/Hydrophilic amino acid ratio The ratio of hydrophobic to hydrophilic amino acids in the peptide sequence.
Molecular mass The calculated molecular mass of the peptide.
Aliphatic index A measure of the relative volume occupied by aliphatic side chains in the sequence.
Instability index A computed value that estimates the stability of the peptide in a test tube environment.
Hydrophobicity (GRAVY) The GRAVY score indicates the overall hydrophobicity of the peptide.
Isoelectric point The pH at which the peptide carries no net electrical charge.
Hydrophobic moment A measure of the hydrophobic character of the peptide sequence.
Charge (pH:7) The net charge of the peptide at a neutral pH of 7.
Aromaticity The relative frequency of aromatic amino acids in the peptide sequence.
Molar extinction coefficient (cysteine, cystine) The molar extinction coefficients of the peptide considering cysteine and cystine residues.
Secondary Structure fraction (Helix, Turn, Sheet) The estimated fraction of secondary structure elements in the peptide.
SMILES Notation A simplified molecular input line entry system representation of the peptide.

FAQs (Frequently Asked Questions)

What is dbACP?
dbACP is a manually curated repository of experimentally validated anticancer peptides.
What are anticancer peptides?
Anticancer peptides are amphipathic and mainly cationic, often originating from antimicrobial peptides. They selectively target cancer cells by disrupting membranes or inducing apoptosis.
Why is dbACP created?
To catalog anticancer peptides that offer high specificity, tumor penetration, and are not impacted by resistance mechanisms.
What is unique about dbACP?
It includes detailed experimental data, 3D structures, and analysis tools related to anticancer peptides.
Does this database include all known anticancer peptides?
It is a first-round curated resource and is continuously updated with new literature findings.
Why search dbACP?
To access validated peptide data including activity, modifications, sources, and cell line targets.
How do I search dbACP?
Search by peptide name, sequence, assay type, activity, cell line, PMID, etc.
To whom can I report a discrepancy?
Visit the Contact page to directly reach out to the dbACP development team.

Abbreviations

Abbreviation Full Form
sp. Species
D amino acid Dextrorotary amino acid
L amino acid Levorotary amino acid
IC50 Half-maximal inhibitory concentration
MIC Minimum inhibitory concentration
LD50 Lethal dose 50 / Median lethal dose
LC50 Lethal concentration 50
IC50 ± SD IC50 value ± Standard deviation
EC50 Half-maximal effective concentration
CC50 Cytotoxic concentration 50%
Kd Dissociation constant
ED50 Median effective dose
ID50 Median infective dose
MTT assay 3-(4,5-dimethylthiazolyl-2)-2,5-diphenyltetrazolium bromide assay
MTS assay Dimethylthiazol-carboxymethoxyphenyl-sulfophenyl-tetrazolium assay
WST-1 assay Water-soluble tetrazolium salt assay
LDH leakage assay Lactate dehydrogenase leakage assay
CCK-8 assay Cell Counting Kit-8 assay
TUNEL assay Terminal deoxynucleotidyl transferase dUTP nick-end labeling assay
PES colorimetric assay Pheazine ethyl sulfate (PES) colorimetric assay
SDS-PAGE assay Sodium dodecyl sulfate–polyacrylamide gel electrophoresis assay
CNF assay Cell nucleus fragmentation assay
SRB assay Sulforhodamine B assay
ELISPOT assay Enzyme-linked immunosorbent spot assay
XTT assay Methoxynitrosulfophenyl-tetrazolium carboxanilide assay
PI-uptake assay Propidium Iodide Uptake Assay
PMS assay Phenazine methosulfate assay
GFP assay Green Fluorescent Protein based assay
EIA Enzyme Immunoassay
GST Pull-down assay Glutathione S-Transferase Pull Down assay
PKA Kinase assay cAMP-dependent protein kinase A assay
GST Competition assay Glutathione S-Transferase Competition assay

ADMET Properties

What are ADMET properties and what kind of data it contain?
ADMET stands for Absorption, Distribution, Metabolism, Excretion, and Toxicity. These descriptors predict a compound's pharmacokinetic and toxicological profiles, essential in drug development. It provides both the raw values of these properties and their percentiles relative to known drugs (typically DrugBank-approved compounds). The data can be used to analyze how “drug-like” a molecule is, as well as to predict its pharmacokinetics and potential toxicity.
What does the 'molecular_weight' value represent?
Molecular weight is the total mass of a molecule, measured in Daltons (g/mol). A higher value usually means a larger, potentially more complex molecule. It is important because drugs with very high molecular weights may have poor absorption and distribution in the body.
What does 'logP' indicate?
LogP is the logarithm of the partition coefficient between n-octanol and water. It is a measure of lipophilicity, indicating how hydrophobic (fat-loving) or hydrophilic (water-loving) a compound is. Negative values suggest good water solubility, while high positive values indicate high lipid solubility. LogP influences absorption, distribution, and the ability to cross cell membranes.
What is the significance of 'hydrogen_bond_acceptors' and 'hydrogen_bond_donors'?
These values represent the count of hydrogen bond acceptors (typically nitrogen or oxygen atoms) and donors (hydrogen atoms attached to electronegative atoms like nitrogen or oxygen) in the molecule. These features affect solubility, permeability, and the molecule’s ability to form interactions with biological targets.
What does the 'Lipinski' value mean?
This is the count of Lipinski’s Rule of Five violations. Lipinski’s rules are guidelines for predicting oral bioavailability in humans. Ideally, a drug-like molecule has zero or one violation of these rules, so a value of 1 here indicates a single violation.
What is 'QED' in this context?
QED stands for Quantitative Estimate of Drug-likeness. It combines various molecular properties into a single score ranging from 0 (poor drug-likeness) to 1 (ideal drug-likeness). A low value suggests the molecule is less similar to typical drugs.
What does 'stereo_centers' mean?
This field represents the number of stereocenters (chiral centers) in the molecule. Chiral centers are carbon atoms attached to four different groups and can influence a molecule’s biological activity and metabolism.
What is 'tpsa'?
TPSA stands for Topological Polar Surface Area, which is a measure of the molecule’s polar surface area in square angstroms. A higher TPSA generally correlates with lower cell membrane permeability and poorer absorption.
What is the 'AMES' value?
The AMES value reflects the result of an Ames test prediction, which assesses the mutagenic potential of a compound. It’s usually a probability or score, with higher values indicating higher mutagenicity risk.
What does 'BBB_Martins' mean?
BBB_Martins is a predictor for blood-brain barrier (BBB) penetration, using the Martins model. Values closer to 1 indicate a higher probability that the compound can cross the BBB and affect the central nervous system.
What does 'Bioavailability_Ma' indicate?
This is the predicted bioavailability score using the Ma model, usually reflecting the fraction of the drug that reaches systemic circulation unchanged. Higher scores suggest better oral bioavailability.
What are the 'CYP' fields (CYP1A2_Veith, CYP2C19_Veith, etc.)?
These values predict the likelihood that the molecule will interact with or be metabolized by various cytochrome P450 enzymes, which are critical for drug metabolism. The substrate and Veith models indicate if the compound is likely to inhibit or be a substrate for each enzyme, which is important for drug-drug interaction and metabolism prediction.
What does 'Carcinogens_Lagunin' signify?
This value represents the predicted carcinogenicity (cancer-causing potential) of the molecule, based on the Lagunin model. Higher values may indicate a greater likelihood of carcinogenicity.
What does 'ClinTox' mean?
ClinTox is a prediction score for clinical toxicity, indicating the likelihood that the molecule will cause toxic effects in humans during clinical trials.
What is 'DILI'?
DILI stands for Drug-Induced Liver Injury. The value estimates the risk of the compound causing liver toxicity, which is a major reason for drug withdrawal from the market.
What does 'HIA_Hou' represent?
HIA_Hou is a prediction score for Human Intestinal Absorption, calculated using the Hou model. It indicates how well the compound can be absorbed from the gut after oral administration.
What do 'NR-AR', 'NR-AhR', 'NR-Aromatase', etc., represent?
These fields indicate predicted interactions with various nuclear receptors such as Androgen Receptor (AR), Aryl hydrocarbon Receptor (AhR), Aromatase, Estrogen Receptor (ER), and Peroxisome Proliferator-Activated Receptor gamma (PPAR-gamma). Positive predictions suggest potential off-target hormonal or toxic effects.
What is 'PAMPA_NCATS'?
PAMPA_NCATS refers to the Parallel Artificial Membrane Permeability Assay result as predicted by NCATS. It gives a proxy for passive cell membrane permeability, with higher values indicating better permeability.
What does 'Pgp_Broccatelli' indicate?
This value is the predicted probability that the compound will be a substrate for P-glycoprotein (an efflux transporter), using the Broccatelli model. If positive, the drug may be pumped out of cells, reducing its efficacy.
What do the 'SR-ARE', 'SR-ATAD5', 'SR-HSE', 'SR-MMP', and 'SR-p53' fields indicate?
These values represent predicted activity on various stress response pathways: Antioxidant Response Element (ARE), ATAD5, Heat Shock Element (HSE), Matrix Metalloproteinase (MMP), and p53 tumor suppressor. They indicate potential mechanisms of toxicity.
What does 'Skin_Reaction' mean?
This is a prediction of the likelihood that the compound will cause a skin reaction, which is important for assessing safety in topical drugs.
What does 'hERG' stand for?
hERG refers to the human Ether-à-go-go-Related Gene potassium channel, a critical target in cardiac safety. High affinity can result in dangerous cardiac side effects, like arrhythmia.
What is 'Caco2_Wang'?
Caco2_Wang is the predicted permeability of the molecule in Caco-2 cell lines (human epithelial colorectal adenocarcinoma), which are used as an in vitro model of the intestinal barrier. Values indicate how well the molecule may be absorbed in the gut.
What does 'Clearance_Hepatocyte_AZ' and 'Clearance_Microsome_AZ' mean?
These fields represent predicted drug clearance rates in human hepatocytes and microsomes (from the AstraZeneca model). Clearance is a measure of how quickly a drug is removed from the body by metabolism.
What is 'Half_Life_Obach'?
Half_Life_Obach gives the predicted half-life of the compound, using the Obach model. The half-life is the time it takes for half of the drug to be eliminated from the bloodstream.
What does 'HydrationFreeEnergy_FreeSolv' represent?
This value predicts the hydration free energy, indicating the energy change when a molecule moves from the gas phase into water. It affects solubility and bioavailability.
What is 'LD50_Zhu'?
LD50_Zhu is a predicted lethal dose (LD50) value for the compound, using the Zhu model. LD50 is the dose required to kill half the members of a tested population and is a standard measure of acute toxicity.
What does 'Lipophilicity_AstraZeneca' measure?
This field gives the lipophilicity value from the AstraZeneca model, similar to logP, helping predict absorption, distribution, and membrane penetration.
What does 'PPBR_AZ' mean?
PPBR_AZ stands for Plasma Protein Binding Rate predicted by AstraZeneca. A higher value means more of the drug binds to plasma proteins, leaving less free drug available for action.
What is 'Solubility_AqSolDB'?
Solubility_AqSolDB is the predicted solubility of the compound in water (aqueous solubility). Poor solubility can hinder absorption and effectiveness.
What does 'VDss_Lombardo' represent?
VDss_Lombardo is the predicted volume of distribution at steady-state, as estimated by the Lombardo model. It describes how extensively the drug spreads into body tissues relative to the plasma.
What do the fields ending with '_drugbank_approved_percentile' indicate?
These percentile fields show how the property of your molecule compares to approved drugs in the DrugBank database. For example, a value of 99 means the property is higher than 99% of approved drugs, while a value near 0 means it’s lower than almost all drugs. This helps contextualize whether a molecule is typical or unusual compared to known drugs.
How can this file be used in drug design and research?
By analyzing these descriptors and predictions, researchers can quickly assess whether a new molecule has favorable drug-like properties, potential toxicity, metabolic liabilities, or absorption issues. Comparing percentile ranks can also reveal how similar a molecule is to already approved drugs, guiding optimization and selection.

Peptide Molecular Descriptors

What are molecular descriptors?
Molecular descriptors numerically represent chemical information like structure, electronic distribution, and topology, crucial for QSAR modeling and compound screening.
What is MaxAbsEStateIndex?
MaxAbsEStateIndex stands for the maximum absolute electrotopological state (E-State) index among all atoms in the molecule. E-State indices quantify the electronic environment of atoms, helping in understanding reactivity and functional group properties.
What does MaxEStateIndex represent?
MaxEStateIndex is the highest E-State value observed for any atom in the molecule. It indicates which atom has the greatest electronic influence, which can be relevant for understanding how the molecule might interact chemically or biologically.
What is MinAbsEStateIndex?
MinAbsEStateIndex is the smallest absolute value of the E-State index among all atoms in the molecule. It highlights the atom that is electronically least distinct from a neutral environment within the molecule.
What does MinEStateIndex mean?
MinEStateIndex gives the lowest (most negative) E-State index in the molecule, pointing to atoms that might be electron-deficient or in a unique chemical environment.
What is qed?
qed stands for Quantitative Estimate of Drug-likeness. It combines several molecular descriptors into a single score between 0 and 1, with higher values indicating more drug-like properties.
What is SPS?
SPS refers to the Sum of Property Scores, which is a composite score derived from multiple physicochemical properties, providing an overall summary of molecular suitability for certain applications.
What does MolWt stand for?
MolWt is the molecular weight of the compound, representing the sum of the atomic weights of all atoms in the molecule. It is usually measured in Daltons (g/mol).
What is HeavyAtomMolWt?
HeavyAtomMolWt is the sum of the atomic weights of all non-hydrogen atoms in the molecule. This helps focus on the core structure and properties that are less influenced by hydrogen atoms.
What does ExactMolWt mean?
ExactMolWt gives the exact molecular mass based on the most abundant isotopes of each atom, useful for precise mass spectrometry and chemical identification.
What is NumValenceElectrons?
NumValenceElectrons is the total count of valence electrons in the molecule, reflecting how many electrons are available for bonding and chemical reactions.
What is NumRadicalElectrons?
NumRadicalElectrons represents the number of unpaired electrons in the molecule, which is important for identifying radicals and understanding reactivity.
What does MaxPartialCharge represent?
MaxPartialCharge is the highest partial (atomic) charge in the molecule, showing which atom is most electron-rich or carries the most negative charge.
What does MinPartialCharge indicate?
MinPartialCharge is the lowest partial (atomic) charge in the molecule, indicating the atom that is most electron-poor or carries the most positive charge.
What is MaxAbsPartialCharge?
MaxAbsPartialCharge is the largest absolute value of atomic partial charge, pointing to the atom with the strongest charge (regardless of sign) within the molecule.
What does MinAbsPartialCharge mean?
MinAbsPartialCharge is the smallest absolute value of partial charge among the atoms, indicating an atom with a charge closest to neutral.
What is FpDensityMorgan1, FpDensityMorgan2, FpDensityMorgan3?
FpDensityMorgan1, FpDensityMorgan2, and FpDensityMorgan3 are densities of Morgan fingerprints at different radii (1, 2, and 3). Morgan fingerprints are a way to represent molecular structure for computational analysis, often used in cheminformatics and virtual screening.
What are BCUT2D descriptors (BCUT2D_MWHI, BCUT2D_MWLOW, etc.)?
BCUT2D descriptors are a set of molecular indices that summarize various properties (such as mass, charge, logP, and molar refractivity) across the molecule. They are used in molecular similarity calculations and QSAR modeling.
What does AvgIpc mean?
AvgIpc stands for average information content per atom, which measures the complexity and information density of a molecular structure. It is often used in cheminformatics for comparing molecular complexity.
What is BalabanJ?
BalabanJ is the Balaban index, a topological index measuring the connectivity of a molecule’s atoms. It is used for assessing molecular branching and complexity.
What is BertzCT?
BertzCT is the Bertz complexity index, which provides a quantitative measure of a molecule’s structural complexity based on atom types and connections.
What are Chi indices (Chi0, Chi1, Chi2, etc.)?
Chi indices, including Chi0, Chi1, Chi2, and their n and v variants, are connectivity indices that capture molecular branching and electronic structure. They are widely used in QSAR and structure-activity modeling.
What is HallKierAlpha?
HallKierAlpha is a molecular descriptor that measures the degree of branching and ring systems in a molecule. It is often used to distinguish between linear and highly branched or cyclic compounds.
What does Ipc stand for?
Ipc stands for information content per atom, which is another metric for molecular complexity, reflecting the diversity and arrangement of atoms in a molecule.
What are Kappa indices (Kappa1, Kappa2, Kappa3)?
Kappa indices measure molecular shape, flexibility, and size. Kappa1, Kappa2, and Kappa3 provide information on the linearity or cyclicity of the molecule’s structure.
What is LabuteASA?
LabuteASA stands for Labute’s Approximate Surface Area, an estimation of the molecule’s solvent-accessible surface area, useful in modeling interactions with biological molecules.
What do PEOE_VSA, SMR_VSA, SlogP_VSA descriptors represent?
These descriptors (PEOE_VSA, SMR_VSA, SlogP_VSA) quantify the surface areas of the molecule associated with particular atomic properties, such as partial charge (PEOE), molar refractivity (SMR), and logP (SlogP). They are useful for modeling molecular recognition and predicting ADMET properties.
What is TPSA?
TPSA stands for Topological Polar Surface Area, which measures the surface area of polar atoms (typically oxygen and nitrogen) and their attached hydrogens. High TPSA can indicate poor cell membrane permeability.
What are EState_VSA and VSA_EState descriptors?
EState_VSA and VSA_EState descriptors combine surface area calculations with electrotopological state indices, helping to capture how the distribution of electronic properties influences molecular interactions.
What is FractionCSP3?
FractionCSP3 is the fraction of carbon atoms in the molecule that are sp3 hybridized, which is related to molecular saturation and three-dimensionality.
What is HeavyAtomCount?
HeavyAtomCount is the number of non-hydrogen atoms in the molecule, a fundamental measure of size and complexity.
What is NHOHCount?
NHOHCount gives the total number of -NH or -OH groups in the molecule, which are important for hydrogen bonding and solubility.
What does NOCount mean?
NOCount represents the number of nitrogen and oxygen atoms present in the molecule, indicating the presence of heteroatoms which often play key roles in biological activity.
What are NumAliphaticCarbocycles, NumAromaticHeterocycles, and related ring descriptors?
These fields enumerate the number of different ring systems in a molecule, including aliphatic carbocycles, aromatic heterocycles, saturated rings, and so on. These properties influence molecular rigidity, planarity, and bioactivity.
What do NumHAcceptors and NumHDonors mean?
NumHAcceptors is the number of hydrogen bond acceptor atoms (such as N or O), while NumHDonors is the number of hydrogen bond donor groups (-NH or -OH). Both are critical for predicting solubility and molecular interactions in biological systems.
What is NumHeteroatoms?
NumHeteroatoms counts the number of non-carbon and non-hydrogen atoms in the molecule. Heteroatoms (like N, O, S) often confer specific chemical or biological properties.
What is NumRotatableBonds?
NumRotatableBonds is the count of bonds that allow free rotation, reflecting the molecule’s flexibility. More rotatable bonds typically mean greater conformational freedom but can also reduce drug-likeness.
What is MolLogP?
MolLogP is the calculated octanol-water partition coefficient, another measure of lipophilicity, which influences membrane permeability and solubility.
What does MolMR stand for?
MolMR stands for molecular molar refractivity, which is related to the molecule’s size and polarizability and can impact how it interacts with light and with biological targets.
What do the 'fr_' descriptors represent?
The 'fr_' descriptors indicate the presence or count of specific functional groups or substructures in the molecule, such as amide, aldehyde, benzene, pyridine, and others. They are helpful for rapidly profiling molecular features relevant to biological activity or synthetic accessibility.
How does this file will help in drug design and research?
This file serves as a comprehensive molecular fingerprint, summarizing a wide array of physicochemical and structural properties for a compound. In drug design and research, these descriptors are vital for rapidly screening and comparing molecules, predicting their pharmacokinetic behaviors, and assessing their suitability as drug candidates. By analyzing features like molecular weight, lipophilicity, polar surface area, ring systems, hydrogen bond capacity, and specific substructures, researchers can estimate a compound’s solubility, membrane permeability, metabolic stability, and likelihood of off-target effects. This systematic profiling accelerates the selection of molecules with optimal drug-like properties, helps avoid liabilities such as toxicity or poor bioavailability, and supports machine learning models for virtual screening, lead optimization, and quantitative structure–activity relationship (QSAR) studies. Ultimately, the information contained in this file enables more efficient, data-driven decision-making throughout the drug discovery pipeline.