Design and screening of HepG2 cancer cell line inhibitors from Triterpenoid derivatives of Paramignya Trimera
- Faculty of Pharmacy, Ho Chi Minh City University of Technology, 475A Dien Bien Phu Street, Binh Thanh District, Ho Chi Minh City, Viet Nam
- Institute of Pharmaceutical Education and Research, Binh Duong University, 504 Binh Duong Avenue, Thu Dau Mot City, Binh Duong, Viet Nam
- Faculty of Chemical Engineering, Industrial University of Ho Chi Minh City, 12 Nguyen Van Bao Street, Go Vap District, Ho Chi Minh City, Viet Nam
Abstract
Currently, Artificial intelligence (AI) is a ubiquitous technology that provides effective support across all fields. The pharmaceutical industry in general and drug production and development, in particular, are enjoying a very good application for the opportunity when in silico models have emerged as powerful platforms for designing new drugs. The aim of this project is to develop new anti-cancer agents by designing novel Triterpenoid derivatives from Paramignya Trimera and predicting their efficacy against the Bcl-2 target receptor. The project used three main in silico models: QSARMLR, QSARPCR and QSARANN. The models can be used to estimate IC50 values for novel derivatives and Escin extracted from Paramignya Trimera. Finally, the new good-value derivatives were docked to the Bcl-2 receptor to assess responsiveness. As a result, newly designed 196 compounds from the structural framework of Triterpenoid compounds were designed by combined with potential substituents. From there, screening by the rule of Veber identified 138 substances that met the requirement of having the ability to make drugs. Successfully, built QSARMLR, QSARPCR, QSARANN models with results of statistical values: R2 = 0.849, R2adj = 0.826, Q2LOO = 0.789 for the QSARMLR model; QSARPCR model with R2 = 0.860, R2adj = 0.831, Q2LOO = 0.805, and the QSARANN model with the best results: R2train = 0.941, R2test = 0.915, R2cv = 0.912. The use of models can help predict the effectiveness of newly engineered compounds. In this study, 20 compounds were found to be more efficient than Escin. Molecular docking on the Bcl-2 receptor found T.new7 gave the most potential results with the binding energy E_binding = -7.933 (kcal.mol-1), RMSD = 1.915 (Å). The research has achieved its goal by finding T.new7, a newly designed compound with better anti-cancer ability than natural Escin.
INTRODUCTION
According to GLOBOCAN, liver cancer is one of the 5 deadliest cancers, with a high number of new cases and deaths each year in 2020. Figure 1 shows that liver cancer has the third highest number of deaths in the world and the highest number of deaths in Vietnam1. Worldwide, liver cancer is the third most common cause of cancer death, accounting for 8.3%, following lung and colorectal cancer. In Vietnam, liver cancer is the leading cause of death, accounting for 20.6% of all deaths. These data indicate that with the current situation of deaths from liver cancer, project implementation is extremely necessary. Liver cancer not only causes hundreds of thousands of deaths annually but also imposes a significant socioeconomic burden. Therefore, the search for liver cancer derivatives is extremely urgent for patients and for humanity in general.

Estimated number of deaths by cancer in 2020; World & Vietnam; both sexes; all ages. Source: GLOBOCAN
Liver cancer is a type of cancer that starts in liver cells. The liver is a football-sized organ located in the upper right quadrant of the belly, beneath the diaphragm and above the stomach. The liver can develop several types of cancer. Hepatocellular carcinoma (HCC) is the most common type of liver cancer and is the main type of liver cell (hepatocyte). Cancer that spreads to the liver is more common than cancer that spreads to liver cells. Cancer that develops in another part of the body, such as the colon, lung, or breast, and spreads to the liver is referred to as metastatic cancer rather than liver cancer. This type of cancer is termed by the organ in which it began; for example, metastatic colon cancer describes cancer that begins in the colon and travels to the liver.
Triterpenes are a class of terpenes made up of six isoprene units with the chemical formula CH. These compounds can alternatively be thought of as three terpene units. Triterpenes are produced by animals, plants, and fungi and include squalene, the precursor to all steroids. Triterpenes have a wide range of structures. Almost 200 distinct skeletons have been identified. These skeletons can be roughly classified based on the number of rings present. Pentacyclic structures (5 rings) predominate in general. One of the uses of Triterpenoids in the human body is to help prevent and treat cancer as well as to combat cancer metastasis. According to a 2011 study by Watchtel-Galor and colleagues, the use of triterpenoids in genital mushrooms has anticancer effects in vivo according to animal studies (mouse studies). In addition, the study indicated that the ingredients also contain active substances that help prevent cancer cells from growing in vitro (in the test tube). Thus, Triterpenoids help inhibit many types of cancer cells, such as lung cancer, breast cancer, and skin cancer cells. In addition, cancer metastasis is quite complicated. Cancer cells separate from the primary tumor and begin to move to other parts of the body. From there, small tumors—secondary tumors—form2.
Triterpenoids are of interest because of their anti-inflammatory and analgesic properties, especially in anticancer cell lines, including HepG2 cells. Artificial intelligence facilitates the creation of virtual screening models for derivative compounds. This prospective study was designed to explore a synthetic compound with superior cancer-fighting properties compared to the natural substance found in . The triterpenoid of the oleanolic acid (OA) subgroup, called escin (Figure 3), is extracted (Figure 2) of the family . OA has various benefits, including anti-inflammatory, antiviral, and hypoglycemic effects, and has potential for use against cancer cells. A large number of Triterpenoids are active against various human cancer cell lines, such as HepG2, SMMC-772 (hepatocellular carcinoma), HL-60 (leukemia), A549 (hepatocellular carcinoma), MCF-7 (breast cancer), and SW-480 (colon carcinoma) 3.

(a) Image of
Furthermore, oleanolic acid (OA) affects cancer cells via many routes. Increasing Bcl-2 receptor inhibition is a strategy that promotes the proliferation of OA-treated HepG2 cancer cells. As a result, the Bcl-2 receptor was chosen as the target of interest. This research used algorithms to predict novel synthetic chemicals. Chemicals that are more effective at inhibiting HepG2 cells are being found. The topic is focused on developing three models: QSAR, QSAR, and QSAR. Using virtual screening procedures saves time, money, and human resources. This research is likely to yield a result that speeds up compound screening in research and new medicine manufacture compared to experimental trials.
METHODOLOGY
Data mining from experiments
The data collected from the experiments are divided into 2 datasets: the training subset and the external evaluation subset. The two subsets are completely independent datasets. The condition is that the compound has a Triterpenoid framework and was tested on HepG2 cell carcinoma cells with an IC value.
Design of new compounds
Two R1 and R2 sites (Figure 3) in the structural frame were selected for the attachment of substituents via the maximum design method. The binding group, which includes 14 cells labeled T1 to T14, has been shown to have anticancer activity. Therefore, 196 novel compounds were synthesized using the maximum design technique. Multilevel design: This method is used in drug design and helps generate a list of design compounds based on various taxonomic factors and material quantities6. The maximum design limits the possibility of missing significant compounds, thereby providing a complete dataset of possible designs when combining taxonomic elements (2 positions selected on the frame Escin structure) and corresponding materials (14 functional groups T1-14 were selected).

Structural framework for designing new compounds

The substitution groups used in the design of new molecules
Optimization of the structures
New derivatives were created using the ChemDraw program. Using molecular and quantum mechanics, all novel and experimental derivatives have been structurally optimized. These two types of software used include HyperChem with an MM+ force field and a gradient level of 0.05 and MOPAC with the semiempirical PM7 approach. This approach helps molecules determine the most stable structure and acquire descriptive variables, including partial charge, HOMO, LUMO, MW, DH, and so on.
Calculation of molecular descriptors
After the structural optimization procedure, MOE software was used to determine the molecular descriptor for all the datasets. The selection and calculation of all descriptors were investigated from 0 to 3D. When the results are made public. The variable screening procedure was used to exclude variables that were not important. When combined with the descriptor variables produced from the structural optimization process, a dataset of descriptive characteristics for each molecule is obtained, allowing QSAR models to be built.
Estimation of QSAR models
This study focused on the development of three QSAR models: multivariate linear regression (MLR), principal component regression (PCR), and artificial neural network (ANN) models.
QSAR model
The QSAR model predicts the dependent variable Y based on the values of two or more independent variables X. The model is represented as follows:
Y = + × + × + … + × + e (1)
Here, Y is the dependent variable, and β, β, β… β are the regression parameters of the model. X is the independent variable (k is the number of variables), and e is the random error. In this study, the dependent variable was the IC value. The independent variable is the molecular descriptor22. Regression 200823 software was used to construct the QSAR model.
QSAR model
The set {X,Y}, where X is a data group with m observations and n variables and Y is the dependent variable. The information is gathered but not previously processed. Although outcome Y has no direct association with X, it does have a relationship with the principal components, which is a property of principal component regression22. To create the QSAR model, the XLSTAT 201624 program was used.
QSAR model
Artificial neural networks (ANNs) perform the same learning process as the human brain22. The structure of an artificial neural network I()-HL()-O() includes the following: the input layer I() is the descriptive variable of the QSAR model, the output layer O() is the IC value, and the hidden layer HL() is investigated to determine the best QSAR model25. The QSAR model was trained on the MATLAB 201626 tool.
Drug-likeness
The rule of action for Lipinski-5, the earliest and most well-known rule for identifying substances with good oral absorption, was proposed in 199727. Since then, several analogous rules based on molecular characteristics, such as those given by Ghose28 and Veber29, have been established. According to Veber's rule, substances in this study must meet the following two criteria: rotatable bonds (nRB) ≤ 10 and polar surface area (tPSA) ≤ 140 Å. Therefore, screening according to the same rules is aimed at finding compounds that have the potential to become more effective oral drugs according to Veber's rule under two conditions: rotatable bonds (nRB) ≤ 10 and polar surface area (tPSA) ≤ 140 Å. Reduced molecular flexibility, as measured by the number of rotatable bonds and low polar surface area or total hydrogen bond count (sum of donors and acceptors), are important predictors of good oral bioavailability. A reduced polar surface area correlates better with an increased permeation rate than does lipophilicity (C log P), and an increased rotatable bond count has a negative effect on the permeation rate29.
Bioactivity prediction
Medicinal characteristics were determined by using three QSAR models to predict the bioactivity at the IC for new synthetic compounds, and esin is a natural triterpenoid derived from . Then, using Escin as a reference, we looked for derivatives with greater biological activity than Escin. Currently, research predicts and discovers derivatives with higher bioavailability than natural chemicals and the potential to become medications.
Molecular Docking
The main goal of molecular docking is to understand and predict molecular recognition both in terms of structure-finding bonds and energy-predicting affinity. Currently, the application of molecular docking methods is very diverse and includes structure-activity studies, optimization, and potential molecule searches via virtual screening30. In this study, we used the MOE2019 package to perform the molecular docking process.
Escin, a Triterpenoid derived from , belongs to the oleanolic acid group. This process affects cancer cells in a variety of ways, including inducing cyclic death, controlling the cell cycle, and killing cancer cells. Bcl-2 normally prevents cell cycle death (apoptosis). The target of action in this investigation was chosen to be Bcl-2, which inhibits Bcl-2 receptors, hence boosting cancer cell cyclic death31. The Bcl-2 receptor, encoded 4D2M, was obtained from the Protein Data Bank (PDB)32.
RESULTS
The training and test datasets
Seventy-four chemicals were gathered from articles published in reputable journals and PubMed. The data were utilized to develop models and assess the external inhibitory concentration (IC. The training set of 60 compounds was used to develop QSAR models, and the external validation set of 14 compounds was utilized to assess the predictive power of the biological activity of the model.

Summary of the 60 experimental IC50 values in the training datasets
Design of the new compound
Using the multilayer design method and ChemDraw tool, a total of 196 novel molecules were obtained. All of these derivatives were optimized using the proper molecular mechanics sequence, followed by PM7 quantum mechanics. They are then calculated descriptors in the following phase.
Optimization of the structure and calculation of descriptors
All the compounds, both experimental and newly created, were subjected to structural optimization and molecular descriptor computations. The findings generated 310 descriptive attributes for each molecule utilized to construct the QSAR model.
Construction of models
Results of building the QSARMLR model
k |
Variables |
R2 |
R2adj |
Q2LOO |
SE |
Fstat |
PRESS |
1 |
x1 |
0.250 |
0.233 |
0.192 |
4.139 |
18.890 |
1064.026 |
2 |
x1 | x2 |
0.626 |
0.613 |
0.569 |
2.938 |
47.792 |
568.351 |
3 |
x1| x2 | x3 |
0.710 |
0.694 |
0.651 |
2.613 |
45.626 |
460.183 |
4 |
x1| x2 | x3 | x4 |
0.756 |
0.738 |
0.693 |
2.418 |
42.577 |
403.963 |
5 |
x1| x2 | x3 | x4 | x5 |
0.785 |
0.765 |
0.720 |
2.291 |
39.363 |
368.313 |
6 |
x1 | x2 | x3 | x4 | x5 | x6 |
0.814 |
0.793 |
0.752 |
2.150 |
38.628 |
326.307 |
7 |
x1 | x2 | x3 | x4 | x5 | x6 | x7 |
0.829 |
0.805 |
0.764 |
2.084 |
35.894 |
310.581 |
8 |
x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 |
0.849 |
0.826 |
0.789 |
1.972 |
35.944 |
277.383 |
Notation of molecular descriptors | |||||||
LUMO |
Lowest unoccupied molecular orbital |
x1 |
vsurf_CW4 |
Capacity factor at -2.0 |
x5 | ||
PEOE_RPC- |
Relative negative partial charge |
x2 |
SlogP_VSA3 |
Bin 3 SlogP (0.00, 0.10] |
x6 | ||
21C |
Partial charge of C position number 21 |
x3 |
vsurf_EWmin1 |
Lowest hydrophilic energy |
x7 | ||
vsurf_DD13 |
vsurf_EDmin1, vsurf_EDmin3 distance |
x4 |
SlogP_VSA4 |
Bin 4 SlogP (0.10, 0.15] |
x8 |
The model yields the following equation: R = 0.849, R = 0.826, and Q = 0.789:
IC (µM) = 3.739 + 2.247× x + 85.94× x + 24.36× x (3.13)+ 0.156×x + 5.24× x – 0.03695× x + 0.933× x - 0.131× x (2)
Construction of the QSAR model
The QSAR model was built based on the variables of the QSAR model and yielded the following results: R = 0.860, R = 0.831, and Q = 0.805. The equations are represented as follows:
IC (µM) = 8.406 + 2.48×x + 73.319×x + 25.836×x + 0.135×x + 3.965×x - 0.037×x + 1.250×x - 0.130×x (3)
Construction of the QSAR Models
The QSAR model was built using the QSAR-MLR model descriptors from equation (2). The training of models uses a back-propagation algorithm with transfer functions such as Logsig, Tansig and Purelin. Therefore, the architecture of the ANN models in this scenario is I(8)-HL()-O(1). QSAR models were developed in two stages. First, using the training dataset, multiple designs of MLP networks with different values were identified, and the results are shown in
The results of initial screening for the ANN architecture
Ord. |
QSARANN model |
Transfer function |
R2train |
R2test |
R2cv |
Training error |
Test error |
Validation error |
Training algorithm |
1 |
I(8)-HL(6)-O(1) |
Logsig |
0.986 |
0.988 |
0.988 |
1.201 |
3.367 |
0.983 |
BFGS 42 |
2 |
I(8)-HL (6)-O(1) |
Tansig |
0.957 |
0.984 |
0.998 |
0.961 |
2.739 |
1.186 |
BFGS 63 |
3 |
I(8)-HL(10)-O(1) |
Logsig |
0.965 |
0.987 |
0.97 |
0.948 |
2.084 |
1.752 |
BFGS 39 |
4 |
I(8)-HL(10)-O(1) |
Tansig |
0.975 |
0.992 |
0.99 |
2.249 |
2.589 |
0.721 |
BFGS 13 |
5 |
I(8)-HL(5)-O(1) |
Purelin |
0.907 |
0.967 |
0.973 |
2.790 |
2.246 |
1.003 |
BFGS 17 |
6 |
I(8)-HL(6)-O(1) |
Purelin |
0.941 |
0.916 |
0.912 |
0.822 |
1.389 |
0.795 |
BFGS 36 |
7 |
I(8)-HL(10)-O(1) |
Purelin |
0.911 |
0.921 |
0.975 |
1.658 |
2.326 |
1.896 |
BFGS 43 |
8 |
I(8)-HL(4)-O(1) |
Purelin |
0.917 |
0.977 |
0.936 |
2.567 |
2.238 |
1.326 |
BFGS 69 |
Second, using the same external evaluation dataset (Table 5) of the QSAR and QSAR models, the best network was chosen based on the MARE (%) and Q values. As a result, the best model was I(6)-HL(6)-O(1) (Figure 5a), with the statistical parameters Q = 0.866 and MARE = 62.1%, as shown in Figure 5 and Figure 12, respectively, with the Purelin transfer function.

The architecture of the QSPRANN I(8)-HL(6)-O(1) model
The external validation
External evaluation is considered a test to validate the predictive ability of the built models. From there, the model that gives the closest prediction results to the experimental results is selected. The external evaluation dataset includes 14 compounds obtained from experiments and is an independent set from the set used to construct the QSAR model. Detailed information on these derivatives is presented in Figure 7.

The experimental values of IC50,exp in the external evaluation dataset
The predicted values of the QSAR models for 14 substances in the external evaluation dataset are presented in
The predicted IC50,pred values of the three models in the external evaluation set
Symbol |
IC50,exp (µM) |
IC50,pred (µM) | ||
QSARMLR |
QSARPCR |
QSARANN6 | ||
TPN1 |
8.900 |
5.777 |
6.196 |
5.311 |
TPN2 |
29.700 |
10.388 |
10.573 |
11.069 |
TPN3 |
3.300 |
3.754 |
3.613 |
2.562 |
TPN4 |
17.660 |
6.408 |
6.462 |
7.848 |
TPN5 |
11.600 |
5.995 |
6.276 |
4.827 |
TPN6 |
15.500 |
8.868 |
9.065 |
7.713 |
TPN7 |
17.700 |
7.547 |
7.519 |
7.264 |
TPN8 |
22.400 |
8.867 |
9.065 |
7.714 |
TPN9 |
25.000 |
9.353 |
10.069 |
8.772 |
TPN10 |
23.700 |
7.547 |
7.519 |
7.264 |
TPN11 |
22.300 |
8.485 |
9.184 |
7.403 |
TPN12 |
21.000 |
10.116 |
10.320 |
10.313 |
TPN13 |
26.700 |
9.860 |
10.266 |
10.054 |
TPN14 |
0.660 |
3.750 |
4.015 |
1.5943 |
Drug-likeness
After screening the drug likeness of all 196 compounds using Veber's criteria, we found 138 compounds that met these criteria.
Bioactivity prediction
The study predicts the IC for 138 newly designed substances and Escin. Then, 20 compounds with less than Escin were obtained, ordered from small to large based on QSAR for the best predictability. The structures of the 20 potential substances are presented in Figure 8.

The structures of the 20 new compounds
The detailed prediction results of 20 compounds from each model are presented in
The predicted IC50,pred values of new derivatives from three QSAR models
Symbol |
IC50, pre (µM) |
Symbol |
IC50, pre (µM) | ||||
QSARMLR |
QSARPCR |
QSARANN |
QSARMLR |
QSARPCR |
QSARANN | ||
T.new1 |
1.754 |
2.15 |
2.675 |
T.new11 |
4.025 |
4.277 |
1.817 |
T.new2 |
2.848 |
2.7 |
1.532 |
T.new12 |
5.571 |
5.589 |
2.502 |
T.new3 |
2.905 |
4.159 |
1.477 |
T.new13 |
4.11 |
4.786 |
1.672 |
T.new4 |
0.807 |
0.973 |
2.094 |
T.new14 |
3.906 |
4.175 |
1.955 |
T.new5 |
5.013 |
5.616 |
2.254 |
T.new15 |
5.042 |
5.325 |
2.201 |
T.new6 |
4.48 |
4.687 |
2.056 |
T.new16 |
1.229 |
2.004 |
2.279 |
T.new7 |
0.938 |
1.764 |
1.187 |
T.new17 |
5.299 |
5.871 |
2.686 |
T.new8 |
6.115 |
6.803 |
2.719 |
T.new18 |
2.863 |
3.642 |
1.573 |
T.new9 |
2.369 |
3.11 |
1.496 |
T.new19 |
1.818 |
2.746 |
1.306 |
T.new10 |
6.277 |
7.156 |
1.356 |
T.new20 |
3.685 |
4.609 |
1.957 |
Escin |
3.391 |
3.534 |
2.752 |
Molecular Docking
To test the inhibitory ability of the peptides on HepG2 cancer cells, 20 drugs were docked with the appropriate IC50 values for the Bcl-2 receptor. This docking process helps evaluate the binding ability of the compound to the Bcl-2 target receptor by simulating the 3D structure of both the receptor and the compound. Substances that are considered well bound have an RMSD < 2.0 Å and an E_binding < -7.0 kcal.mol. The results for the 6 compounds with good binding energies and RMSD values are presented in Figure 5.

Docking results of six compounds with the 3U6J–Bcl-2 system
DISCUSSION
QSAR models

The variation in the SE, R2train, and Q2LOO values in response to the
The QSAR model results for R = 0.849 > 0.6(47) showed that the model encoded 84.9% of the biological activity variables in the dataset. An R = 0.826 > 0.5 represents an encoding of 82.6% of the active value variable in the data, and Q = 0.789 > 0.5(47). As a result of these findings, the model produced relatively good prediction outcomes.
Moreover, the QSAR model results for R = 0.860 > 0.6(47) demonstrated that the model encoded 86% of the biological activity variables in the dataset. An R = 0.831 > 0.5 represents an encoding of 83.1% of the variable to the active value in the data. Q = 0.846 > 0.5(47). Based on these findings, the model produces accurate predictions.
The architecture of ANN I(8)-HL(6)-O(1) using the Purelin transfer function for R = 0.941, R = 0.916 and Q = 0.912 shows that the model has good predictability with high correlation values. The results with an external evaluation set of 0.866 show that the predictive ability of this model is closest to reality.
Based on the above reasons, the QSAR models were chosen to develop the new design and Escin.
The contributions of the variables in the model were also investigated, and the results are presented in Figure 11. All the descriptors contributed significantly to various degrees the most significant contributor was PEOE_RPC-, and the least significant contributor was vsurf_EWmin1, with contributions of 42.3% and 1.6%, respectively. The remaining variables also contributed to the QSAR model in the following order: PEOE_RPC- > vsurf_CW4 > SlogP_VSA3 > LUMO > 21C > vsurf_DD13 > SlogP_VSA4 > vsurf_EWmin1.

The contributions of the variables to the QSARMLR model
The external validation
As mentioned above, external evaluation is used to construct the MLR and PCR models. In addition, the best ANN model was identified from the initial survey models, as shown in

The MARE (%) and Q2EX values of the QSAR models
As depicted in Figure 13, the Qof the QSAR model for the relationships between the IC and experimental IC values are shown, for a value of 0.840. Similar to the QSAR and the external evaluation set, the result is Q = 0.846, and the QSAR gives a result of 0.866. The conclusion that the above three models all give good correlation index results for the external evaluation set shows that the evaluation ability is reliable and can be used to predict a wide range of design substances.
Furthermore, one-way ANOVA showed that the differences between the experimental and predicted values from the three models, QSAR, QSAR, and QSAR, were not significant when the results were F = 0.0269 < F = 3.2381. Therefore, the predictive ability of the three models is appropriate.

Correlations of experimental and predicted values on the external dataset of QSAR models
Bioactivity prediction
Under the same calculation conditions, the same models predicting the results obtained above for the 20 substances had better IC values than did those of Escin. The present study used Escin as a base to select compounds with better biological activity to prove that this potential new substance has superior cancer cell inhibitory ability compared to natural active substances.
Prediction results of new molecules and predicted Escin values from three QSAR models, QSAR, QSAR, and QSAR. There was no significant difference in the analysis of variance (F = 0.71595 < F = 3.07606). Therefore, the predictive ability of the three models is consistent and reliable.
Molecular Docking
The full interaction results of the six new compounds on the Bcl-2 receptor are presented in
T.new1 binds to the Bcl-2 receptor via a hydrogen acceptor bond to ARG74 (distance = 2.97 Å, energy = -1.7 kcalmol), E_binding = -7.158 (kcalmol) and RMSD = 1.539 (Å).
T.new4 binds to the Bcl-2 receptor via a pi-cation bond to ARG154 (distance = 3.58 Å, energy = -1.6 kcalmol), E_binding = -7.817 (kcalmol) and RMSD = 1.696 (Å).
T.new7 binds to the Bcl-2 receptor via a hydrogen donor bond to CYS174 (distance = 3.37 Å, energy = -1.2 kcalmol), E_binding = -7.933 (kcalmol) and RMSD = 1.915 (Å).
T.new11 binds to the Bcl-2 receptor via a pi-cation bond to TYR79 (distance = 3.78 Å, energy = -0.9 kcalmol), E_binding = -7.166 (kcalmol) and RMSD = 1.388 (Å).
T.new12 binds to the Bcl-2 receptor via a hydrogen donor bond to CYS174 (distance = 3.23 Å, energy = -0.8 kcalmol), E_binding = -7.869 (kcalmol) and RMSD = 1.279 (Å).
T.new19 binds to the Bcl-2 receptor via a pi-cation bond to ILE146 (distance = 3.65 Å, energy = -0.9 kcalmol), E_binding = -7.367 (kcalmol) and RMSD = 1.846 (Å).
The full interaction results of the six new compounds on the Bcl-2 receptor are presented in
Detailed interaction results of new compounds on the Bcl-2 receptor
Compounds |
Ligand |
Receptor |
Interaction |
Distance (Å) |
E (kcal/mol) | |||
T.new1 |
O 58 |
NE |
ARG |
74 |
(A) |
H-acceptor |
2.94 |
-1.7 |
T.new4 |
5-ring |
NH2 |
ARG |
154 |
(A) |
pi-cation |
3.58 |
-1.6 |
T.new7 |
O 20 |
SG |
CYS |
174 |
(A) |
H-donor |
3.37 |
-1.2 |
T.new11 |
6-ring |
CD1 |
TYR |
79 |
(A) |
pi-H |
3.78 |
-0.9 |
T.new12 |
O 48 |
SG |
CYS |
174 |
(A) |
H-donor |
3.23 |
-0.8 |
T.new19 |
5-ring |
CG2 |
ILE |
146 |
(A) |
pi-H |
3.65 |
-0.9 |
T.new7 had the best results: T.new7 binds to the Bcl-2 receptor by a hydrogen donor bond to CYS174 (distance = 3.37 Å, energy = -1.2 kcalmol) E_binding = -7.933 (kcalmol) and RMSD = 1.915 (Å).
On the BCL-2 receptor, the amino acids considered essential are ARG74, ARG154, CYS174, TYR79, CYS174, and ILE146 when sequentially linked by the 6 most potential compounds.
Since then, QSAR, QSAR, and QSAR models have been successfully constructed to predict new engineered substances. Finally, the T.new7 compound was selected to inhibit HepG2 cancer cells.
CONCLUSION
This study applied the QSAR model to screen and develop new drugs, specifically triterpenoids, for use on HepG2 cancer cells. The final selected compound T.new7 showed better bioavailability than the naturally occurring substance and met the conditions for use as a drug according to Veber's rule. Biological activity prediction based on the screening process and statistical statistics is fair and reliable. This study provides the foundation for future T.new7 experimental studies. Based on these findings, T.new7 was created using the Escin structural framework, with R as morpholine and R as coumarin. T.new7 has an IC (µM) of 0.938, 1.764, and 1.187 according to the 3 models QSARMLR, QSARPCR, and QSARANN, respectively. In this case, QSAR produces the best prediction results, and all three values are lower than those of the natural parent chemical Escin. The results of molecular docking showed that E_binding = -7.933 kcal.mol and RMSD = 1.915 Å. The T.new7 compound binds to the Bcl-2 receptor via an H-donor. Specifically, T.new7 gives amino acid CYS174 a hydrogen (distance d = 3.37 Å, energy = -1.2 kcal.mol). Therefore, T.new7 was selected as the best inhibitor of HepG2 cancer cells. This current study is limited by the fact that it involved only virtual screening. Despite its exploratory nature, this study offers T.new7 for further experiments. Consequently, further experimental studies are necessary to confirm the effectiveness of T.new7 against HepG2 cancer cells.
ABBREVIATIONS
ANOVA: Analysis of variance
ANN: Artificial neural network
Bcl-2: B-cell lymphoma 2
GLOBOCAN: Global Cancer Statistics
HepG2: Human liver cancer cell line
HOMO: Highest occupied molecular orbital
IC: Half-maximal inhibitory concentration
LUMO: lowest unoccupied molecular orbital
MLR: Multiple linear regression
OA: Oleanolic Acid
PCR: Principal component regression
PDB: Protein Data Bank
PM7: Parameterized Model 7
QSAR: Quantitative structure-activity relationship
COMPETING INTERESTS
AUTHOR CONTRIBUTIONS
All authors participated in study design, coordination, and manuscript drafting