Previous PageNext Page

Methods

In the present study, a 3D descriptor is introduced that is based on the autocorrelation of properties at distinct points on the molecular surface. The distances between surface points are sorted into preset intervals (dlower, dupper). The autocorrelation coefficient A(dlower, dupper) is obtained by summation of the products of property values p at points i,j having a distance d belonging to the distance interval (dlower, dupper) and by weighting the sum by the total number L of distances in the interval:

(4)

The autocorrelation vectors exhibit some interesting properties. First, they are unique for a given molecular geometry. Second, they are invariant to translation and rotation since only spatial distances instead of Cartesian coordinates are used (Figure 3).

Figure 3. The dependence of the autocorrelation vector of corticosterone on six different parameters of the calculation scheme (eq. 4): (a) six different spatial orientations; (b) seven different conformations of the side chain at position 17; (c) five different point densities; (d) four different distance intervals dij; (e) five different sets of atomic radii; (f) comparison of the Connolly surface with the van der Waals surface. See text for the values of the different parameters and their default values.

The first example for applying the surface autocorrelation vector is presented using a well-known dataset: 31 steroids (Chart 3) binding to the corticosteroid binding globulin (CBG) (Table 1).

Chart 3


Table 1. CBG Binding Affinity Data from Ref 4.

compd

CBG
affinity (pK)

activity classa

compd

CBG
affinity (pK)

activity classa

1

-6.279

2

17

-5.225

3

2

-5.000

3

18

-5.000

3

3

-5.000

3

19

-7.380

1

4

-5.763

3

20

-7.740

1

5

-5.613

3

21

-6.724

2

6

-7.881

1

22

-7.512

1

7

-7.881

1

23

-7.553

1

8

-6.892

2

24

-6.779

2

9

-5.000

3

25

-7.200

1

10

-7.653

1

26

-6.144

2

11

-7.881

1

27

-6.247

2

12

-5.919

2

28

-7.120

2

13

-5.000

3

29

-6.817

2

14

-5.000

3

30

-7.688

1

15

-5.000

3

31

-5.797

2

16

-5.225

3

     

a 1, high; 2, intermediate; 3, low; this classification was obtained by dividing the dataset into three classes of comparable site.

3D models of the structures were obtained by using the 3D structure generator Corina.24,25
Partial atomic charges were calculated by the PEOE method27 and its extension to conjugated systems.28
Autocorrelation vectors were calculated for each molecule for distance intervals of 1 Å from 1 to 13 Å by using the electrostatic potential as property.

In Figure 5, a plot of the first two principal components is shown.

Figure 5. Principal components plot of the steroid data set: squares, high activity; asterisks, intermediate activity; crosses, low activity.

A Kohonen network was used for the nonlinear mapping of the data from the twelve-dimensional space spanned by the autocorrelation vectors into two dimensions.
Figure 6 shows the resulting network. A Kohonen network with a toroidal topology was used.

Figure 6. Kohonen map of the steroid data set: squares, high activity; asterisks, intermediate activity; crosses, low activity. The Kohonen network has a toroidal topology. Thus, the upper and lower neurons, as well as those at the left- and right-hand side, are directly connected as indicated by the arrows.

Figure 7 shows a 4-fold replication of Figure 6 in order to illustrate the continuous nature of a toroidal surface.

Figure 7. The 4-fold replication of the Kohonen map of Figure 6. The three different clusters of compounds with high, intermediate, and low activity are highlighted by shaded areas.

A feedforward multilayer neural network was used to obtain a predictive model of the biological activity of the 31 steroid molecules based on their autocorrelation vectors. Figure 8 shows the topology of the network used.

Figure 8. Multilayer neural network topology.

Figure 9 shows the ability of the trained network to reproduce the data used for training.

Figure 9. Plot of the experimental pK values against the pK values reproduced by the trained network.

In order to estimate the predictive power of the model, cross-validation following the leave-one-out scheme was performed.

a

b

Figure 10. Plot of the experimental pK values against the cross-validated values: (a) entire dataset (molecule 31 marked by a circle); (b) dataset of 30 molecules without 31.

For comparison, Cramer et al.4 obtained for the first 21 steroid molecules of the same dataset a CoMFA model with a cross-validated r2 of 0.66.

In a second example, the affinities of 78 polyhalogenated aromatic compounds for binding to the cytosolic Ah receptor were studied. The dataset consisted of 25 chlorinated and brominated dibenzo-p-dioxins,8,9 39 chlorinated dibenzofurans,8,9 and 14 chlorinated biphenyls.10 (Tables 2-4)

Table 2. Binding Affinities of Polychlorinated and Polybrominated Dibenzo-p-dioxins (for the Numbering See Chart 4)

substitution position

pEC50

substitution position

pEC50

2,3,7,8-Cl4

8.000

1-Cl

4.000

1,2,3,7,8-Cl5

7.102

2,3,7,8-Br4

8.824

2,3,6,7-Cl4

6.796

7,8-Cl-2,3-Br2

8.830

2,3,6-Cl3

6.658

3,7-Cl-2,8-Br2

9.350

1,2,3,4,7,8-Cl6

6.553

3,7,8-Cl-2-Br

7.939

1,3,7,8-Cl4

6.102

1,3,7,8,9-Br5

7.032

1,2,4,7,8-Cl5

5.959

1,3,7,8-Br4

8.699

1,2,3,4-Cl4

5.886

1,2,4,7,8-Br5

7.770

2,3,7,-Cl3

7.149

1,2,3,7,8-Br5

8.180

2,8-Cl2

5.495

2,3,7-Br3

8.932

1,2,3,4,7-Cl5

5.194

2,7-Br2

7.810

1,2,4-Cl3

4.886

2-Br

6.530

1,2,3,4,6,7,8,9-Cl8

5.000

   

Table 3. Binding Affinities of Polychlorinated Dibenzofurans (for the Numbering See Chart 4)

Cl positions

pEC50

Cl positions

pEC50

2

3.553

1,2,4,7,8

5.886

3

4.377

2,3,4,7,8

7.824

4

3.000

1,2,3,4,7,8

6.638

2,3

5.326

1,2,3,6,7,8

6.569

2,6

3.609

1,2,4,6,7,8

5.081

2,8

3.590

2,3,4,6,7,8

7.328

1,3,6

5.357

2,3,6,8

6.658

1,3,8

4.071

1,2,3,6

6.456

2,3,4

4.721

1,2,3,7

6.959

2,3,8

6.000

1,3,4,7,8

6.699

2,6,7

6.347

2,3,4,7,9

6.699

2,3,4,6

6.456

1,2,3,7,9

6.398

2,3,4,8

6.699

 

3.000

1,3,6,8

6.658

2,3,4,7

7.602

2,3,7,8

7.387

1,2,3,7

6.959

1,2,4,8

5.000

1,3,4,7,8

6.699

1,2,4,6,7

7.169

2,3,4,7,9

6.699

1,2,4,7,9

4.699

1,2,3,7,9

6.398

1,2,3,4,8

6.921

1,2,4,6,8

5.509

1,2,3,7,8

7.128

   

Table 4. Binding Affinities of Polychlorinated Biphenyls (for the Numbering See Chart 4)

Cl positions

pEC50

Cl positions

pEC50

3,3´,4,4´

6.149

2,3,3´,4,4´,5

5.301

3,4,4´,5

4.553

2,3´,4,4´,5,5´

4.796

3,3´,4,4´,5

6.886

2,3,3´,4,4´,5´

5.149

2´,3,4,4´,5

4.854

2,2´,4,4´

3.886

2,3,3´,4,4´

5.367

2,2´,4,4´,5,5´

4.102

2,3´,4,4´,5

5.041

2,3,4,5

3.854

2,3,4,4´,5

5.387

   

Since these compounds are highly hydrophobic, the hydrophobicity potential was used as molecular surface property.

12 autocorrelation coefficients per molecule were calculated.

Figure 11 shows the 4-fold replicated map.

Figure 11. The 4-fold replication of the 20 x 20 Kohonen map of the polyhalogenated aromatic compounds: squares, high affinity; asterisks, medium affinity; crosses, low affinity. The area occupied by one replication of the whole dataset is enclosed by a black line. Areas of compounds with high and low affinity are shaded.

Figure 12 shows a plot of the experimental pEC50 values against the cross-validated values obtained by a multilayer neural network.

Figure 12. Plot of the experimental pEC50 values of the polyhalogenated dibenzo-p-dioxins (rhombs), dibenzofurans (crosses), and biphenyls (squares) against the cross-validated pEC50 values.

Previous PageNext Page