Book Cover J. Zupan, J. Gasteiger
Neural Networks in Chemistry and Drug Design: An Introduction

Accompanying Data Sets

Italian Olive Oils
  This data set is described in Chapter 10 (pages 176-189) of the book and in the following publication:
J. Zupan, M. Novic, X. Li, J. Gasteiger,
Classification of multicomponent analytical data of olive oils using different neural networks
572 olive oil samples from nine different regions of Italy; for each sample the normalized concentrations of eight fatty acids are given. We thank Prof. Michele Forina, University of Genova, Italy for making this dataset available.
Download the complete dataset
Steroids Binding to the CBG Receptor
  This data set contains 31 steroids, which bind to the corticosteroid binding globulin (CBG) receptor. You can read the detailed descritpion or directly download the MDL SDFile and the biological activity data.
Combinatorial Library
  In order to also provide a larger data set to the scientific community, the combinatorial libraries studied in Sections 20.5 - 20.6 comprising derivatives are made accessible. Because of the size of the data set we refrain from giving the connection tables as Molfile (ASCII) data sets. Rather, we provide for each of the compounds the 12 autocorrelation coefficients obtained from the molecular electrostatic potential on the van der Waals surface as described in
J. Sadowski, M. Wagener, J. Gasteiger,
Angew. Chem. Int. Ed. Engl., 1995, 34, 2674-2677.

All the descriptor files are compressed with the GNU compression program gzip. Each descriptor file contains one line per library compound. Each line consists of 12 autocorrelation coefficients of the molecular electrostatic potential on the van der Waals surface and a unique label. The labels are composed of four letters specifying the four amino acids that are attached to the core molecule (dimethylxanthene, cubane, or adamantane) by their one letter symbol. The position of the amino acids on the scaffold of the core molecule is given by the position of the one letter symbol in the label string: The first position in the label string corresponds to group R1, the second position to group R2, and so on. The position of the groups R1 to R4 is shown in the depictions below. Thus, e.g., the label of the most active xanthene derivative identified in the original experimental paper is IPKV.
T. Carell, E.A. Wintner, J. Rebek, Jr.,
Angew. Chem. Int. Ed. Engl., 1994, 33, 2062-2064

dimethylxanthenes cubanes adamantanes
dimethylxanthene derivatives cubane derivatives adamantane derivatives
Flavonoid compounds (Section 13.9)
  This dataset is described in Section 13.9 of the book and consists of 55 flavonoids and the associated biological activity data (IC50 values for PTK inhibition). You may see the structures online or directly download the MDL SDFile and the biological activity data.