Previous PageNext Page

Organic Reactions Classified by Neural Networks: Michael Additions, Friedel-Crafts Alkylations by Alkenes, and Related Reactions**

Lingran Chen and Johann Gasteiger*

Organic reactions are influenced by many factors: the structures of the starting materials, reagents, and catalysts as well as reaction conditions such as temperature, solvent, pressure, and light. Each of these factors can be considered as a separate coordinate spanning a multidimensional space, and in this sense a chemical reaction is an event in this type of multidimensional space. Chemists have largely gained their knowledge on organic reactions from observations on a series of reactions; common and differentiating features were sought in order to assign reactions to classes, which were frequently named after the principal discovers, for example Wittig reaction, Michael addition, and Beckmann rearrangement. Clearly, such a one-dimensional classification scheme can only insufficiently account for the variety of observations on chemical reactions.
We will show here that a set of chemical reactions can be projected by a self-organizing neural network into a two-dimensional map. A reaction consists of a point in such a map,and the distance between two points reflects how similar two reactions are. Different types of similarities between reactions can be represented by projection into different directions of the two-dimensional map.
The human brain generates two-dimensional maps in the visual, auditory, and somatosensory cortex from corresponding sensory information obtained from the environment. The self-organizing neural network method developed by Kohonen [1] models this feature of the human brain [2]. In this network the basic processing units, the artificial neurons, are arranged in two dimensions and thus "maps" are obtained of the analyzed information.
Clearly, an enormous range of variations exists for organic reactions, and a map reflecting this broad spectrum of possibilities would be large indeed. To illustrate the potential of our method we will therefore limit the discussion to a set of reactions having an important feature in common: the reaction center, the set of atoms and bonds directly involved in the bond rearrangement during reaction. The reaction scheme chosen is the addition of a C-H bond to a C=C bond (Table 1) and includes such important reaction types as Michael additions, Friedel-Crafts alkylations by alkenes, and free radical additions to alkenes.
A set of 120 reactions was obtained by a search with the reaction retrieval system ISIS Host [3] in the 1991 version of the ChemInform-RX data base [4]. As the changes in the structures of the starting materials are the most decisive influences on a chemical reaction, we concentrated in this investigation only on these structural influences. The question is then, how should the structures be coded? Clearly, lists of functional groups around the reaction center cannot be the best method, because this list would be quite extensive. Actually, chemists have already tackled the problem of having to compare diverse functional groups and have generalized the influence of functional groups with concepts such as partial charges and inductive/field, resonance, and polarizability effects. Methods for the empirical calculations of these effects developed in our group [5]-[8] were used to calculate the influence of functional groups on the atoms of the reaction center. Specifically, s- and p-electronegativities, cs and cp, were considered at atoms C-1 and C-3, the total charges, qtot, on atoms C-2 and C-3, and the effective polarizability, ai, on C-3 (see Table 1).

These seven variables were used to describe each individual reaction of the data set and used as input to the Kohonen neural network. This network projects a seven-dimensional space into a map consisting of 12x12 neurons. Each neuron has as many weights as there are input variables for each object; in our case there are seven. A reaction s will be mapped into that neuron c that has weights wji most similar to the input variables xsi of the reaction considered [Eq.(a)].

(a)

After each input of a reaction the weights of all neurons are adjusted such as to make them more similar to the input variables. However, this adjustment is largest for the winning neuron c and decreases with increasing distance of a neuron from this central neuron. Reactions that have similar electronic variables will thus be mapped into the same or adjacent neurons. In our case quite a few neurons obtained several reactions (up to five), indicating a high degree of similarity for those reactions. A number of neurons (78) did not obtain any reaction at all.
To visualize the results of the self-organization of the 120 reactions in the Kohonen network, the reactions were inspected by chemists and assigned to appropriate reaction types, which were identified by symbols (Table 2). The neurons were labeled with these symbols to indicate which reaction was projected into which neuron. The results are shown in Figure 1.

Fig. 1. Kohonen feature map obtained for the classification of 120 reactions.White boxes represent empty neurons; boxes with an x denote conflict neurons.The symbols are defined in Table 2.

Quite a few neurons obtained several reactions, which were always of the same type, indicating the power of the Kohonen network in perceiving similarities of different reactions. Only one neuron, neuron (12,1) stored conflicting information: it contained both a condensation reaction and a reaction that had been coded in the data base with a wrong reaction center.
Michael additions comprised the bulk of the reaction, and consequently the Kohonen network reserved the largest area for this reaction type. The more specialized reaction types and those with only a few members were pushed to the edges of the map. It is quite remarkable that the reaction types assigned by a chemist were also perceived by a Kohonen network, which used the specified physicochemical variables of the reaction center as criteria for assigning reactions; in the Kohonen network the individual reactions of one type were collected in the same region of the map.
Even within the area of one reaction type the site in which a reaction is located contains a lot of chemical information. Figure 2 shows the part of the map of Figure 1 containing Michael additions. Additional labels give further information on the individual reactions. Reactions that have only one (no label), two (2Z), or three (3Z) electron-withdrawing groups at the reacting H-C bond are quite well separated. By the same token, reactions that have two strongly electron-withdrawing groups (*) at the C=C bond are separated from those that have only one (no label). An important advantage of a two-dimensional map is that structural variations at both reacting bonds, the C-H and the C=C bonds, can be indicated simultaneously. Furthermore, also within a reaction type such as Michael additions the more special examples (S) are found at the edges of the map.

Fig. 2. Detailed analysis of the cluster of Michael additions in the Kohonen feature map in Fig. 1. The neurons marked with 2Z and 3Z were mapped by Michael additions in which the reacting H-C bonds are activated by two or three strongly electron-withdrawing groups, respectively. The neurons marked with * indicate Michael additions in which the reacting C=C bonds are activated by two strongly electron-withdrawing groups. The neurons marked with S were mapped by special Michael additions.

We will now discuss some of the special Michael additions in more detail. Scheme 1 shows the reactions mapped into neurons (6,1), (3,7), (3,12), and (12,12). Whereas in most Michael additions a H-C bond is activated by groups that exert a -M effect, the three reactions mapped into neuron (6,1) show that the carbanion initiating a Michael addition can also be stabilized by three groups exerting a -I effect [9]. The two reactions in neuron (3,7) are quite unique: the CH3-group reacting is activated by an ester group that exerts its influence through conjugation across a double bond [10]. A CH3 group also reacts in the reaction stored in neuron (3,12),but it is activated by an ortho-nitro substituent on the phenyl group [11]. Clearly, a common feature in the last two reactions is that a CH3 group is activated by an electron-withdrawing group in conjugation; however, this effect is transmitted across different systems. Thus, it is gratifying that these two special Michael additions end up in similar regions, of the map (neuron (3,7) and (3,12)).

(6,1):

(3,7):

(3,12):

(12,12):

Scheme 1. Michael additions mapped into neurons (6,1), (3,7), (3,12) and (12,12). The bonds denoted by dashed lines are those broken or made in the reaction.

The reaction in neuron (12,12) is the only Michael addition [12] in this set of reactions in which a H-Csp bond reacts; all other cases involved a H-Csp3 bond. This reaction thus extends the scope of Michael additions and is stored at the borderline of the area of this reaction type.
One of the exciting features of the Kohonen network is that in classifying the reactions, the network automatically gains chemical knowledge from those reaction instances. This "trained" network can then be used to predict reaction types of the unknown reactions. To illustrate this point, the 120 reactions of the previous data set were divided into two groups. The 60 reactions with odd indices (see Table 2) were used as the training set, the other 60 reactions as test set. The result is quite impressive: The reaction types of 95% (57 instances) of the reactions in the test set were correctly predicted.The reason for two undecided cases and one wrong case is very simple: these three special reactions are not represented in the training set. In fact, all three of these reactions were mapped into empty neurons at the edges of the map.
The two-dimensional Kohonen feature maps can show the relationships of the chemical reactions under investigation, point out the major reaction types, present their scopes, and indicate also unusual reactions. Thus, they allow the chemist to order observations on chemical reactions in an intuitively more appealing and chemically more significant manner.

Received: October 6, 1995 [Z8450IE]
German version: Angew. Chem. 1996, 108, 844-846

Keywords: Kohonen maps · Michael additions · neural networks · reaction classification

Previous PageNext Page


Johann.Gasteiger@chemie.uni-erlangen.de