Previous PageNext Page

5 Known Limitations and Further Development

It should be realized that EROS is still in active development, particularly as concerns the knowledge base. Thus, there are a number of problems and limitations that we know of and that we are trying to resolve. It is important for the users and particularly the system managers of EROS to seek contact with us in order to be provided with the most recent developments.

5.1 Reaction Rules

5.1.1 Number of Reaction Rules

The singly most disturbing limitation is that the number and the scope of the reaction types available as coded reaction rules is still rather limited. The ones that are presently available have mainly been coded to show by spotlights the range of applications open to EROS.
Thus, the coverage of organic reactions presently is rather fragmentary.
In this situation, the user may code her/his own reaction rules. The present set-up requires a knowledge of the scripting language Tcl which can only be expected from experienced users or system managers.

5.1.2 Automatic Rule Generation

Since many years we have been working on automatic reaction rule generation. The previous version, EROS 6, had such a facility. Since then we have applied machine learning techniques such as conceptual clustering methods to perceive reaction types and thus extract knowledge from reaction databases.[29] The system HORACE was based on topological and physicochemical criteria to perform this task.
Our present answer to this problem are self-organizing neural networks such as the one proposed by Kohonen. Methods have been developed for the clustering of reactions extracted from databases into landscapes of reactions that allow the perception of reaction types.[30][31][32] This has been further developed to explore the feasibility and scope of a reaction type. Mechanisms have already been provided for accessing a Kohonen network from a reaction rule. Thus, we are on the way to automatically extract knowledge on chemical reactions from reaction databases and make such knowledge available to EROS.

5.1.3 Number of Neural Networks

The evaluation of reaction types can be supported by neural networks such as backpropagation (BPG) networks, Kohonen networks, or counterpropagation (CPG) networks. For each rule written in Tcl only one BPG network, and one Kohonen/CPG network can be provided. The initialization of the network has to be made in the RULE_INFO part of the reaction rule that uses such a network.

5.1.4 Floating Point Values and Integer Number

Floating point values and integer numbers which are transferred to the core system are not allowed to contain +-signes. For one problem and how to avoid it see chapter 3.5.

5.2 Number of Intermediates and Products

The current version of EROS keeps all reaction intermediates and products in memory and does not save the molecules on disk when the required storage space exceeds memory size. Storage requirements may become particularly critical in the simulation of experiments of combinatorial chemistry when a large number of products is generated. The memory necessary for an EROS run depends on the number and size of different structures produced and on the contents of the reaction rules such as number of rules and usage of physicochemical variables. The EROS 7 system itself needs slightly more than 20 MB. Dependent on the size of the molecules and the physicochemical variables used about 5 - 250 kB are needed per molecule.
Presently, care has to be taken that enough memory is available. Otherwise, EROS 7 may crash uncontrollably. When the development of EROS was initiated, not all C++ compilers had the exception handling implemented and thus a check for insufficient memory at important steps in the program could not be made.
The next version of EROS will provide for an automatic save onto disk if the number of molecules exceeds a given threshold. Thus, this limitation will have been overcome.

5.3 Integration of Kinetics

It is recommended to use the GEAR algorithm for the integration of the differential equations. The other integration methods (Runge-Kutta and Runge-Kutta-Merson) may fail in some cases. The multi-time dosage of starting materials works only with the GEAR algorithm, because the other integration methods are less stable.
The file with the data for the concentration-time curves is presently only written with the GEAR algorithm. The end concentrations are calculated by all three integration methods. If minimal_concentration is set to 0.0, the concentration values are written into the file PS1.prd. If minimal_concentration is set to a value higher than 0.0, a file PS#.prd is written for each reaction level, where # is the number of the level. The concentration values for the entire reaction network are in the file PS#.prd, where # is the highest number of reaction levels. All these files have the same file format. The name of the files currently cannot be changed.
The files PS#.prd are column oriented. The first column gives the time in seconds followed by columns for molecule 0, molecule 1, and so on until the last molecule. The molecule numbers are the same as in the structure file. Because all molecules are stored with numbers starting with zero and are copied to new molecule numbers before they take part in a reaction, the concentration values for molecule 0 and, if you start your simulation with more than one molecule, the next ones are all 0.0, too.
The total reaction time in these files may be less than the chosen reaction time because these files are limited to 4999 time intervals of the integration. The size of the intervals is chosen by an internal algorithm, which may exceed 4999. In this case the concentration values end for all compounds at that time, where the maximum of 4999 intervals is reached.

5.4 Physicochemical Variables

Most of the physicochemical variables are still calculated from a connection table representation of molecules. The structures in the MOSES format are automatically converted to one reasonable connection table. The calculation of physicochemical descriptors is initiated with this structure and does therefore still suffer from the limitations of a connection table. In particular, it cannot be controlled which mesomeric structure will be generated in the conversion of a MOSES representation.

5.5 Manual

The manual still needs extensions in various chapters. In particular, the chapter 4, Writing Your Own Reaction Rules, still has to be written in English. A German version is contained in the dissertation by Dr. Robert Höllering, University Erlangen-Nürnberg, 1998 which can be accessed over the internet at http://www2.chemie.uni-erlangen.de/services/dissonline/
dissertation/Robert_Hoellering/html/
(A few functionality described here is not yet working: combine_elsys, some group handling functionality, and the handling of the internal error flag).

This dissertation is a rich source of additional information on the EROS system. However, all this information is in German.
Detailed information on the MOSES data structure and its implementation in C++ can be obtained from the dissertation of Dr. Susanne Bauerschmidt, which can also be accessed online at http://www2.chemie.uni-erlangen.de/services/dissonline/data/dissertation/Susanne_Bauer
schmidt/html
. Again, however, all this information is in German.

Previous PageNext Page


Prof. Dr. J. Gasteiger
Computer Chemie Centrum, Org. Chem., Uni. Erlangen
Nägelsbachstraße 25
D-91052 Erlangen

Gasteiger@CCC.Chemie.Uni-Erlangen.DE