Introduction to Generate3D

Inside Generate3D

Fragments

Build tree

Geometry optimization

Diversity limit

Hyperfine

Usage - conformational analysis

Usage - verbose

Verbose in molconvert

Switch verbose using environmental variable

Forum links

ChemAxon’s Generate3D is the molecular coordinate generation component which is used by Marvin GUI’s Structure/Clean3D function, Conformers calculator plugin, and molconvert command line tool. This document tries to give a brief introduction to the inner workings, usage and tuning of Generate3D.

Inside Generate3D

This part introduces key concepts utilized in Generate3D

Fragments

To generate coordinates/conformers in the default configuration the input molecule is divided into small fragments. Single or multiple conformers are generated for every fragment in an atom-by-atom manner (rough introduction available at http://www.chemaxon.com/library/advanced-automatic-generation-of-3d-molecular-structures/ - see downloadable pdf file: http://www.chemaxon.com/conf/Advanced_automatic_generation_of_3D_molecular_structures.pdf ).

Build tree

As a next step the fragments are organized into a build tree structure according to their overlaps. Each tree node represents a fragment of the input structure: the individual fragments mentioned above are associated to the bottom level nodes (leaves). Other tree nodes represents bigger fragments resulting from the fusing of the fragments represented by its child nodes. This way the input structure is represented by the highest level (root) node in the build tree.

Coordinates and conformers are generated in a demand driven fashion by the build tree: every node can be commanded to generate one or more conformers for its represented fragment.

Bottom level nodes in the build tree (leaves, representing the individual fragments) achieve this by building the fragment in an atom-by-atom manner, determining energetically favorable, multiple atom placement variations.

Other nodes - including the root node - in the build tree represents fragments resulting from fusing together other fragments (represented by child nodes). When a new conformer is requested from a node then it first explores the conformational space resulting from various rigid fuses of the fragment conformers already generated by child nodes. If no more fragment conformer can be generated then one of the child nodes will be asked to provide one additional conformer and the pairwise fuse restarts.

Fragment fusing sequence

The build tree determines the order of the fragments to fuse. The order of the conformational space exploration however is not fully determined. Currently a top-down approach is utilized: a lower level node in the build tree will not explore further its conformational space while a higher level fuse can yield further conformers.

Geometry optimization

Fragment conformers resulting from the fusing process are optimized when required before passing to a higher level node in the build tree. This happens when the rigid fuse results in high energy conformation (strained bond/angle, atom proximity).

Conformers generated for the input structure (by the root node in the build tree) are always optimized. Optionally an additional MMFF94[1] based optimization and energy calculation can be executed on these final structures.

The building process inside uses a proprietary extended version of the Dreiding[2] force field.  Our extensions

Diversity limit

Conformers in the build tree are separated with a predefined diversity limit: two fragment conformers having higher than this predefined limit RMSD difference will be considered different conformers. This diversity limit for the input structure (represented by the root node in the build tree) can be set externally (using parameter [diversity] in API/molconvert, -d in cxcalc.

Hyperfine

Conformers with invalid local energy minimum can be eliminated by an optional post processing step called hyperfine. Here all generated conformers are processed with multiple short, low temperature molecular dynamic runs followed by strict geometry optimization. Only these processed conformers are returned by Generate3D.

Properties of the conformational analysis

Conformational space coverage

Conformational flexibility assigned to higher level (later) fragment-fragment fuses is explored more thoroughly.

Deterministic execution

The fragmentation algorithm might depend on atom index order and structure standardization. During the execution the application of short molecular dynamics runs might introduce non deterministic components. Geometry optimization is utilized during the building process, it is a deterministic process, however can be sensitive to minute (infinitesimal) variations of the input (which can be introduced by a previous MD run).

Since the execution speed of a software might vary the time-out feature might introduce non-deterministic success/failure for corner case inputs (which need execution time about the allowed time).

Usage - conformational analysis

Generate3D expects two conformer count parameter:

Defaults:

Usage - tips for conformational analysis

time cxcalc -Xmx4g conformers -m 1000 -y false -l 9000 "C(N)(O)C" | tee out.sdf | molconvert smiles - | sort -u

Usage - verbose

Verbose mode (released in marvin version 5.7) is available which prints progress info to the console, on the standard error with various detail level[3]. Verbose mode can be set by parameter string passed to molconvert command line tool or by setting “chemaxon_clean3d_options_verbose” environment variable.

Verbosity level can be set for multiple printers, however typically all printers are set to a high verbosity level when required.

Please note that example verbose printouts in this document are not updated automatically: the actual printouts produced by newer marvin versions might differ.

Verbose in molconvert

Pass [verbose] option to turn on minimal verbosity or [verbose]{9} to turn on verbosity level 9 in the Generate3D option string:

Turn on minimal verbosity (verbose level 1)::

$ ./molconvert sdf -3:[verbose] -s C1CCCCCCCCCC1 > out.sdf

[GENERAL:1] 3D generation invoked, opts=[verbose]

[GENERAL:1] Start coordinate generation

[CONFANAL:1]   0 conformers available, needed 1 more.

[CONFANAL:1]   0 conformers available, needed 1 more.

[GENERAL:1] Coordinate generation finished

Turn on high verbosity  (verbose level 9):

$ ./molconvert sdf -3:[verbose]{9} -s C1CCCCCCCCCC1 > out.sdf

[GENERAL:1] 3D generation invoked, opts=[verbose]{9}

[GENERAL:2] Input structure: C1CCCCCCCCCC1

[GENERAL:2] Verbose level configuration for different verbose printers:

[GENERAL:2]         printer: GENERAL level: 9

[GENERAL:2]         printer: HYPERFINE level: 9

[GENERAL:2]         printer: CONFANAL level: 9

[GENERAL:1] Start coordinate generation

[CONFANAL:1]   0 conformers available, needed 1 more.

[CONFANAL:2]         Invoke light build

[CONFANAL:2]           returned 81 conformer(s) (0 after filtering)

[CONFANAL:1]   0 conformers available, needed 1 more.

[CONFANAL:2]         Invoke optimization on soft errors

[CONFANAL:2]           1 new conformer(s) found

[GENERAL:1] Coordinate generation finished

[GENERAL:2]   Generated conformer count: 1

Switch verbose using environmental variable

Generate 3D checks for the presence of environmental variable “chemaxon_clean3d_options_verbose”. If found then its value is processed as the argument part of option “[verbose]”. If verbose mode definied using this environmental variable and from option string simultaneously then the higher specified verbose level will be considered.

The possibility to specify verbosity using this environmental variable can be useful when Generate3D option string is not accessible, for example MarvinSketch/View GUIs or cxcalc command line tool.

Please note that the format and amount  of verbose messages might be modified in the future, so they are not suitable for parsing or further machine processing.

Turn on minimal verbosity (verbose level 1):

$ export chemaxon_clean3d_options_verbose=1

$ ./msketch

(draw a structure in MarvinSketch launched and press CTRL-3 or Structure - Clean 3D - Clean in 3D)

[GENERAL:1] 3D generation invoked, opts=S{fine}[prehydrogenize]

[GENERAL:1] Start coordinate generation

[CONFANAL:1]   0 conformers available, needed 1 more.

[GENERAL:1] Coordinate generation finished

$ echo C1CCCCC1 | ./cxcalc conformers > out.sdf

[GENERAL:1] 3D generation invoked, opts=c3[prehydrogenize][ca]{100}{100}[timelimit]{900}L{1}[diversity]{0.1}E

[GENERAL:1] Start coordinate generation

[CONFANAL:1]   0 conformers available, needed 100 more.

[GENERAL:1] Coordinate generation finished

Turn on high verbosity  (verbose level 9):

$ export chemaxon_clean3d_options_verbose=9

$ ./msketch

(draw a structure in MarvinSketch launched and press CTRL-3 or Structure - Clean 3D - Clean in 3D)

[GENERAL:1] 3D generation invoked, opts=S{fine}[prehydrogenize]

[GENERAL:2] Input structure: C1CCC2CCCCC2C1

[GENERAL:2] Verbose level configuration for different verbose printers:

[GENERAL:2]         printer: GENERAL level: 9

[GENERAL:2]         printer: HYPERFINE level: 9

[GENERAL:2]         printer: CONFANAL level: 9

[GENERAL:1] Start coordinate generation

[CONFANAL:1]   0 conformers available, needed 1 more.

[CONFANAL:2]         Invoke light build

[CONFANAL:2]           returned 16 conformer(s) (1 after filtering)

[GENERAL:1] Coordinate generation finished

[GENERAL:2]   Generated conformer count: 1

$ echo C1CCCCC1 | ./cxcalc conformers > out.sdf

[GENERAL:1] 3D generation invoked, opts=c3[prehydrogenize][ca]{100}{100}[timelimit]{900}L{1}[diversity]{0.1}E

[GENERAL:2] Input structure: C1CCCCC1

[GENERAL:2] Verbose level configuration for different verbose printers:

[GENERAL:2]         printer: GENERAL level: 9

[GENERAL:2]         printer: HYPERFINE level: 9

[GENERAL:2]         printer: CONFANAL level: 9

[GENERAL:1] Start coordinate generation

[CONFANAL:1]   0 conformers available, needed 100 more.

[CONFANAL:2]         Invoke light build

[CONFANAL:2]           returned 6 conformer(s) (5 after filtering)

[CONFANAL:2]         Invoke light build

[CONFANAL:2]           returned 0 conformer(s) (0 after filtering)

[CONFANAL:2]         Invoke heavier build

[CONFANAL:2]           returned 0 conformer(s) (0 after filtering)

[GENERAL:1] Coordinate generation finished

[GENERAL:2]   Generated conformer count: 5

High verbosity when using hyperfine:

$ export chemaxon_clean3d_options_verbose=9

$ echo C1CCCCC1 | ./cxcalc conformers -e > out.sdf

[GENERAL:1] 3D generation invoked, opts=c3[prehydrogenize][hyperfine][ca]{100}{100}[timelimit]{900}L{1}[diversity]{0.1}E

[GENERAL:2] Input structure: C1CCCCC1

[GENERAL:2] Verbose level configuration for different verbose printers:

[GENERAL:2]         printer: GENERAL level: 9

[GENERAL:2]         printer: HYPERFINE level: 9

[GENERAL:2]         printer: CONFANAL level: 9

[GENERAL:1] Start coordinate generation

[CONFANAL:1]   0 conformers available, needed 100 more.

[CONFANAL:2]         Invoke light build

[CONFANAL:2]           returned 6 conformer(s) (5 after filtering)

[CONFANAL:2]         Invoke light build

[CONFANAL:2]           returned 0 conformer(s) (0 after filtering)

[CONFANAL:2]         Invoke heavier build

[CONFANAL:2]           returned 0 conformer(s) (0 after filtering)

[HYPERFINE:1] Starting hyperfine on 5 initial conformers

[HYPERFINE:2]   Process initial conformer 0 initial E[0]= 10.97

[HYPERFINE:2]         Identified new conformer. E'[0]= 10.957

[HYPERFINE:2]         (Already identified conformer found with lower energy) E'[0]= 10.956

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 0

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 0

[HYPERFINE:2]         (Already identified conformer found with lower energy) E'[0]= 10.956

[HYPERFINE:2]   Process initial conformer 1 initial E[1]= 18.692

[HYPERFINE:2]         Identified new conformer. E'[1]= 18.679

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 1

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 1

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 1

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 1

[HYPERFINE:2]   Process initial conformer 2 initial E[2]= 18.696

[HYPERFINE:2]         Identified new conformer. E'[2]= 18.678

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 2

[HYPERFINE:2]         (Already identified conformer found with lower energy) E'[2]= 18.678

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 2

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 2

[HYPERFINE:2]   Process initial conformer 3 initial E[3]= 19.326

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 2

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 1

[HYPERFINE:2]         Identified new conformer. E'[3]= 19.361

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 3

[HYPERFINE:2]         (Already identified conformer found with lower energy) E'[3]= 19.359

[HYPERFINE:2]         (Already identified conformer found with lower energy) E'[3]= 19.357

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 2

[HYPERFINE:2]   Process initial conformer 4 initial E[4]= 22.679

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 2

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 2

[HYPERFINE:2]         (Already identified conformer found with lower energy) E'[2]= 18.678

[HYPERFINE:3]         (No new conformer) Equivalent with conformer 2

[HYPERFINE:1] Identified 4 conformers

[GENERAL:1] Coordinate generation finished

[GENERAL:2]   Generated conformer count: 4

Forum links

Tracker topic:

https://www.chemaxon.com/forum/ftopic8016.html (Generate3D related documents)

References to this documents are in the following topics:

https://www.chemaxon.com/forum/ftopic7961.html (Lower than lowest energy conformers)

https://www.chemaxon.com/forum/ftopic8026.html (Diversity limit causes Marvin conformer calculation to hang)

https://www.chemaxon.com/forum/ftopic10904.html (Regarding the conformational analysis wtih Clean 3D)


[1] Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94, Thomas A. Halgren, J. Comp. Chem.; 1996; 490-519, 
DOI: 10.1002/(SICI)1096-987X(199604)17:5/6<490::AID-JCC1>3.0.CO;2-P

[2] DREIDING: a generic force field for molecular simulations, Stephen L. Mayo, Barry D. Olafson, William A. Goddard, J. Phys. Chem.; 1990; 94 (26), pp 8897–8909, DOI: 10.1021/j100389a010

[3] Please note that printed verbose info is not intended for further machine processing (parsing): its messages, formatting, the scope of information emitted on various levels might change with a new release.