Tudor I. Oprea, Johan Gottfries, Vladimir Sherbukhin, Peder Svensson, and T h o m a s C. Kiihler
Chemical information management in drug discovery: Optimizing the computational and combinatorial chemistry interfaces Color Plate I. Spotfire provides the possibility to view data of various types in several ways, including 2D and 3D scatter plots, bar charts, • I I J and pie charts. Filters can be applied to one or several columns in the data table by range sliders, item sliders, or check boxes, depending on the / data type in each column. ! , i Color coding, size and shape of markers, and labeling or flagging of objects gives the possibility to view data at a higher resolution than normal. .. • ,,-. • m~' . '•, Highlighting one object (molecule) in one graph leads to the immediate highlighting of zooming in on the region around a the same object in all other ,:,1 selected compound (right mouse button) !~1 graphs while at thesame time J a list of all available data for that object does become available. The structure visualization plug-in provides instantaneous access to structural data simply by clicking on the obA ject of interest. It also provides the possibility of applying additional filters via substructure or similarity searches. The search results are immediately added to the data table and appear as a check box query device. Panels a and b illustrate how the properties and structural features of a set of reagents (or enumerated products) could be visualized. The ability to view relationships in two- and three-dimensional projections, the possibility to home in on the neighborhood of any of the selected objects (compounds), and the capability to use multidimensional distance filters making possible the easy identification and structural inspection of neighbors are illustrated. Obviously. to show the real power of the application with just a few static screen-shots is difficult. (a) Scatter plot based on 2,801 aliphatic halides projected in the two most important property dimensions of a 10-dimensional PCA model. The current screen-shot is centered around one structure (highlighted in blue) out of 20 that have been identified by a spaceB filling selection procedure in the 10 dimensions. The structure associated with the highlighted data point is shown in the structure visualizer window, in this case the compound with ID = 29. The letter H does not stand tbr hydrogen but codes Ibr an Halogen (H), in this case either a chlorine or a bromine. An Euclidean distance filter (over all 10 dimensions) to the selected compound (ID = ~)1 can be applied by adjusting the range slider (red ellipse). The current screen-shot shows compound ID = 29 and its 310 closest neighbors out of the initial 2,801 compounds (green ellipse). (b) Scatter plot based on 2,801 aliphatic halides projected in the three most important property dimensions of a 10-dimensional PCA model. The halides containing an aldehyde functionality (in this case 20) have been identified by a substructure search and converted into a convenient check box filter (blue box and bars). The current screen-shot is centered around one of the aldehydes (MFCD00799494) by a novel 3D neighborhood zooming functionality (green ellipse). The current screen-shot shows 152 compounds out of the initial 2,801. The lound aldehydes are colored in red and tagged with an optional label, here the MFCD number.
inspectionof selection
n! •
L
istancefilterin 10
J. Mol. Graphics Mod., 2000, Vol. 18, A u g u s t - O c t o b e r
541