On the intrinsic dimensionality of chemical structure space

On the intrinsic dimensionality of chemical structure space

Chemosphere, Vol.17, No.8, pp 1617-1630, Printed in Great Britain 1988 0 0 4 5 - 6 5 3 5 / 8 8 $3.00 + .OO P e r g a m o n Press plc On the Intrins...

1MB Sizes 3 Downloads 48 Views

Chemosphere, Vol.17, No.8, pp 1617-1630, Printed in Great Britain

1988

0 0 4 5 - 6 5 3 5 / 8 8 $3.00 + .OO P e r g a m o n Press plc

On the Intrinsic Dimensionality of Chemical Structure Space

by

G.D. Veith, B. Greenwood I, R.S. Hunter 2, G.J. Niemi, and R.R. Regal3

Environmental Research Laboratory-Duluth 6201Congdon Boulevard D u l u t h , MN 55804

Icomputer Sciences Corporation Falls Church, VA

2 C e n t e r f o r Data S y s t e m s and A n a l y s i s Montana S t a t e U n i v e r s i t y Bozeman, MT 59717

3 D e p a r t m e n t of M a t h e m a t i c a l S c i e n c e s University

of M i n n e s o t a - D u l u t h

D u l u t h , MN 55812

1617

1618

INTRODUCTION An i m p o r t a n t chemical

structures

chemicals, "lead"

rates

and t h e r a p e u t i c s

can be e s t i m a t e d chemicals

analogous structures

similarity are

is

is often

considered

simultaneously

similarity

which a l l

can be i d e n t i f i e d .

important variables,

c o m p r e h e n d what t h i s i s one of t h e

large

universe

problem.

tools

first

attempts

and s c a l i n g

In t h e s e

f o r many c h e m i c a l s ,

is

exploring

space

in

the

from i t .

This

space for a

space has been activity

systematically

is transformed

to a s m a l l e r

between chemicals

for cluster

analysis

are

problem that

available

m o d e l i n g and v a l i d a t i n g

in t h i s

compiled set

of

coordinates

intuitive systematic

complex phenomena.

in a

techniques.

selection

of a n a l o g s ,

data

of e s s e n t i . a l

small

in a m u l t i v a r i a t e

in

s p a c e a r e u s e d as a

of o t h e r m u l t i v a r i a t e

for a relatively

h s m a l l number of p o i n t s

of

f o r most of t h e v a r i a t i o n

c o m p o n e n t s a r e u s e d as o r t h o g o n a l

by t h e p r a c t i c a l

properties

chemicals,

We have

we need t o

structure

structure

data are

a p p r o a c h may be an i m p r o v e m e n t t o

limited

chemical

is accomplished,

chemical

chemicals

so many p o t e n t i a l l y reduce

and t h e b i o l o g i c a l

approaches,

s p a c e and d i s t a n c e s

m e a s u r e of s i m i l a r i t y

it

to

components which account

The p r i n c i p a l

While t h i s

are necessary

of c h e m i c a l

and i n f o r m a t i o n

or principal

structure

words,

of c h e m i c a l s .

molecules 3'4'5'6

set.

are

where

chemical

a structure

Because t h e r e

to d e f i n e

sought using thermodynamic properties

the

to define

to

chemical

from a p e r s p e c t i v e

s p a c e means and what can be p r e d i c t e d

The d i m e n s i o n a l i t y

variables,

in r e s e a r c h ,

in o t h e r

the

Despite the

from many p e r s p e c t i v e s .

When t h i s

and

chemical

One r e a s o n i s t h a t

by a t t e m p t i n g

multivariate

of t h i s

by c o m p a r i n g t h e

problem or,

of

Moreover,

data are available.

simultaneously.

and d i f f e r e n t

properties

homologs 1'2

evaluated

a multivariate

similar

chemicals

modifications

Chemical

interpretation

approached chemical

paper

behavior.

similar

New i n d u s t r i a l

subtle

s u c h as "homolog" and " a n a l o g s "

inherently

dimensionality

are often

from s u i t a b l e

has evaded q u a n t i t a t i v e

chemicals are

and b e h a v i o r .

f o r which t o x i c o l o g i c a l

w i d e s p r e a d use of t e r m s

all

and p h a r m a c o l o g y i s t h a t

properties

w i t h known c h e m i c a l

of u n t e s t e d

similarity

in chem is try

have s i m i l a r

pesticides,

structures

reaction safety

expectation

sets

number of

space precludes Multivariate

analyses

1619

can be a c c u r a t e be e n c o u n t e r e d derived. large,

only are

chemicals

included

One of t h e b e s t representative

s y s t e m may c h a n g e To e x p l o r e have s e l e c t e d

the

chemical

set

data

set.

of e i g h t

properties

minicomputers to graphically through user-selected

to

space

is

structure

is to compile a stability

of t h e

structure

space,

from r e g i s t r i e s are available

from g r a p h t h e o r y .

components.

likely

of c h e m i c a l

for

less

A set

computed f o r

of t h e d a t a

the

than

of more t h a n 90 e a c h of t h e

s e t has been reduced to

Computer p r o g r a m s were d e v e l o p e d

display

we

t o m e t h o d s of q u a n t i t a t i n g

have b e e n s y s t e m a t i c a l l y

principal

the

of c h e m i c a l

structures

and t h e d i m e n s i o n a l i t y

diversity

a r e added.

we have t u r n e d

derived

indices

chemicals,

on c h e m i c a l

space

Otherwise,

dimensionality

chemicals,

variations

of t h e

from which the

new k i n d s of s t r u c t u r e s

intrinsic

of t h e

graph-theoretic 19,972

in t h e d a t a

all

ways t o p r o d u c e a s t a b l e

Because data

one p e r c e n t structural

if

representing

a s e t of 19,972 c h e m i c a l

production.

a set

if

"universe"

of c h e m i c a l

for

structures

windows.

METHODS Molecular (atoms)

topology treats a chemical

connected

connectivity

by edges

(bonds) 7.

indices from chemical

as framework,

derived from structures

bond,

Methods

sub-graphs

length 8'9'I0 and will not bc discussed classified

structure

as a group of vertices

for computing

have been discussed

in detail here,

and valence

molecular

indices.

at

The indices are

Framework

indices are

reduced to only carbon atoms and single bonds.

bond indices provides

a mechanism

that all the vertices

are assumed to be carbon and the vertex corrections

differentiate valence

to look a step beyond framework

The

the local bonding of each vertex.

is the number of non-hydrogen

The correction

bonds at the vertices.

indices

in

factor for

The valence

indices use vertex values which are adjusted for both bonding and heteroatom electronegativity

9.

1620

A graph is a finite

set

of v e r t i c e s

e d g e s c o n n e c t two of t h e v e r t i c e s . graph that

has a l l

the

vertices

and a f i n i t e

A connected

set

of e d g e s

in which

s u b g r a p h of a g r a p h

c o n n e c t e d by some c o m b i n a t i o n s

is a sub-

of t h e

edges.

C !

Subgraphs are c l a s s i f i e d

into

paths

(-C-C-),

C

clusters

(-C-k-C-),

I

C

I

C

/\

path-clusters

(C-C-C-C) and c y c l e s 9 ( - C - C ) .

subgraph that

has o n l y one or two e d g e s t o e a c h v e r t e x .

non-cyclic

subgraph that

path/cluster

has only t h r e e

is a non-cyclic

cluster.

A subgraph that

a chain.

The o r d e r

Indices

of

systematic

of h i g h e r

adjacent

of v e r t e x

type

one c y c l i c

algorithm

structures

identified

indices

of a c c u r a t e

t o compute t h e

of o n l y c o n n e c t e d

and t h e

in t h e

indices

as

subgraph. However,

for multicyclic

m o l e c u l e s has

subgraph enumeration.

first

l0 o r d e r s

subgraphs.

subgraphs to generate

vertices

A

subgraph is defined

by hand c a l c u l a t i o n .

due t o t h e d i f f i c u l t i e s

using computer data

includes

order

is a

i s composed of b o t h a p a t h and a

least

can be g e n e r a t e d

We d e v e l o p e d an e f f i c i e n t

uses

A cluster

of a s u b g r a p h i s t h e number of e d g e s

calculation

efficiently

at

is a n o n - c y c l i c

or f o u r edges to each v e r t e x .

subgraph that

contains

low o r d e r

n o t been r e p o r t e d

A path

of

indices

The p r o g r a m

new s u b g r a p h s w h i c h

a r e computed by s i m p l e b o o k k e e p i n g

and number of e d g e s a t e a c h v e r t e x .

The g r a p h e n u m e r a t i o n p r o g r a m was d e v e l o p e d on a V A X - l l / 7 8 0 c o m p u t e r a t Montana S t a t e

University.

of m o l e c u l a r

connectivities

structural

and c h e m i c a l

connectivity Substances organic

indices Initial

In an e f f o r t indices,

similarity

for

19,972

Inventory.

molecules with less

c a r b o n atom.

Generating

all

t o g a i n more i n s i g h t

particularly

as a t o o l

in m o l e c u l e s ,

chemicals

includes

t h a n 60 n o n - h y d r o g e n atoms and a t indices

for these

chemicals

nature

the

from t h e U.S.

data base

the

to determine

we g e n e r a t e d

selected

The s e l e c t e d

into

EPA Toxic

only discrete least

one

took approximately

20 h o u r s of CPU c o m p u t e r t i m e on t h e VAX-11/780. The 0 t h t o 9 t h o r d e r t e r m s f o r p a t h s , clusters, order

the

3rd t o 9 t h o r d e r

the 4th to 9th order terms for path/clusters,

terms for cycles

90 . s t r u c t u r a l

variables.

for the framework, Principal

bond,

and t h e

and v a l e n c e

component a n a l y s i s

terms for 3rd t o 9 t h

indices

comprise

(PCA) was u s e d to

1621

explore set

the covarianee

of o r t h o g o n a l

large

part

variables

the

principal

t h e 90 v a r i a b l e s

the

in the

variation

principal

for

19,972

S c i e n c e s 11

set,

Because a l l

influence

data points

as much i n f o r m a t i o n the

data

set

derived

from

were skewed

to the m a j o r i t y

were l o g - t r a n s f o r m e d of t h e s e

display

(hereafter

of t h e

to s t a b i l i z e

large molecules

patterns

further,

rotation

in t h e

resolved

eolor

and m a g n i f i c a t i o n

from d i f f e r e n t

Five y e a r s

ago t h e

e x c e s s of $ 1 0 0 , 0 0 0 . graphics

were c r e a t e d

graphics

and i s c a p a b l e grid

position

particular

consisting

several

d e t e r m i n e what c o l o r gun on t h e m o n i t o r i s

sophisticated

chip driving

To d i s t i n g u i s h

were a l s o specific

developed

s e g m e n t s of

setting

resolved

in c o n j u n c t i o n

choices

herein

of d i s p l a y

Each b i t - p l a n e each bit

is obtained

for each color at that

representing

bits

in t h e c o l o r

on t h e

o r 256 p a r t s .

f o r t h e 512 a d d r e s s e s

the

nine planes

gun i s s t o r e d .

coordinate

memory,

is a c a r t e s i a n

By s c a n n i n g a l l

address

The

w i t h a NEC 7220

nine bit-planes

screen.

to eight

medium-to-high resolution The images p r e s e n t e d

of 672 x 480 b i t s ,

is displayed

of 16.8 m i l l i o n

and s c a l i n g

d e v i c e w i t h a IBM PC AT h o s t .

512 c o l o r s .

a nine-bit

where a n u m e r i c a l

data

lookup t a b l e .

capabilities

b e l o w $5000.

of a d o t on t h e

coordinate,

of t h e

and d i m e n s i o n s .

8088 m i c r o p r o c e s s o r

of d i s p l a y i n g

We b e g a n

t h r o u g h an e x p a n d e d

examination at

VX384 g r a p h i c s

controller

We w a n t e d

of s u c h a c o m p u t e r s y s t e m would have b e e n in

are priced

on a V e e t r i x

device

coordinate

Today,

devices

s y s t e m u s e s an I n t e l

angles

cost

structure-space

is a c h a l l e n g e .

representation

so we were a b l e t o "zoom in" f o r a c l o s e r universe

of a c h e m i c a l

termed the universe)

spatial

dimension over a highly

palette

matrix

a

We

in as many d i m e n s i o n s as p o s s i b l e .

a fourth

table

indices.

the variables

in many d i m e n s i o n s

window of a t h r e e - d i m e n s i o n a l

spatial

retained

chemicals using the Statistical

the variables

and r e d u c e t h e

by 19,972

exploring

color

connectivity

of some l a r g e m o l e c u l e s r e l a t i v e

data

still

them to a

component space.

to d i s p l a y

the

and t o r e d u c e

components) that

original

Designing a eomputer generated defined

variables

c o m p o n e n t s from t h e c o r r e l a t i o n

for the Social

chemicals

in t h e

calculated

due t o t h e p r e s e n c e

of t h e s e

(principal

of t h e v a r i a t i o n

calculated

Package

structure

of t h e

at a

lookup

These s e t t i n g s grid.

Each c o l o r

This provides color

lookup

a

1622

table.

The t h r e e - d i m e n s i o n a l

are

firm-ware

all

i m p l e m e n t a t i o n s which g r e a t l y

c o m p l e x i t y of t h e sophisticated

transformations,

rotations,

and m a g n i f i c a t i o n s

reduce the computational

p r o g r a m r u n n i n g on t h e h o s t c o m p u t e r .

graphics

programs that

m a i n f r a m e s a r e now w i t h i n

T h i s means

once c o u l d be r u n o n l y on l a r g e

the capabilities

of s m a l l e r

m i n i - and m i c r o c o m p u t e r

systems. Three-dimensional the

binary

associated

driver

windowing was i m p l e m e n t e d in t h e

on t h e h o s t

routines

IBM PC AT.

were w r i t t e n

PROPCA a l l o w s t h e u s e r t o a s s i g n

linear

variable

spectral

and r e a s o n a b l y

any t h r e e

is then selected lookup t a b l e ,

A program called

primitives

fast

variables

execution

of a

times.

t o t h e X, Y, and Z a x e s .

t o be mapped o v e r t h e

h three-dimensional

of

PROPCA and i t s

in FORTRAN 77 t o p r o v i d e t h e b e n e f i t s

b r o a d b a s e of s o f t w a r e c o m p a t i b i l i t y

h fourth

graphics

512 c o l o r s

virtual

of a

window can be

defined

i n t e r m s of a minimum and maximum f o r t h e X, Y, and Z a x e s .

This

feature

allows close

are not

available select

e x a m i n a t i o n of s m a l l

through the

different

substantially

scaling

options

sections

of t h e

of t h e p r o g r a m .

r a n g e s and e n d p o i n t s

For e x a m p l e ,

for each axis.

r e d u c e d by s e l e c t i v e l y

image t h a t

Also plot

viewing only the areas

one c o u l d

time

is

of i n t e r e s t

t h r o u g h the windowing o p t i o n .

RESULTS Principal variables inverted single

the

in t h i s file

pixels

chemical

c o m p o n e n t s from t h e

s t u d y and were r e t r i e v e d

in the host computer. as t h e y were r e a d .

universe

de f i n e d

virtual-window,

is also

operation

has p r o g r e s s e d

traversal

is complete,

the

screen which allows

1 presents A reference

orientation

the counter A color subtle

use of s i n g l e

data

a first

pixels

rather

g l i m p s e of t h e cube d e s c r i b e s

when r o t a t i n g

or s c a l i n g .

the plotting

set traversal.

is displayed

h u e s t o be c o r r e l a t e d than filled

as

wire-frame

is updated to d i s p l a y legend

of an

c o m p o n e n t s were p l o t t e d

i n f o r m t h e u s e r of how f a r

in t e r m s of t h e

t h e window l i m i t s .

Because t h e

Figure

providing

provided to

from r a p i d t r a v c r s a l s

The p r i n c i p a l

and s t r u c t u r e - s p a c e .

A counter

within

19,972 x 90 d a t a m a t r i x were u s e d as

After

the

t h e number of p o i n t s

at the

far

right

of

w i t h component v a l u e s . polygons lessens

the

1623

c a p t u r e of d e p t h a l o n g t h e Z a x i s , s i d e s of t h e u s e r - d e f i n e d spatial

and c o l o r

virtual

PROPCA p r o v i d e s f o r p r o j e c t i o n s

window and c o n c u r r e n t v i e w i n g of t h e

i n f o r m a t i o n i n t h e X-Y, X-Z and Y-Z p l a n e s .

The PCA r e s u l t e d

in e i g h t p r i n c i p a l

t h e y e x p l a i n e d 93.5% of t h e v a r i a t i o n was p o s i t i v e l y variables. indicate all

correlated

with all

PC 2 was p o s i t i v e l y

in t h e o r i g i n a l

variables

correlated

cyclic

variables.

variables

three principal structure:

cluster

components a l l

s i z e (PC 1),

with all

differences

Similarly cyclicness,

correlated

with

correlated

other variables

variables.

The f i r s t

i n f o r m a t i o n on c h e m i c a l and number of c y c l e s (PC

components i d e n t i f i e d

more s p e c i f i c

For e x a m p l e , PC 4 had p o s i t i v e variables,

differences

correlations

but n e g a t i v e

(£ < .18) w i t h t h e 7 t h and 9 t h o r d e r c y c l i c

variables.

in b r a n c h i n g , b o n d i n g ,

v a l e n c y ( p r e s e n c e of h e t e r o a t o m s s u c h as h a l o g e n s and o x y g e n ) ,

and c o m b i n a t i o n s of t h e s e s t r u c t u r a l Figure 1 presents principal

that

with all

and p a t h / c l u s t e r

PC 5 t o PC 8 convey a d d i t i o n a l

PC 1

variables

PC 3 was p o s i t i v e l y

correlated

(£ > .58) w i t h t h e 3rd and 4 t h o r d e r c y c l i c correlations

cluster

d e g r e e of b r a n c h n e s s (PC 2 ) ,

between chemicals.

d a t a ( T a b l e 1).

but n e g a t i v e l y

convey g e n e r a l i z e d

The r e m a i n i n g f i v e p r i n c i p a l

> 1 and

except for the cyclic

In c o n t r a s t ,

and n e g a t i v e l y

except the valence-corrected

3).

components w i t h e i g e n v a l u e s

the degree a molecule is branched,

p a t h and c y c l i c

with all

onto the

attributes.

the chemical s t r u c t u r e - s p a c e

components on t h e X, Y, and Z a x e s ,

fourth principal

respectively.

component w i t h red to d e s i g n a t e

designate

large values.

The p r i n c i p a l

gradients

of d i f f e r e n c e s

between c h e m i c a l s .

for the first

three

Color s c a l e s t h e

s m a l l v a l u e s and b l u e to

components a r e a x e s t h a t

represent

For e x a m p l e , on t h e e x t r e m e l e f t

in F i g u r e 1 i s c a r b o n monoxide, t h e s m a l l e s t m o l e c u l e in t h e d a t a b a s e , the largest f o r PC 1.

m o l e c u l e in t h e d a t a b a s e (CAS # 1356089) h a s t h e The " s t r i n g "

of s t r u c t u r e s

or l i n e a r

cluster

largest

while

value

in t h e lower l e f t

c o r n e r of F i g u r e 1 i s a group of n e a r l y 1200 u n b r a n c h e d , n o n - c y c l i c structures

which a r e s e p a r a t e d from t h e u n i v e r s e of b r a n c h e d s t r u c t u r e s .

Figure 2 is a d i f f e r e n t Both views i l l u s t r a t e in t h r e e

view of t h e s e same s e t s of p r i n c i p a l that

structures

d i m e n s i o n s may a c t u a l l y

which a r e c l o s e

be f a r a p a r t

components.

(similar)

(dissimilar)

to e a c h o t h e r

in a f o u r t h

1624

dimension.

Figure

3 presents

perspective

where the 4th,

another

5th,

v i e w of t h e u n i v e r s e

and 6 t h p r i n c i p a l

Y, and Z a x e s r e s p e c t i v e l y

and t h e 7 t h p r i n c i p a l

color.

19,584

This view contains

chemicals

are apparent

in these

structures

from t h e

components present component are

t h e X,

scaled

in

and many h o m o l o g o u s s e r i e s

of

dimensions.

DISCUSSION This approach similarity analogs this

is being used for

defined

paper

components,

using

neighbors

include

nearest

2-ethyl,

The s e c o n d u s e i s t o a t t e m p t association example,

with chemicals

Veith

that

in food chains

(Log P ) .

C h e m i c a l s w i t h Log P v a l u e s

Figure 4.

bioaccumulation

l in which the color

The f i g u r e

increases the

substantial

axis that

with molecular

weight

of t h e u n i v e r s e

new s t r u c t u r e s

which fall

4 also at

shows t h a t

(blue

data

right-center).

other

structures,

large

but non-accumulative

molecular

contains areas

are

Even t h o u g h t h e s e

they are widely

separate

chemicals

partition

For

coefficient

considered a close-up

to Log P i n s t e a d classes

with

large

reasonably

with

s u c h as s u l f o n i c

of

of t o PC

Therefore,

Log p13 and

be p r e s u m e d t o have i s unknown.

low Log P v a l u e s

structures

in other

to h a v e

t h e Log P v a l u e

v o l u m e (X a x i s ) .

molecules

by

bioaccumulative

e v e n i f t h e Log P v a l u e

large

chemicals

behavior.

4 presents

chemicals could

oxirane.

harmful

than 4.0 are

Figure

If

the nearest

or biological

greater

and/or

of b i p h e n y l .

phenoxymethyl

the highly

is

many of t h e

inserted,

f o r many c h e m i c a l

potential

there

is

has been scaled

in these

bioaccumulation

include

n-octanol/water

potential.

illustrates

red regions

Figure

have a large

s c o p e of

principal

potentially

of known c h e m i c a l

chemicals

substantial

ether)

identify

e t a l . 12 d e m o n s t r a t e d

suitable

s u c h as d i p h e n y l a m i n e

and 4 - n i t r o

to

identify

distances.

m e a s u r e and e i g h t

neighbors

glycidal

of c h e m i c a l

is beyond the

and m e t h y l a m i n o d e r i v a t i v e s

(phenyl

2-me£hyl,

It

is to

for measuring

if a molecule

its

definition

The f i r s t

dimensions.

distance

that

hydroxyl,

oxirane

multivariate

algorithms

a Euclidean

in the universe,

phenoxymethyl

in eight

possible

we c a n r e p o r t

4,4'-hydroxyamino,

a stable,

two p u r p o s e s .

by n e a r n e s s

to present

Nonetheless,

inserted

to developing

appear

dimensions acid

i m m e r s e d in

and c o n s t i t u t e

and azo d y e s .

. . . . .] 1 " 1 " ~ ' *. . .~. . .' . .~. . . . . . . . .

.....L

_

_

_

~

J

........

!

i ,~ ~ ,

~~

~

.....

.... I

~ii! i~i!~ ~ ~'~

~il ¸¸

f



T

• i¸

.......................................... ~............ L i~i,

~,IIII,}~}IIIIIZ~I

t

• !

1629

Table i.

Interpretations and exauDles of extremes for 8 principal components

calculated from gO variables based on connectivity indices for 19,972 industrial chemicals.

Principal Component

Eigenvalue

Variation explained %

Low values of principal

High values of principal component

1

47.36

52.6

small molecules

large molecules

2

12.14

[3.5

few branches on molecule

multi-branched molecules

3

10.53

Ll.7

non-cyclic molecules

multi-cyclic molecules

4

5.19

5.8

7th to 8th order cycles

3rd to 4th order cycles

5

3.[3

3.5

molecules with single bonds and simple branching patterns

multi-branched molecules with double or triple bonds and/or with many heteroaComs

6

2.83

3.2

complex branching patterns and multi-cyclic molecules with few heteroatoms

complex 3rd and 4th order cyclic molecules

7

1.74

1.9

5th to 7th order cycles

complex valencecorrected branches chemicals with many heteroatroms

8

L.22

L.4

short chain molecules with complex heteroatom branches

Long chain molecules with few heteroatoms

1630

In summary, we have d e v e l o p e d a s t a b l e graph theoretic location

indices

of any c h e m i c a l

system permits

the

and m o l e c u l a r t o p o l o g y .

exploration

of t h e

a p p e a r t o be s t r u c t u r a l l y

space are

associated

before

structure

with chemical

the multi-dimensional

structure,

R e g i o n s of

s u c h as f r e e - e n e r g y ;

property

the

and n e a r e s t

to a given chemical.

properties

work c o m p i l i n g s y s t e m a t i c

for

A computer graphics

space around the

similar

s p a c e b a s e d on

The c o o r d i n a t e s

can be computed from s t r u c t u r e .

neighbors

substantial

chemical

however,

d a t a b a s e s must be c o m p l e t e d

s p a c e c a n be t e s t e d

for predictive

power.

REFERENCES I.

L.P Hammett,

Physical

Book Company, 2.

Organic Chemistry,

New York,

R.F. Gould, Biological

Second Edition {McGraw-Hill

1970), 420 pp. Correlations

- The Hansch Approach,

Chemistry Series No. 114 {ACS, Washington,

D.C.,

R.D. Cramer, J. Am. Chem. Soc. I02(6),

1837-1849

{1980).

4.

R.D. Cramer, J. Am. Chem. Soc. I02(6),

1849-1959

{1980).

5.

R.C. Reid, Fluid Phase Equilibria,

6.

W.J. Dunn and S. Wold, Bioorg.

7.

P.E. Long, An Introduction to General Toxicology Ohio,

13, 1-14 {1983).

Chem. 9, 505-523 (1980). {Merrill

Publ. Co.,

1971), 281 pp.

8.

M. Randic, ~. Am. Chem. Soc. 97, 6609-6613

9.

L.B. Kier, and L.H. Hall, Molecular Connectivity Research

in

1972), 340 pp.

3.

Columbus,

Advances

(1975). in Chemistry and Drug

{Academic Press, New York 1976), 257 pp.

I0.

A. Sabljic,

and N. Trinajstic,

11.

N.H. Nie, C.H. Hull, J.G. Jenkins, Statistical

Acta Pharm. Jugosl. K. Steinbrenner,

Package for the Social Sciences

31: 189-214 {1981). and D.H. Bent,

(McGraw-Hill,

New York,

1975), p. 675. 12.

C.D. Veith, 1040-1040

13.

A. Leo,

D.L. DeFoe, and B.V. Bergstedt,

(1970).

Lo~ P v a l u e s

Claremont,

J. Fish. Res. Board Can. 36,

CA.

computed v i a CLOGP, Pomona C o l l e g e MEDCHEM P r o j e c t