Chemosphere, Vol.17, No.8, pp 1617-1630, Printed in Great Britain
1988
0 0 4 5 - 6 5 3 5 / 8 8 $3.00 + .OO P e r g a m o n Press plc
On the Intrinsic Dimensionality of Chemical Structure Space
by
G.D. Veith, B. Greenwood I, R.S. Hunter 2, G.J. Niemi, and R.R. Regal3
Environmental Research Laboratory-Duluth 6201Congdon Boulevard D u l u t h , MN 55804
Icomputer Sciences Corporation Falls Church, VA
2 C e n t e r f o r Data S y s t e m s and A n a l y s i s Montana S t a t e U n i v e r s i t y Bozeman, MT 59717
3 D e p a r t m e n t of M a t h e m a t i c a l S c i e n c e s University
of M i n n e s o t a - D u l u t h
D u l u t h , MN 55812
1617
1618
INTRODUCTION An i m p o r t a n t chemical
structures
chemicals, "lead"
rates
and t h e r a p e u t i c s
can be e s t i m a t e d chemicals
analogous structures
similarity are
is
is often
considered
simultaneously
similarity
which a l l
can be i d e n t i f i e d .
important variables,
c o m p r e h e n d what t h i s i s one of t h e
large
universe
problem.
tools
first
attempts
and s c a l i n g
In t h e s e
f o r many c h e m i c a l s ,
is
exploring
space
in
the
from i t .
This
space for a
space has been activity
systematically
is transformed
to a s m a l l e r
between chemicals
for cluster
analysis
are
problem that
available
m o d e l i n g and v a l i d a t i n g
in t h i s
compiled set
of
coordinates
intuitive systematic
complex phenomena.
in a
techniques.
selection
of a n a l o g s ,
data
of e s s e n t i . a l
small
in a m u l t i v a r i a t e
in
s p a c e a r e u s e d as a
of o t h e r m u l t i v a r i a t e
for a relatively
h s m a l l number of p o i n t s
of
f o r most of t h e v a r i a t i o n
c o m p o n e n t s a r e u s e d as o r t h o g o n a l
by t h e p r a c t i c a l
properties
chemicals,
We have
we need t o
structure
structure
data are
a p p r o a c h may be an i m p r o v e m e n t t o
limited
chemical
is accomplished,
chemical
chemicals
so many p o t e n t i a l l y reduce
and t h e b i o l o g i c a l
approaches,
s p a c e and d i s t a n c e s
m e a s u r e of s i m i l a r i t y
it
to
components which account
The p r i n c i p a l
While t h i s
are necessary
of c h e m i c a l
and i n f o r m a t i o n
or principal
structure
words,
of c h e m i c a l s .
molecules 3'4'5'6
set.
are
where
chemical
a structure
Because t h e r e
to d e f i n e
sought using thermodynamic properties
the
to define
to
chemical
from a p e r s p e c t i v e
s p a c e means and what can be p r e d i c t e d
The d i m e n s i o n a l i t y
variables,
in r e s e a r c h ,
in o t h e r
the
Despite the
from many p e r s p e c t i v e s .
When t h i s
and
chemical
One r e a s o n i s t h a t
by a t t e m p t i n g
multivariate
of t h i s
by c o m p a r i n g t h e
problem or,
of
Moreover,
data are available.
simultaneously.
and d i f f e r e n t
properties
homologs 1'2
evaluated
a multivariate
similar
chemicals
modifications
Chemical
interpretation
approached chemical
paper
behavior.
similar
New i n d u s t r i a l
subtle
s u c h as "homolog" and " a n a l o g s "
inherently
dimensionality
are often
from s u i t a b l e
has evaded q u a n t i t a t i v e
chemicals are
and b e h a v i o r .
f o r which t o x i c o l o g i c a l
w i d e s p r e a d use of t e r m s
all
and p h a r m a c o l o g y i s t h a t
properties
w i t h known c h e m i c a l
of u n t e s t e d
similarity
in chem is try
have s i m i l a r
pesticides,
structures
reaction safety
expectation
sets
number of
space precludes Multivariate
analyses
1619
can be a c c u r a t e be e n c o u n t e r e d derived. large,
only are
chemicals
included
One of t h e b e s t representative
s y s t e m may c h a n g e To e x p l o r e have s e l e c t e d
the
chemical
set
data
set.
of e i g h t
properties
minicomputers to graphically through user-selected
to
space
is
structure
is to compile a stability
of t h e
structure
space,
from r e g i s t r i e s are available
from g r a p h t h e o r y .
components.
likely
of c h e m i c a l
for
less
A set
computed f o r
of t h e d a t a
the
than
of more t h a n 90 e a c h of t h e
s e t has been reduced to
Computer p r o g r a m s were d e v e l o p e d
display
we
t o m e t h o d s of q u a n t i t a t i n g
have b e e n s y s t e m a t i c a l l y
principal
the
of c h e m i c a l
structures
and t h e d i m e n s i o n a l i t y
diversity
a r e added.
we have t u r n e d
derived
indices
chemicals,
on c h e m i c a l
space
Otherwise,
dimensionality
chemicals,
variations
of t h e
from which the
new k i n d s of s t r u c t u r e s
intrinsic
of t h e
graph-theoretic 19,972
in t h e d a t a
all
ways t o p r o d u c e a s t a b l e
Because data
one p e r c e n t structural
if
representing
a s e t of 19,972 c h e m i c a l
production.
a set
if
"universe"
of c h e m i c a l
for
structures
windows.
METHODS Molecular (atoms)
topology treats a chemical
connected
connectivity
by edges
(bonds) 7.
indices from chemical
as framework,
derived from structures
bond,
Methods
sub-graphs
length 8'9'I0 and will not bc discussed classified
structure
as a group of vertices
for computing
have been discussed
in detail here,
and valence
molecular
indices.
at
The indices are
Framework
indices are
reduced to only carbon atoms and single bonds.
bond indices provides
a mechanism
that all the vertices
are assumed to be carbon and the vertex corrections
differentiate valence
to look a step beyond framework
The
the local bonding of each vertex.
is the number of non-hydrogen
The correction
bonds at the vertices.
indices
in
factor for
The valence
indices use vertex values which are adjusted for both bonding and heteroatom electronegativity
9.
1620
A graph is a finite
set
of v e r t i c e s
e d g e s c o n n e c t two of t h e v e r t i c e s . graph that
has a l l
the
vertices
and a f i n i t e
A connected
set
of e d g e s
in which
s u b g r a p h of a g r a p h
c o n n e c t e d by some c o m b i n a t i o n s
is a sub-
of t h e
edges.
C !
Subgraphs are c l a s s i f i e d
into
paths
(-C-C-),
C
clusters
(-C-k-C-),
I
C
I
C
/\
path-clusters
(C-C-C-C) and c y c l e s 9 ( - C - C ) .
subgraph that
has o n l y one or two e d g e s t o e a c h v e r t e x .
non-cyclic
subgraph that
path/cluster
has only t h r e e
is a non-cyclic
cluster.
A subgraph that
a chain.
The o r d e r
Indices
of
systematic
of h i g h e r
adjacent
of v e r t e x
type
one c y c l i c
algorithm
structures
identified
indices
of a c c u r a t e
t o compute t h e
of o n l y c o n n e c t e d
and t h e
in t h e
indices
as
subgraph. However,
for multicyclic
m o l e c u l e s has
subgraph enumeration.
first
l0 o r d e r s
subgraphs.
subgraphs to generate
vertices
A
subgraph is defined
by hand c a l c u l a t i o n .
due t o t h e d i f f i c u l t i e s
using computer data
includes
order
is a
i s composed of b o t h a p a t h and a
least
can be g e n e r a t e d
We d e v e l o p e d an e f f i c i e n t
uses
A cluster
of a s u b g r a p h i s t h e number of e d g e s
calculation
efficiently
at
is a n o n - c y c l i c
or f o u r edges to each v e r t e x .
subgraph that
contains
low o r d e r
n o t been r e p o r t e d
A path
of
indices
The p r o g r a m
new s u b g r a p h s w h i c h
a r e computed by s i m p l e b o o k k e e p i n g
and number of e d g e s a t e a c h v e r t e x .
The g r a p h e n u m e r a t i o n p r o g r a m was d e v e l o p e d on a V A X - l l / 7 8 0 c o m p u t e r a t Montana S t a t e
University.
of m o l e c u l a r
connectivities
structural
and c h e m i c a l
connectivity Substances organic
indices Initial
In an e f f o r t indices,
similarity
for
19,972
Inventory.
molecules with less
c a r b o n atom.
Generating
all
t o g a i n more i n s i g h t
particularly
as a t o o l
in m o l e c u l e s ,
chemicals
includes
t h a n 60 n o n - h y d r o g e n atoms and a t indices
for these
chemicals
nature
the
from t h e U.S.
data base
the
to determine
we g e n e r a t e d
selected
The s e l e c t e d
into
EPA Toxic
only discrete least
one
took approximately
20 h o u r s of CPU c o m p u t e r t i m e on t h e VAX-11/780. The 0 t h t o 9 t h o r d e r t e r m s f o r p a t h s , clusters, order
the
3rd t o 9 t h o r d e r
the 4th to 9th order terms for path/clusters,
terms for cycles
90 . s t r u c t u r a l
variables.
for the framework, Principal
bond,
and t h e
and v a l e n c e
component a n a l y s i s
terms for 3rd t o 9 t h
indices
comprise
(PCA) was u s e d to
1621
explore set
the covarianee
of o r t h o g o n a l
large
part
variables
the
principal
t h e 90 v a r i a b l e s
the
in the
variation
principal
for
19,972
S c i e n c e s 11
set,
Because a l l
influence
data points
as much i n f o r m a t i o n the
data
set
derived
from
were skewed
to the m a j o r i t y
were l o g - t r a n s f o r m e d of t h e s e
display
(hereafter
of t h e
to s t a b i l i z e
large molecules
patterns
further,
rotation
in t h e
resolved
eolor
and m a g n i f i c a t i o n
from d i f f e r e n t
Five y e a r s
ago t h e
e x c e s s of $ 1 0 0 , 0 0 0 . graphics
were c r e a t e d
graphics
and i s c a p a b l e grid
position
particular
consisting
several
d e t e r m i n e what c o l o r gun on t h e m o n i t o r i s
sophisticated
chip driving
To d i s t i n g u i s h
were a l s o specific
developed
s e g m e n t s of
setting
resolved
in c o n j u n c t i o n
choices
herein
of d i s p l a y
Each b i t - p l a n e each bit
is obtained
for each color at that
representing
bits
in t h e c o l o r
on t h e
o r 256 p a r t s .
f o r t h e 512 a d d r e s s e s
the
nine planes
gun i s s t o r e d .
coordinate
memory,
is a c a r t e s i a n
By s c a n n i n g a l l
address
The
w i t h a NEC 7220
nine bit-planes
screen.
to eight
medium-to-high resolution The images p r e s e n t e d
of 672 x 480 b i t s ,
is displayed
of 16.8 m i l l i o n
and s c a l i n g
d e v i c e w i t h a IBM PC AT h o s t .
512 c o l o r s .
a nine-bit
where a n u m e r i c a l
data
lookup t a b l e .
capabilities
b e l o w $5000.
of a d o t on t h e
coordinate,
of t h e
and d i m e n s i o n s .
8088 m i c r o p r o c e s s o r
of d i s p l a y i n g
We b e g a n
t h r o u g h an e x p a n d e d
examination at
VX384 g r a p h i c s
controller
We w a n t e d
of s u c h a c o m p u t e r s y s t e m would have b e e n in
are priced
on a V e e t r i x
device
coordinate
Today,
devices
s y s t e m u s e s an I n t e l
angles
cost
structure-space
is a c h a l l e n g e .
representation
so we were a b l e t o "zoom in" f o r a c l o s e r universe
of a c h e m i c a l
termed the universe)
spatial
dimension over a highly
palette
matrix
a
We
in as many d i m e n s i o n s as p o s s i b l e .
a fourth
table
indices.
the variables
in many d i m e n s i o n s
window of a t h r e e - d i m e n s i o n a l
spatial
retained
chemicals using the Statistical
the variables
and r e d u c e t h e
by 19,972
exploring
color
connectivity
of some l a r g e m o l e c u l e s r e l a t i v e
data
still
them to a
component space.
to d i s p l a y
the
and t o r e d u c e
components) that
original
Designing a eomputer generated defined
variables
c o m p o n e n t s from t h e c o r r e l a t i o n
for the Social
chemicals
in t h e
calculated
due t o t h e p r e s e n c e
of t h e s e
(principal
of t h e v a r i a t i o n
calculated
Package
structure
of t h e
at a
lookup
These s e t t i n g s grid.
Each c o l o r
This provides color
lookup
a
1622
table.
The t h r e e - d i m e n s i o n a l
are
firm-ware
all
i m p l e m e n t a t i o n s which g r e a t l y
c o m p l e x i t y of t h e sophisticated
transformations,
rotations,
and m a g n i f i c a t i o n s
reduce the computational
p r o g r a m r u n n i n g on t h e h o s t c o m p u t e r .
graphics
programs that
m a i n f r a m e s a r e now w i t h i n
T h i s means
once c o u l d be r u n o n l y on l a r g e
the capabilities
of s m a l l e r
m i n i - and m i c r o c o m p u t e r
systems. Three-dimensional the
binary
associated
driver
windowing was i m p l e m e n t e d in t h e
on t h e h o s t
routines
IBM PC AT.
were w r i t t e n
PROPCA a l l o w s t h e u s e r t o a s s i g n
linear
variable
spectral
and r e a s o n a b l y
any t h r e e
is then selected lookup t a b l e ,
A program called
primitives
fast
variables
execution
of a
times.
t o t h e X, Y, and Z a x e s .
t o be mapped o v e r t h e
h three-dimensional
of
PROPCA and i t s
in FORTRAN 77 t o p r o v i d e t h e b e n e f i t s
b r o a d b a s e of s o f t w a r e c o m p a t i b i l i t y
h fourth
graphics
512 c o l o r s
virtual
of a
window can be
defined
i n t e r m s of a minimum and maximum f o r t h e X, Y, and Z a x e s .
This
feature
allows close
are not
available select
e x a m i n a t i o n of s m a l l
through the
different
substantially
scaling
options
sections
of t h e
of t h e p r o g r a m .
r a n g e s and e n d p o i n t s
For e x a m p l e ,
for each axis.
r e d u c e d by s e l e c t i v e l y
image t h a t
Also plot
viewing only the areas
one c o u l d
time
is
of i n t e r e s t
t h r o u g h the windowing o p t i o n .
RESULTS Principal variables inverted single
the
in t h i s file
pixels
chemical
c o m p o n e n t s from t h e
s t u d y and were r e t r i e v e d
in the host computer. as t h e y were r e a d .
universe
de f i n e d
virtual-window,
is also
operation
has p r o g r e s s e d
traversal
is complete,
the
screen which allows
1 presents A reference
orientation
the counter A color subtle
use of s i n g l e
data
a first
pixels
rather
g l i m p s e of t h e cube d e s c r i b e s
when r o t a t i n g
or s c a l i n g .
the plotting
set traversal.
is displayed
h u e s t o be c o r r e l a t e d than filled
as
wire-frame
is updated to d i s p l a y legend
of an
c o m p o n e n t s were p l o t t e d
i n f o r m t h e u s e r of how f a r
in t e r m s of t h e
t h e window l i m i t s .
Because t h e
Figure
providing
provided to
from r a p i d t r a v c r s a l s
The p r i n c i p a l
and s t r u c t u r e - s p a c e .
A counter
within
19,972 x 90 d a t a m a t r i x were u s e d as
After
the
t h e number of p o i n t s
at the
far
right
of
w i t h component v a l u e s . polygons lessens
the
1623
c a p t u r e of d e p t h a l o n g t h e Z a x i s , s i d e s of t h e u s e r - d e f i n e d spatial
and c o l o r
virtual
PROPCA p r o v i d e s f o r p r o j e c t i o n s
window and c o n c u r r e n t v i e w i n g of t h e
i n f o r m a t i o n i n t h e X-Y, X-Z and Y-Z p l a n e s .
The PCA r e s u l t e d
in e i g h t p r i n c i p a l
t h e y e x p l a i n e d 93.5% of t h e v a r i a t i o n was p o s i t i v e l y variables. indicate all
correlated
with all
PC 2 was p o s i t i v e l y
in t h e o r i g i n a l
variables
correlated
cyclic
variables.
variables
three principal structure:
cluster
components a l l
s i z e (PC 1),
with all
differences
Similarly cyclicness,
correlated
with
correlated
other variables
variables.
The f i r s t
i n f o r m a t i o n on c h e m i c a l and number of c y c l e s (PC
components i d e n t i f i e d
more s p e c i f i c
For e x a m p l e , PC 4 had p o s i t i v e variables,
differences
correlations
but n e g a t i v e
(£ < .18) w i t h t h e 7 t h and 9 t h o r d e r c y c l i c
variables.
in b r a n c h i n g , b o n d i n g ,
v a l e n c y ( p r e s e n c e of h e t e r o a t o m s s u c h as h a l o g e n s and o x y g e n ) ,
and c o m b i n a t i o n s of t h e s e s t r u c t u r a l Figure 1 presents principal
that
with all
and p a t h / c l u s t e r
PC 5 t o PC 8 convey a d d i t i o n a l
PC 1
variables
PC 3 was p o s i t i v e l y
correlated
(£ > .58) w i t h t h e 3rd and 4 t h o r d e r c y c l i c correlations
cluster
d e g r e e of b r a n c h n e s s (PC 2 ) ,
between chemicals.
d a t a ( T a b l e 1).
but n e g a t i v e l y
convey g e n e r a l i z e d
The r e m a i n i n g f i v e p r i n c i p a l
> 1 and
except for the cyclic
In c o n t r a s t ,
and n e g a t i v e l y
except the valence-corrected
3).
components w i t h e i g e n v a l u e s
the degree a molecule is branched,
p a t h and c y c l i c
with all
onto the
attributes.
the chemical s t r u c t u r e - s p a c e
components on t h e X, Y, and Z a x e s ,
fourth principal
respectively.
component w i t h red to d e s i g n a t e
designate
large values.
The p r i n c i p a l
gradients
of d i f f e r e n c e s
between c h e m i c a l s .
for the first
three
Color s c a l e s t h e
s m a l l v a l u e s and b l u e to
components a r e a x e s t h a t
represent
For e x a m p l e , on t h e e x t r e m e l e f t
in F i g u r e 1 i s c a r b o n monoxide, t h e s m a l l e s t m o l e c u l e in t h e d a t a b a s e , the largest f o r PC 1.
m o l e c u l e in t h e d a t a b a s e (CAS # 1356089) h a s t h e The " s t r i n g "
of s t r u c t u r e s
or l i n e a r
cluster
largest
while
value
in t h e lower l e f t
c o r n e r of F i g u r e 1 i s a group of n e a r l y 1200 u n b r a n c h e d , n o n - c y c l i c structures
which a r e s e p a r a t e d from t h e u n i v e r s e of b r a n c h e d s t r u c t u r e s .
Figure 2 is a d i f f e r e n t Both views i l l u s t r a t e in t h r e e
view of t h e s e same s e t s of p r i n c i p a l that
structures
d i m e n s i o n s may a c t u a l l y
which a r e c l o s e
be f a r a p a r t
components.
(similar)
(dissimilar)
to e a c h o t h e r
in a f o u r t h
1624
dimension.
Figure
3 presents
perspective
where the 4th,
another
5th,
v i e w of t h e u n i v e r s e
and 6 t h p r i n c i p a l
Y, and Z a x e s r e s p e c t i v e l y
and t h e 7 t h p r i n c i p a l
color.
19,584
This view contains
chemicals
are apparent
in these
structures
from t h e
components present component are
t h e X,
scaled
in
and many h o m o l o g o u s s e r i e s
of
dimensions.
DISCUSSION This approach similarity analogs this
is being used for
defined
paper
components,
using
neighbors
include
nearest
2-ethyl,
The s e c o n d u s e i s t o a t t e m p t association example,
with chemicals
Veith
that
in food chains
(Log P ) .
C h e m i c a l s w i t h Log P v a l u e s
Figure 4.
bioaccumulation
l in which the color
The f i g u r e
increases the
substantial
axis that
with molecular
weight
of t h e u n i v e r s e
new s t r u c t u r e s
which fall
4 also at
shows t h a t
(blue
data
right-center).
other
structures,
large
but non-accumulative
molecular
contains areas
are
Even t h o u g h t h e s e
they are widely
separate
chemicals
partition
For
coefficient
considered a close-up
to Log P i n s t e a d classes
with
large
reasonably
with
s u c h as s u l f o n i c
of
of t o PC
Therefore,
Log p13 and
be p r e s u m e d t o have i s unknown.
low Log P v a l u e s
structures
in other
to h a v e
t h e Log P v a l u e
v o l u m e (X a x i s ) .
molecules
by
bioaccumulative
e v e n i f t h e Log P v a l u e
large
chemicals
behavior.
4 presents
chemicals could
oxirane.
harmful
than 4.0 are
Figure
If
the nearest
or biological
greater
and/or
of b i p h e n y l .
phenoxymethyl
the highly
is
many of t h e
inserted,
f o r many c h e m i c a l
potential
there
is
has been scaled
in these
bioaccumulation
include
n-octanol/water
potential.
illustrates
red regions
Figure
have a large
s c o p e of
principal
potentially
of known c h e m i c a l
chemicals
substantial
ether)
identify
e t a l . 12 d e m o n s t r a t e d
suitable
s u c h as d i p h e n y l a m i n e
and 4 - n i t r o
to
identify
distances.
m e a s u r e and e i g h t
neighbors
glycidal
of c h e m i c a l
is beyond the
and m e t h y l a m i n o d e r i v a t i v e s
(phenyl
2-me£hyl,
It
is to
for measuring
if a molecule
its
definition
The f i r s t
dimensions.
distance
that
hydroxyl,
oxirane
multivariate
algorithms
a Euclidean
in the universe,
phenoxymethyl
in eight
possible
we c a n r e p o r t
4,4'-hydroxyamino,
a stable,
two p u r p o s e s .
by n e a r n e s s
to present
Nonetheless,
inserted
to developing
appear
dimensions acid
i m m e r s e d in
and c o n s t i t u t e
and azo d y e s .
. . . . .] 1 " 1 " ~ ' *. . .~. . .' . .~. . . . . . . . .
.....L
_
_
_
~
J
........
!
i ,~ ~ ,
~~
~
.....
.... I
~ii! i~i!~ ~ ~'~
~il ¸¸
f
k¸
T
• i¸
.......................................... ~............ L i~i,
~,IIII,}~}IIIIIZ~I
t
• !
1629
Table i.
Interpretations and exauDles of extremes for 8 principal components
calculated from gO variables based on connectivity indices for 19,972 industrial chemicals.
Principal Component
Eigenvalue
Variation explained %
Low values of principal
High values of principal component
1
47.36
52.6
small molecules
large molecules
2
12.14
[3.5
few branches on molecule
multi-branched molecules
3
10.53
Ll.7
non-cyclic molecules
multi-cyclic molecules
4
5.19
5.8
7th to 8th order cycles
3rd to 4th order cycles
5
3.[3
3.5
molecules with single bonds and simple branching patterns
multi-branched molecules with double or triple bonds and/or with many heteroaComs
6
2.83
3.2
complex branching patterns and multi-cyclic molecules with few heteroatoms
complex 3rd and 4th order cyclic molecules
7
1.74
1.9
5th to 7th order cycles
complex valencecorrected branches chemicals with many heteroatroms
8
L.22
L.4
short chain molecules with complex heteroatom branches
Long chain molecules with few heteroatoms
1630
In summary, we have d e v e l o p e d a s t a b l e graph theoretic location
indices
of any c h e m i c a l
system permits
the
and m o l e c u l a r t o p o l o g y .
exploration
of t h e
a p p e a r t o be s t r u c t u r a l l y
space are
associated
before
structure
with chemical
the multi-dimensional
structure,
R e g i o n s of
s u c h as f r e e - e n e r g y ;
property
the
and n e a r e s t
to a given chemical.
properties
work c o m p i l i n g s y s t e m a t i c
for
A computer graphics
space around the
similar
s p a c e b a s e d on
The c o o r d i n a t e s
can be computed from s t r u c t u r e .
neighbors
substantial
chemical
however,
d a t a b a s e s must be c o m p l e t e d
s p a c e c a n be t e s t e d
for predictive
power.
REFERENCES I.
L.P Hammett,
Physical
Book Company, 2.
Organic Chemistry,
New York,
R.F. Gould, Biological
Second Edition {McGraw-Hill
1970), 420 pp. Correlations
- The Hansch Approach,
Chemistry Series No. 114 {ACS, Washington,
D.C.,
R.D. Cramer, J. Am. Chem. Soc. I02(6),
1837-1849
{1980).
4.
R.D. Cramer, J. Am. Chem. Soc. I02(6),
1849-1959
{1980).
5.
R.C. Reid, Fluid Phase Equilibria,
6.
W.J. Dunn and S. Wold, Bioorg.
7.
P.E. Long, An Introduction to General Toxicology Ohio,
13, 1-14 {1983).
Chem. 9, 505-523 (1980). {Merrill
Publ. Co.,
1971), 281 pp.
8.
M. Randic, ~. Am. Chem. Soc. 97, 6609-6613
9.
L.B. Kier, and L.H. Hall, Molecular Connectivity Research
in
1972), 340 pp.
3.
Columbus,
Advances
(1975). in Chemistry and Drug
{Academic Press, New York 1976), 257 pp.
I0.
A. Sabljic,
and N. Trinajstic,
11.
N.H. Nie, C.H. Hull, J.G. Jenkins, Statistical
Acta Pharm. Jugosl. K. Steinbrenner,
Package for the Social Sciences
31: 189-214 {1981). and D.H. Bent,
(McGraw-Hill,
New York,
1975), p. 675. 12.
C.D. Veith, 1040-1040
13.
A. Leo,
D.L. DeFoe, and B.V. Bergstedt,
(1970).
Lo~ P v a l u e s
Claremont,
J. Fish. Res. Board Can. 36,
CA.
computed v i a CLOGP, Pomona C o l l e g e MEDCHEM P r o j e c t