Computer Methods and Programs in Biomedicine 25 (1987) 157-168
157
Elsevier CPB 00856
Radiologic automated diagnosis (RAD) G o r d o n Banks, J o h n K. Vries and Sean M c L i n d e n Decision Systems Laboratory, University of Pittsburgh, Pittsburgh, PA, U.S.A.
R A D is a program currently being developed to interpret neuroimages. Given the clinical information usually available on the imaging request, R A D will analyze the scan directly from the data generated by the scanning machine to produce a differential diagnostic list explaining any lesions it discovers. R A D uses a computerized three-dimensional stereotaxic atlas of the nervous system as a model of normal structures in the analysis of scans. Diagnosis; Neuroimaging; Radiology; Artificial intelligence
1. Background During the past ten years, new techniques for imaging the brain have had a revolutionary effect on the practice of neurology and neurosurgery. Formerly, many structural abnormalities in the brain were inaccessible to radiologic investigation. Arteriography and pneumoencephalography were valuable procedures, but they are invasive. Moreover, lesions that were avascular or too small to distort the ventricular system could only be studied through the functional changes which they caused in the clinical neurologic examination. Computed tomography (CT) has allowed direct imaging of the brain with a high degree of resolution, and is currently one of the most useful diagnostic tools of modern medicine. Yearly improvements in resolution and speed have continued to enhance its usefulness. Recently, magnetic resonance imaging (MRI) has reached the point in its development where it has begun to complement CT scanning. The data gathered by these imaging techniques
Correspondence: G. Banks, Decision Systems Laboratory, University of Pittsburg, 1360 Scaife Hall, Pittsburg, PA 15261, U.S.A.
consists of arrays of numeric values which are associated with regions in three-dimensional space inside the object being imaged. In the case of CT, these values, called Hounsfield numbers, are related to the x-ray attenuating properties of the matter occupying the region. In order to facilitate interpretation by radiologists, this data is displayed using computer graphics techniques in the form of two-dimensional slices through the object, with points in space represented as 'pixels' of intensity proportional to the Hounsfield numbers. This produces an image similar in appearance to conventional radiographs. To facilitate the generation of these images, a dedicated computer is part of each imaging system.
2. Neuroimage interpretation In order to successfully interpret neuroimages, the radiologist must have knowledge of the location, morphology, and x-ray attenuating properties (radiodensity and radiolucency) of many anatomic structures. He must know the relationships of anatomic regions to vascular territories, and how lesions in one anatomic location may produce morphologic changes in another region (for exam-
0169-2607/87/$03.50 © 1987 Elsevier Science Publishers B.V. (Biomedical Division)
158 pie, dilation of the lateral ventricles resulting from obstruction of the aqueduct of Sylvius by a posterior fossa tumor). While the images alone contain considerable differential diagnostic information, the radiologist is greatly aided by clinical information about the patient [1]. The patient's symptoms and signs can suggest a set of anatomic sites for closer scrutiny, as well as allow the radiologist to prioritize his differential diagnosis among those etiologic entities that present a similar radiologic picture to the one he has under current consideration. Ideally, these anatomically based neuroimages are interpreted by radiologic subspecialists (neuroradiologists) in conjunction with clinical information supplied by neurologists or neurosurgeons. These specialists characteristically have the knowledge of correlative neuroanatomy necessary for optimum interpretation of these scans. Because of the usefulness of CT, scanners are now present in many hospitals which do not have neuroradiologists. In these hospitals, general radiologists are responsible for scan interpretation. Often, the community hospitals have the latest in CT hardware, but lack even neurological or neurosurgical support. Many times, the radiologist responsible for reading CT scans was trained before the advent of CT and has to rely on continuing education and reading to achieve his competency with these techniques, and may have had little instruction in neuroanatomy beyond medical school. We believe that an expert system could be of value to physicians performing neuroradiologic diagnosis, whatever their training or competence. The general radiologist would benefit by reviewing a program-generated differential diagnosis for those diseases which he might not have considered. In addition, the explanatory features of an expert system could have value from the continuing education standpoint. The neurologist, neurosurgeon, or neuroradiologist might find the program useful for screening scans for special attention, or for prompting consideration of the more arcane possibilities in difficult cases. In addition, such a system could facilitate analysis of the images in ways that are difficult for humans. For example, a program that could successfully extract regions automatically could make quantitative mea-
surements of the volume of lesions or ventricular size with more facility than can be done from the film. Dynamic processes could be investigated by studying serial changes in lesions over time. Currently, these studies require tedious human interaction, which precludes their widespread use. Also, standardized objective methods might be developed to perform cross-modality correlation (e.g. CT vs. MRI vs. PET). At the present state-of-theart, there is no reason to believe that a computer capable of handling such an expert system could not be an integral part of the next generation of CT scanners without adding significantly to their cost.
3. Image analysis The application of artificial intelligence (AI) techniques to image analysis is currently a topic of great interest in the fields of computer vision and robotics. The general field of AI applications to computer vision and image understanding is too broad to be covered here. The field is well surveyed in the textbook of Ballard and Brown [2], as well as in chapters on image understanding in Winston [3] and Cohen and Feigenbaum [4]. One of the most difficult steps in image understanding is segmenting the objects of interest from the background. A wide variety of techniques have been developed for this purpose, but all have had some drawbacks. Recently, AI techniques have been considered for this type of low-level processing where a priori knowledge of the expected properties of objects can be used to guide edge, surface, and boundary detectors. These techniques may also have a role to play in region merging and texture analysis. In the field of medical imaging, Ballard and coworkers [5,6] and Lesgold [7] have applied AI techniques to the extraction of features from digitized chest films. In addition, AI techniques open the possibility of evaluating images with respect to analogic models, and for providing procedural knowledge for top-level control of analytical processes. AI methods can also be useful in choosing forms for representing higher-level knowledge about normal and pathologic features that may be present in the
159 image. K u m a r [8] demonstrates how a rule-based system can be used to analyze objects which have been extracted from CT scans using conventional image analysis techniques such as edge detectors.
4. RAD The goal of the R A D project is to develop a program using AI techniques which will take data from medical imaging devices along with a succinct list of the clinical findings and demographic information, such as would be provided on a request for an imaging study, and produce a ranked differential diagnostic list. The list should be as short as possible, and it should contain the diagnosis that eventually proves to be the correct one. The basic approach used in R A D is to analyze the scan data by comparing the pixel densities of regions of the scan with an analogic atlas-based three-dimensional model which is normalized to the patient. A similar approach using two-dimensional models is being pursued simultaneously and independently by Newell [9]. Focussing is done by using clinical manifestations and also by identifying asymmetries and regions of the image which are statistically abnormal in terms of pixel histogram. Initial hypotheses are identified based on the location and histogram characteristics of the abnormality and the clinical information. Hypotheses are tested against the image data, and those that cannot be ruled out are reported on the differential diagnostic list. The program is reported as work-in-progress. In the following sections, we will report in detail on the parts of the system that are operational, some of which have interesting applications, and outline the planned approach for future work.
5. Image representation Raw data is entered into the system by means of digital tapes produced by imaging devices. These tapes contain processed information in archive format with image data represented as arrays of binary numbers. The first step in analysis is to convert these arrays into Lisp vectors to make the
image information accessible to RAD. In the case of the G E 9800 CT scanner, programs to accomplish this step have been completed. For this scanner, the images are stored as 512 x 512, 384 x 384, or 256 x 256 arrays of 12-bit numbers representing radiodensities (Hounsfield numbers) corresponding to the coordinates { x, y } in each CT slice. The various CT slices may be thought of as taken at different points along a z axis. U d u p a [10] has surveyed the large number of image representation schemes in current use, including chain codes, boundary and surface representations, B-spline functions, generalized cylin-
t level
1
l object
level 2 i
@ Fig. 1. Octree representation.
level 3
160 ders, and quadtree and octree encoding. For our application, we have chosen the latter technique, which we feel offers distinct advantages. Two-dimensional image representation using quadtree encoding was first proposed by Klinger [11] and elaborated by Samet [12]. In this scheme, binary images extracted from raw data are represented by a tree of order (or degree) 4 (quadtree). The root level can be conceptualized as a block large enough to enclose all of the pixels making up the image or region. A single number can be assigned to the root node, denoting its 'state'. If it is 0, the entire image is background (white). If it is 2, the entire image is object (black). If the entire image cannot be classified as all object or all background, the number is 1 (gray) and the tree is then subdivided into quadrants. For each of these subquadrants, the same process is reapplied, characterizing their states as 0, 1, or 2. This process may thus be applied recursively. When the process terminates, we are left with a quadtree structure whose terminal nodes are all 0 or 2. One computational advantage immediately becomes apparent when it is considered that large section is of a typical image will be all background or all image and processing only need continue for those subregions which contain borders. The quadtree concept can easily be extended to three dimensions (or indeed, generally into n dimensions) [13]. The three-dimensional representation, octree (tree of order 8) encoding, was elaborated by Meagher [14]. Here the three-dimensional regions are divided recursively into octants until the desired resolution is obtained. Fig. 1 shows how a simple object (a rectangular parallelopiped) in a region can be completely represented by a three-level octree. Some of the advantages of quadtree and octree encoding are: (1) The encoding process can be carried to any level resolution, including the single-pixel level. If high resolution is not necessary, examination of the top few levels of the tree saves orders of magnitude of computer processing time. (2) Image regions are naturally organized into hierarchical relationships which facilitates context analysis. (3) An extensive library of algorithms for conversion between quadtrees and other forms of image
representation, and for calculating the geometric properties of images represented as quadtrees has been developed by Samet. (4) Three-dimensional octree images can easily be constructed from two-dimensional slices in quadtree format. (5) The operations of translation, scaling, rotation, and hidden-surface elimination, which are computationally intensive for most image representation schemes, can be accomplished without multiplication or division. Rotations at angles of 90 ° simply require changing the order of traversal of the tree. Arbitrary rotations are more complicated but still more economical than with the conventional trigonometric methods [14]. (6) Composite images can be quickly constructed by logical union or intersection of source images. In this fashion, octree-encoded images of anatomic structures from the atlas can be combined with images from CT scans.
6. Regional segmentation The CT cuts correspond to tomographic slices through the head at a given level and a preset angle with respect to the base. The data in each CT cut is a two-dimensional array of Hounsfield numbers in the x and y axes. The depth of the stack of slices representing all the tomographic cuts corresponds to the z coordinate. The Hounsfield number therefore represents the radiodensity of a region of space located at a given set of { x, y, z } coordinates. The approximate Hounsfield numbers for various substances found in the head are shown in Table 1. It should be noted that CT scanners are often calibrated with respect to cerebrospinal fluid (csf) density, and that these ranges may vary considerably in individual cases. Nevertheless, they serve as useful initial values prior to histogram optimization. For each substance shown in Table 1, the number of pixels falling into each range can be computed for the whole scan or for any subdivision thereof. Image segmentation is begun by classifying each pixel in a given CT cut according to its Hounsfield number. Performing simple statistical analysis on the tabulated results of the pixel counts
161 i d e a of the characteristics of the lesion (large, e n h a n c i n g , with a s s o c i a t e d h y p o d e n s e areas).
TABLE 1 Approximate Hounsfield ranges Tissue
Range
Fat Cerebrospinal fluid White matter Gray matter ( - contrast) Gray matter ( + contrast) Tumor Blood ( - contrast) Blood ( + contrast) Bone
- 100 - 10 28 33 39 50 50 50 100
to to to to to to to to to
- 10 15 32 38 45 150 80 100 4000
a n d their ( x , y } c o o r d i n a t e s for each slice can yield useful focussing i n f o r m a t i o n for the i m a g e analysis process. T a b l e 2 is a t a b u l a t i o n of the pixels for a C T slice c o n t a i n i n g a c o n t r a s t - e n h a n c ing t u m o r ( g l i o b l a s t o m a ) which is in the left p a r i e t a l - o c c i p i t a l region. T h e origin of c o o r d i n a t e s is in the u p p e r left corner. F o r each H o u n s f i e l d p a r t i t i o n , the x a n d y c o o r d i n a t e s of the ' c e n t e r of g r a v i t y ' for that p a r t i t i o n is t a b u lated. I n s p e c t i o n o f the table shows t h a t the b o n e p a r t i t i o n is s y m m e t r i c , while 26% of the pixels lie in the h y p e r d e n s e ( c o n t r a s t - e n h a n c i n g ) range, ind i c a t i n g gross a b n o r m a l i t y . T h e s e h y p e r d e n s e pixels are c e n t e r e d in the left lower q u a d r a n t (the c e n t e r of the slice is at {252, 248}). G r a y a n d white m a t t e r densities a r e shifted to the right, while there are m o r e h y p o d e n s e pixels in the left l o w e r q u a d r a n t t h a n there o u g h t to be. This alm o s t i n s t a n t a n e o u s analysis c a n focus a t t e n t i o n o n the lower left q u a d r a n t as well as given s o m e
TABLE 2 Statistical analysis of Hounsfield ranges Density
Area
%
cgx
cgy
Air Hypodense White Gray Hyperdense Heme Bone Total
32 16 527 2041 1090 1176 263 4145
0 0 12 49 26 4 6 100
243 242 268 268 217 100 250 252
436 346 231 226 267 250 236 248
7. Computerized stereotaxic atlas U n d e r s t a n d i n g of an i m a g e of the nervous system requires the i n t e r p r e t e r ( w h e t h e r h u m a n or m a c hine) to have a k n o w l e d g e of n e u r o a n a t o m y . T h e n regions o f the scan c o r r e s p o n d i n g to neuroa n a t o m i c structures which are visualized b y the t e c h n i q u e can b e identified. W e use structures r e p r e s e n t e d in a c o m p u t e r i z e d t h r e e - d i m e n s i o n a l n e u r o a n a t o m i c atlas as m o d e l s for c o m p a r i s o n with e x t r a c t e d objects, as well as tools for facilitating the e x t r a c t i n g process. These m o d e l s are repres e n t e d i n t e r n a l l y in the s a m e o c t r e e - e n c o d e d form a t as the objects e x t r a c t e d f r o m CT. The octree m e t h o d facilitates c o m p a r i s o n s since it is s i m p l e to m a k e logical A N D s a n d O R s (intersections a n d unions) b e t w e e n octrees r e p r e s e n t i n g atlas objects a n d e x t r a c t e d objects. The system can n o r m a l i z e objects d e r i v e d f r o m atlases to i n d i v i d u a l C T scans a n d correct for skew of the p a t i e n t in the scanner. T h r e e - d i m e n sional images c o m b i n i n g a r b i t r a r y atlas o b j e c t s a n d objects e x t r a c t e d f r o m scans can be p r o d u c e d . T h e t h r e e - d i m e n s i o n a l r e p r e s e n t a t i o n s of atlas obj e c t s can b e sectioned in a n y a r b i t r a r y plane, a n d these sections can b e p r o j e c t e d u p o n a t w o - d i m e n sional C T slice m a d e in a n y plane. This c o m p o n e n t of the R A D system is useful in its own right for p l a n n i n g stereotaxic n e u r o s u r g i c a l p r o c e d u r e s a n d teaching n e u r o a n a t o m y . Fig. 2 shows the m o d u l e s in R A D for generating and manipulating three-dimensional images e x t r a c t e d f r o m C T scans a n d atlases. I m a g e files f r o m the G E 9800 s c a n n e r are e x t r a c t e d f r o m t a p e a n d c o n v e r t e d to 512 × 512 a r r a y s of H o u n s f i e l d n u m b e r s . These a r r a y s are a v e r a g e d a n d interp o l a t e d to p r o d u c e a 256 × 256 × 256 c u b e of 16-bit n u m b e r s . T h e O R T H O a n d C T V I E W routines are used to o b t a i n a r b i t r a r y t w o - d i m e n s i o n a l slices in the axial, coronal, or sagittal planes. O c t r e e d a t a structures are g e n e r a t e d b y recursively a p p l y i n g the r o u t i n e s in O C T A G E N until the d a u g h t e r o c t a n t s are entirely full or e m p t y . If this has not o c c u r r e d b y the time the octree is eight
162
I,ram. bo,for I
I octroo d0,abose
law of geometrical optics. Color map indexing is controlled by the STOA and ATOS routines. Composite images are generated using the L O G I C A L _ O R and L O G I C A L A N D routines. These programs perform a parallel traversal of two objects in octree format to produce a third octree object which is the logical O R or logical A N D of the original two object. Fig. 3 shows a view of the base of the skull from above. This picture was made from the bone-window octree of a CT scan using the technique described above. Fig. 4 shows the tools for extracting and manipulating objects for the computerized atlas. Sources may be conventional printed atlases, CT scans displayed on the video monitor, or interactively edited existing data structures. Plates from printed atlases are placed on the graphics tablet ( H I P A D ) and structures are outlined using a mouse-like device. Alternatively, CT scans can serve as a source for objects. Their borders can be traced from CT sections on the Sun computer
Fig. 2. Image manipulation and display.
levels deep, the pixels are forced to be full or empty. Octrees are created for each appropriate Hounsfield range (bone, csf, white matter, etc.). Although average values for these ranges are known, individualized ranges can be obtained prior to octree generation by manually thresholding using CTVIEW. These octree data structures are the basis for the remaining operations. The X F R M routine is used to translate, scale, and rotate the octree objects as well as to perform hidden surface elimination. The output of this routine is a 256 × 256 array of 16-bit numbers that represent the distance from the surface of the viewing plane to the object at each point (z-buffer). The S H A D E R routine generates pseudo three-dimensional shading from the z-buffer. The z-buffer is interpreted as a contour map, and imaginary light vectors are reflected from its surface. The intensity of reflected light is calculated according to Lambert's
Fig. 3. Base of skull viewed from above.
163
.r? f HIPAD
I
transformed
octree
Fig. 4. Tools for extracting and manipulating objects from an atlas.
g r a p h i c d i s p l a y screen using the mouse. T h e F M m o d u l e r e c o r d s the z c o o r d i n a t e of the two-dim e n s i o n a l outlines. T h e stacks of outlines from serial atlas or C T sections are p r o c e s s e d b y the POLYHEDRON m o d u l e to f o r m t h r e e - d i m e n sional objects. It p e r f o r m s linear i n t e r p o l a t i o n to c r e a t e a u n i f o r m series of outines at 1 - m m intervals f r o m the b o t t o m of an o b j e c t to the top. T h e s e outlines are then p r o c e s s e d to yield a set of { x, y, z } c o o r d i n a t e s for all p o i n t s in a three-di-
m e n s i o n a l region b e l o n g i n g to the object. T h e p r o g r a m A T L A S takes these p o i n t s as i n p u t a n d p r o d u c e s an octree r e p r e s e n t a t i o n of the object in the s a m e f o r m a t as the octrees used for the objects e x t r a c t e d f r o m the C T scans. This m e a n s that all of the octree t r a n s f o r m a t i o n o p e r a t i o n s used on C T scans c a n also be used on these a t l a s - d e r i v e d objects. This includes scaling, translation, a n d rot a t i o n s as well as the logical O R a n d logical A N D o p e r a t i o n s . C o m p o s i t e objects m a y be built from s u b u n i t objects a n d atlas objects m a y be c o m b i n e d with regions e x t r a c t e d f r o m C T scans. Fig. 5 shows a t h r e e - d i m e n s i o n a l view of the t h a l a m i as seen from the top. T h e s e t h a l a m i were derived f r o m S c h a l t e n b r a n d ' s stereotaxic atlas [15] by tracing with the H I P A D .
8. Normalization
Fig. 5. Thalami extracted from an atlas.
T h e n o r m a l i z a t i o n p r o b l e m m a y be b r o k e n d o w n i n t o three parts. T h e first involves t r a n s f o r m a t i o n o f all the structures in the c o m p u t e r i z e d atlas into a s t a n d a r d projection, regardless of the source of their derivation. T h e second involves n o r m a l i z i n g c o m p o n e n t s in a n e u r o a n a t o m i c d a t a b a s e to struc-
164 tures in patient CT scans. This requires the establishment of stable reference points that are relatively invariant from patient to patient. The third part involves correction in three dimensions for the obliquity of the alignment of the patient with the scanner. We have found that it is easier to normalize the neuroanatomic database to the scan than the converse. Since there are fewer data points in the database than in the scan, we apply a homogeneous coordinate transformation to the three-dimensional atlas in order to fit it to the coordinates of the CT under consideration. To normalize for the different orientations of the stereotaxic atlas plates or CT projections used to generate the neuroanatomic database, the axial view as seen from the top was defined as the standard view. Defining the orientation of the atlas plates for the ATLAS program (i.e. axial, coronal, sagittal) automatically generates a transform matrix to allow a homogeneous coordinate transformation into the standard view. The normalization of structures from the neuroanatomic database to patient CT scans is based on the line from the anterior commisure to the posterior commisure ( A C - P C line). This line has been used extensively by stereotaxic neurosurgeons and has been found to be relatively invariant [16]. It has been used as a reference line for the construction of coordinate systems for several stereotaxic atlases. To determine the A C - P C line, the midline sagittal CT cut is generated from the interpolated three-dimensional CT scan using the O R T H O program. This view is displayed on the video monitor, and the position of the anterior and posterior commisures are marked using a mouse. The midpoint of the A C - P C line determines the origin of the x and y coordinates. The z coordinate is determined by the slice selected to represent the midline in the sagittal plane. The transformation matrix necessary to normalize the neuroanatomic database with respect to the origin of the (x, y, z ) coordinate system is automatically generated when the A C - P C line is marked. Correction for obliquity involves the determination of an orientation vector in the sagittal, coronal, and axial planes. In the case of the sagittal view, the orientation of the A C - P C line is used.
In the coronal plane, the line connecting the floor and the roof of the third ventricle is traced with the mouse on a coronal view produced by ORT H O . The axial vector is obtained by mousing the line between the aqueduct of Sylvius and the most anterior point of third ventricle on an axial section containing both. While these operations are currently done by hand, we hope to be able to automate the process. This will require the program to be able to locate the third ventricle as well as to determine the landmarks for the commisures. Fig. 6 shows the results of combining the octree of the normalized central ganglionic masses (caudate, putamen, globus pallidus, and thalamus) extracted from Schaltenbrand's atlas with the skull octree from a patient CT scan using the L O G I C A L _ O R routine. The front of the skull has been cut away to reveal those gray-matter structures within. The P R O J E C T module allows two-dimensional back projections of the objects in the atlas to be made upon any arbitrary CT slice generated with the O R T H O routine. Fig. 7 shows a back-projected section of the central ganglionic masses extracted from the atlas on a coronal section of a patient CT. Fig. 7 is a black-and-white reproduction of the color display of the Sun, with each component of the ganglionic masses assigned a
Fig. 6. Atlas derived central ganglionic masses ORed with patient skull.
165 section of a patient CT. The VL nucleus is a frequent target for the placing of stereotaxic lesions. The ability to superimpose the target from the atlas upon the patient's CT scan can help the surgeon plan his approach. Stereotaxic surgery is usually done in an operating room which is fitted with a CT scanner, and so the computerized stereotaxic atlas has potential for becoming a useful tool in this realm.
9. Clinical correlation
Fig. 7. Back projection of coronal section of ganglionic masses.
different number in the color map. This technique is useful in stereotaxic surgery. Fig. 8 shows a projection of the VL nucleus of the thalamus and the whole thalamus from the atlas on an axial
Fig. 8. Thalamus and VL nucleus projected in axial section.
Clinical information is requested as a part of every requisition for neuroimaging studies. This clinical information is useful to the neuroradiologist in focussing his attention on anatomic regions where involvement would explain the clinical manifestations. The demographic information (sex, age, race, etc.) is useful because the different etiologic entities causing the radiographic lesions have varying predictions for certain patient 'ecologies'. Taken together, the ecology and the neurologic manifestations serve to focus attention on a few probable explanations and to narrow the differential diagnosis. For example, ataxia and vomiting in a 6year-old will bring to mind processes involving the posterior fossa, likely meduloblastoma or astrocytoma of the cerebellum. In an adult, cerebellar hemorrhage or metastatic carcinoma would be more likely with the same manifestations. Subdural hematomas are more likely in the very young and the very old, as well as in alcoholics. The scan of a patient with a left hemiparesis needs special attention to the right motor cortex, internal capsulate and corticospinal tracts. We have constructed a database of neuroanatomic objects linked to the symptoms and signs produced when the objects are involved with pathologic processes. The neuroanatomic objects are also linked into the atlas representations, so the spatial geometric relations are known. The vascular territories are represented so that constellations of symptoms and signs can be localized by connectivity (tracts), spatial, or vascular relationship. An example of a manifestation data structure is the following:
166 (setq~ALEXIA WITHOUT AGRAPHIA[ (make-manf :comment 'NIL :definition "'Ability to write, but not read, even what the patient has written." :source "'Kertesz, Chap. 13, 14." :anatomy '(]any 1 of I (IL A N G U L A R GYRUS I (lull of I (]*SPLENIUM OF CORPUS CALLOSUM I (]any 1 of I C * L OCCIPITAL CORTEX I I*L O C C I P I T A L W H I T E MATTER]))))) :diagnosticanatomy ((ANATOMIC
:aka
(lany 1 of I (]L OCCIPITAL LOBE]))) (VASCULAR ( ]any 1 of I ([L POSTERIOR CEREBRAL[)))) '(]PURE ALEXIAI [PREANGULAR ALEXIA[)))
Manifestations implicate involvement of neuroanatomic objects. These objects (e.g. thalamus) are represented by Lisp structures which contain their spatial coordinates and extent, pointers to the data structures necessary for imaging their models, their subunits (for thalamus, these include the different thalamic nuclei, such as VL), their superunits, and their vascular supply. Through the use of these data structures, a set of manifestations can be explained by hypotheses involving anatomic structures, regions, or vascular territories. Such information will serve to focus the attention of the image analysis on certain regions as well as help order the differential diagnostic list.
10. Identification of imaged regions While this part of the program is still developmental, we can outline the approach. The basic problem is to first identify known neuroanatomic structures which image well based on comparison with their models. The regions which are candidates for identification with a specific anatomic structure can be selected by expected volume, tissue density, and the coordinates of the expected
location. We have found experimentally in two-dimensional CT cuts, using a crude triangle shape of proper orientation as a model, the p u t a m e n - p a l lidal complex is easily identified, even when only the central region of the scan is specified as the location. Once the more easily identified structures (ventricles, ganglionic masses) are located, they can be used to assist in the identification of more obscure structures. Several approaches are useful in the analysis of these regions, given a model. It is not to be expected that simply normalizing the model to the patient CT will always ensure congruency between model and corresponding patient structures in the image. The model may have to be deformed in certain ways in order to match the patient [17,18]. Certain structures, for example the ventricles, are quite variable. However, the variability is usually manifest in a few simple stereotyped ways, which can be captured in a knowledge base, or in a set of models. Pizer [19,20] and coworkers are attempting to develop a method based on symmetric axis transforms for defining neuroanatomic objects by hierarchical shape descriptors. While their method is mathematically intensive, there are similarities (conceptually, at least) between the construction of their Gaussian-blurred hierarchic shapes and the octree representation. Another tool that may be quite useful in comparing regions in the scan with the model is the Levenshtein distance [21]. This technique is a method of comparing strings for closeness of fit. String-like structures can be generated from the images and the models for any arbitrary point by sending out radial vectors from that point and computing a histogram of the radially encountered changes of Hounsfield number. For example if we restricted ourselves for simplicity to densities of csf (C), white matter (W), gray matter (G), and bone (B), and plotted the histogram at 2-mm intervals along some imaginary radial vector from center of pallidus through capsule, caudate, ventrical, and frontal lobe to the skull we might have something like: GG WGGC C C C W W W W W W G G G G B B B B.
Sets of vectors about a point might then be compared to the expected vectors from the database to
167
determine how close the unknown was to the model.
11. Neuroradiologic knowledge base Abnormalities in a scan consist of regions which represent known structures which are distorted, displaced or missing, and those which represent objects for which there is no normal analogue, such as hematomas, tumors, aneurisms, and bullets. This second class of lesion cannot be handled exclusively by a model-based approach and requires a knowledge base of neuroradiology. In order to decide if a given diagnosis is consistent or inconsistent, the program must be able to access knowledge of radiologic features which are explainable by that diagnosis. The neuroradiologic knowledge base will contain a profile of each etiologic lesion which is detectable using neuroradiologic imaging techniques. Each profile will have knowledge of the radiographic features which may be observed with each lesion, demographics, favored anatomic sites, and specific radiologic features such as presence of calcifications, characteristic densities, edema, multiplicity, cavitation, etc. There will also be procedural knowledge for determining the presence or absence of the characteristic features in the form of functions that know how to determine the presence of calcifications within the lesion, surrounding edema, etc. Both the classic appearance of a particular lesion and the freq~ency of known variations in the appearance of each lesion will be entered into the database. It is estimated that between 500 and 1000 lesions will be needed for this database.
References [1] P. Doubilet and P.G. Herman, Interpretation of radiographs: effect of clinical history, AJR 137 (1981) 1055-1058. [2] D.H. Ballard and C.M. Brown, Computer Vision (Prentice-Hall, Englewood Cliffs N J, 1982). [3] P.H. Winston, Artificial Intelligence, 2nd edn. (AddisonWesley, Reading MA, 1984). [4} P.R. Cohen and E.A. Feigenbaum, Handbook of Artificial Intelligence (W. Kaufman, Los Altos CA, 1982). [5] D.H. Ballard, Hierarchic Recognition of Tumors in Chest Radiographs (Birkhauser-Verlag, Basel, 1976).
[6] D.H. Ballard, U. Shani and R.B. Schudy, Anatomical models for medical images, in: Proc. 3rd COMPSAC, IEEE Computer Society International Computer Software and Applications Conference (1979) 565-570. [7] A.M. Lesgold, Acquiring expertise, in: Tutorials in Learning and Memory: Essays in Honor of Gordon Bower, J.R, Anderson and S.M. Kosslyn, eds. (W.H. Freeman, San Francisco CA, 1984). [8] R. Kumar, and S. Srihari, An expert system for the interpretation of cranial CT scan images, in: Proceedings of the Expert Systems in Government Symposium, K. Karan, ed., McClean, VA, October 1985. [9] J.A. Newell and E. Sokolowska, Model based recognition of CT scan images, in: MEDINFO 86 Proceedings, R. Salamon, B. Blum and M. Jorgensen, eds,, pp. 619-623 (Elsevier, Amsterdam, 1986). [10] J.K. Udupa, Display of 3D information in discrete 3D scenes produced by computerized tomography, Proc. IEEE 71 (1983) 420-431. [11] A. Klinger and C.R. Dyer, Experiments in picture representation using regular decomposition, Comput. Graph. Image Proc. 5 (1976) 68-105. [12] H. Samet, Region representation: quadtrees from boundary codes, Commun. ACM 23 (1980) 163-169. [13] J. Udupa, S.N. Srihari and G.T. Herman, Boundary detection in multidimensions. IEEE Trans. PAMI 4 (1982) 41-50. [14] D. Meagher, Geometric modelling using octree encoding, Comput. Graphics Image Proc. 19 (1982) 129 147. [15] G. Schaltenbrand and W. Wahren, Atlas for Stereotaxy of the Human Brain (Thieme, Stuttgart, 1977). [16] J. Talairach. VI Congreso Latin Americano de Neurochirurgia, Montevideo (1955) 865 925. [17] F.L. Bookstein. Size and shape spaces for landmark data in two dimensions. Stat. Sci. 1 (1986) 181-242. [18] R. Bajcsy and C. Broit, Matching of deformed images, Proc. Int. Conf. Pattern Recogn., Munchen, October 1982, pp. 351-353. [19] L.R. Nackman and S.M. Pizer, Three-dimensional shape description using the symmetric transform. I. Theory, IEEE Trans. PAMI 7 (1985) 187 202. [20] S.M. Pizer, W.R. Oliver, J.M. Gauch and S.H. Bloomberg, Hierarchical figure-based shape description for medical imaging, Technical report 86-026 (University of North Carolina, Department of Computer Science, Chapel Hill NC, 1986). [21] V.I. Levenshtein, Binary codes of capable of correcting deletions, insertions, and reversals, Soviet Physic-Doklady 10 (1966) 707-710.
Copyright notice This article is based on "Radiologic automated diagnosis (RAD)", by G. Banks, J.K. Vries and S. McLinden, Decision Systems Laboratory, appearing in Tenth Annual Symposium on Computer Applications in Medical Care, Washington, DC, 25-26 October 1986, pp. 228-239, © 1986 IEEE.