Mathl. Comput. Modelling Vol. 17, No. 12, pp. 49-58, 1993 Printed in Great Britain. All rights reserved
AN x-MODEL
OS957177/93 $6.00 + 0.00 Copyright@ 1993 Pergamon Press Ltd
FOR VISUAL
PERCEPTION
SILVIO BEZERRA AND YVES CHERRUAULT Laboratoire MEDIMAT, Univenitk Pieire et Marie Curie 15, rue de l’hkole de Mkdecine, 75270, Paris, Cedex 06, Prance ARMAND CARON Laboratoire M.S.I., Universite de Limoges et M.A.S.I., C.N.R.S, U.A. no 818 7, rue Jules-Vall& 19100, Brive La Gaillarde, France
(Received November 1992; accepted December 1992) Abstract-We propose a model of visual perception based on the responses of X-type ganglion cells. This model is divided in two parts: the reconstructionstage and the recognition-learning stage. The model could explain some properties of the Lateral Geniculate Nucleus.
1. INTRODUCTION The proposed model takes its originality in a neuronal architectural based on the responses of the X-type cells represented by the matrix (zij). The aim of this model is to realize a reconstruction of a pattern and to compare the reconstituted pattern to a set of patterns previously learned in order to recognise it or otherwise to include it in this set of patterns. The model represents a neural networks and will be developed in two parts: the reconstruction mode and the recognitionlearning mode. 2. THE MODEL
BASED
ON THE X-CELLS
The model is based on the responses of the X-type ganglion cells, principally contained in the fovea [1,2] of the retina. These cells are specialised in detecting [3] contours of patterns, and they permit a reconstruction of this patterns. The visual stimulus P reaches the neuronal system in structure X where the X-type responses are computed and processed. Information is then transmitted to the structure S which computes the exit owing to the responses computed in X. In the structure X the X-type responses are computed by the following equation [4,5]: (Sij) = X(i, j, to> = C(i, j, to) - S(i, j, to - &), where +m
C(i, j,
to> = Ui,j, _[r_l;
to> *
J -co
s -co
Wi,
Gc(i,j,~,y)
j, z, Y) . P(i, j, to) . dz dy
.
P(i,j,to) . &(i,.?,to
is the response for the central receptive field, and
S(i,j,s)=I,(i,j,t,)*JtmJiDoG,(i,j,z,y)-P(i,j,to).d2dy =yyrf
-I” G,(z,j,~,y).P(i,j,to).I,(i,j,to-t)d%dydt,
represents the response for the surrounding receptive field.
49
- t)dzdydt,
S. BEZERFLA et al.
50
G, and G, are the two-dimensional Gaussian curves for the central and surrounding receptive fields respectively. 1, and 1, are the temporal responses of both central and surrounding receptive fields and they are modelled by the impulsion responses of low-pass filter. For more details we refer to previous papers [6-81. This X-type response cells allow us to recognize the contour but it does not permit us to “see” the two-dimensional pattern. After the compression of the visual information realised by the retina, a reconstruction of the pattern is necessary because each Lateral Geniculate Nucleus “sees” the visual field [9]. Moreover, we note that when the pattern is reconstructed, the ganglion cells responses (X, Y, W) can be used together to reconstruct the pattern and to permit to complete the visual information. This elementary analysis allows the connection with other layers in the cortex [9]. Using the notion of a hierarchy [lo] o f neurons which represents different levels of abstract representations of the visual world, we next propose the recognition-learning of the pattern. We can note that in [ll], the recognition is both a parallel and hierarchical multi-step process. The vision consists in producing the image in successive stages from the pattern of the retina: the first stage concerns the representation of the local properties of the pattern, then is related to visible surfaces in a centred co- ordinates viewer and at last concerns a centred pattern of the three-dimensional structure of the viewed shape [ll]. In our model, this stage is called pattern reconstruction. The problem of the recognition-learning stage may be set as follows: How the information is “‘u priori” distributed in the brain ? This information is organised in such a way that it is accessible in particular to the recognition but also to the reasoning or mental activity form. We suppose the existence of a set of learned patterns without presupposing a structure of learning: the recognition will consist in a test between the reconstructed pattern and the patterns in the pattern’s set. References [ll-141 may be consulted for more elaborated model of recognition of objects and persons. Let us now introduce the notion of a Neural-network working in two modes: reconstruction mode and recognition-learning mode. 3. THE
RECONSTRUCTION
MODE
In the reconstruction mode (Figure 1) we have elaborated a law for our model. We use the X-type cell’s response (computed in X) and the double operator of detection of the vertical and horizontal lines (centered in the proximity of each point of the discretized pattern in structure X). We reconstruct the isodensity zones of the illuminated parts of the pattern.
Figure 1. Schematic of the Reconstruction mode.
The visual stimuli is given by P-structure and the X responses are computed in X-structure. The responses of the total system are given in S-structure.
This double operator(Figure 2) has been imagined by works on the specific responses to lines of the cells in zone V3 in visual cortex, whatever be the colour. On the one hand, the X-types cells response will be isolated from the other visual detection as the motion (zone V5) and the
X-model
for visual perception
Figure 2. The double operator is computed
51
in structure S.
30
25 20 15 10 5 04 1
I
6
11
21
16
26
31
36
41
---
46
51
56
61
66
p - iterations
I~
square
------
lozenge - - - - -
disk
triangle I
Figure 3a. The Frobenius matrix norm for 70 iterations. This norm gives the stability of the pattern built at the output of the structure S in Reconstruction mode. Recognition square
disk
lozenge
triangle
square
0.0
6.68
6.8
7.43
disk
6.68
0.0
0.12
0.75
lozenge
6.8
0.12
0.0
0.63
triangle
7.43
0.75
0.63
0.0
Figure 3b. The Euclidean norm in Recognition-Learning
mode.
colour (zone V4) [10,15]. On ‘the other hand, the association of the colour and isodensity zones permit the reconstruction of the original pattern. This association of colour and zones could be representative of the structure Vl. In agreement with Vl, we remark that cells belonging to the structure Vl prefer low spatial frequencies and are selective for wavelengths concentrated in the “blob” (10,151 specially present in layer 2 and 3. The reconstructed pattern is computed in structure S and is represented by the matrix (sij (p)) and then is calculated by the following equation:
S.
52
BEZEFLRA et al.
For all (ij) such that zij > 0 we have:
c (I,k)Evoish(ij)
(%j(P)) = (%j(P-
1 Q,k Zi,k + NJ
1)) +
%,k %,k +
(&j(O))
c
>
03
(k,l)Evoish(ij)
-d ’
(k,l)Evoish(ij)
where vdsh(i, j) is the horizontal neighbourhood of the (i, j) point represented by a horizontal line. voisv(i, j) is the vertical neighbourhood of the (i, j) point represented by a vertical line. Nh is the normalization parameter in the horizontal direction and N, the normalization parameter in the vertical direction. The reconstruction is obtained when ]](sij(p))]] F is stable (Figure 3a) (i.e., bounded when p goes to infinity). A schematic flow-chart of our computer program in reconstruction mode is shown in Figure 3. It will be proved in paragraph V that the matrix (sij(p)) gives a stable image pattern at the output of the structure S in about 30th iterations. The numerical results show that the network rebuild the pattern. This pattern will be noted Si. In Recognition-learning mode (Figure 3) the pattern Si will be recognized whether it belongs to the set of patterns or it will be a candidate to belong to this set. 4. THE RECOGNITION-LEARNING
MODE
We suggest the following algorithm: 1. Show the restored pattern Sk. 2. Compute i* such that J = [[SK - Wp IIF= mini ]]SK - W~]]F. 3. If J < E (fixed noise’s threshold), then Sk is recognized as the cluster i*, otherwise we compute K. 4. Compute K = IIISKIIF- IIW~IIFI = mini(/IISKIIF- llwill~ I). 5. If K < E, then the classification is possible but it is necessary to move the eyes for concluding. . 6. Else choose another index of neuron which does not belong to the clusters’ set. 7. Learning:
wi(t + 1) = K(t)
+ o(t)[xk - iV#)],
IV& + 1) = wi(t), where vois(i*) is the neighbourhood introduced a noise E.
if i E vois(i*), else.
of the i* cell in the output layer. Where we have
A schematic flow-chart of our computer program in recognition-learning mode is shown in Figure 3. In our model we use two norms, the Euclidean norm and the Frobenius matrix norm:
is the Frobenius matrix norm. i=l
j=l
The Euclidean norm is used in Step 4. for the computing recognition (Figure 3b) and the fiobenius matrix norm gives the possibility of recognition after the eyes’ moving. A noise may be introduced in our model from parameter E.
X-model .---.-
for visual perception
53
p-’ Pallem I X Reaponu *,
Reconstruction Mode
1
Recognition-Learning
I
I
-..-__.._.. ___._..
Mode
I
Lmnir
END
??
Introduction of Information (Movingby Y-Types cellr response)
Figure 3. Schematic
MM
17:12-E
flow-chart of our program.
S. BEZERRA et al.
54 X- types ganglion cells response
X- types ganglion cells response
Figure 4a. Results of computer simulation in Reconstruction mode. P(cc, y, t) is the stimulus pattern (disk) presented in X-structure. It’s represented by 85 x 85 matrix. Each component represents the response from a single cell.
Figure 4b. Results of computer simulation. Pattern computed represented by surface form.
X-types ganglion cells response X-types
Figure 5s Results of computer simulation in Reconstruction mode. P(cc, y, t) is the stimulus pattern (triangle) presented in X-structure. It’s represented by 85 x 85 matrix. Each component represents the response from a single cell.
5. NUMERICAL
ganglion cells response
Figure 5b. Results of computer simulation. Pattern computed represented by surface form.
EXPERIENCES
The distribution of the modelled X-type cells in our matrix 85 x 85 corresponds to the distribution of the real cells in primate retina. The computing program was written in TURBO BASIC and runs on a PC 486-33 computer for pattern images. In the reconstruction mode we have used a two-dimensional pattern (Figure 6) in structure P and the results of structures X and S has been represented by: -a
matrix form, where each component of the matrix (X,)
or (Sij) (i = 1,. . . ,85
and j = 1, . . ,85) represents the intensity of the response X-cells (structure X) or the reconstructed pattern (Structure S), respectively. A graphic representation for matrix form is shown where each component of the matrix (X,j)
X-model
for visual perception
X-types ganglion cells response X-
Figure 6a. Results of computer simulation in Reconstruction mode. P(z, y, t) is the stimulus pattern(lozenge) presented in X-structure. It’s represented by 85x 85 matrix. Each component represents the response from a single cell. X- types ganglion cells response
Figure 7a. Results of computer simulation in Reconstruction mode. P(z, y, t) is the stimulus pattern(square) presented in X-structure. It’s represented by 85 x 85 matrix. Each component represents the response from a single cell.
typesganglion cells response
Figure 6b. Results of computer simulation. Pattern computed represented by surface form.
X- types ganglion cells response
Figure 7b. Results of computer simulation. Pattern computed represented by surface form.
or (,‘I!$) is represented by a circle where the radius is proportional to this component. presented the same computed responses by mean of a three-dimensional surface.
We have
The computed X-type ganglion cells’ responses (structure X) are given in Figures 4-8. and correspond to a disk, a triangle, a lozenge and a square (respectively). The results of the structure S corresponding to the reconstruction mode are given in Figures 8-11. In the recognition-learning mode, we have chosen Sk as a disk, given in Figure 11. and we have built the clusters’ set {Wi, i = disk, triangle, lozenge, square} shown in Figures 8 to 11 which represents the patterns in the memory.
S. BEZERFLA et al.
56
Reconstructionmode
L
I
Figure 8a. Results of computer simulation in Reconstruction mode. Sdisk corresponding to the pattern rebuilt in S-structure. It’s represented by 85 x 85 matrix. Each component represents the response to the lines of cells in zone V3 in visual cortex.
Figure 8b. Results of computer simulation. Pattern computed represented by surface form.
Reconstructionmode
Reconstructionmode
Figure 9a. Results of computer simulation in Reconstruction mode. S’triansle corresponding to the pattern rebuilt in S-structure. It’s represented by 85 x 85 matrix. Each component represents the response to the lines of cells in zone V3 in visual cortex.
Figure 9b. Results of computer simulation. Pattern computed represented by surface form.
6. CONCLUSION We proposed a neural network model for visual perception which uses as originality the ganglion cells’ responses as the input of the system for rebuilding the original pattern. In this model, we have only used the X-type ganglion cells’ responses. We observed that all patterns are quickly rebuilt in the Reconst7zlction mode and we noticed also that we can introduce a noise parameter E. This parameter can be used for improving the Recognition-learning mode in our model in order to give a qualitative representation of the vision.
X-model
for visual perception
57
Reconstruction mode
Figure 10a. Results of computer simulation in Reconstruction mode. Slozenge corresponding to the pattern rebuilt in S-structure. It’s rep resented by 85 x 85 matrix. Each component represents the response to the lines of cells in zone V3 in visual cortex.
Reconstruction mode
Figure lla. Results of computer simulation in Reconstruction mode. Ssquare corresponding to the pattern rebuilt in S-structure. It’s represented by 85 x 85 matrix. Each component represents the response to the lines of cells in zone V3 in visual cortex.
Figure lob. Results of computer simulation. Pattern computed represented by surface form.
Reconstruction mode
Figure Ilb. Results of computer simulation. Pattern computed represented by surface form.
This model will be advantageously improved by informations coming from other ganglion cells responses, as the Y-type ganglion cell response to the moving, which are involved in the visual perception process. It is clear that the present model does not pretend to reflect the complexity of the real visual system. Here, we just present a sketch of this complexity. This sketch will be improved in future works.
S. BEZERFLAet al.
58
REFERENCES
2. 3. 4. 5. 6. 7.
8. 9.
10. 11. 12. 13. 14. 15.
J.E.Dowling, The RETINA An Approachable Part of the Brain, The Belknap Press of Harvard University Press, (1987). Y.Galifret, VISION (physiologie), Encyclopaedia Universalis, pp. 95&964. C. Enroth-Cugell and J.G. Robson, The contrast sensitivity of retinal ganglion cells of the cat, J. Physiol. 187, 517-552 (1966). C.Koch and I. Segev, Methods in Neuronal Modelling from Synapses to Networks, MIT Press, (1989). J.Richter and S. Ullman, A model for the temporal organization of X- and Y-type receptive fields in the primate retina, Biol. Cybem. 43, 127-145 (1982). S.J. Bezerra, A. Caron and Y. Cherruault, A study of the responses of bipolar cells applied to bidimentional pattern, 8th International Congress of Cybernetics and Systems, New York, (1990). S.J. Bezerra, A. Caron and Y. Cherruault, Une Etude des reponses intracellulaires des cellules bipolaires appliquee B l’image bidimensionnelle, 12 Bme Congr& International de CybernCtique, 1989, Namur-Belgique. S. Bezerra, A. Caron and Y. Cherruault, A mathematical model of visual perception, International Jozlrnal of Bio-Medical Computing (IJBC) 32, 181-195 (1993). M. Imbert, La neurobiologie de l’image, La Recherche 144, 600-613 (mai 1983). S. Zeki, La construction des images par le cerveau, La Recherche 222, 712-721 (juin 1990). Y. Burnod, An Adaptative Neural Network-the cerebral cortex, Collection Biologie thkorique-Masson, (1988). A. Caron, Une machine neuromim&ique pour l’image, l’acquisition de connaissance, et la reconnaissance. Rkalisation matbrielle et expbriences, Cybernetica XXX11 (2) (1989). A. Caron and P. Barral, Un algorithme sous contrainte pour la r&olution de systemes linbaires et son application & 1’Btude d’une memoire associative, C. R. Acad. Sci. Paris, t. 313, SBrie I, pp. 791-796, (1991). F. Fogelman-So&e, Apport du connexionnisme B la fusion du renseigment, Science et DCfence, DGA, Nouvelles Avant es scientifiques et techniques, pp. 274-289, (1992). S. Zeki, Parallelism and functional spacialization in human visual cortex, Cold Spring Harbor Symposia on Quantitative Biology, Volume LV, pp. 651-661, (1990).