-Tti3E%:: in Biome!!Iicine ELSEVIER
Computer
Methods
and Programs
in Biomedicine
52 (1997)
203-211
The biological toolbox: a computer program for simulating basic biological and pathological processes A. Vawer, J. R.ashbass* Department
of Histopathologja, Received
University
4 March
of Cambridge,
1996; revised
AddenbrookeS
:I September
Hospital,
1996; accepted
Cambridge
14 October
CB2 ZQQ.
UK
1996
Abstract The program described here has been written to enable pathologists and biologists with almost no computer experience to design complex models of cell interactions. The program although simulating in only two dimensions allows the user to define the individual rules governing cell behaviour using a language called Cell Description Language, then simulates the multiple interactions between the cells to produce a dynamic visual interpretation representing tissue growth and differentiation. The program has been developed using the World Wide Web, thereby giving access to anyone with an Internet connection. The Web technology allows others to use our powerful computers to perform the complex calculations that are necessary and effectively eliminates the problems of modifying and compiling the program to run on more than one hardware platform. The changes that take place during the simulation are presented as a video using the MPEG video format; they may then be viewed on many different types of computers. The toolbox provides a novel approach to computer-based biological simulations and .an excellent resource for teaching. 0 1997Elsevier ScienceIreland Ltd. All rights reserved Keywords:
Computer
simulation;
Internet;
Language;
World
I. Introduction The rules that govern cell growth, differentklion, division and migra.tion are common to all cells, but it is how these are combined and COIItrolled that varies between tissues. Understanding how these mechanisms contribute to the architec* Corresponding author. Tel.: + 44 1223 217163; 1223 216980; e-mail:
[email protected] 0169-2607:97/$17.00 RIISO169-2607(96)01796-8
0 1997 Elsevier
Science
Ireland
fax:
Ltd.
+ 44
All rights
Wide Web
ture which defines individual tissues is one of the main aims of modern biology. Changes in either individual or combinations of these processesalter the architecture of the tissues in ways that characterise the underlying pathological processes and it is the change in morphological appearance that the histopathologist
most often uses to define
the pathological entity. There has been considerable advance in the last few years in our understanding of the basic bioreserved
204
A. Vawer,
J. Rashbass
/ Computer
Methods
logical and pathobiological processes which characterise individual diseases. In many cases, although we know the basic cellular defect which produces the disease, it is difficult, if not impossible, to predict how the single change interacts with other cellular processes to alter the final tissue architecture. One approach to this problem is to analyse each bic,logical mechanism mathematically and then to (combine them to simulate the potential interactions. This method has been applied successfully to several biological processes, e.g. wound healing, and can provide insights into the underlying biology [1,2]. The limitation of a purely mathematical analysis is that the answers are often in the form of equations which can be difficult for the majority of biologists to appreciate or interpret [3]. To increase the accessibility by biologists to the mathematical approach, we have combined the mathematics with a computer simu!ation i4:]” Computers are ideal for this type of problem since they can easily perform the complex and numerous calculations required to represent biological processes. By appropriate programming, the answer can be displayed in a graphical form that is much more easily understood by the biologist. Our simulation is only in two dimensions. This is largely due to the limitations in processing power available and although this causes some restrictions to the comparisons that can be made in the model system, (for example, if one were to simulate organogenesis) it does not prevent the user generating simulations of biological pro,cesses. It is a property shared by many other models described in the literature [2,5-91 all of which have provided valuable insights into biolog ical mechanisms.
24, Background
Our initial work concentrated on a graphical simulation of a stratified epithelium resembling epidermis [4]. The properties of individual cells including their growth rate, adhesive properties and factors that they respond to and secrete are al.1 defined in the model. The simulation will then
and Programs
in Biomedicine
52 (1997)
203-211
grow from a single stem cell to reach a steady state, will regenerate itself if part of the model is removed and shows a clonal structure similar to that seen in normal epithelia. Introducing alterations into the rules affecting individual stem cells simulates ‘mutations’, allowing us to mimic morphological changes such as those seen in pathological conditions such as viral warts or Bowen’s disease [lo]. Other approaches in the literature have often taken a more restrictive approach to the cells in the model. Many place the cells within a restricting lattice which may be honeycombed in form [5] or square or triangular [9,11,12]. This is limited because the cells are often constrained during the simulations, a process not applicable biologically. By confining cells to a fixed geometrical lattice, the range of possible patterns produced by such mlodels is also limited. For this reason, in our models we have taken the alternative approach to produce simulations which do not rely on a preconceived; lattice array. We therefore model a large number of biological processes and use the interactions ibetween these to simulate the tissue architecture [4]. 3. Design considerations
Despite the usefulness of the approach described above, the computer simulation requires considerable programming experience to create the initial model. Modifications of iadividual rules can only be produced by altering the program, making it hard for the non-programmer to introdu.ce changes. In order to overcome this problem, the concept of a ‘biological toolbox’ was conceived to provide a tool for the experimental biologist who might not have enough computer experience or the facilities to write the whole computer program themselves. The aim of the biological toolbox is to provide a range of basic biological properties which the user can allocated to individual cells. Cells may then be combined to produce simulations of more complicated cellular activities to generate tissue. To facilitate the use of the progr,am an easily understood user interface was also necessary.
A. Vawer, J. Rashbass / Computer Methods and Programs in Biomedicine 52 (1997) 203-211
In order to provide the required flexibility, it was decided that the creation of a new programming language, Cell Description Language (CDL), would be the most straightforward approach, while the use of the World Wide Web would provide the most appropriate interface. The World Wide Web was chosen for several reasons: firstly, it is an interface familiar to many users of the Internet, secondly it provides an easily accessible graphical environment for the programmer and finally, it can be accessed from anywhere connected to the Internet, rather than solely the computers on which the program lhas been installed.
4. System description
The software is executed on a Sun SparcStatSon 20 acting as a World Wide Web server. For the generation of Moving Picture Experts Group (MPEG) movies, the data is transferred to a Sun SparcClassic. The model can be accessed on the Internet at http://www.his.path.cam.ac.uk/ biotoo!box.html The system is written in a combination of the scripting language Perl, which is primarily involved in handling user interface, and C which is used for processing the simulations themselves. The language which the system implements for simulating biological situations is CDL. An overviezw of the processes involved in setting up and using the model is shown in Fig. 1.
205
4.2. Cellular functions In order to build up a description in CDL, the individual cell types are first named. Individual properties can then be assigned to the cells using an editor provided in the Web interface. This lists the current instructions which have been assigned to the cell, and further properties can be added to this list by clicking on an icon representing the desired property and then providing further details as requested. The following properties are defined in the model. 4.2. I. Flags and variables We have used the concept of a ‘flag’ to define both a single state of a cell (which can be either on or off) or a variable that affects a cell, for example, flags store numerical information important to the cell, such as its age 0.r size. They can be set and modified as the simulation progresses using the flag command. For example, the cell’s size can be changed by modification of the
4.1. Data structures Each, cell has a data structure associated with it. These are stored in memory as a linked list, allowing new cells to be added and dead cells to be removed without having to move the rest of the data in memory. The details of the structure are shown in Table 1. The array of variables pointed to by this data structure contains the current values of all of the flags, gradient thresholds and amounts of compound being released by the current cell.
Fig. 1. Schematic Iuser to configure
diagram showing the steps followed the biological toolbox.
by the
206
A. Vuwer,
Table 1 Contents
of the data
int cell-type int cell _ remove
int anchored
double size double
ceil . .x
double
j
double
force-.x
double
force-!:
struct cell-data *next struct cell-data “prev
structure
J. Rnsl~bnss / Computer
Methods
for a cell
An integer representing the type of the cmrent cell. A flag for cell removal. If it is set; the ceil will be removed from the simulation during the next physical iteration. If this flag is set, the cell is anchored for the current iteration. It is cleared at the end of each physical iteration. The radius of the cell. The .r co-ordinate of the cell within the model The y co-ordinate of the cell within the model. The horizontal component of the force acting on the cell. The vertical component of the force acting on the cell. A pointer to the next cell in the linked list. A pointer to the previous cell in the linked list.
Abbreviations: int, signed integer; double, 44 bit floating number; struct cell data, data structure shown here.
point
CELL_AREA flag, simulating growth or shrinkage. At the start of the simulation, all user-defined flags and variables are set to zero unless the initial. flag command has been used. A number of Stan-. dard flags and variables exist, as shown in Table -.3 4.32. Chemical gradients
Locally acting diffusible factors are included. Individual cells can be assigned one or more substances which they can release into the surrounding medium. These can act on other cellsa paracrine mechanism-or the releasing cell itself-autocrine-or there can be a combination of both. The effects can be either inhibitory or stimulatory. It is possible to set a threshold concentratioc. beneath which an individual cell is unable to determine the presence of a compound. The relevant CDL commands are shown in Table 3. Unlike the programs used in the initial work on stratified epidermis, the concentration of a chemical comp,sund in the simulation is not simply
and Progmrns
in Biomedicine
5.2 (1997)
203-211
calculated as being inversely proportional to the distance from the source. Instead, the simulation area is divided into a 100 x 100 grid. The squares in the grid are approximately the same size as the cells, but cells often straddle individual grid squares. This degree of approximation and linear interpolation used in the grid allows sufficient information about the direction of a gradient and the concentration to simulate a number of biological mechanisms that can occur as a result of the concentration. The model does not take into account the effect on diffusion of obstruction by cells. It is unclear from the published literature how significant this “blocking effect’ might be. It may be different in different situations, some cells (e.g. gut and renal epithelial cells] may actually be able to produce steps in a gradient by actively pu.mping substances across the cell, while preventing diffusion between them. Each square of the grid can have an amount of chemical compound allocated to it. During each Tdale 2 Pm-defined
flags and variables
GL.UE_RANGE Flag CELL-AGE CELL-AREA GLUE -RANGE MODEL -ITERATIONS CELL-X CELL-Y CONTACT _CIELLTYPE CONTACT -DLR OTHER-X OTHER-
Y
OT HER-DIST
used by CDL
Meaning The age of the current cell. The area of the current cell. The range within which o’ne cell may be attracted to another. The number of iterations which have been completed. The cell’s X co-ordinate within the model. The cell’s Y co-ordinate within the model, During a global or contacts loop, it shows the type of cell under consideration. As above, but provides the direction of the remote cell. During a global loop, this provides the X co-ordinate of the remote cell. As above, but provides the Y co-ordinate. As above, but provides the distance between the two cells,
Cha.nging the values of these serves no useful in the case of CELL-AREA where it modifies cell.
purpose, except the area of the
A. Vawer,
J. Rashbass
/ Computer
Me!hods
Table 3 CDL commands concerned with chemical gradients, cell movement and cell contacts Parameter
Commands
Gradient
Release Stop Threshold Diffuse Sink
Commands
Contacts {} Global {} Remote {} Commands
Migrate Anchor Repel
Effect
concerned
with the use of chemical
gradients
Defines a compound which can be released by a cell, and determines the magnitude of the effect caused when this cell detects that compound. Causes the current cell to release a given amount of a compound. Ceases the release of a compound. Defines the level below which a named compound cannot be detected by the cell. Defines the diffusion constant for the named compound. Causes the cell to act as a sink, removing the given amount of a compound from its part of the model, if possible. concerning
neighbouring
cells
and .Pvograms
in Biomedicine
207
203-211
the diffusion algorithm, because they are a separate property that the user can define for each cell. Each cell can remove X units of a substance at a time from its locality. The change produced by this is then taken into account in the next iteration of the diffusion calculations,, The same principle applies to the release of a compound by a cell. The power of this approach is that it allows sinks and sources to arise in different positions in the model with time as the properties in the model change. The nature of the gradient is determined by the balance between rate of production by the source and loss at the sink and also the nature of the diffusion constant for that substance. All these variables can be defined separately. 4.2.3.
Cell division
Cells can divide. This is implemented using the mitose command. The daughter cells can either
Produces a loop during which all cells in contact with the current cell are considered. Produces a loop as above, but all cells in the model are considered. Applies a sea of commands to a remote cell, for example killing it off. concerned
52 (1997)
r
I
I
with cell movement
Causes the cell to migrate in a given direction, if possible. Locks the cell in position for the current phase only. Repels all cells of a given type away from the current cell.
.* C,, is the concentration N is
Where parameters are required, appropriate prompts are made after clicking on the corresponding icon on the Web page.
physical iteration of the model, the chemical compound is able to diffuse through the model by calculating the change in concentration which will result for each square, as shown in Fig. 2, and then updating the original matrix. To improve the resolution of the grid, the concentration at a point between grid points is determined by interpolation. This is shown in Fig. 3. Any compounds which diffuse out of the grid are lost to the model. The substances released from the individual cells can be consumed by other cells which act as a sink. The sinks are not implemented directly in
of conzpo,und in square n
Fick’s first law of diffusion:
ACs=-
Flux
of matter
DC g
the number density of particles z is the distance diffused
n=4 D(Cn- C5) ?I=‘) D(Cn- C5) Z z c cn=l n=6 where z=l for n=2,4,6,8 andz=JZ forn=1,3,7,9 D is the constant of di&sibility
Fig. 2. Diagram showing the method used to evaluate the changes in the concentration of diffusible substances in the model. The square under consideration is labelled as C, in the diagram and the process is then repeated for each square of the grid in turn.
A. khww, J. Rashbuss I
208
Computer
A4ethods
arm’
Programs in Biomedicine52 (1997) 203-21 i 4.1?.5.Cell death
a ‘7X
#Cells can be lost from the model as a result of cell death. All cells will remain in the model unless the properties assigned to the cell instruct it to die by use of the remove command. The type of conditions which lead to cell death are age, position or the level of a substance detected by the cell. The model allows the user to determine if cells shrink before they die, leave a hole or secrete a substance just before they die. The effects of a cell dying can vary and be different for each cell in the model, if the user wishes.
c
b
a,b,c,darepoints of known concentration Q is thepoint of unknown concentration C, is the concentrationat point n cpzca+(cb-c,)x cz = cc + (Cd - Cc)” q=c,+(c,-c,)y a.v q* = co + (Cb - CJX
+ (Cc + (Cd - C&
- c, - (Cb - C&}Y
Fig. 3. ‘The apparent resolution of the concentration grid in the model IS improved by interpolation as shown in the diagram. This allows the concentration at points not directly on the grid TObe determined.
have the same properties as the parent cell or one can differentiate immediately into a cell with a different set of rules, and therefore simulate a type of stem cell. 4.I?.4. Cell &(ferentintion
Any cell that acquires or loses a property can be said to have differentiated. Cells in the model can be made to change their properties. This can be regarded as a phenotypic or genotypic change or both. 19nthe genotypic change, the cell acquires new properties, but does not alter its colour in the display. In. the phenotypic change, using the become command, cells alter their colour and take on all of the rules of another cell type.
42.6. Cell contact
Cells that come into contact with other cells can alter their properties. Contact requires direct apposition between at least two cells, but the total number of cells required to media&e an effect can be defined. For example, a cell might differentiate if it is in contact with five cells of another type. Table 3 shows the CDL commands governing cell contact. 42.7. Cell movement
Cells may be caused to move by external forces acting on them. Together with this, the cell may be instructed to move up or down a chemical gradient at a particular angle and with a particular force. If this migratory force is strong enough, the cells moving in this way will push obstructing cells from their path. The CDL commands available for these cellular functions are shown in Table 3. Individual cells can be given the ability to repel other cells of a given type with a certain force. The magnitude of the force obeys the inverse-square law, so that cells distant from the source of the repulsion will experienice very little effect. If necessary, cells can anchor themselves in place in the model for the current iteration. This negates the effect of any repulsion or forces from other cells. 42.8. Decisions
All of the cell functions can be assigned to a cell on the basis of other properties. For example, one might wish to have a cell start prodttcing a compound if it finds itself next to another cell type. This type of effect is slightly different to the one
A. Vawer,
J. Rashbass
/ Computer
Methods
and Programs
in Biomedicine
52 (1997)
203-211
209
in which a cell differentiates to another cell type because it is transitory and dependent only on the local #environment. For decision-making, C.DL implements an if statement. It also allows logr,cal Boolean operators (AND, OR and NOT) to be included in the cell descriptions as well as a range of mathematical functions. These are shown in Table 4. 4.3. Dejking
the initial spatial arrangement
Each cell type in the model can be allocated a position at the start of the simulation. Cells can be distributed within the bounds of a box representing the environment, which can be defined as being circular or rectangular, with edges which can act as fixed barriers or allow cells to be lost from the edge of the model. Cells can be placed either singly or in groups. Alternatively, the user can import a histological image into the model and use this as a template on which to place the individual cells. Table 4 Expression evaluation in CDL
End
Fig. 4. Flow chart showing the basic structure of the simulation while the program runs.
,4.4. Program Symbol or expression
Description
() + .’ I * &
Normal maths functions Logical AND Logical OR Logical NOT Comparisons
! =>= <= rand Pi int (expr ) flag Jag
name
effect effict name dir ejfec? name touching cell cell t.vpe
Random number from 0 to 1. The mathematical constant n. The integral part of the evaluated expression. The value of a flag or variable. The magnitude of the given effect experienced by the cell. The direction of the effect, pointing down the gradient. The number of cells in contact with this one. Returns a number reflecting the cell type.
The following operators may be used where expressions are accepted as parameters of a command. Note that there is no operator priority at all. Thus 2+2*2 gives 8 and not 6, and must be written 2+ (2*2), instead.
of simulation
execution
The structure of the program i:nvolved in processing the simulations is shown in Fig. 4. The compilation step involves a conversion of the textual cell description language to a form of internal code. All of the commanid keywords are tokenised and the variable names are replaced with indexes into the data area reserved for each cell. This improves the efficiency of the actual running of the simulation as the command syntaxes have been checked and variable and command names do not need to be located in look-up tables. The initial state of the model is then read into the model, including the initial positions of cells, the boundary conditions for the model and the initial values of flags. The prograrn then enters a cycle of biological and physical iterations. The biological iteration consists of the execution of the commands associated with each individual cell in existence in the model in turn. This is immediately followed by a physical iteration. During this, the following processes take place.
210
A. Vaawer, J. Rashhass
/ Computer
Methods
- All
or‘ the cells are moved according to the forces acting upon them. Forces between cells are calculated as shown by Stekel et al. [4]. - The concentrations of compounds involved in chemical gradients are recalculated. -. The boundary conditions for the model are inspected. Any cells which are about to move outside the user-defined border are either removed from the model or are prevented from moving further towards the edge. - Cells which have died are removed from the model Section 4.2.5 If there are no cells remaining in the model, the simulation terminates. If the maximum number of iterations has not been reached, then another biological iteration is performed. 4.5. Dater presentation
Once the simulation has been completed, the data produced may be retrieved in several forms. The first possibility is the creation of an MPEG movie from the data generated. Once this has been created it can be downloaded and viewed., providing a dynamic visual reproduction of the simulation. It is also possible to request only the image of the final iteration as this is much quicker than the production of a full movie. Examples of both of t.hese forms of output can be seen on the Internet: http://www.his.path.cam.ac.uk/toolboxmovie.html The system can generate data in a format suitable to import into any graphics package, in the form of the number of each individual cell type at each iteration. Alternatively the raw data, consisting of th’e co-ordinates, size and cell type for each cell in each iteration, can be downloaded and imported into whichever package the user wishes to use.
and Programs
in Biomedicine
52 (1997)
203-211
data for normal skin growth and differentiation covered in Stekel et al. [4] The modelling process is of necessity limited and restricted by our knowledge of biology. The computer facilities are not sufficiently powerful to allow us to model in three dimensions or to permit cells to assume different sh.apes in the basis of the physical forces they experience. However, we anticipate that we will be able to incorporate these changes in the future. The strengths of the CDL system lie in its flexibility and accessibility. It is able to maintain a good level of performance regardless of the personal computer used to set up the model. Should it be necessary to extend the scope of the simulation, for example to provide three dimensional modelling, the majority of the principles of the actual language can remain unchanged as most of the modification lies in the mathematical aspects hidden to a large extent from the user. The modular nature of the program allows new rules to be easily added to the current simulation.
6. Lessonslearned
In producing graphical simulations of this form, it is essential that there is input from biologists. It is reasonably straightforward to generate images which look quite acceptable, but have been produced in a way which is either biologically unlikely or known to be incorrect; more than one possible set of rules applied to the simulation might produce the same end result. Further to this point, testing models developed within the system requires comparison with published biological data in order to confirm the approaches taken in the production of the model [9,11,12].
5. Status report
7. Future plans
In order to test the program’s capabilities, the original skin epithelial model was completely rewritten using the toolbox. This produced the expected results when compared with the original version of the skin model and to the biological
Our current model, due to the available computer processing power, only defines biological mechanisms in two dimensions. This is, of course, only an approximation to reality and therefore restricts the comparisons that can be made. How-
A. Vawer, J. Rashbass / Computer A4ethods and Programs in Biomedicine 52 (1997) 203-21 I
ever, even with this limitation, which is o:ne shared by other biological models described in the literature, [2,5-91 stimulating discussion can be had of considerable insights gained into basic biological mechanisms. The model will be developed further to allow pathological changes to be introduced during simulations rather than at the start. For example, it would be particularly valuable to allow the model to run a simulation until a steady state is reached and then introduce a change into the rules governing one cell type. This would mimic the biological process of a somatic mutation and could 'be used to investigate the mechanisms of malignanc:y. In this way it would be possible to test whether rules incorporated into the model were capable of correcting the effects of the mutation during a dynamic simulation. In order to improve the performance of the model further, we propose to optimise some of the calculations made during the physical iteration. These changes would include examining only the cells near to the one under consideration when calculating forces, rather than examining every cell in the model individually. Furthermore, in certain situations it may be possible to allow the option of wrap-around when rectangular boundaries are used. This would permit better consideration of repetitive structures such as those seen in the skin. We propose to make this program widely available to the academic research community to encourage feed back and exchange of ideas using this type of model environment. The biological toolbox is an extremely powerful program that we anticipate will be a valuable resource for the experimental biologist.
Acknowledgements
We are grateful to Sun Microsystems t%r the donation of the SparcStation 20 and to
211
our colleagues in the Department of Pathology, Cambridge University, for their helpful comrnents on this project. A.V. acknowledges the generous support of an undergraduate bursary from the Pathological Society of Great Britain.
References
PI J.A. Sherratt and J.D. Murray, Models of epidermal
wound healing. Proc. R. Sot. Lond. Eiol. Sci 241 (1990) 29-36.
131J.A. Sherratt and J.D. Murray, Mathematical analysis
of a basic model for epidermal wound healing. J. Math. Biol. 29 (1991) 389-404. [31 J.A. Sherratt, P. Martin, J.D. Murray and J. Lewis, Mathematical models of wound healing in embryonic and adult epidermis. Ima. J. Math. Appl. Med. Biol. 9 (1992) 177-196. [41 D. Stekel, J. Rashbass and E.D. Williams, A computer graphic simulation of squamous epithelium. J. Theor. Biol. 175 (1995) 2833293. [51 T. Williams and R. Bjerknes, Stocha:stic model for abnormal clone spread through epithelial basal layer. Nature 236 (1972) 19-21. I61 L. Pilz, W. Rittgen and P. Tautu, Modelling cellular morpho- and carcinogenesis. In: Stochastic spatial processes,ed. P. Tautu (Springer-Verlag, Berlin, 1993). [71 C.S. Potten, The epidermal proliferative unit: the possible role for the central basal cell. Cell Tissue Kinet. 7 (1974) 77-86. 181L. Bodenstein, A dynamic simulation model of tissue growth and cell patterning. Cell Differ. 19 (1986) 1933.
P. Meakin, A new model of biological pattern formation. J. Theor. Biol. 118 (1986) 101-113. UOI J. Rashbass, D. Stekel and E.D. Williams. The use of a computer model to simulate epithelial pathologies. J. Path01 179 (1996) 3333339. Pll J. Smolle, J.F. Smolle: H. Stettner and H. Kerl, Relationship of tumor cell motility and morphologic patterns. Part 1. Melanocytic skin tumors. Am. J. Dermatopathol. 14 (1992) 23 l-237. [12] J. Smolle, S. Taniguchi and H. Kerl, Relationship of tumor cell motility and morphologic patterns. Part 2. Analysis of tumor cell sublines with different motility in vitro. Am. J. Dermatopathol. 14 (1992) 315318. [91