Modeling optimization problems in the unstructured world of spreadsheets

Modeling optimization problems in the unstructured world of spreadsheets

~ Omega, Int. J. Mgmt Sci. Vol. 25, No. 3, pp. 313-322, 1997 Pergamon Plh S0305-0483(97)00004-2 © 1997 Elsevier Science Ltd. All rights reserved P...

1MB Sizes 5 Downloads 30 Views

~

Omega, Int. J. Mgmt Sci. Vol. 25, No. 3, pp. 313-322, 1997

Pergamon

Plh S0305-0483(97)00004-2

© 1997 Elsevier Science Ltd. All rights reserved Printed in Great Britain 0305-0483/97 $i7.00 + 0.00

Modeling Optimization Problems in the Unstructured World of Spreadsheets DG CONWAY CT RAGSDALE

1

Virginia Polytechnic Institute and State University, Blacksburg, VA, USA (Received April 1996; accepted after revision January 1997) Electronic spreadsheets are the most common software tool managers use to analyze data and model quantitative problems. Increasingly, these software packages are being used in introductory O R / M S courses to introduce students to a variety of quantitative modeling tools. Because spreadsheets are inherently free-form, they impose no particular guidelines or structure on the way problems may be modeled. Thus, academics and practitioners accustomed to solving problems using very structured, dedicated O R [ M S software packages are facing the challenge of dealing with these problems in the unstructured spreadsheet environment where there is often a variety of ways to implement and solve the same problem. This challenge is particularly acute in the case of optimization problems. Some are responding to this challenge by devising rules for implementing models that impose an artificial structure on spreadsheets, sometimes resembling the operation of dedicated O R [ M S optimization packages. This paper offers a critique of this approach and provides some guidelines we believe to be more helpful in creating effective spreadsheet models for optimization problems. © 1997 Elsevier Science Ltd

Key words

spreadsheet, design, optimization, education

1. INTRODUCTION

opportunity. The opportunity is exciting in that spreadsheets bring a new dimension of relevance to introductory OR/MS courses as students see how various quantitative techniques can be applied using popular commercial software packages. It is also expected that exposing students to these techniques via spreadsheets will increase the likelihood of students actually using some of these techniques in their future business careers[2]. However, several new challenges also await those who adopt the spreadsheet approach to teaching OR/MS. As the number of new textbooks addressing this approach to teaching increases, so do the questions about the "correct" or "best" way to build spreadsheet models for various types of OR/MS problems [3-7]. Spreadsheets are inherently free-form and impose no particular structure on the way problems may be modeled. Traditional dedicated OR/MS software packages (e.g., Lindo, Storm, AB:QM, QSB) have

SINCE THEIR INTRODUCTION over 10 years ago, electronic spreadsheet programs such as Excel, Lotus 1-2-3, and Quattro Pro have become the most common tool business people use to model and analyze quantitative problems [1]. Indeed, a spreadsheet package is often included in the software bundle that comes standard with many new personal computer systems. The latest versions of these spreadsheet packages contain powerful analytical tools accessible through a familiar and easy to use interface that we could only dream about a decade ago. Today, literally millions of business people have many of the tools of operations research and management science (OR/MS) available at their fingertips through the spreadsheet software on their desktop computers. For educators in the field of OR/MS, this represents both an exciting and challenging 313

314

Conway, Ragsdale--Modeling Optimization Problems

quite the opposite characteristic and impose fairly rigid rules or structures for modeling problems. Thus, spreadsheets are both a blessing and a curse for the OR/MS instructor. They are a blessing in that they deliver access to the tools of OR/MS in a product that is appealing to the masses. They are a curse in that instructors accustomed to teaching students to solve problems using very structured, dedicated OR/MS software packages are faced with the challenge of dealing with these problems in the unstructured spreadsheet environment where there is often a variety of ways to implement and solve the same problem [8]. This challenge is particularly acute with regard to optimization problems. In spite of their immense popularity, relatively little has been written about how one should develop a spreadsheet model. And most of what has been written simply attempts to apply principles from traditional information systems (IS) design standards to the spreadsheet environment. Some in the OR/MS community are adopting these ideas and devising rules for implementing models that impose an artificial structure on spreadsheets [3]. While these efforts are well-intentioned and may be helpful in some cases, we find this approach to be largely inappropriate and ill-advised when using spreadsheets to model optimization problems. Because spreadsheets are inherently freeform, it is difficult to identify one set of rules for constructing all spreadsheet models. Spreadsheet models can be built for a broad range of purposes and for a wide variety of types of users. Thus, we believe that domain-specific spreadsheet design guidelines are likely to apply to different types of spreadsheet applications. And given the growing interest in using spreadsheets in OR/MS education, we also believe it is important to give serious consideration to identifying guidelines that may apply to specific OR/MS modeling techniques. This paper takes a first step in this direction by considering the implementation of optimization models in spreadsheets. We explain why we believe it is ill-advised to impose artificial structures on spreadsheets models of optimization problems. We then present some guidelines and suggestions we believe to be helpful in creating more effective spreadsheet models for optimization problems.

2. T H E O R I G I N S O F A R T I F I C I A L STRUCTURE

Historically, programs developed for mainframe computer systems relied heavily on programming standards to ensure software reliability. Standards in program design, testing and documentation have evolved over the past 30 years and are considered by IS professionals to be an integral aspect of software development[9]. The proliferation of spreadsheet software represents somewhat of a nightmare to corporate IS professionals as many non-IS managers have adopted a "Do-It-Yourself" approach to their own systems and modeling needs. Such independence allows managers to create systems in whatever way they choose [8]. Thus, it is not surprising that some IS professionals have called for standards to guide spreadsheet development. This sentiment is expressed by Kee [9, p. 56], who states, "Reliable spreadsheet software begins with a standard format for developing spreadsheet applications. A standard format imposes a common structure on spreadsheet developers and, hence, a greater degree of planning and documentation. More importantly, it forces programmers to develop their applications within a logical framework designed to simplify spreadsheet construction and enhance reliability. It also provides users with a common format, thereby making it easier to learn and apply the firm's spreadsheet software." The "standard format" or "common structure" for spreadsheet design referred to above and suggested by many other writers usually involves allocating separate areas of the spreadsheet (or pages in a workbook) for assumptions, data, parameters/decision variables, and formulas or outputs. (We will use the terms "standard format" and "common structure" interchangeably in this paper.) The use of a standard format is not without merit and for many types of problems is completely appropriate. For instance, we find this approach to be very useful for carrying out Monte Carlo simulation in the spreadsheet environment. However, contrary to Kee's sentiments (given above), for many optimization problems we find that the forced use of a standard format results in spreadsheets models that are more difficult to

Omega, Vol. 25, No. 3

construct, less reliable, and more difficult to understand. In the past (when DOS-based twodimensional, two-color spreadsheets were most common), physically separating the data, parameters and outputs of a spreadsheet model may have been the only effective way of distinguishing these different elements of the problem to the spreadsheet user. However, with Windows-based spreadsheets the difference between data, parameters, and outputs can be easily distinguished by using different colors, shading, and/or borders for different types of cells. In essence, these different formatting options add new dimensions to a spreadsheet that allow various elements of a model to be clearly distinguished regardless of their physical location. In many ways, these formatting options eliminate the need for a "standard format" and give us freedom to design a spreadsheet model in the way that most logically represents the problem at hand. Of course, freedom of design does not guarantee logical design. To achieve the end result of a logical spreadsheet design, our modeling efforts must be directed toward specific goals.

3. G O A L S O F S P R E A D S H E E T

DESIGN

We believe that, in most cases, a spreadsheet's primary business purpose is that of reporting and communicating information to managers. As such, the primary design objective in any spreadsheet modeling task should be to communicate the relevant aspects of the problem at hand in as clear and intuitively appealing a manner as possible. When this can be accomplished using a "standard format" the use of such an approach is appropriate. However, we believe it is inappropriate to sacrifice communicative effectiveness solely for the sake (or convenience) of conforming to a "standard format". Careful planning of the spreadsheet's layout may require extra time and effort, but it aids effective communication [10]. In addition to the issue of communicating clearly, spreadsheet designers should also give consideration to the following design goals [11]: • Reliability--the output a spreadsheet gen-

erates should be correct and consistent.

315

This has an obvious impact on the degree of confidence the user places in the model. Auditability--the user should be able to retrace the steps followed to generate the different outputs from the model in order to understand the model and verify results. Modifiability--the ability to change or enhance the spreadsheet in order to meet dynamic user requirements. In most cases, we believe the spreadsheet design which communicates its purpose most clearly will also be the most reliable, auditable and modifiable design. As the following examples illustrate, forcing a spreadsheet model of an optimization problem into a "standard format" can actually impair the reliability, auditability, and/or maintainability of the model. 4. A P R O D U C T M I X P R O B L E M STANDARD FORMAT

IN

Figure 1 illustrates the "standard format" approach to implementing a typical product mix problem. This problem (adapted from[6]) involves determining the most profitable product mix for a company that manufacturers two different type of hot tubs: Aqua-Spas and Hydro-Luxes. The "Problem Data" section of the spreadsheet summarizes the number of pumps and amount of labor and tubing required by each hot tub model (in cells C4 through D6). The total numbers of pumps, labor hours and tubing available are shown in the column labeled "Amount Available" (cells E4 through E6). The unit profit associated with each type of hot tub is shown in the row labeled "Unit Profit" (cells C7 and D7). The decision variables for the problem are clearly marked in cells C l l and Dll. While this model is perfectly capable of being used to generate the optimal solution to the problem, it can be criticized as having a number of shortcomings. For instance, in the "Problem Data" section, the casual observer may infer that the values in the "Amount Available" column and "Unit Profit" rows are in some way associated with the resource requirements listed in the adjacent cells. The last value in a column or row of data frequently represents a sum or some other summary measure of the data in the

Conway, Ragsdale--Modeling Optimization Problems

316

Problem Data ::Resources: .................... ....................

Pump=.............. .L.~..or...(..h.r=)

Aqua-Spa

Hydro.Lux

1

1

9 12 $365

.....

:: Tubin ::Unit Profit:



6

16 $300

Available 200 '

,

1566

2880

Decision Variables ................................................

6g..u..a-se.a.s.....i...

;Number Produced:

Outputs Resources: ....................

P.u.mp=........

...................

.~=bor.lh._r,)

................... T u b i n g (ft)

122

#y...a.r..o.:..L. u ~..,..s.. ....................................

i

78 Maximize: C19 By Changing:Cl1:Dll

Amount Used i 200 1,566 2,712

::Total Profit:

Subject To:

C15:C17<=E'I:E6 Cl1:D11)=0

$67,930

i!!i!i!i i i i i i i i i !i!!ii!i i i i i i i i i Cell: C15 C19

Formula:

Copied to:

=C4*$C$11+D4*$D$11 = C 7 " C 1 I + D 7 * D 11

(316:C17 --

Fig. 1. The product mix problem implemented in a standard format.

row or column. Thus, the placement of these values may be inconsistent with the casual observer's intuition and impair the auditability of the model. In Fig. 1, the formulas in the "Output" section representing resource usage (cells C15 through C17, representing the LHS values of the constraints) can be implemented conveniently by entering and copying a single formula. However, the physical displacement of these values from the coefficients in the "Problem Data" section tends to mask the relation between the inputs and outputs. That is, it may not be immediately obvious what will be affected if the value of six in cell D5 is changed. This has a negative impact on the auditability of the model. Similarly, after solving the problem the user would likely be interested in comparing the

amounts of the resources used (cells C15 through C17) with the amounts originally available (cells E4 through E6) to make sure the answer is feasible (or that the model is reliable). The physical displacement of the constraint LHS values (in cells C15 through C17) away from their corresponding RHS values (in cells E4 through E6) makes verifying the results of the model somewhat difficult. The physical displacement of these ranges also makes communicating the model to the solver software a bit more difficult and error prone (negatively impacting the reliability of the model). Finally, the "Total Profit" function representing the objective (cell C19) could be mistaken as a sum of the values in C15 through C17. And the relation of the objective (cell C19) to the unit profits in cells C7 through D7 and the decision variables in cells C11 through D11 is certainly

317

Omega, Vol. 25, No. 3

not suggested by its physical proximity to these cells, reducing the auditability of the model.

5. T H E P R O D U C T M I X P R O B L E M I N AN ALTERNATE FORMAT

An alternate way of implementing the model for the same product mix problem is shown in Fig. 2. In this model, the ~roblem data is shown in white (unshaded) cells, the decision variables are indicated by shaded cells with dashed borders (cells B3 through C3), constraint cells are indicated by shaded cells with solid borders (cells D8 through D10), and the cell representing the objective function is shaded with a double border (cell D4). The English-reading human eye tends to scan text and numbers from left to right and top to bottom [12, p. 384]. So when our eyes run across

i~N~i

i Ag.ua-Spas

i

Hgdro-L.uxes

:N~l}i, Number Produced: i

iii~iii Resources:

i

Resourcei

~::-~'-',.~;..........................................

"!.

ii~N~

i

IN~'.'.:~

i

ii!~Eli ::!i~iii

"i ~" i

!" •

ii !

Req'd per:

i

A~i~ili'.~i~i'""!"""iq~B~iiJ"iix

~li~lii" Liii>iir (h~i) ...............

i2i

"Cell': D4

i

::

iiiiNii~iiN"Tot'ai'i~rofit:

/i~'.;.:-~/!

~,'~!

the "Total Profit" value (cell D4) in Fig. 2, we intuitively assume it is related to the numbers we have scanned before reaching this cell--namely, the cells representing the unit profits (cells B4 through C4) and the quantities produced (cells B3 through C3). Thus, locating the "Total Profit" cell in close proximity to the cells that determine its value enhances the auditability of the model. Similarly, in this model the cells labeled "Amount Used" (ceils D8 through D10) are in close proximity to the cells that influence their values. Locating the RHS values of the constraints (i.e., the column labeled "Amount Available") immediately next to the cells representing the LHS values of the constraints makes it easy for the user to check the solution for feasibility, identify binding resources, and communicate the model to the solver software in a reliable and easily auditable manner. If we compare Figs 1 and 2, a key difference

}

i

Amount

i.... iLiii'$~i""

Amount

i

i

A~/iii~i~ii;:

"i

!'

9 ...............i................ 6 ................. .

.

.

lSS~; ......

: ............

i

i

" .......

~"

r .........

~

;222222 22

!,

i

i

.................... i .........

i

i

"i

i

i

i

Me=ximize: DI By Changing: a3:C3

[ •i

i i

i

i

~

~

.

!,

.

+

i ii I ii i!

ISubjectTo:

;

De:DIQ,c-Ee:E10

B3:C3I.=D

i"

ii! !i! ii ii!ii! iiiii! i i!i!iiii !!i iIi i iiiiii iiiiiiI

Form'ulal .........................................'""'""''"'""'"""""'"130p i ; d toi""" I I =B4*$B$3+C4*$C$3 I D8:D10 I

Fig. 2. The product mix problem implemented in an alternate format.

. .

~r~

318

Conway, Ragsdale--Modeling Optimization Problems

to note is that Fig. 1 implements the elements of the model (i.e., the decision variables, constraints and objective) 'away' from the data while Fig. 2 fits the elements of the model (i.e., decision variables, constraints, and objective) 'around' the data. Thus, it is expected that fitting a model around a logical representation of the data will result in a more effective spreadsheet design in many instances. 6. A TRANSPORTATION PROBLEM IN STANDARD FORMAT As another example, Fig. 3 illustrates the "standard format" approach to implementing a typical transportation problem. This problem

(adapted from [6]) involves determining the least costly (shortest distance) plan for shipping fruit from orange groves in the cities of Mt. Dora, Eustis and Clermont to processing plants in the cities of Ocala, Orlando and Leesburg. The "Problem Data" section of the spreadsheet summarizes the distances between each grove and processing plant (in cells C5 through E7), the bushels of fruit available at each grove needing to be processed (in cells F5 through F7), and the capacity available at each processing plant (in cells C8 through E8). The decision variables for the problem are clearly listed in cells C14 through El6. The total number of bushels being shipped out of each grove are computed by formulas in cells C20 through C22,

Problem Data Distances From Groves To Plant at: Ocala Orlando Lees~:

Groves Mt Dora ::Eustis ::Clermont :=Capacity

21 50 40 35 30 22 55 20 25 i 200 000 i 600 000 :: 225 000

i

....................................................

:T .......................

i Bushels

i121211N

i 275,000 .........409,000 ::..................i i i

300,000

-: ............................

Decislon Variables

N 211121111]111111111

Bushels Shlpeed..Fr.om............. Groves To Plant at: Ocals i Odando Leesburg

::Groves ::Mr Dora

200 000 0 75 000 i Eu=U=........................g................250,.g00....::....150,g£0......i ::Clermont i o i 3oo,ooo ' o i

::Bushels S h i p .)ed } . _ _From:i _~

::Mr Dora

i 275,000

Minimize: E24 By Ch,,nging: C11:E16 SubjectTo: C2D:C22=FS:F/ F20:F22<=Cg:E6 C14:E16>=0

Bushels Received At:

Ocala

~200,000

............................ Eustis i '400](?O(J.......................................................... !Orlando I 55(J,(J00 Clermont 300,000 .......................~Leesburg I 225,000 ::Total Distance in bushel-miles): .

...~

. . . . . . . . . .

......................................................................................................

:: 24,000,000 i .. . . . . . . . . . . .

, ........................

l .......................

!!Niiiiiiiiii!iiiiiiiiiiiiiiiii!ii!iii!

iiiiiiii!iii!ii!i!ii!iiN!iiNiiNili!iiiiil

Cell:

Formula:

Copied

C20 F20 F21 F22 E24

=SUM(C14:E14) =SUM(C14:C16) =SUM(D14:D16) =SUM(E14:E16) =SUMPRODUCT(C5:E7,C14:E16)

C21:C22

Fig. 3. The transportation problem implemented in a standard format.

to:

Omega, Vol. 25, No. 3

while the total number of bushels being received at each processing plant are computed by formulas in cells F20 through F22. Finally, the objective function value for the problem is computed by the formula in cell E24. While this model is again perfectly capable of solving the problem, it can be criticized as having many of the shortcomings identified earlier for the product mix problem, plus some new ones. First, in the "Problem Data" section, the casual observer may once again confuse the values in the "Bushels Available" column (cells F5 through F7) and "Capacity" row (cells C8 through E8) as having some relationship to the distances listed adjacent to these values. Second, in the "Outputs" section of the worksheet, the cells labeled "Bushels Shipped From:" (cells C20 through C22) actually compute the sum of each of the rows in the "Decision Variables" section. But this relationship is masked somewhat by the physical separation of the "Decision Variables" and "Outputs" sections (reducing auditability). Similarly, the cells labeled "Bushels Received At:" (cells F20 through F22) actually compute the sum of each of the columns in the "Decision Variables" section. This relationship is also masked by the physical separation of the "Decision Variables" and "Outputs" sections and is further complicated by a difference in orientation (reducing auditability). That is, the column sums that we might expect to see listed 'across' row 17 (in cells C17 through El7) are actually listed 'down' column F (in cells F20 through F22). This difference in orientation introduces another major complication in that each of the formulas in cells F20 through F22 must be entered individually. Notice that while a single formula can be entered in cell C20 and copied down to generate the formulas for cells C21 and C22 the same is not true for cells F20 through F22. Having to enter each of these formulas individually obviously increases the risk of errors being made (reducing reliability) and would also become unduly burdensome if there were, say, twenty processing plants rather than only three. Another problem introduced by the physical separation of data, variables and outputs in this problem is that it makes it difficult for the user to compare the left-hand sides (LHS) of constraints to their right-hand side (RHS)

319

values. For instance, after solving the problem the user might be interested in comparing the total number of bushels being shipped from each grove (cells C20 through C22) with the total bushels originally available at each grove (cells F5 through F7) to verify (or audit) that all the fruit is being processed. A similar comparison of the total bushels being received at each processing plant (cells F20 through F22) with the capacity at each plant (cells C8 through E8) might be desired to verify (or audit) that no processing capacities are being exceeded. The physical separation of these ranges makes such comparisons somewhat difficult. Again, having the LHS and RHS values of the constraints in different areas on the spreadsheet again makes it more difficult to describe the constraints of the model to the spreadsheet solver software and makes it harder to audit the solver settings when attempting to verify (or debug) the model. A final problem with the model shown in Fig. 3 relates to the issue of modifiability. Suppose a new processing plant becomes available. We could easily insert a new column between the existing columns D and E to accommodate the data and decision variables for the new plant. However, special care must be taken to add a row in the output section only in the area labeled "Bushels Received At" and not in the area labeled "Bushels Received From". Of course, this is not a major problem, but it does illustrate that adding a grove or processing plant in this model is not as simple as adding a row or column to the spreadsheet. 7. T H E T R A N S P O R T A T I O N P R O B L E M IN AN ALTERNATE F O R M A T

An alternate way of implementing the model for the same transportation problem is shown in Fig. 4. In this model, the problem data is shown in white (unshaded) cells, the decision variables are indicated by shaded cells with dashed borders (cells C12 through El4), constraint cells are indicated by shaded cells with solid borders (cells C15 through El5 and F12 through F14), and the cell representing the objective function is shaded with a double border (cell El8). Notice that the cells in the column labeled "Bushels Shipped" (cells F12 through F14) contain formulas that sum the rows in which

320

Conway, Ragsdale--Modeling Optimization Problems

i i !

I

Distances From Groves To Plant at: Ocala ~ Orlando i Leesbur

21

i

5o

i

4o

36 i 30 22 ...........................................
...........I clerrn°nt

I

...........

' ...................................

55

20

!

25

i

........Bu..s..h..e.!..s..Sh.!PP.e.d...F..r.o..m.. .......... ! ~ ; a ~ ; . ;

.................

........... I Mt. Dora

Groves To Plant at: Ocala i Orlando . Leesburg

Bushels Bushels Shipped ::Available

|i::il~iii~i~i::ii![;iil::iiiii::it

.......... ..................Nilii i iN ! il; iiii ............................ 200,000

,

600,000

225,000

iiiiiii

iTotal Distance (in bushel-miles):

I By Ckanging: C12:E14 ......... ] S u b j e c t T o : F12.F14-GI

............

t......................................................................

........... I l

............

i...................................................................... !

. . . . . . . . . . . ~. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N

~

~ ...............................................

2:G1,1

C15:E15(=C16:E16 C12:E14)-0

. . . . . . . . . . .

.:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

; .......................

: .......................

:............... i

-

iiiiiiiiiiiiiiiiNiiiiiNiiiii!i!N!i!iN!iii!i!iiii!ii!!i!i!i!!iiiNiiiJ!!Niii!iiiiii!i!i!i!iii!i!i!ii iii!i!iiiiiiiii!iii!iiii!iiiiiN!iiiiiiiiiiiiiiiiiiiiiii!iiiiiii iiiii Cell:

Formula:

Copied

F12 C15 E18

=SUM(C12:E12) =SUM(C12:C14) =SUMPRODUCT(C5:E7,C12:E14

F13:F14 D15:E15 --

)

to:

Fig. 4. The transportation problem implemented in an alternate format•

they appear. The physical location of these cells suggests and supports their purpose. Also, to implement these calculations, a single formula was entered in cell F12 and copied down the column. Thus, once the user understands the purpose of the formula in F12, the remaining formulas are also understood. Similarly, the cells in the row labeled "Bushels Rec'd" (cells C15 through El5) contain formulas providing totals of the columns in which they appear• Again, this is intuitively consistent with their physical location and these formulas can be completed by entering a single formula in cell C15 and copying it across the remaining columns. Locating the "Bushels Available" values next to the "Bushels Shipped" values and placing the "Capacity" values just beneath the "Bushels

Rec'd" values allows the user to easily verify the feasibility of this solution (i.e., these placements allow for a direct comparison of R H S and L H S values). These placements also make it easier (and less error prone) to communicate the LHS and R H S values of the constraints to the spreadsheet solver software. We also note that there are no differences in orientation when comparing these values. Finally, modifying this model to accommodate another processing plant or grove is a simple process of inserting, respectively, a new column or row and filling in the resulting cells with appropriate data and formulas. Thus, the model shown in Fig. 4 avoids all of the problems associated with Fig. 3 and also occupies less screen space as fewer labels have to be repeated in multiple areas of the spreadsheet.

Omega, Vol. 25, No. 3

8. USING TEXT BOXES, NOTES AND VOICE

9. GUIDELINES FOR GOOD SPREADSHEET DESIGN

MESSAGES

All of the previous figures included text boxes that appear as an electronic post-it note in the spreadsheet to document the details of the model being solved. A simple macro can be implemented to display or hide these text boxes with the click of a button. While the use of shading and borders to identify cells representing the variables, constraints and objective function can be effective, at first glance, the meaning of the shading and borders will not be obvious. To alleviate this problem, notes can be used to describe any cell in a spreadsheet. Figure 5 shows a note that is associated with the "Total Distance" cell El8. Notes like this can be made to appear and disappear automatically as the user simply moves the cell pointer over various cells in the spreadsheet. In Excel 7.0 it is also easy to record and associate spoken (voice) messages with these notes that will play automatically when the cell pointer hovers over noted cells.

. . . . . . . . . . .

t •i. !Groves jML Dora

.......... 1 eiii~i° .......... [ e i ~ ~ o n t

The spreadsheet designs presented here are not particularly new and have probably been used by others to implement similar models. However, our discussion and analysis of these examples highlights several important guidelines that we believe should be considered when evaluating alternatives for implementing models of optimization problems in a spreadsheet. All of these suggestions have the effect of enhancing the reliability, auditability and/or modifiability of a spreadsheet model. Do not embed numeric constants in formulas. Numeric constants should be placed in individual cells and labeled appropriately. This enhances the reliability and modifiability of the model. Organize the data, then build the model around the data. Once the data is arranged in a visually appealing manner, logical locations for decision variables, constraints and objective function tend to naturally

~ Distances From ~ Groves To Plant at: ! Ocala i Odando Leesburg [ 21 i 50 40 [

................ I ..........3~ ......... ".":....... ~ ...............: ............. ~ ............

............................................

I......... ~g .......... ,[........ ~ ~ ....................

..............i[ i

~ g ..........

i............. giii~eii siiibi;e a i=rom ............ i

............ i..................................................Gioi/~i%~iafi~i~i................... Bushels Bushels ~ed iAvailable ...........iG'iov'ei................T'""iSciT;;"'"r"i:iriando'"'"i";;sb'u'rg"":"

t Jt iDora : : i : : i i i : : ~i i i::ii !:i~:8~N i i M~i:i:i~!i!i:!Ni~t~!i:ii!ii~i~i~i:ig:~:iEl~

Bushels Rec'd Capacity

200,000

600,000

225,000

Minimize: E18 By Changing: CIZ:E14 Subject To: F12:FI 4=G12:G14 C15:E15<=C16:E16 t-0

L

. . . . . . . . . . .

............

iTotal Distance ,in bushel-miles):

i .................................................................................

321

+ .............

with the given s ~ u l i ~ . ~'e wish to minimize this value.

Fig. 5. Illustration of the use of text boxes and cell notes.

322













Conway, Ragsdale--Modeling

suggest themselves. This also tends to enhance the reliability, auditability and maintainability of the model. Things which are logically related (e.g., left-hand-sides and right-hand-sides of constraints) should be arranged in close physical proximity and in the same columnar or row orientation. This enhances reliability and auditability of the model. A design that results in formulas that can be copied is probably better than one that does not. A model with formulas that can be copied to complete a series of calculation in a range is less prone to error (or more reliable) and tends to be more understandable (or auditable) (i.e., once the user understands the first formula in a range, he/she understands all the formulas in a range). Column or row totals should be in close proximity to the columns or rows being totaled. Spreadsheet users often expect columns and/or rows to be totalled. Numbers at the ends of columns or rows that do not represent totals can easily be misinterpreted (reducing auditability). The English-reading h u m a n eye scans left to right, top to bottom. This fact should be considered and reflected in the spreadsheet design to enhance the auditability of the model. Use shading, borders and protection to distinguish changeable parameters from other elements of the model. This enhances the reliability and modifiability of the model. Use text boxes and cell notes to document various elements of the model. These devises can be used to provide greater detail about a model or particular cells in a model than labels on a spreadsheet may allow. 10. CONCLUSIONS

M a n y believe that spreadsheet modeling is an a r t - - o r at least an acquired skill [6, 10]. And m a n y spreadsheet users have already acquired the above guidelines (as well as others) and applied them at a subconscious or intuitive level

Optimization Problems .

for several years. But as the importance and prevalence of spreadsheets increase, we believe there is great value in explicitly identifying domain-specific guidelines that academics, practitioners and students can use to ease the transition from structured, dedicated O R / M S packages to the unstructured world of spreadsheets. While there probably is no one 'right' way to build a spreadsheet model for a given problem, we believe some ways are more effective than others. This paper provides some suggestions and guidelines we believe to be helpful in designing effective spreadsheets for optimization problems. We hope this paper will provoke further thought and debate within the O R / M S community regarding the issue of effective spreadsheet design. REFERENCES 1. Mason, D. and Keane, D., Spreadsheets: solution or problem?. Interface, 1989, 82-84. 2. Cornford, T. and Doukidis, G. I., An investigation of the use of computers within OR. European Journal of Information Systems, 1991, 1, 131-140. 3. Carom, J. D. and Evans, J. R., Management Science: Modeling, Analysis and Interpretation. South-Western, Cincinnati, 1996. 4. Clauss, F. J., Applied Management Science and Spreadsheet Modeling. Duxbury Press, Belmont, CA, 1996. 5. Plane, D., Management Science: A Spreadsheet Approach. Boyd and Fraser, Danvers, Massachusetts, 1994. 6. Ragsdale, C. T., Spreadsheet Modeling and Decision Analysis: A Practical Introduction to Management Science. Course Technology Inc., Boston, 1995. 7. Winston, W. L. and Albright, S. C., Practical Management Science. Duxbury Press, Belmont, CA, 1997. 8. Cragg, P. B. and King, M., Spreadsheet modeling abuse: an opportunity for OR? Journal of the Operational Research Society, 1993, 44, 743-752. 9. Kee, R., Programming standards for spreadsheet software. CMA Magazine, 1988, 55-60. 10. Dhebar, A., Managing the quality of quantitative analysis. Sloan Management Review, 1993, 34, 69-75. 11. Ronen, B., Palley, M. A. and Lucas, H. C. Jr., Spreadsheet analysis and design. Communications of the ACM, 1989, 32, 84~93. 12. The Windows Interface Guidelines for Software Design. Microsoft Press, Redmond, Washington, 1995. FOR CORRESPONDENCE: Cliff T Ragsdale, Department of Management Science and Information Technology, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061-0235, USA.

ADDRESS