A STRUCTURE FOR ENHANCING USER PARTICIPATION IN MODEL DEVELOPMENT
Timothy J. VanEpps Armco Inc. Middletown Works Industrial Engineering Middletown, Ohio 450~3
ABSTRACT Mathmatical modeling is the peaceful coexistence of theory and practice. If either of these aspects becomes dominant the results may be detrimental to the acceptance and usefulness of the model. When the gap between theory and practice is effectively bridged, the chances for success of the model are greatly enhanced. Often the client (end user) is treated as a resource outside the scope of the problem solution. This isolation of the client from the solution technique makes it difficult for him to understand, defend, and implement the answers produced by the model. Making the client an integral part of linking the physical problem to the model solution, raises his or her understanding and participation level to where the modeler and client interact more productively. This gap is being bridged at Armco, specifically in the forecasting of energy supply and demand. The method developed borrows an important concept from the area of Artificial Intelligence, in that the storage and manipulation of the energy system "knowledge" has been separated from the actual model program code. Using a VAX 11/780 and an "off the shelf" query language the energy engineer can manipulate the variables which define the energy systems. Software has been developed to retrieve information from the knowledge base. Using this information, this software creates source code and manual data input programs for a complete model package. This paper describes the development and implementation of this system in the context of a comprehensive Energy Management Computer System at an integrated steelmaking facility.
KEYWORDS Mathematical modeling, Integrated steel mill, Energy management, Energy modeling
INTRODUCTION The job of the Industrial Engineer is to collect or measure specific data, mathematically relate that data in order to determine any underlying cause and effect relationships which may be present, use these relationships to forecast future events and finally to collect data to evaluate our predictions. Whether we call it by name or not, the determination of the underlying cause and effect relationships is a model building process. Much of our time is spent in the creation of models where a "model" is defined as a convenient way of representing our knowledge of a specific subject. Model building, under the cloak of "operations research" has become a discipline all by itself. Those of us who consider ourselves practicing operations researchers often are guilty of shutting out those who have not benefitted from our extensive training in the formal techniques of operations research. As a result of this we sometimes build models which may be mathematically or statistically eloquent but which have limited usefulness to anyone else. The model building task has two separate and distinct contributions to make in its areas of application. The total benefit of model building is distributed between these two areas. Firstly, the act of creating a model requires a structured and systematic investigation of the system at hand. The result very often is not only the organization of existing knowledge but the acquisition and incorporation of new knowledge as well. Secondly, after its creation the model may be used to predict the effects of proposed changes to the system being modeled. The application of the model is an act of faith; the faith that relationships determined in the past will continue to hold true in the future. The foundation of this faith lies in the understanding and belief in the "structured and systematic investigation" required in building the model. When the model building function is considered separate from the remaining engineering functions then the client or end user may be isolated from the modeler potentially
512
VAN EPPS: A Structure for Enhancing User Participation in Model Development
513
leading to disbelief, distrust, and nonacceptance of the model generated solutions. This situation effectively negates the benefits which may be derived from application of the model to specific problems. Futhermore, since the user has not been included in the model development he or she may not have taken part in the learning process within the model development. When the modeler moves on to his or her next assignment the new knowledge may leave as well.
ENERGY SUPPLY AND DEMAND Unlike many other industries, steel production requires large-scale equipment and huge quantities of raw materials. A large portion of the cost of making steel is in the energy requirements of the steel making process. The energy flow within an integrated steel making facility differs according to variations in the production schedules at each of the steps in the steel making process. Many of these process steps also produce by-product energy. The quantity of the byproduct energy generation and its subsequent use varies significantly over time. When this byproduct energy is unavailable or its supply is insufficient then it must be supplemented through expensive purchased energy sources. The goal of energy management is to balance the generation and use of this by-product energy to minimize the purchased energy requirements. During 1984, a model was developed at Armco's Middletown Works to aid engineers with this task. Subsequently, both the energy and operating structures of Middletown Works have changed requiring frequent modification of the original model. These changes have required a high level of availability of the modeler to make the necessary modifications. The modeling system described here is a solution to this problem.
ENERGY MODELING At the heart of all modeling endeavors must be a complete and clearly defined problem wherein both the response and predictor variables are well defined. In the case of the energy model, the responses or outputs are the specific quantities of energy consumed at a given level of steel production and the predictors or inputs are the energy usage rates which define an operating unit's energy needs as a function of their production. These lists of predictors and responses represent approximately 2000 variables and in the past have not been maintained for the purposes of documentation outside of the model program code. The actual development of the model requires that the cause and effect relationships between the predictors and the response variables he expressed mathematically. These are the actual equations which must be solved by the model during its execution. The form of these equations once again is typically buried in model program code and are not readily available to the nonprogramming user to examine and evaluate. It is at this point in the model building activity when the client/user relationship is most critical. The representation of the cause and effect relationships of the predictor and response variables is where the real learning takes place. The client must be able to communicate his knowledge of the physical system to the modeler and the modeler must communicate his interpretation of that knowledge back to the client. A high level of trust must be developed between the client and the modeler, and a substantial amount of time is necessary to nurture such a relationship. A systematic method is required to keep track of these predictors and responses, as well as the method by which the predictors are combined to derive the responses (i.e. the equations). Beyond this bookkeeping function the user should have the capability to modify the equations in order to experiment with new ideas and evaluate their implications. After all, a major reason for building a model is that experimenting with the actual system could be disasterous and the model is a means of replacing "trial and catastrophe" with "trial and error". To fulfill this need the notion of a "variable list" was developed wherein each of the predictors and responses is stored along with all of their associated background information. Each variable in the list is defined in the following terms: -
ID
=
an 8 character code unique to each variable which is recognizable by the model program.
NAME
=
a 35 character text field unique to each variable which is recognizable by the model user.
UNITS
=
a 15 character text field which defines the units of measure for the given variable.
SOURCE
=
a 4 character text field which designates the variable as either a predictor (input) or a response (output).
For predictor variables, information is stored which tells where representative values for the variable may be found in the historical data files. For response variables, a 150 character field is provided where a FORTRAN equation can be defined in terms of other variables defined in the variable list which describes how the given variable is to be calculated.
514
PROCEEDINGS OF THE 8TH ANNUAL CONFERENCE ON COMPUTERS AND INDUSTRIAL ENGINEERING
In order to make the variable list easily accessable for the user, a standard "off the shelf" query and report language has been used to store, retrieve, and manipulate the data contained in the variable llst. The relatively simple command set and ease of use have proved to be very valuable for this application. The variable list provides the means by which a user with little or no formal computer programming background can begin to define not only the fundamental variables in a model but also the ways in which these variables interact. Now that the parts of a model can be defined and manipulated by the user, the parts must be gathered together in such a way as to allow the user to actually run the model and produce predicted responses. To do this, software has been created which performs the following functions.
Buildin~ the Model Program Once the response variable equations have been defined they must be retrieved from the variable list and combined to produce computer program code which will actually perform the model calculations. This function itself is s multi-step process which includes: searching the equations to verify the validity of each of the independent variables. To be considered valid, each independent variable must be previously defined in the variable list. verifying that no dependent variable is a function of itself i.e. A = (B + D) * A is an invalid definition. verifying that no circular dependencies exist in the equation definitions i.e. A = B + C B =D+E C=F+A is an invalid definition. -
determining the proper precedence of calculation of the response variables i.e. if A = B + C B :D+E D= F+G then we must be assured that "D" is calculated before "B" which must be calculated before "A".
-
creation of the actual program code with all equations in the proper order.
Runnin~ the Model The user has the ability to initialize the model with actual measurements of the predictor variables taken for a historic period of time. The variable list contains information pertaining to how each individual predictor variable may either be directly retrieved or derived from values in the historic data base. Manual data editing of individual predictor variables is provided so that specific "what if" questions can be addressed. The predictor variables are logically grouped according to either the operating unit or the fuel to which they pertain. Any or all of these groups may be accessed and altered to produce a new scenario of predictor variables whose effects can be calculated by the model.
A Variable List Driven Model:
An Example
There are two boilerhouses at the Middletown Works which produce high pressure steam for both process and steam space heating. Typical operation of these facilities consists of operating one boilerhouse at full capacity as the "baseload" facility and running the other boilerhouse as a "peaking" facility fulfilling the remaining demand for steam over and above the baseload. To represent this situation for modeling purposes, the following variables would need to be defined:
ID PRD TON STM--USE BH1 CAP BH2"-CAP
NAME
UNITS
SOURCE
TOTAL TONS OF STEEL PRODUCED STEAM USAGE RATE/TON S T E E L #1BOILERHOUSE STEAM CAPACITY #2 BOILERHOUSE STEAM CAPACITY
NET TONS ~4BTUS/TON MMBTUS MMBTUS
INPUT INPUT INPUT INPUT
STM DMD TOTAL DEMAND FOR STEAM f~4BTUS (PRD_TON * STM USE) BH!._STM TOTAL STEAM PRODUCED @ BH1 ~4BTUS (MIN(STNDMD,BHI_CAP)) BH2 STM TOTAL STEAM PRODUCED @ BH2 MMBTUS (MIN(STM_DI'.'~ - BHI_STN,BH2__CAP))
OUTPUT OUTPUT OUTPUT
VAN EPPS: A S t r u c t u r e f o r Enhancing User P a r t i c i p a t i o n
i n Model Development
515
In this example the model will expect that the total steel production, the steam usage rate, and the capacities for making steam at each of the two boilerhouses have been previously determined and ere inputs to the model. The source for each of these variables may be either the historic data base or they may be manually entered at model run time. The output or response variables are derived from the given inputs or predictors as shown in their equations. The #I Boilerhouse is designated as the "baseload" facility and the quantity of steam produced there will be equal to either the total steam demand or the total capacity of the boilerhouse, whichever is less. The #2 Boilerhouse is designated the "peaking" facility and the amount of steam produced there is calculated as either the remaining steam demand (STM-DMD BHI STM) or the total capacity of the #2 Boilerhouse. It should be noted that the steam produced at The #2 Boilerhouse under the current assumptions is a function of what has already been produced at #I Boilerhouse. The response BH2 STM must, therefore, be calculated after the response BHI STM. The equation precedence calculator takes care of this condition for the user. The user may now wish to examine the effects of different "what if" type conditions, for example: The effect of taking one of the four boilers at #I Boilerhouse out of service would be reflected by the user changing "BHI_CAP"; the capacity for that boilerhouse. Similarly, the effect of winter weather conditions could be examined by changing the steam usage rate. These represent relatively simple changes to the numeric inputs of the model but may not be trivial questions to ask the model to consider. If, however, the effect of changing the roles of each boilerhouse; making #2 Boilerhouse the "baseload" facility and #I Boilerhouse the "peaking" facility requires a change in the structure of the model program itself. This would require the following changes in the equations previously defined: ID
NAME
UNITS
BH2 STM TOTAL STEAM PRODUCED @ BH2 ~g4BTUS (MIN(STMDMD,BH2CAP)) BHI STM TOTAL STEAM PRODUCED @ BHI MMBTUS (MIN(STM_DMD - BH2_STM,BHI_CAP))
SOURCE OUTPUT OUTPUT
In this case the quantity of steam produced at the #2 Boilerhouse is equal to either the total steam demand or the total capacity of the boilerhouse, whichever is less and the amount of steam produced at #1Boilerhouse is calculated as either the remaining steam demand (STM DMD - BR2 STM) or the total capacity of the #I Boilerhouee, whichever is less. In contrast to the first set of conditions, now the steam produced at the #1Boilerhouse is a function of what has already been produced at #2 Boilerhouse. The response B H 1 S T M must, therefore, be calculated after the response BH2 STM. The equation precedence calculator will automatically take care of this condition fo~ the user when the model is rebuilt to incorporate these changes before the next model run. With the variable list driven model, the user not only has numeric input parameters to manipulate but he may also make changes in the basic assumptions of how the response variables are calculated without needing programming assistance.
CONCLUSION As stated earlier, the use of a model is an act of faith; a faith which must be founded in the understanding and acceptance of the way in which the model represents our knowledge of a specific subject. The development of a model must therefore include the knowledge itself, provided by the client and the representation of that knowledge provided by the modeler. The variable list method cannot replace either of these functionals but rather supplies a common means for both to work on the same problem. Hopefully, the modeler learns that model building is not solely the province of "operations research". At the same time the client learns that formal methods for representing his knowledge exist and that through the use of these methods new knowledge may be inferred from old knowledge.