Visualization of multivariate data with radial plots using SAS

Visualization of multivariate data with radial plots using SAS

Computers & Industrial Engineering 41 (2001) 17±35 www.elsevier.com/locate/dsw Visualization of multivariate data with radial plots using SAS Quinto...

160KB Sizes 0 Downloads 81 Views

Computers & Industrial Engineering 41 (2001) 17±35

www.elsevier.com/locate/dsw

Visualization of multivariate data with radial plots using SAS Quinton J. Nottingham*, Deborah F. Cook, Christopher W. Zobel Department of Management Science and Information Technology, Virginia Polytechnic Institute State University, 1007 Pamplin Hall, Blacksburg, VA 24061-0235, USA

Abstract Data visualization tools can provide very powerful information and insight when performing data analysis. In many situations, a set of data can be adequately analyzed through data visualization methods alone. In other situations, data visualization can be used for preliminary data analysis. In this paper, radial plots are developed as a SAS-based data visualization tool that can improve one's ability to monitor, analyze and control a process. Using the program developed in this research, we present two examples of data analysis using radial plots; the ®rst example is based on data from a particle board manufacturing process and the second example is a business process for monitoring the time-varying level of stock return data. q 2001 Elsevier Science Ltd. All rights reserved. Keywords: Radial plot; Visualization; SAS

1. Introduction Visualization can be described as a process of `representing data as a visual image' (Latham, 1995), where the image is developed using a combination of points, lines, coordinate systems, numbers, symbols, words, shading, and color to represent different measured quantities (Tufte, 1983). Visualization is often used to make apparent any trends and patterns in data that indicate underlying relationships among variables (Colet & Aaronson, 1997). This can help not only with the process of creating new models to analyze these relationships, but also with the validation of existing models. Business managers and process operators spend considerable amounts of time and effort studying the relationships between various process factors and parameters. Statistics, mathematics, simulation, and operations research are tools for developing models that represent business processes, with these models then being used to assist managers and operators in controlling and adjusting the business process. There * Corresponding author. Tel.: 11-540-231-7843; fax: 11-540-231-3752. E-mail address: [email protected] (Q.J. Nottingham). 0360-8352/01/$ - see front matter q 2001 Elsevier Science Ltd. All rights reserved. PII: S0 3 6 0 - 8 3 5 2 ( 0 1 ) 0 0 04 0 - 7

18

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

Fig. 1. Weather map symbol.

are many business processes, however, that rely heavily on real-time plots of parameters and on process manager expertise for process analysis and control. For example, in producing its watches, Timex Corporation generates not only the watch-face components but also the machine parts such as the gears, springs, and molded plastic parts. If any one of these items is not within speci®cations, then the quality of the current product and any future product may be adversely affected; it is therefore important for the problem to be quickly and clearly identi®ed and for some associated corrective action to be taken. Improvements in the amount and type of information communicated visually in real-time would allow for improved control of processes such as this. Colet and Aaronson (1997) note that most current visualization tools support only the task of presenting results and do not support the entire process involved in data analysis. Recent advances in computer and visualization technology, however, have provided the capability to develop tools that can be useful in studying, monitoring, and controlling a process. Various software packages that can be utilized for visualization are commercially available, and while the theory behind most visualization tools is not new, these tools are now much more accessible to all sizes and types of businesses because of the advances in computer technology; SAS System for Windows (1998) is one example of such a software application. The purpose of this paper is to develop a SAS-based visualization tool that can enhance a business process owner's ability to monitor, analyze, and control the process by providing a more complete picture of process conditions. Utilizing some of the built-in capabilities of SAS, the tool creates radial plots that can simultaneously display multiple process parameter values as the process evolves through time, allowing users of the system to identify potential trends, patterns, or relationships in the data before performing complex statistical or mathematical analyses. This can potentially lead to a much better understanding of the process being considered. To illustrate this potential, two examples of the radial plots' effectiveness are included: the ®rst is based on a particle board manufacturing process and uses actual data collected from a study mill, and the second is a business process which involves examining the time-varying level of stock return data. 2. Data visualization There are many examples in the literature that graphically displays multivariate data using symbols. Blazek, Bradley and Scott (1987), for example, use polyplots that are viewed as univariate control charts

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

19

Fig. 2. Sales volume radial plot.

and Kleiner and Hartigan (1981) use symbols, such as trees and castles, to represent multivariate data. A well-known application of using symbols for multivariate plots is a weather map symbol as shown in Fig. 1 (Chambers, Cleveland, Kleiner & Tukey, 1983). The wind direction and speed for a given observation are shown by the direction of the ¯ag and the number of bars, and the cloud cover is shown by level of shading. Work by Chambers et al. (1983) and later by Friendly (1991) introduced a symbolic star plot for graphing multivariate data in which the values of the variables are coded into symbols. Instead of using the values of two variables as x- and y-coordinates for each observation, they coded each of the remaining variables as an individual ray anchored at some arbitrary location. The radial plots presented in this paper are a variation of this symbolic star plot. Fig. 2 shows an example of a simple radial plot of two observations which has time (the x-coordinate) and sales volume (the y-coordinate) plotted as the vertex of each observation, and which represents the additional variables of Price, Advertising Expense, and Pro®t as rays extending from that origin. Changes in the values of the additional variables can easily be observed by comparing the change in the length of these rays from one radial plot to the next. For example, Fig. 2 clearly shows the increase in both price and pro®t from April 22, 2000 to May 22, 2000, and the resulting increase in sales. 3. Radial plots for process analysis and control Radial plots have several distinct uses in the context of process monitoring and control. They can be used as precursors to the development of statistical or mathematical models by allowing a preliminary analysis of relationships among parameters. They can also be used alongside these mathematical and statistical tools to study, monitor, and control many types of business processes. Many business processes rely on the use of statistical process control (SPC) and the corresponding control charts for process analysis and control. However, a univariate control chart can be misleading, especially when a process can be affected by several variables. It is useful to relate the variation of the process to other variables that are being observed simultaneously with the variable that is charted. This type of display enables the user to analyze the process, as a whole, while visualizing other control process variables. This can be accomplished with a radial plot. In this context, it is appropriate to use a radial plot in conjunction with control charts. For example,

20

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

Table 1 Example of SAS code used to produce radial plots 1 Data Sales; 2 In®le ªc:/my documents/research/Data Visualization/SalesData.prnº; 3 Input Time Price AIP Diff AdvExp Demand; 4 run; 1 Options linesize ˆ 132 ps ˆ 44 nodate nonumber; 2 Options Mprint; 3 ®lename data `c:\my documents\research\data visualization'; 4 %include data(SalesData); 5 %include data(stars-2a); 6 %stars(data ˆ Sales, var ˆ Price AIP Diff AdvExp, id ˆ Time, across ˆ 10, down ˆ 1, color ˆ `Black', yvar ˆ Demand); 7 Run; 8 Quit;

one can plot the control limits of, say, a Shewhart control chart, and then plot the ray lengths of the auxiliary variables along with the primary process variable. Therefore, one can view the radial plot as a univariate control chart enhanced by rays representing the auxiliary variables of the data. 4. Generation of the radial plots Prior to constructing the radial plots, the values of the auxiliary variables need to be rescaled. To construct a radial plot, the following algorithm is appropriate. 1. The value of each variable is used as the length of the ray for that variable. Therefore, the value of the variable should be nonnegative with the magnitude between some value c and 1, where c is the length of the smallest ray with respect to the largest (c may be zero). So, if xij is the jth measurement for the ith variable, then the rescaled variable is given by xpij ˆ c 1 …1 2 c†

‰xij 2 min…xij †Š max…xij † 2 min…xij †

where min(xij) is the smallest value of variable i and max(xij) is the largest value of variable i. 2. It is desired that the direction of the ray for each of the variables be equally spaced around the circle so that the jth ray is at an angle from the horizontal of uj ˆ 2p…j 2 1†=p where j ˆ 1; 2; ¼; p; with p representing the number of variables. Once the original observations have been rescaled, calculating the ray length for each of the auxiliary variables and the angle from which each ray will be from the horizontal, the data can then be plotted for the purpose of visual analysis. That is, the user has the bivariate pair (x,y) Ð the basic variables and the auxiliary variables of interest consisting of the p-tuple xpi1 ; xpi2 ; ¼; xpip : The Proc Shewhart procedure in SAS (1998) allows the user to create radial plots with control charts, but it is not as user friendly as the approach presented in this paper. The Proc Shewhart procedure

21

Fig. 3. Particleboard radial plots with all process parameters.

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

22

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

requires the user to specify and create a data set with the transformed auxiliary variables, as well as a data set with variables representing the control limits, ray sizes, and standard deviations. The program used in this research simply requires the user to read in the data via the SAS data step. The code given in Table 1 represents two programs that need to be executed to produce radial plots. The ®rst four lines are the code needed to read the data into a SAS data set from a ¯at data ®le. The second set of code in Table 1 is the code that calls the macro (see Appendix A) to create the radial plot. In the second set of code, the macro allows the user to specify the auxiliary variables (var ˆ Price AIP Diff AdvExp) to be included in the radial plot, the number of observations on each plot (across ˆ 10), as well as the variables to use as the x and y coordinates (id ˆ Time, yvar ˆ Demand). The use of colors on the radial plots with the rays on the auxiliary variables allows the user to visually identify the magnitude of the auxiliary variables with respect to the basic variables in the process under examination. That is, one can use colors to represent whether an auxiliary variable is within two standard deviations of the mean, between two and three standard deviations of the mean, or beyond three standard deviations of the mean. This color scheme has been employed in all radial plots in this paper with green, blue and red respectively for the ranges described above. The deviations are currently calculated a posteriori, as full knowledge of the observations over the entire data set is available. They could also easily be calculated on a real-time basis, using all observations currently available. 5. Particleboard example Particleboard is a composite wood product used in various furniture and building applications. The raw material input of planer shavings is steamed, re®ned, dried, and then blended with binding agents. Following this blending operation, the stock material is formed into individual mats that are pressed, cut, and sanded into ®nal product form. Various process parameters are measured and process settings recorded during the particleboard manufacturing process. The parameter measurements, which include moisture content of the original particle material, resin content, in-process moisture contents, and bulk density, are the basis for process adjustments. The process settings include information about general press conditions represented by temperature and cycle time and line conditions including line speed and dryer conditions. Also, bonding treatment is calculated from the measurements representing the amount of resin (bonding agent) and wood, as well as the moisture content of the mixture. Strength of the ®nal board is a key quality characteristic of the product. One measure of the board strength is internal bond (IB). IB is an overall measure of the integrity of the board that de®nes how well the internal core material bonded together and is determined using destructive testing methods on the ®nal manufactured product. A data set of all measured process parameters and settings was collected from a particleboard study mill. Radial plots of all process parameters and with IB as the response variable were developed and analyzed. Fig. 3 shows the value of all process parameters at three speci®c points in time. This plot allows an operator to monitor all 10 process parameters on one plot, as opposed to ten separate univariate plots. The addition of color to highlight values enables an operator to quickly identify signi®cant changes in the process levels for all 10 parameters. Drill-down capabilities would give the operator access to the individual plots of the process parameters from the radial plots. As only general information about process parameter levels is easily discernable from Fig. 3, an

23

Fig. 4. Particleboard radial plots with critical process parameters.

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

24

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

effective process analysis would require additional radial plots with fewer process parameter values. With this in mind, critical process parameters were chosen (face and core bulk density, face and core resin content, and face and core infeed moisture content) from the particleboard data set and radial plots were developed (Fig. 4). A preliminary visual analysis to identify which process parameters are contributing to the changes in IB can be accomplished using these radial plots. For example, it is likely that the low value of IB for the sample at observation 98 is at least partially due to the low value of core resin content. Similarly the higher values of IB occurring at observations 101, 104, and 106 are likely being impacted by the higher values of core resin content. The radial plots also allow the impact of multiple parameter values to be analyzed and compared simultaneously. For example, it may be dif®cult to determine the cause of the low value of IB when looking only at the values at observation 103. Core bulk density and core resin treatment are the parameters expected to most impact IB, however the values of those parameters at observation 103 are similar to those that resulted in higher IB values at other times, such as observation 102. Additional information about interactions between all parameters must then be gleaned by analyzing and comparing the parameter values at different times, in this case observation 99 and observation 103. The value of core bulk density at observation 99 is higher than at 103 (as denoted by the colors of the rays), as is the value of IB. When comparing the two points, one can quickly determine that the higher IB at observation 99 occurred with lower moisture content values. The combination of high core bulk density and lower moisture contents resulted in a higher IB value than that observed at observation 103. This comparison allows the operator to further his or her understanding of the process. From the radial plots, the operator can determine that a high value of core bulk density can result in a higher IB level when the moisture content values are low. The interpretation of the radial plots facilitates the increased understanding of a complex process, as the interactions between parameters can be analyzed and efforts to determine the optimal parameter combinations can be initiated. The gain of this type of process information and knowledge is invaluable in a company's efforts to improve process control and reduce process variability. Process managers can improve their control of the manufacturing system as a whole as their level of understanding of parameter inter-relationships improve. Additionally, the process knowledge gained through the radial plots could be used as a basis for experimental design or for the development of a process simulation model. 6. Stock market example The ®nancial world is a rich environment for data analysis and there is ongoing interest in both the public and private sectors in developing techniques for predicting and/or characterizing the behavior of the stock market over time (see Pesaran and Timmermann (1995) or Glosten, Jagannathan and Runkle (1993)). Although historical data can easily capture the movement of the market and the values of parameters believed to have a direct effect on this movement, it can be dif®cult to transform this data into usable information about the fundamental relationships in place within the system. Radial plots can help to address this dif®culty by simultaneously displaying on the same graph both the values of the chosen input parameters and the corresponding level of the market response. We demonstrate the use of radial plots in this context by applying them to a data set composed of monthly

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

25

observations of the S & P 500 stock index between January 1980 and December 1982 This is a subset of a larger data set created by Pesaran and Timmermann (1995) and originally used to identify the external factors most likely to have an effect on the level of excess returns associated with this index. This same data set was later used by Qi and Maddala (1999) to demonstrate that the relationships between the inputs and the response are non-linear in nature, making them dif®cult to characterize analytically. An arti®cial neural network approach was used in this second paper to address this non-linearity and to predict the changing value of the response variable based on changes in the input variables. The radial plots shown in Figs. 5±7 incorporate the six variables chosen as inputs for the neural network model of Qi and Maddala: the dividend yield (divyield), the one-month treasury bill rate (intrate), the monthly change in this interest rate (cir), the change in monetary base (cmb), the change in industrial production (cip), and the in¯ation rate (in¯atn). As discussed above, the primary advantage of the radial plots lies in their ability to visually identify the relationships between the input variables being considered and the response variable. For example, observations 313, 314, and 315 (January through March, 1980) in Fig. 5 illustrate a rapid increase in the monthly interest rate and the corresponding rapid decrease of the excess return response. When the interest rate rebounds back to nominal levels in observation 316 (April, 1980), so does the excess return. Similarly, the level of the response changes when the interest rate increases and then decreases again from observations 321 to 324 (September to December, 1980). During this period, however, in¯ation is also higher than normal and the combination of the two factors appears to affect the stability of the response. The possible relationship between the different input variables, as well as between the inputs and the response variable, can also be illustrated by such radial plots. Take, for example, observations 332±335 (August to November, 1981) in Fig. 6, where a decrease in both interest rate and in¯ation lead to a sharp increase in the excess return and a corresponding increase in the dividend yield. Later on, in observations 343±346 (July through October, 1982), shown in Fig. 7, a large decrease in the change in interest rate parameter and a simultaneous increase and then decrease in the dividend yield identify a period of signi®cant instability in the response. It is interesting to note that the change in industrial production remains steady throughout this entire period, suggesting that it had little effect on the observed instability. Pesaran and Timmermann (1995) make the observation that the signi®cance of the different input variables' effects on the response value are time dependent and often correspond to shocks to the economy. They also point out that the early 1980s were a period of instability, due to the change in the Federal Reserve's operating procedures with respect to the targeting of interest rates. The radial plots not only show this instability, but also they clearly show how each of the independent variables are changing with respect to it. This allows for easy identi®cation not only of potentially signi®cant input parameters, but also of the manner in which they are signi®cant. In this context, the radial plots can serve as a very powerful descriptive tool with the ability to support, as well as to suggest, more detailed types of analyses. 7. Conclusions Radial plots offer a simple, yet powerful, visualization tool for business process analysis and control. A preliminary analysis of the impact of multiple variables on a response variable can be conducted. This preliminary analysis can then guide further process analysis and modeling and consequently lead to an

26 Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

Fig. 5. Radial plots for excess returns on the S & P 500 (Jan±Dec, 1980) with signi®cant input parameters.

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

Fig. 6. Radial plots for excess returns on the S & P 500 (Jan±Dec, 1981) with signi®cant input parameters.

27

28 Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

Fig. 7. Radial plots for excess returns on the S & P 500 (Jan±Dec, 1982) with signi®cant input parameters.

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

29

improved understanding of the business process. In this paper we have presented a tool for generating and displaying radial plots that incorporates the use of colors to represent changes in the different variable values being considered. This new tool uses some of the built-in functionality in SAS for Windows and greatly simpli®es the process of generating these radial plots. In order to use the procedure, the operator must simply record the measurements of interest into a single data set and be sure that that data set is saved as a ¯at data ®le. The program/macro can then be executed using the code provided in Table 1. Although some knowledge of the SAS programming language is an advantage, it is not necessary in order to use the macro. This ease of use, together with the widespread availability of SAS as an analysis tool, can ultimately help make visualization of business processes more accessible to a greater variety of potential users. Appendix A Following is the code used to generate the radial plots. The macro below is called using the second set of code listed on page 8 of this manuscript. The code may also be obtained in electronic form by request. %macro STARS( data ˆ _LAST_, /* Data set to be displayed */ var ˆ _NUMERIC_, /* Variables, as ordered around the star */ id ˆ , /* Observation identi®er (char) */ std ˆ , /* Standardize ®rst: M [SD] */ minray ˆ .1, /* Minimum ray length, 0 , ˆ MINRAY , 1 */ across ˆ 6, /* Number of stars across a page */ down ˆ 1, /* Number of stars down a page */ color ˆ `BLACK', /* star color, quoted string or variable name */ c®ll ˆ ``, /* Background color */ rayline ˆ 1, /* Line style(s) of rays */ raythick ˆ 1, /* Line thickness of rays */ htvlabel ˆ 2.5, /* Height of variable labels in key */ a0 ˆ 0, /* Angle offset - angle for ®rst ray */ circle ˆ 0, /* Draw circle around each star at 100%? */ missing ˆ ., /* value assigned to missing data, or. to delete */ name ˆ stars, /* name for graphic catalog entry */ keypage ˆ 1, /* draw variable key on separate page? */ ); %if %bquote(&data) ˆ %bquote(_LAST_) %then %let data ˆ &syslast; %if %length(&color) ˆ 0 %then %let COLOR ˆ `BLACK'; data _null_; ®le log; array p{*} &var; point ˆ 1; set &data point ˆ point nobs ˆ nobs; k ˆ dim(p);

30

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

put @7 `Number of variables ˆ ` k; call symput(`NV' , left(put(k, 2.))); stop; run; proc univariate noprint data ˆ &data; var &var; output out ˆ _range_ min ˆ mn1-mn&nv max ˆ mx1-mx&nv std ˆ std1-std&nv mean ˆ mean1-mean&nv; run; data _null_; point ˆ 1; set &data point ˆ point nobs ˆ nobs; put ªNOTE: Radial plot for data set &DATAº; put @7 `Number of observations ˆ ` nobs /; call symput(`NOBS', put(nobs, 5.)); lines ˆ ª&raylineº; this ˆ scan(lines,1); do i ˆ 2 by 1 until(this ˆ ` `); this ˆ scan(lines,i); end; call symput(`NL' , left(put(i-1, 2.))); stop; run; data &data; set &data; if _n_ ˆ 1 then set _range_; array raycolor{*} raycolor1-raycolor&nv; array std{*} std1-std&nv; array mean{*} mean1-mean&nv; array vars{*} &var; do i ˆ 1 to &nv; if vars(i) , (mean(i)-2*std(i)) OR vars(i) . (mean(i) 1 3*std(i)) then raycolor(i) ˆ 3; if vars(i) le (mean(i) 1 2*std(i)) AND vars(i) ge (mean(i)-2*std(i)) then raycolor(i) ˆ 1; if vars(i) le (mean(i) 1 3*std(i)) AND vars(i) gt (mean(i) 1 2*std(i)) then raycolor(i) ˆ 2; if vars(i) ge (mean(i)-3*std(i)) AND vars(i) lt (mean(i)-2*std(i)) then raycolor(i) ˆ 2; end; * The block below rescales the variables on a scale of minray to 1; data _scaled_;

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

set &data; if _n_ ˆ 1 then set _range_; drop i keep mn1-mn&nv mx1-mx&nv; array vars{*} &var; array mn{*} mn1-mn&nv; array mx{*} mx1-mx&nv; keep ˆ 1; do i ˆ 1 to &nv; if vars(i) ˆ . then do; vars(i) ˆ &missing; keep ˆ &missing; end; else vars(i) ˆ &minray 1 (1-&minray)*(vars(i)-mn(i))/(mx(i)-mn(i)); end; if keep ˆ . then delete; %put &DATA dataset variables scaled to range &MINRAY to 1; proc format; value posn 0-22.5 ˆ `6' /* left, centered */ 22.6-67.5 ˆ `C' /* left, above */ 67.6-112.5 ˆ `2' /* centered, above */ 112.6-157.5 ˆ `A' /* right, above */ 157.6-202.5 ˆ `4' /* right, centered */ 202.6-247.5 ˆ `7' /* right, below */ 247.6-292.5 ˆ `E' /* centered, below */ other ˆ `F'; /* left, below */ run; data stars; length function varname color $8; array p(k) &var; array raycolor{*} raycolor1-raycolor&nv; retain s1-s&nv c1-c&nv; retain cols &across /* number of observations per row */ rows &down /* number of rows per page */ xsys `2' /* use data coordinates */ ysys `2' /* for both X and Y */ lx ly page 1 /* cell X,Y and page counters */ rx ry r /* cell radii */ a0;

31

32

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

array s(k) s1-s&nv; /* sines of angle */ array c(k) c1-c&nv; /* cosines of angle */ array l{&nl} _temporary_ (&rayline); drop cols rows rx ry cx cy s1-s&nv c1-c&nv &var x0 y0 k save; drop varname showvar r ang a0 i; if _n_ ˆ 1 then do; a0 ˆ &a0 * (3.1415926/180); do k ˆ 1 to &nv; ang ˆ a0 1 (2 * 3.1415926 * (k-1)/&nv); s ˆ sin(ang); c ˆ cos(ang); p ˆ 1.0; end; rx ˆ (100/cols)/2; ry ˆ (100/rows)/2; text ˆ `Variable Assignment Key'; showvar ˆ 1; %if &keypage . 0 %then %do; x0 ˆ 50; y0 ˆ 50; r ˆ 30; size ˆ &htvlabel; x ˆ x0; y ˆ 8; function ˆ `LABEL'; output; link DrawPlot; page 1 1; lx ˆ 0; ly ˆ 0; %end; end; set _scaled_ end ˆ lastobs; showvar ˆ 0; r ˆ 0.75; * (x0, y0) represents the origin of the plot; x0 ˆ time; y0 ˆ &yvar; link DrawPlot; %if &circle . 0 %then %do; size ˆ .; color ˆ `yellow'; line ˆ &circle; link circle; %end; if (lastobs) then do; call symput(`PAGES',trim(left(page))); put `STARS plot will produce ` page `page(s).'; end; lx 1 1; /* next column */

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

if lx ˆ cols then do; lx ˆ 0; ly 1 1; end; /* next row */ if ly ˆ rows then do; lx ˆ 0; ly ˆ 0; page 1 1; end; /* next page */ return; DrawPlot: *± draw rays from the origin to each point; *± label with the variable name if showvar ˆ 1; do k ˆ 1 to &nv; x ˆ x0; y ˆ y0; function ˆ `MOVE'; output; x ˆ x0 1 p * r * c; y ˆ y0 1 p * r * s; line ˆ l{1 1 mod(k-1, &nl)}; save ˆ size; size ˆ &raythick; function ˆ `DRAW'; if raycolor{k} ˆ 1 then color ˆ `GREEN'; if raycolor{k} ˆ 2 then color ˆ `BLUE'; if raycolor{k} ˆ 3 then color ˆ `RED'; output; size ˆ save; if showvar ˆ 1 then do; ang ˆ a0 1 (2 * 3.1415926 * (k-1)/&nv); varname ˆ ` `; call vname(p,varname); text ˆ trim(left(varname)); text ˆ substr(text,1,1) õ lowcase(substr(text,2)); position ˆ left(put(180*ang/3.14159,posn.)); function ˆ `LABEL'; output; end; end; return; circle: do i ˆ 0 to 2*3.1415926 by (2*3.1415926/36); if i ˆ 0 then function ˆ `move';

33

34

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

else function ˆ `draw'; x ˆ x0 1 r*cos(i); y ˆ y0 1 r*sin(i); output; end; return; run; %do pg ˆ 1 %to 1; data slide&pg; set stars; if page ˆ &pg; xvar ˆ trim(left(put(text,1.))); proc gslide annotate ˆ slide&pg /* Plot current page */ name ˆ ª&name&pgº des ˆ ªRadial Plot of &dataº; run; proc datasets nolist lib ˆ work; delete slide&pg; %end; %do pg ˆ 2 %to &pages; data slide&pg; length color $8; set stars; array raycolor{*} raycolor1-raycolor&nv; if page ˆ &pg; proc univariate noprint data ˆ slide&pg; var time &yvar; output out ˆ _axes_ min ˆ minx miny max ˆ maxx maxy; run; data axes; set _axes_; call symput(`minx',minx-1); call symput(`miny',miny-1); call symput(`maxx',maxx 1 1); call symput(`maxy',maxy 1 1); run; axis1 order ˆ (&minx to &maxx by 1); axis2 order ˆ (&miny to &maxy); symbol value ˆ point; proc gplot data ˆ &data; plot &yvar*time / annotate ˆ slide&pg vaxis ˆ axis2 haxis ˆ axis1; run; proc datasets nolist lib ˆ work; delete slide&pg; %end; %mend;

Q.J. Nottingham et al. / Computers & Industrial Engineering 41 (2001) 17±35

35

References Blazek, L. W., Bradley, N., & Scott, D. M. (1987). Displaying multivariate data using polyplots. Journal of Quality Technology, 19 (2), 69±74. Chambers, J. M., Cleveland, W. S., Kleiner, B., & Tukey, P. A. (1983). Graphical methods for data analysis, Boston, MA: Duxbury Press. Colet, E., & Aaronson, D. (1997). Visualization of multivariate data: human-factors considerations. Behavior Research Methods, Instruments, and Computers, 27 (2), 257±263. Friendly, M. (1991). SAS system for statistical graphics, Cary, NC: SAS Institute. Glosten, L. R., Jagannathan, R., & Runkle, D. E. (1993). On the relationship between the expected return and the volatility of the nominal excess return on stocks. Journal of Finance, XLVIII (5), 1779±1801. Kleiner, B., & Hartigan, J. A. (1981). Representing points in many dimensions by trees and castles. Journal of the American Statistical Association, 76 (374), 260±276. Latham, R. (1995). The dictionary of computer graphics and virtual reality, . (2nd ed)New York: Springer. Pesaran, M. H., & Timmermann, A. (1995). Predictability of stock returns: Robustness and economic signi®cance. Journal of Finance, L (4), 1201±1228. Qi, M., & Maddala, G. S. (1999). Economic factors and the stock market: a new perspective. Journal of Forecasting, 18, 151± 166. SAS System for Windows, Version 7. (1998). Cary, NC: SAS Institute. Tufte, E. R. (1983). The visual display of quantitative information, Chesire, CT: Graphics Press.