Incremental Updateable Control Systems on an FPGA

Incremental Updateable Control Systems on an FPGA

Incremental Updateable Control Systems on an FPGA T. Parkkila*, S. D. Glaser** *VTT Technical Research Centre of Finland, Oulu, FI-90571 Finland (Tel...

1MB Sizes 6 Downloads 61 Views

Incremental Updateable Control Systems on an FPGA T. Parkkila*, S. D. Glaser**

*VTT Technical Research Centre of Finland, Oulu, FI-90571 Finland (Tel: +358-40-521-7949; e-mail: [email protected]). **University of California, Berkeley, CA 94720 USA (e-mail:[email protected]).

Abstract: Component-based development of embedded system software has proven to be a cost effective methodology, and many general component technologies are available from both industry and academia. Often times such modular updating and upgrading can be too expensive (in cost, manpower, and down time) to execute. This is especially true for safety- and time-critical applications where rigorous requalification processes are necessary after a software change. A theoretical approach to incremental or modular certification would provide the ability to use particular subsystems in a modular fashion, but this requires both functional and time behaviour independence of the module. In this research a componentbased software model was used to develop incrementally updateable multiprocessor architecture for a reconfigurable FPGA computing technology. The developed system architecture was designed in the MATLAB/Simulink environment and evaluated through a case study of an embedded home automation control system. A new functionality was added to the initial system, and later reconfigured into the FPGA-chip as a new processor-module. Different memory architectures of the multiprocessor system and their relations to the changes in temporal behaviour of the system were tested and measured. With the best memory configuration, the average change in internal temporal behaviour of the system processors of initial functionality after system upgrading was smaller than 0.5 s. Keywords: Multiprocessor systems, system architectures, real-time systems, safety-critical, closed-loop control, certification, validation, verification. 1. INTRODUCTION Industrial embedded systems, especially safety-critical or Hard Real-Time (HRT) systems, must be certified before being put into service. Certification is a very expensive process, including extensive verification and validation testing, and a design and development process that must satisfy a set of tightly defined rules as to how the development work is organized (Bouyssounouse and Sifakis, 2005). If the embedded system design is changed during the development phase, rigorous qualification procedures, e.g., reverification, revalidation, recertification, are necessary to assess and reduce the risk of malfunction or unanticipated system failure. Additional costs of a design improvement can negate any savings that might have otherwise been realized. System modularization and systemic incremental updatability are very attractive solutions to reduce system design and updating costs (Rushby 2002, Nicholson, Bate and Mcdermid 2000, Fenn et al. 2007). System partitioning properties, error containment and encapsulation services of modules are important properties for the modular system architecture to address. By modularization, a system can be partitioned to minimize the number of subsystems that need to be retested or recertified when changes occur - “modular certification.” Modular certification allows for certification arguments that

have been developed for a particular subsystem to be re-used in a modular fashion. (Rushby, 2002) In addition to the modular certification, an incremental certification approach is used to describe the ability to integrate and qualify new applications and maintain existing applications, without a need to recertify or requalify the entire system again (Nicholson, Bate and Mcdermid, 2000). System modularization is natural in distributed federated systems, but it is more complicated to implement in single-chip embedded real-time systems. Software can be designed as modular, but shared computing resources impede the desired independency of each software module (Hammett, 2003). Today’s multi-core computing technology offers new possibilities to implement the modularity features of physical distributed systems in single chip implementations. Kopetz et al. (2007) have researched the implementation of the integrated system architecture of DECOS into single ASIC (Application Specific Integrated Circuit) SoC (System-onChip). The idea was to distribute different functionalities, such as measurement, control, and communication controllers, into different cores on the chip, and time its behavior with internal TTA (Time-Triggered-Architecture) based TT-NoC (Time-Triggered Network-on-Chip) network. They described a method of how to program multi-core SoCs and proved advancements of multi-core SoC technology in

achieving better composability and error containment properties than conventional microprocessor based systems.

responsible for insuring the software component’s functional independency of any concurrent activity.

ASIC-based SoCs are only one-time configurable multi-core chips. In contrast to Kopetz’s research, the aim of our program was to improve the modular system updateability utilizing the support of multiprocessor capabilities available in reconfigurable FPGA (Field Programmable Gate Array) technology. With FPGA technology, the entire hardware digital logic can be remotely reprogrammed, soft-core processors, as well as updating operational software.

A software component, e.g., the AD_converter component in Fig. 1, can have six types of signals: clock input, data input, data output, trigger input, trigger output, and trigger reset. In addition to a compulsory clock input signal, the software component has to include at least one trigger input and one trigger reset signal. Depending on internal functionality, the software component can have different numbers of data inputs, data output, and trigger output signals. Besides compulsory clock and trigger signals, the AD_converter component has two data input signals, tempHWater and tempOutside, for reading sensor data, and two data output signals, temp_hwater, and temp_outside, to transmit sampled sensor data outside the module. There is also the trigger output signal data_ready to trigger the next module in a modular system model, and, for our case study, data output signal moduleActive for pointing out temporal behavior of the module. Signals of the AD_converter-component are listed in Table 1.

This paper reports on a component-based embedded system design, and its implementation within a FPGA computing platform. The design was evaluated in a case study of an embedded home automation control system. Initial control system functionality was updated with a new control loop, and temporal behaviors of system components during initial phase I and the updated phase II were measured and compared. Both phases were tested for six different processor memory configurations, and trigger output signals were captured for further statistical analysis of temporal behavior.

Table 1. Signals of ”AD_converter” module 2. NOVEL MODULAR SYSTEM ARCHITECTURE The SaveCCM (Åkerholm et al., 2007) software component model was used as the ideological form of our multiprocessor system architecture. SaveCCM was developed for designing and developing functionally independent software components (modules), and composing applications from these. Because of its strict “read-execute-write” semantic, SaveCCM was suitable for our purpose. It ensures that once a component is triggered, the execution is functionally independent from any concurrent activity, i.e., a component produces the same output under both preemptive and nonpreemptive scheduling (Åkerholm et al., 2007).

AD_converter module signals Name: Type: clk clock input tempHWater data input tempOutside data input trigger trigger input temp_hwater data output temp_outside data output data_ready trigger output triggerReset trigger reset moduleActive data output

Width: 1 bit 16 bits 16 bits 1 bit 16 bits 16 bits 1 bit 1 bit 1 bit

Fig. 1. A basic system building block. The basic building block of the novel modular system architecture, a processor module, is itself a compound of two parts – a software component, and its trigger component. An example MATLAB/Simulink block diagram representation is in Fig. 1. The trigger component is responsible for firing the software component due an external trigger event. After firing, the trigger component remains inactive until reset from the software component. In the inactive state, it blocks all external trigger events. Therefore, trigger-component is

Fig. 2. State flow diagrams of initial and updated AD_converter software. Software component functionality can be implemented with Matlab/Simulink state flow design tools. Fig. 2 is a drawing of the state flow implementations of initial and updated software structures of the AD_converter component.

Software components have two states: IDLE and EXECUTE. In the IDLE state the component is waiting for a trigger event, and once the component is triggered the state will be changed into the EXECUTE state. Component software can be updated without losing the initial time functionality of the model. Update can be managed by adding new EXECUTE states into the initial component state flow diagram. In Fig. 2, the AD_converter software component is updated with a new EXECUTE state in order to handle new tempSWater data. 3. CASE STUDY: EMBEDDED HOME AUTOMATION CONTROL SYSTEM In this case study an incremental updateable home automation control system was designed and implemented. The research was executed in two phases. In the first phase, a simple embedded control system for a heating-water mixer valve (Fig. 3) was designed using current system architecture, then implemented on an FPGA-based computing platform. In the second phase the initial system was updated by adding a new control system for a fresh water mixer valve. The objective was to measure time behaviour of the system before and after the update and find a solution to maintain the initial time behaviour of the control system as stable as possible. Therefore, the focus of the study was more in designing and testing system independency than refining the control system application.

The phase I system design was a compound of four processor modules; System_module in the non-critical domain and AD_module, Control_module and DA_module in the critical domain. The AD_module was responsible for taking measurements, i.e. read temperature sensor values, from the analog-to-digital converter and relay the values to the next module of the system, the Control_module. The Control_module calculated a simple PI (ProportionalIntegral) control function based on the measurements and previous control output values. It also relayed the new control value to the DA_module. The DA_module converted the given control values to an analogue voltage to control the mixer valve. The critical domain was timed with a system trigger logic, which fed the ExTrigger-signal to the DA_module. The triggering period of the system trigger logic was adjustable by the non-critical domain through the user interface. The run-time engine was a Altera Cyclone II FPGA (Altera, 2008) on a VTT ReDNeC-P0 FPGA research platform, Fig. 4. Each processor module in the critical domain was implemented as a Nios II (Altera, 2009a) economy softcore processor using Altera’s SOPC Builder design tool (Altera, 2009b). The system processor on the non-critical domain was implemented as Nios II standard softcore processor. Signals in and out of the softcore processors were implemented using general parallel I/O port configurations. The functionality of the Trigger component (Fig. 1) was created with Nios II softcore processor’s general I/O port configurations edgecapture function. With the edge-capture function it was possible to implement the desired blocking interrupt service for a trigger input signal. The software for the processor modules were generated from the MATLAB/Simulink model. The only required software additions to the generated software codes when attaching them into the real application, were the use of digital interfaces between the processor modules and the software drivers interfaced with the world through AD- and DA-converters.

Fig. 3. The home automation control system, phase I and phase II. 3.1 Design and Implementation Phase I The phase I control system should measure outdoor air and heating water temperatures, and calculate values to control the heating water mixer valve. A simple user interface was required, which was implemented using the Ethernet-bus and a computer. The control system was divided into two principle domains – non-critical and critical. The non-critical domain was responsible for user interface services. The critical domain was responsible for the actual time and safety critical control loops, which were designed within the developed modular system architecture.

Fig. 4. System evaluation platform.

3.2 Design and Implementation Phase II In the second phase of the experiment, the system was updated with a new control system for the service water mixer valve. The desired upgrade of the system was made by integrating a new temperature measurement into the AD_module, a new control calculation processor module, Control_module2, and a new actuator control output into the DA_module of the initial system design. Even though the ReDNeC-P0 research platform with the VTT SANTACRUZ I/O card supported two AD-converters, the use of one shared AD-converter to perform all measurements was required to better simulate a real situation. Similarly, the DA_module was shared with both water mixer valve control systems. Use of the shared processor modules required a software update for the processor modules by adding EXECUTE state into the initial software state flow diagram, as illustrated in Fig. 2. The configuration phase II was designed incrementally from the phase I configuration block diagram, but they were each compiled anew using Altera’s Quartus II (Altera, 2009b) FPGA system designer and compiler. Quartus II software has the possibility to use incremental compilation, where initial configuration can be partitioned into smaller divisions, which can be “frozen” when updating the system configuration. In this case, the incremental compilation feature was not used. Fig. 5 describes the FPGA floor plans of both implementation phases. Ovals in the figures bound the logic usage and locations of the softcore processors, which differ between implementation phases.

The Nios II softcore processor has five primary memory sections for its software, which are listed in Table 2. Each processor module has an interface to its internal on-chip RAM, and to the shared external SDRAM chip on the FPGA board. Therefore, different memory configurations could be tested by dividing the memory sections between these two RAM memories. Table 2. Memory sections (.text) (.rodata) (.rwdata) Heap Stack

Section purpose Actual executable code memory Read only data memory Read/Write data memory Dynamical allocation memory Temporal parameter memory

Table 3 describes the various memory configurations tested. Tests were made for six different memory configurations mem1-mem6, for both implementation phases I and II. In each test, every processor module on the critical domain had similar memory configuration, e.g. in the mem3 memory configuration each processor module had three memory sections located in the external SDRAM memory and two sections in the on-chip RAM. Table 3. Memory configurations (.text) (.rodata) (.rwdata) Heap Stack

mem1 on-chip on-chip on-chip on-chip on-chip

mem2 external on-chip on-chip on-chip on-chip

mem3 external external external on-chip on-chip

mem4 on-chip external external on-chip on-chip

mem5 on-chip external external external external

mem6 external external external external external

All measurements were measured and recorded with a Yokogawa DL750 eight channel digital oscilloscope, sampling at 2x106 samples per second ( t = 500 ns). Each cpuActive signal and data_ready trigger signal was measured using the built-in pulse width measurement tool of the oscilloscope, and the results captured into ASCII format for future statistical analysis. The signal waveforms were also recorded from each memory configurations mem1-mem6 and from both implementation phases.

Fig. 5. Phase I FPGA floor plan on the left and phase II floor plan on the right. 4. MEASUREMENTS For testing the time behaviour of the system, each softcore processor in the critical domain emitted a cpuActive-signal, which connected to a physical IO-pin of the VTT SANTACRUZ I/O card. Each processor set the cpuActivesignal immediately after the processor module was triggered, and reset the signal just before polling the next trigger event. All trigger signals from the internal trigger logic and from the processor modules (data_ready -signals) of the critical domain were assigned to physical IO-pins of the VTT SANTACRUZ I/O card as well. These trigger signals were used to launch the next processor into action.

Figures 6 and 7 present the combined graphs of the cpuActive-signals waveforms from the phase I and phase II implementations. Fig. 6 is from the implementations with the memory configuration mem1, and Fig. 7 is from the implementations with the memory configuration mem6. The graphs above the time axis are from the phase I implementation before the reconfiguration, and the graphs under the time axis are from the phase II implementation. Values of the time axis are in s. Note that the charts are not in the same time scale. When comparing these two pictures, it is obvious that the execution time is much longer using external RAM memory than internal RAM memory. Addition of the new processor, sharing the same external RAM memory, led to significant deterministic errors. From the Fig. 6 we can point out that the active time of the AD processor and DA processor have almost doubled from the first to the second phase

configuration, whereas the activity time of the Control processor remains the same.

AD+ and AD- describe changes in positive and negative pulse widths of the data_ready signal of AD_module and CTR+/- of CONTROL_module, and DA+/- of DA_module. Results of measurements are arranged by the length of average change, therefore the columns of charts are stated in order of mem1, mem4, mem5, mem2, mem3 and mem6.

Fig. 6. Waveforms of “cpuActive” signals from the memory configuration mem1. Fig. 8. Waveforms of “data_ready” signals from the memory configuration mem1.

Fig. 7. Waveforms of “cpuActive” signals from the memory configuration mem6. The addition of the new sensor reading and actuator control was the source of the differences, which were expected and acceptable changes in time behaviour. The critical state was maintaining the start time of the processor modules at the event trigger. There was a 6 s time difference for the start of the AD processor between phase I and phase II configurations. Using the internal on-chip RAM (mem1 configuration) with the phase II configuration resulted in a quicker start, explained by the conventional FPGA compiling method and the change in software structure of the AD processor. There is a very significant difference is starting times when the processors used the external RAM (mem6 configuration), illustrated in Fig. 7. Adding the Control_module2 processor into the system resulted in a 37.5 s delay in the start of the AD processor. Maintaining the timing of the signal outputs from the processor modules was another system requirement. Figures 8 and 9 are the graphs of the same four configurations shown in the two previous figures, but starting from the trigger output (data_ready) signals of the processor modules. AD, CONTROL and DA are the initial data_ready signals and AD2 is new data_ready signal from the AD_module to the Control_module2. The differences between the trigger signal time behaviour was under 10 s using internal RAM memories, and in the tens to hundreds of s using the external memory. Fig. 10 is a summary of changes in average pulse widths of the data_ready trigger signals after the system update. Graphs

Fig. 9. Waveforms of “data_ready” signals from the memory configuration mem6.

Fig. 10. Summary of changes in average pulse widths of “data_ready” signals. The measurements point out that memory configurations mem1, mem4 and mem5, which had the actual executable code linked to internal on-chip memory, had the least change in system time behaviour. In particular the memory configurations mem1 and mem4, which had also Heap and Stack memory sections linked into on-chip memory, were affected with very minor changes. For the mem1 memory configuration, the average change in internal time behaviour of the system processors of initial functionality after system upgrade was 404.8 ns.

This simulation provides a good test of the interdependence and running time differences between uniform and nonuniform memory architectures. Just adding multiple processor cores does not guarantee independent execution of software on the processors - memory architecture is a key factor for the incremental updateable modular system architecture. The time critical home automation control system application was partitioned into independent processor modules: AD/measure, Control/calculate and DA/actuator control. While it was desired to optimize interdependency for this software design, the multiprocessor application did a better job. According to the captured runtimes of the processors (Fig. 6) the multiprocessor took approximately 1600 s for a control cycle in the phase II. This was due to the overlapping in time of the runtimes of the processors. Without overlapping, the sum of the processors runtimes is approximately 3300 s (AD 500 s, CONTROL 1150 s, CONTROL2 1150 s, DA 500 s). Due to the narrow duty cycle of the processor modules the multiprocessor system ran 52% faster than a similar but single processor system. There is more time to make calculations for other processes than in single processor incarnation. This allows an incremental system upgrade to be made by adding an independent code section into the existing processor module. This requires an advanced multi-trigger interface for each processor modules to allow different application code sections be tripped independently. 5. CONCLUSIONS This paper presented an approach for using reconfigurable FPGA computing technology as a multiprocessor system implementation platform for an incremental updateable modular system. The modular embedded home automation control system was designed in the Matlab/Simulink environment and then implemented into a FPGA as a multiprocessor system. Component-based architecture offered functional independency, and FPGA-based multiprocessor implementation of the architecture ensured temporal independency for the system components. With the best configuration, the average change in internal time behaviour of the system processors under initial functionality after system upgrade was 404.8ns. The independent multiprocessor system gave also 52% better computing performance comparing to a similar but single processor system. The independent multiprocessor system improved computing performance, and the most important, was the finding that the inactive time of the processors can be used for other measurements or calculations. For example, in this study we demonstrate the addition of new control system as a processor added to the initial system, but in some cases it might be possible to add additional control systems as new application code sections into existing processors, provided there is sufficient available time. This requires a new multitrigger interface for processor modules to allow application code sections be tripped independently. It also requires

internal system self-validation mechanisms and algorithms to measure and calculate system temporal behaviour while keeping timing under control. Both refinements are point of ongoing research. The embedded home automation control system was the first application field where the modular FPGA-based system architecture was evaluated. Our next step is to evaluate the modular system architecture in a field of distributed synchronous sensing. 6. ACKNOWLEDGEMENTS This work has been done in EU ITEA2 USENET – Ubiquitous M2M Service Networks project. We would like to acknowledge the TEKES - Finnish Funding Agency for Technology and Innovation for support of USENET project in Ubicom research programme. REFERENCES Åkerholm, M., Carlson, J., Fredriksson, J., Hansson, H., Håkansson, J., Möller, A., Pettersson, P. and Tivoli, M. (2007). The SAVE approach to component-based development of vehicular systems. Journal of Systems and Software, vol. 80, no. 5, pp. 655-667. Altera (2008). Cyclone II Device Handbook, Volume 1, Altera, San Jose, CA, USA. Altera (2009a). Nios II Processor Reference Handbook, Version NII5V1-9.0, Altera, San Jose, CA, USA. Altera (2009b). Quartus II Handbook, Version 9.0, Altera, San Jose, CA, USA. Bouyssounouse, B. & Sifakis, J. (2005). Embedded Systems Design, Volume 3436, Springer Berlin / Heidelberg, Germany. Fenn, J.L., Hawkins, R.D., Williams, P.J., Kelly, T.P., Banner, M.G. and Oakshott, Y. (2007). The Who, Where, How, Why And When of Modular and Incremental Certification. Proceedings of the 2nd Institution of Engineering and Technology International Conference on System Safety, pp. 135-140, London, UK. Hammett, R. (2003). Flight-critical distributed systems: design considerations [avionics]. Aerospace and Electronic Systems Magazine, IEEE, vol. 18, no. 6, pp. 30-36. Kopetz, H., Obermaisser, R., El Salloum, C. and Huber, B. (2007). Automotive Software Development for a MultiCore System-on-a-Chip. Proceedings of the 4th International Workshop on Software Engineering for Automotive Systems, pp. 2-9, IEEE Computer Society, Minneapolis, USA. Nicholson, M., Bate, I. and Mcdermid, J. (2000). Generating and Maintaining a Safety Argument for Integrated Modular Systems. Proceedings of 5th Australian Workshop on Safety Critical Systems and Software, pp. 31-41, Australian Computer Society, Melbourne, Australia. Rushby, J. (2002). Modular Certification, SRI International, Menlo Park, CA, USA.