Design of fault-tolerant microprocessors for space applications

Design of fault-tolerant microprocessors for space applications

Accepted Manuscript Design of fault-tolerant microprocessors for space applications M.S. Gorbunov PII: S0094-5765(18)32179-9 DOI: https://doi.org/1...

1MB Sizes 7 Downloads 191 Views

Accepted Manuscript Design of fault-tolerant microprocessors for space applications M.S. Gorbunov PII:

S0094-5765(18)32179-9

DOI:

https://doi.org/10.1016/j.actaastro.2019.04.029

Reference:

AA 7462

To appear in:

Acta Astronautica

Received Date: 29 December 2018 Revised Date:

26 March 2019

Accepted Date: 11 April 2019

Please cite this article as: M.S. Gorbunov, Design of fault-tolerant microprocessors for space applications, Acta Astronautica (2019), doi: https://doi.org/10.1016/j.actaastro.2019.04.029. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

M.S. Gorbunova

Research Institute of System Analysis of the Russian Academy of Sciences (SRISA), Nakhimovsky prosp. 36-1, Moscow, Russia, 117218

SC

a Scientific

RI PT

Design of fault-tolerant microprocessors for space applications?

Abstract

M AN U

We discuss the main design concepts for fault-tolerant microprocessors, Instruction Set Architectures (ISA) of microprocessors for space applications and the achievable characteristics considering the KOMDIV microprocessors designed by SRISA. The trade-off between the fault-tolerance, performance and power consumption is considered for microprocessors designed using the silicon-oninsulator (SOI) and bulk CMOS technologies. Keywords: microprocessor, fault-tolerance, CMOS, Single-Event Effects

Abbreviations

AC C

EP

TE D

CMOS — Complementary Metal-Oxide Semiconductor (technology). SOI — Silicon on Insulator. SOS — Silicon on Sapphire. DICE — Dual Interlocked CEll. SEU — Single Event Upset. SET — Single Event Transient. SEL— Single Event Latchup. RHBD — Radiation Hardened By Design. RHBP — Radiation Hardened By Process. VLSI — Very Large-Scale Integration (circuit). ISA— Instruction Set Architecture. ITAR — International Traffic in Arms Regulations. VME — VersaModule Eurocard bus. SIMD — Single Instruction, Multiple Data. LET— Linear Energy Transfer. EDAC— Error Detection And Correction. ECC — Error Correcting Codes.

? The study was provided by the support of the state fundamental research program No. 0065-2019-0008. Email address: [email protected] (M.S. Gorbunov)

Preprint submitted to Acta Astronautica

March 26, 2019

ACCEPTED MANUSCRIPT

TID — Total Ionizing Dose (effects). SEE — Single Event Effects. RISC — Reduced Instruction Set Computer. TMR — Triple Modular Redundancy. 1. Introduction

RI PT

2

AC C

EP

TE D

M AN U

SC

The safety of space systems depends on the reliability of electronics components used in their control and data processing subsystems. Space environment causes different radiation effects leading to errors or even failures of electronic components [1, 2, 3]: if the system remains functional after an error, such error is called soft. Otherwise, it is a hard error. Caused by single particle strike to a sensitive volume of an integrated circuit, Single Event Effects are the most important factor defining the reliability of modern space-borne electronic systems and must be taken into account at the design stage [4]. Single Event Upsets (SEU) and Single Event Transients (SET) are soft errors in the memory and combinational elements correspondingly. Single Event Latchups (SEL) are usually hard errors caused by the activating of the parasitic thyristor in bulk CMOS structure [4, 5]. SEL can be eliminated by using of current limiters and/or supply voltage monitors: a thyristor can be deactivated by switching the supply voltage off, and if such switching is quick enough, the component can remain functional after the restart. In bulk CMOS, SEL can be avoided by design using the guard rings in the layout. SOI/SOS technologies do not have such parasitic structures and hence are SEL free. Soft errors can be eliminated using particular cell topology or layout and also by EDAC. Total Ionizing Dose (TID) effects are less critical for the components manufactured using modern CMOS technologies with very thin gate oxides. Unlike TID, SEE became more critical in small elements, because one particle can induce multiple upsets there. Unlike high-performance systems design, which have the small execution time as a primary goal [6], the reliability, power consumption, and weight usually have the highest priority for a system designer, who therefore tries to implement all possible and reasonable solutions to protect the design from SEU and SET. The solutions include different kinds of modular redundancy, error correcting codes (ECC) [7, 8] and particular cell design (e.g., well-known DICE cells [9]) following the hardness by design (RHBD) concept [10]. The high price of the fault-tolerance and radiation hardness [11] is usually believed to be inevitable, and the optimality of the resulting design is disputable. In this paper, we consider the main Instruction Set Architectures (ISA) of microprocessors for space applications and then discuss the achievable characteristics considering the microprocessors designed by Scientific Research Institute of System Analysis of the Russian Academy of Sciences (SRISA). The choice of ISA for a space system is a complex problem based on traditions, the availability and/or convenience of software development, ITAR considerations, but the main argument for choosing a microprocessor is its reliability and fault-tolerance. Fig. 1 shows the diagram with the history and perspectives

ACCEPTED MANUSCRIPT

3

AC C

EP

TE D

M AN U

SC

RI PT

of ISAs used in space programs (based on the original diagram from [12]). The spacecraft computer system designer has to choose between specially designed expensive rad-hard parts and commercial off-the-shelf (COTS) products that passed the extensive hardness assurance tests [13, 14]. Intel x86 ISA is widely used not only in personal computers but also in space systems. E.g., Space Shuttles used 80386 microprocessors, and Hubble Space Telescope (HST) has been still working on 80486 [15]. Although some x86 microprocessors were reported to be sensitive to SEL [16], the potential problems with radiation effects in HST are expected from other parts [17]. Microprocessors are often chosen for space missions after their reliability confirmation in industrial applications. As an example, the microprocessors based on MIL-STD-1750A, which is the Military Standard Sixteen-Bit Computer ISA, were firstly used in aircraft and then at various NASA and ESA programs, such as Cassini probe (launched in 1997), Mars Express (2003), Venus Express (2005) and Rosetta (2004). [18]. The remarkable example of this ISA is MA31750 microprocessor (up to 3 MIPS at 12MHz clock frequency) designed using silicon on sapphire (SOS) technology [18], which makes it latchup free (SEL free).

Figure 1: History and perspectives of Instruction Set Architectures (ISAs) used in space programs (based on the diagram from [12]).

A large number of space programs used PowerPC based microprocessors, which were either radiation hardened by process (RHBP concept) or by design

ACCEPTED MANUSCRIPT

4

AC C

EP

TE D

M AN U

SC

RI PT

(RHBD concept). In all cases, special microprocessors for space applications had lower clock frequency and lower performance compared to their commercial analogs (e.g., due to a refusal of large L2 cache that could be the source of errors in space environment [19]). PowerPC computation systems were used in such NASA programs as Deep Impact (2004), Mars (2006) and Lunar (2009) Reconnaissance Orbiters, Opportunity (2003), Spirit (2003) and Curiosity (2011) Mars Rovers and many others. Yet in 2007, commercial, civil and Department of Defense (DoD) satellites based on 150 BAE Systems RAD750 single-board computers were expected to be launched within 2 years [20]. Similarly to MILSTD-1750A microprocessors, PowerPC systems were initially designed for commercial and industrial applications. For example, (first launched in 2002) IBM PowerPC 750FXA based Orion’s central computer is a Honeywell International Inc. flight computer originally built for Boeing’s 787 jet airliner, but now the Orion spaceship is planned to travel to an asteroid in the 2020s and to carry astronauts to and from Mars in the 2030s [21]. It is remarkable that the same microprocessor was used in Apple iBook G3 in 2003. The newest BAE Systems PowerPC microprocessor 55xx family is based on 45 nm Silicon-on-Insulator technology from the trusted foundry and also includes some RHBD solutions [22]. The performance has changed from 1800 MIPS per board in 2004 to 3.0 Dhrystone MIPS/MHz per chip achieving the throughput of 5.6 billion operations per second (GOPS)/ 3.7 GFLOPS in 2017. MIPS microprocessors are also used in space systems. In the early 1990s, when MIPS R3000 hardware and software commercial support was widespread [23], Harris Corporation and LSI Logic Corporation developed the first 32-bit radiation-hardened commercially compatible microprocessor Mongoose I [24]. Mongoose I based RH3000 VME module was then developed. It used both RHBP and RHBD concepts: the microprocessor was produced by SOI/SOS technology and contained EDAC memory protection and other fault tolerance techniques [23]. It is noteworthy that the SEU test results showed nearly an order of magnitude improvement over the 80386 [24]. Synova Inc Mongoose V MIPS chip was produced using SOI technology and achieved 10 MIPS at 12 MHz. It was used for Earth Observer 1 (EO-1) mission (2000), the Wilkinson Microwave Anisotropy Probe (WMAP) (2001), the Space Technology 5 (ST-5) (2006) and New Horizons, launched in January 2006 to study Pluto [24]. Although ESA funded the development of a successor for the MIL-STD1750 based processors, such chips are dual-use products and are under strict US export regulations (ITAR) [18]. It has lead to the choice of the open, scalable processor architecture SPARC developed by Sun Microsystems. The first chip was SPARC V7 ERC32 32-bit microprocessor designed by Gaisler Research AB (now Cobham Gaisler AB) and manufactured by Temic GmbH (now Atmel) [18]. System-on-Chips are represented by SPARC V8 fault-tolerant (FT) UT699 LEON3FT chip (manufactured using 250 nm technology) demonstrated up to 89 DMIPS performance and used, for example, in the Chandrayaan-1 mission (2008) [18, 24]. ARM ISA is planned to use for space systems in the USA and EU. DAHLIA (Deep sub-micron microprocessor for spAce rad-Hard appLIcation Asic) is a

ACCEPTED MANUSCRIPT

5

TE D

M AN U

SC

RI PT

Very High-Performance microprocessor System on Chip (SoC) based on STMicroelectronics European 28nm Fully Depleted SOI (FDSOI) technology with multi-core ARM processors for real-time applications [25]. The radiation hardness test results were published recently for 28 nm Ultra-Thin Body and Box (UTBB) FDSOI technology demonstrating promising results [26]. The performance is expected to be 20 to 40 times the performance of the existing SoC for space. The DAHLIA consortium includes Thales Alenia Space, Airbus Defense&Space and other European companies and institutions. At the same time, NASA develops the High-Performance Spaceflight Computing (HPSC) Processor Chiplet. This four-year project is expected to deliver a rad-hard space ARM processor to provide optimal power-to-performance for upgradeability, software availability, ease of use, and cost [27]. The HPSC will use RHBD standard cell libraries, ARM A53 processor with its internal NEON single instruction, multiple data (SIMD) design. The last ISAs shown in Fig. 1 are RISCV and the MIPS-like KOMDIV. The perspectives of the former one for space applications are beyond the scope of this paper. The latter ISA microprocessors will be discussed in the next section of the paper. As one can see, there is almost always a gap ∼ 10 years between the microprocessor design and its first space implementation. The usage of a fault-tolerant microprocessor for space may last for about 20 years and more. Its commercial analogs achieve much better performance during such period. Another significant trend is the continuous complication of the system-on-chips: modern space systems require not only high-speed computation but also high-speed interfaces and complex functionality. 2. Fault-tolerant KOMDIV microprocessors for space systems

AC C

EP

Scientific Research Institute of System Analysis of the Russian Academy of Science (SRISA) is one of Russian research and development companies. In 2003, 1890VM1T [28] became the first Russian 32-bit RISC microprocessor that was designed and manufactured in Russian Federation. It is MIPS-like microprocessor with Russian ISA KOMDIV (the Russian acronym that means “Pipelined Single-Chip Microprocessor for Intensive Computations”). The project gave birth to 2 main development directions of SRISA microprocessors: 32-bit and 64 bit KOMDIV32 and KOMDIV64 correspondingly. A brief genealogy of KOMDIV microprocessors and corresponding technology nodes are shown in Fig.2. Note that “SOI” means Partially Depleted SOI process, “CMOS” means bulk CMOS technologies. Being manufactured using relatively old technology process, the 1890VM1T chip (with up to 33 MHz clock) became relatively wide-spread in Russian industry including space systems, however. Its early modification is still used in TsVM-101 onboard digital computer of Soyuz manned spaceship and Progress cargo ship [29, 30]. Its descendant, 1890VM2T is also used in space, e.g. in TsVM20 quadruple modular redundant onboard computers of Meteor-M weather observation satellites [31]. As the chips are manufactured using bulk

ACCEPTED MANUSCRIPT

TE D

M AN U

SC

RI PT

6

Figure 2: The brief “genealogy” of KOMDIV32 and KOMDIV64 microprocessors designed by Scientific Research Institute of System Analysis of the Russian Academy of Sciences. “IC/SC” stands for “interface/system controller”, SoC and SiP are system-on-chip and systemin-package correspondingly.

AC C

EP

CMOS technology, SEL is possible and, although the threshold LET is above the upper LET limit achievable in Cf-252 experiments [32, 33], redundancy must be applied at the system level. The needed fault-tolerance is also achieved using TMR in these cases. The second generation of KOMDIV32 microprocessors (5890VE1T, 5890VM1T, 1900VM2T) for space applications used SOI technology to prevent the occurrence of SEL and special RHBD techniques like SEU-tolerant SRAM cell design, parity check, transistor layout, and TMR to increase the fault-tolerance [33, 34]. 5890VM1T and 1900VM2T have program, electrical and constructive compatibility with 1890VM1T and 1890VM2T correspondingly, which makes the former chips attractive as substitutes of latter microprocessors achieving better mass and size characteristics of a system [28]. The 1907VM01A4 and 1907VM044 (later referred to as “VM01A4” and “VM044” in this paper for simplicity) are the third generation of 8-stage pipeline KOMDIV32 microprocessors for space. The microprocessors are dedicated for

ACCEPTED MANUSCRIPT

7

RI PT

control modules of space systems. Other SOI KOMDIV32 and KOMDIV64 SoCs were designed for complex space systems operating with a large amount of data, which are to be quickly transferred to the ground stations for the subsequent processing. The GAMMA-400 space telescope is a remarkable example of such application [35].

M AN U

SC

Both considered designs are based on the same standard cell library, except DICE versions of cache memory cells designed for VM044 only. The chips are pin-compatible and have the same topology shown in Fig. 4. Both microprocessors have the same design of CPU and FPU register files (RFs). 16×48 bit RFs have two read and two write ports (2r2w) and DICE cells protected by Hamming (12,8) code [7] per byte. Unprotected register files have the following threshold Linear Energy Transfer (LET) and SEU cross section: >90 MeV·cm2 /mg and <6.3 · 10−11 cm2 /bit. Static RAM blocks are with Double Modular redundancy (DMR) protected decoder and control logic. All blocks of 1907VM044 (except those in CPU core) use the Distributed Triple Modular Redundancy (DTMR) that triplicates both logic and flip-flops but keeps the clock common to all copies.

EP

TE D

The CPU core of VM044 uses block TMR as previously described in [34] and shown in Fig. 3.

(a)

(b)

AC C

Figure 3: KOMDIV microprocessor cores (a — VM01A4, b — VM044 with BTMR): MBIST is Memory Built-in Self-test, TREU is Typical Reliability Ensuring Unit [34], CPU is the integer central processing unit, ALU is the integer arithmetic logic unit, I-Cache and D-Cache are instruction and data caches correspondingly, CP0 is system controlling co-processor.

The used DTMR principle is as follows: flip-flops, logic and majority voters [36] are triplicated, the clock is common. The used DTMR principle is illustrated in Fig. 5: flip-flops, logic and majority voters (signed as “V”) [36] are triplicated, the clock is common. The CPU core of VM044 uses Block TMR (BTMR) as previously described in [34].

ACCEPTED MANUSCRIPT

SC

RI PT

8

TE D

M AN U

Figure 4: Topology of the microprocessors under study.

Figure 5: Distributed TMR principle: flip-flops, logic and voters are triplicated, the clock is common.

EP

3. Results for 1907VM01A4 and 1907VM044

AC C

The common fault-tolerance RHBD techniques are well-known, but the optimality of the resulting design is also essential. To check the optimality, it is useful to compare two modern fault-tolerant microprocessors that are very close to each other in architecture and microarchitecture. As we show below, some RHBD techniques might be overkill even for space applications. Table 1 shows the differences between the considered microprocessors. The data are taken from previous studies and datasheets. One can see from the table that the application of DTMR, DICE cells, etc. (provided that the chip area is constant) leads to a significant loss in the performance. The upset cross section strongly depends on the executed test program: different programs activate different sensitive nodes [37]. The situation is even more complicated for the fault-tolerant systems because their sensitivity depends on the performance of built-in Error Detection And Correction (EDAC)

ACCEPTED MANUSCRIPT

9

VM044 66 <9

SC

~48 ~14

4+4

TE D

M AN U

VM01A4 Core parameters Clock frequency, 100 MHz Power <5 consumption, W CPU Performance, ~85 MIPS FPU Performance, ~20 MFLOPS Cache size (Data+Instructions), 8+8 KB 2-way set Cache associativity associative Cache memory cell 6T type Cache memory parity check protection Cache memory cell (unprotected) ~5 threshold LET, MeV·cm2 /mg Cache memory cell (unprotected) SEU 5.0 · 104 cross section, ×10−13 cm2 /bit

RI PT

Table 1: Microprocessors Under Test.

EP

TMR type

None

direct mapped DICE

parity check ~90

7.2 Block, Distributed

AC C

MIL-STD-1553 Memory Size 2K×44 2K×44 2-port Memory cell type 2-port 8T DICE Cache memory Hsiao Hsiao protection (22,16) [8] (22,16) ~90 (not ~65 Threshold LET, protected (protected MeV·cm2 /mg by by EDAC) EDAC) SEU cross section (unprotected), 5.2 1.3 ×10−12 cm2 /bit

ACCEPTED MANUSCRIPT

10

RI PT

subsystems. This fact makes “traditional” fault-tolerant metrics (e.g., cross section vs. LET dependence) impractical. The authors of [38] suggest to use a rough way for the complicated system cross section calculation: X σi = Dj σij , (1) j

EP

TE D

M AN U

SC

where σi is the cross section for system event type i, the sum is over all device subsystems j, Dj is the duty cycle for device subsystem j, and σij is the cross section for subsystem j to exhibit system event type i. Unfortunately, such complex cross section concept is impractical due to the large numbers of i and j in modern SoC. Even if the value from 1 is experimentally obtained, it cannot be used for SER calculation, because it does not have geometrical meaning and does not describe any single sensitive volume. This was the reason why we introduced the mean fluence between failures (MFBF) metric [39] and used it for the comparison of the fault-tolerant microprocessors. The physical meaning of MFBF is the mean fluence that that must be accumulated after the last error to produce the next error. It can be compared to the fluence accumulated during some critical operations (critical fluence), e.g., during the accumulated data transferring from the spacecraft (satellite) to the ground data center. The prevalence of MFBF over the critical fluence ensures that not more than one error can occur during the critical operation. The heavy-ion tests were provided using IS OI 400-N heavy ion test facility designed by Branch of JSC “United Rocket and Space Corporation” - “Institute of Space Device Engineering” (URSC-ISDE) with Government Corporation “Roscosmos” support. The facility is based on U-400 cyclotron of Flerov Nuclear Reaction Laboratory of Joint Institute for Nuclear Research (JINR), Dubna, Moscow region, Russia. It was operated by specialists of URSC-ISDE, Moscow, Russia. Heavy ions were used at normal incidence only at room temperature. Linear Energy Transfer (LET) is calculated using SRIM [40] tool taking into account metal layers. All parts under study were decapsulated before tests. Table 2 shows the parameters of heavy ion beams. Table 2: Heavy Ions Characteristics

Energy, MeV/nucl.

LET, MeV×cm2 /mg

3.8 3.4

69 6.6

AC C

Heavy ion 132 20

Xe Ne

Range in Si, µm 41 37

We used the combined “Paranoia” [41] test with a test of MIL-STD-1553 memory. The “Paranoia test” executes the double-precision Paranoia FPU validation test bench [42]. Numerous tests are performed to validate the floatingpoint performance of a microprocessor: FPU calculations are verified against values calculated in the integer unit. It is important for heavy ion tests that the

ACCEPTED MANUSCRIPT

11

M AN U

SC

RI PT

software cross-checking detects any undetected errors in the FPU calculations. The program reports about such errors as “FPU data error”. Also, the crosschecking between integer unit and FPU exists, so undetected errors in integer unit cause a reported failure. Operating at maximum frequency, each iteration takes 29 ms and 84 ms for VM01A4 and VM044 correspondingly. During the MIL-STD-1553 memory tests, standard patterns 0x00000000 or 0xFFFFFFFF are written, after which the reading cycles start. If the value read from memory differs from the current pattern, the erroneous data are reported, the correct value is written back. The patterns are changed after 300 cycles or after 1000 errors detected. It is important that all built-in fault mitigation techniques were enabled in both microprocessors during the experiment. If an error was detected (but not corrected) in the cache memory, the microprocessor had used the external SRAM for the correct data. Note that the on-board elements in packages (including the external SRAMs and ROMs) were not affected by heavy ions due to relatively small energies of ions (with ranges in silicon∼ 30 µm). The test board is designed for the vacuum camera of Roscosmos heavy ion’s facility. There are the microprocessor, the external SRAM and ROM at the board. UART provides the output. Two boards were installed in the camera at the same time, which is possible due to very low beam flux non-uniformity (∼ 1%). 4. Experimental Results

AC C

EP

TE D

Four chips (2 chips of each type of microprocessors) were involved to the provided experiments: the same pair of VM01A4 and VM044 was used at sessions with odd numbers, another pair was used in even sessions. Table 3 shows the MFBF results. MFBF values are calculated for any error, without attributing them to any kind (SEU in memory, I/O errors, etc.). The extensive analysis of possible errors is provided in [39]. We detected no errors at 20 Ne beams with fluences from 1.2×106 cm−2 (flux ∼ 103 cm−2 ·s−1 ) to 1.3×107 cm−2 (flux ∼ 104 cm−2 ·s−1 ). We can see that although the fluence between errors is larger for VM044 than for VM01A4 for all fluxes (as could be expected), but even for the latter microprocessor it is still high enough. Table 3: MFBF

No.

Flux, cm−2 ·s−1

1 2 3

104 104 103

MFBF, ×106 cm−2 VM01A4 VM044 1.6 1.9 0.6 3.9 1.9 4.0

The results show that radical techniques like TMR do increase the mean fluence between failures (MFBF), but the ability of both microprocessors to repair after the errors allows to choose the appropriate design for the needed system

ACCEPTED MANUSCRIPT

12

RI PT

parameters including power consumption and performance requirements. The application of radical mitigation techniques leading to significant performance decrease and power consumption might be ineffective. 5. Discussion: future KOMDIV32 and KOMDIV64 Microprocessors for Space

6. Conclusions

TE D

M AN U

SC

The requirements of modern space systems for fault-tolerant high-speed interfaces and high-performance computation at minimal power consumption lead to transfer the solutions from the commercial and industrial sector to highly reliable space projects, as one can see from the previous sections. Modern KOMDIV64 microprocessors are designed using commercial 65 nm general-purpose bulk CMOS technology as can be seen from Fig. 2. To satisfy the reliability requirements, the particular standard cell and input/output libraries and IP blocks were designed by SRISA. The primary objective of RHBD concept application is to eliminate the possibility of SEL and to decrease the probability and error rate as much as possible. The solution is to combine known RHBD techniques like guard rings with EDAC. The application of TMR and/or SEUtolerant SRAM cells is useful only in function critical nodes, i.e. in register files. Preliminary results show that such solutions can keep the performance high enough, while the radiation and fault tolerance is also kept at the desired level. Therefore the perspective KOMDIV32 applications are considered only for TMR-based parts for mission-critical control circuits of the satellite or the lowpower control systems of small spacecrafts (like CubeSats), while the highperformance work with a large amount of data can be done by KOMDIV64 65 nm SoCs with various high-speed interfaces like RapidIO, SpaceWire, etc.

AC C

EP

The constant complication of space systems functionality requires new systemson-chips, i.e., microprocessors with various high-speed interfaces. To satisfy the space requirements, such systems-on-chip must be effective and optimal from the positions of energy consumption, fault-tolerance, and performance. The future reliable space systems will be likely designed using sub-100 nm CMOS technology and RHBD concepts, but the design of such parts needs for new metrics and methodology taking into account the optimality concerns. References

[1] N. Smirnov, Ensuring safety of space flights, Acta Astronautica 135 (2017) 1 – 5 (2017). doi:https://doi.org/10.1016/j.actaastro.2017.03.028.

[2] K. Gustafsson, L. Sihver, D. Mancusi, T. Sato, K. Niita, Simulations of the radiation environment at iss altitudes, Acta Astronautica 65 (1) (2009) 279 – 288 (2009). doi:https://doi.org/10.1016/j.actaastro.2009.01.040.

ACCEPTED MANUSCRIPT

REFERENCES

13

RI PT

[3] S. Avakyan, V. Kovalenok, V. Savinykh, A. Ivanchenkov, N. Voronin, A. Trchounian, L. Baranova, The role of a space patrol of solar X-ray radiation in the provisioning of the safety of orbital and interplanetary manned space flights, Acta Astronautica 109 (2015) 194 – 202 (2015). doi:https://doi.org/10.1016/j.actaastro.2014.10.025.

[4] P. E. Dodd, M. R. Shaneyfelt, J. R. Schwank, J. A. Felix, Current and future challenges in radiation effects on cmos electronics, IEEE Transactions on Nuclear Science 57 (4) (2010) 1747–1763 (Aug 2010). doi: 10.1109/TNS.2010.2042613.

M AN U

SC

[5] T. Tomioka, Y. Okumura, H. Masui, K. Takamiya, M. Cho, Screening of nanosatellite microprocessors using californium single-event latch-up test results, Acta Astronautica 126 (2016) 334–341 (Sep. 2016). doi:10.1016/ j.actaastro.2016.05.004. [6] V. Fratin, D. Oliveira, P. Navaux, L. Carro, P. Rech, Energy-delay-fit product to compare processors and algorithm implementations, Microelectronics Reliability 84 (2018) 112 – 120 (2018). doi:https://doi.org/10.1016/ j.microrel.2018.03.019. [7] R. W. Hamming, Error detecting and error correcting codes, The Bell System Technical Journal 29 (2) (1950) 147–160 (April 1950). doi: 10.1002/j.1538-7305.1950.tb00463.x.

TE D

[8] M. Y. Hsiao, A class of optimal minimum odd-weight-column sec-ded codes, IBM Journal of Research and Development 14 (4) (1970) 395–401 (July 1970). doi:10.1147/rd.144.0395. [9] T. Calin, M. Nicolaidis, R. Velazco, Upset hardened memory design for submicron CMOS technology, IEEE Transactions on Nuclear Science 43 (6) (1996) 2874–2878 (December 1996).

AC C

EP

[10] D. G. Mavis, D. R. Alexander, Employing radiation hardness by design techniques with commercial integrated circuit processes, in: 16th DASC. AIAA/IEEE Digital Avionics Systems Conference. Reflections to the Future. Proceedings, Vol. 1, 1997, pp. 2.1–15–22 vol.1 (Oct 1997). doi:10.1109/DASC.1997.635027. [11] R. Ginosar, Survey of Processors for Space, in: DASIA 2012 - DAta Systems In Aerospace, Vol. 701 of ESA Special Publication, 2012, p. 10 (Aug. 2012). [12] J. Gaisler, The SPARC history in SPACE, in: 11th ESA Workshop on Avionics, Data, Control and Software Systems (ADCSS), 2017 (Accessed 13.11.2018), https://indico.esa.int/event/182/contributions/1526/ (2017 (Accessed 13.11.2018)). [13] J. A. Felix, J. R. Schwank, M. R. Shaneyfelt, J. Baggio, P. Paillet, V. Ferlet-Cavrois, P. E. Dodd, S. Girard, E. W. Blackmore, Test procedures for proton-induced single event latchup in space environments,

ACCEPTED MANUSCRIPT

REFERENCES

14

IEEE Transactions on Nuclear Science 55 (4) (2008) 2161–2165 (Aug 2008). doi:10.1109/TNS.2008.2000773.

RI PT

[14] M. Pignol, COTS-based applications in space avionics, in: 2010 Design, Automation Test in Europe Conference Exhibition (DATE 2010), 2010, pp. 1213–1219 (March 2010). doi:10.1109/DATE.2010.5456992.

SC

[15] Lockheed Martin Missiles & Space, Hubble Space Telescope Servicing Mission 3A Media Reference Guide for NASA, 1999 (Accessed 13.11.2018), https://asd.gsfc.nasa.gov/archive/hubble/ap df /news/SM 3A − M ediaGuide.pdf (1999(Accessed13.11.2018)).

M AN U

[16] A. H. Johnson, G. M. Swift, L. D. Edmonds, Latchup in integrated circuits from energetic protons, IEEE Transactions on Nuclear Science 44 (6) (1997) 2367–2377 (Dec 1997). doi:10.1109/23.659064. [17] M. A. Xapsos, C. Stauffer, T. Jordan, C. Poivey, D. N. Haskins, G. Lum, A. M. Pergosky, D. C. Smith, K. A. LaBel, How long can the hubble space telescope operate reliably? - a total dose perspective, IEEE Transactions on Nuclear Science 61 (6) (2014) 3356–3362 (Dec 2014). doi:10.1109/ TNS.2014.2360827. [18] J. Eickhoff, Onboard Computers, Onboard Software and Satellite Operations: An Introduction, Springer Publishing Company, Incorporated, 2011 (2011).

TE D

[19] R. Hillman, G. Swift, P. Layton, M. Conrad, C. Thibodeau, F. Irom, Space Processor Radiation Mitigation and Validation Techniques for an 1800 MIPS Processor Board, in: ESA Special Publication, Vol. 536 of ESA Special Publication, 2004, p. 347 (Oct. 2004).

EP

[20] J. Keller, National geospatial-intelligence agency selects bae systems’ radiation-hardened computers for worldview-1 satellite, Military & Aerospace Electronics 18 (12), 2007 (December 2007).

AC C

[21] S. Gaudin, The orion spacecraft is no smarter than your phone, 2014 (Accessed 13.11.2018), https://www.computerworld.com/article/2855604/theorion-spacecraft-is-no-smarter-than-your-phone.html (2014 (Accessed 13.11.2018)). [22] R. Berger, S. Chadwick, E. Chan, R. Ferguson, P. Fleming, J. Gilliam, M. Graziano, M. Hanley, A. Kelly, M. Lassa, B. Li, R. Lapihuska, J. Marshall, H. Miller, D. Moser, D. Pirkl, D. Rickard, J. Ross, B. Saari, D. Stanley, J. Stevenson, Quad-core radiation-hardened system-on-chip power architecture processor, in: 2015 IEEE Aerospace Conference, 2015, pp. 1–12 (March 2015). doi:10.1109/AERO.2015.7119114.

[23] M. Iacoponi, D. Roderick, The rh-3000 mips compatible space processor, in: 9th Computing in Aerospace Conference Proceedings, 1993 (1993).

ACCEPTED MANUSCRIPT

REFERENCES

15

[24] J. Cressler, H. Mantooth, Extreme Environment Electronics, Industrial Electronics, CRC Press, 2017 (2017).

RI PT

[25] J.-L. Poupat, M. Mattavelli, DAHLIA: Very High Performance Microprocessor for Space Applications, in: 11th ESA Workshop on Avionics, Data, Control and Software Systems (ADCSS2017), 2017 (Accessed 13.11.2018), http://dahlia-h2020.eu/wp-content/uploads/2018/02/ADCSS-2017DAHLIA.pdf (2017 (Accessed 13.11.2018)).

SC

[26] R. Liu, A. Evans, L. Chen, Y. Li, M. Glorieux, R. Wong, S. Wen, J. Cunha, L. Summerer, V. Ferlet-Cavrois, Single event transient and tid study in 28 nm utbb fdsoi technology, IEEE Transactions on Nuclear Science 64 (1) (2017) 113–118 (Jan 2017). doi:10.1109/TNS.2016.2627015.

M AN U

[27] J. Keller, Air Force, NASA to develop radiation-hardened ARM processor for next-generation space computing, Military & Aerospace Electronics 27 (7), 2016 (July 2016). [28] P. Osipenko, The devices designed by srisa for space applications, Nauchniye eksperimenty na malyh kosmicheskih apparatah. Apparatura, sbor dannyh i upravlenie, elektronnaya komponentnaya baza. Trudy nauchnotehnicheskogo seminara. (in Russian) (2012) 139–148 (May 2012). [29] T. Peake, Ask an Astronaut: My Guide to Life in Space (Official Tim Peake Book), Random House, 2017 (2017).

TE D

[30] J. Oberg, A Digital Soyuz, 2010 (Accessed 13.11.2018), https://spectrum.ieee.org/aerospace/space-flight/a-digital-soyuz (September 2010 (Accessed 13.11.2018)).

EP

[31] A. V. Gorbunov, A. L.Churkin, D. A. Pavlov, Space complex with "meteor3m" apparatus for hydrometeo and oceanic provision, Voprosy elektromehaniki (in Russian) (105) (2008) 17–28 (2008).

AC C

[32] B. V. Vasilegin, V. V. Emeiyanov, K. Tapero, A. I. Ozerov, M. V. Kamensky, P. N. Osipenko, The study of microprocessor sensitivity to seu induced by cf-252, Voprosy atomnoy nauki i tehniki (in Russian) (3-4) (2006) 59–63 (2006). [33] M. S. Gorbunov, B. V. Vasilegin, A. A. Antonov, P. N. Osipenko, G. I. Zebrev, V. S. Anashin, V. V. Emeliyanov, A. I. Ozerov, R. G. Useinov, A. I. Chumakov, A. A. Pechenkin, A. V. Yanenko, Analysis of soi cmos microprocessor’s see sensitivity: Correlation of the results obtained by different test methods, IEEE Transactions on Nuclear Science 59 (4) (2012) 1130–1135 (Aug 2012). doi:10.1109/TNS.2012.2183147.

[34] P. N. Osipenko, A. A. Antonov, A. V. Klishin, B. V. Vasilegin, M. S. Gorbunov, P. S. Dolotov, G. I. Zebrev, V. S. Anashin, V. V. Emeliyanov, A. I. Ozerov, A. I. Chumakov, A. V. Yanenko, A. L. Vasiliev, Fault-tolerant soi

ACCEPTED MANUSCRIPT

REFERENCES

16

microprocessor for space applications, IEEE Transactions on Nuclear Science 60 (4) (2013) 2762–2767 (Aug 2013). doi:10.1109/TNS.2013.2241453.

RI PT

[35] O. V. Serdin, A. A. Antonov, A. G. Dubrovsky, E. A. Novogilov, A. L. Zuev, The special radiation-hardened processors for new highly informative experiments in space, Journal of Physics: Conference Series 798 (1) (2017) 012010 (2017).

SC

[36] I. A. Danilov, M. S. Gorbunov, A. A. Antonov, Set tolerance of 65 nm CMOS majority voters: A comparative study, IEEE Transactions on Nuclear Science 61 (4) (2014) 1597–1602 (Aug 2014). doi:10.1109/ TNS.2014.2311297.

M AN U

[37] R. Velazco, S. Karoui, T. Chapuis, Seu testing of 32-bit microprocessors (for space application), in: Workshop Record 1992 IEEE Radiation Effects Data Workshop, 1992, pp. 16–20 (1992). doi:10.1109/REDW.1992.247330. [38] M. Bagatin, S. Gerardin, Ionizing Radiation Effects in Electronics: From Memories to Imagers, Devices, Circuits, and Systems, CRC Press, 2015 (2015). URL https://books.google.ru/books?id=IJPwCgAAQBAJ

TE D

[39] M. S. Gorbunov, A. A. Antonov, P. A. Monakhov, V. S. Anashin, A. A. Klyayn, A. E. Koziukov, E. F. Imametdinov, E. V. Marina, Direct experimental performance comparison of two microprocessors for the efficiency evaluation of single event effects mitigation techniques, in: Proceedings of RADECS-2018 (In Press), 2018 (2018). [40] J. Ziegler, Interactions of ions with matter, http://www.srim.org/ (2018). [41] R. Karpinski, Paranoia: A floating-point benchmark, Byte (1985) 223–235 (February 1985).

AC C

EP

[42] F. Sturesson, J. Gaisler, R. Ginosar, T. Liran, Radiation characterization of a dual core leon3-ft processor, in: 2011 12th European Conference on Radiation and Its Effects on Components and Systems, 2011, pp. 938–944 (Sept 2011). doi:10.1109/RADECS.2011.6131334.

• • • •

AC C

EP

TE D

M AN U

SC

RI PT



ACCEPTED MANUSCRIPT Instruction Set Architectures used for space applications are reviewed and discussed The main fault-tolerance enhancement techniques are discussed The fault tolerance of KOMDIV microprocessors is discussed for the first time The trade-off between the fault-tolerance, performance and power is considered The characteristics of 1907VM01A4 and 1907VM044 are discussed as the example