Professional Documents
Culture Documents
Abstract—Trends in terrestrial neutron-induced soft-error in In addition, MCU can be a threat in mission-critical systems
SRAMs from a 250 nm to a 22 nm process are reviewed and with an extreme number of logic devices that are mainly pro-
predicted using the Monte-Carlo simulator CORIMS, which is tected by spatial or time redundancies. Typically redundancy
validated to have less than 20% variations from experimental
soft-error data on 180–130 nm SRAMs in a wide variety of neutron circuits such as triple module redundancy (TMR) [11], dupli-
fields like field tests at low and high altitudes and accelerator tests cation [12], replication [13], and redundant-nodes latches/FFs
in LANSCE, TSL, and CYRIC. The following results are obtained: [14]–[18] like the dual interlocked storage cell (DICE) [19]
1) Soft-error rates per device in SRAMs will increase x6-7 from cannot be effective when relevant nodes are corrupted simulta-
130 nm to 22 nm process; 2) As SRAM is scaled down to a smaller neously by an MCU [20], [21]. Since such redundancy systems
size, soft-error rate is dominated more significantly by low-energy
neutrons (< 10 MeV); and 3) The area affected by one nuclear in electronic systems are strictly relevant to the international
reaction spreads over 1 M bits and bit multiplicity of multi-cell standard IEC61508 [22] that defines the functional safety of
upset become as high as 100 bits and more. electrical/electronic/programmable electric safety related sys-
Index Terms—Bit multiplicity, cosmic ray impact simulator tems, protection technologies against MCUs may have to be
(CORIMS), multi-cell upset (MCU), multi-node upset (MNU), consistent with the scope of the standard.
scaling, single event upset (SEU), static random access memories Historically MCUs are understood as taking place as a
(SRAMs). result of the collection of charges produced by secondary
ions from nuclear spallation reaction in a device. As de-
I. I NTRODUCTION vice scaling down proceeds, novel MCU modes are being
reported as “charge sharing among memory storage nodes in
S CALING down of semiconductor devices to sub-100 nm
technology encounters a wide variety of technical chal-
lenges like Vth variation [1], negative bias temperature insta-
the vicinity [8], [23] or bipolar effects in p-well [9], [24],
[25]. Ibe et al. have proposed multi-coupled bipolar interac-
bility (NBTI) [2], short-channel effect [3], gate leakage [4], tion (MCBI) for one of the bipolar MCU mechanisms that is
and so on. Terrestrial neutron-induced single event upset (SEU) regarded as a parasitic thyristor effect triggered by a single
is one key issue that can be a major setback in scaling. In event snapback (SES) in the p-well and causes MCU mul-
particular, “multi-cell upsets (MCUs),” which are defined as tiplicity of more than 10 bits [9]. It is also reported that
simultaneous errors in more than one memory cell induced MCU physical address pattern differs depending on written
by a single event, have been under close scrutiny [5]–[9]. The data patterns typically between the groups ALLX (All “1”
concept of the MCU, therefore, contains both upsets that can or All “0”) and Checkerboard (CB or its complement CBc).
be corrected by error detection/correction code (EDAC/ECC) In this paper, the statistics in SEUs and MCUs in static
as well as those which cannot. The latter is called “multiple random access memories (SRAMs) are predicted down to
bit upset” or “multi-bit upset” (MBU) of memory cells in the 22 nm process by using the Monte-Carlo simulator CORIMS
same word, and can lead, for example, to hang-ups of computer [26]. In the present paper, the bipolar effects may be much
systems. Though MBUs can be avoided by a combination of more fatal in practical applications but are not included into
ECC and the interleaving technique [9], MCUs that can be the physical model in CORIMS because of their complexity.
corrected by EDAC/ECC can still be problematic in high- The effects of the bipolar actions will be modeled and reported
performance devices such as contents addressable memories elsewhere.
(CAMs) [10] used in network processors and routers. In the In Section II, the physical model, major algorithms, and
case of system design, it is therefore very important to evaluate statistical parameters are reviewed. In Section III, simulation
MCUs as well as soft-error rates (SERs) of the device in results are presented and discussed in conjunction with impacts
design phase. on logic devices. Section IV concludes the insights from the
simulation results.
Manuscript received October 20, 2009; revised March 18, 2010; accepted
March 24, 2010. Date of publication May 20, 2010; date of current version II. M ODEL D ESCRIPTION
June 23, 2010. The review of this paper was arranged by Editor G.-T. Jeong.
The authors are with the Production Engineering Research Labora- A. Overall Microscopic Soft-Error Model
tory, Hitachi, Ltd., Yokohama 244-0817, Japan (e-mail: hidefumi.ibe.hf@
hitachi.com; hitoshi.taniguchi.dn@hitachi.com; yasuo.yahagi.rg@hitachi.com; Fig. 1 depicts a schematic of a microscopic soft-error model
kenichi.shimbo.tu@hitachi.com; tadanobu.toba.ee@hitachi.com). for a SRAM cell, which has two n+ nodes in the p-well and two
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. p+ storage nodes in the n-well. Two sets of adjacent n+ and p+
Digital Object Identifier 10.1109/TED.2010.2047907 nodes correspond to two potential states: “high” or “low.” The
Fig. 2. Neutron energy spectrum at sea level in New York City under medium
Fig. 1. Macroscopic mechanism of neutron-induced soft-error in static ran-
solar activity [32].
dom access memory (SRAM). Physical sequences proceeds from 1 to 5 .
memory data “1” or “0” is assigned to the side (right or left) that
has high potential. Once a ballistic neutron penetrates into the
SRAM, nuclear spallation reaction may take place between the
neutron and the nucleus (mostly Si) in the device. As a prompt
reaction, nucleons (protons and neutrons) collide with each
other in the nucleus. Some of the nucleons may escape from the
nucleus when they have enough kinetic energies. This process
is called Intra-Nuclear Cascade (INC) [27]. After this prompt
process, light nuclei may be “evaporated” from the residual
excited nucleus [28]. As a consequence nucleons, light nuclei,
and the residual nucleus run inside the SRAM cell producing
electron-hole pairs along with the ion track. Energy necessary
to produce one pair of electron and hole is 3.6 eV in Si. When
one of such secondary ions hit the storage nodes, some of
the charges are collected to the storage node mainly through
the funneling effect [29] and the drift/diffusion process. If the Fig. 3. Calculated energy spectra of secondary ions produced in Si substrate
at sea level in New York City.
amount of the charges exceeds the critical charge that can flip
the logical state of the SRAM, a soft-error takes place in the
SRAM. energy, and direction of each secondary ion produced in a
spallation reaction are thus determined and reaction locations
are randomly set in the device model.
B. Nuclear Spallation Reaction Models The accuracy of the nuclear reaction model is validated
Monte-Carlo single event CORIMS [26] is equipped with though the comparison of nuclear reaction data of high-energy
numerical solutions for nuclear spallation reactions of silicon, proton and aluminum [31]. SER in the device under any neutron
ion track analysis in an infinite layout of memory cells in a spectra can be simulated. In the case of a simulation at a
semiconductor device, and charge collection to the diffusion specific location at ground level on Earth, the terrestrial neutron
layer of the device. The model of the nuclear spallation re- spectrum at the location is corrected in accordance with the
action is based on the intra-nuclear cascade (INC) model and geomagnetic latitude and the altitude based on the standard
the evaporation model by Weisskopf and Ewing [28]. The neutron spectrum at sea level in New York City as shown in
INC model is applied to the prompt collision process, where Fig. 2 [32].
many-body collisions among nucleons (neutron and proton) are Fig. 3 shows an example of outputs from CORIMS for the
treated numerically as cascades of relativistic binary collisions energy spectra of secondary ions produced directly from Si
between two nucleons in the target nucleus. The evaporation substrate with the neutron spectrum in Fig. 2. It is noteworthy
model of light particles from excited nucleus is also applied for that:
a delayed nuclear reaction process, where nucleons (n and p), (i) light particles such as proton and helium (or alpha parti-
deuterons (2 H or D), tritons (3 H or T), helium, and residual cle) have high production rates and high energies up to a
nucleus are released into the substrate. The inverse reaction few hundreds to 1000 MeV; while
cross section necessary for the determination of an evaporation (ii) heavier particles such as Mg and Al also have relatively
channel (a set of evaporated light particle and residual nucleus) high production rates but do not have high energies with
is calculated based on the GEM model [30]. Nucleus type, maximum energies 10–100 MeV.
IBE et al.: IMPACT OF SCALING ON NEUTRON-INDUCED SOFT ERROR IN SRAMs 1529
Fig. 6. Simplified structure of the CMOS SRAM unit cell for the single event
simulator CORIMS. (a) Top view. (b) A-A’ cross section.
TABLE I
ASSUMED ROADMAP OF SCALING IN SRAM
I. Roadmap
Table I summarizes the typical roadmap parameters in
20–130 nm SRAM, assumed based on ITRS2007 [37]. Lateral
2-D scaling is assumed to reduce the area by a factor of 2
by each generation. Depth profile is assumed to be constant
due to something lacking in the roadmap information and
also because of the difficulty in making a shallow profile. As
Fig. 9. Examples of MCU categories and codes. parasitic capacitance is basically in proportion to a device area,
critical charge is also assumed to decrease by a factor of 2
WL (category “w”), and cluster (an MCU that has two by each generation. Although reduction in the supply voltage
or more bits along with both BL and WL directions; Vdd is preferable for reducing power consumption, it is actually
category “c”). being limited in order to ensure enough margin from the upper
2) MCU code that can be almost uniquely relevant to a bound of Vth variation [1] and therefore assumed to be constant.
physical address pattern in an MCU is given as: The critical charge will decrease more rapidly if the Vdd is
reduced by generation, leading to an increase in SER.
C_N1 _N2 _N3 _N4 _P
TABLE III
MAJOR SIMULATION RESULTS FOR THE DATA PATTERN ALL “1”
Fig. 10. Comparison of measured and simulated SER in 130 and 180 nm
SRAM with quasi-monoenergetic neutron facilities [26].
TABLE II
MAJOR SIMULATION RESULTS FOR THE DATA PATTERN CB
TABLE IV
PREDICTED TRENDS IN MCU CATEGORIES AND MCU CODES
(iv) There are only minor differences between CB and FF smaller and the memory cells are more tightly packed in
data patterns. the smaller generations. The directional effects become
Table IV summarizes the trends in MCU categories for data weak for smaller generations since the contribution from
patterns (a) CB and (b) FF. Typical MCU codes and the number charge collection by the directional funneling effects
of unique codes are also shown in the table. The figures in become smaller.
the cells are the ratio to the total MCUs in percentage. Most
When MCU bits align along with a single word line, multi-bit
MCU error patterns for MCU codes are shown in Fig. 9. Some
upset (MBU), which corrupts error correction by ECC and thus
substantial differences can be seen between the data patterns:
corrupts the reliability of electronic systems, may take place.
(i) The ratios of the category W (on single WL) for CB Such alignment can take place in the MCU categories (W) and
patterns are higher than those for FF patterns by a factor (C). Table V summarizes the ratio of such alignment in MCU to
of about 2. This is due to the two “high” nodes located in the total SEU. In Table V, “Total” includes both the categories
the same p-well of two adjacent bits in the WL direction (W) and (C), and the ratios are shown for bit widths of 2, 3,
for CB patterns so that two adjacent bits in WL direction 4–8, and more than 8 bits in the (W) MCUs. It is seen that
are easily corrupted. This is also seen in the ratios of the almost all word line alignments take place in the category (W)
MCU code W_2_2_1_2_any parity. MCUs, and the bit widths for almost all (W) MCUs are less
(ii) The ratios of the code C_4_2_2_2_A1 for FF patterns are than 8 bits. This means that an ECC with an 8-bit interleave can
substantially higher than those for CB patterns. eliminate MBUs with only a slight risk in 22 and 32 nm design
(iii) The differences between the ratios of categories seem to rules. Results for AllX are not shown here because they are
be clear for larger generations (180 and 130 nm). This similar to those for CB.
has been clearly observed in our former work for 180 nm The fact that MCU ratio drastically increases as scaling
SRAMs [45]. The differences are getting unclear for proceeds means that multi-node upset (MNU), in which mul-
smaller generations. The reason for this, perhaps, is that tiple logical nodes of sequential or combinational logic devices
SRAM cells are easily corrupted by the charge deposited are corrupted, must increase as well. This may cause serious
only in the depletion layer as the critical charge becomes effects in the reliability design of logic devices since MNUs
1534 IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 57, NO. 7, JULY 2010
TABLE V
PERCENT RATIO OF POSSIBLE MBU TO TOTAL SEU (CHECKER BOARD)
Fig. 15. Total charge deposition spectra for 22 nm and 130 nm SRAMs.
Fig. 17. Failed bit map for each generation with CB pattern. Fig. 19. Changes in an MCU cross section in SRAM with scaling.
Fig. 20. Changes in the SBU cross section with scaling and neutron energy.
Fig. 18. Changes in an SEU cross section in SRAM with scaling.
than 10 MeV [46]–[48]. This implies that two essential changes
may be needed in the standard methods, including JESD89A,
E. Failed Bit Map (FBM)
to estimate SER from accelerator-based testing. These two
Fig. 17 shows the distribution of total failed bits in the BL changes are the following:
(perpendicular axis) and WL (vertical axis) address space when 1) Include the contribution of neutrons with energy lower
about 58 000 nuclear reactions take place in the four bits near than 10 MeV to avoid large errors in SER estimation
the origin for the data pattern CB. It is seen that the area when the spallation neutron sources are us.
densely affected drastically increases from 130 nm (about 50 × 2) Modify the ordinary excitation function with the saturated
50 bits) to 22 nm (about 500 × 500). The automatic MCU cross section to have a sharp peak at low neutron energy
classification tool MUCEAC has been introduced to make the when the (quasi-) monoenergetic neutron sources are
statistic calculations from a number of MCUs and demon- used.
strated for mainly 130 nm SRAM test results [9]. The ex- By contrast, there are no essential changes in the MCU cross
tremely widened range of FBMs, however, would make the section shapes. This can be attributed to the relatively low
statistic calculations for MCU in neutron accelerated testing for contributions of lighter particles to the MCU. The sharp peak,
45–22 nm SRAM very painful or almost impossible unless an meanwhile, is understood to originate from single bit upset
ultra-high-speed automatic classification tool is developed. (SBU), as shown in Fig. 20. The cross section curve for SBU
can be obtained by subtracting the MCU cross section in Fig. 16
from the SEU cross section in Fig. 14.
F. Energy Dependency of SEU/MCU Cross Section
SEU and MCU cross sections for each generation are shown
G. Trends in MCU Ratio
as a function of neutron energy in Figs. 18 and 19, respectively.
As scaling proceeds, the contribution of neutrons with energy Fig. 21 shows the trends in MCU ratio to the total SEU.
lower than 10 MeV drastically increases due to an increase The ratio generally increases as neutron energy increases and
in the contribution of lighter particles as the scaling proceeds. scaling proceeds. When the neutron energy increases, heavy
Recent experimental results with low-energy protons showed ions with higher energy are produced, flipping multiple memory
quite consistent trends with the predicted trends, where an SEU cells. If the memory cells are packed more densely, the number
cross section has a sharp peak for protons with energies lower of flipped MCU bits is naturally increased. The maximum ratio
1536 IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 57, NO. 7, JULY 2010
Fig. 21. Changes in MCU ratio with scaling and neutron energy.
It is reported by several researchers that SER in logic devices
will have serious impacts on components, including the PLL
[49], clock line [8] or global control line (SET/RESET) [50] in
electronic systems. As mentioned before, MNU is apparently a
major setback in reliability design of logic devices and systems,
in particular, with redundancy cells and modules.
In order to establish any valid countermeasures for logic
systems, the following approaches should first be established
first:
(i) Techniques to identify vulnerable components. Neutron
irradiation tests in system or component level [51], [52]
would be useful. Identification of vulnerable parts can
also be done by using high-energy ion beams [53].
Broad-beam neutron irradiation facilities such as FNL,
ANITA [54], and TRIUMF [55] can be utilized for such
system/component-level tests.
Fig. 22. Changes in MCU multiplicity distributions with scaling.
(ii) Techniques to quantify the vulnerability in gate (flip
flops, latches, AND, NAND, NOR, OR, etc.) level. Gate
exceeds 0.5 for 22 nm SRAM, indicating that the impact of
chain irradiation tests [56], [57] would be useful. Simu-
MCU and MNU has become more serious.
lation and design tools like CORIMS, in which realistic
device structures are implemented, can also be applied.
H. Trends in MCU Multiplicity Distribution With this kind of knowledge, intrinsic immune devices or
Fig. 22 shows the changes in MCU multiplicity distributions. more effective redundancy techniques with low power and costs
It is seen that the multiplicity shifts to a larger number of bits have to be established before 32–22 nm eras.
as scaling proceeds. The ratios of SBU and lower multiplic-
ity MCUs reduce correspondingly. As mentioned before, the IV. C ONCLUSION
maximum multiplicity is well beyond tens of bits when scaling
proceeds beyond 32 nm. Trends in terrestrial neutron-induced soft-errors in SRAMs
down to 22 nm process are predicted by using the Monte-
Carlo simulator CORIMS, which is validated to have less than
I. Validity of Simulated Results 20% variations from experimental data in a wide variety of
In the present model, the depth profile of impurities and the neutron fields like the low- and high-altitude field tests and the
maximum funneling length are fixed for all generations. But in accelerator tests in LANSCE, TSL, and CYRIC.
reality the depth profile will be shallower. The funneling length The following results are obtained:
will also be shorter as the concentration of impurities become 1) Soft-error rates per device in SRAMs will increase x6-7
higher. These effects would work for suppressing SER. On the from 130 nm to 22 nm process.
other hand, the operation voltage may be reduced in reality as 2) As SRAM is scaled down to a smaller size, SEU is dom-
scaling proceeds. This works for worsening SER. inated more significantly by low energy neutrons (< 10
Changes in the material in the device would make wider MeV). The MCU, however, does not change drastically.
variations in the prediction. If the high-k material is used for 3) The area affected by one nuclear reaction spreads well
gate oxides like HfO, the critical charge is increased to result beyond 1 M bits area and the multiplicity of multi-cell
in lower SER. Meanwhile if the low-k material is used for upset become as high as 100 bits and more.
interlayer oxide, parasitic capacitance is reduced to result in The discussions are extended to the MNUs of logic devices/
lower critical charge and higher SER. systems and countermeasures to them.
IBE et al.: IMPACT OF SCALING ON NEUTRON-INDUCED SOFT ERROR IN SRAMs 1537
[46] R. K. Lawrence, J. F. Ross, N. Haddad, D. Albrect, R. A. Reed, and Hitoshi Taniguchi is currently a Researcher in
M. A. McMahan-Norris, “Soft error sensitivities in 90 nm Bulk CMOS the Production Engineering Research Laboratory,
SRAMs,” in Proc. NSREC, Quebec, QC, Canada, Jul. 20–24, 2009, Hitachi, Ltd., Yokohama, Japan. He studies fault
No. W-4. diagnosis technology and terrestrial neutron-induce
[47] B. D. Sierawski, J. A. Pellish, R. A. Reed, R. D. Schrimpf, soft-error. His major theme is simulation of SER
K. M. Warren, R. A. Weller, M. H. Mendenhal, A. D. Tipton, M. A. in SoCs.
Xapsos, R. C. Baumann, X. Deng, M. J. Campola, M. R. Friendlich,
H. S. Kim, A. M. Phan, and C. M. Seidleck, “Impact of low-energy proton
induced upsets on test methods and rate predictions,” IEEE Trans. Nucl.
Sci., vol. 56, no. 6, pp. 3085–3092, Dec. 2009.
[48] D. F. Heidel, P. W. Marshall, J. A. Pellish, K. P. Rodbell, K. A. LaBe,
J. R. Schwank, S. E. Rauch, M. C. Hakey, M. D. Berg, C. M. Castaneda,
P. E. Dodd, M. R. Friendlich, A. D. Phan, C. M. Seidleck, M. R.
Shaneyfelt, and M. A. Xapsos, “Single-event upsets and multiple-bit
upsets on a 45 nm SOI SRAM,” IEEE Trans. Nucl. Sci., vol. 56, no. 6,
pp. 3499–3504, Dec. 2009.
[49] T. D. Loveless, L. W. Massengil, B. L. Bhuva, W. T. Holman, R. A. Reed,
D. McMorrow, J. S. Melinger, and P. Jenkins, “A single-event-hardened
phase-locked loop fabricated in 130 nm CMOS,” in IEEE Trans. Nucl.
Sci., Dec. 2007, vol. 54, no. 6, pp. 2012–2020.
[50] M. Cabanas-Holmen, E. Cannon, A. Kleinosowski, J. Killens, J. Ballast, Yasuo Yahagi received the B.S. and M.S. degrees
and J. Socha, “Clock and reset transients in a 90 nm RHBD single-core from the University of Tokyo, Tokyo, Japan, in
tilera processor,” in Proc. NSREC, Quebec, QC, Canada, Jul. 20–24, 2009, 1991 and 1993, respectively and the Ph.D. degree
No. PG-3. in quantum science and engineering from Tohoku
[51] L. Borucki, G. Schindlbeck, and C. Slayman, “Comparison of accelerated University, Sendai, Japan, in 2005.
DRAM soft error rates measured at component and system level,” in He joined Hitachi, Ltd., Yokohama, Japan, in
Proc. IRPS, Phoenix, AZ, Apr. 27–May, 1, 2008, pp. 482–487, No. 5A.4. 1993. He was a Visiting Lecturer at Tokyo Institute
[52] A. V. Prokofiev, J. Blomgren, M. Majerle, R. Nolte, S. Rottger, S. P. Platt, of Technology from 2004 to 2005. He has carried
C. X. Xiao, and A. N. Smirnov, “Characterization of the ANITA neutron out systematic experimental works to establish test-
Source for Accelerated SEE Testing at The Svedberg Laboratory,” in ing standards and validate simulation results by us-
Proc. IEEE Radiation Effects Data Workshop, Quebec City, QC, Canada, ing worldwide accelerater facilities. He contributed
Jul. 20–24, 2009, pp. 166–173. to establishing the Japanese standard of testing methods of environmen-
[53] D. G. Mavis, P. H. Eaton, and M. D. Sibley, “SEE characterization tal radiation-induced soft-error, EDR-4705, published in 2005 by the Japan
and mitigation in ultra-deep submicron technologies,” in Proc. ICICDT, Electronics and Information Technology Industries Association. His current
Austin, TX, May 18–20, 2009, pp. 105–112. research activity is devoted to the field of electromagnetic compatibility (EMC).
[54] A. V. Prokofiev, J. Blomgren, M. Majerle, R. Nolte, S. Rottger, S. P. Platt, He is currently a Senior Researcher in the Production Engineering Research
and A. N. Smirnov, “Characterization of the ANITA neutron source for Laboratory, Hitachi, Ltd.
accelerated SEE testing at the Svedberg laboratory,” in Proc. NSREC,
Quebec, QC, Canada, Jul. 20–24, 2009, No. W-25.
[55] E. W. Blackmore, “Development of a large area neutron beam for system
testing at TRIUMF,” in Proc. IEEE Radiation Effects Data Workshop,
Quebec, QC, Canada, Jul. 20–24, 2009, pp. 157–160.
[56] E. H. Cannon and M. Cabanas-Holmen, “Heavy ion and high energy
proton-induced single event transients in 90 nm inverter, NAND and NOR
gates,” IEEE Trans. Nucl. Sci., vol. 56, no. 6, pp. 3511–3518, Dec. 2009.
[57] T. Makino, D. Kobayashi, K. Hirose, D. Takahashi, S. Ishii, M. Kusano,
S. Onoda, T. Hirao, and T. Ohshima, “Soft-error rate in a logic LSI
estimated from SET pulse-width measurements,” IEEE Trans. Nucl. Sci., Ken-ichi Shimbo is currently a Researcher in
vol. 56, no. 6, pp. 3180–3184, Dec. 2009. the Production Engineering Research Laboratory,
Hitachi, Ltd., Yokohama, Japan. He studies fault
diagnosis technology and terrestrial neutron-
induce soft-error analysis in FPGAs and network
components.