



# **CHIP AGEING**





# **PART I: CONTEXT**

3.



## A BIG TREND (HYPE?): CPS AND IOT



Editorial board: Marc Duranton, Koen De Bosschere, Christian Gamrat, Jonas Maebe, Harm Munk, Olivier Zendra

#### **IOT SYSTEMS VERSUS CYBER-PHYSICAL SYSTEMS**

There are many definitions of Internet of Things and cyber*physical systems*, and a lot of controversy. We choose to define a CPS system as being characterized by having an actuator that directly impacts the physical world (a screen is not considered as an actuator in this definition), while an *IoT* system is distributed and composed of elements that communicate typically via the Internet. With our definition, CPS and IoT are not exclusive. For example, a self-driving car that is not connected and makes all its decision locally is a CPS device, but not an IoT device. It only becomes an IoT device (still being a CPS) if it is connected, e.g. to get maps from a server. A smart sensor transmitting the local temperature to a smartphone is an IoT device, but it is not part of a CPS. If it is connected to a thermostat that controls heating, the combination (i.e. the system composed of the sensor, the various servers and the thermostat) becomes a CPS (and the sensor is still a IoT device).



# **Computing Distribution for "Cognitive" systems**



4



# Embedded intelligence needs local high-end computing



System should be autonomous to make good decisions in all conditions

Safety will impose that basic autonomous functions should not rely on "always connected" or "always available"

Cloud and HPC cannot support many cyber-physical applications.



# Embedded intelligence needs local high-end computing



Example: detecting elderly people falling in their home

# **Privacy** will impose that some **processing should be done locally** and not be sent to the cloud.



# Embedded intelligence needs local high-end computing



**Bandwidth** will require more **local processing** 7

signal (voice) recognition

# **CPS CHALLENGES IN TRANSPORT: SELF-DRIVING CAR**



Ceatech



## **CPS CHALLENGES IN TRANSPORT: SELF-DRIVING CAR**

**ENVIRONMENT** 

# **SENSING - PERCEPTION - CONTROL/DECISION**

VEHICLE FEEDBACK





### **5 KEY ENABLING TECHNOLOGIES**



# **APPLICATIONS/ALGORITHMS**

ing longunation advantabably drops and pavale output high accuracy high SWR expansion

Ceatech







Chronocam

Mobileye

Argo AI (Ford)



Cruise automation (GM)



Drive.ai



Nutonomy

And others ....

**COPYRIGHT CEA 2017** 



## **COMMUNICATION CHALLENGES**





# **ENABLING DATA PRIVACY : OVERVIEW**

# CINGULATA: Secured data processing on unsecured servers

"Homomorphic encryption (HE) allows to process encrypted information without decrypting it..."

To design and translate the native application in HE domain

To help integrate HE in the system architecture

In-house & standard crypto-systems

**Application to industrial use cases:** 



Ceatech



Hidden navigation

Biometric



Intrusion detection

**COPYRIGHT CEA 2017** 



## **ENABLING DATA PRIVACY : OVERVIEW**

Ceatech



"An architecture for practical confidentiality-strengthened face authentication embedding homomorphic cryptography", CloudComm'16 COPYRIGHT CEA 2017



## **COMPUTING CHALLENGES**



**COPYRIGHT CEA 2017** 



## **COMPUTING CHALLENGES**

R Car HB SiP module Carrow MSP Carrow MSP 5 Description and 100 **Decentralized vs. centralized** SPU all Cartas \$17 Cartas \$17 Crea all Ages 131 L2 Coden 1.2 Cadte Hol-LFODTH-3300 Kibit bas-fett (ONV) HyperFlath PROPERTY. INCOME. Guas/Grai SP SD4.1 xMR(3,1 SM, 0005/Tatio d 30 granics processes Video moreo processo Calife Calenaet a last a long high part of ribulary Decast IT computing & communication POA Gaul of AND NO 1994 C. Nowa, 6943 technologies R low file in the second R. HERE liersers Inter Chairman 200 **IRPAM** 54% USBUE INF il Come Salate. Matchil 275 ACT: # Bast KOM **NVIDIA** MOBILEYE RENESAS SW & tools (Drive PX2) (EyeQ5) (R-Car V3H) Safety and Cybersecurity INTEL (Xeon D1529) **KALRAY** QUALCOMM (MPPA Bostan) (Snapdragon 820) And others... **COPYRIGHT CEA 2017** ARCHI 2017 | Olivier HERON | March 8th 2017 | 18



## **COMPUTING CHALLENGES**

**Decentralized vs. centralized** 

IT computing & communication technologies

SW & tools

Safety and Cybersecurity



Tools



# **DYNAMIC MANAGEMENT OF FAMP**

- In SMP architectures, extensions accelerate some specific tasks from 4x to 1000x
  - But are used less than 5% of time
  - May consume up to 25% of the processor area

# • Functionally Assymetric Multi-core Processor (FAMP)

- Objectives are
  - To maintain a reduced silicon area
  - To limit performance degradation
  - To reduce the energy consumption
- Techniques
  - Limits the use of costly extensions for critical sections
  - Optimizes task placement according to performance



CORTEX A9 dual core Floorplan From Osprey – 1.9W TDP 2GHz (6.7mm2)



#### **COPYRIGHT CEA 2017**



28nm FDSOI results

800MHz, <2mm<sup>2</sup>, <1W







Smart Management Controller (SMC) Ageing, body-bias, temperature...

C-BB zone

#### **COPYRIGHT CEA 2017**



# PART II: FDSOI & RELIABILITY (OVERVIEW)

**COPYRIGHT CEA 2017** 



ITRS requirements (normalized v.s. 180nm)

Source: ITRS PIDS2000&2011



CONTEXT

Ceatech

Source: V. Huard, IRPS2013



# WHAT IS FDSOI

Bulk technology limits due to short channel effects

Fully-Depleted SOI transistor

Adjustable transistors Vt through body biasing (BB)

Improved variability

Low cost process technology

Fully compliant with IP designs already available in Bulk

# FD-SOI Transistor



Improving power Efficiency - Bringing high flexibility in SoC integration



Source: "Advances in Applications and Ecosystem for FD-SOI technology", Philippe Magarshack, STMicroelectronics



## **KEY AUTOMOTIVE CHALLENGES**

Temperature **Reliability** Low Power & Energy **Performance** 

Large spatial & temporal gradients in a vehicle Heat dissipation solutions are limited Must-be compliant to AEC Q100 standard

Autonomous vehicle reinforce the mandatory requirement of continuity of service 10 to 1 FIT objective (are still?) over 30 years

 Thermal dissipation (~10<sup>1</sup>W)
 When vehicle is switched-off, some electronics remain powered-on (Door opening, OTA,...)

ADAS & autonomous vehicle (IA) → growth demand for embedded computing solutions in vehicle

Needs

# **PERFORMANCE AND POWER**

FDSOI enables 32% speed boost compared to Bulk at 1V (no BB)

Adaptive body biasing: wider bias range available v.s. bulk

Ceatech

BB can be adjusted dynamically by application

Vt reduces → +10% performance (FBB) v.s. no BB

Vt increases → static power /2 (RBB) v.s. no BB

FBB → High performance application RBB → Low power application

Performance-power tradeoff => one technology covering two application objectives

> ... But increases by 3 static power

... But decreases performance by 10%



Measurements on a ring oscillator



Source: "28nm FDSOI Technology Platform for High-Speed Low-Voltage Digital Applications", N. Planes et al., STMicroelectronics

**COPYRIGHT CEA 2017** 



# Bulk technology suffers from ageing and radiation

Ageing causes variability during operation

Radiation causes bit flipping in storage cells

FDSOI does not cause new failure mechanisms

FDSOI improves transistor reliability v.s. Bulk

> BTI: Bias temperature Instability HCI: Hot Carrier Injection TDDB: Time-dependant dielectric breakdown

Circuit performance decreases

SRAM reliability decreases



Source: V. Huard, Workshop DATE-MEDIAN15, ST Microelectronics ARCHI 2017 | Olivier HERON | March 8th 2017 | 27

**COPYRIGHT CEA 2017** 



Better neutrons immunity v.s. Bulk upset rate < 10FIT/Mb

Single Event Latchup immunity

Alpha particle quasi-immunity. upset rate <1 FIT/Mb No need for ultra-pure Alpha packaging

Very small error clusters (99% single bits) Single error correction codes are sufficient

FDSOI: 100× to 1000× more reliable than Bulk at Sea-level and Space (also better than FinFET)

#### Experimental Failure-in-Time (FIT) test data



| Gain w.r.t. BULK        | UTBB FD-SOI                    | FinFET                                      |  |
|-------------------------|--------------------------------|---------------------------------------------|--|
| Alpha                   | 1000×                          | 15×                                         |  |
| Neutron                 | 100×                           | 10×                                         |  |
| Latchup                 | immune                         | not reported                                |  |
| Multiple Cell<br>Upsets | 99% single bit<br>max. 2 cells | max. 4 cells<br>worse according<br>to INTEL |  |

ST, Roche, CISCO SER workshop, Oct'14 TSMC, Fang, CISCO SER workshop, Oct'14

Source: Ph. Roche, ST Microelectronics



HCI + BTI: degradation is 2× to 7× slower than in Bulk

TDDB: degradation is 2× slower than in Bulk

@VDD, FBB increases degradation v.s. no BB

AVS+ABB can fully compensate ageing with performance increase



Although FBB is active, ageing also depends on Workload significantly

Source: X. Federspiel, IEEE 2012, ST Microelectronics ARCHI 2017 | Olivier HERON | March 8th 2017 | 29

**COPYRIGHT CEA 2017** 

### **RELIABILITY: TEMPERATURE INDUCED AGEING**



Source : Saliva et al., DATE, 2015

Ceatech



- Embedded applications need >100 GOPS/s
- Multi-Processor (many-core) System on Chip
- Transistor shrinking
  - (+) integration
  - (+) performance
  - (+) power
  - (-) variability
  - (-) leakage
  - (-) temperature
  - (-) reliability



- Reliability sign-off is already done in back-end
- Industry needs solutions at higher abstraction levels



# PART III: AGEING AT ARCHITECTURE LEVEL

**COPYRIGHT CEA 2017** 



SOME ADDRESSED ISSUES

# Ageing device models?

# How scaling from device to gate, then circuit?

# How to simulate at Register-transfer? at Transaction level?



# To understand and prevent ageing induced defects in digital ICs as early as possible at design time





# To understand and prevent ageing induced defects in digital ICs as early as possible at design time







**COPYRIGHT CEA 2017** 

# **RELIABILITY SIMULATION**

Ceatech







# **RELIABILITY SIMULATION: OVERVIEW**

| Level        | Reliability<br>monitors        | Main factors                                                                                            | Existing solutions (not<br>exhaustive)                                                                                                                               | +/-                                                                                                                                      |
|--------------|--------------------------------|---------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------|
| Device       | Shift of electrical parameters | <ul> <li>Transistor/<br/>wire geometry</li> <li>Currents, voltages, T°C,<br/>simulation time</li> </ul> | <ul> <li>BERT (Berkeley)</li> <li>HOTRON (TI)</li> <li>RelXpert (Cadence)</li> <li>ELDO Premier<br/>(MentorGr.+STMicr.)</li> <li>Etc.</li> </ul>                     | <ul> <li>+ Accurate (sign-off)</li> <li>+ Silicon proof</li> <li>- Slow simulation</li> <li>- Too late for design exploration</li> </ul> |
| Gate         | Shift of propagation delay     | <ul> <li>Cell nature</li> <li>Stress time (toggling rate or duty cycle)</li> <li>T°C</li> </ul>         | <ul> <li>ILLIADS (Univ. of Illinois)</li> <li>GLACIER (BTA techn.)</li> <li>In-house solutions</li> <li>(STMicroelectronics, Infineon)</li> <li>OFFIS</li> </ul>     | <ul> <li>+ Silicon proof</li> <li>+ Early reliability projections</li> <li>- Too slow for design exploration</li> </ul>                  |
| Architecture | MTTF, CFR<br>Delay degradation | •Area<br>•Workload<br>•Power<br>•T°C<br>•Process params                                                 | <ul> <li>•RAMP (IBM+Univ. of Illinois)</li> <li>•Univ of California San</li> <li>Diego+EPFL</li> <li>•RAAPS, DAPHNE (CEA LIST)</li> <li>•DAC&amp;DATE2016</li> </ul> | <ul> <li>+ Generic models</li> <li>+ Fast design exploration (virtual prototyping)</li> <li>- Model accuracy?</li> </ul>                 |



• E.g. Eldo Premier flow for reliability simulation





#### A REAL PROPERTY AND A REAL PROPERTY AND Clatech FAILURE RATE ESTIMATION AT TRANSACTION LEVEL IN A NUTSHELL

#### POWER-TEMPERATURE SIMULATION (Intel/Docea Power)



SESAM Simulator

**COPYRIGHT CEA 2017** 

CATRENE/RELY

#### **RELIABILITY SIMULATION AT REGISTER TRANSFER LEVEL**

• Path propagation delay

Ceatech

Tp: path progation delay Ts: setup time Th: hold time ST:slack time Tclk: clock period

R: register (flip-flop)







#### **RELIABILITY SIMULATION AT REGISTER TRANSFER LEVEL**

• Gate degradation model (golden reference)



Ceatech

 $dtg / tg = S * (SP * POT)^{n}$ [Huard et al., IRPS]

dtp / tp =  $\sum S(g_i) * (SP(g_i) * POT)^n$ 

• Path degradation model at gate level

dtg: delay drift of timing arc  
tg: fresh delay  
S: sensitivity  
n: time exponent  
SP: static probability 
$$SP = \frac{\text{time at 1}}{POT}$$

*dtp: delay drift of physical path tp: fresh delay* 









• Register-to-Register path model at RTL

$$dtp \ / \ tp \approx S_{AVG} \ast \sum ( \ SP'(g_i) \ast POT )^n$$

SAVG: Average sensitivity in the whole circuit SP': approximated SP of "parent" of gate







ARCHI 2017 | Olivier HERON | March 8th 2017 | 47



#### **RELIABILITY SIMULATION AT RTL**





Nom Prénom | Date

**COPYRIGHT CEA 2017** 







#### MIPS R4000 like processor (AntX)



Figure 5. Instruction distribution of benchmarks after simulation with an ISS.





#### Workload profiling at RTL

Median drift v.s. benchmarks

ST automotive bulk 40n, 125°C, 1V05 (WC)







RTL vs. Gate-level (golden)

## AntX in SYLVESTERM40 testchip (2015)





#### **OXIDE BREAKDOWN ASSESSMENT**

- Voltage
- Temperature
- Workload is not a major driver

# **Oxide Breakdown**

- Dielectric breakdown (TDDB) is enhanced
- EOT reduction:
  - drives the lifetime reduction
  - but makes BD event more progressive



**TDDB** degradation

Soft BD event Lifetime 010 0 00 00 00 12 Current ratio 10 -Pattern 1 Istat/Istato -Pattern 2 --- Pattern 3 NMOS - Pattern 4 PMOS 0 \_0.5 -0.5 \_1 \_1.5 28 nm 40 nm 65 nm -2 % 90 nm **Technology nodes** 10 100 1000 Stress time [s] Vincent Huard / STMicroelectronics 13 Source : Saliva et al., SETS ,2015 Source : Saliva et al., IRPS .2014



#### **OXIDE BREAKDOWN ASSESSMENT**

Assessment at RTL







#### **OXIDE BREAKDOWN ASSESSMENT**

- AntX processor
- Fault injection on 180 flip-flops => single-fault propagation probability
  - Flip-flops belonging to the 100000 paths (over 2000000) with shortest slack time
- 361 simulations

|                 | pmos | nmos | tot |
|-----------------|------|------|-----|
| No errors       | 71   | 71   | 142 |
| Silent errors   | 36   | 25   | 61  |
| Single error    | 0    | 1    | 1   |
| Multiple errors | 73   | 83   | 156 |

#### memtest86

| bitcount        |      |      |     |  |
|-----------------|------|------|-----|--|
|                 | pmos | nmos | tot |  |
| No errors       | 63   | 63   | 126 |  |
| Silent errors   | 56   | 73   | 129 |  |
| Single error    | 1    | 0    | 1   |  |
| Multiple errors | 60   | 44   | 104 |  |

#### **COPYRIGHT CEA 2017**

. . .

#### **MODELLING BY STOCHASTIC PROCESS: OXIDE BREAKDOWN**

Ceatech

[Heron, Guérin et al., ALT'12]



### Ceatech MODELLING BY STOCHASTIC PROCESS: BTI

J.Fang et al Understanding the impact of transistor-level bti variability. IRPS 2012

AND REPORTED AND A REPORT OF A DESCRIPTION





**COPYRIGHT CEA 2017** 

ARCHI 2017 | Olivier HERON | March 8th 2017 | 56



# **PART IV: AGEING MONITORING & ESTIMATION**

**COPYRIGHT CEA 2017** 

ARCHI 2017 | Olivier HERON | March 8th 2017 | 57



# How to build accurate circuit ageing models independently of specific device models?

# How to derive accurate ageing information from PVT sensors?

How to estimate actual workload for ageing estimation?

COPYRIGHT CEA 2017

# Ceatech

#### **MONITORING & ESTIMATION ROADMAP**

On-line workload estimation

Characterization based path ageing modelling

> BTI/HCI [Patent]

On-line failure probability estimation

IRPS2016

Hardware security

#### **ON-CHIP MONITORING FOR AGEING CHARACTERIZATION OF DEVICES**

(19) United States

NUMBER OF A DESCRIPTION OF A DESCRIPTION Clatech

> (12) Patent Application Publication (10) Pub. No.: US 2012/0245879 A1 Sep. 27, 2012 Mikkola (43) Pub. Date:

> > G06F 19/00

- (54) PROGRAMMABLE TEST CHIP, SYSTEM AND METHOD FOR CHARACTERIZATION OF (51) Int. CL INTEGRATED CIRCUIT FABRICATION PROCESSES (52) U.S. CL
- (75) Inventor: Esko O. Mikkola, Tucson, AZ (US)
- (73) Assignee: Ridgetop Group, Inc.
- 13/424,025 (21) Appl. No.:
- (22) Filed: Mar. 19, 2012
  - **Related U.S. Application Data**
- (60) Provisional application No. 61/465,463, filed on Mar. 21, 2011.

(57) ABSTRACT A test chip, system and method for testing large numbers of test devices on a single test chip decreases the time and complexity required to characterize the variation and reliability of the IC fabrication process. A remotely configurable test chip can be programmed with varying bias conditions for testing of process variation or numerous failure modes on large sample sizes. An on-chip addressing technique allows large numbers of test devices to be tested simultaneously and the measurement signals read out serially for different test devices. The test chip may be configured for wafer, die or package-level testing.

Publication Classification

(2011.01)

762/117





#### **ON-CHIP MONITORING FOR AGEING CHARACTERIZATION OF CIRCUIT**

Ready to be inserted in a CAD flow (RTL)

накалы аларына какаларына какала Clatech

- Not for sign-off but for early design decisions
- No need for modelling of device physics of failure



SAAT: Software Aging Analysis Tool



### **ON-CHIP MONITORING FOR AGEING CHARACTERIZATION OF CIRCUIT**

- Preliminary results on AntX processor
- FDSOI28n (Vb=0)
- Critical path analysis (Fresh frequency 500MHz@0V6-125C)



Fig. 6. Accuracy between the proposed method and the observation of the delay degradation on the path with  $\Delta V_{th}$  constant



Fig. 7. Accuracy between the proposed method and the observation of the delay degradation on the path with  $\Delta V_{th}$  variable

#### **ON-LINE ESTIMATION OF HCI/BTI**

Mechanism to on-line assess the circuit reliability by estimating the degradation of its critical paths under the actual stress conditions.

A REAL PROPERTY AND A REAL PROPERTY OF Ceatech

Critical path delay

**Creating accurate but nonetheless** simplified circuit-level aging models from existing device-level models and using in-situ monitors to follow dynamic (slow) variations





## SUMMARY

- **Development time and cost** 
  - Early analysis  $\rightarrow$  re-design effort is reduced
- **Engineer productivity** 
  - Models aid to predict the parameter shift per device
- **Risk management** 
  - Design risk issues can be evaluated earlier in the design flow
- Repeatability
  - A change in the design specification can be taken into account immediately
- **Technology versatility** 
  - TSMC 45n, ST bulk40n, ST FDSOI28n and others
  - HW-assisted ageing modelling
- **Market segments** 
  - Safety critical applications: small/medium size ICs (e.g. microcontroller)
  - Telecom, wireless, consumer: High-end ICs (e.g. multicore)



#### **AGEING INDUCED SECURITY THREATS**

#### MAGIC: Malicious Aging in Circuits/Cores [Karimi et al, TACO 2015]

- Attaque qui accélère les effets du NBTI
  - Erreurs observées après 1 mois
- Processeur OpenSPARC T1, technologie 45 nm
- Techniques de mitigation et correction existantes pas suffisantes







### **AGEING INDUCED SECURITY THREATS**

- Scenario 1 : Attaque pour garantie
  - Echange du dispositif avant la fin de la garantie
- Scenario 2 : Obsolescence programmée
  - Programme de vieillissement envoyé par le fabriquant comme mise à jour
- Scenario 3 : Hardware backdoor
  - Dispositif mis hors service à distance
- Scenario 4 : Attaque contre la sécurité
  - Révélation de clés













**COPYRIGHT CEA 2017**