Pre and post-silicon techniques to deal with large-scale process variations Jaeyong Chung, Ph.D. Department of Electronic Engineering Incheon National University Outline Introduction to Variability Pre-silicon Techniques Basics of traditional static timing OCV AOCV/LOCV SSTA POCV/SOCV Post-silicon Techniques Compressed Sensing Compressed Silicon Sensing (CSS) Virtual Probe Our Proposed Framework Application of CSS 2 Timing Uncertainty Add Timing Margins For Delay Uncertainty – Process Variation – Voltage Variation – Temperature Variation – Aging Effects Associated Costs – Area, Power, Design Efforts/Time Actual Circuit Delay Timing Margin CLOCK Process 3 Temperature Voltage Aging Classification of variability 4 Sources of variation FEOL (Front end of line) variation BEOL (Back end of line) variation Random Dopant Fluctuation Lithography-induced variation /Proximity effects Erosion and dishing in CMP process 5 CD (Lgate) Variation Critical dimension (a.k.a, Lgate, Leff,…) – The effective channel length of transistors – Affects delay and leakage substantially – Varies across-wafer and within-chip systemically A reduction of 1nm of the standard deviation of CD → $7.5/chip for a high end product a) b) c) d) e) f) wafer-to-wafer Across-wafer Die-to-die Across-die Pattern dependent Local random noise http://www.eecs.berkeley.edu/~bora/Conferences/2009/SPIE09-Qian.pdf 6 CD (Lgate) Variation Across-wafer CD variation – Post Exposure Bake (PEB) is the greatest variation culprit – In areas where the bake plate is relatively cool, CD is larger than average http://bcam.berkeley.edu/ARCHIVE/theses/Friedberg_PhD.pdf 7 CD (Lgate) Variation Within-chip CD variation – Lens aberration induces spatially correlated variation – Different layout leads to different spatial patterns due to optical proximity effect http://www.bioee.ee.columbia.edu/courses/upload/Bibliography/orshansky2004.pdf 8 Voltage, temperature, weather, … [IBM] 9 Spatial Correlation 10 Traditional Static Timing Setup check 11 Traditional Static Timing Hold check 12 Traditional Static Timing Use worst/best corners for setup/hold checks p2 Launch Path DL (Launch clock path + data path) Setup Check DL < Dc + Tclock p1 Process Space D Q D CLOCK Capture Path DC 13 Q Traditional Static Timing GBA (Graph-Based Analysis) takes linear time in circuit size PBA (Path-Based Analysis) takes exponential time in circuit size Launch Path DL Setup Check DL < Dc + Tclock D Q D 1 1 CLOCK 1 Capture Path DC 14 1 Q Traditional Static Timing GBA finds an upper bound of the worst path delay in linear time through the graph GBA is pessimistic than PBA a D b c The upper bound of the worst path delay 15 Q On-Chip Variation (OCV) Launch Path DL Setup Check DL < Dc + Tclock D Q D [0.95,1.05] CLOCK [0.95,1.05] [0.95,1.05] Capture Path DC 16 [0.95,1.05] Q On-Chip Variation (OCV) Common Path Reconvergence Pessimism Removal (CRPR) (a.k.a, CPPR) Launch Path DL Setup Check DL < Dc + Tclock D Q D [0.95,1.05] CLOCK [0.95,1.05] [0.95,1.05] Capture Path DC 17 Q Statistical Cancellation (OCV) 18 Statistical Cancellation 19 Advanced OCV (AOCV) (aka LOCV) Stage-based AOCV 20 [Synopsys Whitepaper] Advanced OCV (AOCV) (aka LOCV) [Synopsys Whitepaper] 21 Advanced OCV (AOCV) (aka LOCV) Cells gets different derate depending on the logic depth and the cell type (Design-specific OCV, CLK DA) it also depends on the loads and slews Launch Path DL Setup Check DL < Dc + Tclock D Q D [0.95,1.05] 5% CLOCK [0.95,1.05] [0.95,1.05] Capture Path DC 22 [0.97,1.03] 3% Q RSS credit in setup/hold check Method 1 (OCV) Setup Check DL < Dc + Tclock RSS credit Method 1 (OCV) Method 1 (OCV) If perfectly correlated, variation will be canceled 23 Advanced OCV (AOCV) (aka LOCV) Distance-based AOCV DL < Dc + Tclock 24 [Synopsys Whitepaper] Advanced OCV (AOCV) (aka LOCV) 0.05 = [0.95,1.05] 5% 3% 0.03 0.02 + 25 Advanced OCV (AOCV) (aka LOCV) Derating Table for each cell type 26 [Synopsys Whitepaper] Advanced OCV (AOCV) (aka LOCV) AOCV requires a lot of library characterization efforts Derating values for each cell type, each depth, each location, each slew, each load Worst-case derating is selected across each load and each slew A source of pessimism AOCV tables doesn’t have much information Can be predicted by the simple analytic model Path credit is mapped into the credit of segment delays Paths do not consist of a single type of gates Not graph-based analysis (GBA) friendly 27 Statistical Static Timing Analysis (SSTA) 28 Statistical Static Timing Analysis (SSTA) Canonical Form All timing quantities computed and propagated in a parameterized form ATs, RATs, slacks, slews, delays, etc Need to characterize sensitivities for each cell type, each delay, each slew 29 Statistical Static Timing Analysis (SSTA) Statistical Timing: Where’s the tofu? ICCAD 2009, IBM 30 Statistical Static Timing Analysis (SSTA) Use 7 variables in the Canonical Form 31 Statistical Static Timing Analysis (SSTA) SSTA benefits Chip-to-chip variation No corners Safe RSS credit Within-chip variation RSS credit down a path RSS credit in setup/hold check 32 Statistical Static Timing Analysis (SSTA) Statistical Timing: Where’s the tofu? ICCAD 2009, IBM 33 Statistical Static Timing Analysis (SSTA) Statistical Timing: Where’s the tofu? ICCAD 2009, IBM 34 Statistical Static Timing Analysis (SSTA) Statistical Timing: Where’s the tofu? ICCAD 2009, IBM 35 Statistical Static Timing Analysis (SSTA) Statistical Timing: Where’s the tofu? ICCAD 2009, IBM 36 Statistical Static Timing Analysis (SSTA) Apples-to-apples comparison of statistical flow to: 2 corner foundry-like timing with derating ‘n’ corner industry-standard flow Exhaustive corner timing - 380ps total - 200ps from RSS credit in chip-to-chip variation - 80ps from RSS credit in on-chip variation Statistical Timing: Where’s the tofu? ICCAD 2009, IBM 37 Parametric OCV (POCV) (aka SOCV) Use SSTA for within-chip variation only Eliminate a lot of characterization burden from SSTA, giving up the benefits in chip-to-chip variation Use a few variables only in the canonical form Statistical OCV (SOCV) is a similar technique In theory, POCV/SOCV is clearly a better engineering than AOCV Better accuracy and less characterization effort “A parametric approach for handling local variation effects in timing analysis”, DAC 2009, Mutlu. A (Extreme DA) 38 Remaining Pessimism in SSTA/POCV Refactoring - CRPR for Combinational Networks Launch Path DL= Dcd + max( a + b, a + c) + d + e = Dcd + a + max(b, c) + d + e Using Distributivity Of + over max Setup Check DL < Dc + Tclock b D Q d a c CLOCK Capture Path DC [Chung and Abraham, ICCAD 2009] (Best Paper Award Nomination) [Chung and Abraham, TCAD 2012] 39 e D Q Compressed Sensing JPEG Encoder IP Acquisition (Sampling) 8X8 Block 2D Discrete Cosine Transform RGB2YCbCr Color Converter Quantization Huffman Encoding Sparse f f Compressed Sensing Tremendous impact on signal processing, machine learning, statistics,.. The original groundbreaking paper [Donoho 2004] has been cited 8769 times (200+ papers in the last 3 years.) Linear measurements or Non-uniform sampling Decoding or Recovery Classical answer: Underdetermined → cannot solve We have k equations and 2m unknowns, If k>2m, we may have a unique solution New answer: Information on 2m unknowns are encoded into k measurements, and we can recover it perfectly and efficiently (In practice, around 4m are needed) Compressed Sensing Images and sounds have continuation Samples adjacent in time or space are highly correlated (high energy at low frequencies) Conventional measurements are not efficient After acquisition at t0 t0 Lower entropy, less information CS recovers/predicts unobserved quantities from a few observations Compressed Silicon Sensing In IC manufacturing, measurements are expensive IC cost = die cost + test cost + package cost Could be applicable to pre-silicon as well (where some simulations are expensive or interpolation is used) Wafer Automatic Test Equipment (ATE) Virtual Probe Framework for wafer characterization Many wafer test results are spatially correlated across wafer Spatially correlated data (282 measurements) Random 50 measurements Predicted from 50 samples 1.8% Error Our CSS Framework Test-items are also correlated strongly VP recover results of each test-item independently Our approach does it simultaneously Synthetic wafer Normalized Flushed delay Normalized Log(IDDQ) 50 samples/item VP’s Prediction (12% Error) Our Prediction (0.4% Error) Our CSS Framework Can decompose it into correlated variation and random variation Synthetic wafer 50 samples/item Applications of CSS Conclusions Robustness is the key to success in nanometer technologies Margins are the easiest way to obtain robustness Margins eat up competitiveness Needs sophisticated engineering for margining (OCV, AOCV, POCV,…) Post-silicon engineering (silicon debug, characterization, etc) is very important under large-scale process variations Compressed Silicon Sensing CS is a revolutionary theory Let’s take advantage of it at IC design and manufacturing! 48 Q/A Thank you! 49

© Copyright 2019 ExploreDoc