# PowerPoint Template

An Adaptive Parallel Flow for Power Distribution Network Simulation Using Discrete Fourier Transform Xiang Hu1, Wenbo Zhao2, Peng Du2, Amirali Shayan2, Chung-Kuan Cheng2 1 ECE Dept., 2CSE Dept. University of California, San Diego 01/19/2010 Outline Motivation Adaptive simulation flow using discrete Fourier transform (DFT) Basic DFT flow Problems with the basic DFT flow Adaptive DFT flow Experimental results Error analysis Time efficiency of the adaptive flow Time efficiency of parallel processing Conclusions Page 2 Outline Motivation Adaptive simulation flow using discrete Fourier transform (DFT) Basic DFT flow Problems with the basic DFT flow Adaptive DFT flow Experimental results Error analysis Time efficiency of the adaptive flow Time efficiency of parallel processing Conclusions Page 3 PDN Simulation: Why Frequency Domain? Huge PDN netlists Time-domain simulation: serial - slow Frequency-domain simulation: parallel fast Frequency dependent parasitics Simulation results Time-domain: voltage drops, simultaneous switching noise (SSN) input dependent Frequency-domain: impedance, anti-resonance peaks input independent Page 4

Transform Operations: Why Discrete Fourier Transform Laplace Transform [Wanping 07] Input: Series of ramp functions Output: Rational expressing via vector fitting Vector fitting may introduce large errors Choice of frequency samples is case dependent Discrete Fourier Transform (DFT) Input: periodic signal Inverse DFT is straightforward: vector fitting is not needed Frequency sample points with uniform steps Page 5 Outline Motivation Adaptive simulation flow using discrete Fourier transform (DFT) Basic DFT flow Problems with the basic DFT flow Adaptive DFT flow Experimental results Error analysis Time efficiency of the adaptive flow Time efficiency of parallel processing Conclusions Page 6 Basic DFT Simulation Flow Page 7 Problem with Basic DFT Flow Wrap-around effect requires long padding zeros at the end of the input Periodicity nature of DFT Small uniform time steps are needed to cover the input frequency range T x 10 8 T -3 6 x 10 -3 7 5

output 5 4 3 Correct 4 Voltage (V) 6 Current (A) Large number of simulation points! 2T 3T 4T 3 2 2 1 1 0 0 T -1 0 0.2 0.4 0.6 0.8 Time (sec) x 10 8 DFT repetition 7 6

Current (A) Current (A) 5 4 3 x 10 -3 -7 -1 Wrap-around 6 7 output 5 4 3 2 1 1 0 0 0.5 1 1.5 2 x 10 2.5 x 10 -3 5 6 2

0 Time (sec) -8 Distorted! 4 Voltage (V) 8 -3 1 x 10 3 2 1 -1 0 0.5 Page 8 1 1.5 Time (sec) 2 2.5 x 10 -8 -1 0 0 0.2 0.4 0.6 Time (sec) 0.8

1 x 10 -7 -1 0 0.5 1 1.5 Time (sec) 2 2.5 x 10 -8 Adaptive DFT Simulation x 10 4T - -3 5 Voltage (V) 4 3 2 T 6 x 10 -3 6 5 5 4 4

3 3 Voltage (V) 6 2T 3T Voltage (V) T 2 1 1 0 0 0 0 0.2 0.4 0.6 Time (sec) Correct 0.8 -1 1 x 10 -7 0 0.2 0.4 0.6 Time (sec) Distorted 0.8

1 x 10 -3 2 1 -1 x 10 -1 0 0.5 1 1.5 2 Time (sec) -7 2.5 x 10 -8 Correct! Basic ideas of the adaptive DFT flow: cancel out the wrap-around effect by subtracting the tail from the main part of the output Main part of the output: obtained with small time step and small period; distorted by the wrap-around effect Tail of the output: low frequency oscillation; can be captured with large time steps Total number of simulation points is reduced significantly! Page 9 Adaptive DFT Flow Period[i]: the input period at each iteration Interval[i]: the simulation time step at each iteration FreqUpBd[i]: the upper bound of the input frequency range at each iteration vi(t): tentative time-domain output within the frequency range [0, FreqUpBd] at each iteration Iteration #1: obtain the main part of the output Iteration #2~k: capture the oscillations in the

tail of the output (high, middle, and low resonant frequencies) For each iteration #i, i=k, k-1, , 2, subtract the captured tail from the outputs at iteration #j, j

1 0 0.5 10 Page 12 Original Input -3 5 Current (A) Impedance ( ) 4 x 10 4 10 6 Frequency (Hz) 10 8 -1 0 0.5 1 Time (sec) 1.5 2 x 10 -8 Experimental Results: Adaptive Flow Process T1 x 10 10 Iteration #1: v1(t)

-3 v1(t), t1=20ps, T1=20.48ns 8 t1=20ps Voltage (V) 6 4 2 v1 (t ) 0 0 10 Voltage (V) v (mT : (m 1)T ) 2 1 1 m 1 -2 -4 T1=20.48ns 3 x 10 0.5 -3 1 1.5 2 Time (sec) 3T1 2T1 2.5

x 10 -8 Iteration #2: v2(t) 4T1 8T1 t2 = 64t1 v2(t), t2=640ps, T2=163.84ns 5 = 640ps T2 = 8T1 0 = 163.84ns -5 0 10 x 10 1 2 3 4 5 6 7 9 x 10 -8 5 v1 (t ) v (mT : (m 1)T ) 2 1 m 1

Final output-v1(t) Tail: 0 -5 Final output: 3 Main part: Final output v1(t) Final output Voltage (V) 8 Time (sec) -3 v2 (T1 : 8T1 ) 0 Page 13 1 2 3 4 Time (sec) 5 6 7 8 9 x 10 -8 1 Experimental Results: DFT Flow vs. SPICE 10

x 10 -3 DFT flow SPICE transient simulation 8 Voltage (V) 6 4 2 Relative error: 0.25% 0 -2 -4 0 0.5 1 Time (sec) Page 14 1.5 2 x 10 -7 Error Analysis: Error Caused by Wrap-around Effect 6 x 10 -3 Output comparison 1.5 DFT flow, T=20.48ns DFT flow, T=163.84ns SPICE transient simulation 5 -4 Error relative to SPICE Difference between DFT and SPICE, T=20.48ns Difference between DFT and SPICE, T=163.84ns

1 4 Relative error: 0.12% 0.5 3 Voltage (V) Voltage (V) x 10 2 0 -0.5 1 0 -1 -1 -1.5 Relative error: 2.09% 0 0.5 1 1.5 Time (sec) 2 2.5 x 10 -8 0 0.5 1 1.5 Time (sec)

2 2.5 x 10 Theorem 1: Let be the initial value of the output voltage. Suppose for some , then the mean square error, i.e., is bounded by . Page 15 -8 Error Analysis: Error Caused by Different Interpolation Methods -3 6 x 10 Output comparison 2 DFT flow, t=20ps DFT flow, t=2.5ps SPICE transient simulation 5 x 10 -5 Error relative to SPICE Difference between DFT and SPICE, t=20ps Difference between DFT and SPICE, t=2.5ps 1.5 1 4 Relative error: 0.12% 0.5 Voltage (V) Voltage (V) 3 2 0 -0.5

1 0 -1 -1.5 0 0.2 0.4 0.6 0.8 Time (sec) x 10 Voltage (V) 1 x 10 -7 1.31 1.3 1.29 2.294 0.2 0.4 0.6 0.8 Time (sec) DFT: sinusoidal interpolation DFT flow, t=20ps DFT flow, t=2.5ps SPICE transient simulation Page 16 -2 0 SPICE: PWL interpolation -4 1.32

1.28 Relative error: 0.09% -1 2.2945 2.295 Time (sec) 2.2955 2.296 x 10 -8 1 x 10 -7 Time Complexity Analysis: Adaptive vs. Non-adaptive k Adaptive flow time complexity: O( Ti / ti ) i 1 Ti: simulation period at iteration #i, T1 T2 Tk ti: simulation time step at iteration #i, t1 t2 tk Non-adaptive flow time complexity: O(Tk / t1 ) 2.5 Non-adaptive method Adaptive method Relative error to Hspice (%) 2 1.5 1 0.5 Page 17 0 0 100 200 300

400 Time (sec) 500 600 700 Parallel Processing 16000 Test case: 3D PDN Basic DFT Adpative DFT 14000 ~ 0.17 million nodes The adaptive DFT flow has more advantage when the number of available processors is limited. Simulations between each iteration of the adaptive flow need to be processed serially Simulation time (sec) 12000 10000 8000 6000 4000 2000 0 0 50 100 150 200 Number of processors 250 Simulation time with different number of processors (sec) 1 prc 4 prcs 8 prcs 16 prcs 64 prcs

128 prcs 256 prcs Hspice 21374 NA NA NA NA NA NA Basic DFT 15976 6635 3225 1647 425 218 143 Adaptive DFT 4947 2096 904 490 180 120 115 Page 18 300 Parallel Processing: DFT Flow vs. SPICE

DFT error compared to Hspice Simulation result 8 x 10 -3 3 Adaptive DFT Hspice 7 x 10 -5 2 6 1 Voltage (V) Voltage (V) 5 4 3 2 0 -1 -2 1 -3 0 -1 0 Page 19 0.5 1 1.5 Time (sec) 2 2.5 x 10 -8

-4 0 0.5 1 1.5 Time (sec) 2 2.5 x 10 -8 Outline Motivation Adaptive simulation flow using discrete Fourier transform (DFT) Basic DFT flow Problems with the basic DFT flow Adaptive DFT flow Experimental results Error analysis Time efficiency of the adaptive flow Time efficiency of parallel processing Conclusions Page 20 Conclusions Implemented an adaptive flow for large PDN simulation using DFT Total number of simulation points is reduced significantly compared to the basic DFT flow Achieved a relative error of the order of 0.1% compared to SPICE 10x speed up with a single processor compared to SPICE. Parallel processing is incorporated to reduce the simulation time even more significantly Page 21 Page 22

## Recently Viewed Presentations

• From the Founder's Desk:. Shreyans. Chopra is the Founder and CEO at RTB Demand Group. Having graduated from University of Pennsylvania, he started working as a Technology Analyst in Goldman Sachs.
• Enfermedades repetidas o crónicas (especialmente si es hereditaria) "Donde haya un virus, lo pesco yo" "Me siento enfermo y cansado…" "Es un mal de familia, así que ahora me toca a mí" Esterilidad, tendencia al aborto o problemas femeninos relacionados...
• Variations in CERES-Terra Fluxes and Cloud Properties with SST Anomalies Zach Eitzen (SSAI/NASA-LaRC) Kuan-Man Xu (NASA-LaRC) Takmeng Wong (NASA-LaRC) Acknowledgments Thanks to David Doelling and Michele Nordeen for help with CERES cloud property data.
• Scientists wouldn't cheat or cut corners to get ahead Online free access journals wouldn't blindly accept a paper Publishers won't mess up a paper in printing
• Cavendish School Lincoln Close, Runcorn, Cheshire, England Whole data using Durham University Data base CEM school Performance descriptions in English The following are descriptions of pupils' performance that indicate the emergence of skills, knowledge and understanding in English.
• (4 marks) AO1 Likely studies include: Paulescu et al (1993), Logie et al (1989), Robbins et al (1996), Hunt (1980), Brooks (1968), Baddeley, Thomson and Buchanan (1975), Hoosain and Salili (1988), Hulme, Thomson, Muir and Lawrence (1984), Levey, Aldaz, Watts...
• The passécomposé. The passé composéexpresses what happened in the past (sometimes called the past perfect tense). It's not the only French past tense. It has 2 parts: helping (auxiliary) verb. and a . past participle. The helping verb for most...
• Thus far we have focused on product markets, where households demand goods and services are supplied by firms (upper loop). We now turn to the resource markets, where firms demand factors of production which are supplied by households in exchange...