# Addressing DDR5 design challenges with IBIS-AMI modeling techniques

Todd Westerhoff, SiSoft Doug Burns, SiSoft Eric Brock, SiSoft

DesignCon 2018 IBIS Summit Santa Clara, California February 2, 2018



# Agenda

- Traditional DDR analysis
- Eyes and probabilities
- IBIS-AMI models
- DDR5 topologies and transactions
- Optimizing terminations
- Pulse response analysis
- Applying equalization
- Summary



#### Traditional DDR Analysis







Signal Integrity + Flight Time Extraction + Timing Analysis

- Combined signal integrity + timing analysis
  - → voltage and timing margins
- SI analysis < 100 data bits</li>
- Margins computed based on worst case data



#### **Eyes and Probabilities**



- SerDes analysis techniques used to predict eye characteristics > 1M bits
- Eye margins measured against reference mask
  - Timing / skew adjustments are assumed



#### **IBIS-AMI** Models

#### Goals

- Interoperable: different vendor models work together
- Portable: one model runs in multiple simulators
- Flexible: supports Statistical and Time-Domain simulation
- High Performance: simulates a million bits per CPU minute
- Accurate: high correlation to simulations / measurement

#### **Assumptions**

- "High impedance node" between analog I/O & equalization circuitry
- Analog I/O operates in linear region
- EQ behavior modeled by code linked into simulator
- EQ models meet AMI API



#### Serial Channels vs. DDR Topologies

#### Serial Channels

- Differential
- Long and lossy
- Unidirectional
- Continuous operation
- Simple impedance control
- Point to point topology

#### **DDR Topologies**

- Single-ended & differential
- Short and reflective
- Bidirectional
- Communicate in bursts
- Complex impedance control
- Multiple sources / loads with long branches



Feb 2018

#### DDR5 Topologies and Transactions



- 1. (Write) Controller to DIMM1
- 2. (Write) Controller to DIMM2
- 3. (Read) DIMM1 to Controller
- 4. (Read) DIMM2 to Controller



#### **Optimizing Terminations**

#### **Driver settings:**

```
    DQ_ZO34 Generic DDR5 34 Ohm Driver
    DQ_ZO40 Generic DDR5 40 Ohm Driver
    DQ_ZO48 Generic DDR5 48 Ohm Driver
```

#### Receiver / terminator settings:

```
DQ_IN_ODTOFF Generic DDR5 Receiver with No ODT
DQ_IN_ODT34 Generic DDR5 Receiver with 34 ohm ODT
DQ_IN_ODT40 Generic DDR5 Receiver with 40 ohm ODT
DQ_IN_ODT48 Generic DDR5 Receiver with 48 ohm ODT
DQ_IN_ODT60 Generic DDR5 Receiver with 60 ohm ODT
DQ_IN_ODT80 Generic DDR5 Receiver with 80 ohm ODT
DQ_IN_ODT120 Generic DDR5 Receiver with 120 ohm ODT
DQ_IN_ODT240 Generic DDR5 Receiver with 240 ohm ODT
```

#### Each transaction:

```
(Driver) * (Receiver) * (Terminator) * (Terminator)
= 3 * 8 * 8 * 8 * 8
= 12,288 combinations per transaction
* 4 transactions
= 49,152 total termination settings (!)
```

Sectings (:)

# Finding The Best Case

|                         |            | _       |               |                        |   |              |   |              |   |              | $\overline{}$ |             |   |              |   |              |   |
|-------------------------|------------|---------|---------------|------------------------|---|--------------|---|--------------|---|--------------|---------------|-------------|---|--------------|---|--------------|---|
|                         |            |         | Variation     |                        |   |              |   |              |   |              |               |             |   |              |   |              |   |
| Variable:               | Type:      | Format: | Group:        | Value 1:               |   | Value 2:     |   | Value 3:     |   | Value 4:     |               | Value 5:    |   | Value 6:     |   | Value 7:     |   |
| \$controller:t1:model   | IBIS Model | List    | <none></none> | DQ_Z034                | ¥ | DQ_Z040      | ¥ | DQ_Z048      | v |              | v             |             | ~ |              | v |              | Ţ |
| \$dimm1_sdram1:t1:model | IBIS Model | List    | <none></none> | DQ_IN_ODT34            | v | DQ_IN_ODT40  | v | DQ_IN_ODT48  | v | DQ_IN_ODT60  | v             | DQ_IN_ODT80 | v | DQ_IN_ODT120 | v | DQ_IN_ODT240 | v |
| \$dimm1_sdram2:t1:model | IBIS Model | List    | <none></none> | DQ_IN_ODT120           | v | DQ_IN_ODT240 | ¥ | DQ_IN_ODTOFF | - |              | v             |             | v |              | ¥ |              | V |
| \$dimm2_sdram1:t1:model | IBIS Model | List    | dimm2         | DQ_IN_ODT60            | ¥ | DQ_IN_ODT80  | ¥ | DQ_IN_ODT120 | v | DQ_IN_ODT240 | v             |             | v |              | ¥ |              | v |
| \$dimm2_sdram2:t1:model | IBIS Model | List    | dimm2         | DQ_IN_ODT60            | ¥ | DQ_IN_ODT80  | v | DQ_IN_ODT120 | - | DQ_IN_ODT240 | v             |             | • |              | • |              | - |
| \$UI                    | UI         | List    | <none></none> | 0.313ns - ddr5_dq_3200 | v |              | v |              | v |              | v             |             | v |              | v |              | - |

252 cases tested (out of a possible 12,288)



Voltage margin distribution



#### Best case conditions



Voltage and timing margin against mask



# Limiting Transaction vs. Speed



- Data rate: 3200 to 6400MT/s, increments of 400 MT/s
- Voltage margin against mask is plotted
- Two questions
  - Why is 3600 MT/s so much worse than 3200 MT/s?
  - How can 4400 MT/s be better than 3600 MT/s?



### Pulse Response Analysis

3200 MT/s (Blue) vs 3600 MT/s (Red)



- Vertical lines denote ISI at sampling times
- 3600 MT/s maximizes ISI at sample point



#### Pulse Response Analysis

4400 MT/s (Cyan) vs 3600 MT/s (Red)



- 4400 MT/s has less pulse height & width
- 4400 MT/s ISI is much better than 3600 MT/s



#### 3600 MT/s vs. 4400 MT/s



#### 3600 MT/s







### **Applying Equalization**



- Problem here isn't loss, it's <u>ringing</u>
- TX FIR and RX CTLE filters deal with loss
- RX DFE is best suited to correct ringing
- RX ISI voltages can be read directly off the plot



### Eye Height vs. DFE Taps



- We compute theoretical eye opening from the pulse response
- We also compute how DFE taps could improve eye height
  - Assumes tap range, granularity, training efficiency



# Impact of 2 DFE Taps



3600 MT/s @ die pad







### DFE Taps vs. Speed



- Data rate: 3200 to 6400MT/s, increments of 400 MT/s
- Voltage margin against mask is plotted
- DFE taps: 0 (Red), 2 (Blue), 4 (Green), 8 (Yellow)



# Modeling DRAM DFE Training





- DDR5 controllers set DRAM DFE taps by training at system startup
- Modeling training sequences with AMI models is difficult & time consuming
- Manually optimizing DRAM DFE settings through simulation is difficult & time consuming
- Approach: let DRAM DFE taps adapt, then lock settings when "Ignore\_Bits" is reached



# Modeling DRAM DFE Training





- How well does a "regular" adapting DFE model approximate an optimally trained, fixed-tap model?
  - Pretty well, in this example
  - Adapting (top eye, red contour)
  - Fixed (bottom eye, blue contour)
- In a pinch, an adapting DFE will probably give a reasonable estimate



# Modeling DQS Clock Forwarding





- DDR5 forwards the sampling clock via the DQS signal
- AMI models typically recover the clock from the incoming data
- "Locking the clock" in a typical AMI model approximates forwarded clock behavior
- Note that the "locked clock" contour (blue) is <u>larger</u> than its counterpart



#### Modeling Tx and Rx Jitter





- Previous simulation results have omitted Tx/Rx jitter for simplicity
- AMI jitter / noise modeling
  - TX jitter directly modulates TX output
  - RX jitter modulates sampling clock
  - RX noise affects sampling latch input
- Effect RX eye contour
  - No jitter (red)
  - TX Dj (blue)
  - RX Dj (green)



### Summary

- Modern DDR analysis uses techniques that predict behavior over millions of data bits
- Optimizing driver / receiver termination settings is essential to ensuring margin
- Pulse response analysis provides useful insight into how ringing affects DDR design margins
- AMI models can be used to predict how Tx/Rx EQ will improve design margins
- This presentation only covered <u>part</u> of the full methodology needed for DDR5 analysis





Todd Westerhoff twesterh@sisoft.com

Doug Burns dburns@sisoft.com

Eric Brock ebrock@sisoft.com

