Modeling and Simulation of Single-ended PAM4 Signals in Memory Interfaces

Fangyi Rao, Keysight Technologies

Virtual Asian IBIS Summit (China)
November 19, 2021
[Previously given November 12, 2021 (Japan)]
Introduction

- Data rate in GDDR6X reaches 19-21 GT/s,
- PAM4 signaling scheme is adopted to mitigate channel bandwidth limitation on high-speed data
- Every 2 bits are mapped to one PAM4 level (symbol)

\[
\begin{align*}
11 & & 10 \\
10 & \quad \text{Linear} & 11 & \quad \text{Gray} \\
01 & & 11 \\
00 & & 00
\end{align*}
\]

- PAM4 symbol rate is half of bit rate
- Requires half of the bandwidth of NRZ signal

![Graph showing comparison of NRZ and PAM4 signals](image)
Introduction (cont’d)

- Level separation in PAM4 is 1/3 of that in NRZ → -9.5dB SNR penalty
- Level separation loss must be offset by
  - Sufficient dynamic and linearity range
  - Powerful equalization capability
  - Efficient tuning and optimization
  - Reduced jitter and noise impairments
- Comprehensive modeling and simulation methodology is essential
- IBIS-AMI is proven a versatile and high-performance modeling and simulation framework for serial and parallel link analyses
Challenges in PAM4 Memory Interface Modeling and Simulation

• PAM4 signaling in single-ended interfaces
• Common mode in single-ended PAM4 signals
• Asymmetric rise and fall edges in single-ended I/O
• Source-synchronous clocking
• Jitter tracking and the impact of unmatched receiver
• …
Single-ended PAM4 Signaling in AMI Simulation

- Stimulus input waveform to Tx GetWave is differential and has four levels at -1/2, -1/6, 1/6 and 1/2 V, representing PAM4 symbols 0, 1, 2 and 3, respectively.

- The single-ended signal at the Rx input is decomposed into common and differential components

\[ v_{in}(t) = V_{DC\_offset} + v_{diff}(t) \]

- The four-level differential component \( v_{diff}(t) \) is the result of convolution between Tx GetWave output and the channel impulse response.

- \( v_{diff}(t) \) is the input waveform to Rx GetWave

- The common mode \( V_{DC\_offset} \) is assumed to be a constant and defined as the midpoint between level 0 and level 3 static state voltages at Rx input
Single-ended PAM4 Signaling in AMI Simulation (cont’d)

- $V_{DC\_offset}$ is characterized by EDA tools and passed into Rx Init through parameter DC_Offset
- Rx GetWave can choose to internally recover the single-ended $v_{in}(t)$ by adding $V_{DC\_offset}$ to its input waveform $v_{diff}(t)$

![Simulated single-ended waveform and eye diagram at DQ Rx input (package) pin](image)
Modeling Vref

• The Rx comparator subtracts Vref from the input single-ended signal
• Vref is calibrated on a set of discrete values during the system training phase
• Vref normally is close to but not exactly equal to $V_{DC\_offset}$, leaving a small residual DC bias in the comparator output
• The comparator output $v_{out}$ can be modeled as

$$v_{out}(t) = v_{in}(t) - V_{ref}$$

$$= V_{DC\_offset} + v_{diff}(t) - V_{ref}$$
Asymmetric Rise and Fall Edges in Single-ended Signal

• In a single-ended I/O, the pull-up and pull-down slew rates are usually noticeably different, leading to asymmetric rise and fall edges

• With asymmetric edges the upper and lower PAM4 eyes are asymmetric

• Advanced AMI simulation algorithms are developed to capture the difference between rise and fall waveforms

Simulated eye diagram at DQ Rx input pin
Transmitter Nonlinearity ($R_{LM}$)

- Stimulus input waveform to Tx GetWave has four idea levels of $-1/2$, $-1/6$, $1/6$ and $1/2$ V, representing four PAM4 levels linearly separated by a uniform step of $1/3$ V.
- Tx GetWave can internally map these levels to non-ideal values to model Tx nonlinearity.

Simulated eye diagram at DQ Rx input pin.
Signal-to-Noise-and-Distortion-Ratio (SNDR)

$$SNDR = 10 \log_{10} \left( \frac{p_{\text{max}}^2}{\sigma_e^2 + \sigma_n^2} \right)$$

- $p_{\text{max}}$: maximum Tx output signal amplitude
- $\sigma_e$: RMS of Tx output nonlinear distortion
- $\sigma_n$: RMS of Tx output noise

Simulated eye diagram at DQ Rx input pin

Without Tx noise

With 12dB Tx noise
Transmitter Equalization

Simulated eye diagram at DQ Rx input pin
Receiver Equalization: CTLE and DFE

DQ Rx input

DQ Rx output after CTLE and 3-tap DFE
Modeling Clock Forwarding Architecture

• In SerDes interfaces clock is embedded in data, and Rx GetWave has only one input waveform, which is the data signal.

• In memory interfaces data Rx uses strobe (WCK in GDDR) as forwarded clock to clock DFE and data sampling.

• Practically, in memory interfaces data Rx has two inputs, one is data, and the other is strobe.

• To enable modeling of clock forwarding, Rx GetWave API is extended to take both data and clock input waveforms.
Simulation Flow

- Step 1: Calculate analog channel output (crosstalk is taken into account)
- Step 2: Calculate output of all WCK Rx
- Step 3: Use WCK Rx output as DQ Rx clock input and calculate output of all DQ Rx
Jitter Tracking and Unmatched Rx Effect

• Correlated jitter in data and strobe can be tracked and canceled in data Rx by clock forwarding

• Unmatched data and strobe Rx path reduces data-strobe jitter correlation and adversely impacts data Rx jitter tracking and DFE performance
Jitter Tracking and Unmatched Rx Effect (cont’d)

- DQ Rx input without DQ & WCK Tx SJ
- DQ Rx input with DQ & WCK Tx SJ (0 WCK-to-DQ delay)
- DQ Rx input with DQ & WCK Tx SJ (6UI WCK-to-DQ delay)
- DQ Rx post-DFE without DQ & WCK Tx SJ
- DQ Rx post-DFE with DQ & WCK Tx SJ (0 WCK-to-DQ delay)
- DQ Rx post-DFE with DQ & WCK Tx SJ (6UI WCK-to-DQ delay)

- Tx SJ is canceled in DQ Rx by DQ-WCK jitter tracking
- Jitter tracking is less effective with WCK-to-DQ delay

Rx input eye is closed by Tx SJ
Summary

• IBIS-AMI methodology is applied to model and simulate single-ended PAM4 signals in memory interfaces

• Stimulus input waveform to Tx GetWave is differential and has four levels at -1/2, -1/6, 1/6 and 1/2 V, representing PAM4 symbols 0, 1, 2 and 3, respectively

• Common mode of single-ended signal is included in simulation by decomposing a single-ended signal into common and differential components

• Asymmetric rise and fall edges are captured in advanced IBIS-AMI simulation algorithms

• $R_{LM}$, SNDR and EQ (Tx and Rx) can be included in model and simulation

• Modeling of clock forwarding is enabled by extending the Rx GetWave API to take two input waveforms, one for data, and the other for forwarded clock

• The extended API naturally captures the jitter tracking behavior and impacts of unmatched Rx
Thank you!