



# **Energy-Efficient CMOS Optical Receiver for Short-Reach Data Center Application**

**Chongyun Zhang** 

Thesis supervisor: Prof. C. Patrick Yue

June 13, 2025

Optical Wireless Lab

Department of Electronic and Computer Engineering
The Hong Kong University of Science and Technology (HKUST)

### **Outline**

- Background
- PAM-4 Optical Receiver Data Path
- PAM-4 Optical Receiver Front End
- Conclusion

# Research Background

- Data center networks keep scaling in BW and physical size
- Optical interconnects offer enhanced traffic capacity and reduced power consumption
- Further scaling of efficiency and density remains challenging due to limited integration in optical modules



#### **Network switch arrays with E/O interfaces**



(\*Revenue in the Data Center market for different segments Worldwide from 2018 to 2029 [Graph], Statista Market Insights, July 22, 2024. [Online]. Available: <a href="https://www.statista.com/forecasts/1441973/revenue-data-center-market-for-different-segments-worldwide">https://www.statista.com/forecasts/1441973/revenue-data-center-market-for-different-segments-worldwide</a>.)

# **Development Trend in Optical Interconnect**

#### Intensity-Modulation Direct-Detection (IMDD) Optical Link





**Retimed Pluggable Optics (RPO)** 

- High power consumption from DSP chips
- Increase of frequency-dependent losses in PCB
- · Heavy cost and power from SiGe BiCMOS front end





**Linear Pluggable Optics (LPO)** 

### **Design Challenges**



#### **Integration of front-end transceivers in CMOS**

- Lower  $f_T$  and intrinsic gain
- Worse noise performance
- Limited supply voltage



**PAM-4 Design Tradeoffs** 



# Adoption of four-level pulse amplitude modulation (PAM-4)

- ~9.5 dB worse signal-to-noise ratio (SNR)
- Higher linearity to preserve four symbols

# Research Scope







### **Outline**

- Background
- PAM-4 Optical Receiver Data Path
  - System Architecture
  - Implementation
  - Measurement Result
- PAM-4 Optical Receiver Front End
- **)** Conclusion

# **Data Center Optical Interconnect**



- Integrating TIA and sampler (deserializer) reduces the overhead and electrical connections
- Low power consumption and low cost for 50-Gb/s link

# **Design Challenges**



TIS: transimpedance amplifier; VGA: variable gain amplifier; CTLE: continuous-time linear equalizer

### **RX Architectural Consideration**

- CTLE is avoided to save power
- Passive inductors are avoided to save area
- Transadmittance-stage transimpedance-stage (TAS-TIS) topology is employed



Equalizers are integrated at sampler to mitigate the residual ISI of TIA

### **RX Architectural Consideration**



- ORX link model to evaluate post-TIA equalization
- TIA model with a second-order flat response

### **RX Architectural Consideration**



- BW<sub>TIA</sub> < 0.45x baud rate, sensitivity is limited by ISI
- BW<sub>TIA</sub> > 0.45x baud rate, sensitivity is limited by noise
- A combination of a 2-tap FFE and a 2-tap DFE delivers the best overall sensitivity
- Optical receiver (ORX) is designed with a BW<sub>TIA</sub> of ~0.5x baud rate, followed by a 2-tap FFE and a 2-tap DFE

### **Proposed Architecture**



#### Linear TIA

- >73 dBΩ maximum gain
- >20 dB dynamic range
- Compact and energy-efficient

#### ■ PAM-4 sampler

- Half-rate structure
- 2-tap FFE + 2-tap DFE

#### Clock path

- External differential clock
- Voltage-controlled delay line
- Divider (DIV)

### **Outline**

- Background
- PAM-4 Optical Receiver Data Path
  - System Architecture
  - Implementation
  - Measurement Result
- PAM-4 Optical Receiver Front End
- **Conclusion**

### PD Interface and TIS



#### **Direct connection scheme**

- ✓ Simple and save pads
- Noise at VSS<sub>TIA</sub> and PD bias affects the single-ended input signal



#### **On-chip connection scheme**

- ✓ Enhanced ground noise rejection by ac-coupled VSS<sub>TIA</sub> and PD cathode
- ✓ R<sub>D0</sub> and C<sub>D0</sub> provide onchip filtering for noise2



#### **Schematic of TIS with DCOC**

- Pseudo-differential push-pull TIS provides single-todifferential conversion
- Current tail for better supply noise rejection

15

### Gilbert-Cell-Based VGA



- 5-bit current DAC for Gm control
- Gain = Gm\*R<sub>L</sub> fixed load impedance: R<sub>L</sub>
- Shunt peaking required to expand BW



 Two 580-pH inductors required to expand the BW to 32 GHz

# **VGA Employing TAS-TIS Topology**

### **Cherry-Hooper Amp**

• 
$$Z_X \approx \frac{1}{g_{m2}}$$

• 
$$Z_{out} \approx \frac{1}{g_{m2}}$$

#### **TAS-TIS** topology

- Modified Cherry-Hooper Amplifier
- Split into two stages:
   TAS and TIS





#### **VGA** employing a TAS-TIS topology



TAS: transadmittance stage TIS: transimpedance stage

# **VGA Employing TAS-TIS Topology**





- Load impedance of TAS:  $R_F/A_{TIS}$ , higher BW
- Output impedance:  $1/Gm_{TIS}$ , larger driving capacity
- Variable  $R_F$  causes BW variations over gain variations
- Switches for  $R_F$  control bring extra parasitics



- 3-bit TAS-TIS VGA
- 8-dB gain tuning range
- ~25-GHz BW variation

### **Proposed Gilbert-TIS VGA**





- Load impedance of TAS: R<sub>F</sub>/A<sub>TIS</sub>, higher BW
- Output impedance:  $1/Gm_{TIS}$ , larger driving capacity
- Fixed  $R_F$  mains a constant BW over gain variations
- CML-based TAS to get fully differential signal

- High gain-BW product
- <0.2-GHz BW variation</li>

# Post-Amp Design and TIA Frequency Response



- TAS-TIS topology
- Two differential pairs to achieve high  $Gm_{TAS}$
- R<sub>P</sub>, M<sub>P1</sub>, M<sub>P2</sub> form active inductors to expand BW



- TIA achieves 73.6-dBΩ max. gain with 14.2-GHz BW
- < 0.3 GHz BW variation</li>

# Sampler with Integrated Equalizer

 $M_{P1}$ 

 $M_{P3}$ 



• Dummy  $M_{P3}/M_{P4}$  to mitigate clock through from  $M_{P1}/M_{P2}$ 

# **FFE and Summer Timing Diagram**



- Data is sampled and held for 1 UI by S/H circuits using CK\_SH and CKB\_SH alternatively
- Data in the even path experiences an 1-UI delay relative to the odd path
- 0.5-UI precursor of D<sub>1</sub> is cancelled by subtracting D<sub>2</sub> from D<sub>1</sub>

# **DFE and Summer Timing Diagram**



- Before D<sub>1</sub> is sliced by the rising edge of CK\_CMP, D<sub>0</sub> must be regenerated and subtracted from D<sub>1</sub>
- Stringent timing constrain to close decision feedback loop for the first tap
- 1-UI < 42ps for 48-Gb/s PAM-4 operation</li>
- The delay performance of slicers is critical: reduce T<sub>CKQ</sub>

UI: unit interval

# **Track-and-Regenerate Slicer**





- Track-and-regenerate slicer
- CLK=0, CLKB=1, input tracked by Vp and Vn, latch is charged to  $V_{\text{DD}}$
- CLK=1, CLKB=0, Vp and Vn discharged to V<sub>SS</sub>, latch regenerates signal
- Optimized clock-to-Q delay, < 17ps</li>

### **Clock Path**



- CML to CMOS clock buffer amplifies sinusoidal clock signals to rail-to-rail
- C<sup>2</sup>MOS frequency divider is used to provide clock signals for PAM-4 decoder
- Delay line controlled by a 6-bit R2R ladder is used to accommodate delay variations

# Simulated Eye Diagram at Summer Output



- 48-Gb/s PAM-4 input with 220-uA amplitude, 70-fF PD, 1.5-ps RJ
- Eye diagrams of half-rate 24-Gb/s PAM-4 at summer output before decoding
- Coefficients of FFE, first-tap DFE, second-tap DFE: 0.07, 0.08, 0.01

### **Outline**

- Background
- PAM-4 Optical Receiver Data Path
  - System Architecture
  - Implementation
  - Measurement Result
- PAM-4 Optical Receiver Front End
- **)** Conclusion

### Die Photo and Power Breakdown





- Fabricated in a 28-nm bulk CMOS
- ~0.06 mm<sup>2</sup> core area for ORX
- Wire-bonded to a 27-GHz PD with 0.75-A/W responsivity
- 61.4 mW at 48-Gb/s in total, TIA contributing 13.1 mW

### **Measurement Setup**



- 1308-nm light source is coupled to PD through a single-mode fiber
- Optical power level is adjusted by an internal optical attenuator
- Deserialized MSB and LSB are sent to off-chip BER testing

### **NRZ BER Bathtub Curves**



#### **NRZ** input signal

- All data slicers are enabled without PAM-4 threshold voltages
- 28-Gb/s NRZ input with -8.0-dBm input OMA
- 30-Gb/s NRZ input with -7.7-dBm input OMA
- 1e-12 BER at 30-Gb/s validate slicer design and DFE operation

### **PAM-4 BER Bathtub Curves**



#### **PAM-4** input signal

- 48-Gb/s PAM4 with -4.6-dBm input OMA
- Only enabling FFE, BER is higher than pre-FEC limit
- After enabling both FFE and DFE, BER is improved to < 1e-5</li>

# **ORX Sensitivity**

#### 48-Gb/s PAM4 Optical Input



**Decoded 6-Gb/s Output** 





- PAM-4 optical input with
   4.8-dB extinction ratio
- Under 1e-12 BER target,
   -8.2-dBm sensitivity at 28 Gb/s NRZ is achieved
- Under 2.4e-4 BER target,
   -5.1-dBm sensitivity at 48 Gb/s PAM-4 is achieved

# **Comparison Table**

|                                        | JSSC'21 [1]    |               | OJCAS'21[2] | JSSC'22 [3]              |             | RFIC'23 [4]  | VLSI'24 [5]         |               | This Work                 |              |
|----------------------------------------|----------------|---------------|-------------|--------------------------|-------------|--------------|---------------------|---------------|---------------------------|--------------|
| Technology                             | 65nm CMOS      |               | 40nm CMOS   | 28nm CMOS                |             | 28nm CMOS    | 22nm FinFET         |               | 28nm CMOS                 |              |
| Data Rate (Gb/s)                       | 16 (Duobinary) |               | 36 (PAM-4)  | 100 (PAM-4)              |             | 42.7 (NRZ)   | 50 (NRZ)            |               | 48 (PAM-4)                |              |
| PD Capacitance (fF)                    | 180            |               | 100         | 100 70                   |             | N/A          | 100                 |               | 60                        |              |
| PD Responsivity (A/W)                  | 0.8            |               | 8.0         | 1                        |             | 0.8          | 0.48                |               | 0.75                      |              |
| NRZ OMA Sens. at<br>BER 1e-12 (dBm)    | -11.6          |               | N/A         | -11.1<br>@56Gb/s         |             | -3.6         | -6                  |               | -8.2 @28Gb/s              |              |
| PAM-4 OMA Sens. at<br>BER 2.4e-4 (dBm) | N/A            |               | -4.8*       | -8.9                     |             | N/A          | N/A                 |               | -5.1                      |              |
| RX EQ Capabilities                     | N/A            |               | 2-tap DFE   | 2-tap FFE +<br>2-tap DFE |             | CTLE         | CTLE + 2-tap<br>FFE |               | 2-tap FFE + 2-<br>tap DFE |              |
| Area (mm²)                             | 0.09           |               | 0.23        | 0.45                     |             | 0.11**       | 0.32 (TIA + RX)     |               | 0.06                      |              |
| Power (mW)                             | 4.0<br>(TIA)   | 11.2<br>(ORX) | 128.8 (RX)  | 117<br>(TIA)             | 381<br>(RX) | 145.2** (RX) | 15.8<br>(TIA)       | 75.9<br>(ORX) | 13.1<br>(TIA)             | 61.4<br>(RX) |
| Efficiency (pJ/bit)                    | 0.25           | 0.7           | 4.0         | 1.17                     | 3.9         | 3.4          | 0.38                | 1.5           | 0.27                      | 1.28         |
| FoM (Gbps/mm²/mW)                      | 2570           |               | 473         | 1725                     |             | 889          | 622                 |               | 2589                      |              |

<sup>\*</sup>Estimated from reported sensitivity curve \*\*CDR included

### **Outline**

- Background
- PAM-4 Optical Receiver Data Path
- PAM-4 Optical Receiver Front End
  - System Architecture
  - Implementation
  - Measurement Result
- Conclusion

### **Motivation**



### **Architectural Consideration**



$$R_{F} = \frac{(A_{0} + 1)\omega_{A}}{C_{T}BW_{TIS}^{2}} \approx \frac{GBW_{A}}{2\pi C_{T}BW_{TIS}^{2}}$$

$$\overline{i_{n,TIS|CTLE}^{2}}(f) = \frac{4kT}{R_{F}n} + \frac{4kT\gamma}{g_{m}R_{F}^{2}n^{2}} + 4kT\gamma \times \frac{(2\pi C_{T})^{2}}{g_{m}}f$$

$$\overline{i_{n,TIS}^{2}}(f)$$

$$+ \frac{4kT\gamma}{g_{m,eq}R_{F}^{2}n^{2}} + \frac{4kT\gamma}{g_{m,eq}R_{F}^{2}} \left(\frac{f}{BW_{TIS}}\right)^{4}$$

$$\overline{i_{m,eq}^{2}R_{F}^{2}n^{2}} + \frac{g_{m,eq}R_{F}^{2}}{g_{m,eq}R_{F}^{2}} \left(\frac{f}{BW_{TIS}}\right)^{4}$$

- Low-BW TIS + CTLE is used to beak the BW-noise trade-off
- Increasing n reduces white noise terms, while  $f^2$  and  $f^4$  color noise terms remain unchanged
- Large scaling factor n resulting in reduced  $BW_{TIS}$ , necessitating higher peaking from CTLE



•  $I_{n,R_F}^2$  is split at TIS input and output

TIS transfer function

$$\mathbf{Z}_{TIS}(\mathbf{s}) = \frac{-R_o(g_m \mathbf{n} R_F - 1 - s C_{gd} \mathbf{n} R_F)}{1 + g_m n R_F + s K_1 + s^2 K_2}$$

TIS output impedance

$$\mathbf{Z}_{o}(\mathbf{s}) = \frac{R_{o}[1 + s\mathbf{n}R_{F}(C_{gd} + C_{IN})]}{1 + g_{m}\mathbf{n}R_{F} + sK_{1} + s^{2}K_{2}}$$

$$K_1 = C_{IN}(R_o + nR_F) + C_oR_o + C_{gd}nR_F(1 + g_mR_o)$$
  
$$K_2 = nR_FR_o(C_{IN}C_o + C_{IN}C_{gd} + C_{gd}C_o)$$

Noise PSD at TIS output

$$S_{TIS,out}(s) = I_{n,R_F}^2 |Z_{TIS} - Z_o|^2 + I_{n,g_m}^2 |Z_o|^2$$



Transfer function of an ideal unity-gain CTLE stage that recovers the full BW

$$H_{CTLE}(s) = \frac{1 + g_m R_F + s K_1 + s^2 K_2}{(1 + g_m R_a) \left(1 + \frac{s}{nQ2\pi f_{TIS}} + \frac{s^2}{(n2\pi f_{TIS})^2}\right)}$$

Zeros of CTLE cancel the poles of TIS

$$\mathbf{Z_{TIS}}(\mathbf{s}) \times \mathbf{H_{CTLE}}(\mathbf{s}) = \frac{-R_o \left( g_m \mathbf{n} R_F - 1 - s C_{gd} \mathbf{n} R_F \right)}{(1 + g_m R_a) \left( 1 + \frac{s}{\mathbf{n} Q 2\pi f_{TIS}} + \frac{s^2}{(\mathbf{n} 2\pi f_{TIS})^2} \right)} \qquad \mathbf{V_{noise,out}} = \sqrt{\int_0^\infty S_{CTLE,out}(\mathbf{s}) df}$$

Noise PSD at CTLE output

$$S_{CTLE,out}(s) = I_{n,R_F}^2 |(Z_{TIS} - Z_o) \times H_{CTLE}|^2 + I_{n,g_m}^2 |Z_o \times H_{CTLE}|^2$$

Thermal noise of feedback resistor  $R_F$ 

$$I_{n,R_F}^2 = 4kT/\mathbf{n}R_F$$

Channel thermal noise

$$I_{n,g_m}^2 = 4kT\gamma g_m$$

RMS noise at CTLE output

$$V_{noise,out} = \sqrt{\int_0^\infty S_{CTLE,out}(s)df}$$

SNR at CTLE-equalized TIA output

$$SNR = 20log_{10} \left( \frac{V_{ISI}}{V_{n,out}} \right)$$

• Worst eye opening  $V_{ISI}$  is calculated from the main cursor  $V_0$  and the  $i_{th}$  cursors  $V_i$ 

$$V_{ISI} = |V_0| - 3\sum_{i \neq 0} |V_i|$$

- SNR improves as n increases
- n > 3, color noise component dominated
- White noise is suppressed, while the color noise is not affected

#### SNR as a function of $R_F$ scaling factor n





CTLE over/under-peaking affects TIA noise

#### SNR as a function of quality factor Q



Q variation < 15%, degradation < 2.5 dB</li>

## **Proposed Architecture**



- $n = \sim 3.5$  to obtain a large TIS gain and eliminate the post-amplifier
- Gain and middle/high-frequency peaking of the CTLE are tunable
- S2D conversion is put after the single-ended CTLE instead of the TIS
- T-coils are integrated to optimize return loss and relieve BW degradation from ESD

### **Outline**

- Background
- PAM-4 Optical Receiver Data Path
- PAM-4 Optical Receiver Front End
  - System Architecture
  - Implementation
  - Measurement Result
- Conclusion

# **TIS with Multi-Peaking Input Network**



- Multi-peaking input network to distribute parasitic capacitance
- Multi-layer stacked T-coil and inductor are custom designed
- Good broadband impedance matching under heavy capacitive loading achieved

# Single-Ended Inverter-Based CTLE



- Gm-C filter create one pole at  $g_{m1}/(C_{H1}+C_{H2})$
- CTLE response

$$\begin{split} H(s) &= [H_{M}(s) - H_{L}(s)] \cdot L_{Active} \\ &= \left[ g_{m} - \frac{g_{m0}g_{m2}}{g_{m1}} \cdot \frac{1}{1 + s\left(C_{H1} + C_{H2}\right)/g_{m1}} \right] \cdot \frac{sRC_{gs}}{2g_{m3}} \end{split}$$

- High  $f_T$  of PMOS in deep sub-micron CMOS technologies
- CTLE engages two parallel paths: main and low-pass paths
- · Peaking is created by subtracting the low-pass path from the main path
- 3-bit dc gain control and 2-bit middle-frequency (MF) tuning implemented

# **CTLE with Q-Shaping Inductor**



- Accommodate BW variation of TIA caused by bond wires and input capacitance
- Tunable Q: programmable transmission gate R<sub>TG</sub> in parallel with a 670-pH inductor L<sub>s</sub>
- S2D circuit is implemented by a unity gain buffer with active inductor load

## Simulated Frequency Response

### **Post-layout simulation result**



#### At TIS output:

- 53.6 dB $\Omega$  with BW<sub>3dB</sub> of 7.1-GHz
- BW<sub>6dB</sub> of 15.1 GHz, 23.8 GHz and 33.7 GHz

#### **CTLE** response:

6-dB peaking at 31.3 GHz

#### At S2D output:

• 59.9 dB $\Omega$  with BW<sub>3dB</sub> of 33.6 GHz

# VGA Design



- NMOS- or PMOS-only input pairs exhibit compromised linearity
- CMOS input pairs of both TAS and TIS improve linearity
- Combine the tunability of both Gm and R<sub>F</sub> 3-bit Gm control and 2-bit R<sub>F</sub> control
- Tail current sources to improve immunity to supply variations and CM rejection
- Source degeneration resistor to further enhance linearity
- Gain: -2.4 dB to 7.3 dB

# **Simulated Eye Diagrams**





- TIA input amplitude of 600 μA<sub>pp</sub>
- Simulated 100-Gb/s PAM-4 eye diagrams at VGA input and output
- VGA output: ratio level mismatch (RLM) > 96%

# **Output BUF Design**



- Two cascaded differential pairs with shunt-inductive peaking to drive 50-Ω off-chip load
- Multi-layer stacked T-coils to accommodate ESD capacitance
- Buffer with T-coil and ESD diodes provides 0-dB gain with 42-GHz BW

### **Outline**

- Background
- PAM-4 Optical Receiver Data Path
- PAM-4 Optical Receiver Front End
  - System Architecture
  - Implementation
  - Measurement Result
- Conclusion

### Die Photo and Power Breakdown





- Fabricated in a 28-nm bulk CMOS technology
- $0.69 \times 0.53 \text{ mm}^2$  area defined by the pad frame
- 32 mW power consumption including output buffer from a 1.2-V supply

# **Small Signal Measurement: S-Parameter**





- Four-port S-parameter measurement up to 50 GHz
- S21: maximum gain of 21.1 dB with a BW of 30.5 GHz
- S11 and S22 lower than -6 dB up to 40 GHz

# **Small Signal Measurement: Transimpedance**



Current Tuning

Resistor Tuning

10

Freq (GHz)

- ZT: max. gain of 65 dBΩ with a 28 GHz BW
- 9-dB gain control range with an average step of 0.3 dB, overall BW variation < 3 GHz</li>
- CTLE: dc gain control range of 6.8 dB



### **Noise Measurement**





#### Single-ended output noise distribution

- 80-GHz sampling oscilloscope
- Noise from oscilloscope de-embedded

#### Input-referred current noise

$$i_{n,in}(rms) = \frac{2 \times \sqrt{(2.65mV)^2 - (1.16mV)^2}}{10^{(\frac{65}{20})}} = 2.68 \,\mu A_{rms}$$

#### **Average input-referred current noise density**

$$2.68 \, \mu A_{rms} / \sqrt{28 \, GHz} = 16 \, pA / \sqrt{Hz}$$

### **THD Measurement**



#### Single-ended total harmonic distortion (THD)

- 67-GHz spectrum analyzer
- 1-GHz fundamental frequency
- 10 harmonics counted

#### Within a THD of 5%

- At max. gain: 280 μA<sub>pp</sub> input, ~500 mV<sub>pp</sub> output
- At min. gain: 640 μA<sub>pp</sub> input, ~400 mV<sub>pp</sub> output

### **Time Domain Measurement**



#### Time domain measurement setup

- 64-Gbaud bit error rate tester (BERT)
- 10-dB attenuator at data input
- TIA differential output combined by a balun
- More than 10k UI of PRBS-9 pattern



### **BER Measurement: NRZ**



# ■ BER at different amplitudes of electrical input signals is measured

- Assume a PD responsivity of 0.75 A/W
- Estimated BER versus input OMA sensitivity

#### ■ Under 1e-12 BER

- -6.5-dBm sensitivity at 64 Gb/s
- -8.8-dBm sensitivity at 56 Gb/s

### **BER Measurement: PAM-4**



# ■ BER at different amplitudes of electrical input signals is measured

- Assume a PD responsivity of 0.75 A/W
- Estimated BER versus input OMA sensitivity

#### ■ Under 2.4e-4 pre-FEC limit

- -7.8-dBm sensitivity at 100 Gb/s
- With -4-dBm input at 100 Gb/s, 1.5e-5 BER achieved with a RLM of 0.89

# **Comparison Table**

| Reference                                   | ESSCIRC'18 [6]                  | JSSC'22 [7]                                          | JSSC'22 [3]  | JSSC'23 [8]                                          | VLSI'23 [9]                          | SSCL'23 [10]                                          | SSCL'24 [11]                                            | This Work                                            |
|---------------------------------------------|---------------------------------|------------------------------------------------------|--------------|------------------------------------------------------|--------------------------------------|-------------------------------------------------------|---------------------------------------------------------|------------------------------------------------------|
| Technology                                  | 28nm CMOS                       | 22nm<br>FinFET                                       | 28nm<br>CMOS | 16nm<br>FinFET                                       | 12nm<br>FinFET                       | 22nm FD-<br>SOI                                       | 28nm<br>CMOS                                            | 28nm<br>CMOS                                         |
| Data Rate (Gb/s)                            | 112*                            | 128*                                                 | 100          | 112                                                  | 90                                   | 106.25                                                | 85*                                                     | 100*                                                 |
| Gain (dBΩ)                                  | 65                              | 59.3                                                 | 68.6         | 63                                                   | 65                                   | 74                                                    | 65                                                      | 65                                                   |
| BW (GHz)                                    | 60                              | 45.5                                                 | 20.8         | 32                                                   | 25                                   | 28                                                    | 24                                                      | 28                                                   |
| THD@ Input<br>Current, Output<br>Amplitude) | <5%@ 1mA <sub>pp</sub> ,<br>N/A | <5%@<br>330µA <sub>pp</sub> ,<br>304mV <sub>pp</sub> | NA           | <8%@<br>670µA <sub>pp</sub> ,<br>336mV <sub>pp</sub> | <9%@<br>600µA <sub>pp</sub> ,<br>N/A | <4%@2.46<br>mA <sub>pp</sub> ,<br>550mV <sub>pp</sub> | <1.77%@<br>330µA <sub>pp</sub> ,<br>660mV <sub>pp</sub> | <5%@<br>640µA <sub>pp</sub> ,<br>400mV <sub>pp</sub> |
| Noise ( $pA/\sqrt{Hz}$ )                    | 19.3                            | 12.6                                                 | 17           | 16.9                                                 | 13.4                                 | 11                                                    | 10.4                                                    | 16                                                   |
| Input/Output ESD                            | No                              | No                                                   | No           | Yes (80f)                                            | Yes                                  | Yes                                                   | No                                                      | Yes (90f)                                            |
| Power (mW)                                  | 107                             | 11.2                                                 | 117          | 77                                                   | 29.2                                 | 155                                                   | 56                                                      | 32                                                   |
| Efficiency (pJ/bit)                         | 0.96                            | 0.09                                                 | 1.17         | 0.69                                                 | 0.32                                 | 1.46                                                  | 0.66                                                    | 0.32                                                 |
| FoM**                                       | 997                             | 3748                                                 | 478          | 739                                                  | 1522                                 | 905                                                   | 762                                                     | 1556                                                 |

<sup>\*</sup>Electrical Measurement

\*\*FoM = 
$$\frac{Gain(\Omega) \times BW_{3dB}(GHz)}{P_{dC}(mW)}$$

### **Outline**

- Background
- PAM-4 Optical Receiver Data Path
- > PAM-4 Optical Receiver Front End
- **Conclusion**

### Conclusion

#### ■ PAM-4 ORX Data Path

- A 48-Gb/s PAM-4 ORX is proposed with a linear TIA and a PAM-4 sampler integrated
- The TIA avoids CTLE and passive inductors, and the sampler incorporates a 2-tap FFE and a 2-tap DFE to mitigate ISI from TIA
- Under 2.4e-4 BER, -5.1-dBm sensitivity is achieved with 1.28-pJ/bit (0.27-pJ/bit for TIA alone) efficiency

#### ■ PAM-4 ORX Front End

- A 100-Gb/s PAM-4 linear TIA is proposed with high inductance density
- A single-ended inverter-based CTLE is implemented, and a current reuse VGA based on a TAS-TIS topology provides a large gain-BW product and high linearity
- 28-GHz BW with a 65-dBΩ gain is achieved, consuming only 32 mW

### **Publication**

- [1] **Chongyun Zhang**, Fuzhan Chen, Li Wang, Lin Wang, and C. Patrick Yue, "Recent advances of high-speed short-reach optical interconnects for data centers," *IEEE Open J. Solid-State Circuits Soc.*, vol. 5, pp. 86-100, 2025.
- [2] **Chongyun Zhang**, Li Wang, Zilu Liu, Fuzhan Chen, Quan Pan, Xianbo Li, and C. Patrick Yue, "A 48-Gb/s half-rate PAM4 optical receiver with 0.27-pJ/bit TIA efficiency, 1.28-pJ/bit RX efficiency, and 0.06-mm<sup>2</sup> area in 28-nm CMOS," in *Proc. IEEE Symp. VLSI Technol. Circuits* (VLSI Technol. Circuits), Jun. 2024, pp. 1–2.
- [3] **Chongyun Zhang**, Fuzhan Chen, and C. P. Yue, "A 56-Gb/s PAM-4 transmitter using silicon photonic microring modulator in 40nm CMOS," in *Proc. IEEE Int. Midwest Symp. Circuits Syst. (MWSCAS)*, Aug. 2022, pp. 1-4.
- [4] **Chongyun Zhang**, Xinyi Liu, and C. Patrick Yue, "A compact VCSEL model for high-speed optical interconnect design," in *Proc. Laser Congr. (ASSLLAC)*, Jan. 2021, JTu1A.30.
- [5] Abdekhoda Johar, **Chongyun Zhang**, Li Wang, Reza Sarvari, Reza Navid, and C. Patrick Yue, "A 56-Gb/s PAM-4 injection-locked CDR," in Proc. IEEE Eur. Solid State Circuits Conf. (ESSCIRC), Sep. 2025, accepted.
- [4] Fuzhan Chen, **Chongyun Zhang**, Li Wang, Quan Pan, and C. Patrick Yue, "A 56-Gb/s PAM-4 VCSEL transmitter with piecewise compensation scheme in 40-nm CMOS," *IEEE J. Solid-State Circuits*, early access.
- [5] Fuzhan Chen, **Chongyun Zhang**, Li Wang, Quan Pan, and C. Patrick Yue, "A 2.05-pJ/b 56-Gb/s PAM-4 VCSEL transmitter with piecewise nonlinearity compensation and asymmetric equalization in 40-nm CMOS," in *Proc. IEEE Eur. Solid State Circuits Conf. (ESSCIRC)*, Sep. 2023, pp. 373-376.
- [7] **Chongyun Zhang**, Li Wang, Fuzhan Chen, Quan Pan, Xianbo Li, and C. Patrick Yue, "A 48-Gb/s inductorless PAM4 optical receiver with 1.28-pJ/bit efficiency in 28-nm CMOS," *IEEE J. Solid-State Circuits*, major revision.

# **Acknowledgement**

■ Supervisor: Prof. C. Patrick Yue

**■** Committee Members

Chairperson: Prof. Wenjing Ye

External: Prof. Chao Wang

ECE: Prof. Howard Cam Luong

**Prof. Man Hoi Wong** 

• CSE: Prof. Song Guo

■ All Optical Wireless Lab Members

Dr. Li Wang

Dr. Fuzhan Chen

### Reference

- [1] M. G. Ahmed, D. Kim, R. K. Nandwana, A. Elkholy, K. R. Lakshmikumar and P. K. Hanumolu, "A 16-Gb/s -11.6-dBm OMA sensitivity 0.7-pJ/bit optical receiver in 65-nm CMOS enabled by duobinary sampling," *IEEE J. Solid-State Circuits*, vol. 56, no. 9, pp. 2795–2803, Sep. 2021.
- [2] W. Ho, Y. Hsieh, B. Murmann and W. Chen, "A 32 Gb/s PAM-4 optical transceiver with active back termination in 40 nm CMOS technology," *IEEE Open J. Circuits Syst.*, vol. 2, pp. 56–64, 2021.
- [3] H. Li, C. Hsu, J. Sharma, J. Jaussi and G. Balamurugan, "A 100-Gb/s PAM-4 optical receiver with 2-tap FFE and 2-tap direct-feedback DFE in 28-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 57, no. 1, pp. 44–53, Jan. 2022.
- [4] H. Kang, I. Kim, R. Liu, et al., "A 42.7Gb/s Optical Receiver with Digital CDR in 28nm CMOS," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Jun. 2023, pp. 9–12.
- [5] S. Krishnamurthy et al., "A 4×50Gb/s NRZ 1.5pJ/b co-packaged and fiber-terminated 4-channel optical RX," in *Proc. IEEE Symp. VLSI Technol. Circuits* (VLSI Technol. Circuits), Jun. 2024, pp. 1-2.
- [6] H. Li, G. Balamurugan, J. Jaussi and B. Casper, "A 112 Gb/s PAM4 linear TIA with 0.96 pJ/bit energy efficiency in 28 nm CMOS," in *Proc. IEEE Eur. Solid-State Circuits Conf. (ESSCIRC)*, Sep. 2018, pp. 238–241.

### Reference

- [7] S. Daneshgar, H. Li, T. Kim and G. Balamurugan, "A 128 Gb/s, 11.2 mW single-ended PAM4 linear TIA with 2.7 μArms input noise in 22 nm FinFET CMOS," *IEEE J. Solid-State Circuits*, vol. 57, no. 5, pp. 1397-1408, May 2022.
- [8] D. Patel, A. Sharif-Bakhtiar and T. C. Carusone, "A 112-Gb/s 8.2-dBm sensitivity 4-PAM linear TIA in 16-nm CMOS with co-packaged photodiodes," *IEEE J. Solid-State Circuits*, vol. 58, no. 3, pp. 1–14, Mar. 2023.
- [9] M. Kashani, H. Shakiba and A. Sheikholeslami, "A 0.32pJ/b 90Gbps PAM4 optical receiver front-end with automatic gain control in 12nm CMOS FinFET," in *Proc. IEEE Symp. VLSI Technol. Circuits (VLSI Technol. Circuits),* June 2023, pp. 1-2.
- [4] M. Parvizi et al., "A 112-Gb/s, -10 dBm sensitivity, +5 dBm overload, and SiPh-based receiver frontend in 22-nm FDSOI," *IEEE Solid-State Circuits Lett.*, vol. 7, pp. 263-266, 2024.
- [5] S. Ma et al., "A 85-Gb/s PAM-4 TIA With 2.2-mApp Maximum Linear Input Current in 28-nm CMOS," *IEEE Solid-State Circuits Lett.*, vol. 7, pp. 50-53, 2024.





# Thank you

Optical Wireless Lab

Department of Electronic and Computer Engineering
the Hong Kong University of Science and Technology (HKUST)

# **Back-up: Calibration Logic**



#### **Slicers with calibration circuits**

- Calibration logic
- 6-bit calibration DACs, DAC\_Cal

# **Back-up: Calibration Logic**



- If the comparator does not generate '0' within 8 clock cycles, calibration logic output increases by 1 and DAC\_cal increases by 1 MSB
- The process continues until a transition from '1' to '0' happens at slicer output and the calibration ends

# Back-up: V<sub>ISI</sub> Calculation

• Worst eye opening  $V_{ISI}$  for PAM-4 is calculated from

$$V_{ISI} = |V_0| - 3\sum_{i \neq 0} |V_i|$$

- $V_0$  is the main cursor
- $V_i$  is the  $i_{th}$  pre/post cursors



# Back-up: Response of Proposed VGA in 100-Gb/s TIA



- 7.3 dB with 23.6 GHz BW
- -2.4 dB with 35.3 dB
- 9.7 dB dynamic range, 11.7 GHz BW variation

# Back-up: High-Frequency PCB Design



- 10-mil RO4350B material for laminates
- PCB is trenched to accommodate the TIA die, reducing the length of bond wires

# Back-up: High-Frequency PCB Design

#### **Simulated S parameters of PCB traces**



#### Input trace:

- S11 < -20 dB up to 35 GHz
- S21 > -0.4 dB at 25 GHz

#### **Output trace:**

- S11 < -10 dB up to 35 GHz
- S21 > -0.6 dB at 25 GHz