# Optoelectronic Receivers in Standard CMOS for Short-Range Optical Communications

by

# **Quan PAN**

A Thesis Submitted to

The Hong Kong University of Science and Technology
in Partial Fulfillment of the Requirements for
the Degree of Doctor of Philosophy
in the Department of Electronic and Computer Engineering

April 2014, Hong Kong

# **Authorization**

I hereby declare that I am the sole author of the thesis.

I authorize the Hong Kong University of Science and Technology to lend this thesis to other institutions or individuals for the purpose of scholarly research.

I further authorize the Hong Kong University of Science and Technology to reproduce the thesis by photocopying or by other means, in total or in part, at the request of other institutions or individuals for the purpose of scholarly research.

Quan PAN

April 2014

# Optoelectronic Receivers in Standard CMOS for Short-Range Optical Communications

by

# Quan PAN

This is to certify that I have examined the above PhD thesis and have found that it is complete and satisfactory in all respects, and that any and all revisions required by

the thesis examination committee have been made.

| MENCI.                                                                         |
|--------------------------------------------------------------------------------|
| Prof. C. Patrick YUE, ECE Department (Thesis Supervisor)                       |
| Las                                                                            |
| Prof. Nevin L. ZHANG, CSE Department (Thesis Examination Committee Chairman)   |
| Medenspring                                                                    |
| Prof. Wing-Hung KI, ECE Department (Thesis Examination Committee Member)       |
|                                                                                |
| Prof. Andrew. W. O. POON, ECE Department (Thesis Examination Committee Member) |
| Wh Than                                                                        |
| Prof. Jingshen WU, MAE Department (Thesis Examination Committee Member)        |
| MMM                                                                            |

Professor Ross. D. MURCH (Head of Department)

Department of Electronic and Computer Engineering

April 2014

# To my family

# **Acknowledgements**

First of all and most importantly, I would like to express my sincere gratitude to my Ph.D. supervisor, Prof. C. Patrick Yue for his encouragement, patience, valuable support, and guidance throughout my research. It has been a valuable experience to be Prof. Yue's Ph.D. student. From him I have learned how to broaden my thoughts, how to conduct research, and how to pursue a successful life. This training has brought me a brand new world that I had never experienced before.

I would like to sincerely thank Prof. Nevin L. Zhang, Prof. Wing-Hung Ki, Prof. Andrew W. O. Poon, Prof. Fujiang Lin, Prof. Jingshen Wu, and Prof. D. Ross Murch for serving as my thesis committee.

I would like to thank the ECE lab technicians, Mr. Frederick Kwok, Mr. Siu Fai Luk, and Mr. Allen F. L. Ng for their wonderful technical support on PCB bonding, design tools, and chip tape-outs. I would like to thank Mr. Kwok Wai Chan for equipment purchase and all kinds of equipment guidance. I enjoyed the discussion and knowledge sharing between us.

I would like to thank my colleagues in the High Speed Silicon Laboratory (HS2L), Dr. Liang Wu, Dr. Li Sun, Mr. Yipeng Wang, Mr. Zhengxiong Hou, Mr. Fengyu Che, Mr. Salahuddin Raju, Mr. Duona Luo, Mr. Xianbo Liu, Miss Liwen Jing, Mr. Babar Hussain, and Mr. Khawaja Qasim Maqbool who shared numerous valuable discussions with me. Especially for the high-speed circuit group, we learned from each other, helped with each other, and encouraged each other.

I would like to thank my friends in the Senor and Instrumentation Laboratory (SIL), Dr. Rongxiang Wu, Dr. Xiaodong Huang, Dr. Xianda Zhou, Dr. Lulu Peng, Mr. Xiangming Fang, Mr. Hao Feng, Miss Jie Ren, and Mr. Mingyang Yin, for having fun together and making my life colorful.

I also would like to thank my friends in the Photonic Device Laboratory (PDL), Dr. Shaoqi Feng, Mr. Yu zhang, and Miss Yu Li, for their help with device measurements and discussions; Dr. Sujiang Rong, Dr. Alan Ng, Dr. Jun Yin, Dr. Shiyuan Zheng, Mr. Alvin Li, Mr. Yue Chao in the Analog Research Laboratory (ARL); Dr. Chenzhang Zhang, Dr. Xiaocheng Jing, Dr. Vincent Chan, Dr. Yan Lu, Dr. Yonggen Liu, Mr. Cheng Huang, Mr. Erick Lai, Mr. YK Teh, Mr. Lin Cheng, Mr. Min Tan, Mr. Fan Yang in the Integrated Power Electronics Laboratory (IPEL); Dr. Bing Liu, Dr. Ruoyu Xu, Dr. Jing Guo, Mr. Jiageng Huang in the Mixed-Signal Bio-Medical Integrated Circuits Laboratory (MIXIC); and Mr. Qimeng Jiang, Mr. Yunyou Lu, Mr. Zhikai Tang, Mr. Xi Tang, Mr. Cheng Liu, Mr. Shenghou Liu, Miss Hanxing Wang in the Wide-Bandgap Semiconductor Electronics Laboratory (WISE-LAB).

I would like to express my sincere gratitude to my family for their unconditional love and encouragement. I appreciate my parents' support and sacrifice to put me through college. Finally, I thank my fianc &, Yuyin Liang, for her unlimited support and love.

# **Table of Contents**

| Title F | Page                                                | i   |
|---------|-----------------------------------------------------|-----|
| Autho   | orization Page                                      | ii  |
| Signat  | ture Page                                           | iii |
| Ackno   | owledgements                                        | V   |
| Table   | of Contents                                         | vii |
| List of | f Figures                                           | xi  |
| List of | f Tables                                            | XV  |
| Abstra  | act                                                 | xvi |
| Chapt   | er 1 Introduction to CMOS Optoelectronic Receivers  | 1   |
| 1.1     | Research History                                    | 2   |
| 1.2     | Research of state-of-the-art CMOS optical receivers | 5   |
| 1.3     | Scope of this Research                              | 9   |
| 1.4     | Organization of the Thesis                          | 10  |
| Chapt   | er 2 30-Gb/s OEIC Architecture                      | 12  |
| 2.1     | Introduction                                        | 12  |
| 2.2     | Design Challenges                                   | 13  |
| 2.3     | The Proposed 30-Gb/s OEIC Architecture              | 14  |
| Chapt   | er 3 30-Gb/s OEIC Circuits Design                   | 17  |
| 3.1     | Transimpedance Amplifier                            | 17  |
| 3.2     | Low Dropout Regulator                               | 25  |
| 3 3     | DC Offset Cancellation Buffer                       | 27  |

| 3.4    | Ma   | in Amplifier                                           | 29 |
|--------|------|--------------------------------------------------------|----|
| 3.5    | Cas  | scaded Continuous-Time Linear Equalizer                | 29 |
| 3.6    | Lin  | niting Amplifier                                       | 34 |
| 3.7    | DC   | Offset Cancellation Feedback Loop                      | 39 |
| 3.8    | Out  | tput Driver                                            | 41 |
| 3.9    | Bia  | s Circuits and 64-bit Shift Register                   | 42 |
| Chapte | er 4 | 30-Gb/s OEIC Measurement                               | 44 |
| 4.1    | Me   | asurement Setup                                        | 44 |
| 4.2    | Me   | asurement Results                                      | 46 |
| 4.2    | 2.1  | Electrical Measurement Results                         | 49 |
| 4.2    | 2.2  | Optical Measurement Results with a 30-Gb/s Off-Chip PD | 51 |
| 4.2    | 2.3  | Optical Measurement Results with a 14-Gb/s Off-Chip PD | 54 |
| 4.3    | Coı  | nclusion                                               | 57 |
| Chapte | er 5 | 18-Gb/s Fully Integrated OEIC Architecture             | 58 |
| 5.1    | Intr | oduction                                               | 58 |
| 5.2    | Des  | sign Challenges                                        | 58 |
| 5.3    | The  | e Proposed Optical Receiver Architecture               | 59 |
| Chapte | er 6 | 18-Gb/s Fully Integrated OEIC Circuits Design          | 61 |
| 6.1    | On-  | -chip PW/DNW PD                                        | 61 |
| 6.1    | 1.1  | Introduction                                           | 61 |
| 6.1    | 1.1  | The Proposed Topology                                  | 63 |
| 6.2    | Ind  | uctive Cascode Inverter-based Transimpedance Amplifier | 65 |
| 6.3    | Cas  | scaded Continuous-Time Linear Equalizer                | 66 |
| 64     | Ada  | antive Equalization Loop                               | 69 |

| 6.4.1      | Introduction                                             | 69    |
|------------|----------------------------------------------------------|-------|
| 6.4.2      | Variable-Gain Low Pass Filter                            | 71    |
| 6.4.3      | Differential Power Detector                              | 73    |
| Chapter 7  | 18-Gb/s Fully Integrated OEIC Measurement                | 74    |
| 7.1 Me     | easurement Setup                                         | 74    |
| 7.2 Me     | easurement of the Proposed PW/DNW PD                     | 74    |
| 7.3 Me     | easurement Results of the 18-Gb/s Fully Integrated OEIC  | 81    |
| 7.3.1      | Electrical Measurement Results                           | 83    |
| 7.3.2      | Optical Measurement Results                              | 84    |
| 7.4 Con    | nclusion                                                 | 86    |
| Chapter 8  | Technology Options and Physical Implementation Technique | s for |
| High Frequ | uency Amplifiers                                         | 88    |
| 8.1 Hig    | gh-Speed Transistor $V_t$ Options and Layout             | 88    |
| 8.1.1      | Introduction                                             | 88    |
| 8.1.2      | LNA Test Circuit with Different Layout and $V_t$ Options | 89    |
| 8.1.3      | Measurement Results                                      | 92    |
| 8.1.4      | Conclusion                                               | 100   |
| 8.2 Dif    | ferential Stacked Spiral Inductor Design                 | 101   |
| 8.2.1      | Introduction                                             | 101   |
| 8.2.2      | The Differential Stacked Spiral Inductor                 | 102   |
| 8.2.3      | Measurement Results                                      | 105   |
| 8.2.4      | Conclusion                                               | 107   |
| Chapter 9  | Conclusions and Future Work                              | 108   |
| 9.1 Co     | nclusions                                                | 108   |

| 9.2     | My Own Contributions                                  | 110 |
|---------|-------------------------------------------------------|-----|
| 9.3     | Future Work                                           | 111 |
| Refer   | ENCES                                                 | 112 |
| List of | f Publications                                        | 121 |
| List of | f Patents                                             | 124 |
| Apper   | ndix A: MATLAB Program for Curve Fitting              | 125 |
| Apper   | ndix B: MATLAB Program for Gain Conversion            | 129 |
| Apper   | ndix C: Labjack Control for the 64-bit Shift Register | 130 |

# **List of Figures**

| Fig. 1. 1. Total traffic bandwidth increase estimation by Cisco, 2012-2017 [1]1                |
|------------------------------------------------------------------------------------------------|
| Fig. 1. 2. Bandwidth and distance comparison for commercial electrical and optical links [7]4  |
| Fig. 1. 3. 100-GbE module evolution [12]                                                       |
| Fig. 1. 4. Block diagram of the second generation 100-GbE system [13]                          |
|                                                                                                |
| Fig. 2. 1. Conventional optical receiver architecture.                                         |
| Fig. 2. 2. Architecture of the proposed 30-Gb/s OEIC receiver with cascaded equalization 14    |
|                                                                                                |
| Fig. 3. 1. Schematic of the inverter-based TIA [19].                                           |
| Fig. 3. 2. (a) Input peaking network and (b) small signal model of the core TIA20              |
| Fig. 3. 3. Simulated frequency response of input peaking network                               |
| Fig. 3. 4. Simulated frequency response of TIA input impedance.                                |
| Fig. 3. 5. Simulated frequency response of the TIA.                                            |
| Fig. 3. 6. Schematic of the tri-loop LDO [22]                                                  |
| Fig. 3. 7. DOC buffer: (a) schematic, and (b) simulated frequency response                     |
| Fig. 3. 8. Conventional CTLE topology                                                          |
| Fig. 3. 9. CTLE with inductive peaking.                                                        |
| Fig. 3. 10. (a) Schematic of 1-stage CTLE, and (b) simulated frequency response of 3-stage     |
| CTLE. 34                                                                                       |
| Fig. 3. 11. (a) Schematic of the conventional CH amplifier, and (b) simplified block diagram36 |
| Fig. 3. 12. (a) Schematic of the modified CH amplifier, and (b) simplified block diagram37     |

| Fig. 3. 13. DOC feedback loop.                                                                 | 40        |
|------------------------------------------------------------------------------------------------|-----------|
| Fig. 3. 14. Schematic of the output driver.                                                    | 41        |
|                                                                                                |           |
| Fig. 4. 1. The full-view of the custom-designed chip-on-board test fixture.                    | 44        |
| Fig. 4. 2. (a) Electrical S-Parameter, and (b) electrical data eye/BER measurement setup       | 45        |
| Fig. 4. 3. Optical data eye/BER measurement setup for the OEIC.                                | 46        |
| Fig. 4. 4. Measurement setup.                                                                  | 47        |
| Fig. 4. 5. The chip-on-board microphotograph.                                                  | 48        |
| Fig. 4. 6. Measured frequency response with different CTLE settings                            | 49        |
| Fig. 4. 7. Measured electrical data eye (PRBS-15) with (a) CTLE disabled, and (b) CTLE over    | <b>?-</b> |
| equalized                                                                                      | 50        |
| Fig. 4. 8. Measured PRBS-15 optical data eye with a 30-Gb/s off-chip PD: (a) 27 Gb/s, CTLE     |           |
| disabled, (b) 28 Gb/s, CTLE disabled, and (c) 30 Gb/s, CTLE enabled (0011)                     | 52        |
| Fig. 4. 9. Measured BER bathtub curves.                                                        | 53        |
| Fig. 4. 10. Measured BER versus optical input power                                            | 53        |
| Fig. 4. 11. Measured optical eye with a 14-Gb/s off-chip PD: (a) 28 Gb/s, and (b) 30 Gb/s      | 55        |
| Fig. 4. 12. Measured BER bathtub curves                                                        | 56        |
| Fig. 4. 13. Measured BER versus optical input power                                            | 56        |
|                                                                                                |           |
| Fig. 5. 1. Architecture of the proposed 18-Gb/s fully integrated OEIC receiver with on-chip PI | )         |
| and adaptive equalizer.                                                                        | 60        |
|                                                                                                |           |
| Fig. 6. 1. (a) Top-down, and (b) cross-section views of the proposed PW/DNW PD                 | 64        |
| Fig. 6. 2. Schematic of the inductive cascode inverter-based TIA                               | 66        |

| Fig. 6. 3. Schematic of the first stage CTLE.                                                         | 67   |
|-------------------------------------------------------------------------------------------------------|------|
| Fig. 6. 4. Cascaded 3-stage CTLE.                                                                     | 68   |
| Fig. 6. 5. (a) Power spectral density of the random data bit stream, and (b) block diagram of         | AEL  |
| [24]                                                                                                  | 70   |
| Fig. 6. 6. Schematic of the variable-gain LPF.                                                        | 71   |
| Fig. 6. 7. Simulated gain and bandwidth for variable-gain LPF.                                        | 72   |
| Fig. 6. 8. Schematic of the DPD.                                                                      | 73   |
| Fig. 7. 1. (a) Microphotograph of the proposed PW/DNW PD, and (b) measurement setup                   | 75   |
| Fig. 7. 2. (a) Measured photocurrent with and without illumination light, and (b) measured l          | oias |
| dependency of the responsivity. The inset is the ratio of illumination to dark current vs. $V_{PD}$ . | 77   |
| Fig. 7. 3. Measured reflection coefficients of the PW/DNW PD at 0.5-V $V_{PD}$                        | 79   |
| Fig. 7. 4. Measured vs. fitted optical frequency response of the PD                                   | 79   |
| Fig. 7. 5. PD model with both intrinsic and extrinsic sub-models.                                     | 80   |
| Fig. 7. 6. CoB testing fixture for (a) electrical measurement, and (b) optical measurement            | 82   |
| Fig. 7. 7. Measured electrical frequency response with different CTLE settings                        | 83   |
| Fig. 7. 8. Measured optical PRBS-15 data eyes at the maximum DR in standard and avalance              | he   |
| mode with the CTLE enabled and disabled.                                                              | 84   |
| Fig. 7. 9. Measured BER bathtub curves with PD in different modes: (a) 0.5-V standard mod             | łe,  |
| and (b) 12.3-V avalanche mode.                                                                        | 85   |
| Fig. 7. 10. Measured BER versus optical input power                                                   | 86   |
| Fig. 8. 1. Simplified schematic of the 5-GHz LNAs                                                     | 90   |

| Fig. 8. 2. Layout view (to scale) of (a) design split #1 and #2: Merged_Nor $V_t$ and Merged     | $\mathrm{d} LV_t$ ; |
|--------------------------------------------------------------------------------------------------|---------------------|
| (b) design split #3 and #4: PDK_Nor $V_t$ and PDK_L $V_t$ ; and (c) design split #5: PDK_L $V_t$ | _1.7:1.             |
|                                                                                                  | 92                  |
| Fig. 8. 3. Die photo of one LNA test circuit used in this study                                  | 93                  |
| Fig. 8. 4. Measured results of power gain (S21) for all 5 LNAs.                                  | 95                  |
| Fig. 8. 5. Measured results of input matching (S11) for all 5 LNAs.                              | 96                  |
| Fig. 8. 6. Measured results of output matching (S22) for all 5 LNAs.                             | 97                  |
| Fig. 8. 7. Measured results of reverse isolation (S12) for all 5 LNAs                            | 97                  |
| Fig. 8. 8. Testing setup of the differential NF measurement.                                     | 98                  |
| Fig. 8. 9. Measured results of the NF.                                                           | 98                  |
| Fig. 8. 10. Measured results of IIP3 (a) Merged_Nor $V_t$ , and (b) PDK_L $V_t$ _1.7:1           | 100                 |
| Fig. 8. 11. The custom-designed DSSI [68].                                                       | 103                 |
| Fig. 8. 12. HFSS simulation results of (a) inductance, and (b) quality factor for single-lay     | er and              |
| presented customized DSSI                                                                        | 104                 |
| Fig. 8. 13. Microphotograph of the DSSI.                                                         | 105                 |
| Fig. 8. 14. Comparison of measured and simulated inductance of the DSSI                          | 106                 |
| Fig. 8. 15. Comparison of measured and simulated quality factor of the DSSI                      | 106                 |

# **List of Tables**

| Table 2. 1: Breakdown of the receiver gain, and IRN                                        |
|--------------------------------------------------------------------------------------------|
| Table 3. 1 Paramter summary of simulated input peaking network                             |
| Table 3. 2 Parameter summary of simulated core TIA input impedance24                       |
| Table 4. 1: Comparison to recently published CMOS optical receivers                        |
| Table 6. 1: Gain-bandwidth- $V_{LPF}$ table for different data rates                       |
| Table 7. 1: Parameters for the PD's intrinsic model under the standard mode80              |
| Table 7. 2: Comparison with recently published CMOS PDs for $V_{PD} \leq$ Supply voltage81 |
| Table 7. 3: Comparison to published CMOS 850-nm optical receivers                          |
| Table 8. 1: Parameters of the 5-GHz LNAs90                                                 |
| Table 8. 2: Summary of LNA test circuit design splits                                      |
| Table 8. 3: Post-simulation and measurement result summary                                 |
| Table 8. 4: Recommended layout and $V_t$ usage guidelines                                  |
| Table 8. 5: Simulation comparison of the single-layer inductor and the customized DSSI 105 |

# Optoelectronic Receivers in Standard CMOS for Short-Range Optical Communications

by

## **Quan PAN**

Department of Electronic and Computer Engineering

The Hong Kong University of Science and Technology

#### **ABSTRACT**

Short-range optical communications with data rates above 10 Gb/s have drawn significant research efforts in recent years as conventional copper cables have become less competitive with respect to weight, energy efficiency, limited channel bandwidth, crosstalk, and electromagnetic interference (EMI). Thus, complementary metal-oxide-semiconductor (CMOS) optoelectronic integrated circuits (OEICs) have become extremely attractive since they can be extensively adopted in short-range optical communications, such as local area networks (LANs), board-to-

board interconnects, and data-to-data centers. In this thesis, two large OEIC systems are designed for different configurations. First of all, key challenges and bottlenecks of the two OEICs are discussed and analyzed from the system views. Second, different methodologies are presented to solve them. Third, fabricated in Taiwan Semiconductor Manufacturing Company (TSMC) standard 1-V 65-nm CMOS technology, these two OEICs are designed, fabricated, and measured, respectively.

A 41-mW 30-Gb/s CMOS digitally-controlled OEIC with an off-chip 14-Gb/s Global Communication Semiconductors (GCS) PIN 850-nm photodetector (PD) is achieved with the proposed cascaded equalization approach. The presented OEIC consists of an inverter-based transimpedance amplifier (TIA), a DC offset cancellation (DOC) buffer, a main amplifier (MA), a 3-stage continuous-time linear equalizer (CTLE), a 2-stage limiting amplifier (LA), a DOC feedback loop, an on-chip low dropout (LDO) regulator, a 64-bit shift register, and a  $50-\Omega$  output driver (OD). The electrical measurement results demonstrate that it achieves the highest transimpedance gain (83 dB $\Omega$ ) and the widest bandwidth (24 GHz) at the lowest power consumption (41 mW) among the CMOS OEICs published to date. The 3-stage CTLE offers 16dB adjustable low-frequency gain to overcome channel loss and compensate for process, voltage and temperature (PVT) variations. Furthermore, the optical measurement results show that with a 30-Gb/s PD, the receiver achieves  $10^{-12}$  BER for 30-Gb/s,  $2^{15}$ –1 pseudo-random binary sequence (PRBS) inputs at -5.6-dBm sensitivity, and 1.37-pJ/bit efficiency. With a 14-Gb/s PD, the receiver can still reach 30 Gb/s at  $10^{-12}$  BER with only 0.6-dB degradation in sensitivity demonstrating the effectiveness of the proposed receiver design and the cascaded CTLE. With a 1/1.2-V voltage supply, the core area is 0.26 mm<sup>2</sup>.

A 48-mW 18-Gb/s CMOS fully integrated OEIC with an on-chip PD and an adaptive cascaded equalization approach for 850-nm short-range optical communications provides a single-chip solution with many advantages compared with hybrid systems, such as low-cost, elimination of bonding and packaging at the key input node. However, CMOS on-chip PDs have characteristics of extremely limited bandwidth and much smaller responsivity, becoming the bottleneck of highspeed OEIC systems. To improve the limited bandwidth and responsivity performance of conventional CMOS on-chip PDs, a new PD topology is proposed, fabricated and measured. To extend the system's overall bandwidth, a robust slow roll-up CTLE topology which compensates for both PVT variations and the lossy frequency response of the on-chip PD is proposed and designed. This OEIC consists of a CMOS P-well/Deep N-well (PW/DNW) on-chip PD, an inductive cascode inverter-based TIA, a DOC buffer, an MA, a 3-stage tunable CTLE, a 2-stage LA, a DOC feedback loop, an adaptive equalization loop (AEL), an on-chip LDO, and an output open-drain buffer (OB). The electrical measurement results show a transimpedance gain of 102  $dB\Omega$  and a bandwidth of 12.5 GHz at the power consumption of 48 mW. Furthermore, the optical measurement results demonstrate a fully integrated solution under (1) standard mode (0.5-V PD reverse bias voltage  $(V_{PD})$ ), a record data traffic of 9 Gb/s for  $2^{15}$ -1 PRBS with  $10^{-12}$  BER, -4.2dBm optical input sensitivity, and 5.33-pJ/bit efficiency and (2) avalanche mode (12.3-V  $V_{PD}$ ) another record data traffic of 18 Gb/s for 2<sup>15</sup>-1 PRBS with 10<sup>-12</sup> BER, -4.9-dBm optical input sensitivity, and 2.7-pJ/bit efficiency. With a 1/1.2-V voltage supply, the core area is 0.23 mm<sup>2</sup>.

# **Chapter 1**

# **Introduction to CMOS Optoelectronic Receivers**

For the past few decades, a great surge in data traffic has emerged from the exponential growth of multimedia consumer applications. However, the data transmission is becoming the performance limiting factor, because the data transmission time becoming longer than the data processing time. According to the Cisco Visual Networking Index (VNI) Forecast, global IP traffic will reach 120.6-exabyte per month by 2017 [1]. Therefore, scientists are working hard on the design of ultrafast communication systems.



Fig. 1. 1. Total traffic bandwidth increase estimation by Cisco, 2012-2017 [1].

### 1.1 Research History

In the early 1840s, the fiber optics came into sight during the research of light refraction by Daniel Colladon and Jacque Babinet in Paris. 1n 1870, John Tyndall wrote about the property of "total internal reflection" about the nature of light [2]. By the late 1800s, scientists had found that light could transfer inside bend rods made of quartz. And the "fiber" was born as a flexible, transparent rod of glass or plastic. In 1954, research was focused on fiber bundles for image transmission by Abraham van Heel of the Technical University of Delft in Holland, Harold Horace Hopkins and Narinder Singh Kapany of the Imperial College in London, independently [3]. The latter two scientists achieved a light transmission through a 75-cm long bundle combined by several thousand fibers, publishing their research work in Nature in the same year [4].

"Bare" fiber lost too much energy to the surrounding air, as recognized by Brian O' Brien of the American Optical Company. This motivated Abraham van Heel to wrap up the fiber core with a cladding, which had a lower refractive index that trapped light in the core according to the "total internal reflection theory", published in 1870. However, the fiber loss was still too high, around 1,000 dB/km, and could only be used in internal medical examinations.

In the 1950s and 1960s, the invention of the laser as the light source played a critical role in modern fiber optics. Its broadband modulation capability provided great potential for data transmission, although at that time it seemed that no suitable propagation medium was available [3]. In 1966, Charles K. Kao and George A. Hockham of the Standard Telecommunication Laboratory (STL) in Britain were the first to propose the idea that the huge attenuation in fibers at that time was caused by impurities rather than by fundamental physical effects such as scattering.

The light-loss properties for optical fibers were studied systematically, and the right material to use – silica glass – with high purity was finally pointed out. Their theory paved the way that the attenuation of optical fibers could be below 20 dB/km. The discovery honored Kao with the Nobel Prize in Physics in 2009 [4], [5]. Four years later, researchers Robert D. Maurer, Donald Keck, Peter C. Schultz, and Frank Zimar of Corning Glass Works (CGW) together demonstrated an optical fiber achieving an 17-dB/km attenuation by doping silica glass with titanium. Five years later, by using germanium dioxide as the core dopant, an optical fiber with only 4-dB/km attenuation was produced, and then the attenuation number is further reduced to be 0.2 dB/km in 1979.

Therefore, after inventing the new optical light source and suitable propagation medium, optical communication systems show great potential towards a new revolution. Fiber-optic links show significant superiority to conventional electrical cable links and wireless links, in terms of cost, bandwidth, channel loss per kilometer, crosstalk, and electromagnetic interference (EMI). For example, the bandwidth of optical fiber is roughly 25 to 50 GHz and the loss is around 0.15 to 0.2 dB/km. While the numbers for twisted-pair cables are 200 dB/km at 100 MHz, and the numbers for low-cost coaxial cables are 500 dB/km at 1GHz. Also, the wireless transmission with several gigahertz carrier frequencies suffers from the attenuation of tens of decibels per a few meters while can only supporting data rate up to 100 Mb/s [3].

In the practical applications, cost is always one of the most important concerns. For long-haul data communications, implemented in expensive III-V materials, such as GaAs and InP-InGaAs, the high-speed fiber-optic links have already replaced the electrical alternative. These expensive materials are mainly used in the conversion components between optics and electronics, and the

driver/transimpedance amplifier stages. Since a large number of users share with the long-haul fiber-optic links, the cost per user is relatively low [6]. For example, for the long-haul data communications among continents or countries, commercial fiber-optic links have completely replaced electrical links.



Fig. 1. 2. Bandwidth and distance comparison for commercial electrical and optical links [7].

However, for short-haul communications, the so-called last mile of the internet, such as Local Area Networks (LANs), board-to-board, chip-to-chip and data-to-data centers, channels cannot be shared and thus these expensive III-V materials based fiber-optic links cannot be afforded any more. Therefore, the cost consideration dominates in short-haul applications. That is why nowadays the copper cable technology such as coaxial cable and power line communication for

short-haul communications still dominates. Copper cable technology is a decent solution as it can support each subscriber with bandwidth up to several gigahertzes. However, great difficulties have emerged when subscribers begin to ask for higher bandwidth. Shown in Fig. 1.2, the bandwidth-distance product of copper cable cannot afford the requirement for 100 Gb/s—m any more, but the optical fiber can satisfy the requirement. It is the trend that optical fiber must find a cost-effective way to replace the copper cable in short-haul communications [6].

#### 1.2 Research of state-of-the-art CMOS optical receivers

With the great development of complementary metal-oxide-semiconductor (CMOS) technology, the chip fabrication cost has been reduced significantly. Lots of electronic circuits have been moved from different expensive technologies to the CMOS platform. Therefore, it has great potential to implement optical communication systems on CMOS technology, both providing higher performance and lower cost [3] [8].

However, CMOS technology has its own drawbacks. Compared to expensive III-V technologies, not only it is much noisier since the whole circuits share with the same substrate, but also its speed is much lower due to limited device characteristics. Moreover, with the technology scaling down to sub-micrometer level, the voltage supply also drops, which introduces the headroom problem and more stringent requirement for CMOS circuits' signal-noise-ratio (SNR) performance.

Furthermore. the conversion circuits between optics and electronics. the driver/transimpedance amplifier stages are performance limited in CMOS technology, especially the implementation of CMOS photodetectors (PDs). Compared with commercial PDs made in expensive III-V technologies with tens of GHz bandwidth and several hundreds of mA/W responsivity, the CMOS alternative normally can only achieve several-MHz bandwidth and about 30-mA/W responsivity, respectively [9]. Therefore, to satisfy the ever-growing requirement of data transmission, new CMOS PD topologies and the following CMOS receiver architectures are demanded to boost the data rate.

For CMOS optical receivers with integrated CMOS PDs, there is no industry standard established, yet. Under standard PD reverse bias  $(V_{PD})$  condition  $(V_{PD} \le V_{DD})$ , the maximum data rate and efficiency achieved are 8.5 Gb/s and 5.53 pJ/bit, respectively [10]. Under avalanche  $V_{PD}$  condition  $(V_{PD} > V_{DD})$ , the maximum data rate and efficiency are 12.5 Gb/s and 4.72 pJ/bit, respectively [11].

For CMOS optical receivers with off-chip PDs made in expensive III-V technologies, there are already industry standards available for different applications. For example, the 100-Gbit Ethernet (100-GbE) is a very hot topic in the past few years, which has already begun to evaluate from the  $1^{st}$  generation standard  $10 \times 10$  Gb/s to the  $2^{nd}$  generation  $4 \times 25$  Gb/s, as shown in Fig. 1.3 [12].



Fig. 1. 3. 100-GbE module evolution [12].

Fig. 1.4 shows the block diagram of the 2<sup>nd</sup> generation 100-GbE system. The 100-GbE system takes full advantages of the sub-micrometer advanced technology and reduces the I/O density. It consists of two main parts: the optical transmitter and the optical receiver. The data will go through the transmitter clock and data recovery (CDR) circuit, the laser driver, laser, and transfer to the optical receiver via the optical links (single-mode fibers (SMFs) or multi-mode fibers (MMFs)). Then the photodiode/photodetector will convert the input optical light to the output

electrical photocurrent. After data cleanup and compensation by the optical receiver and CDR, the desired data signal will go to the DSP at the receiver end.



Fig. 1. 4. Block diagram of the second generation 100-GbE system [13].

Moreover, many academic research works above 25-Gb/s data rate have been carried out in recent years. T. Takemoto from Hitachi has demonstrated a state-of-the-art optical receiver with 25–28-Gb/s data rate and 1.78-pJ/bit efficiency in 2013 [14].

In this thesis, CMOS optical receivers both with and without CMOS on-chip PDs have been investigated, designed, fabricated, measured, and summarized, respectively.

## 1.3 Scope of this Research

In this thesis, one 30-Gb/s CMOS optoelectronic integrated circuit (OEIC) system with digitally-tunable cascaded equalization and one 18-Gb/s CMOS OEIC system with on-chip PD and adaptive continuous-time linear equalizer (CTLE) are proposed, designed, fabricated, and measured in Taiwan Semiconductor Manufacturing Company (TSMC) standard 1-V 65-nm CMOS technology. The design flow is as follows:

- Verilog-A tools are used to build the system behavior model for these two different OEIC systems, firstly. By doing this procedure, the main challenges and bottlenecks are studied carefully.
- 2) Based on the Verilog-A behavior model, new device, circuit, and system topologies are proposed to solve the limitations.
- 3) With the TSMC 1-V 65-nm technology process design kit (PDK), these two OEICs are realized with circuit design, layout, post-layout simulation.
- 4) With high-performance equipment, these two OEICs are measured successfully.
- 5) Performance comparisons with published OEICs are made and future work is recommended.

# 1.4 Organization of the Thesis

In Chapter 1, an introduction of optical communications is provided. First, the research history is depicted from the 1800s. Second, the advantages and disadvantages of both electrical links (copper cables) and optical links (SMFs or MMFs) are discussed and compared. Moreover, the CMOS technology and the expensive III-V technology are also compared. Third, the state-of-the-art CMOS optical receivers with and without CMOS on-chip PDs are summarized, respectively.

In Chapter 2, the architecture and system level design of a 41-mW 30-Gb/s CMOS OEIC with digitally-tunable cascaded equalization are proposed and analyzed. The order of the gain and CTLE stages in the receiver chain is optimized to achieve a minimum input referred noise (IRN) for the best input sensitivity.

In Chapter 3, each building block of a 30-Gb/s CMOS OEIC is characterized. To achieve the required low IRN, sufficient transimpedance gain, and adequate bandwidth, novel circuit topologies and different bandwidth enhancement techniques are adopted.

In Chapter 4, the measurement setups and measurement results of the proposed 30-Gb/s OEIC is shown in detail, whose electrical measurement results achieve the highest transimpedance gain (83 dB $\Omega$ ) and the widest bandwidth (24 GHz) at the lowest power consumption (41 mW) among the CMOS optical receivers above 25 Gb/s published to date.

In Chapter 5, the architecture and system level design of a 48-mW 18-Gb/s fully integrated CMOS optoelectronic integrated circuits (OEICs) with an on-chip PD and an adaptive equalizer are proposed and analyzed. The bottleneck of this system is discussed. Novel device and circuit topologies are proposed to further boost the bandwidth and sensitivity.

In Chapter 6, the unique building blocks of the 18-Gb/s OEIC is analyzed. The novel P-well/Deep N-well PD, the novel low-roll up frequency response of the cascaded 3-stage CLTE, and the adaptive equalization loop are discussed in detail.

In Chapter 7, the measurement results of the proposed 18-Gb/s OEIC, which achieves both new record data rates under standard mode and avalanche mode, are demonstrated.

In Chapter 8, the technology options and physical implementation techniques for high frequency amplifiers are studied using a set of 5-GHz differential cascode low noise amplifiers (LNAs). Moreover, to reduce the size of the inductor used for shunt/series peaking in the high frequency amplifiers, a differential stacked spiral inductor (DSSI) is presented to increase the inductance density by 3 times. Layout guidelines are recommended for optimum high-speed amplifier designs.

Finally, conclusions, my own contributions, and suggestions for future work are presented in Chapter 9.

# Chapter 2

#### 30-Gb/s OEIC Architecture

#### 2.1 Introduction

The conventional optical receiver architecture is shown in Fig. 2.1. The modulated light from the optical transmitter is channeled to the optical receiver via a multi-mode fiber (MMF) or single-mode fiber (SMF). The on-chip or off-chip photodetector (PD) then converts the incoming modulated light into photocurrent. As the following stage, the preamplifier magnifies the received  $\mu$ A-level photocurrent to an mV-level voltage signal, and then the post amplifier further boosts the signal swing to several hundreds of mV to drive the analog/digital interface.



Fig. 2. 1. Conventional optical receiver architecture.

However, this optical receiver architecture is only sufficient for systems with low data rate, up to several hundred MHz. To accommodate the emerging 100-Gbit Ethernet (100-GbE), new receiver architectures are required. As shown in Chapter 1, the 100-GbE system has created the need for optical transceiver ICs operating at 25–28 Gb/s. In this chapter, we will propose a state-of-the-art 30-Gb/s complementary metal-oxide-semiconductor (CMOS) optoelectronic integrated circuits (OEICs) architecture.

#### 2.2 Design Challenges

In 90-nm and 65-nm CMOS processes, recent publications have shown an ability to achieve power efficiency of around 2–3 pJ/bit [14]–[16]. To further improve power efficiency, more advanced processes with higher device speed can be adopted but this increases cost. Alternatively, equalization circuits such as a continuous-time linear equalizer (CTLE) and a decision feedback equalizer (DFE) [17] can be utilized to boost the data rate with the existing 90-nm or 65-nm technology. In this work, a new record of 1.37-pJ/bit power efficiency is achieved with the proposed cascaded CTLE equalization topology.

To achieve a bit error rate (BER) of  $10^{-12}$ , a stringent signal-to-noise ratio (SNR) requirement is demanded at the input of the OEIC system. Since the desired photocurrent generated from the off-chip PD is at the level of  $\mu$ A, the total input referred integrated noise of the OEIC system should be ultra-low, especially for the first circuit stage: the transimpedance amplifier (TIA).

Moreover, the circuit bandwidth limitations also cause design difficulties within the existing 65-nm CMOS process. Therefore, different bandwidth enhancement techniques are utilized in this work, including series inductive peaking, parallel inductive peaking, and negative capacitance compensation (NCC) techniques.

### 2.3 The Proposed 30-Gb/s OEIC Architecture

Fig. 2.2 shows the block diagram of the proposed 30-Gb/s CMOS OEIC. The optical receiver consists of an inverter-based TIA, a DC offset cancellation (DOC) buffer, a main amplifier (MA), a 3-stage CTLE, a 2-stage limiting amplifier (LA), a DOC feedback loop, an on-chip low dropout (LDO) regulator, a 64-bit shift register, and a  $50-\Omega$  output driver (OD). To balance the input loading capacitance to the pseudo-differential TIA, a pad and an on-chip dummy diode is added.



Fig. 2. 2. Architecture of the proposed 30-Gb/s OEIC receiver with cascaded equalization.

For a target input sensitivity of -5-dBm optical power, the minimum input photocurrent from a commercial PD with 0.6-A/W responsivity is 190  $\mu$ A. Thus, to deliver 300-mV differential peak-to-peak output swing, the receiver needs to provide at least 64-dB $\Omega$  of gain. To ensure the limiting function of the limiting amplifier, for the minimum receiver gain, an additional 3-dB gain is allocated.

Since the input signal from the off-chip PD is single-ended, a pseudo-differential TIA is designed. However, its differential outputs have large offset. If the TIA outputs were directly inserted into the following circuit stages, it would saturate the system. Therefore, a DOC buffer is added between the TIA and the MA stage to suppress the dc offset before further amplification in order to avoid saturating the receiver.

As listed in Table 2.1, the TIA is designed to maximize its gain (42 dB $\Omega$ ) and BW (21 GHz) while keeping its input referred noise (IRN) (15.8 pA/ $\sqrt{\rm Hz}$ ) to less than half of the total IRN (33.6 pA/ $\sqrt{\rm Hz}$ ) when the CTLE is enabled to provide -6-dB of low-frequency attenuation. To meet the overall gain requirement, the MA and LA together provide 31-dB of gain. Simulations reveal that with the MA placed before the CTLE, the receiver integrated IRN over 25 GHz is 6.9  $\mu$ A, whereas if the placement of the MA and CTLE is reversed, the IRN increases significantly to 11.7  $\mu$ A. When the CTLE is disabled, the order of the MA and CTLE does not affect the total IRN because it provides a flat 10-dB gain up to 26 GHz and hence is able to suppress most of the noise contribution from the subsequent 2-stage LA. To remove the accumulated offset voltages from the MA, CTLE, and LA stages, a DOC feedback loop is utilized. Simulations show that the DOC buffer and feedback loop together provide a capability of 50-dB offset cancellation.

Table 2. 1: Breakdown of the receiver gain, and IRN.

|                        | TIA                   | DOC  | NAA  |      | LE      | ΙΛ       | Total   |
|------------------------|-----------------------|------|------|------|---------|----------|---------|
|                        |                       | HA   | טטט  | IVIA | Enabled | Disabled | LA      |
| Gain (dBΩ, dB)         | $42 \text{ dB}\Omega$ | 0    | 10   | -6   | 10      | 21       | 67 - 83 |
| −3-dB BW<br>(GHz)      | 21                    | 24   | 28   | N/A  | 26      | 25       | N/A     |
| Power (mW)             | 5                     | 4    | 8    | 8    | 8       | 15       | 41^     |
| IRN (pA/ $\sqrt{Hz}$ ) | 15.8                  | 23.8 | 28.2 | 30.4 | 28.8    | 33.6#    | 33.6#   |

<sup>^</sup>Including 1mW for LDO and bias. # With CTLE enabled. When CTLE is disabled, the total IRN is reduced to  $28.9 \text{ pA}/\sqrt{\text{Hz}}$ .

# Chapter 3

# 30-Gb/s OEIC Circuits Design

### 3.1 Transimpedance Amplifier

As the first circuit block in the receiver chain, the transimpedance amplifier (TIA) typically determines the noise and bandwidth performance of an optoelectronic integrated circuit (OEIC). The large parasitic capacitance of the photodetector (PD) requires a low TIA input impedance to support a high bandwidth. Meanwhile, the transimpedance gain of the TIA ( $Z_{TIA}$ ) should be sufficiently large to suppress the input-referred noise (IRN) contributed by subsequent stages. To satisfy the above requirements while maintaining low power and low distortion, several techniques have been reported. As one of the mainstream TIA topologies, the regulated cascode (RGC) [14], [18] reduces the TIA input impedance by using local feedback at the expense of introducing extra noise. Closed-loop TIAs using a common-source amplifier with inductive feedback and a complementary metal-oxide-semiconductor (CMOS) inverter amplifier with pure resistive feedback have been reported in [15] and [16], respectively. However, all these designs cannot provide both the sufficient gain and bandwidth with the existing process and 1-V voltage supply.

In this work, an inverter-based TIA employing input series peaking and shunt-shunt inductive feedback is designed, as shown in Fig. 3.1. The 4-bit binary-weighted (1:2:4:8) resistor  $R_f$  is

utilized to digitally-control the transimpedance gain of the TIA. For this design,  $R_f$  consists of poly-type resistors and NMOS transistors. The 4-bit digital control totally provides a 4-dB tuning range to compensate for different PD loadings and process, voltage and temperature (PVT) variations. If more than 4 bits are used, more NMOS transistors will be added, introducing more capacitance to the input node and consequently reducing the bandwidth of the whole receiver.



Fig. 3. 1. Schematic of the inverter-based TIA [19].

The TIA is a self-biased circuit through the feedback path ( $R_f$  and  $L_f$ ). At low frequency, the TIA transimpedance gain  $Z_{TIA}$  and the TIA bandwidth  $f_{TIA}$  can be expressed as

$$Z_{TIA} = R_f \cdot \frac{A_{inv}}{1 + A_{inv}} \tag{3.1}$$

$$f_{TIA} = A_{inv} \cdot \frac{1}{2\pi R_F(C_{PD} + C_{in})}$$
(3.2)

$$A_{inv} = G_m \cdot (r_0 \parallel R_f) \tag{3.3}$$

where  $A_{inv}$  is the open-loop gain of the TIA.  $C_{inv}$ ,  $G_m$ , and  $r_o$  represent the total input capacitance, small-signal transconductance, and the output resistance of the CMOS inverter, respectively. The TIA transistor sizes are optimized among the gain, bandwidth, and input referred noise (IRN) [19].

At high frequency, the TIA gain and TIA bandwidth is much more difficult to analyze. To understand the full transfer function, a  $\pi$  input peaking network and a small signal model of the core TIA is shown in Fig. 3.2. The total transfer function  $Z_{TIA}$  is given by

$$Z_{TIA} = H_1(s) \cdot Z_{core}(s) \tag{3.4}$$

where  $H_1(s)$  is the current mode transfer function of the  $\pi$  input peaking network, and  $Z_{core}(s)$  is the transimpedance gain of the core TIA. As shown in Fig. 3.2 (a), a series peaking inductor  $L_{bondwire}$  is adopted at the input of the TIA, to isolate the capacitor of the photodetector  $C_{PD}$  from the input capacitor of the core TIA  $C_{in}$ , which delays the current flowing into the core TIA and reduces the rising time at the input node. The function of  $H_1(s)$  can be derived as

$$H_{1}(s) = \frac{I_{in}}{I_{PD}} = \frac{sC_{in}R_{in} + 1}{s^{3}C_{in}C_{PD}R_{in}L_{bondwire} + s^{2}C_{PD}L_{bondwire} + s(C_{PD} + C_{in})R_{in} + 1}$$
(3.5)

where  $R_{in}$  is the equivalent input resistance of the core TIA. With the existing large capacitance at the input node, to ensure an acceptable -3-dB bandwidth for the receiver,  $R_{in}$  is designed to be 50  $\Omega$  to achieve a low RC product at the system input.



Fig. 3. 2. (a) Input peaking network and (b) small signal model of the core TIA.

To better understand the transfer function of  $H_1(s)$ , a simulation method in [20] is adopted to determine the parameters of the input peaking network. Therefore,  $H_1(s)$  is re-written as,

$$H_1(s) = \frac{\frac{s}{\omega_o}k + 1}{\left(\frac{s}{\omega_o}\right)^3 \frac{k}{m}(1 - k) + \left(\frac{s}{\omega_o}\right)^2 \frac{(1 - k)}{m} + \frac{s}{\omega_o} + 1}$$
(3.6)

where  $k = C_{in}/(C_{in} + C_{PD})$ ,  $m = R_{in}^2(C_{in} + C_{PD})/L_{bondwire}$  and  $\omega_o = 1/R_{in}(C_{in} + C_{PD})$ . An optimal bandwidth enhancement ratio (BWER) is achieved when m = 1.75 with 1-dB ripple as illustrated in Fig. 3.3 and Table 3.1.



Fig. 3. 3. Simulated frequency response of input peaking network.

Table 3. 1 Parameter summary of simulated input peaking network.

| $k = C_{in} / C_{PD} + C_{in}$ | Ripple (dB) | $m=Z_{in}^2C/L_{bondwire}$ | BWER |
|--------------------------------|-------------|----------------------------|------|
| 0.4                            | 1.7         | 3.50                       | 2.0  |
| 0.4                            | 1.0         | 1.75                       | 2.5  |
| 0.4                            | 2.3         | 0.85                       | 3.8  |

To investigate the transfer function of the core TIA,  $Z_{core}(s)$ , the small signal model of the CMOS inverter amplifier including the shunt-shunt feedback network is depicted in Fig 3.2 (b). The loading effect of the feedback network is incorporated into the input and output impedance of the inverter amplifier, respectively. These inductive loadings  $(Y_{II,fb})$  and  $(Y_{II,fb})$  an

The core TIA's input impedance,  $Z_{in}(s)$ , is depicted as follows,

$$Z_{in}(s) = \frac{V_g}{I_{IN}} = \frac{\frac{s}{\omega_o m_1} + 1}{\left(\frac{s}{\omega_o}\right)^4 \frac{k}{m_1 m_2} (1 - k) + \left(\frac{s}{\omega_o}\right)^3 \frac{k(1 - k)}{m_2} + \left(\frac{s}{\omega_o}\right)^2 \left(\frac{(1 - k)}{m_2} + \frac{1}{m_1}\right) + \frac{s}{\omega_o} + 1}$$
(3.7)

where  $k = C_{in}/(C_{in} + C_{PD})$ ,  $m_1 = R_f^2(C_{in} + C_G)/L_f$ ,  $m_2 = R_f^2(C_{in} + C_G)/L_{S2}$  and  $\omega_o = 1/R_f(C_{in} + C_{PD})$ . With the same technique used in the analysis of  $H_1(s)$ , the simulated frequency response of the core TIA input impedance is shown in Fig. 3.4. As depicted in Table 3.2, when  $m_1 = 6.25$  and  $m_2 = 2$ , an optimal BWER of 2.7 is achieved with 1 dB ripple.



Fig. 3. 4. Simulated frequency response of TIA input impedance.

The transimpedance function of the core TIA is

$$Z_{T,core}(s) = \frac{a(s)}{1 + a(s) \cdot f(s)}$$
(3.8)

$$a(s) = Z_{in}(s) \cdot G_m \cdot Z_{out}(s) \tag{3.9}$$

where  $G_m = g_{m,n} + g_{m,p}$  is the equivalent transconductance of the CMOS inverter amplifier,  $Z_{out} = r_o ||1/(sC_L)||(R_f + sL_f)$  is the output impedance of the core amplifier,  $f(s) = 1/(R_f + sL_f)$ .

Finally, with the equation (3.4), the simulated performance of these two inductive peaking is shown in Fig. 3.5, achieving a total BWER of 2.5 and a ripple of 1 dB.

Table 3. 2 Parameter summary of simulated core TIA input impedance.

| $k = C_{in} / C_G + C_{in}$ | Ripple (dB) | $m1=R_f^2C/L_f$ | $m2=R_f^2C/L_{S2}$ | BWER |
|-----------------------------|-------------|-----------------|--------------------|------|
| 0.6                         | 3           | 6.25            | 4                  | 3.5  |
| 0.6                         | 1           | 6.25            | 2                  | 2.7  |
| 0.6                         | 2.2         | 6.25            | 1.25               | 2.1  |

Due to the large capacitance of  $C_{PD}$  (180 fF), a bond wire  $L_{bondwire}$  is in-series utilized at the input of the TIA. With the help of  $L_{bondwire}$ ,  $C_{PD}$  will be charged/discharged faster, improving the rising/falling edge of the input data, therefore accelerating the signal transition and boost the bandwidth. In this design,  $L_{bondwire}$ , sets the input series peaking factor to limit the maximum ripple in the gain frequency response. Its inductance is chosen to be 0.75 nH +/- 25%, which is feasible for real implementation. The second and third peaking at the input are introduced by on-chip inductors  $L_{s2}$  and  $L_f$ , respectively [19].

Given a 180-fF  $C_{PD}$ , the simulation result shown in Fig. 3.2 reveal that the TIA bandwidth is improved from 7.5 GHz to 21 GHz. The simulated average IRN current density is 15.8  $pA/\sqrt{Hz}$  within 25 GHz.



Fig. 3. 5. Simulated frequency response of the TIA.

# 3.2 Low Dropout Regulator

To suppress the wideband noise from the power supply and to alleviate the loading effect due to the supply bond-wires, a trip-loop fully-integrated low dropout (LDO) regulator is proposed to provide the 1-V voltage supply of the TIA and dc offset cancellation (DOC) buffer, as shown in Fig. 3.6 [22]. To support the TIA and DOC buffer running at 25 Gb/s, the LDO should have ultra-fast response and also large output capacitance. Typical fully-integrated LDOs cannot provide full-spectrum power supply rejection (PSR) and good transient response since they set the dominant pole at the internal node and use only a small output capacitance [21]. Since the



Fig. 3. 6. Schematic of the tri-loop LDO [22].

loading current of this LDO is small (about 9 mA) compared to other general purpose (about 100 mA) LDOs, it is advantageous to place the LDO's dominant pole at its output node for high PSR and fast response. The schematic of the trip-loop LDO is depicted in Fig. 3.6. Most of the limited available capacitance (silicon area) is allocated to its output node (dominant pole), while the internal poles are pushed to frequencies higher than the unity-gain frequency by using buffer impedance attenuation (BIA) and flipped voltage follower (FVF) techniques [22].

#### 3.3 DC Offset Cancellation Buffer

Due to device mismatches or single-ended input, DC offset is unavoidable. The offset from previous stages and within each amplifier stage would be amplified by the following stages and further prohibit the circuits from normal DC operating points. Conventionally, direct AC coupling is simple and straightforward, but it would occupy a large chip area for the large capacitance and its parasitic capacitance would dramatically degrade the bandwidth, which is not desirable for this design [23].

To solve the large DC offset from the pseudo-differential TIA, a DOC buffer from [10] is inserted between the TIA stage and the MA stage, as shown in Fig. 3.7(a). When determining the DOC buffer's corner frequency, there is a trade-off between the IRN and the DOC capability. If more DOC capability is allocated, which means larger DOC corner frequency, there will be more IRN at the system input due to the loss at low frequency. In this work, to achieve a good trade-off between DOC capability and IRN, the corner frequency is designed to be the optimal value: 15 kHz, as shown in Fig. 3.7(b).



Fig. 3. 7. DOC buffer: (a) schematic, and (b) simulated frequency response.

### 3.4 Main Amplifier

As discussed in Chapter 2, the main amplifier (MA) stage is inserted between the DOC buffer and the 3-stage CTLE to decrease the IRN noise from the subsequent circuit stages. The MA is very important in the system since not only its previous stage, the DOC buffer has a voltage gain of 0 dB, but also its subsequent stage, the 3-stage continuous-time linear equalizer (CTLE) has a voltage gain of -6 dB when equalization is enabled. Without the MA, the noise from the amplifiers after the DOC buffer will be amplified, and then deteriorate the input sensitivity. A modified Cherry Hooper topology with a shunting peaking differential inductor, cross-coupled capacitors, and a 4-bit binary-weighted control to compensate for PVT variations are adopted, which has the same topology with the Limiting amplifier (LA). This topology will be analyzed in detail in section 3.6.

### 3.5 Cascaded Continuous-Time Linear Equalizer

To ensure the signal integrity over PVT variations, lossy optical fibers, PDs, and previous amplifier stages with limited bandwidth, high-speed equalizers have found extensive usage in modern broadband data communications. Different equalizer topologies have been proposed to compensate for the channel loss. The passive equalizer in [24] consumes zero power consumption, but it sacrifices the DC gain from 0 dB to negative values, introducing too much noise. It is acceptable for wire-line receivers since the input SNR is high enough to tolerate the gain loss. The decision feedback equalizer (DFE) in [25] remains a challenge for low-power OEIC designs. Especially for data rate above 25 Gb/s, the DFE is too power hungry that it might

be not suitable for the next-generation high-speed optical receiver. In this work, the CTLE topology is utilized since it consumes significantly less power than the DFE, and at the same time introduces much smaller noise than the passive equalizer. Indeed the CTLE amplifies the crosstalk and high frequency noise, but this drawback can be eliminated by adding an additional MA stage before the CTLE, as discussed in Chapter 2.

The conventional CTLE provides a wide bandwidth and the capability of boosting in high frequencies with the capacitive degeneration technique, as shown in Fig. 3.8 [26] [27].



Fig. 3. 8. Conventional CTLE topology.

The transfer function is:

$$\frac{V_{out}(s)}{V_{in}(s)} = \frac{g_{m1}R_D}{1 + \frac{g_{m1}R_D}{2}} \frac{1 + \frac{s}{\omega_{z1}}}{(1 + \frac{s}{\omega_{p1}})(1 + \frac{s}{\omega_{p2}})}$$
(3.10)

where  $\omega_{z1}=1/(R_sC_s)$ ,  $\omega_{p1}=(1+g_{m1}R_s/2)/(R_sC_s)$ ,  $\omega_{p2}=1/(R_DC_L)$ , and  $g_{m1}$  is the transconductance of  $M_1$ . As shown above, this topology can allocate  $\omega_{p1}$  away from  $\omega_{z1}$  by a factor of  $(1+g_{m1}R_s/2)$ , but at the same time, reduces the DC gain by the same factor. With this limitation, this topology cannot achieve a high gain-bandwidth product (GBW) even with multiple cascaded stages [26].

To solve this problem, the inductive peaking technique is adopted, as shown in Fig. 3.9. The transfer function is:

$$\frac{V_{out}(s)}{V_{in}(s)} = \frac{g_{m1}R_D}{1 + \frac{g_{m1}R_D}{2}} \frac{1 + \frac{s}{\omega_{z1}}}{(1 + \frac{s}{\omega_{p1}})} \frac{1 + \frac{s}{\omega_{z2}}}{(1 + \frac{2\xi s}{\omega_n} + \frac{s^2}{\omega_n^2})}$$
(3.11)

where  $\omega_{z2}=2\xi\omega_n$ ,  $\xi=(R_D/2)\sqrt{C_L/L_P}$ ,  $\omega_n=1/\sqrt{L_PC_L}$ , and  $\omega_{z1}$  and  $\omega_{p1}$  remain unchanged. The parallel inductive peaking introduces an extra zero  $\omega_{z2}$ , extending the frequency boosting by cancelling the first pole  $\omega_{p1}[26]$ .



Fig. 3. 9. CTLE with inductive peaking.

With the same CTLE topology shown above, a modified CTLE is presented in Fig. 3.10(a). It features a 4-bit binary-weighted tunable source degeneration RC network to adjust the zero frequencies in the CTLE gain response. Moreover, in this work the three cascaded CTLE stages used in the 30-Gb/s OEIC utilize inductive shunt peaking at different frequencies (5, 12, and 20 GHz) to enable broadband compensation.

The simulated frequency response of the 3-stage CTLE is depicted in Fig. 3.10(b). With different digital control settings, it can accommodate different frequency responses of PDs and previous amplifier stages. When all 3 stages are disabled, it has a flatten response with a DC gain of 10.5 dB; when all the stages are enabled, it achieves the maximum equalization of 18 dB.





Fig. 3. 10. (a) Schematic of 1-stage CTLE, and (b) simulated frequency response of 3-stage CTLE.

## 3.6 Limiting Amplifier

The LA is indispensable to provide enough gain for the whole receiver. At the same time, the bandwidth of the LA should not degrade the system bandwidth while its power efficiency should be maximized [28–30]. Therefore, in high-speed applications, the difficulties in designing an LA lie in the requirements of achieving high gain and wide bandwidth simultaneously at acceptable power consumption. Many design techniques have been proposed to achieve broadband amplifiers, including inverse scaling [31], series peaking [15], negative resistance and capacitance [18] and shunt peaking [32]. The invert scaling technique has advantages when there

are a large difference between the input and the output capacitance. But in this proposed 30-Gb/s OEIC, the LA's loading capacitance is almost the same as the input capacitance. The series peaking inductor is inserted between two circuit blocks, which is directly on the signal chain that will cause longer interconnect. Although the negative resistance varies the output resistance which is proportional to the gain and negative capacitance (NC) enhances the bandwidth, it increases the IR drop, which is not suitable under low voltage supply.

In this work, the Cherry Hooper (CH) topology [33] is selected to implement the LA stage. The conventional CH amplifier is shown in Fig. 3.11(a). The corresponding simplified block diagram is shown in Fig. 3.11(b). The gain and bandwidth expressions are shown as follows,

$$Gain = g_{m1}R_f \propto I_{BIAS1} \tag{3.12}$$

$$\omega_{-3dB} = g_{m2} / C_B \propto I_{BIAS2} \tag{3.13}$$

which mean the gain and the bandwidth are independent from each other by using two tail currents. As shown in these two equations, increasing the tail currents, can improve both the gain and bandwidth performance.



Fig. 3. 11. (a) Schematic of the conventional CH amplifier, and (b) simplified block diagram.



Fig. 3. 12. (a) Schematic of the modified CH amplifier, and (b) simplified block diagram.

In this work, the CH topology with shunt peaking inductor shown in Fig. 3.12(a) is selected to implement the LA stage. With the additional signal path ( $L_2$ ,  $R_3$ , and  $R_4$ ), more voltage headroom can be achieved. Although the PVT variations affect LA's robust high speed operation, the automatic constant gain bias circuitry recovers such effect by regulating the bias current [32]. However, the topology adds much noise and power. So in our design, a digital tuning scheme [34] depicted in Fig. 3.12(a) is proposed to compensate for the effects brought by PVT variations, which costs no extra power.

The transfer function is shown as follows,

$$A(s) = \frac{V_{out}}{V_{in}} = \frac{A_0 \omega_n^2}{s^2 + 2\xi \omega_n s + \omega_n^2}$$
(3.14)

where

$$A_{0} = \frac{g_{m1}g_{m2}R_{1}R_{2}\left(1 + \frac{1}{g_{m2}R_{f}}\right)}{1 + \frac{R_{1} + R_{2}}{R_{f}} + g_{m2}R_{1}R_{2} / R_{f}}$$
(3.15)

$$\xi = \frac{1}{2} \frac{R_2 C_A + R_1 C_B + R_1 R_2 (C_A + C_B) / R_f}{\sqrt{R_1 C_A R_2 C_B (1 + g_{m2} R_1 R_2 / R_f)}}$$
(3.16)

$$\omega_n = \sqrt{\frac{1 + \frac{R_1 + R_2}{R_f} + g_{m2}R_1R_2 / R_f}{R_1C_AR_2C_R}}$$
(3.17)

When  $\zeta = \frac{\sqrt{2}}{2}$ ,  $f_{-3dB} = \omega_n/4\pi^2$ , the maximum flat frequency response is achieved. And the gain-bandwidth product (GBW) is calculated as,

$$GBW = A_0 f_{-3dB\_BW} = \frac{g_{m1} g_{m2}}{C_A C_B} \frac{1}{f_{-3dB}} \frac{1}{4\pi^2} \left( 1 + \frac{1}{g_{m2} R_f} \right)$$
(3.18)

Assuming  $f_{T} = G_{m1}/2\pi C_1 = G_{m2}/2\pi C_2$ ,

$$A_0 f_{-3dB\_BW} = \frac{f_T}{f_{-3dB\_BW}} \left( 1 + \frac{1}{g_{m2} R_f} \right) f_T$$
 (3.19)

As equation (3.19) shows, the final GBW has been increased to be beyond the technology  $f_{\rm T}$  by a factor equal to  $\frac{f_T}{f_{-3dB_-BW}} \bigg( 1 + \frac{1}{g_{m2}R_f} \bigg)$ . The post-layout simulation reveals that a DC gain of 10 dB and a BW of 28 GHz are achieved with this modified CH topology.

## 3.7 DC Offset Cancellation Feedback Loop

To remove the accumulated offset voltages from the MA, CTLE, and LA stages, a DOC feedback loop is utilized, which consists of the RC low-pass filter and the conventional amplifier. In this work, the fully differential MA, CTLE, and LA stages totally provide 41-dB DC gain, but only -30-dB offset cancellation capability is designed with system optimization.



Fig. 3. 13. DOC feedback loop.

The DOC feedback loop exhibits a high-pass characteristic. The low cut-off frequency  $f_c$  is determined by,

$$f_c = \frac{A_{MA} A_{CTLE} A_{LA} A_F + 1}{2\pi C_F R_F}$$
 (3.20)

Where  $A_{MA}$ ,  $A_{CTLE}$ ,  $A_{LA}$ ,  $A_F$  are DC gains of the MA, 3-stage CTLE, 2-stage LA, and feedback amplifier, respectively.  $R_F$  and  $C_F$  is the feedback on-chip resistance and off-chip capacitance, respectively.

The accumulated offset voltages can be minimized by careful analog layout techniques. The differential paths should be fully symmetrical so that in post-layout simulation, even without the DOC feedback loop, the accumulated offset voltage of the six cascaded amplifiers is below 1 mV. However, the DOC feedback cannot be omitted since PVT variations are unpredictable.

### 3.8 Output Driver



Fig. 3. 14. Schematic of the output driver.

The design of the output driver (OD) is critical as it must overcome the impedance loading due to the bond-wires, and the  $50-\Omega$  PCB transmission line. As depicted in Fig. 3.13, the OD consists of two stages: the first stage is a differential amplifier, while the second stage is a 6-bit digitally-controlled source degenerated amplifier. A 4-bit digital control is used to compensate for high-frequency losses in the PCB transmission-line, while the remaining 2-bit control is used to provide  $50-\Omega$  output impedance matching.

The transfer function of the output driver is,

$$\frac{V_{out}(s)}{V_{in}(s)} = g_{m1}R_1 \frac{g_{m2}R_D}{1 + \frac{g_{m2}R_D}{2}} \frac{1 + \frac{s}{\omega_{z1}}}{(1 + \frac{s}{\omega_{p1}})} \frac{1 + \frac{s}{\omega_{z2}}}{(1 + \frac{2\xi s}{\omega_n} + \frac{s^2}{\omega_n^2})}$$
(3.21)

where  $\omega_{z1}=1/(R_sC_s)$ ,  $\omega_{p1}=(1+g_{m2}R_s/2)/(R_sC_s)$ ,  $\omega_{z2}=2\xi\omega_n$ ,  $\xi=(R_D/2)\sqrt{C_L/L_P}$ ,  $\omega_n=1/\sqrt{L_PC_L}$ ,  $C_L$  is the loading capacitance at output nodes. By the driving capability from the output driver, the losses from the PCB transmission-line is compensated successfully [68].

# 3.9 Bias Circuits and 64-bit Shift Register

The bias currents or voltages for all building blocks are generated on chip with basic current mirrors. Simulation results show that under different corners, the bias circuits can support the core circuits with desired values. The power consumption is  $500 \, \mu A$ .

A 64-bit shift register (SR) is designed to digitally control the whole system. Specifically, 4 bits are for the TIA, 12 bits are for the 3-stage CLTE, 15 bits are for the MA and the 2-stage LA, and 6 bits are for the OD.

# **Chapter 4**

### 30-Gb/s OEIC Measurement

## 4.1 Measurement Setup

The 4-layer 1.6-mm Roger 4003–FR4 printed circuit board (PCB) shown in Fig. 4.1 is custom-designed with the PCB tool DesignSparkPCB, and fabricated in Shenzhen, China.



Fig. 4. 1. The full-view of the custom-designed chip-on-board test fixture.

The PCB is divided into 3 main parts: the core area includes the input and output  $50-\Omega$  PCB traces, and  $1-\mu$ F by-pass capacitors for each pad. The on-board bias area provides the testing chip

with different bias voltage/current/control signals. The on-board low-dropout regulator (LDO) area provides the whole chip-on-board (CoB) with stable supplies.

First of all, the electrical measurement setups are shown in Fig. 4.2. Fig. 4.2(a) depicts the S-Parameter measurement setup using the 40-GHz R&S ZVB8 network analyzer. Fig. 4.2(b) shows the electrical data eye measurement setup. The pseudo random binary sequence (PRBS) signal generated by the SDG Model 12070 30-Gb/s programmable pattern generator (PPG) is inputted to the proposed optoelectronic integrated circuit (OEIC) by the 40-GHz radio-frequency (RF) cable. And then the output signal is detected by the DCA-X 86100D oscilloscope for data eye measurement and the SDA Model 13020 32-Gb/s Programmable Error Detector for bit error rate (BER) measurement.



Fig. 4. 2. (a) Electrical S-Parameter, and (b) electrical data eye/BER measurement setup.

After the electrical measurement, the OEICs' optical characteristics are measured with the setup shown in Fig. 4.3. The PPG and the Photline optical transmitter together forming the optical source, feeds the proposed OEIC with the required PRBS signal. A 1.5-m lensed SMF with a core diameter of 9 µm is used to transfer the input light precisely onto the off-chip or on-chip photodetector (PD).



Fig. 4. 3. Optical data eye/BER measurement setup for the OEIC.

#### 4.2 Measurement Results

Fig. 4.4 shows the on-site photo of the typical OEIC measurement setup. The E4440A spectrum analyzer is used here to ensure the alignment the SMF with the off-chip or on-chip PD. The 64-bit shift register is controlled by a Labjack and a computer. The Photline optical transmitter is controlled by a Modbox Bias Control (MBC) interface to adjust the power of the input light.



Fig. 4. 4. Measurement setup.

The 30-Gb/s OEIC CoB microphotograph is shown in Fig. 4.5. As depicted in Fig. 8.5(a), the input light transfers through the 9- $\mu$ m lensed SMF, and the output electrical signal is extracted through the Roger 50- $\Omega$  PCB trace and the Southwest end-launch connector. Fig. 4.5(b) shows the zoomed-in photo. Both the 30-Gb/s and 14-Gb/s commercial off-chip PDs have a topology of ground-signal-ground (GSG). To ensure the same ground voltage between the off-chip PD and the OEIC chip and also prevent coupling noises to the signal path, two ground bond-wires are utilized to connect the off-chip PD and the OEIC chip. The core area is 0.26 mm², and it consumes 41-mW power.



(a)



Fig. 4. 5. The chip-on-board microphotograph.

#### 4.2.1 Electrical Measurement Results

Fig. 4.6 demonstrates the measured electrical frequency response of the 30-Gb/s OEIC with different CTLE settings, which is tested by direct probing on the CoB without an off-chip commercial PD. The measured adjustable gain is 67-83 dB $\Omega$ , with 24-GHz BW. The measured electrical 30-Gb/s data eye, with CTLE disabled, exhibits an RMS jitter of 1.52ps, as shown in Fig. 4.7(a). The data eye with CTLE over-equalized is shown in Fig. 4.7(b), including 1.3 ps from the measurement equipment.



Fig. 4. 6. Measured frequency response with different CTLE settings.



(a)



(b)

Fig. 4. 7. Measured electrical data eye (PRBS-15) with (a) CTLE disabled, and (b) CTLE over-equalized.

#### 4.2.2 Optical Measurement Results with a 30-Gb/s Off-Chip PD

The optical performance is measured with different off-chip commercial PDs. First of all, to investigate the characteristics of the OEIC circuits, a commercial 30-Gb/s GaAs-based PIN vertically integrated systems (VIS) PD with a 0.4-A/W responsivity is utilized. Without the CTLE enabled, the 27-Gb/s and 28-Gb/s are shown in Fig. 4.8(a) and (b). With the CTLE optimally tuned (4-bit tuning: 0011) as depicted in Fig. 4.8(c), the 30-Gb/s optical data eye achieves an RMS jitter of 2.58 ps, and -5.6-dBm sensitivity for  $2^{15}$ -1 PRBS with  $10^{-12}$  BER.

The measured BER bathtub curves and BER versus optical input power curves with different data rates are depicted in Fig. 4.9 and Fig. 4.10, respectively.



(a)





Fig. 4. 8. Measured PRBS-15 optical data eye with a 30-Gb/s off-chip PD: (a) 27 Gb/s, CTLE disabled, (b) 28 Gb/s, CTLE disabled, and (c) 30 Gb/s, CTLE enabled (0011).



Fig. 4. 9. Measured BER bathtub curves.



Fig. 4. 10. Measured BER versus optical input power.

#### 4.2.3 Optical Measurement Results with a 14-Gb/s Off-Chip PD

To demonstrate the effectiveness of the proposed cascaded equalization technique, a commercial 14-Gb/s GaAs-based PIN Global communication Semiconductor (GCS) PD with a 0.6-A/W responsivity is bonded to the proposed OEIC. As shown in Fig. 4.11, when the 3-stage CTLE is disabled, the data eye at 28 Gb/s is really poor, and the data eye at 30 Gb/s is completely closed. When the CTLE is enabled, the OEIC successfully compensates for the limited BW of the 14-Gb/s PD to operate at 28 Gb/s and 30 Gb/s, respectively. For the 30-Gb/s,  $2^{15}$ –1 PRBS data input, the corresponding data eye achieves an RMS jitter of 2.78 ps, and –5-dBm sensitivity with  $10^{-12}$  BER.





Fig. 4. 11. Measured optical eye with a 14-Gb/s off-chip PD: (a) 28 Gb/s, and (b) 30 Gb/s.

Fig. 4.12 depicts the measured BER bathtub curves with the 14-Gb/s PD for different data rate inputs. Without the 3-stage CTLE enabled, the 28-Gb/s data eye cannot achieve BER $\leq$ 10<sup>-10</sup>, and the BER bathtub curve for 30-Gb/s data eye is too poor to be depicted in the plot. When the 3-stage CTLE are enabled, both the 28-Gb/s and 30-Gb/s data eyes have much better eye openings with BER $\leq$ 10<sup>-12</sup>.

Fig. 4.13 shows the measured BER versus the input optical power. For the 30-Gb/s PRBS-15 data input, the input sensitivity achieving BER $\leq$ 10<sup>-12</sup> is -5 dBm, and for the 28-Gb/s PRBS-15 data input, the corresponding sensitivity is -5.4 dBm.



Fig. 4. 12. Measured BER bathtub curves.



Fig. 4. 13. Measured BER versus optical input power.

## 4.3 Conclusion

For the 30-Gb/s PD setting, the 30-Gb/s eye opening at BER≤10<sup>-10</sup> (1-UI) achieves 24%, while for the 14-Gb/s PD alternative, the eye opening only degrades by 6%. The measured 30-Gb/s BER versus the optical input power also indicates only 0.6-dB degradation in sensitivity. In sum, both the measured eye opening and the sensitivity demonstrate the effectiveness of the proposed OEIC design and the cascaded CTLE techniques.

Table 4.1 compares the performance of this work with recently published CMOS optical receivers. With an efficiency of 1.37 pJ/bit, this work achieves the highest transimpedance gain (83 dB $\Omega$ ) and the widest bandwidth (24 GHz) at the lowest power consumption (41 mW) among the CMOS optical receivers published to date.

Table 4. 1: Comparison to recently published CMOS optical receivers.

|                                             | [14]  | [15]  | [35]  |           | Work<br>14-Gb/s PD |
|---------------------------------------------|-------|-------|-------|-----------|--------------------|
| CMOS Technology                             | 65-nm | 65-nm | 90-nm | 65-       | nm                 |
| Supply Voltage (V)                          | 3.3/1 | 1     | 1.2   | 1 (1.2 fe | or LDO)            |
| Gain (dB-Ohm)                               | 76.8  | 72.5  | 78.3  | 8         | 3                  |
| BW (GHz)                                    | 21.4  | 21    | 20    | 2         | 24                 |
| Power (mW)                                  | 90.9  | 48.8  | 44.4  | 4         | 1                  |
| Data Rate (Gb/s)                            | 25–28 | 25    | 25    | 30        |                    |
| Sensitivity (dBm)                           | -9.7  | -6.8  | -4    | -5.6      | -5.0               |
| Eye Opening at BER<10 <sup>-10</sup> (1-UI) | 65%   | N/A   | 22%   | 24%       | 18%                |
| Efficiency (pJ/bit)                         | 3.25  | 1.95  | 1.78  | 1.        | 37                 |

## Chapter 5

## 18-Gb/s Fully Integrated OEIC Architecture

#### 5.1 Introduction

The rising demand for short-range optical links using 850-nm wavelength has generated strong interest in complementary metal-oxide-semiconductor (CMOS) optoelectronic integrated circuits (OEICs) [36]–[41]. Compared to existing hybrid solutions employing off-chip GaAs-based avalanche photodetectors (PDs), single-chip OEICs with on-chip CMOS PDs can lower optical module assembly cost and eliminate the parasitic effects due to the bond-wire and ESD protection circuits at the receiver input. In the following three chapters, a new 18-Gb/s fully integrated OEIC with on-chip CMOS PD and adaptive equalizer is proposed, designed, fabricated, and measured.

# **5.2** Design Challenges

Although the on-chip CMOS PD has the advantages shown above, it has its own severe drawbacks: much lower responsivity and bandwidth compared with the commercial PD designed with III–V materials. On the one hand, much lower on-chip PD responsivity, almost one tenth of its counter-part with III–V materials, indicates that with the same input light, the desired signal is

much smaller, deteriorating the signal-to-noise ratio (SNR) at the receiver input; on the other hand, much limited on-chip PD bandwidth, about one fortieth of its counter-part, results in terrible frequency response if no further equalization is adopted.

To solve the challenge of the low responsivity and the corresponding small SNR, first of all, a novel on-chip PD topology will be proposed. Moreover, the system input referred noise (IRN) will be further reduced compared to the 30-Gb/s OEIC, by utilizing an inductive cascode inverter-based transimpedance amplifier (TIA) with higher gain and smaller bandwidth.

To settle the challenge of the small on-chip PD bandwidth, first of all, the characteristics of its frequency response will be investigated with a standalone on-chip PD testing chip. After that, a novel cascaded equalization approach having the opposite frequency response is proposed to compensate for the limited on-chip PD bandwidth.

# **5.3** The Proposed Optical Receiver Architecture

Fig. 5.1 shows the proposed architecture of the 18-Gb/s fully integrated OEIC with an on-chip PD and an adaptive equalizer which is fabricated in Taiwan Semiconductor Manufacturing Company (TSMC) 1-V 65-nm CMOS technology. The whole receiver consists of the proposed CMOS on-chip P-well/Deep N-well (PW/DNW) PD, an inductive cascode inverter-based TIA, a DC offset cancellation (DOC) buffer, a main amplifier (MA), a 3-stage continuous-time linear equalizer (CTLE), a 2-stage modified Cherry-Hooper limiting amplifier (LA), a DOC feedback loop, an adaptive equalization loop (AEL), and an output buffer (OB).



Fig. 5. 1. Architecture of the proposed 18-Gb/s fully integrated OEIC receiver with on-chip PD and adaptive equalizer.

Since the system IRN analysis is the similar procedure with the method used for the 30-Gb/s OEIC system presented in Chapter2, it will be omitted here to avoid iteration. Thus, the author will focus on the unique circuit blocks of the 18-Gb/s fully integrated OEIC in next chapter.

# Chapter 6

# 18-Gb/s Fully Integrated OEIC Circuits Design

In both the 30-Gb/s optoelectronic integrated circuits (OEIC) and the 18-Gb/s OEIC, there are some circuit blocks sharing the same topologies, including the on-chip low dropouts (LDO) regulator, The DC offset cancellation (DOC) buffer, the main amplifier (MA), the 2-stage modified Cherry-Hooper limiting amplifier (LA), and the DOC feedback loop. Although their circuit parameters, such as DC gain, bandwidth (BW), corner frequency, and power consumption, have been changed to accommodate different specifications, we will only emphasize on the unique circuit blocks of the 18-Gb/s OEIC in this chapter.

# 6.1 On-chip PW/DNW PD

#### 6.1.1 Introduction

For short-range (<100 m) optical links over 5 Gb/s, 850-nm wavelength is a promising choice by using vertical cavity surface emitting lasers (VCSEL) as a cost-effective light source. Therefore, low-cost CMOS-based optical receivers with on-chip photodetectors (PDs) have been actively pursued in recent years to take advantage of silicon PN junctions' ability to detect 850-nm light. However, the responsivity and the operating bandwidth of CMOS PD are limited due to several

factors, including the shallow junction depth (<1 µm) in standard CMOS, the deep absorption length in silicon (>15 µm at 850 nm), and large carrier transit time in the substrate absorption layer [41], [42]. Typical CMOS PDs can only achieve a low responsivity of around 30 mA/W with bandwidth limited to about 100 MHz, whereas standard GaAs-based PDs can deliver hundreds of mA/W and up to a few tens of GHz [43]–[46]. Nevertheless, CMOS PDs remain attractive because of their compatibility with system-on-chip integration which can avoid packaging parasitics and offer sophisticated receiver equalization to compensate for the PD's limited performance.

To improve the responsivity and the bandwidth of CMOS PDs, two major approaches have been investigated – the spatially modulated PD (SMPD) [10], [43] and the avalanche PD (APD) [42], [45], [48], [49]. The SMPD employs a differential, symmetrical layout with half of the PD blocked from the light source to cancel out the slow diffusion current in the substrate absorption region. This technique is effective for all CMOS PD designs using the substrate as the P-side of the junction such as the P-sub/N+, P-sub/N-well, and P-sub/Deep N-well. The drawback of SMPDs is that half of the PD exposure area is wasted for a given fiber's input aperture. The other approach is to operate the PD in avalanche mode, known as APD, but this requires the system to provide a rather large bias voltage, typically in excess of 10 to 14 V, which increases system cost and generates reliability issues. Lee and Choi have reported good CMOS APD performance using a P+/N-well junction [49] and a P-well/N+ junction [51]. Since the P+/N-well or P-well/N+ junction is surrounded by shallow trench isolation (STI), the edge curvature effect is eliminated to achieve a higher breakdown field compared to a P-sub/N-well junction. Using the P+/N-well or P-well/N+ junction also avoids having to bias the P-sub at a large negative voltage which can cause various reliability issues.

In this work, a vertical P-well/Deep N-well (PW/DNW) PD is investigated exploiting the additional PW region available in advanced CMOS technologies without process modification. The lateral alternative has been studied in [48] and [49]. The measured dc responsivity, ac response, modeling formula and received data eye diagram will be presented. A performance comparison to published CMOS PD in standard mode (reverse bias voltage ( $V_{PD}$ )  $\leq$  voltage supply ( $V_{DD}$ )) is made to provide design guidelines for future work on CMOS optical receiver with on-chip PD.

#### **6.1.1** The Proposed Topology





Fig. 6. 1. (a) Top-down, and (b) cross-section views of the proposed PW/DNW PD.

Fig. 6.1 shows the top-down and cross-section views of the vertical PW/DNW PD. The PW region is completely covered with P+ to reduce the extrinsic series resistance. The silicide blocking layer, resist protection oxide (RPO), is used to keep the P+ region unsilicided, except under the metal contact area, which minimizes any reflection of the incident light by the silicide layer and thus avoids degrading the responsivity.

The proposed PW/DNW PD offers three key advantages over other CMOS PN junctions. First, it eliminates the use of the P-sub to reduce the slow substrate diffusion current [50]. Second, it has a deeper junction depth and lighter doping concentration than the P+/N-well (or PW/N+) so that with the same  $V_{PD}$ , more light is absorbed by the deeper and wider depletion region.

Finally, using the PW instead of the P-sub can have the biasing voltage compatibility to the optical receiver when operating in the avalanche mode. The PW region can be directly connected to the receiver's transimpedance amplifier (TIA) input (which is typical at half of  $V_{DD}$ , around 0.5 V for 65-nm design) while a large positive voltage is applied to the DNW region to establish the required  $V_{PD}$ . This offers better reliability than APDs using the P-sub as the P-side because the P-sub must be the ac common ground with the N-side of the PD connected to the TIA input and a large negative voltage must be applied to the P-sub [43]. As illustrated in Fig. 6.1, in addition to the desired PW/DNW PD, there is a byproduct P-sub/DNW junction. By biasing and bypassing the DNW (at  $V_{PD}$ ) and P-sub (at 0 V) to the ac common ground, the operation of the PW/DNW PD is not affected by the byproduct junction.

## 6.2 Inductive Cascode Inverter-based Transimpedance Amplifier

The transimpedance amplifier (TIA) is one critical building block of the optical receiver. Its sensitivity essentially depends on the input-node capacitance, the feedback resistor, and the transconductance of the input transistors of the amplifier. Inverter-based TIAs have been studied extensively in recent years [51], which show better noise performance than the simple commonsource (CS) TIA.

Given the same bias current, in order to achieve a maximum TIA gain, large input transistor is required. However, it will introduce large gate-source capacitance directly to the input node. Moreover, by the Miller Effect, the gate-drain capacitance will also affect the TIA performance. These two capacitances will become the bandwidth limiting factor. Fig. 6.2 shows the schematic

diagram of the proposed inductive cascode inverter-based TIA. With the cascode topology, a higher voltage gain can be achieved with a much smaller input transistor, which means much smaller input gate-source capacitance. Furthermore, the Miller gate-drain capacitance of the input transistors will be reduced significantly, boosting the bandwidth of the whole system.



Fig. 6. 2. Schematic of the inductive cascode inverter-based TIA.

## **6.3** Cascaded Continuous-Time Linear Equalizer

To save chip area, for the first stage, the negative capacitance compensation (NCC) circuit is used instead of an inductor, as illustrated in Fig. 6.3. The second and third stage employs an on-chip

inductor for the shunt peaking load to trade off chip area for power, shown in Fig. 3.9. Each CTLE stage also features a tunable RC source degeneration circuit to adjust the zeroes and poles in the frequency response. The three stages are tuned simultaneously by a single control voltage ( $V_{CTLE}$ ) from an adaptive control loop.



Fig. 6. 3. Schematic of the first stage CTLE.

Each CTLE stage has a roll-up slope of 20 dB/decade starting from its first zero, until it reaches the high frequency second pole. However, an on-chip CMOS PD normally has a frequency response with a slow roll-off frequency response with a slope of 5–10 dB/decade. To compensate for the slow roll-off in the PD responsivity and achieve the overall flatten frequency response of

the receiver, a three-stage CTLE is designed with different peaking frequencies at 2.5 GHz, 6 GHz, and 12 GHz, respectively, as shown in Fig. 6.4.



Fig. 6. 4. Cascaded 3-stage CTLE.

The transfer function is depicted as,

$$\frac{V_{out}(s)}{V_{in}(s)} = \frac{g_{m1,EQ1}R_{D,EQ1}}{1 + \frac{g_{m1,EQ1}R_{D,EQ1}}{2}} \frac{g_{m1,EQ2}R_{D,EQ2}}{1 + \frac{g_{m1,EQ2}R_{D,EQ2}}{2}} \frac{g_{m1,EQ3}R_{D,EQ3}}{1 + \frac{g_{m1,EQ3}R_{D,EQ3}}{2}} \\
\times \frac{1 + \frac{s}{\omega_{z1,EQ1}}}{(1 + \frac{s}{\omega_{p1,EQ1}})(1 + \frac{s}{\omega_{p2,EQ1}})} \frac{1 + \frac{s}{\omega_{z1,EQ2}}}{(1 + \frac{s}{\omega_{p1,EQ2}})(1 + \frac{s}{\omega_{p2,EQ2}})} \frac{1 + \frac{s}{\omega_{z1,EQ3}}}{(1 + \frac{s}{\omega_{p1,EQ3}})(1 + \frac{s}{\omega_{p2,EQ3}})} (6.1)$$

By interpolating poles and zeros, a slow roll-up frequency response of 5–10 dB/decade has been achieved to compensate for the signal loss due to the on-chip CMOS PD.

## **6.4** Adaptive Equalization Loop

#### 6.4.1 Introduction

To accommodate and overcome different input data rates, channel loss, and process, voltage and temperature (PVT) variations, an adaptive equalization loop (AEL) has been utilized in previous publications. D. Lee proposed an AEL based on a slope-detection algorithm which compared the difference of the slopes between the limiting amplifier input and the output [10]. J. Lee proposed an AEL with a power detector that consisted of a low-pass filter and a high-pass filter, so that it compared the low and high frequency power of the limiting amplifier output [26]. In this work, the high-pass filter is removed to further reduce the power consumption.

Fig. 6.5(a) shows that the power spectrum of a random data bit stream can be described by a  $sinc^2(f)$  function [52]. The ratio of the power density at any two frequencies is known. To generate the desired CTLE control voltage  $V_{CTLE}$ , a differential power detector (DPD) is utilized to compare the all-pass path signal powers and the low-pass path signal powers, as depicted in Fig. 6.5(b). A variable-gain low pass filter (LPF) is utilized to filter out the high-frequency energy and at the same time, amplifies the low-frequency energy. Since the full spectrum power  $P_{Total}$  is  $f_0/f_1$ times of the low-pass frequency power  $P_I$  ( $f_0$  is the fundamental frequency of the signal, and  $f_1$  is the low-pass corner frequency), the power of the low-pass path is required to be amplified by  $f_0/f_1$  so that ideally it is equals to the power of the all-pass path [24].



Fig. 6. 5. (a) Power spectral density of the random data bit stream, and (b) block diagram of AEL [24].

#### 6.4.2 Variable-Gain Low Pass Filter

Fig. 6.6 shows the variable-gain LPF used in the AEL. The diode-connected PMOS  $M_{8-9}$  loads together with the cross-coupled pair  $M_{6-7}$  are adopted to fix the output nodes' common-mode voltage, eliminating the need for a common-mode feedback (CMFB) circuit. The NMOS transistor  $M_5$  in the linear region is used as a variable source degeneration resistor to change the LPF gain  $(A_{LPF})$  and bandwidth  $(f_{-3dB\_LPF})$ . The required  $A_{LPF}$  and  $f_{-3dB\_LPF}$  for different data rates are plotted in Fig. 6.7. The simulated  $A_{LPF}$  and  $f_{-3dB\_LPF}$  is also plotted as the  $V_{LPF}$  changes from 0.4 to 1 V with a 0.05-V step. The crossing points yield the LPF's gain-bandwidth for different data rates.



Fig. 6. 6. Schematic of the variable-gain LPF.



Fig. 6. 7. Simulated gain and bandwidth for variable-gain LPF.

Table 6. 1: Gain-bandwidth- $V_{LPF}$  table for different data rates.

| Data Rate | Li        | VLPF  |      |
|-----------|-----------|-------|------|
| (Gb/s)    | Gain (dB) | (V)   |      |
| 5         | 12.9      | 81.6  | 0.43 |
| 8         | 14.2      | 96.8  | 0.46 |
| 12        | 15.1      | 118.0 | 0.52 |
| 18        | 16.0      | 143.9 | 0.56 |

#### 6.4.3 Differential Power Detector

Fig. 6.8 shows the schematic of the DPD used in the AEL. By connecting the drains and sources of one differential pair  $(M_{1a}+M_{1b})$  and  $M_{2a}+M_{2b}$  together, with fully differential inputs, ideally the odd harmonics at the output will be eliminated. Thus, a super differential pair is formed, and its output current is proportional to the difference between the two input voltages squared:

$$I_{out} = I_{out2} - I_{out1} = K \times (V_{in1\_dm}^2 - V_{in2\_dm}^2)$$
 (6.2)



Fig. 6. 8. Schematic of the DPD.

In this work, K is designed to be  $4.5 \text{ mA/V}^2$ .

# Chapter 7

# 18-Gb/s Fully Integrated OEIC Measurement

## 7.1 Measurement Setup

Both the electrical and optical measurement setups of the 18-Gb/s fully integrated OEIC are the same as those for the 30-Gb/s OEIC, as depicted in Fig. 4.2, 4.3, and 4.4. Moreover, since the on-chip P-well/Deep N-well (PW/DNW) photodetector (PD) is proposed in the fully integrated optoelectronic integrated circuit (OEIC), additional setups are utilized for the measurement of the standalone on-chip PD testing chip, shown in section 7.2.

# 7.2 Measurement of the Proposed PW/DNW PD

Since the on-chip PW/DNW PD is the bottleneck of the 18-Gb/s OEIC system, a separate PD testing structure is fabricated to investigate its performance as shown in Fig. 7.1. The octagonal PD layout is customized for light exposure by directly shining a commercial optical single-mode (SMF) or multi-mode fiber (MMF) on the sample. In this measurement, a lensed SMF with a core diameter of 9 µm is used with a working distance in the range of a few µms to ensure the input light is well within the sensing area of the 50-µm PD. A dummy PD is added to ensure the capacitance loadings to the pseudo-differential TIA inputs are balanced. The PD output current is



9- µm single-mode fiber(SMF)

RF Probe

Fig. 7. 1. (a) Microphotograph of the proposed PW/DNW PD, and (b) measurement setup.

(b)

collected through an RF signal-ground-signal (SGS) probe for both the responsivity and S-parameter measurements. The S-parameters are measured with R&S ZVB8 network analyzer, to extract the PD's impedance for developing a simulation model [53].

Fig. 7.2 shows the measured illumination and dark currents as a function of the PD's reverse bias voltage ( $V_{PD}$ ). The illumination current is generated by incident photons with energy larger than the bandgap energy of silicon. At low  $V_{PD}$ , the measured output illumination current is 16  $\mu$ A for a -5-dBm input power (Pin) whereas the measured background dark current is about 100 to 200 pA. Both currents have a weak dependence on  $V_{PD}$  until the PD begins to enter avalanche mode at about 11 V. In avalanche mode, both currents increase rapidly due to impact ionization and carrier multiplication under the high electric field in the depletion region [21].





Fig. 7. 2. (a) Measured photocurrent with and without illumination light, and (b) measured bias dependency of the responsivity. The inset is the ratio of illumination to dark current vs.  $V_{PD}$ .

As the depletion width widens with further increase of  $V_{PD}$ , the thermally-induced dark current increases more rapidly than the optically-excited illumination current. Eventually, the noisy dark current dominates the PD output current. Fig. 7.2(b) shows the measured responsivity. The definition of responsivity is shown as:

$$Responsivity = \frac{Illumination\ current - Dark\ current}{Input\ light\ power} \tag{7.1}$$

When under the PD standard mode, i.e., in this work,  $V_{PD}$  is equal to 0.5-V supply for the 65-nm standard CMOS process, a responsivity of 51 mA/W is obtained. The peak responsivity is 1.03

A/W at 12.8-V breakdown voltage. When using the PD under the avalanche mode, it is essential to set the  $V_{PD}$  below the breakdown voltage biasing point in order to meet the signal-to-noise ratio for a given bit error rate requirement. In this work, the optimal  $V_{PD}$  for the avalanche mode is 12.3 V, and a responsivity of 272 mA/W is achieved.

Next, the procedure to obtain the PD model under the standard mode is discussed. Since the PD model under the avalanche mode can be achieved with the same procedure, it will not be iterated here. Fig. 7.3 shows the measured electrical reflection coefficients from 20 MHz to 10 GHz at  $0.5\text{-V}\ V_{PD}$  on Smith Chart. The measured PD capacitance is around 480 fF. The real part of the measured S-Parameter decreases slowly when the operating frequency increases. An equivalent electrical extrinsic sub-model (consisting of a variable resistor and a capacitor) is proposed to mimic the measured S-Parameter, as shown in Fig. 7.3. One goal of our research is to characterize both the electrical extrinsic and optical intrinsic parameters of the proposed PD [55], so that we can build a model for the PD. Then, with the PD model, the co-simulation of PD and the subsequent receiver circuits can be performed seamlessly.

To investigate the PD's optical intrinsic characteristic and build the corresponding intrinsic submodel, the optical frequency response is measured. Fig. 7.4 shows the optical frequency response measured at 0.5-V  $V_{PD}$ . The plots are normalized to the low-frequency responsivity of 51 mA/W for  $V_{PD}$  at 0.5 V. The measured -3-dB bandwidth is 500 MHz. A MATLAB program is utilized to form a polynomial as a fitting curve for the measured data, as shown in Fig. 7.4.



Fig. 7. 3. Measured reflection coefficients of the PW/DNW PD at 0.5-V  $V_{PD}$ .



Fig. 7. 4. Measured vs. fitted optical frequency response of the PD.

Next, as shown in Fig. 7.5, the optical intrinsic sub-model  $I_{PD,intrinsic}(s)$  is obtained by deducting the extrinsic sub-model and equipment's input impedance  $Z_{In,\ equip}$ . Again, with the fitting MATLAB program, the intrinsic sub-model is as follows:



Fig. 7. 5. PD model with both intrinsic and extrinsic sub-models.

$$I_{PD,intrinsic}(s) = \frac{a_0 + a_1 s + a_2 s^2}{b_0 + b_1 s + b_2 s^2}$$
(7.2)

Table 7. 1: Parameters for the PD's intrinsic model under the standard mode.

| $a_0$     | a <sub>1</sub> | <b>a</b> <sub>2</sub> | $b_0$     | <i>b</i> <sub>1</sub> | b <sub>2</sub> |
|-----------|----------------|-----------------------|-----------|-----------------------|----------------|
| 2.4×10^17 | 3.7×10^9       | 0.17                  | 2.4×10^17 | 3.8×10^9              | 1              |

With both the electrical extrinsic and the optical intrinsic sub-models obtained above, the co-simulation of the on-chip PD and the subsequent circuits can be successfully fulfilled in Cadence Spectre simulator. Table 7.2 summarizes the PD performance and compares them to recent published results.

Table 7. 2: Comparison with recently published CMOS PDs for  $V_{PD} \le$  Supply voltage.

|                       | [10]                  | [38]                  | [56]                    | [57]     | This<br>Work          |
|-----------------------|-----------------------|-----------------------|-------------------------|----------|-----------------------|
| CMOS Tech Node        | 130-nm                | 180-nm                | 180-nm                  | 65-nm    | 65-nm                 |
| Junction              | P+/N-well             | P-sub/N-              | P-sub/N-well            | P-sub/N- | P-well/DN-            |
| Junction              | P+/IN-Well            | well                  | (SMPD)                  | well     | well                  |
| Supply Voltage (V)    | 1.5                   | 1.8                   | 3.3                     | 1.0      | 1.0                   |
| $V_{PD}(V)$           | ~0.5                  | <1.8                  | 2.3                     | 0.3      | 0.5                   |
| Responsivity (mA/W)   | 50                    | n/a                   | 20                      | n/a *    | 51                    |
| Measured BW (MHz)     | 348                   | n/a                   | 1100                    | 75 - 150 | 500                   |
| Maximum Data with     | 8.5 Gb/s,             | 3 Gb/s,               | 2 Gb/s,                 | 3.125    | 9 Gb/s,               |
| OEIC and Equalization | BER 10 <sup>-12</sup> | BER 10 <sup>-11</sup> | BER 10 <sup>-12 #</sup> | Gb/s     | BER 10 <sup>-12</sup> |
| C <sub>PD</sub> (fF)  | 3,000                 | 1,600                 | 416                     | 14,000   | 480                   |
| Area (µm²)            | 4900                  | 2500                  | 10000                   | 62500    | 2500                  |

<sup>\*</sup> No measured responsivity value. #No equalization circuits in the receiver.

# 7.3 Measurement Results of the 18-Gb/s Fully Integrated OEIC

<sup>#</sup> No equalization circuits in the receiver.



(a)



(b)

Fig. 7. 6. CoB testing fixture for (a) electrical measurement, and (b) optical measurement.

The 18-Gb/s OEIC microphotograph is shown in Fig. 7.6. To investigate the OEIC's circuit characteristics, a standalone OEIC chip without the on-chip PD is fabricated for electrical S-Parameter measurement, as shown in Fig. 7.6(a). Fig. 7.6(b) shows the corresponding chip-on-board (CoB) testing fixture with the on-chip PD for optical measurement. The core area is 0.26mm<sup>2</sup>, and it consumes 48-mW power.

#### 7.3.1 Electrical Measurement Results



Fig. 7. 7. Measured electrical frequency response with different CTLE settings.

Fig. 7.7 demonstrates the measured electrical frequency response of the 18-Gb/s OEIC with different CTLE settings, which is tested by directly probing on the CoB. The system achieves a transimpedance gain of 102 dB $\Omega$  with a BW of 12.5 GHz. When the CTLE is enabled, it provides 33-dB adjustable gain with a slow frequency roll-up to compensate for the gradual roll-off of the on-chip PD responsivity.

## 7.3.2 Optical Measurement Results



Fig. 7. 8. Measured optical PRBS-15 data eyes at the maximum DR in standard and avalanche mode with the CTLE enabled and disabled.

To demonstrate the effectiveness of the 3-stage CTLE and AEL, Fig. 7.8 shows the data eyes measured at the maximum data rate achieved with the CTLE disabled and enabled. As shown in the plot, the CTLE and AEL improve the maximum data rate from 3 to 9 Gb/s and 7 to 18 Gb/s under 0.5-V standard and 12.3-V avalanche mode, respectively.

The measured optical BER bathtub curves with different PD operation modes are depicted in Fig. 7.9. The measured optical BERs versus optical input power with different PD operation modes are shown in Fig. 7.10.



Fig. 7. 9. Measured BER bathtub curves with PD under different modes: (a) 0.5-V standard mode, and (b) 12.3-V avalanche mode.



Fig. 7. 10. Measured BER versus optical input power.

## 7.4 Conclusion

In sum, the 18-Gb/s OEIC measurement results demonstrate a fully integrated solution for the short-range 850-nm optical communications. On the one hand, under the standard mode (0.5-V PD  $V_{PD}$ ), a record data traffic of 9 Gb/s for  $2^{15}$ -1 PRBS with  $10^{-12}$  BER, -4.2-dBm optical input sensitivity, and 5.33-pJ/bit efficiency is presented; on the other hand, under the avalanche mode (12.3-V PD  $V_{PD}$ ), a record data traffic of 18 Gb/s for  $2^{15}$ -1 PRBS with  $10^{-12}$  BER, -4.9-dBm optical input sensitivity, and 2.7-pJ/bit efficiency is exhibited.

Table 7.3 compares the measured performance to other 850-nm optical receivers with integrated PDs. This work achieves the fastest data rate, best efficiency, and highest [6].

Table 7. 3: Comparison to published CMOS 850-nm optical receivers.

|                | [6]    | [10]   | [11]    | [37]    | This  | Work    |
|----------------|--------|--------|---------|---------|-------|---------|
| Tochnology     | 130-nm | 130-nm | 250-nm  | 180-nm  | 65    | -nm     |
| Technology     | CMOS   | CMOS   | BiCMOS  | CMOS    | CMOS  |         |
| PD Op. Mode    | Std.   | Std.   | Avalan. | Avalan. | Std.  | Avalan. |
| PD Bias (V)    | 1.2    | 0.5    | 12      | 14.2    | 0.5   | 12.3    |
| PD Area (µm²)  | 3600   | 4900   | 100     | 2505    | 2     | 071     |
| Resp. (mA/W)   | 5      | 50     | 70      | 29      | 51    | 272     |
| PD BW (GHz)    | 0.5    | 0.348  | 5       | 6.9     | 0.5   | 1.1     |
| Supply (V)     | 1.2    | <1.5   | 2.5     | 1.8     | •     | 1.0     |
| Gain (dB-Ohm)  | 105    | 120    | 68.4#   | 88      | 1     | 02      |
| Power (mW)     | 74     | 47     | 59      | 118*    |       | 48      |
| Max. DR (Gb/s) | 4.5    | 8.5    | 12.5    | 10      | 9     | 18      |
| BER            | 10-12  | 10-12  | 10-12   | 10-11   | 10-12 | 10-12   |
| Sens. (dBm)    | -3.4   | -3.2   | -7      | -6      | -4.2  | -4.9    |
| Eff. (pJ/bit)  | 16.44  | 5.53   | 4.72    | 11.8    | 5.35  | 2.7     |
| FoM^           | 55     | 242    | 2       | 42      | 471   | 1089    |

#Simulation result. ^FoM: Eq. 21 in [6].

$$FOM = \frac{bit \ rate \left[\frac{Gbit}{s}\right] \cdot |log(BER)| \cdot |sensitivity[dBm]| \cdot gain[dBOhm] \cdot PD \ diameter^2[\mu m^2]}{power[mW] \cdot technology^2[nm^2]}$$

(7.3)

# **Chapter 8**

# Technology Options and Physical Implementation Techniques for High Frequency Amplifiers

The technology options and physical layout implementations are becoming more and more crucial for high-speed amplifiers. On the one hand, for the active device, the  $V_t$  options and transistor layout in Taiwan Semiconductor Manufacturing Company (TSMC) 1-V 65-nm CMOS technology are much more critical than before, since the parasitic resistor, inductor, and capacitor (RLC) become dominant with the shrinking of the technology node. On the other hand, for the passive device, the on-chip inductor is widely deployed for bandwidth improvement in high-speed amplifiers by shunt peaking or series peaking. However, it occupies a huge amount of chip area, which not only means much higher cost, but also an increase in the length of interconnections and more capacitive loadings.

## 8.1 High-Speed Transistor $V_t$ Options and Layout

#### 8.1.1 Introduction

The high-speed cell-based transistor layout methodology was first attempted in a 3-µm CMOS process nearly two decades ago [58], [59]. More recently, a radio frequency (RF) cell-based modeling platform for parameterized sub-circuit cells layout was proposed based on a 0.13-µm

CMOS process [60]. Three fundamental sub-circuits cells were proposed in [60] for the cascode stage, differential pair, and cross-coupled pair. In practice, the choice of the devices to merge is not so clear in terms of the layout parasitic impact to circuit performance. For example, when drawing the layout of a differential cascode low-noise amplifier, the designers need to select from: (1) drawing four individual devices for the cascode and main transistors; (2) merging the cascode and main transistor; or (3) merging the main devices and cascode devices as two merged differential pair. In conventional analog design below a few hundred MHz, merging the diffusion area of adjacent devices is a common practice to save silicon area, reduce diffusion parasitic capacitance and improve transistor matching. However, such layout practice tends to incur more interconnect wiring parasitic due to the inter-digitated layout style, which can be detrimental to RF design. In nano-scale processes beyond 65-nm technology node, the potential problem worsens as the crosstalk capacitance increases drastically due to the narrower contact and line spacing. As a result, it is unclear whether two-transistor sub-circuit cells with an merged diffusion or an individual transistor layout with external routings will provide better RF performance. Moreover, when designing in deeply scaled processes, the performance impact of using normal threshold voltage and low threshold voltage devices must also to be accounted for. This chapter answers these design questions based on the experimental results obtained from a set of RF low-noise amplifiers.

## 8.1.2 LNA Test Circuit with Different Layout and $V_t$ Options

The differential cascode low noise amplifier (LNA) topology employed in this study is shown in Fig. 8.1 [61]–[63]. The cascode devices ( $M_2$ ) are used to achieve high gain and good reverse isolation. With the series inductor ( $L_g$ ) at the gate of the input transistor and the source

degeneration inductor ( $L_s$ ) at the source node, input noise and impedance matching are realized. However,  $L_g$  has to be relatively large due to the small gate-source capacitance of  $M_1$ . To solve this problem, an extra design parameter, metal-insulator-metal (MIM) capacitors,  $C_d$ , are inserted in parallel with the input transistors, reducing the gate-induced current noise [62].



Fig. 8. 1. Simplified schematic of the 5-GHz LNAs.

Table 8. 1: Parameters of the 5-GHz LNAs.

| $M_1$        |                           | $M_2$        |                                 |  |
|--------------|---------------------------|--------------|---------------------------------|--|
| W : L        | I <sub>bias</sub> (μΑ/μm) | W:L          | <i>I<sub>bias</sub></i> (μΑ/μm) |  |
| 80 μm : 60nm | 94                        | 80 µm : 60nm | 94                              |  |

| $L_g$        | Ls           | L <sub>d</sub> | $C_d$  | $C_L$  |
|--------------|--------------|----------------|--------|--------|
| 4.3 nH, Q=11 | 0.5 nH, Q=14 | 7.7 nH, Q=11   | 160 fF | 190 fF |

To compare the performance impact of different layout styles and  $V_t$  options, five different versions of the LNA test circuits are designed and characterized. The design splits and key attributes are summarized in Table 8.2. The first two design splits adopted a merged diffusion layout for the cascode ( $M_2$ ) and main device ( $M_1$ ), which has similar layout style to the RF subcircuit cells proposed in [60]. Designs using normal  $V_t$  (~450 mV) and low  $V_t$  (~250 mV) are included in our study. Due to the inter-digitated layout style, the wiring routing is so congested that the transistor gate can only be contacted from one-side without introducing too much extra parasitic due to routing overlaps and crossovers. The third and fourth samples employ the standard single-transistor RF cell from the PDK, which have more room to accommodate double-sided gate contacts. Again, both normal  $V_t$  and low  $V_t$  samples are included. The fifth split differs from the fourth one in its cascode to main device width ratio. This is a layout flexibility offered by using individual transistor cell layout, rather than merged diffusion sub-circuit cells.

Table 8. 2: Summary of LNA test circuit design splits.

| # | Design Splits<br>Label               | Cascode to<br>Main<br>Device<br>Ratio | Vt<br>Option | Layout F                       | eatures                          |
|---|--------------------------------------|---------------------------------------|--------------|--------------------------------|----------------------------------|
| 1 | $Merged\_Nor V_t$ (RBC)              | 1:1                                   | Normal       | Merged cascode and             | Single-<br>sided gate            |
| 2 | Merged_L $V_t$ (RBC)                 | 1 : 1                                 | Low          | main device                    | contact                          |
| 3 | $PDK_Nor V_t$                        | 1:1                                   | Normal       | Individual                     |                                  |
| 4 | PDK_LV <sub>t</sub>                  | 1:1                                   | Low          | cascode and main device        | Double-<br>sided gate<br>contact |
| 5 | PDK_L <i>V<sub>t</sub>_</i><br>1.7:1 | 1.7 : 1                               | Low          | using PDK<br>RF<br>transistors |                                  |



Fig. 8. 2. Layout view (to scale) of (a) design split #1 and #2: Merged\_Nor $V_t$  and Merged\_L $V_t$ ; (b) design split #3 and #4: PDK\_Nor $V_t$  and PDK\_L $V_t$ ; and (c) design split #5: PDK\_L $V_t$ \_1.7:1.

#### **8.1.3** Measurement Results

One die photo of the LNA test circuit is shown in Fig. 8.3. The test circuits are fabricated with TSMC 1-V 65-nm CMOS technology. Each LNA occupies an area of 0.52 mm<sup>2</sup>. The top metal is used to design the four on-chip inductors with a quality factor ranging from 11 to 14 at 5 GHz. On-wafer probing techniques are adopted to measure the S-parameter, NF and IIP3 performance.



Fig. 8. 3. Die photo of one LNA test circuit used in this study.

A performance summary of post-simulation (SS MOS at 85 °C) and measurement results are shown in Table 8.3. To facilitate fair comparisons, the LNAs are biased at a fixed power consumption of 18 mW from a 1.2-V  $V_{DD}$ . The measurement and simulation results illustrate trends consistent with each other. Design splits #3, #4, and #5 using an individual device layout with separated diffusion area, low  $V_b$  and double-sided gate contact provide better gain and noise performance. Specifically, the power gain and NF are improved by 1.5 dB and 0.3 dB, respectively. On the other hand, using normal  $V_t$  devices yields the better linearity with a ~4-dB enhancement in IIP3. Merging the diffusion at the cascode node also slightly improves the linearity as the nonlinear junction capacitance is reduced.

Table 8. 3: Post-simulation and measurement result summary.

| Performance |          | Merged       | Merged | PDK    | PDK   | PDK        |
|-------------|----------|--------------|--------|--------|-------|------------|
| Summary     |          | _NorVt       | _LVt   | _NorVt | _LVt  | _LVt_1.7:1 |
| S21(dB)     | Post-sim | 17.1         | 17.1   | 17.4   | 18.1  | 18.1       |
| 021(db)     | Measured | 15.4         | 16.2   | 16.3   | 16.8  | 16.9       |
| S11(dB)     | Post-sim | -9.3         | -9.4   | -9.1   | -9.3  | -10.3      |
| OTT(GB)     | Measured | -10.2        | -9.4   | -10.1  | -8.4  | -9.7       |
| S22(dB)     | Post-sim | -12.6        | -12.8  | -14.7  | -15.8 | -11.7      |
| 022(dB)     | Measured | -8.2         | -8.7   | -8.8   | -8.3  | -8.1       |
| S12(dB)     | Post-sim | -36.5        | -36.5  | -35.4  | -35.4 | -35.5      |
| 012(02)     | Measured | -32.8        | -31.6  | -32.3  | -31.7 | -31.5      |
| NF(dB)      | Post-sim | 2.57         | 2.58   | 2.35   | 2.28  | 2.27       |
| (db)        | Measured | 2.86         | 2.81   | 2.76   | 2.59  | 2.59       |
| IIP3        | Pre-sim  | -8.0         | NA     | -8.4   | -9.2  | -10.1      |
| (dBm)       | Measured | <b>-</b> 5.6 | -9.3   | -6.5   | -10.3 | -11.4      |

The S-Parameters are measured using an R&S ZVB8 4-port network analyzer. The (SHORT, OPEN, LOAD and THRU) SOLT calibration method is performed by using a GGB Calibration Substrate CS-8. Fig. 8.4 plots the measured power gain (S21) of the five LNAs. Design split #5 (PDK\_LV<sub>t</sub>\_1.7:1) achieves the highest gain and has a 1.5-dB higher gain than the lowest one by design split #1 (Merged\_NorVt). The low  $V_t$  splits all demonstrate slightly higher power gain compared to their corresponding normal  $V_t$  counterparts. The measured input and output matching are shown in Fig. 8.5 and Fig. 8.6, respectively. The measured reverse isolation is

shown in Fig. 8.7. Since a cascode topology is used in this design, the reverse isolation is excellent.



Fig. 8. 4. Measured results of power gain (S21) for all 5 LNAs.

The NF of the differential LNAs has to be measured in a single-ended to single-ended configuration due to equipment limitation, as shown in Fig. 8.8 [64]. To alleviate this problem, two Krytar 4020180 28GHz 180° hybrid couplers are adopted for single-ended to differential conversion. The non-ideality of the hybrid coupler is evaluated using an R&S ZVB8 4-port network analyzer. A maximum imbalance of 0.5 dB in amplitude and ±8 degrees in phase is observed. An Agilent N8975A noise figure analyzer and a N4002A noise source are used to perform the NF measurement [65], [66]. For calibration, the noise source is first connected directly to the noise figure analyzer. Next, a differential THRU structure in the calibration

substrate is measured for de-embedding the pad parasitic. The THRU structure is replaced with the device under test (DUT) for the actual measurement. The measured NF performance is plotted in Fig. 9. Design split #5 (PDK\_L $V_t$ \_1.7:1) achieves the best NF performance. The splits using normal  $V_t$  and single-sided gate contact exhibit th3 worst NF performance.



Fig. 8. 5. Measured results of input matching (S11) for all 5 LNAs.



Fig. 8. 6. Measured results of output matching (S22) for all 5 LNAs.



Fig. 8. 7. Measured results of reverse isolation (S12) for all 5 LNAs.



Fig. 8. 8. Testing setup of the differential NF measurement.



Fig. 8. 9. Measured results of the NF.

The IIP3 performance is measured with two Agilent signal generators and one E4440A spectrum analyzer. The two-tone signal with components at 5 GHz and 5.01 GHz are produced using a Krytar 4020180 28GHz 180° hybrid couplers and injected into the test circuits. The IIP3 measurement results for design split #1 (Merged\_Nor $V_t$ ) and split #5 (PDK\_L $V_t$ \_1.7:1) are plotted in Fig. 8.10 (a) and (b), respectively. The slopes of the fundamental tone and the third tone in both cases are close to the theoretical value of 1 and 3, respectively. The superior IIP3 attained by Merged\_Nor $V_t$  suggests that merged diffusion and normal  $V_t$  will help the linearity performance.





Fig. 8. 10. Measured results of IIP3 (a) Merged\_Nor $V_t$ , and (b) PDK\_L $V_t$ \_1.7:1.

#### 8.1.4 Conclusion

Based on the measurement results from a set of 5-GHz LNAs, this work in this section evaluates the performance impact of different layouts and  $V_t$  options [67]. For higher gain, bandwidth, and better NF consideration, an individual transistor layout and low  $V_t$  are preferred. When linearity and matching are more important, normal  $V_t$  devices with merged diffusion area can yield better performance. One recommended layout and  $V_t$  usage guidelines for high-speed amplifier design are summarized in Table 8.4. Further study on the mixing of normal and low  $V_t$  cascode devices should be carried out.

Table 8. 4: Recommended layout and  $V_t$  usage guidelines.

|              | For higher gain and | For better linearity and |
|--------------|---------------------|--------------------------|
|              | bandwidth           | matching                 |
| $V_t$        | Low V <sub>t</sub>  | Normal $V_t$             |
| Layout style | Individual device   | Merged diffusion area    |

Based on the guidelines, we have already known how to layout high-speed amplifiers in the two OEIC systems: The low  $V_t$  devices and individual layouts for differential pairs, cross-coupled pairs, and cascode transistors should be utilized to pursue higher gain and bandwidth. Meanwhile, linearity is not important in the two optoelectronic integrated circuits (OEICs), and the matching can be improved by balancing the differential signal paths intentionally.

## 8.2 Differential Stacked Spiral Inductor Design

#### 8.2.1 Introduction

The inductors provided by the TSMC 1-V 65-nm CMOS process design kit (PDK) library are usually targeted at high quality factor (Q) application and most of them only include the top layer with 3.23-µm thickness [68]. This will inevitably lead to large silicon area since the inductance is only determined by the geometry dimension [69]. For example, the smallest size of a differential inductor with 1.0-nH low-frequency inductance that can be found from the PDK library used in this work is 110 µm × 116 µm. Such an inductor  $L_0$  has 4 turns with the inner diameter  $d_{\rm in}$  = 30 µm, metal width w=3 µm, metal spacing s=2 µm, and a 25-µm thickness guard ring on the

outside for extra protection. It achieves a Q of 16 at 18 GHz, but only  $0.08\text{-pH/}\mu\text{m}^2$  for the inductance density. In high-speed amplifier designs, high-Q is not considered as a critical design parameter for shunt peaking loads, since they are already in series with resistors. Instead, smaller area is more desirable. Hence, self-shielded differential stacked spiral inductors with two metal layers are presented in this section. They sacrifice the Q but obtain higher inductance density [68].

#### 8.2.2 The Differential Stacked Spiral Inductor

For the design of a 1.0-nH differential inductor, a single-layer differential inductor  $L_I$  could be chosen as a starting point.  $L_I$  is provided by PDK library with the same  $d_{in}$ , w, and s as  $L_0$ , but has only 2 turns. In this work,  $L_I$  has 243-pH inductance at low frequency and uses top metal layer (metal 9 in this case). Next, the layout of  $L_I$  is copied to a lower metal layer – metal 8 – to form another single-layer inductor  $L_2$ . Then the center taps of  $L_I$  and  $L_2$  are separated and reconnected to form a 4-turn differential stacked spiral inductor, as shown in Fig. 8.11.  $L_I$  and  $L_2$  are connected to be positively coupled to each other. Since the adjacent metal layers have smallest potential difference under this condition, the effect of coupled capacitance is minimized. The two top metal layers are chosen to minimize the parasitic capacitance referred to the ground as well.

The differential stacked spiral inductor (DSSI) presented in this section is simulated with its smallest single-layer counterpart provided by the foundry. High frequency structure simulator (HFSS) is used to verify the L and Q of the DSSIs. Both of the inductors have 4 turns, the same  $d_{in}$ , w, and s, and achieve 1.0-nH differential inductance at low frequency. But the differential stacked spiral inductor used in this work only occupies an area of 60  $\mu$ m  $\times$  66  $\mu$ m. The total

inductance density is improved by 3.2 times and achieves 0.25 pH/ $\mu$ m<sup>2</sup>. The DSSI also achieves a higher self-resonance frequency, as depicted in Fig. 8.12(a). These improvements are achieved by sacrificing the quality factor, as shown in Fig. 8.12(b). Detailed comparison between the DSSI and its single-layer counterpart are shown in Table 8.5.



Fig. 8. 11. The custom-designed DSSI [68].



Fig. 8. 12. HFSS simulation results of (a) inductance, and (b) quality factor for single-layer and presented customized DSSI [68].

Table 8. 5: Simulation comparison of the single-layer inductor and the customized DSSI.

|                          | Single-layer inductor | DSSI          |
|--------------------------|-----------------------|---------------|
| Low-frequency inductance | 1.0 nH                | 1.0 nH        |
| Area                     | 110 μm × 116 μm       | 60 μm × 66 μm |
| Self-resonance frequency | 34.7 GHz              | 41.5 GHz      |
| Peaking quality factor   | 16                    | 11            |
| Number of turns          | 4                     | 4             |

#### 8.2.3 Measurement Results

The presented DSSI is fabricated in TSMC 1-V 65-nm CMOS technology. The chip microphotograph is shown in Fig. 8.13. The measured and simulated inductances are within 5% deviation at low frequency. Due to the real parasitic capacitors, the measured self-resonance frequency drops to be 37.5 GHz, compared to the simulated 41.5 GHz.



Fig. 8. 13. Microphotograph of the DSSI.



Fig. 8. 14. Comparison of measured and simulated inductance of the DSSI.



Fig. 8. 15. Comparison of measured and simulated quality factor of the DSSI.

#### 8.2.4 Conclusion

Based on the single-layer inductor provided in TSMC 1-V 65-nm CMOS technology, a fast design methodology of a DSSI is adopted to accommodate the high-speed amplifier design. The presented DSSI increases the inductance density by over 3 times, which can save much chip area and reduce the long interconnections and capacitive loadings. Moreover, the low-Q characteristics of the DSSI can be further utilized to absorb the series resistance at the output of high-speed amplifiers into the on-chip inductor, which can greatly ease the high-speed circuit layout.

## Chapter 9

#### **Conclusions and Future Work**

#### 9.1 Conclusions

Nowadays, with the growth of multimedia consumer applications, such as applications in Apple and Android cell phones, data traffic has increased so dramatically that conventional electrical copper links have met their performance limitations, in terms of bandwidth, channel loss, electromagnetic interference (EMI), reflection, and crosstalk. Therefore, high-speed fiber-optic links have begun to replace these electrical copper links. For long-haul communications, optical links that use III-V materials have already replaced electrical interconnections since a large number of end-users can share the expense of these materials. However, for short-range communications, this solution cannot be adopted since the communication channels cannot be shared by multiple users. In this thesis, two short-range complementary metal-oxide-semiconductor (CMOS) optoelectronic integrated circuit (OEIC) systems for different applications are proposed, designed, fabricated, and measured.

The 41-mW 30-Gb/s CMOS optical receiver with digitally-tunable cascaded equalization is targeted at high-speed optical communications above 25 Gb/s. Using the existing Taiwan Semiconductor Manufacturing Company (TSMC) 1-V 65-nm CMOS technology, its key design challenges are how to achieve a high data rate and a superior input sensitivity with the minimum

power consumption, and at the same time, have the system compensate for the loss of the signal channel and variations of process, voltage and temperature (PVT). In this work, to meet these challenges, first, the architecture design at the system-level is analyzed to breakdown the system's gain, power, and input referred noise (IRN) to individual circuit blocks. Second, novel circuits are proposed, including an inverter-based transimpedance amplifier (TIA) employing shunt-shunt inductive feedback and input series peaking, a tri-loop low-dropout (LDO) regulator providing full-spectrum power supply rejection (PSR) and good transient response, and a 3-stage cascaded continuous-time linear equalizer (CTLE) utilizing inductive shunt peaking at different frequencies. Finally, both the electrical and optical measurements are demonstrated showing the effectiveness of the proposed low-power 30-Gb/s optical receiver and the proposed cascaded CTLE technique.

The 48-mW 18-Gb/s fully integrated CMOS optical receiver with an on-chip photodetector (PD) and an adaptive equalizer is targeted at a single-chip receiver solution for short-range optical communications. Its bottleneck and main challenge are extreme limitations of the on-chip PD's responsivity and bandwidth. To boost the data rate, innovations both in device and circuits are proposed, including an P-well/Deep N-well (PW/DNW) PD, an cascode TIA with shunt-shunt feedback, input series peaking, and negative capacitance compensation (NCC), and a cascaded CTLE with different peaking frequencies. Moreover, an adaptive equalization loop (AEL) based on the analysis of the power spectral density of a random data bit stream is utilized to auto-tune the OEIC. Measurement results reveal that, the proposed single-chip solution achieves new record performance under both two PD operation modes.

Moreover, the technology options and physical implementation techniques have been becoming more and more crucial for high-speed amplifiers and systems design, since the parasitic resistor, inductor, and capacitor (RLC) have begun to be dominant with the shrinking of the technology node. Both the active and passive devices in CMOS technology, including transistor  $V_t$  options, transistor and cell-based sub-circuit layouts, a differential stacked spiral inductor (DSSI), are studied with a set of 5-GHz low noise amplifiers (LNAs). Finally, a design and layout guideline for high-speed circuits is proposed and carried out in these two OEICs.

## 9.2 My Own Contributions

My individual contributions mainly include:

- 1) System architecture designs for both of the two OEIC systems, and specification calculations for each building block;
- 2) The novel on-chip CMOS PW/DNW PD design;
- 3) The novel slow roll-up cascaded 3-stage CTLE design;
- 4) The novel cascode TIA with shunt-shunt feedback, input series peaking, and NCC;
- 5) Integration of the two OEIC systems.
- 6) To give a guideline for high-speed circuit layout and technology *Vt* options, five 5-GHz high-speed amplifiers with different design considerations are designed, measured, and compared. Moreover, a differential spiral stacked inductor (DSSI) is designed to accommodate the need of high-speed circuits with much smaller chip area and lower quality factor.

To excel over existing architectures:

For the 30-Gb/s OEIC with fully-digital control, the performance specifications are distributed among each circuit blocks. A novel 3-stage cascaded CTLE is proposed to compensate for the frequency loss due to the 14-Gb/s off-chip PD, and boost the final data rate up to 30 Gb/s, with only 0.6-dB degradation in input sensitivity.

For the 18-Gb/s fully integrated OEIC, first, the bottleneck of the whole system, i.e., the on-chip PD is investigated. To alleviate the effect from the PD, a novel PW/DNW PD is proposed to provide better responsivity and -3-dB bandwidth. However, only with the novel PD is far away to be sufficient to obtain data rate up to 10 Gb/s. So second, a slow roll-up cascaded 3-stage CLTE with adaptive equalization is proposed to boost the data rate up to 9 Gb/s under standard mode and 18 Gb/s under avalanche mode, respectively.

#### 9.3 Future Work

On the one hand, OEICs with a data rate above 25 Gb/s have become very attractive since the prosperous growth of the second generation of 100-Gbit Ethernet (100-GbE). However, to achieve the superior performance, expensive off-chip commercial PDs fabricated in III-V materials are required. On the other hand, although single-chip OEICs with on-chip PDs can only achieve a lower data rate, they provide many advantages over hybrid implementations. If a combination of these two types of OEICs could be realized, i.e., a single-chip solution with data rate above 25 Gb/s, it would be a great breakthrough for short-range optical communications.

#### REFERENCES

- [1] The zettabyte era-trends and analysis, Cisco Visual Network Index (VNI) Forecast, May. 2013.
- [2] R. J. Bates, Optical switching and networking handbook, New York: McGraw-Hill, 2001.
- [3] B. Razavi, Design of integrated circuits for optical communications, McGraw-Hill, 2003.
- [4] H. H. Hopkins and N. S. Kapany, *A flexible fibrescope, using static scanning*, Nature 173, 1954.
- [5] K. C. Kao and G. A. Hockham, "Dielectric-fibre surface waveguides for optical frequencies," in *Proc. IEEE*, vol. 113, no. 7, pp. 1151–1158, Jul. 1966.
- [6] F. Tavernier, and M. S. J. Steyaert, "High-speed optical receivers with integrated photodiode in 130 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 10, pp. 2856–2867, Oct. 2009.
- [7] A. V. Krishnamoorthy *et al.*, "Progress in low-power switched optical interconnects," *IEEE Journal of Selected Topics in Quantum Electronics*, vol. 17, no. 2, pp. 357–376, Mar./Apr. 2011.
- [8] S. B. Alexander, *Optical communication receiver design*, SPIE Optical Engineering Press, 1997.
- [9] A. C. Carusone, H. Yasotharan, and T. Kao, "Progress and trends in multi-Gbps optical receivers with CMOS integrated photodetectors," in *IEEE Custom Integrated Circuits Conference*, Sep. 2010.

- [10] D. Lee, J. Han, G. Han, and S. M. Park, "An 8.5-Gb/s fully integrated CMOS optoelectronic receiver using slope-detection adaptive equalizer," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 12, pp. 2861–2873,Dec. 2010.
- [11] J. Youn et al., "An integrated 12.5-Gb/s optoelectronic receiver with a silicon avalanche photodetector in standard SiGe BiCMOS technology," *Optics Express*, vol. 20, no. 27, 2012.
- [12] G. Hankins, 100 GbE and beyond, http://www.nanog.org/meetings/nanog52/presentations/Tuesday/hankins-100-gbe-and-beyond.pdf, Jun. 2011.
- [13] C. Cole, P. Anslow, and J. King, "Update to Adopted 100GE 10km SMF PMD Baseline," IEEE 802.3ba Task Force, Jul. 2008.
- [14] T. Takemoto et al., "A 4 × 25-to-28Gb/s 4.9mW/Gb/s -9.7dBm High-Sensitivity Optical Receiver Based on 65nm CMOS for Board-to-Board Interconnects," in *IEEE Int. Solid-State Circ. Conf. Dig. Tech. Papers*, pp. 118–119, Feb. 2013.
- [15] J. Jiang *et al.*, "100Gb/s Ethernet chipsets in 65nm CMOS technology," in *IEEE Int. Solid-State Circ. Conf. Dig. Tech. Papers*, pp. 120–121, Feb. 2013.
- [16] J. Proesel *et al.*, "25Gb/s 3.6pJ/b and 15Gb/s 1.37pJ/b VCSEL-Based Optical Links in 90nm CMOS," in *IEEE Int. Solid-State Circ. Conf. Dig. Tech. Papers*, pp. 418–419, Feb. 2012.
- [17] J. Proesel *et al.*, "A 20-Gb/s, 0.66-pJ/bit Serial Receiver with 2-Stage Continuous-Time Linear Equalizer and 1-Tap Decision Feedback Equalizer in 45nm SOI CMOS," in *IEEE Symp. on VLSI Circuits*, pp. 206–207, 2011.
- [18] S. M. Park and H. J. Yoo, "1.25-Gb/s Regulated Cascode CMOS Transimpedance Amplifier for Gigabit Ethernet Applications," *IEEE Journal of Solid-State Circuits*, vol. 39, pp. 112–121, 2004.

- [19] Y. Wang, Y. Lu, Q. Pan, Z. Hou, L. Wu, W. H. Ki, and C. P. Yue, "A 3-mW 25-Gb/s CMOS transimpedance amplifier with fully integrated low-dropout regulator for 100GbE systems," in *IEEE Radio Frequency Integrated Circuits Symposium Digest of Papers*, Jun. 2014.
- [20] S. Shekhar, J. S. Walling, and D. J. Allstot, "Bandwidth Extension Techniques for CMOS Amplifiers," *IEEE Journal of Solid-State Circuits*, vol. 41, pp. 2424–2439, Nov. 2006.
- [21] E. N. Y. Ho and P. K. T. Mok, "Wide-Loading-Range Fully Integrated LDR With a Power-Supply Ripple Injection Filter," *IEEE Trans. Circuits Syst. II: Express Briefs*, vol. 59, no. 6, pp. 356–360, Jun. 2012.
- [22] Y. Lu, W. H. Ki, and C. P. Yue, "A 0.65ns-Response-Time 3.01ps FOM Fully-Integrated Low-Dropout Regulator with Full-Spectrum Power-Supply-Rejection for Wideband Communication Systems," in *IEEE Int. Solid-State Circ. Conf. Dig. Tech. Papers*, Feb 2014.
- [23] Z. Hou, Q. Pan, Y. Wang, L. Wu, and C. P. Yue, "A 23-mW 30-Gb/s digitally programmable limiting amplifier for 100GbE optical receivers," in *IEEE Radio Frequency Integrated Circuits Symposium Digest of Papers*, Jun. 2014.
- [24] R. Sun, J. Park, F. Manony, and C. P. Yue, "A tunable passive filter for low-power high-speed equalizers," in *IEEE Symp. on VLSI Circuits*, 2006.
- [25] S. Elhadidy and S. Palermo, "A 10 Gb/s 2-IIR-tap DFE receiver with 35 dB loss compensation in 65-nm CMOS," in *IEEE Symp. on VLSI Circuits*, pp. 272–273, 2013.
- [26] J. Lee, "A 20-Gb/s adpative equalizer in 0.13-μm CMOS technology," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 9, pp. 2058–2066, Sep. 2006.

- [27] H. Wang and J. Lee, "A 21-Gb/s 87-mW transceiver with FFE/DFE/Analog equalizer in 65-nm CMOS technology," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 4, pp. 909–919, Apr. 2006.
- [28] J. Weiss *et al.*, "A DC to 44-GHz, 19-dB gain amplifier in 90-nm CMOS using capacitive bandwidth enhancement," ," in *IEEE Int. Solid-State Circ. Conf. Dig. Tech. Papers*, pp. 514–515, Feb. 2006.
- [29] C. H. Lee and S. I. Liu, "A 35-Gb/s limiting amplifier in 0.13-um CMOS technology", in *IEEE Symp. on VLSI Circuits*, pp. 152–153, Jun. 2006.
- [30] J. R. M. Weiss, M. L. Schmatz, and H. Jaeckel, "A 40 Gb/s, digitally programmable peaking limiting amplifier with 20-dB differential gain in 90 nm CMOS," in *IEEE Radio Frequency Integrated Circuits Symposium Digest of Papers*, 2006.
- [31] E. Sackinger and W. C. Fischer, "A 3-GHz, 32-dB CMOS limiting amplifier for SONET OC-48 receiver," *IEEE Journal of Solid-State Circuits*, vol. 35, no. 12, pp. 1884–1888, Dec. 2000.
- [32] K Wu et al., "A 2 x 25Gb/s receiver with 2:5 DMUX for 100Gb/s Ethernet," *IEEE Journal of Solid-State Circuits*, vol. 45, pp. 2421–2432, Nov. 2010.
- [33] E. M. Cherry and D. E. Hooper, "The Design of Wide-band Transistor Feedback Amplifiers," in *Proc. Inst. Elec. Eng.*, vol. 110, pp. 375–389, 1963.
- [34] Z. Hou, Y. Wang, Q. Pan, and C. P. Yue, "A 25-Gb/s 32.1-dB CMOS Limiting Amplifier for Integrated Optical Receivers" in *Proceedings of IEEE International Conference on ASIC*, 2013.
- [35] J. Proesel *et al.*, "Optical receiver using DFE-IIER Equalization," in *IEEE Int. Solid-State Circ. Conf. Dig. Tech. Papers*, pp. 130–131, Feb. 2013.

- [36] T. Kao *et al.*, "A 5-Gbps optical receiver with monolithically integrated photodetector in 0.18-µm CMOS," in *IEEE Radio Frequency Integrated Circuits Symposium Digest of Papers*, Jun. 2009.
- [37] S.-H. Huang *et al.*, "A 10-Gb/s OEIC with meshed spatially-modulated photo detector in 0.18-µm CMOS technology," *IEEE Journal of Solid-State Circuits*, vol.46, no.5, pp. 1158–1169, May. 2011.
- [38] S. Radovanovic, A. J. Annema, and B. Nauta, "A 3-Gb/s optical detector in standard CMOS for 850-nm optical communication," *IEEE Journal of Solid-State Circuits*, vol. 40, pp. 1706–1717, Aug. 2005.
- [39] W. Z. Chen, S. H. Huang, G. W. Wu, C. C. Liu, Y. T. Huang, C. F.Chin, W. H. Chang, and Y. Z. Juang, "A 3.125 Gbps CMOS fully integrated optical receiver with adaptive analog equalizer," in *IEEE Asian Solid-State Circuits Conference Proceedings of Technical Papers*, pp. 396–399, Nov. 2007.
- [40] W. Z. Chen and S. H. Huang, "A 2.5 Gbps CMOS fully integrated optical receiver with lateral PIN detector," in *IEEE Custom Integrated Circuits Conference*, pp. 293–296, Sep. 2007.
- [41] J. S. Youn, H. S. Kang, M. J. Lee, K. Y. Park, and W. Y. Choi, "High-speed CMOS integrated optical receiver with an avalanche photodetector," *IEEE Photonics Technology Letters*, vol. 21, no. 20, pp. 1553–1555, Oct. 2009.
- [42] M. K. Lee, H. S. Kang, and W. Y. Choi, "Equivalent circuit model for Si avalanche photodetectors fabricated in standard CMOS process," *IEEE Electron Device Letters*, vol. 29, no. 10, pp. 1115–1117, Oct. 2008.

- [43] S. H. Huang, W. Z. Chen, Y. W. Chang, and Y. T. Huang, "A 10-Gb/s OEIC with meshed spatially-modulated photo detector in 0.18-μm CMOS technology," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 5, pp. 1158–1169, May. 2011.
- [44] W. K. Huang, Y. C. Liu, and Y. M. Hsin, "A high-speed and high-responsivity photodiode in standard CMOS technology," *IEEE Photonics Technology Letters*, vol. 19, no. 4, pp. 197–199, Feb. 2007.
- [45] M. J. Lee and W. Y. Choi, "Area-dependent photodetection frequency response characterization of silicon avalanche photodetectors fabricated with standard CMOS technology," *IEEE Transaction of Electron Devices*, vol. 60, no. 3, pp. 998–1004, Mar. 2013.
- [46] J. S. Youn, M. J. Lee, K. Y. Park, H. Rucker, and W. Y. Choi, "An integrated 12.5-Gb/s optoelectronic receiver with a silicon avalanche photodetector in standard SiGe BiCMOS technology," *Optics Express*, vol. 20, no. 27, Dec. 2012.
- [47] M. J. Lee and W. Y. Choi, "A silicon avalanche photodetector fabricated with standard CMOS technology with over 1 THz gain-bandwidth product," *Optics Express*, vol. 18, pp. 24189–14194, 2010.
- [48] F. P. Chou, C. W. Wang, Z. Y. Li, Y. C. Hsieh, and Y. M. Hsin, "Effect of deep N-well bias in an 850-nm Si photodiode fabricated using the CMOS process," *IEEE Photonics Technology Letter*, vol. 25, no. 7, pp. 659–662, Apr. 2013.
- [49] F. P. Chou, G. Y. Chen, C. W. Wang, Z. Y. Li, Y. C. Liu, W. K. Huang, and Y. M Hsin, "Design and analysis for a 850 nm Si photodiode using the body bias technique for low-voltage operation," *Journal of Lightwave Technology*, vol. 31, no. 6, pp. 936–941, Mar. 2009.

- [50] Z. Hou, Q. Pan, Y. Li, S. Feng, A. W. Poon, and C. P. Yue, "Integrated CMOS photodetectors for short-range optical communication," in *IEEE Electron Devices and Solid-State Circuits*, Jun. 2013.
- [51] J. H. Mun, S. M. Park, and M. R. Nam, "Four-Channel CMOS Photoreceiver Array for Parallel Optical Interconnects," in *IEEE International Symposium on Circuits and Systems*, vol. 2, pp. 1529–1532, 2005.
- [52] D. Shin *et al.*, "A 1-mW 12-Gb/s continuous-time adaptive passive equalizer in 90-nm CMOS," in *IEEE Custom Integrated Circuits Conference*, pp. 117–120, Sep. 2009.
- [53] Q. Pan, Z. Hou, Y. Wang, and C. P. Yue, "A 65-nm CMOS P-well/Deep N-well avalanche photodetector for integrated 850-nm optical receivers," in *IEEE 10th International Conference on ASIC*, Oct. 2013.
- [54] B. G. Streetman and S. K. Banerjee, *Solid State Electronic Devices*, 6th Edition, Chapter 5, pp. 198, Pearson Prentice Hall, 2006.
- [55] A. C. Carusone, H. Yasotharan, and T. Kao, "CMOS technology scaling considerations for multi-Gbps optical receivers with integrated photodetector," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 8, pp. 1832–1842, Aug. 2011.
- [56] M. Jutzi, M. Grozing, E. Gaugler, W. Mazioschek, and M. Berroth, "2-Gb/s CMOS optical integrated receiver with a spatially modulated photodetector," *IEEE Photonics Technology Letter*, vol. 17, no. 6, pp. 1268–1270, Jun. 2005.
- [57] Y. Dong, and K. Martin, "A monolithic 3.125 Gbps fiber optic receiver front-end for POF applications in 65 nm CMOS," in *IEEE Custom Integrated Circuits Conference*, 2011.
- [58] T. Pletersek *et al.*, "High-Performance Designs with CMOS Analog Standard Cells," *IEEE Journal of Solid-State Circuits*, vol. SC-21, no. 2, pp.215–222, Apr. 1986.

- [59] C. A. Laaber *e. al.*, "Design Considerations for a High- Performance 3-μm CMOS Analog Standard-Cell Library," *IEEE Journal of Solid-State Circuits*, no. 2, pp. 181–189, Apr. 1987.
- [60] D. H. Shin and C. P. Yue, "A Unified Modeling and Design Methodology for RFICs Using Parameterized Sub-Circuit Cells," in *IEEE Radio Frequency Integrated Circuits Symposium Digest of Papers*, Jun. 2006.
- [61] D. H. Shin, J. Park, and C. P. Yue, "A Low-Power, 3-5-GHz CMOS UWB LNA Using Transformer Matching Technique," in *IEEE Asian Solid-State Circuits Conference Proceedings of Technical Papers*, Nov. 2007.
- [62] P. Andreani and H. Sjöland, "Noise Optimization of an Inductively Degenerated CMOS Low Noise Amplifier," *IEEE Trans. on CAS– II: Analog and Digital Signal Processing*, vol.48, no. 9, pp. 835–841, Sep. 2001.
- [63] R. Fujimoto, "A 7-GHz 1.8dB NF CMOS Low-Noise Amplifier," *IEEE Journal of Solid-State Circuits*, no.37, pp. 852–856, Jul. 2002
- [64] Y. C. Chang, "On-Wafer Differential Noise Figure and Large Signal Measurements of Low-Noise Amplifier," in *Proceedings of the 39th European Microwave Conference*, pp. 699–702, 2009.
- [65] Agilent Technologies, "Noise Figure Measurement Accuracy-The Y-Factor Method," *Application Note 57–2*, Mar. 2004,.
- [66] Agilent Technologies, "Advanced Measurement and Modeling of Differential Devices," Application Note 5989–4518EN, Jan. 2006.
- [67] Q. Pan, T. J. Yeh, C. Jou, F. L. Hsueh, H. Luong, and C. P. Yue, "A Performance Study of Layout and V<sub>t</sub> Options for Low Noise Amplifier Design in 65-nm CMOS," in *IEEE Radio Frequency Integrated Circuits Symposium Digest of Papers*, pp. 535–538, Jun. 2012.

- [68] L. Sun, Q. Pan, K. C. Wang, and C. P. Yue, "A 26–28-Gb/s Full-rate Clock and Data Recovery Circuit with Embedded Equalizer in 65-nm CMOS," *IEEE Trans. Circuits Syst. I: Express Briefs*, 2014.
- [69] S. S. Mohan, M. Hershenson, S. P. Boyd, and T. H. Lee, "Simple Accurate Expressions for Planar Spiral Inductances," *IEEE Journal of Solid-State Circuits*, vol. 34, no. 10, pp. 1419–1425, Oct. 1999.

### **List of Publications**

#### **Published Works**

#### **Conference Papers**

- [1] **Quan Pan**, Zhengxiong Hou, Yipeng Wang, Yan Lu, Wing-Hung Ki, Keh Chung Wang, and C. Patrick Yue, "A 48-mW 18-Gb/s Fully Integrated CMOS Optical Receiver with Photodetector and Adaptive Equalizer," in *IEEE Symposium on VLSI Circuits*., Jun. 2014.
- [2] **Quan Pan**, Tzu-JinYeh, Chewnpu Jou, Fu-Lung Hsueh, Howard Luong, and C. Patrick Yue, "A performance study of layout and Vt options for low noise amplifier design in 65-nm CMOS," in *IEEE Radio Frequency Integrated Circuits Symposium Digest of Papers*, pp. 535–538, Jun. 2012.
- [3] **Quan Pan**, Zhengxiong Hou, Yipeng Wang, and C. Patrick Yue, "A 65-nm CMOS P-well/Deep N-well avalanche photodetector for integrated 850-nm optical receivers," in *IEEE 10th International Conference on ASIC*, Oct. 2013.
- [4] Li Sun, **Quan Pan**, Keh Chung Wang, and C. Patrick Yue, "A 25-28 Gbps clock and data recovery system with embedded equalization in 65-nm CMOS," in *International Conference on Solid-State and Integrated Circuit Technology*, Oct. 2012.
- [5] Li Sun, Yipeng Wang, **Quan Pan**, Zhengxiong Hou, Yan Lu, and C. Patrick Yue, "A 26-Gb/s Optical Receiver Front-end in 65nm CMOS," in *International Solid-State Circuits Conference Student Research Preview*, Feb. 2013.

- [6] Zhengxiong Hou, <u>Quan Pan</u>, Yu Li, Shaoqi Feng, A. W. Poon, and C. Patrick Yue, "Integrated CMOS Photodetectors for Short-range Optical Communication," in *IEEE International Conference on Electron Devices and Solid-State Circuits*, Jun. 2013.
- [7] Zhengxiong Hou, **Quan Pan**, Yipeng Wang, Liang Wu, and C. Patrick Yue, "A 23-mW 30-Gb/s digitally programmable limiting amplifier for 100GbE optical receivers," in *IEEE Radio Frequency Integrated Circuits Symposium Digest of Papers*, Jun. 2014.
- [8] Yipeng Wang, Yan Lu, <u>Quan Pan</u>, Zhengxiong Hou, Liang Wu, Wing-Hung Ki, and C. Patrick Yue, "A 3-mW 25-Gb/s CMOS transimpedance amplifier with fully integrated low-dropout regulator for 100GbE systems," in *IEEE Radio Frequency Integrated Circuits Symposium Digest of Papers*, Jun. 2014.

#### **Journal Papers**

- [1] **Quan Pan**, Zhengxiong Hou, Yu Li, Andrew W. Poon, and C. Patrick Yue, "A 65-nm CMOS P-well/Deep N-well photodetector for fully integrated 850-nm optical receivers," *IEEE Photonics Technology Letters*, accepted.
- [2] Li Sun, **Quan Pan**, Keh-Chung Wang, and C. Patrick Yue, "A 26-28-Gb/s clock and data recovery circuit with embedded equalizer in 65-nm CMOS," *IEEE Transactions of Circuits and Systems-I*, accepted.

## **Works Submitted or Under Preparation**

#### **Conference Papers**

[1] Quan Pan, Yipeng Wang, Zhengxiong Hou, Li Sun, Yan Lu, Liang Wu, Wing-Hung Ki, Patrick Chiang, and C. Patrick Yue, "A 41-mW 30-Gb/s CMOS Optical Receiver with Digitally-Tunable Cascaded Equalization," in *Proc. European Solid-State Circuits Conference (ESSCIRC)*, submitted.

#### **Journal Papers**

- [1] **Quan Pan**, Yipeng Wang, Zhengxiong Hou, Li Sun, Yan Lu, Liang Wu, Wing-Hung Ki, Patrick Chiang, and C. Patrick Yue, "A 41-mW 30-Gb/s CMOS Optical Receiver with Digitally-Tunable Cascaded Equalization," *IEEE Journal of Solid-State Circuits*, under preparation.
- [2] **Quan Pan**, Zhengxiong Hou, Yipeng Wang, Yan Lu, Wing-Hung Ki, Keh Chung Wang, and C. Patrick Yue, "A 48-mW 18-Gb/s Fully Integrated CMOS Optical Receiver with Photodetector and Adaptive Equalizer," *IEEE Journal of Solid-State Circuits*, under preparation.
- [3] Yan Lu, Wing-Hung Ki, and C. Patrick Yue, **Quan Pan**, and Yipeng Wang, "A Fully-Integrated Low-Dropout Regulator with Ultra-Fast Transient Response and Full-Spectrum Power Supply Rejection," *IEEE Journal of Solid-State Circuits*, submitted.

# **List of Patents**

[1] **Quan Pan**, and C. Patrick Yue, "A cascaded equalization topology with slow roll-up frequency response," under preparation.

## **Appendix A: MATLAB Program for Curve Fitting**

```
NZ = 5:
         % number of ZEROS in the filter to be designed
NP = 5:
          % number of POLES in the filter to be designed
NG = 30:
           % number of gain measurements
f = 2*pi*1e6*[measured frequency information]; % measurement frequency axis
% Gain measurements (synthetic example = triangular amp response):
Gdb = [measured gain response information]; %
% Must decide on a dc value.
% Either use what is known to be true or pick something "maximally
% smooth". Here we do a simple linear extrapolation:
dc_amp = 0;
fmin = f(1);
fmax = f(NG);
fs = 4*fmax; % discrete-time sampling rate
Nfft = 1024000; % FFT size to use
% Must also decide on a value at half the sampling rate.
% Use either a realistic estimate or something "maximally smooth".
% Here we do a simple linear extrapolation. While zeroing it
% is appealing, we do not want any zeros on the unit circle here.
Gdb_last_slope = (Gdb(NG) - Gdb(NG-1)) / (f(NG) - f(NG-1));
nyq_amp = Gdb(NG) + Gdb_last_slope * (fs/2 - f(NG));
Gdbe = [dc_amp, Gdb, nyq_amp];
```

```
fe = [0, f, fs/2];
NGe = NG+2;
% Resample to a uniform frequency grid, as required by ifft.
% We do this by fitting cubic splines evaluated on the fft grid:
Gdbei = spline(fe,Gdbe); % say `help spline'
fk = fs*[0:Nfft/2]/Nfft; % fft frequency grid (nonneg freqs)
Gdbfk = ppval(Gdbei,fk); % Uniformly resampled amp-resp
figure(1);
semilogx(fk(2:end-1),Gdbfk(2:end-1),'-k'); grid('on');
axis([fmin/2 fmax*2 Gdb(NG)-10 Gdb(1)+10]);
hold('on'); semilogx(f,Gdb,'ok');
xlabel('Frequency (Hz)'); ylabel('Magnitude (dB)');
title(['Measured and Extrapolated/Interpolated/Resampled',...
'Amplitude Response']);
Ns = length(Gdbfk); if Ns~=Nfft/2+1, error('confusion'); end
Sdb = [Gdbfk,Gdbfk(Ns-1:-1:2)]; % install negative-frequencies
S = 10 .^{(Sdb/20)}; % convert to linear magnitude
s = ifft(S); % desired impulse response
s = real(s); % any imaginary part is quantization noise
tlerr = 100*norm(s(round(0.9*Ns:1.1*Ns)))/norm(s);
disp(sprintf(['Time-limitedness check: Outer 20%% of impulse ' ...
'response is %0.2f %% of total rms'],tlerr));
\% = 0.02 percent
if tlerr>1.0 % arbitrarily set 1% as the upper limit allowed
```

```
error('Increase Nfft and/or smooth Sdb');
end
figure(2);
plot(s, '-k'); grid('on'); title('Impulse Response');
xlabel('Time (samples)'); ylabel('Amplitude');
c = ifft(Sdb); % compute real cepstrum from log magnitude spectrum
% Check aliasing of cepstrum (in theory there is always some):
caliaserr = 100*norm(c(round(Ns*0.9:Ns*1.1)))/norm(c);
disp(sprintf(['Cepstral time-aliasing check: Outer 20%% of ' ...
'cepstrum holds %0.2f %% of total rms'],caliaserr));
\% = 0.09 percent
if caliaserr>1.0 % arbitrary limit
error('Increase Nfft and/or smooth Sdb to shorten cepstrum');
end
% Fold cepstrum to reflect non-min-phase zeros inside unit circle:
% If complex:
% cf = [c(1), c(2:Ns-1) + conj(c(Nfft:-1:Ns+1)), c(Ns), zeros(1,Nfft-Ns)];
cf = [c(1), c(2:Ns-1)+c(Nfft:-1:Ns+1), c(Ns), zeros(1,Nfft-Ns)];
Cf = fft(cf); \% = dB \text{ magnitude} + i * minimum phase}
Smp = 10 .^ (Cf/20); % minimum-phase spectrum
Smpp = Smp(1:Ns); % nonnegative-frequency portion
wt = 1 ./ (fk+1); % typical weight fn for audio
wk = 2*pi*fk/fs;
[B,A] = invfreqz(Smpp,wk,NZ,NP,wt);
```

```
Hh = freqz(B,A,Ns);
figure(3);
plot(fk,db([Smpp(:),Hh(:)])); grid('on');
xlabel('Frequency (Hz)');
ylabel('Magnitude (dB)');
title('Magnitude Frequency Response');
legend('Desired','Filter');
```

# **Appendix B: MATLAB Program for Gain Conversion**

```
%This program is used for measurd S-Parameter (S21) to Tranimpedance Conversion %Load the csv file named as "data". The first column is frequency; 2nd s21; %3rd s11

freq=data(:,1); % get freq as x-axis

S11_in=data(:,3);

S21_in=data(:,2);

S11_lin=db2mag(S11_in)+0.05;

S21_lin=db2mag(S21_in);

Zt=(50.*S21_lin)./(1-S11_lin); % convert s21 to Zt

Ztlog=20.*log10(Zt);

%semilogx(freq,Z20log,'k-',freq,Z50log,'b--');

semilogx(freq,Ztlog,'k-');
```

# **Appendix C: Labjack Control for the 64-bit Shift Register**

# Shift Register A: 10011 // CLK RST DIN SEL1 SEL0 00011 11011 //RST 01011 //RST 11011 //RST DB15# color name function 01011 //RST 10011 //D32-->SR\_OUT 1 black 5V USB 5V 00011 10111 //D31 00111 10111 //D30 9 white CIO0 clock (rising edge) 00111 // 10011 //D29 00011 // 10011 //D28 2 gray CIO1 clock (falling edge) 00011 // 10011 //D27 00011 // 10011 //D26 00011 //

```
10011 //D25 10 violet CIO2 reserve
00011 //
10011 //D24
00011 //
10011 //D23
00011 // 3 blue CIO3 reserve
10011 //D22
00011 //
10011 //D21 TIA_B3 11 green GND GND
00011 //
10111 //D20 TIA_B2
00111 //
10011 //D19 4 yellow EIO0 bit 0 (LSB)
00011 //
10011 //D18
00011 //
10011 //D17 CH3_B2 12 orange EIO1 bit 1
00011 //
10111 //D16 CH3_B3
00111 //
10111 //D15 CH3_B4 5 red EIO2 bit 2
00111 //
10011 //D14 CH2_B2
00011 //
```

```
10111 //D13 CH2_B3 13 brown EIO3 bit 3
00111 //
10111 //D12 CH2_B4
00111 //
10011 //D11 6 black EIO4 bit 4
00011 //
10111 //D10 EQ3_B3
00111 //
10111 //D9 EQ3_B2 14 white EIO5 bit 5
00111 //
10111 //D8 EQ3_B1
00111 //
10111 //D7 EQ3_B0 7 gray EIO6 bit 6
00111 //
10111 //D6 EQ1_B3
00111 //
10111 //D5 EQ1_B2 15 violet EIO7 bit 7 (MSB)
00111 //
10111 //D4 EQ1_B1
00111 //
10111 //D3 EQ1_B0 8 blue GND GND
00111 //
10011 //D2
00011 //
```

```
10011 //D1 CH1_B0
00011 //
10011 //D0 CH1_B1
00011 //
10011 //D--No use
00011 //
10011 //D--No use
00011 //
00011 //No clock --> Address:SR
00011
       //No clock
00000
       //No clock --> Address:Reg A last number changed
00000 //No Clock last number changed
10000 //Clock --> Address:RegA
00000 //
10000 //
       //Data in RegA
00000
00000
00000
00000
00000
```

## Shift Register B:

10011 // CLK RST DIN SEL1 SEL0

```
00011
11011
      //RST
01011
      //RST
11011 //RST
                 DB15# color name function
01011 //RST
10011 //D32-->SR_OUT 1 black 5V USB 5V
00011
10111
      //D31
00111
10111
     //D30 9 white CIO0 clock (rising edge)
00111 //
10011 //D29
00011 //
            2 gray CIO1 clock (falling edge)
10011 //D28
00011 //
10011 //D27
00011 //
10011 //D26
00011 //
10011 //D25
            10 violet CIO2 reserve
00011
     //
10011 //D24
00011 //
10011 //D23
```

```
00011 // 3 blue CIO3 reserve
10011 //D22
00011 //
00011 //
10111 //D20 TIA_B2
00111 //
10011 //D19 4 yellow EIO0 bit 0 (LSB)
00011 //
10011 //D18
00011 //
10011 //D17 CH3_B2 12 orange EIO1 bit 1
00011 //
10111 //D16 CH3_B3
00111 //
10111 //D15 CH3_B4 5 red EIO2 bit 2
00111 //
10011 //D14 CH2_B2
00011 //
10111 //D13 CH2_B3 13 brown EIO3 bit 3
00111 //
10111 //D12 CH2_B4
00111 //
10011 //D11 6 black EIO4 bit 4
```

```
00011 //
10111 //D10 EQ3_B3
00111 //
10111 //D9 EQ3_B2 14 white EIO5 bit 5
00111 //
10111 //D8 EQ3_B1
00111 //
10111 //D7 EQ3_B0 7 gray EIO6 bit 6
00111 //
10111 //D6 EQ1_B3
00111 //
10111 //D5 EQ1_B2 15 violet EIO7 bit 7 (MSB)
00111 //
10111 //D4 EQ1_B1
00111 //
10111 //D3 EQ1_B0 8 blue GND GND
00111 //
10011 //D2
00011 //
10011 //D1 CH1_B0
00011 //
10011 //D0 CH1_B1
00011 //
10011 //D--No use
```

```
00011 //
10011 //D--No use
00011 //
00011
       //No clock --> Address:SR
00011
       //No clock
00000
       //No clock --> Address:Reg B last number changed
00000
       //No Clock last number changed
10000
       //Clock --> Address:RegB
00000
       //
10000 //
00000
       //Data in RegB
00000
00000
00000
00000
```