Band 407 Verlagsschriftenreihe des Heinz Nixdorf Instituts Prof. Dr.-Ing. J. Christoph Scheytt (Hrsg.) Schaltungstechnik

Abdul Rehman Javed

Mixed-Signal Baseband Circuit Design for High Data Rate Wireless Communication in Bulk CMOS and SiGe BiCMOS Technologies

HEINZ NIXDORF INSTITUT UNIVERSITÄT PADERBORN

Abdul Rehman Javed

Entwurf von Mixed-Signal-Basisbandschaltungen für drahtlose Kommunikation mit hoher Datenrate in Bulk-CMOS- und SiGe-BiCMOS-Technologien

Mixed-Signal Baseband Circuit Design for High Data Rate Wireless Communication in Bulk CMOS and SiGe BiC-MOS Technologies

#### Bibliografische Information Der Deutschen Bibliothek

Die Deutsche Bibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über http://dnb.ddb.de abrufbar.

Band 407 der Verlagsschriftenreihe des Heinz Nixdorf Instituts

© Heinz Nixdorf Institut, Universität Paderborn – Paderborn – 2022

ISSN (Online): 2365-4422 ISBN: 978-3-947647-26-2

Das Werk einschließlich seiner Teile ist urheberrechtlich geschützt. Jede Verwertung außerhalb der engen Grenzen des Urheberrechtsgesetzes ist ohne Zustimmung der Herausgeber und des Verfassers unzulässig und strafbar. Das gilt insbesondere für Vervielfältigung, Übersetzungen, Mikroverfilmungen, sowie die Einspeicherung und Verarbeitung in elektronischen Systemen.

Als elektronische Version frei verfügbar über die Digitalen Sammlungen der Universitätsbibliothek Paderborn.

Satz und Gestaltung: Abdul Rehman Javed

# Mixed-Signal Baseband Circuit Design for High Data Rate Wireless Communication in Bulk CMOS and SiGe BiCMOS Technologies

zur Erlangung des akademischen Grades eines DOKTORS DER INGENIEURWISSENSCHAFTEN (Dr.-Ing.) der Fakultät der Fakultät Elektrotechnik, Informatik und Mathematik der Universität Paderborn

> genehmigte DISSERTATION

von M.Sc. Abdul Rehman Javed Paderborn

Tag des Kolloquiums: Referent: Korreferent:

31. August 2022 Prof. Dr.-Ing. J. Christoph Scheytt Prof. Dr.-Ing. Rolf Kraemer

#### Vorveröffentlichungen

- [SJ15] J. C. SCHEYTT, AND A. R. JAVED: 100 Gigabit pro Sekunde und mehr für das drahtlose Hochgeschwindigkeits-Internet. In: ForschungsForum Paderborn, Mar. 2015.
- [SJ15] J. C. SCHEYTT, AND A. R. JAVED: Mixed-Signal Baseband Processing for 100 Gbit/s Communications. In: European Microwave Week 2015, Paris, France, Sep. 2015.
- [SJ15] J. C. SCHEYTT, AND A. R. JAVED: Shifting the Analog-Digital Boundary in Signal Processing: Should We Use Mixed-Signal "Approximate" Computing? In: Workshop on Approximate Computing, Paderborn, Germany, Oct. 2015.
- [JS15] A. R. JAVED, AND J. C. SCHEYTT: System Design and Simulation of a PSSS Based Mixed Signal Transceiver for a 20 Gbps Bandwidth Limited Communication Link. In: 2015 1st URSI Atlantic Radio Science Conference (URSI AT-RASC), pp. 1-1, 2015.
- [JS15] A. R. JAVED, J. C. SCHEYTT, K. KRISHNEGOWDA, AND R. KRAEMER: System Design Considerations for a PSSS Transceiver for 100Gbps Wireless Communication with Emphasis on Mixed-Signal Implementation. In: IEEE Wireless and Microwave Technology Conference (WAMICON), Florida, 2015.
- [JS16] A. R. JAVED, J. C. SCHEYTT, AND U. V. D. AHE: Linear Ultra-Broadband NPN-Only Analog Correlator at 33 Gbps in 130 nm SiGe BiCMOS Technology. In: IEEE Bipolar/BiCMOS Circuits and Technology Meeting (BCTM), 2016, New Brunswick, 2016.
- [JS17] A. R. JAVED, J. C. SCHEYTT, K. KRISHNEGOWDA, AND R. KRAEMER: System Design of a Mixed Signal PSSS Transceiver Using a Linear Ultra-Broadband Analog Correlator for the Receiver Baseband Designed in 130 nm SiGe BiCMOS Technology. In: IEEE EUROCON 2017-17th International Conference on Smart Technologies, pp. 228-233, 2017.
- [JS17] J. C. SCHEYTT, A. R. JAVED, ET. AL.: 100 Gbps Wireless System and Circuit Design Using Parallel Spread-Spectrum Sequencing. In: Frequenz, vol. 71, no. 9-10, p. 399 – 414, 2017.
- [KJ18] K. KRISHNEGOWDA, A. R. JAVED, L. WIMMER, A. C. WOLF, J. C. SCHEYTT AND R. KRAEMER: PSSS Transmitter for a 100 Gbps Data Rate Communication in THz Frequency Band. In: 2018 26th Telecommunications Forum (TELFOR), pp. 1-5, 2018.
- [KWJ18] K. KRISHNEGOWDA, L. WIMMER, A. R. JAVED, A. C. WOLF, J. C. SCHEYTT, AND R. KRAEMER: Analysis of PSSS Modulation for Optimization of DAC Bit Resolution for 100 Gbps Systems. In: 15th International Symposium on Wireless Communication Systems (ISWCS), Lisbon, 2018.
- [JS20] A. R. JAVED, AND J. C. SCHEYTT: M-Sequence Radar for High Resolution Ranging with Mixed-Signal Radar Receiver Baseband Using 130nm SiGe BiCMOS Technology. In: 2020 17th European Radar Conference (EuRAD), Utrecht, 2020.
- [JS20] A. R. JAVED, J. C. SCHEYTT, ET. AL: Real100G.com. In: Wireless 100 Gbps and Beyond, R. Kraemer and S. Scholz, Eds., Frankfurt (Oder), Germany, IHP Innovations for High Performance Microelectronics, 2020, pp. 231-294.
- [SJ20] J. C. SCHEYTT, A. R. JAVED, ET. AL: Real100G Ultrabroadband Wireless Communication at High mm-Wave Frequencies. In: Wireless 100 Gbps and Beyond, R. Kraemer and S. Scholz, Eds., Frankfurt (Oder), Germany, IHP - Innovations for High Performance Microelectronics, 2020, pp. 213-230.
- [JS20] A. R. JAVED, J. C. SCHEYTT, ET. AL: Real100G.com. In: Wireless 100 Gbps and Beyond, R. Kraemer and S. Scholz, Eds., Frankfurt (Oder), Germany, IHP Innovations for High Performance Microelectronics, 2020, pp. 231-294.
- [JS20] A. R. JAVED, AND J. C. SCHEYTT: Mixed-Signal Receiver Baseband Slice for High-Data-Rate Communication Using 130 nm SiGe BiCMOS Technology. In: 64th International Midwest Symposium on Circuits and Systems (MWSCAS 2021), East Lansing, 2021.

### Zusammenfassung

Die konventionelle digitale Basisband-Architektur, die eine leistungsstarke digitale Signalverarbeitung und Datenwandler mit großer Bandbreite, hoher Datenrate und hoher Auflösung einsetzt, führt bei drahtlosen Kommunikationssystemen mit hoher Datenrate zu einer hohen Verlustleistung. In dieser Dissertation wird der Einsatz analoger Signalverarbeitung mithilfe einer Mischsignal-Basisbandarchitektur untersucht, um die Verlustleistung und die Komplexität der Schaltung zu reduzieren. Zu diesem Zweck wird die Verwendung der Parallelsequenz-Spreizspektrum-Modulation (PSSS) aufgrund ihrer Eignung für eine Mischsignal-Basisbandimplementierung untersucht. Der vorgeschlagene Mischsignal-Basisbandschaltkreis hat eine modulare, symbol-sliced Architektur und ermöglicht die Kanalentzerrung durch Gewichtung der Chips der Dekodier-Sequenz. Ein komplettes Unit-Slice des Empfänger-Basisbandes wurde in 130 nm SiGe BiCMOS-Technologie hergestellt. Die Messergebnisse zeigen eine sehr gute Linearität und Hochfrequenzleistung sowohl für BPSK- als auch für PAM-4-Daten mit einer PSSS-Chiprate von 20 Gcps. Allerdings führte die Verwendung von Stromschaltlogik (current mode logic) mit einer hohen Versorgungsspannung von 1,2 V und -4 V zu einer hohen Verlustleistung. Für die Sender-Basisbandschaltung wurde eine CMOS-Technologie bevorzugt, um sowohl den digitalen Basisbandkern als auch die analogen Hochgeschwindigkeitskomponenten in einer einzigen integrierten Schaltung realisieren zu können. Die Basisbandkomponenten des Senders wurden in 65-nm-Bulk-CMOS-Technologie hergestellt und getestet. Die 65-nm-Technologie war jedoch nicht schnell genug für die Entwicklung der Hochgeschwindigkeits-Breitband-Analog-Multiplexer-Schaltung des Sender-Basisbandes, was eine Migration zu einer skalierten CMOS-Technologie andeutete. Um die Verlustleistung des Basisbandes des Mischsignal-Empfängers im Vergleich zur SiGe-BiCMOS-Implementierung zu verringern und den für den Sender-Analog-Multiplexer erforderlichen Hochgeschwindigkeitsbetrieb zu ermöglichen, wurde die vorgeschlagene Mischsignal-Architektur in einer 28-nm-Bulk-CMOS-Technologie implementiert, wodurch die Verlustleistung erheblich reduziert wurde. Aufgrund begrenzter zeitlicher und finanzieller Ressourcen wurde jedoch keine Herstellung integrierter Schaltungen in 28-nm-Bulk-CMOS-Technologie vorgenommen. Eine indes Empfänger-Basisbandchips teressante Anwendung als hochauflösendes Entfernungsmessradar wird ebenfalls diskutiert, zusammen mit Messergebnissen, die eine Entfernungsauflösung von bis zu 7,5 mm ermöglichen.

# Abstract

The conventional digital baseband architecture, employing high-performance digital signal processing and wide bandwidth, high data rate, and high-resolution data convertors, results in large power dissipation for high data rate wireless communication systems. This dissertation investigates the use of analog signal processing using a mixed-signal baseband architecture to reduce power dissipation and circuit complexity. For this purpose, the use of parallel-sequence spread spectrum (PSSS) modulation is investigated owing to its suitability for a mixed-signal baseband implementation. The proposed mixed-signal baseband circuit has a symbol-sliced architecture and allows channel equalization by weighting the chips of the decoding sequence. A complete unit slice of the receiver baseband was fabricated in 130 nm SiGe BiCMOS technology. The measurement results show very good linearity and high-speed performance for both BPSK and PAM-4 data with a PSSS chip rate of 20 Gcps. However, the use of current mode logic with a large supply voltage of 1.2 V and -4 V results in large power dissipation. For the transmitter baseband, CMOS technology was preferred to allow the realization of both the digital baseband core and the high-speed analog components on a single chip. Transmitter baseband components were fabricated and tested in 65 nm bulk CMOS technology. The 65 nm technology was, however, not fast enough for the design of the high-speed, broadband analog multiplexer component of the transmitter baseband which necessitated a migration to more scaled CMOS technology. Therefore, to reduce the power dissipation of the mixed-signal receiver baseband as compared to the SiGe BiCMOS implementation, and to enable the high-speed operation required for the transmitter analog multiplexer, the proposed mixed-signal baseband was implemented in a 28 nm bulk CMOS technology which significantly reduced the power dissipation. However, the circuit was not fabricated in 28 nm bulk CMOS technology due to limited time and budgetary resources. An interesting application of the receiver baseband chip as a high-resolution ranging radar is also discussed along with measured results that allow distance resolution of up to 7.5 mm.

# Mixed-Signal Baseband Circuit Design for High Data Rate Wireless Communication in Bulk CMOS and SiGe BiCMOS Technologies

# Contents

| 1      | Introduction1                                                    |
|--------|------------------------------------------------------------------|
| 1.1    | Motivation                                                       |
| 1.1.1  | Possible System Approaches for 100 Gbps Wireless Communication 6 |
| 1.1.2  | Proposed Solution8                                               |
| 1.2    | State of the Art 11                                              |
| 1.2.1  | State of the Art in High-Speed Data Converters11                 |
| 1.2.2  | State of the Art in Ultra-Broadband Wireless Systems             |
| 1.2.3  | State of the Art in Analog Signal Processing for Ultra-Broadband |
| Wirele | ss Systems13                                                     |
| 1.2.4  | State of the Art for PSSS Transceiver Implementations            |
| 1.3    | Organization of Thesis                                           |
| 1.4    | List of Relevant Publications of the Author 17                   |
| 1.5    | Summary 17                                                       |

# 2 Parallel Sequence Spread Spectrum (PSSS) System Analysis and

| Desig | Design                                         |    |
|-------|------------------------------------------------|----|
| 2.1   | Spread Spectrum Communication                  | 19 |
| 2.2   | Fundamentals of PSSS                           | 21 |
| 2.2.1 | Mathematical Overview                          | 21 |
| 2.2.2 | PSSS System Analysis                           | 28 |
| 2.2.3 | Receiver Sensitivity Analysis                  | 31 |
| 2.3   | PSSS System Architecture                       | 35 |
| 2.3.1 | Digital vs. Mixed-Signal Baseband Architecture | 37 |
| 2.3.2 | Transmitter Baseband Architecture              | 38 |
| 2.3.3 | Receiver Baseband Architecture                 | 39 |
| 2.4   | PSSS Mixed-Signal System Design                | 39 |
| 2.4.1 | System Design Parameters                       | 40 |
| 2.4.2 | System Model and Specifications                | 43 |

3

| 2.4.3 | PSSS Amplitude Distribution                            | . 45 |
|-------|--------------------------------------------------------|------|
| 2.5   | Channel Equalization by Chip Weighting                 | . 50 |
| 2.5.1 | Choice of the Training Sequence Data                   | . 51 |
| 2.5.2 | Calculation of Chip Weights                            | . 51 |
| 2.5.3 | Practical Limitations in Determination of Chip Weights | . 52 |
| 2.6   | List of Relevant Publications of the Author            | . 53 |
| 2.7   | Summary                                                | . 54 |
|       |                                                        |      |

Design and Validation of Critical Components for a Mixed-Signal

#### PSSS Baseband......55 Semiconductor Technology ...... 55 3.1 3.1.1 IHP SG13S 130 nm BiCMOS Technology ......55 3.1.2 TSMC 65 nm Bulk CMOS Technology ...... 56 3.1.3 GlobalFoundries 28 nm Bulk CMOS Technology ...... 57 3.2 3.3 3.3.1 3.3.2 3.3.3 3.3.4 3.3.5 3.3.6 3.4 3.4.1 Comparison with the State of the Art......72

| Broadband Correlator                                 | 73                                                                                                                                                                                                                                                                                                                                                                           |
|------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Multiplier Circuit                                   | 74                                                                                                                                                                                                                                                                                                                                                                           |
| Integrator Circuit                                   | 74                                                                                                                                                                                                                                                                                                                                                                           |
| Generation of the Integrate or Reset Command Signal  | 76                                                                                                                                                                                                                                                                                                                                                                           |
| Characterization of the Broadband Correlator Circuit | 77                                                                                                                                                                                                                                                                                                                                                                           |
| Linearity Test                                       | 79                                                                                                                                                                                                                                                                                                                                                                           |
| Correlation Test                                     | 80                                                                                                                                                                                                                                                                                                                                                                           |
| Comparison with the State of the Art                 | 82                                                                                                                                                                                                                                                                                                                                                                           |
| Receiver Baseband Simulations                        | 83                                                                                                                                                                                                                                                                                                                                                                           |
| Transmitter Baseband                                 | 84                                                                                                                                                                                                                                                                                                                                                                           |
| DAC                                                  | 85                                                                                                                                                                                                                                                                                                                                                                           |
| Digital Baseband Core                                |                                                                                                                                                                                                                                                                                                                                                                              |
| Analog Multiplexer                                   |                                                                                                                                                                                                                                                                                                                                                                              |
|                                                      | Broadband Correlator<br>Multiplier Circuit<br>Integrator Circuit<br>Generation of the Integrate or Reset Command Signal<br>Characterization of the Broadband Correlator Circuit<br>Linearity Test<br>Correlation Test<br>Comparison with the State of the Art<br>Receiver Baseband Simulations<br>Transmitter Baseband<br>DAC<br>Digital Baseband Core<br>Analog Multiplexer |

| 3.9  | Power Dissipation of the SiGe BiCMOS Implementation | . 95 |
|------|-----------------------------------------------------|------|
| 3.10 | Comparison with 28 nm CMOS Implementation           | . 95 |
| 3.11 | List of Relevant Publications of the Author         | . 99 |
| 3.12 | Summary                                             | . 99 |

| 4<br>Slice                              | Design of a Complete Mixed-Signal PSSS Receiver Baseband Un                                                                                                                                                                         | nit-<br>101                    |
|-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------|
| 4.1                                     | Post-Correlation Receiver Components 1                                                                                                                                                                                              | 02                             |
| 4.2<br>4.2.1<br>4.2.2                   | Sample-and-Hold Circuit                                                                                                                                                                                                             | 04<br> 04<br> 05               |
| 4.3                                     | Inter-Chip Phase Alignment (Voltage Controlled Delay Line) 1                                                                                                                                                                        | 07                             |
| 4.4                                     | Multioutput Stage 1                                                                                                                                                                                                                 | 10                             |
| 4.5                                     | Layout Design 1                                                                                                                                                                                                                     | 12                             |
| 4.6<br>4.6.1<br>4.6.2<br>4.6.3<br>4.6.4 | Concept for a Complete PSSS Receiver Baseband Circuit       1         Automatic Gain Control       1         Active Power Divider       1         Automatic Offset Adjustment       1         Digital Controls for the Chip       1 | 13<br> 14<br> 15<br> 17<br> 17 |
| 4.7                                     | Summary 1                                                                                                                                                                                                                           | 19                             |

#### 

| 5.1          | High Speed Printed Circuit Board (PCB) Design        | 120 |
|--------------|------------------------------------------------------|-----|
| 5.1.1        | RF Connectors                                        | 122 |
| 5.1.2        | Layer Stack-Up                                       | 122 |
| 5.1.3        | Bond wire Modeling                                   | 125 |
| 5.1.4        | Transmission Line Structure                          | 127 |
| 5.1.5        | S-Parameter Measurements Results                     | 128 |
| 5.2<br>5 2 1 | Standalone Receiver Baseband Unit-Slice Measurements |     |
| 5.2.2        | Testing with BPSK Data                               |     |
| 5.2.3        | Testing with PAM-4 Data                              | 134 |
| 5.3          | System Integration                                   | 135 |
| 5.4          | Application as a High-Resolution Ranging Radar       | 137 |
|              |                                                      |     |

| 5.4.1 | Measurement Results                         |     |
|-------|---------------------------------------------|-----|
| 5.5   | Channel Equalization Test                   |     |
| 5.6   | List of Relevant Publications of the Author |     |
| 5.7   | Summary                                     |     |
|       |                                             |     |
| 6     | Conclusions and Outlook                     | 145 |

| •   |                                            |     |
|-----|--------------------------------------------|-----|
| 6.1 | No Reset Correlator                        | 147 |
| 6.2 | Multiple Resettable Correlators with Flags | 148 |
| 6.3 | Larger Codes with Reconfigurable Hardware  | 150 |
| 6.4 | Reduction of the Power Dissipation         | 152 |
| 6.5 | Summary                                    | 153 |
|     |                                            |     |

| Bibliography | · · · · · · · · · · · · · · · · · · · | 154 |
|--------------|---------------------------------------|-----|
|--------------|---------------------------------------|-----|

# **Table of Figures**

| Fig. 1-1 | Average mm-wave atmospheric absorption spectrum [2]                                   |
|----------|---------------------------------------------------------------------------------------|
| Fig. 1-2 | Figurative explanation of the parallel sequence spread spectrum (PSSS)                |
|          | technique. [10]                                                                       |
| Fig. 1-3 | Comparison between the proposed analog/ mixed-signal transceiver                      |
|          | architecture and the conventional digital transceiver architecture                    |
| Fig. 1-4 | Proposed architecture for the 100 Gbps wireless transmitter with mixed-signal         |
|          | PSSS TX baseband [12]                                                                 |
| Fig. 1-5 | Proposed architecture for the 100 Gbps wireless receiver with the mixed-              |
|          | signal PSSS receiver baseband [12] 10                                                 |
| Fig. 2-1 | $E_b/N_o$ in dB vs. bit error rate (BER) for PAM-16 using bertool from                |
|          | MATLAB                                                                                |
| Fig. 2-2 | Proposed system architecture for PSSS based communication link using a                |
|          | Terahertz radio frequency frontend [13]                                               |
| Fig. 2-3 | Figurative representation of the proposed PSSS mixed-signal baseband                  |
|          | architecture [13]                                                                     |
| Fig. 2-4 | Proposed mixed-signal PSSS transmitter baseband architecture [12] [73]. 38            |
| Fig. 2-5 | Proposed mixed-signal PSSS receiver baseband architecture [12] [73] 39                |
| Fig. 2-6 | Proposed architecture of the mixed-signal PSSS baseband [73] 44                       |
| Fig. 2-7 | PSSS amplitude distribution for BPSK data with a single m-sequence of                 |
|          | length 7 as the bipolar coding sequence {1, 1, 1, -1, -1, 1, -1}. To obtain the       |
|          | complete amplitude distribution for all 7 coding vectors combined, all the            |
|          | number of occurrences have to be multiplied by 7                                      |
| Fig. 2-8 | Normalized distribution of PSSS amplitudes for PAM-16 data encoded with               |
|          | the bipolar version of m-sequences of length 15 [75]                                  |
| Fig. 2-9 | Bit error rate (BER) vs. the number of PSSS amplitudes clipped-off [75]. 50           |
| Fig. 3-1 | Variation of the transit frequency of the transistor with the collector current $I_c$ |
|          | and the collector to emitter voltage $V_{ce}$ for the emitter length of 480 nm (left) |
|          | and 840 nm (right). The $V_{ce}$ variation is as follows: 0.4 V (red), 0.8 V (blue),  |
|          | 1.2 V (magenta), and 1.6 V (green) [76]                                               |
| Fig. 3-2 | Simplified block diagram of the mixed-signal receiver baseband [69] 58                |
| Fig. 3-3 | Details of the unit-slice architecture                                                |
| Fig. 3-4 | The detailed block diagram of the mixed-signal weighted code generator                |
|          | circuit                                                                               |
| Fig. 3-5 | Schematic of the CMOS R-2R DAC                                                        |
| Fig. 3-6 | Schematic of the differential voltage-to-current conversion circuit                   |
| Fig. 3-7 | One-hot pulse generation circuit                                                      |
| Fig. 3-8 | D flip-flop schematic with set or reset functionality                                 |
| Fig. 3-9 | Clock tree for providing clock to the 18 DFFs. The schematic values of the            |
|          | resistive and capacitive peaking are different for different stages                   |

| Fig. 3-10           | Schematic of the selectable feedback delay circuit with a possibility to select one out of 4 delay options. The CMOS ring register is not shown $66$ |
|---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fig. 3-11           | Simulation results of the selectable feedback delay selection circuit showing                                                                        |
| 11g. 5-11           | the four possible delayed feedback pulses with either the 16 <sup>th</sup> or the 17 <sup>th</sup> DEE                                               |
|                     | output pulses as the input                                                                                                                           |
| $E_{\alpha} = 2.12$ | Schemetic diagram of the high around differential surrout switch                                                                                     |
| F1g. 5-12           | Schematic diagram of the mgn-speed differential current switch                                                                                       |
| F1g. 3-13           | Schematic of the 6-to-1 current summation cell to combine 6 out of 18 current                                                                        |
|                     | outputs                                                                                                                                              |
| Fig. 3-14           | Schematic of the transadmittance stage to convert the sum of 6 voltage signals                                                                       |
|                     | to voltage signals. Three copies of this circuit are needed to combine the 18                                                                        |
|                     | current output signals                                                                                                                               |
| Fig. 3-15           | Combining the 3 current summation cell outputs to generate the final mux                                                                             |
|                     | output voltage signal70                                                                                                                              |
| Fig. 3-16           | Simulation result of the weighted code generator circuit with a 30 GHz clock                                                                         |
|                     | signal used as the select signal for the analog current switching mux 71                                                                             |
| Fig. 3-17           | Measured output signal showing the integral of the output generated by the                                                                           |
|                     | weighted code generator circuit                                                                                                                      |
| Fig. 3-18           | Schematic of the four-quadrant multiplier circuit                                                                                                    |
| Fig. 3-19           | Integrator core with fast reset and manual offset correction circuits                                                                                |
| Fig. 3-20           | Simulation results showing the alignment of the reset/ integrate command                                                                             |
| 8                   | signal with the input of the correlator                                                                                                              |
| Fig. 3-21           | Block diagram of the test-chip with the external inputs and outputs indicated                                                                        |
| 116.5 21            | 78                                                                                                                                                   |
| Fig. 3-22           | Measurement setup to characterize the correlator circuit 78                                                                                          |
| Fig 3-23            | Chip microphotographs of the broadband fast resettable correlator test-chip                                                                          |
| 116.5 25            | 79                                                                                                                                                   |
| Fig. 3-24           | Step-response of the correlator for all 4 combinations of multiplier inputs i.e.                                                                     |
| 1-18/0 - 1          | {+++ +-} 80                                                                                                                                          |
| Fig 3-25            | Correlator output (single-ended) with bipolar m-sequences applied at both                                                                            |
| 115. 5 25           | inputs (with and without time shift) with a data rate of 28 Gbps 81                                                                                  |
| Fig 2.26            | Correlator output (singe ended) with bipolar m sequences applied at both                                                                             |
| Fig. 5-20           | contention output (singe-ended) with oppoint in-sequences appred at both                                                                             |
| $E_{a}^{i} 2.07$    | Even diagram of the decoded DESS waveform at the autout of the S/II circuit                                                                          |
| F1g. 3-27           | Eye diagram of the decoded PSSS waveform at the output of the S/H circuit                                                                            |
|                     | for the case of PSSS data generated by the encoding of BPSK data with a                                                                              |
|                     | PSSS chip rate of 30 Gcps                                                                                                                            |
| Fig. 3-28           | Eye diagram of the decoded PAM-4 waveform at the output of the S/H circuit                                                                           |
|                     | for the case of PSSS data generated by the encoding of PAM-4 data with a                                                                             |
|                     | PSSS chip rate of 30 Gcps                                                                                                                            |
| Fig. 3-29           | Block diagram of the proposed C-2C DAC circuit connected to an op-amp                                                                                |
|                     |                                                                                                                                                      |
| Fig. 3-30           | Integral non-linearity (INL) of the C-2C DAC calculated using simulated                                                                              |
|                     | results                                                                                                                                              |

| <ul> <li>Fig. 3-32 Schematic diagram of the 7-bit current steering DAC.</li> <li>88</li> <li>Fig. 3-33 Differential non-linearity (DNL) of the current-steering DAC calculated using simulated results.</li> <li>89</li> <li>Fig. 3-34 Integral non-linearity (INL) of the current steering DAC calculated using simulated results.</li> <li>90</li> <li>Fig. 3-35 Chip layout of the current steering DAC test-chip.</li> <li>90</li> <li>Fig. 3-36 Measured single-ended output result of the DAC test-chip at a clock rate of 1.667 GHz.</li> <li>91</li> <li>Fig. 3-37 PSSS signal generation for a single data vector using matrix operations for digital implementation.</li> <li>92</li> <li>Fig. 3-38 Block diagram of the digital baseband core test-chip with an on-chip bit-sequence generator for testing purposes.</li> <li>93</li> <li>Fig. 3-39 Snapshot of the digital baseband core test-chip payout (left) and its chip microphotograph (right).</li> <li>93</li> <li>Fig. 3-40 Measured results of the digital baseband test-chip showing two selected outputs generating the same output bit sequence. The results correspond exactly with the expected bit sequence on the two outputs.</li> <li>94</li> <li>Fig. 3-41 Block diagram of the decoded PSSS waveform at the output of the S/H circuit for the case of PSSS data generated by the encoding of BPSK data with a PSSS chip rate of 30 Gcps.</li> <li>97</li> <li>Fig. 3-43 Eye diagram of the decoded PSSS waveform at the output of the S/H circuit for the case of PSSS data generated by the encoding of PAM-4 data with a PSSS chip rate of 30 Gcps.</li> <li>97</li> <li>Fig. 3-44 Block diagram of the analog mux designed in CMOS 28 nm technology. The upper trace shows the differential output of the mux during the (n+1)<sup>th</sup> cycle is different (here inverted version) from that during the (n+1)<sup>th</sup> cycle is different (here inverted version) from that during the (n+1)<sup>th</sup> cycle is different (here inverted version) from that during the (n+1)<sup>th</sup> cycle is different (here inverted version) from that during the (n+1)<sup>th</sup></li></ul> | Fig. 3-31 | Differential non-linearity (DNL) of the C-2C DAC calculated using simulated results                                                                                                                                                                                                                                                        |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <ul> <li>Fig. 3-33 Differential non-linearity (DNL) of the current-steering DAC calculated using simulated results</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | Fig. 3-32 | Schematic diagram of the 7-bit current steering DAC                                                                                                                                                                                                                                                                                        |
| <ul> <li>Fig. 3-34 Integral non-linearity (INL) of the current steering DAC calculated using simulated results</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Fig. 3-33 | Differential non-linearity (DNL) of the current-steering DAC calculated using simulated results                                                                                                                                                                                                                                            |
| <ul> <li>Fig. 3-35 Chip layout of the current steering DAC test-chip.</li> <li>90 Measured single-ended output result of the DAC test-chip at a clock rate of 1.667 GHz.</li> <li>91 PSSS signal generation for a single data vector using matrix operations for digital implementation.</li> <li>92 Block diagram of the digital baseband core test-chip with an on-chip bit-sequence generator for testing purposes.</li> <li>93 Snapshot of the digital baseband core test-chip layout (left) and its chip microphotograph (right).</li> <li>93</li> <li>Fig. 3-40 Measured results of the digital baseband test-chip showing two selected outputs generating the same output bit sequence. The results correspond exactly with the expected bit sequence on the two outputs.</li> <li>94</li> <li>Fig. 3-41 Block diagram of the programmable weighted code-generator as implemented in the 28 nm CMOS technology.</li> <li>95</li> <li>Fig. 3-42 Eye diagram of the decoded PSSS waveform at the output of the S/H circuit for the case of PSSS data generated by the encoding of BPSK data with a PSSS chip rate of 30 Gcps.</li> <li>97</li> <li>Fig. 3-43 Eye diagram of the analog mux designed in CMOS 28 nm technology. The upper trace shows the differential outputs of the DFF chain serving as the select signal for the mixed-signal PSSS receiver baseband.</li> <li>101 Fig. 4-2 Block diagram of the omplete mixed-signal PSSS receiver baseband.</li> <li>103 Fig. 4-4 The digital interface from the output of the S/H circuit for the case.</li> <li>104 Fig. 4-5 Schematic of the track and hold circuit.</li> <li>105 Fig. 4-6</li> </ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | Fig. 3-34 | Integral non-linearity (INL) of the current steering DAC calculated using simulated results                                                                                                                                                                                                                                                |
| <ul> <li>Fig. 3-36 Measured single-ended output result of the DAC test-chip at a clock rate of 1.667 GHz</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Fig. 3-35 | Chip layout of the current steering DAC test-chip                                                                                                                                                                                                                                                                                          |
| <ul> <li>Fig. 3-37 PSSS signal generation for a single data vector using matrix operations for digital implementation</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | Fig. 3-36 | Measured single-ended output result of the DAC test-chip at a clock rate of 1.667 GHz                                                                                                                                                                                                                                                      |
| <ul> <li>Fig. 3-38 Block diagram of the digital baseband core test-chip with an on-chip bit-sequence generator for testing purposes</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | Fig. 3-37 | PSSS signal generation for a single data vector using matrix operations for digital implementation                                                                                                                                                                                                                                         |
| <ul> <li>Fig. 3-39 Snapshot of the digital baseband core test-chip layout (left) and its chip microphotograph (right)</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | Fig. 3-38 | Block diagram of the digital baseband core test-chip with an on-chip bit-<br>sequence generator for testing purposes                                                                                                                                                                                                                       |
| <ul> <li>Fig. 3-40 Measured results of the digital baseband test-chip showing two selected outputs generating the same output bit sequence. The results correspond exactly with the expected bit sequence on the two outputs</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | Fig. 3-39 | Snapshot of the digital baseband core test-chip layout (left) and its chip microphotograph (right)                                                                                                                                                                                                                                         |
| <ul> <li>Fig. 3-41 Block diagram of the programmable weighted code-generator as implemented in the 28 nm CMOS technology</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Fig. 3-40 | Measured results of the digital baseband test-chip showing two selected<br>outputs generating the same output bit sequence. The results correspond<br>exactly with the expected bit sequence on the two outputs                                                                                                                            |
| <ul> <li>Fig. 3-42 Eye diagram of the decoded PSSS waveform at the output of the S/H circuit for the case of PSSS data generated by the encoding of BPSK data with a PSSS chip rate of 30 Gcps</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Fig. 3-41 | Block diagram of the programmable weighted code-generator as implemented in the 28 nm CMOS technology                                                                                                                                                                                                                                      |
| <ul> <li>Fig. 3-43 Eye diagram of the decoded PSSS waveform at the output of the S/H circuit for the case of PSSS data generated by the encoding of PAM-4 data with a PSSS chip rate of 30 Gcps</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Fig. 3-42 | Eye diagram of the decoded PSSS waveform at the output of the S/H circuit<br>for the case of PSSS data generated by the encoding of BPSK data with a<br>PSSS chip rate of 30 Gcps                                                                                                                                                          |
| <ul> <li>Fig. 3-44 Simulation result of the analog mux designed in CMOS 28 nm technology. The upper trace shows the differential outputs of the DFF chain serving as the select signal for the mux. Note that the output of the mux during the (n+1)<sup>th</sup> cycle is different (here inverted version) from that during the n<sup>th</sup> cycle 98</li> <li>Fig. 4-1 Block diagram of the complete mixed-signal PSSS receiver baseband 101</li> <li>Fig. 4-2 Block diagram of the mixed-signal PSSS receiver baseband unit-slice 102</li> <li>Fig. 4-3 The digital interface from the output of the S/H circuit to the FPGA transceiver input including the generation of the required clock signals for the interface</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | Fig. 3-43 | Eye diagram of the decoded PSSS waveform at the output of the S/H circuit<br>for the case of PSSS data generated by the encoding of PAM-4 data with a<br>PSSS chip rate of 30 Gcps                                                                                                                                                         |
| <ul> <li>Fig. 4-1 Block diagram of the complete mixed-signal PSSS receiver baseband 101</li> <li>Fig. 4-2 Block diagram of the mixed-signal PSSS receiver baseband unit-slice 102</li> <li>Fig. 4-3 The digital interface from the output of the S/H circuit to the FPGA transceiver input including the generation of the required clock signals for the interface</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | Fig. 3-44 | Simulation result of the analog mux designed in CMOS 28 nm technology. The upper trace shows the differential outputs of the DFF chain serving as the select signal for the mux. Note that the output of the mux during the $(n+1)$ <sup>th</sup> cycle is different (here inverted version) from that during the n <sup>th</sup> cycle 98 |
| <ul> <li>Fig. 4-2 Block diagram of the mixed-signal PSSS receiver baseband unit-slice 102</li> <li>Fig. 4-3 The digital interface from the output of the S/H circuit to the FPGA transceiver input including the generation of the required clock signals for the interface</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | Fig. 4-1  | Block diagram of the complete mixed-signal PSSS receiver baseband 101                                                                                                                                                                                                                                                                      |
| <ul> <li>Fig. 4-3 The digital interface from the output of the S/H circuit to the FPGA transceiver input including the generation of the required clock signals for the interface.</li> <li>Fig. 4-4 The block diagram of the sample-and-hold circuit preceded by a limiting amplifier.</li> <li>Fig. 4-5 Schematic of the limiter circuit.</li> <li>105</li> <li>Fig. 4-6 Schematic of the track and hold circuit.</li> </ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | Fig. 4-2  | Block diagram of the mixed-signal PSSS receiver baseband unit-slice 102                                                                                                                                                                                                                                                                    |
| <ul> <li>Fig. 4-4 The block diagram of the sample-and-hold circuit preceded by a limiting amplifier</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | Fig. 4-3  | The digital interface from the output of the S/H circuit to the FPGA transceiver input including the generation of the required clock signals for the interface                                                                                                                                                                            |
| Fig. 4-5 Schematic of the limiter circuit                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Fig. 4-4  | The block diagram of the sample-and-hold circuit preceded by a limiting amplifier                                                                                                                                                                                                                                                          |
| Fig. 4-6 Schematic of the track and hold circuit                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | Fig 4-5   | Schematic of the limiter circuit 104                                                                                                                                                                                                                                                                                                       |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | Fig. 4-6  | Schematic of the track and hold circuit                                                                                                                                                                                                                                                                                                    |

| Fig. 4-7  | Simulation results of the S/H circuit. Differential signals: input signal (red), output of the limiter (yellow), output of the T/H (green), output of the S/H (turquoise), and the T/H and S/H command signal (blue) |
|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fig. 4-8  | M-sequence correlation parametric simulation with skew as the independent variable                                                                                                                                   |
| Fig. 4-9  | Block diagram of the VCDL with the schematic design of the VCDL part circuit                                                                                                                                         |
| Fig. 4-10 | Phase delay variation in degrees as a function of the Steer_Ctrl voltage input (simulated results)                                                                                                                   |
| Fig. 4-11 | Duty cycle in percentage as a function of the Steer_Ctrl voltage input (simulated results)                                                                                                                           |
| Fig. 4-12 | The schematic of the multioutput stage. Only one of the multiple outputs can be selected at any given time using a CMOS ring register initiated with the inverse of a one-hot pulse code (not shown here)            |
| Fig. 4-13 | The chip layout with the important circuit components highlighted 112                                                                                                                                                |
| Fig. 4-14 | Block diagram of an automatic gain control amplifier                                                                                                                                                                 |
| Fig. 4-15 | Schematic of the variable gain amplifier (VGA)                                                                                                                                                                       |
| Fig. 4-16 | 4-stage binary tree for distribution of PSSS signals to the 15 RX unit-slices.                                                                                                                                       |
| C         | The last node should be terminated in a dummy load impedance similar to the load impedance connected at the other nodes                                                                                              |
| Fig. 4-17 | The active signal distribution network for distribution of PSSS signal to the 15 receiver unit-slices                                                                                                                |
| Fig. 4-18 | The block diagram for automatic offset correction circuit                                                                                                                                                            |
| Fig. 4-19 | General idea for the implementation of a digital potentiometer (top). Using                                                                                                                                          |
| C         | NMOS transistors as switches for the current application (bottom)                                                                                                                                                    |
| Fig. 5-1  | Microphotograph of the chip wire-bonded to the PCB substrate                                                                                                                                                         |
| Fig. 5-2  | Pictures of the 2.92 mm (K-type) prototype PCB connectors [94]                                                                                                                                                       |
| Fig. 5-3  | Layer stack-up for the PCB. Not drawn to scale                                                                                                                                                                       |
| Fig. 5-4  | Measurement of the depth of the cavity using a 3D microscope                                                                                                                                                         |
| Fig. 5-5  | Parameterized HFSS structure for FEM simulation with the chip on the left                                                                                                                                            |
| C         | side and the PCB on the right side of the gap                                                                                                                                                                        |
| Fig. 5-6  | HFSS S-parameter sweep simulation results                                                                                                                                                                            |
| Fig. 5-7  | A picture of the RF-PCB on the left side and the FR-4 PCB on the right that                                                                                                                                          |
| 0         | is mechanically mounted on the back side of the RF-PCB                                                                                                                                                               |
| Fig. 5-8  | Mixed-mode S-parameter measurement results with the help of a vector                                                                                                                                                 |
| 0         | network analyzer (VNA)                                                                                                                                                                                               |
| Fig. 5-9  | Measurement setup for the standalone measurements                                                                                                                                                                    |
| Fig. 5-10 | Test sequence for the detection and correction of offset                                                                                                                                                             |
| Fig. 5-11 | Inter-chip alignment between the external PSSS input signal and the on-chip                                                                                                                                          |
| C         | generated integrate/ reset signal. In the misaligned case on the top, the                                                                                                                                            |
|           | evternal DSSS symbol is not correlated correctly whereas in the aligned case                                                                                                                                         |

external PSSS symbol is not correlated correctly whereas, in the aligned case on the bottom, the whole PSSS symbol will be used for correlation. ...... 132

| Fig. 5-12 | The alignment sequence used for checking the alignment of the PSSS signal with the decoding sequence                                                                                                                                                   |
|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fig. 5-13 | Eye diagram for the BPSK data as input for the PSSS stream                                                                                                                                                                                             |
| Fig. 5-14 | Eye diagram for the recovered data for the case of PAM-4 data as input for the PSSS sequences                                                                                                                                                          |
| Fig. 5-15 | Complete measurement setup for system integration of mixed-signal PSSS                                                                                                                                                                                 |
|           | receiver baseband with RF-frontend and Costas loop for carrier recovery<br>136                                                                                                                                                                         |
| Fig. 5-16 | Broadband m-sequence radar architecture                                                                                                                                                                                                                |
| Fig. 5-17 | Proposed m-sequence radar architecture with mixed-signal radar receiver baseband for high-resolution ranging                                                                                                                                           |
| Fig. 5-18 | Measurement setup with the microphotograph of the m-sequence radar receiver BB test-chip ( $1.85$ mm $\times 2$ mm) in the inset                                                                                                                       |
| Fig. 5-19 | The instantaneous differential output of the correlator                                                                                                                                                                                                |
| Fig. 5-20 | The sampled differential output of the correlator (S/H output) 141                                                                                                                                                                                     |
| Fig. 5-21 | Instantaneous output of the correlator circuit as a function of the sub-T <sub>chip</sub> skew                                                                                                                                                         |
| Fig. 5-22 | Measurement results of the bipolar m-sequence correlation with and without channel equalization                                                                                                                                                        |
| Fig. 5-23 | The sampled differential output of the correlator (S/H output) showing the correlation of bipolar m-sequences with channel equalization using an on-<br>chip weighted code generator                                                                   |
| Fig. 6-1  | Proposed schematic diagram of the no reset integrator circuit which forms a part of the no reset correlator circuit                                                                                                                                    |
| Fig. 6-2  | Block diagram of the multiple resettable correlators with flags 150                                                                                                                                                                                    |
| Fig. 6-3  | Idea for generation of larger (weighted) codes with reconfigurable hardware.                                                                                                                                                                           |
| Fig. 6-4  | The proposed idea to reduce the power dissipation by keeping the i <sup>th</sup> differential current input disconnected except for the current (i <sup>th</sup> ) selection pulse, and the selection pulse before and after the i <sup>th</sup> pulse |

## 1 Introduction

The demand for high data rate wireless communication systems has increased tremendously in the last quarter-century. The advancements in chip fabrication and packaging, and the printed circuit board design have enabled new applications like multigigabit point-to-point or chip-to-chip communication, wireless local area networks, kiosk downloads, wireless High-Definition Multimedia Interface (HDMI) video streaming, etc. The Long-Term Evolution (LTE)-A standard specifies a 100 Mb/s link from the base station to the mobile end-user and the IEEE 802.11ad Wireless Gigabit (WiGig) standard specifies a peak transmission rate of 8 Gbps for short-range (1-10 m) communication in wireless local area networks.

This project deals with the design and development of a wireless communication system with an ultra-high data rate of up to 100 Gb/s. The development of such a communication system represents implementational challenges at multiple levels. One of the most important system design parameters for ultra-high data rate wireless communication is bandwidth. As the data rate increases, the bandwidth needs to increase proportionally to support the increased data rate. The frequency spectrum up to 40 GHz is sliced into a multitude of narrow channels to encourage competition and to permit interference-free communication. Larger continuous chunks of bandwidth are only available higher up in the spectrum. An additional important consideration is the presence of the ambient absorptions peaks caused by molecular oxygen  $(O_2)$  and water vapor  $(H_2O)$ . These peaks occur at the following frequencies: 60 GHz, 119 GHz, 183 GHz, 325 GHz, and 380 GHz as seen in Fig. 1-1. Any communication system designed with any of the above frequencies as the carrier frequency will suffer from very high attenuation caused by the atmospheric absorption in addition to the usual free-space loss. While the frequencies around 90 GHz are used for military purposes, the bands around 80 GHz, and 140 GHz, and the large window in the 210-315 GHz region are good candidates for high-data-rate long-range communication.

The wireless Local Area Network (LAN) and Wide Area Network (WAN) standards (e.g., cellular, Wi-Fi, WiMAX, etc.) with data rates of up to 10 Gbps use the sub-10 GHz spectrum. For example, the IEEE 802.11ax standard for WLAN, officially marketed by the Wi-Fi Alliance as Wi-Fi 6 (using 2.4 GHz and 5 GHz) and Wi-Fi 6E (using 6 GHz) support maximum data rates from 600 Mbps to 9608 Mbps [1]. The microwave band frequencies between 6-40 GHz have low atmospheric absorption and are commonly used for long-distance communication e.g., as the backhaul connection for the cellular networks with data rates of up to a few hundred Mbps. The unlicensed frequency bands at 60 GHz and the licensed bands at 70 and 80 GHz are suited for wireless communication up to 10 Gbps [2], [3] whereas for data rates above 10 Gbps, the frequency bands above 100 GHz are more commonly employed [4], [5].



Fig. 1-1 Average mm-wave atmospheric absorption spectrum [2].

One of the most important systems design parameters is the spectral efficiency i.e., the transmission rate per unit bandwidth measured in units of bits per second per Hertz (bps/Hz). To achieve a target data rate, there is always a trade-off between the available bandwidth and spectral efficiency. For a given bandwidth, the increase in the modulation complexity means higher spectral efficiency e.g., the spectral efficiency of 16 PAM is 4 bps/Hz whereas that of 256 QAM is 8 bps/Hz.

A common method to increase the data through put in a given area is to employ spatial multiplexing. However, at higher frequencies, this approach becomes quite complex. The size of the antenna decreases with the increase in frequency, but the radiation profile is relatively omnidirectional. The increase of the free space path loss at higher frequencies (a virtual effect due to the scaling of the antenna size with the square of the frequency) requires the use of multiple antenna elements to create pencil-shaped beams with high gain and directivity. Making use of multiple such highly directional high-gain antennas, a line-of-sight (LOS) multiple-input multiple-output (MIMO) system can be realized in which the transmitter and receiver use either  $1 \times n$  linear or  $n \times n$  rectangular antenna arrays whose elements are separated by a distance  $D = \sqrt{R\lambda/n}$  where R is the link range, and  $\lambda$  is the wavelength in free space [6]. For a 2  $\times$  2 rectangular LOS spatial multiplexing system at 60 GHz, the antenna element separation distance D can be limited to 5 cm for a maximum range of 1 m. This would be sufficiently small to be of interest for many LOS scenarios such as kiosk download to consumer devices like smartphones. However, the use of LOS multiplexing makes the alignment and positioning of the antennas very critical. An alternate option is to employ polarization multiplexing. Here again, if linearly polarized antennas are used the accurate alignment and positioning become very critical. This alignment requirement can be eliminated by applying left and right circular polarized antennas. The polarization multiplexing without the requirement of highly accurate alignment and positioning is a good method to increase the spectral efficiency by a factor of 2. However, further extension of the LOS MIMO concept makes the implementation very

3

complex and renders the use of higher bandwidth to achieve the target data rate of 100 Gbps necessary [7].

Based on the discussion above, the following conclusions can be made: the use of large bandwidth allows for a reduction of the spectral efficiency to a small-enough value which considerably reduces the implementation complexity. The requirement for higher bandwidth requires the use of the Terahertz spectrum i.e., above 100 GHz. The use of multiple antenna segments can be used to improve the directivity and gain of the TX and RX antennas. Additionally, some limited spatial multiplexing gain can be achieved by the use of line of sight (LOS) MIMO multiplexing e.g., by using left and right circular polarized antennas.

# 1.1 Motivation

The research work carried out in this thesis is a part of a project titled Real100G.com that is itself a part of a priority research program (Schwerpunktprogramm) of the German research foundation (Deutsche Forschungsgemeinschaft) SPP1655 titled, "Wireless 100 Gbps and beyond" which deals with the design and development of ultra-highspeed wireless communication systems for mobile internet access. The research in this thesis is concerned with the baseband circuit design for such a communication system. The research and development activities concerning the radio frequency (RF)-frontend, carrier recovery, and media access control are undertaken in separate projects in the framework of Real100G.com and SPP1655. Despite the clear segregation of the research activities of each of the constituent projects, there was a close collaboration among the partner projects to define interfaces and specifications leading to the common goal of 100 Gbps wireless communication.

Based on the discussion in the previous section, the current project explores the use of a large contiguous chunk of bandwidth around a carrier frequency of 240 GHz. The approach is to use a very large radio frequency (RF) bandwidth of 50 GHz with a smaller spectral efficiency to simplify the implementation complexity. The use of Double Sideband Suppressed Carrier (DSB-SC) modulation for the RF-frontend makes the design of the RF-frontend simpler and allows for a baseband bandwidth of 25 GHz.

A conventional transceiver baseband architecture uses digital signal processors to perform signal processing and uses data converters to transform the data between digital and analog formats. Nyquist rate analog-to-digital converters (ADC) and digital-to-analog converters (DAC) for the above-mentioned data rates are not only difficult to realize but also cause high power dissipation. Moreover, the digital signal processors (DSPs) used to perform baseband signal processing, packet and chip-level synchronization, data error correction, automatic gain control, and adaptive channel equalization need a high operating performance in the order of several hundred giga-floating point operations per second (GFLOPS) which makes the traditional transceiver implementation very inefficient in terms of power dissipation. Thus, the focus of this project is to develop a transceiver baseband architecture in which most of the baseband signal processing is performed in the analog domain which avoids using high-performance DSPs as well as wide bandwidth, high resolution, and high sampling-rate data converters.

An important investigation pertains to the efficient implementation of the baseband system with regards to chip area, implementation complexity, and power dissipation. In general, our design considerations aim at making an analog signal processing implementation possible. Additionally, spread spectrum techniques are also investigated to add robustness to channel fading and interference. The different options for spread spectrum communication are discussed below briefly:

Frequency-hopping spread spectrum (FHSS) is a method for spread spectrum communication in which the spectrum is divided into several narrowband channels and the transmission takes place by switching the carrier frequencies using a pre-defined pattern. FHSS is also specified in IEEE 802.11. In comparison to the direct sequence spread spectrum (DSSS) technique, FHSS has some disadvantages. It is prone to burst errors because of narrowband frequency selective channel fading. It requires a fast, digitally switchable local oscillator and the synchronization between the receiver and the transmitter is quite challenging for fast hopping rate FHSS. Detection of FHSS communication is easier as compared to DSSS because of the larger signal amplitude. FHSS requires a flat frequency response for the entire bandwidth whereas DSSS often employs spectral shaping to aid with electromagnetic interference (EMI) compliance. This results in a larger noise bandwidth (or equivalently smaller input SNR) at the receiver in the case of FHSS as compared to DSSS because [8].

Direct sequence spread spectrum (DSSS) is a communication technique that uses pseudorandom spreading sequences to spread the information bits. The individual bits of the spreading sequences, called chips, are much shorter in duration as compared to the symbol duration. Encoding the data using a spreading sequence, scrambles and spreads the data, and thereby results in a bandwidth nearly identical to that of the spreading sequence. For multiple users simultaneously accessing the channel in asynchronous code division multiple access (CDMA), each user requires an individual pseudorandom spreading sequence or code which must be orthogonal to the codes used by other users. Using a longer code increases the spreading gain of the code defined as the ratio of the symbol (or code) duration to the chip duration. Moreover, the possible number of orthogonal codes increases with the length of the code [8].

Chirp spread spectrum (CSS) is another spread spectrum technique. It uses wideband frequency-modulated chirp signals to encode information. Chirp signals are sinusoidal signals whose frequency ramps up or down linearly during the symbol period. The CSS is not a preferred choice because of the design complexity of the frequency synthesizer phase-locked loop (PLL) required for generating the chirp signals. The rule of thumb for achieving a stable loop in a PLL system is to make the loop bandwidth (LBW) less than the update rate of the PLL by at least a factor of 10. The tuning range of 25 GHz along with a settling time smaller than10 ps is quite challenging to design. Moreover, at least 16-PSK (phase shift keying) modulation is required for the required data rate of 100 Gbps which makes the design further complicated.

Parallel sequence spread spectrum (PSSS) is a spread spectrum technique that can be considered a parallelized version of the direct sequence spread spectrum (DSSS) technique [9]. In DSSS, N transmitters spread their information symbols by encoding them with their respective individual pseudorandom codes, whereas in PSSS, a single transmitter encodes N information symbols in parallel using N cyclically shifted versions of a single pseudorandom code, provided that the shifted versions of the code are orthogonal to each other.

Fig. 1-2 presents a figurative explanation of the Parallel Sequence Spread Spectrum (PSSS) technique [10]. On the transmitter side, the N data symbols are multiplied with their respective cyclically shifted copies of a pseudorandom code and the encoded chips of each data symbol are added up to form the PSSS sequence chips i.e., the baseband signal that is up-converted and sent using an RF-frontend. After down-conversion at the receiver end, the transmitted symbols are recovered by correlation (i.e., multiplication followed by integration) of the received PSSS sequence with the respective decoding sequences. PSSS is well suited for analog or mixed-signal baseband implementation as also discussed in [11] and is used as the preferred spread spectrum technique for the Real100G.com project.

The project Real100G.com discusses the concept, design, and implementation of a high data rate wireless communication system based on a mixed-signal baseband circuit. The proposed baseband circuit uses parallel sequence spread spectrum (PSSS) modulation with a mixed-signal implementation and uses a single RF-carrier with double sideband (DSB) RF transmission. The idea is to develop a communication system architecture that is flexible and whose net data rate can be extended by the use of techniques like multiple-input multiple-output (MIMO) and I/Q modulation. The baseband concept of the



Fig. 1-2 Figurative explanation of the parallel sequence spread spectrum (PSSS) technique. [10]

Real100G.com project is versatile and applies to many application scenarios, ranging from short-range, device-to-device, medium-range local or personal area networks.

#### 1.1.1 Possible System Approaches for 100 Gbps Wireless Communication

After having discussed the basic system design parameters for a 100 Gbps wireless communication system, the different plausible approaches for the implementation are presented below.

#### 1.1.1.1 Single-Carrier vs. Multi-Carrier

Wireless systems can be divided into two main categories based on the radio frequency carriers used for transmission of the digitally modulated waveform i.e., single carrier or multi-carrier systems. The single carrier systems transmit the broadband information signals centered around a single carrier frequency. This requires the RF-frontend circuits to have a flat frequency response for wider bandwidth. Due to the availability of wide bandwidth, a moderate spectral efficiency can suffice which reduces the design complexity of the system. An important consideration for single-carrier wireless communication is frequency selective fading and narrowband interference. This can occur due to destructive interference of the signal arriving at the receiver end from multiple paths (multipath fading) or due to strong reflections or absorption by an obstacle. One solution to this problem is the use of spread spectrum communication. Direct sequence spread spectrum (DSSS) is a logical extension of the DSSS to support a higher data rate while still maintaining a low enough spectral efficiency and is the preferred choice of spread spectrum communication in this project.

On the other hand, the systems based on multi-carriers use either frequency hopping spread spectrum (FHSS), or orthogonal frequency division multiplexing (OFDM). Using FHSS with a 25 GHz baseband bandwidth would require a very high SNR. It is prone to burst errors because of narrowband frequency selective channel fading and thus requires a very good forward error correction (FEC). Moreover, the design of a fast, digitally switchable local oscillator with a very low settling time is quite challenging for fast hopping rate FHSS.

OFDM is a multiplexing technique rather than a spread spectrum technique. It works by dividing the bandwidth into several narrowband carriers. The 25 GHz bandwidth can be divided into 1000 subcarriers each having 25 MHz bandwidth. For the target data rate of 100 Gbps, the spectral efficiency of 4 bps/Hz e.g., 16-QAM is needed in this case. The design of a digital demodulator for this scenario would require broadband *I* and *Q* Nyquist rate data converters with a sampling rate of at least 50 GS/s that is both power-hungry as well as difficult to design. The digital signal processor required to calculate the Fast Fourier Transform (FFT) as well as the other communication system-related tasks will need

to execute several hundred Giga Floating Point Operations per Second (GFLOPS) which will add a significant amount to the overall power dissipation of an OFDM based solution. The state of the art for high-speed data converters required for OFDM is presented in section 1.2.1.

#### 1.1.1.2 High Carrier Frequency, High Bandwidth, and High Spectral Efficiency

One possibility to achieve the target data rate of 100 Gbps is to use the high bandwidth (up to a few GHz) available at the higher end of the V-band (40-75 GHz). Globally, the 57-66 GHz band is split into 4 bands each having 2.16 GHz bandwidth. Europe allows the use of all 4 bands simultaneously. The approximate maximum data rate is 6.8 Gbps per channel which makes the total net data rate equal to  $4 \times 6.8 = 27.2$  Gbps by making use of channel bonding [7]. Accommodating 6.8 Gbps in a 2.16 GHz wide channel implies a spectral efficiency of 6.8/2.16 = 3.15 bps/Hz. Since only a subset of the total number of sub-carriers is used for orthogonal frequency division multiplexing (OFDM) and the coding rate is also less than 1 so a lot of the spectral efficiency is used up for overhead functionalities in the physical layer such as channel coding, pilot symbols, guard intervals, and guard bands. As an example, using 64-quadrature amplitude modulation (QAM) to modulate 336 out of 512 sub-carriers with OFDM with a coding rate of 13/16 almost half i.e.,  $(336/512 \times 13/16 = 0.53)$  of the spectral efficiency is used up for the payload data whereas the rest is used up for the overhead functionalities in the physical layer. Increasing the modulation order to 256-QAM increases the spectral efficiency by 33 percent, whereas an improvement of 370 percent is required to achieve the target data rate of 100 Gbps. Moreover, the transmitter power would have to be increased considerably since every two-fold increase of the spectral efficiency requires an addition of 12 dBs in the link budget to maintain the bit error rate value of  $10^{-3}$ . On the other hand, this puts extreme demands on the linearity of the amplifiers, which would make the implementation unrealistically expensive [7].

The spectral efficiency can be increased by using some sort of spatial multiplexing or MIMO approach. As discussed earlier, at high frequencies like 60 GHz, only LOS MIMO is possible, and that too with a limited multiplexing ratio e.g.,  $2 \times 2$ . A big disadvantage of the LOS MIMO is the accurate alignment and positioning of the pencil beam antennas. This problem can be resolved by using polarization multiplexing as an alternate. The use of linear polarization poses the same problem of requiring very accurate placement and alignment of the antenna elements. A more practical option is to employ left and right circularly polarized antennas. This ensures the multiplexing gain of 2 at high frequencies as well. A further extension of the LOS-MIMO concept is difficult which makes it necessary to use a higher bandwidth that is available in the tera-hertz spectrum.

# 1.1.1.3 Ultra-High Carrier Frequency, Ultra-High Bandwidth, and Moderate Spectral Efficiency

After having discussed the different options for the carrier i.e., single, or multi-carrier and the trade-off between the bandwidth vs. the spectral efficiency, the most promising approach for the target data rate of 100 Gbps is the one employing the ultra-high bandwidth available in the atmospheric low absorption window from 200 GHz to 310 GHz. Taking advantage of the contiguous chunk of available bandwidth, a single carrier centered around 240 GHz can be used with double sideband analog modulation that allows for a baseband bandwidth of 25 GHz for which the spectral efficiency value of 4 bps/Hz is sufficient to reach the target data rate of 100 Gbps. As discussed previously, parallel sequence spread spectrum (PSSS) suits well as a spreading technique in this scenario and can be well-adapted for a mixed-signal implementation. Instead of using wide-bandwidth, high-sampling rate ADCs in the receiver, followed by a digital signal processor to perform demodulation, synchronization, and channel equalization; the demodulation and channel equalization can be performed using analog/ mixed-signal circuits and the data conversion is performed at the end of the correlation period equal to the symbol duration. This reduces the requirements on the ADC circuit. Similarly, on the transmitter side, most of the signal processing can be performed in the analog/mixed-signal domain. Assuming the chip rate of 25 GHz for code sequences of length 15 (as explained in the later chapters) for the mixed-signal PSSS implementation, the data converters will have reduced input bandwidth and sampling rates by a factor of 15 as indicated in Fig. 1-3. Note that the PSSS implementation requires 15 copies of the hardware to encode, transmit, and decode 15 symbols in parallel.



Fig. 1-3 Comparison between the proposed analog/mixed-signal transceiver architecture and the conventional digital transceiver architecture.

#### 1.1.2 Proposed Solution

The conclusion from the discussion in section 1.1.1 is that a single carrier with spread spectrum communication is a better choice as compared to a multi-carrier approach.

Moreover, instead of using a high spectral efficiency with a few GHz of bandwidth available in the V-Band, it is better to shift the carrier frequency in the THz range i.e., around 240 GHz where a large continuous chunk of bandwidth is available i.e., 200-310 GHz. The use of the large bandwidth allows to reduce the spectral efficiency to a smaller value but makes a digital transceiver baseband implementation inefficient and power-hungry. The solution is to perform the baseband signal processing in the analog domain which in turn requires investigating and using a modulation scheme that lends itself to an efficient analog implementation. Parallel sequence spread spectrum (PSSS) is a spread spectrum communication technique that suits well the requirements outlined above and forms the basis of further discussion in this thesis.

As mentioned before, the project Real100G.com deals with the transceiver baseband circuit design and carrier recovery circuit for 100 Gbps wireless communication. It is part of a priority research program of the German Research Foundation to develop solutions for 100 Gbps wireless data transmission for mobile internet access. The project Real100G.RF deals with the RF-frontend for the ultra-high bandwidth with a center frequency of 240 GHz. The transmitter architecture with the proposed mixed-signal baseband is shown in Fig. 1-4 [12]. The inner details about the PSSS transmitter baseband circuit are not discussed here and will be discussed in the later chapters. The important thing here is to highlight the overall architecture of the transmitter circuit and the interface to the RF-frontend circuit. Note that the proposed transmitter uses I and Q PSSS signals that are generated using a mixed-signal PSSS transmitter baseband clocked with a clock frequency of 30 GHz. The 30 GHz clock for the baseband circuit is generated using a 15 GHz reference clock which also serves as the reference clock for the 240 GHz local oscillator signal for the RF-frontend. The I and Q signals are combined and up-converted to generate the RF signal. The important thing to note in Fig. 1-4 is that the digital to analog conversion happens at the symbol rate  $f_{sym} = f_{chip}/18$ . The denominator 18 refers to the code length 15 plus a guard interval length of 3 as explained later in section 2.4.1.3.



Fig. 1-4 Proposed architecture for the 100 Gbps wireless transmitter with mixed-signal PSSS TX baseband [12].

The receiver architecture with the mixed-signal PSSS receiver baseband is shown in Fig. 1-5 [12]. The received I and Q signals from the RF-frontend are divided using an active power divider. One-half of the signal goes to a QPSK Costas loop to recover the I and Q carrier signals whereas the other half goes to the mixed-signal PSSS receiver baseband circuit. The output of the Costas loop acts as the control signal for the voltage-controlled oscillator that provides the reference signal for the local oscillator of the RF-frontend as well as the clock signal for the receiver baseband. The mixed-signal receiver baseband performs the data demodulation and the channel equalization in the analog/mixed-signal domain and uses low sampling rate analog to digital converters working at the symbol rate  $f_s$  defined as  $f_{chip}/18$ . Note that the details about the mixed-signal architecture, the selection of the system design parameters, and other implementational details follow in later chapters. For now, the purpose is to show the overall receiver architecture and the baseband circuit interface to the carrier recovery and RF-frontend circuits. The design of the proposed carrier recovery circuit, as well as the RF-frontend circuits, are dealt with in separate projects and further discussion about their respective implementations and challenges is beyond the scope of this work. For more details, the readers are referred to [13], [14], and [15].

An important point to note is that the proposed transmitter and receiver architectures use I and Q data to increase spectral efficiency. Additionally, the LOS MIMO approach using left and right circular polarization that was proposed earlier can be used to further increase the spectral efficiency by a factor of 2. This allows reducing the modulation complexity by a factor of 4 for the target data rate of 100 Gbps if both I and Q data streams are used along with a 2 × 2 LOS-MIMO.



Fig. 1-5 Proposed architecture for the 100 Gbps wireless receiver with the mixed-signal PSSS receiver baseband [12].

However, in the course of this thesis, the use of the I and Q data streams as well as the LOS-MIMO are considered as additional means of increasing the spectral efficiency and the basic transceiver baseband circuit design will only assume a signal I data stream without any LOS-MIMO. Importantly, it requires no change in the baseband circuit design and hence, the extension to I and Q channels would only require the use of two copies of the baseband circuit for the I and the Q data paths.

## 1.2 State of the Art

This section presents the state of the art in high-speed data converters, and ultra-broadband, high-data-rate wireless communication systems. The state of the art concerning analog signal processing for ultra-broadband, high data-rate wireless communication systems, and the state of the art for PSSS transceiver implementations are also presented.

#### 1.2.1 State of the Art in High-Speed Data Converters

The state of the art for the ADCs prior to the start of this thesis (in 2012) is represented with the following references: in [16], a 6-bit interleaved successive approximation (SAR) ADC with 18 GHz of input signal bandwidth sampling at 40 GS/s is presented in 65 nm CMOS technology that dissipates less than 1.5 W of power. The effective number of bits (ENOB) with the calibration drops below 4 for an input signal bandwidth of 14 GHz with the sampling rate of 40 GS/s. In [17], a 20 GS/s 5-bit dual Nyquist flash ADC with sampling capability up to 35 GS/s designed in 130 nm SiGe BiCMOS technology is presented that dissipates 4.8 W. Note that two copies of the ADCs will be needed, i.e., one each for the *I* and the *Q* branches in the receiver.

In recent years there had been tremendous advancements in the ADC design and measurement as indicated by the following notable publications. In [18], a single-core 4-bit Nyquist-rate flash ADC with a sampling rate of 40 GS/s is presented in 130 nm SiGe BiCMOS technology that dissipates 2.3 W of power excluding the output buffers or the output DAC. The ENOB is 2.8 bits for the input bandwidth of 20 GHz. Other high sampling rate ADCs include a 6-bit, 36 GS/s, SAR ADC featuring 18.1 GHz input signal bandwidth [19], and a 90GS/s 8-bit 667mW 64× Interleaved SAR ADC with 19.9 GHz input bandwidth both designed in 32 nm digital SOI CMOS [20], [20]. In [21], an 8-bit time-interleaved SAR ADC is reported to have an input signal bandwidth of 20 GHz. The ADC is implemented in 14 nm CMOS FinFET technology and is measured at 48 GS/s and 72 GS/s with 97 mW and 235 mW of power dissipation respectively. Another timeinterleaved SAR ADC designed in 40 nm CMOS technology is reported in [22], which has a sampling rate of 65 GS/s. The input 3 dB bandwidth is 20 GHz, and it has a power dissipation of 1.2 W. In [23], 40-97 GS/s 8-bit DACs and ADCs with 40 GHz adaptive feedback equalizer (AFE) bandwidth, fabricated in a 7 nm FinFET process are presented for 400 Gbps coherent optical applications. For a more comprehensive list of high-performance ADCs, the reader is referred to the spreadsheet in [24] which contains an updated list of the notable ADC publications since 1997.

The state of the art for DACs prior to the start of this thesis (in 2012) is represented by the DAC in [25] that operates with a sampling rate up to 28 GS/s and has a power consumption of 2.25 W using a -2.5 V power supply. The DAC has an output bandwidth of at least 14 GHz and an estimated ENOB of 5.5-bit at DC and 4.6-bit at the Nyquist frequency. In recent years, high-performance DACs have been reported as indicated by the following publications: in [23], 40-97 GS/s 8-bit DACs are reported that are designed in 7 nm FinFET CMOS technology for 400 Gbps coherent optical communication. In [26], a fully integrated receiver and transmitter are presented in a 100 Gbps coherent DSP chip, using 4×64 GS/s ADCs and DACs with 8-bit resolution, fabricated in a standard 20nm CMOS process. In [27], an 8-bit 100 GS/s digital-to-analog converter (DAC) using a distributed output topology designed in 28-nm low-power CMOS for optical communications is presented. The DAC has 13 GHz of output bandwidth and has an ENOB of 3.2 bits below 24.9 GHz. The 6-bit DAC in [28] has a conversion rate of 72 GS/s with a 13 GHz output bandwidth. The DAC is fabricated in SiGe BiCMOS technology and has a power dissipation of 19.5 W. The DAC in [29], has a resolution of 8-bit and operates with a sampling rate of 56-65 GS/s featuring an output 3 dB bandwidth of 13 GHz. The DAC is designed in 40 nm CMOS technology. It has an ENOB of 6.5 bits at 8 GHz and dissipates 0.75 W of power.

#### 1.2.2 State of the Art in Ultra-Broadband Wireless Systems

The IEEE802.15.3c [30] and IEEE802.11.ad [31] standards define the multigigabit transmission in the 60 GHz ISM band with personal area networks (PAN) or local area network (LAN) approaches with transmission rates up to 7 Gbps. In [11], [32] very short range (<1 m) as well as moderate range (18 m) ultra-high-speed communication systems of up to 10 Gb/s have been demonstrated using either a complex OFDM based approach [33] in a 2 GHz channel or using a bonded channel of 4 GHz and a single carrier DQPSK [34]. An un-coded data rate of 10 Gbps has been achieved using a dual-polarization mechanism. In addition, the ECMA-387 standard [35] defines a system that can transmit up to 25 Gb/s using QAM-16 modulation on four 2 GHz channels bonded together. The Wireless-HD consortium has specified similar specifications for systems with data rates of up to 27 Gb/s [36]. However, there are no existing ECMA implementations that achieve the proposed 25 Gb/s data rate.

The IEEE P802.15.3d [37] standard deals with the media access control (MAC) layer design, PHY layer design, and RF-frontend design in the terahertz band i.e., (252 GHz - 325 GHz). The IEEE P802.15.3d task group's technical requirements document [38] mentions the required data rates, bit error rate values, the transmission range values for the different application scenarios e.g., chip-to-chip communication, wireless front-haul

communication, etc. The task group recommends using a single carrier approach for the whole bandwidth or dividing the whole channels into sub-channels that can be realized in an OFDM system or by several RF frontends with smaller bandwidths. However, there are no recommendations regarding the modulation type to be used.

The system reported in [39] uses wireless transmission at 237.5 GHz to achieve 100 Gbps over a distance of 20 m using a photonic THz transmitter and an electronic THz receiver with 16 QAM modulation. External system components are used to perform optical to electrical conversion in the transmitter. The system in [40] makes use of the same electronic receiver as in [39] with a fully electronic transmitter Monolithic Microwave Integrated Circuit (MMIC), to achieve 96 Gbps with 8-PSK modulation. The same electronic only wireless link was appended with high gain parabolic antennas to transmit 64 Gbps over a distance of 850 m as reported in [41]. The receiver synchronization is performed in the digital domain but there is no mention of the automatic gain control, pre-distortion, filtering, etc. The wireless communication system in [42] operates with a carrier frequency of 400 GHz and uses QPSK and ultra-dense wavelength division multiplexing to achieve the data rate of 60 Gbps over a distance of 50 cm. For the transmitter, an arbitrary waveform generator (AWG) is used which drives a photonic transmitter. The signal is down-converted using a THz mixer and data demodulation is performed using a real-time sampling oscilloscope.

For the chip-to-chip communication application, a non-coherent on-off-keying (OOK) transmitter chip and an OOK receiver chip designed in 40 nm CMOS technology were demonstrated to communicate at the data rate of 10.7 Gbps using a 210 GHz transceiver over a distance of 1 cm with a BER better than 10<sup>-12</sup> [43]. In an alternate system with a 340 GHz transceiver frontend, the data transmission at 3 Gbps was demonstrated over a distance of 30 cm [44]. 16-QAM data symbols are generated offline and sent to a 12-bit DAC which drives an RF-frontend. At the receiver end, a 10-bit ADC with a sampling rate of 3 GS/s is used to sample the signal whereas the data demodulation is performed offline.

Importantly, none of the above-mentioned approaches describes the complete wireless system that includes carrier synchronization, real-time demodulation of the received data, channel equalization, and generation of baseband signal for the transmitter for the target data rate of 100 Gbps.

# 1.2.3 State of the Art in Analog Signal Processing for Ultra-Broadband Wireless Systems

As discussed earlier, analog signal processing in wireless transceivers can be used for baseband signal processing i.e., modulation/ demodulation, synchronization, adaptive equalization, and decoding/ error correction.

In systems with moderate spectral efficiency, there have been investigations on avoiding the use of data converters by carrying out signal modulation and demodulation entirely in the analog frontend. However, these approaches are limited to the use of simpler modulation schemes e.g., QPSK. The demodulated bits are represented as binary *IQ signals*, and the ADC is replaced with simple threshold circuits. No approach is currently known where a complex baseband system is realized in the analog domain. Earlier investigations by Moerz [45] and Soler [46] showed that complex components like a Viterbi decoder or a MIMO baseband can be realized completely in an analog fashion. These approaches showed that the complexity of the circuit in terms of the number of transistors could be reduced to approximately a tenth and the power dissipation could be reduced to approximately a hundredth of a comparable circuit in standard digital hardware with comparable process parameters. The disadvantage of a complete analog realization is the lack of appropriate automated Computer-Aided Engineering (CAE)-tools to support the design.

Different ultra-broadband wireless transceivers using single-carrier modulation in conjunction with analog demodulation have been demonstrated. An on-off-keying (OOK) transceiver using a carrier frequency of 220 GHz was demonstrated in [47] to transmit at a data rate of 30 Gbps in metamorphic InP HEMT technology. In [48], a SiGe BiCMOS transceiver for the 70-80 GHz band achieved a data rate of 18 Gb/s using QPSK modulation. The CMOS-transceiver reported in [49] for the 60 GHz band uses analog synchronization and an analog channel equalization circuit to communicate at 10 Gbps. Another 60 GHz communication system with a 10 Gbps data rate is reported in [50] that uses analog forward carrier synchronization. Apart from the transceiver from [49] none of the other reported transceivers implement any form of channel equalization. Moreover, none of the receivers have reported a wireless data rate of 100 Gbps so far.

A short review about the reported broadband adaptive analog equalizers (AE) for wireless communications is as follows: A 60 GHz receiver using minimum shift keying (MSK) modulation and an adaptive decision-feedback equalizer (DFE) demonstrated 1 Gbps data transmission [51]. The use of the analog equalizer allowed a reduction in the ADC resolution down to only 4-bit. The communication system in [49] mentioned above with reported data transmission at 10 Gb/s over 60 GHz using QPSK modulation makes use of a Decision Feedback Equalizer (DFE) in 65 nm CMOS. Presumably, the DFE uses two ADCs with a resolution of 1 bit. Note that no Feed-Forward Equalizers (FFE) such as adaptive continuous-time Finite Impulse Response (FIR) filters have been reported. This owes to the fact that contrary to the more complex continuous-time FIR filter implementation, the signal delays in the DFEs can be implemented simply by using a cascade of registers clocked with the symbol frequency. Since DFEs only compensate post cursor ISI [52], the DFE-only equalizers are well suited for line-of-sight (LOS) links but may be of limited use in non-line-of-sight (NLOS) links.

An alternate to the single-carrier modulation with adaptive equalization is the use of OFDM and an analog DFT processor, for frequency domain equalization. Broadband DFT processors have been investigated in the context of UWB transceivers [53] and software-defined radio [54]. A 1 GS/s analog DFT processor for UWB was presented in [55]. However, only an 8-point DFT was implemented. In [56], a 64-tap analog DFT processor for up to 2 GHz sampling frequency in 65 nm CMOS is reported.

#### 1.2.4 State of the Art for PSSS Transceiver Implementations

The most significant result regarding PSSS transmission experiments is provided by [57] where a Hardware in the Loop (HiL) is used to transmit up to 80 Gbps data with spectral efficiency of 4 bps/ Hz. The digital PSSS baseband processor has been realized offline in MATLAB/ Simulink. The baseband (BB) data (i.e., PSSS modulated symbols) used for the transmission is stored in an Arbitrary Waveform Generator (AWG). The PSSS symbols are transmitted using the 230 GHz RF frontend. A Real-Time Oscilloscope (RTO) samples and stores the received signal. The received PSSS baseband data is post-processed for synchronization, channel estimation, reference calculation for channel equalization, decoding, and Bit Error Rate (BER) calculation. Similar hardware-in-the-loop setups were used in [58] and [59] albeit with a nonlinear power amplifier to demonstrate PSSS signal transmission at 20 Gbps with a 240 GHz carrier using adaptive cross-correlation-based equalization. Previously, an analog PSSS baseband implemented as a MATLAB/Simulink model has been used together with 60 GHz radios and low-gain PCB antennas for a radio link experiment [60] to demonstrate 4 Gbps data transmission over 60 GHz at a distance of 3m (spectral efficiency of 2.3 bit/s/Hz). It was demonstrated that the PSSS baseband provides comparable performance to OFDM at 60 GHz. Other preliminary work on PSSS is related to carrier-less PSSS modulation and cross-correlationbased equalization for a USB 3.0 cable in [61] that shows that 20 Gbps and more is possible with BPSK modulation in a duplex communication scenario. In [62], a PSSS transmission experiment is performed with offline baseband processing to demonstrate 15 Gbps communication using a USB 3.0 cable. Note that no baseband circuits have been used in any of the above transmission experiments and that the PSSS baseband processing has been done offline using MATLAB/ Simulink. They serve to provide a proof of concept for the effectiveness of the PSSS modulation with real RF-frontend circuits.

As examples of PSSS baseband hardware: in [63], a rapid prototype is used to evaluate the PSSS performance in industrial communication with a small bandwidth of 20 MHz at a carrier frequency of 5.8 GHz. The implementation of a digital baseband core of a PSSS transmitter on a Xilinx Virtex UltraScale Field Programmable Gate Array (FPGA) is presented in [64] along with the results of Register Transfer Level (RTL) synthesis using a 28 nm bulk CMOS technology. However, the broadband analog components required for the transmitter have not been implemented in [64].

The state of the art in PSSS cross-correlation-based equalization is given by the following publications: system-level investigations on adaptive cross-correlation-based equalization using PSSS modulation are presented in [60], [65], [58] addressing wireless communication applications. A PSSS transceiver IC implementation for the IEEE 802.15.4-2006 standard which operates in the 868 MHz band and achieves 250 kb/s is reported in [66].

PSSS is in some respects similar to High-Speed Packet Access (HSPA) [67]. In HSPA, a single spreading factor of 16 was chosen to create a fast data downlink channel. Using the 'variable factor spreading codes' 16 orthogonal codes are available. With a frame

interval of 2 ms and 16 codes, a maximum of 15 Mbps can be achieved. The correlation is performed completely in the digital domain. No time-domain channel equalization is performed. Transceiver ICs with a higher data rate than HSPA or IEEE 802.15.4-2006 have not been published by academia nor advertised by industry so far. Neither a broad-band PSSS transceiver for more than a 100 MHz bandwidth nor any mixed-signal PSSS transceiver exists. Furthermore, for PSSS-based wireless systems, many open questions exist that have been already answered for conventional single-carrier or multi-carrier modulation schemes but not for PSSS. For example, PSSS performance degradation in non-linear transmitters, the influence of LO jitter, transmitter spectrum shaping, research in synchronization techniques, etc. have not been systematically investigated. Thus, PSSS transceiver implementation remains an open yet interesting topic for research that has not been addressed extensively in the past.

This thesis presents the circuit design and measurement results of a mixed-signal PSSS transceiver baseband circuit. First, a system model is developed from the mathematical overview, and then, the system design considerations for a mixed-signal transceiver implementation are used to define the important circuit parameters for the baseband circuit components. The important circuit components of the transmitter and receiver baseband are characterized as standalone components followed by the characterization of the overall receiver baseband circuit. This is the first-ever implementation and characterization of a broadband mixed-signal PSSS transceiver baseband circuit. Previously reported results had either been hardware in the loop test setups with offline baseband signal processing or use digital signal processing for the baseband for low data rates. This is also the first reported implementation of the channel equalization based on the weighting of the coding sequences for a broadband PSSS receiver baseband.

## 1.3 Organization of Thesis

This thesis is organized as follows. Chapter 1 explains the motivation for the design of the mixed-signal architecture for the baseband circuit by listing the different viable options to realize the target 100 Gbps wireless communication system and presenting a proposed solution that forms the basis for the research in the remainder of the thesis.

Chapter 2 discusses the mathematical model of the PSSS modulation scheme based on which a system model is developed, and the system design parameters are chosen. A mathematical overview of the channel equalization process as well as amplitude clipping is also presented.

Chapter 3 discusses the sliced architecture of the mixed-signal PSSS receiver baseband by discussing a single unit slice circuit in detail. The detailed circuit design of the most important circuit components in the unit-slice using 130 nm SiGe BiCMOS technology along with the measurement results for the characterization of those components is presented in this chapter. The circuit design and characterization of the transmitter baseband circuits are also discussed. A comparison of the baseband circuit implementation in 28 nm bulk CMOS is presented to highlight the advantages of migrating to scaled CMOS technology.

Chapter 4 discusses the design of a complete mixed-signal PSSS receiver baseband slice based on the important circuit components described in chapter 30. The schematic design of the important accessory components, as well as the layout of the receiver baseband unit-slice test-chip, is discussed in detail.

Chapter 5 of this thesis discusses the design of the high-speed printed circuit board (PCB) to characterize the test-chip as well as the measurement setup required for the characterization of the complete mixed-signal PSSS receiver baseband unit-slice test-chip. The important measurement results for the characterization of the standalone receiver baseband unit slice test-chip are presented.

The last chapter of the thesis i.e., chapter 6 presents a summary of the thesis along with the outlook for further research and the possibilities for improvements.

# 1.4 List of Relevant Publications of the Author

- [12] J. C. Scheytt, A. R. Javed, et. al., "Real100G Ultrabroadband Wireless Communication at High mm-Wave Frequencies," in *Wireless 100 Gbps and Beyond*, R. Kraemer and S. Scholz, Eds., Frankfurt (Oder), Germany, IHP – Innovations for High Performance Microelectronics, 2020, pp. 213-230.
- [13] J. C. Scheytt, A. R. Javed, et. al., "100 Gbps Wireless System and Circuit Design Using Parallel Spread-Spectrum Sequencing," *Frequenz*, vol. 71, no. 9-10, p. 399 – 414, 2017.
- [68] A. R. Javed, J. C. Scheytt, K. KrishneGowda, and R. Kraemer, "System Design Considerations for a PSSS Transceiver for 100Gbps Wireless Communication with Emphasis on Mixed-Signal Implementation," in *IEEE Wireless and Microwave Technology Conference (WAMICON)*, Florida, 2015.
- [69] A. R. Javed, et. al., "Real100G.com," in Wireless 100 Gbps and Beyond, R. Kraemer and S. Scholz, Eds., Frankfurt (Oder), Germany, IHP - Innovations for High Performance Microelectronics, 2020, pp. 231-294.

# 1.5 Summary

The research in this thesis is concerned with the baseband circuit design for an ultrawideband highspeed wireless communication system for mobile internet access. From the viewpoint of the baseband circuit design, the use of a single RF-carrier with spread spectrum communication is a better choice as compared to a multi-carrier approach like the OFDM. The current project explores the use of a large contiguous chunk of 50 GHz RF bandwidth around the carrier frequency of 240 GHz with Double Sideband Suppressed Carrier (DSB-SC) modulation. The use of the large bandwidth allows to reduce the spectral efficiency to a smaller value but makes a digital transceiver baseband implementation inefficient and power-hungry. The solution is to perform the baseband signal processing in the analog domain which in turn requires investigating and using a modulation scheme that lends itself to an efficient analog implementation. Parallel sequence spread spectrum (PSSS) is a spread spectrum communication technique that suits well the requirements outlined above and forms the basis of further discussion in this thesis.
# 2 Parallel Sequence Spread Spectrum (PSSS) System Analysis and Design

This chapter deals with the basics of spread spectrum communication using parallel sequence spread spectrum (PSSS). Starting with the mathematical overview, the system architecture of a PSSS system is developed and system analysis is presented. The most important part of the communication system with regards to the PSSS modulation is the baseband circuit. The potential advantages of using a mixed-signal PSSS baseband architecture are discussed. The important system design parameters of the system model are discussed and the most suitable values for the system parameters are determined for the current application i.e., wireless data communication at the data rates of 100 Gbps. The effect of clipping some of the PSSS amplitudes on the bit error rate is discussed. A big advantage of using PSSS modulation is the possibility to merge the channel equalization process with the data decoding process. This is discussed in the context of the proposed mixed-signal architecture for the PSSS baseband.

## 2.1 Spread Spectrum Communication

One of the biggest issues with wireless data communication is fading which means variation of the attenuation of the signal with time, location, and/ or the frequency of the signal. The changes in the attenuation may be random or have a deterministic pattern. Common causes of channel fading are multipath propagation, weather-related effects e.g., rain, snow, etc., or shadowing caused by physical objects that affect the direct line of sight propagation of the signal. Fading deteriorates the performance of the wireless communication system because of the reduction of the signal-to-noise ratio at the receiver end. This is caused by the reduction of the signal power while the noise power remains unaffected. Depending on the source of the fading, the reduction of the signal power may occur over some part or the whole of the signal bandwidth. In the case of multipath propagation, the signals reflected by different reflecting surfaces arrive at the receiver in addition to the signal arriving via the direct line of sight path. The destructive interference of the received signals causes a severe reduction in the signal-to-noise ratio. In the case of shadowing, the direct line of sight path is blocked, and the signal arrives mainly from a reflected path with additional attenuation. Note that both these effects can be considerably reduced if the carrier frequency is varied.

The effects of fading can be combated by using some sort of diversity while transmitting the signal. For example, if the signal is transmitted over a composite of multiple narrowband channels with independent fading profiles and the signal is coherently combined at the receiver then the probability of experiencing fading in the composite channel is equal to the probability that all the constituent narrowband channels simultaneously experience fading, that is an unlikely outcome statistically. The use of frequency diversity to avoid fading can be useful as long as the fading is *slow* i.e., the rate of change of the magnitude and phase of the signal by the fading channel is slower than the coherence time of the channel. The coherence time of the channel is defined as the time during which the channel impulse response may be considered unvaried. In other words, it is the time required for the magnitude or phase change to become uncorrelated from its previous value. For slow fading channels, adaptive frequency diversity schemes can be quite effective but for the case of fast fading channels, time diversity i.e., redundancy can be used along with the use of error-correcting codes to improve the reliability of the link. In the discussion that follows, and in the remainder of this dissertation a slow fading channel is assumed because the desired application is an indoor point-to-point direct line of sight link for which the channel parameters do not change considerably with time.

For slow fading channels, spread-spectrum communication is a common method of combating the effects of channel fading. In spread-spectrum communication, the narrowband signal to be transmitted is spread in the frequency domain to change it to a wideband signal. In addition to avoidance of the catastrophic narrowband fading, spread spectrum communication offers advantages like secure communication i.e., low probability of detection and interception of communication, increasing resistance to electromagnetic interference caused by noise and intentional jamming signals, and allowing multiple access communication on the channel. There are different forms of spread spectrum communication: frequency hopping spread spectrum (FHSS), direct sequence spread spectrum (DSSS), chirp spread spectrum (CSS), and time-hopping spread spectrum (THSS). Moreover, the above forms can also be combined to suit a given application. A brief overview of the different spread spectrum techniques is presented below.

Frequency-hopping spread spectrum (FHSS) is a method of transmitting wireless signals by rapidly hopping (changing) the carrier frequency between the many distinct frequencies made available as slices or bands of a wideband channel. The frequency hops are governed by a predefined pattern known to both the transmitter and the receiver. Since the FHSS signal constantly hops to different frequency bands within the wideband channel, the narrowband fading of a particular frequency band will only slightly lower the overall SNR of the signal. Note that the FHSS communication is secure because the hopping pattern is known only to the intended user and any foreign listener will not be able to decode the information. FHSS transmissions appear as random narrowband disturbances to an unintended user and for this reason, they can share the frequency bands with other conventional transmissions with minimal mutual interference. Similarly, the FHSS transmission will also be affected very little by the narrowband communication in the shared frequency band.

Chirp spread spectrum (CSS) uses chirp signals to encode data. Chirp signals are wideband sinusoidal signals whose frequencies ramp up or ramp down during the duration of the chip pulse. Since the chirp signals are wideband they are resistant to fading. The chirp signals are inherently resistant to the Doppler effect because the chirp signals consist of sinusoids with frequency ramps. Due to this fact, the chirp signals are a good choice for precision distance measurement and ranging applications. Chirp spread spectrum is suitable for low power and low data rate wireless communications. Time hopping spread spectrum (THSS) uses pseudorandom coding sequences to modulate the period and duty cycle of a pulsed radio frequency carrier. The data symbols are transmitted in a time-multiplexed manner, whereby the time slots for the individual transmission channels are varied. The time duration between the slots of the transmission channel is pseudorandom. For the receiver to rearrange the received pulses to their correct order, the pseudorandom coding sequences must be known to the receiver. For any nonintended receiver having no information about the coding sequence, the received signal appears as a noise signal. THSS is the basis for the time domain multiple access (TDMA) method of governing multiuser access on a shared channel.

Direct sequence spread spectrum (DSSS) is used as the basis for the code division multiple access (CDMA) standard of mobile phone networks, for the IEEE 802.11b specification used in Wi-Fi networks, and the Global Positioning System (GPS) [70], [71]. An advantage of DSSS over FHSS is that in the case of FHSS since the data is sent over different carrier frequencies, the channel frequency response for the whole bandwidth needs to be uniform, however, in the case of DSSS, the carrier frequency remains the same so the channel frequency response does not need to be uniform and can be shaped as a Gaussian/ bell-shaped envelop centered at the carrier frequency.

Parallel sequence spread spectrum (PSSS) is another spread spectrum communication method that can be considered a parallel extension of the direct sequence spread spectrum (DSSS). PSSS is used as the modulation method for wireless data transmission in this dissertation and will be discussed at length in the remainder of this chapter.

# 2.2 Fundamentals of PSSS

After having discussed the different variants of the spread spectrum communication, a detailed mathematical analysis of the parallel sequence spread spectrum (PSSS) communication is presented in this section. A simple matrix representation of the PSSS modulation scheme is developed followed by the time domain signal representation. In the next step, to simplify the analysis of the proposed PSSS communication system, the continuous-time domain model is simplified to a discrete-time version. This is followed by the calculation of receiver sensitivity to determine the advantage of using PSSS as a spread spectrum technique.

# 2.2.1 Mathematical Overview

In this section, a mathematical overview of the PSSS modulation scheme is presented. The basic matrix representation of PSSS in [72] is extended from a single data vector to define a whole frame of multiple data vectors. This simplifies the conversion to a time-domain signal representation. Finally, the sampled value of the correlators is represented with a discrete-time model for the system analysis. Maximum length sequences (MLS or m-sequences) are commonly used as coding sequences for PSSS [9] and are used as the

basis of the following analysis unless specified otherwise. More about the choice of coding sequences for PSSS follows later in section 2.4.1.1.

#### 2.2.1.1 Matrix Representation

To explain the working principle of a PSSS based communication system, a matrix representation can be developed as explained below.

A pseudorandom coding sequence  $C_i$  of length N, can be represented as a vector of coding chips  $c_{mi}$  where  $m \in \{1, 2, ..., N\}$  and each coding chip has a duration of  $T_c$ .  $c_{mi} \in \{0, 1\}$  for unipolar codes or  $c_{mi} \in \{+1, -1\}$  for bipolar codes as explained in section 2.4.1.1.

$$C_i = [c_{1i} \ c_{2i} \ \dots \ c_{ki} \ \dots \ c_{Ni}]^T$$
(2-1)

where *T* represents the matrix transpose operator. For PSSS, a total of  $U \le N$  cyclically shifted versions of the chosen pseudorandom code are used to encode parallel data symbols, provided the cyclically shifted versions of the code are orthogonal to each other. A maximum of *N* cyclically shifted versions of the code  $C_i$  can be used. A better signal-to-noise ratio can be achieved by using fewer than the *N* possible cyclically shifted versions of the code at the cost of reduced net data transmission rate. The collection of the cyclically shifted versions of the coding sequences can be written in the form of a coding matrix as follows:

$$C = \begin{bmatrix} C_1 & C_2 & \dots & C_k & \dots & C_U \end{bmatrix} = \begin{bmatrix} c_{11} & \dots & c_{1k} & \dots & c_{1U} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ c_{k1} & \dots & c_{kk} & \dots & c_{kU} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ c_{N1} & \dots & c_{Nk} & \dots & c_{NU} \end{bmatrix}$$
(2-2)

The data is represented as the symbol vector  $S_i$  which consists of symbols  $s_{mi} \in \{\pm 1, \pm 3, \dots, \pm (M-1)\}$  where  $M = 2^k$ ;  $k \in \{0, 1, 2, \dots\}$  represents the order of the digital modulation of the data and  $m \in \{1, 2, \dots, U \leq N\}$  represents the number of data symbols encoded in parallel. Each symbol has a duration of  $T_S = NT_C$ . The symbols  $s_{mi}$  represent independent random variables with equally likely distribution on the sample space  $\{\pm 1, \pm 3, \dots, \pm (M-1)\}$ .

$$S_i = [s_{1i} \ s_{2i} \ \dots \ s_{ki} \ \dots \ s_{Ui}]^T$$
(2-3)

For a data frame of length L, the data matrix S can be written as a matrix consisting of L column vectors:

$$S = [S_1 \ S_2 \ \dots \ S_k \ \dots \ S_L] = \begin{bmatrix} s_{11} \ \dots \ s_{1k} \ \dots \ s_{1k} \ \dots \ s_{1L} \\ \vdots \ \ddots \ \vdots \ \ddots \ \vdots \\ s_{k1} \ \dots \ s_{kk} \ \dots \ s_{kL} \\ \vdots \ \ddots \ \vdots \ \ddots \ \vdots \\ s_{U1} \ \dots \ s_{Uk} \ \dots \ s_{UL} \end{bmatrix}$$
(2-4)

The length L of the data frame depends on the channel coherence time after which the filter coefficients for the channel equalization filter must be recalculated to ensure reliable communication.

...

The PSSS vector  $P_i$  for a single data symbol vector  $S_i$  can be obtained by multiplying the data vector  $S_i$  with the coding matrix C. The elements of the PSSS vector  $P_i$  when represented sequentially as a signal constitute one PSSS sequence.

$$P_i = C.S_i \tag{2-5}$$

The PSSS matrix for the whole data frame of length L can be obtained similarly by multiplying the coding matrix C with the data symbol matrix S.

$$P = \begin{bmatrix} p_{11} & \dots & p_{1k} & \dots & p_{1L} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ p_{k1} & \dots & p_{kk} & \dots & p_{kL} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ p_{N1} & \dots & p_{Nk} & \dots & p_{NL} \end{bmatrix} = C.S$$

$$= \begin{bmatrix} c_{11} & \dots & c_{1k} & \dots & c_{1U} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ c_{k1} & \dots & c_{kk} & \dots & c_{kU} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ c_{N1} & \dots & c_{Nk} & \dots & c_{NU} \end{bmatrix} \cdot \begin{bmatrix} s_{11} & \dots & s_{1k} & \dots & s_{1L} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ s_{k1} & \dots & s_{kk} & \dots & s_{kL} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ s_{U1} & \dots & s_{Uk} & \dots & s_{UL} \end{bmatrix}$$
(2-6)

To recover the encoded symbols, the PSSS vectors are multiplied with the decoding matrix D. In the most general form, the decoding matrix D is the inverse of the coding matrix. For a coding matrix consisting of orthogonal codes, the decoding matrix is equal to the transpose (represented with <sup>T</sup>) of the coding matrix (since  $A \cdot A^T = A^T \cdot A = I$  by definition, for orthogonal matrices). In the current case, however, the coding matrix consists of cyclically shifted versions of one m-sequence, and the condition  $A \cdot A^T = A^T \cdot A = I$ does not directly hold. However, for the case of a coding matrix consisting of cyclically shifted copies of an m-sequence, the decoding matrix can be obtained by taking the transpose of the coding matrix with complementary polarity (i.e., unipolar/ bipolar) to the original coding matrix (e.g., bipolar/unipolar respectively) to get favorable cyclic correlation properties as discussed in section 2.4.1.1. Additionally, the decoding matrix can be weighted to perform channel equalization as discussed in section 2.5. For the remainder of the discussion, it is assumed that the decoding matrix has complementary polarity to that of the coding matrix and is obtained by taking its transpose. The individual chips of the decoding matrix D are denoted as  $c'_{ii}$  where ' represents some modification (complementary polarity and/or weighting) of the original coding chips  $c_{ij}$ .

$$D = C'^{T} = \begin{bmatrix} c'_{11} & \dots & c'_{1k} & \dots & c'_{1U} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ c'_{k1} & \ddots & c'_{kk} & \ddots & c'_{kU} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ c'_{N1} & \cdots & c'_{Nk} & \cdots & c'_{NU} \end{bmatrix}^{T} = \begin{bmatrix} c'_{11} & \dots & c'_{k1} & \dots & c'_{N1} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ c'_{1k} & \ddots & c'_{kk} & \ddots & c'_{Nk} \\ \vdots & & \vdots & & \vdots \\ c'_{1U} & \cdots & c'_{kU} & \cdots & c'_{NU} \end{bmatrix}$$
(2-7)

For m-sequences, using the complementary polarity of the coding and decoding sequences has the advantage that the two matrices are inverse of each other except for a scalar multiple. Using the principle of mathematical induction, it can be proved that for the case of m-sequences with complementary polarity, the product of the coding C and the decoding D matrices is an identity matrix with a scalar multiple equal to (N + 1)/2 [68], [72].

$$D_{bipolar} = \left(\frac{N+1}{2}\right) C_{unipolar}^{-1} = 2 \times C_{unipolar}^{T} - 1 = C_{bipolar}^{T}$$
(2-8)

$$D_{unipolar} = \left(\frac{N+1}{2}\right)C_{bipolar}^{-1} = \frac{C_{bipolar}^{T}+1}{2} = C_{unipolar}^{T}$$
(2-9)

Note that the addition or subtraction of a scalar with the matrix assumes multiplication of the scalar with an all-ones matrix (containing 1's) of the same dimensions, as the matrix to which the scalar is being added.

The conversion between unipolar and bipolar versions of the code follows according to the following formula:

$$x_{bipolar} = 2 \times x_{unipolar} - 1 \tag{2-10}$$

The recovered symbol matrix S' consists of the recovered symbols vectors  $S'_i$  which consists of symbols  $s'_{mi}$  where  $m \in \{1, 2, ..., U \le N\}$  is obtained by multiplying the decoding matrix with the PSSS matrix i.e., S' = D.P

$$S' = \begin{bmatrix} S'_{11} & \dots & S'_{1k} & \dots & S'_{1L} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ s'_{k1} & \ddots & s'_{kk} & \ddots & s'_{kL} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ s'_{U1} & \cdots & s'_{Uk} & \cdots & s'_{UL} \end{bmatrix} = D.P$$

$$= \begin{bmatrix} c'_{11} & \dots & c'_{k1} & \dots & c'_{N1} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ c'_{1k} & \ddots & c'_{kk} & \ddots & c'_{Nk} \\ \vdots & \vdots & \vdots & \vdots \\ c'_{1U} & \cdots & c'_{kU} & \cdots & c'_{NU} \end{bmatrix} \begin{bmatrix} p_{11} & \dots & p_{1k} & \dots & p_{1L} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ p_{k1} & \cdots & p_{kk} & \cdots & p_{kL} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ p_{N1} & \cdots & p_{Nk} & \cdots & p_{NL} \end{bmatrix}$$

$$(2-11)$$

#### 2.2.1.2 Signal Representation

.

Following the basic matrix representation, the time domain signal representation of the system can be developed. Assuming that the amplitude of the signal remains constant for the duration of the chip i.e.,  $T_c$ , the signal can be represented using rectangular pulses or rect(t) function.

$$rect(t) = \begin{cases} 1; & |t| \le 1/2 \\ 0; & |t| > 1/2 \end{cases}$$
(2-12)

The individual symbols of the data symbol vector can be represented as rect(t) signals. The  $i^{th}$  symbol  $s_i$  of each data vector  $S_m$  of a data frame with L data vectors can be represented as a sequence of weighted rect(t) functions each having a duration of  $T_s$ .

$$s_{i_{Frame}}(t) = \sum_{j=1}^{L} s_{ij} \cdot rect\left(\frac{t - \frac{T_S}{2} - (j - 1)T_S}{T_S}\right)$$
(2-13)

The *i*<sup>th</sup> coding sequence corresponding to the column vector  $C_i$  in (2-1) can be represented as a sum of  $N \operatorname{rect}(t)$  functions each with a duration of  $T_C$  making up a total coding sequence length of  $T_S = NT_C$  as follows:

$$C_{i}(t) = \sum_{k=1}^{N} c_{ki} \cdot rect\left(\frac{t - \frac{T_{c}}{2} - (k - 1)T_{c}}{T_{c}}\right)$$
(2-14)

For a data frame of *L* data vectors with duration  $T_L = LT_S$ , the same coding sequence is repeated every  $T_s = NT_C$  seconds. The periodic signal  $C_{i_{Frame}}(t)$  consists of *L* repetitions of the coding sequence signal  $C_i(t)$  with a period of  $T_S = NT_C$ .

$$C_{i_{Frame}}(t) = \sum_{j=1}^{L} C_i(t - (j - 1)T_S)$$
  
=  $\sum_{j=1}^{L} \sum_{k=1}^{N} c_{ki} \cdot rect \left( \frac{(t - (j - 1)T_S) - \frac{T_C}{2} - (k - 1)T_C}{T_C} \right)$  (2-15)

To encode the symbol  $i^{th}$  symbol of each data vector, the signal  $s_{i_{Frame}}(t)$  is multiplied with the periodic coding sequence signal  $C_{i_{Frame}}(t)$ . The resulting product is the DSSS encoded signal  $p_{i_{Frame}}(t)$  corresponding to the data signal  $s_{i_{Frame}}(t)$ .

$$p_{i_{Frame}}(t) = s_{i_{Frame}}(t) \cdot C_{i_{Frame}}(t) = \sum_{j=1}^{L} s_{ij} \cdot rect \left(\frac{t - \frac{T_{S}}{2} - (j - 1)T_{S}}{T_{S}}\right).$$
$$\sum_{j=1}^{L} \sum_{k=1}^{N} c_{ki} \cdot rect \left(\frac{(t - (j - 1)T_{S}) - \frac{T_{C}}{2} - (k - 1)T_{C}}{T_{C}}\right) \quad (2-16)$$

Since  $T_S = NT_C$ , the signal  $rect(t - T_S)$  can be written as a sequence of N time-shifted versions of  $rect(t - T_C)$  functions i.e.,

$$rect\left(\frac{t-\frac{T_{S}}{2}}{T_{S}}\right) = \sum_{k=1}^{N} rect\left(\frac{t-\frac{T_{C}}{2}-(k-1)T_{C}}{T_{C}}\right)$$
(2-17)

The two rect(t) functions in the expression for  $p_{i_{Frame}}(t)$  can thus be combined into one single rect(t) function.

$$p_{i_{Frame}}(t) = \sum_{j=1}^{L} \sum_{k=1}^{N} s_{ij} c_{ki} \cdot rect \left( \frac{t - \frac{T_C}{2} - (k-1)T_C - (j-1)T_S}{T_C} \right)$$
(2-18)

The above expression is a complete time-domain description of DSSS encoding of data i.e., a series of data symbols  $s_{i_{Frame}}(t)$  in a frame of length *L* each encoded with the same coding sequence signal  $C_i(t)$ .

To obtain the PSSS signal  $P_{Frame}(t)$  for the entire frame of length L (i.e., L data vectors), the encoded chips from all the U parallel DSSS encoded signals are added up. The resulting PSSS signal will have (non-binary) multiple amplitude levels which depend on the summation of the different DSSS encoded chip components. The possible PSSS chip amplitude levels for m-sequences are mentioned in Table 2-2.

$$P_{Frame}(t) = \sum_{i=1}^{U} p_{i_{Frame}}(t)$$
  
=  $\sum_{i=1}^{U} \sum_{j=1}^{L} \sum_{k=1}^{N} s_{ij} c_{ki} \cdot rect \left( \frac{t - \frac{T_C}{2} - (k - 1)T_C - (j - 1)T_S}{T_C} \right)$  (2-19)

The above expression is a complete time-domain description of PSSS encoding of data i.e., a frame of *L* data vectors with each data vector consisting of *U* symbols that are encoded in parallel (simultaneously) with *U* coding vectors of length *N*. The chip duration is  $T_c$  and the symbol duration is  $T_s = N \times T_c$  which makes the frame duration equal to  $L \times N \times T_c$ . Note that to simplify the discussion, only one data vector instead of the whole frame can be considered. This can be done easily by setting j = 1 in (2-13)-(2-19). For example, the PSSS signal for a single data vector can be written by setting j = 1 in (2-19).

$$P(t) = \sum_{i=1}^{U} p_i(t)$$
  
=  $\sum_{i=1}^{U} s_i(t) \cdot C_i(t) = \sum_{i=1}^{U} \sum_{k=1}^{N} s_i c_{ki} \cdot rect \left(\frac{t - \frac{T_C}{2} - (k - 1)T_C}{T_C}\right)$  (2-20)

The PSSS signal P(t) is the baseband signal that is up-converted and transmitted using the RF-frontend. After down-conversion at the receiver end, the received signal r(t) contains additional noise and the distortions caused by the non-linearities of the RF-frontend circuits. To decode (recover) the information sent by the transmitter, the received signal r(t) is correlated with the corresponding decoding sequences. The correlation process involves multiplication of the received signal with the U cyclically shifted decoding sequences and integrating the respective products for the symbol duration  $T_S = NT_C$  in parallel to recover all U data symbols simultaneously. At the end of the integration interval, the output of the correlator is sampled using a sample-and-hold circuit (S/H) to recover the  $i^{th}$  transmitted symbols  $s'_i$ . The integrator is then reset to prepare it for the next PSSS sequence signal of the frame  $P_{Frame}(t)$ .

The  $i^{th}$  decoding sequence can be represented as a sum of  $N \operatorname{rect}(t)$  functions each having the duration  $T_C$  and the amplitude value  $c'_{ik}$  that is the transposed, (possibly) weighted,

and complementarily polarized version (i.e., unipolar, or bipolar version) of the coding sequence chip  $c_{ki}$ . Note the transpose relation to the coding sequence.

$$D_{i}(t) = \sum_{k=1}^{N} c_{ik}' \cdot rect\left(\frac{t - \frac{T_{c}}{2} - (k-1)T_{c}}{T_{c}}\right)$$
(2-21)

The *i*<sup>th</sup> symbol of the data vector  $s'_i$  can be recovered by multiplying the received signal r(t) with the decoding sequence signal  $D_i(t)$ . The received PSSS signal r(t) contains noise in addition to the PSSS signal P(t). For all practical purposes, the noise can be considered to be additive with a zero-mean Gaussian amplitude distribution and with equal noise power at all frequencies i.e., with a two-sided power spectral density  $\phi_n(f) = N_o/2$ . This type of noise signal is commonly called additive white Gaussian noise (AWGN) represented as n(t) below.

$$r(t) = P(t) + n(t) = \sum_{i=1}^{U} p_i(t) + n(t) = \sum_{i=1}^{U} s_i(t) \cdot C_i(t) + n(t)$$
(2-22)

The sampled value after the completion of the correlation of the received signal r(t) with the  $i^{th}$  decoding sequence signal gives the  $m^{th}$  symbol  $s'_{mi}$  of the data vector  $S_i$ . In terms of vector signal space, the sampled value of the correlation of the received signal r(t) with the decoding sequence results in the projection  $s'_{mi}$ .

$$s'_{mi} = \frac{1}{T_S} \int_{mT_S}^{(m+1)T_S} r(t) D_i(t) dt = \frac{1}{T_S} \int_{mT_S}^{(m+1)T_S} (P(t) + n(t)) D_i(t) dt$$
$$= \frac{1}{T_S} \int_{mT_S}^{(m+1)T_S} \sum_{i=1}^{U} p_i(t) D_i(t) dt + \frac{1}{T_S} \int_{mT_S}^{(m+1)T_S} n(t) D_i(t) dt$$
(2-23)

The noise signal n(t) is a zero-mean random Gaussian process, whose projection on the orthogonal coordinates of the signal space, represented by the term on the right-hand side of Equation (2-23), can be represented by a zero-mean random variable  $N_i$ . Similarly, the correlation of the received PSSS signal P(t) with the decoding sequence, represented by the term on the left-hand side of Equation (2-23) can be represented as a vector P with components  $A_p \widehat{P_i}$ . Thus,

$$s'_{mi} = A_p \widehat{P_i} + N_i \tag{2-24}$$

The expected value of the random variable  $N_i$  is given by,

$$\boldsymbol{E}[\boldsymbol{N}_{i}] = \boldsymbol{E}\left[\frac{1}{T_{S}}\int_{mT_{S}}^{(m+1)T_{S}}n(t).D_{i}(t)\,dt\right] = \frac{1}{T_{S}}\int_{mT_{S}}^{(m+1)T_{S}}\boldsymbol{E}[n(t)].D_{i}(t)\,dt = 0 \qquad (2-25)$$

where E[n(t)] = 0, since n(t) is a zero-mean random process.

The expected value of the random variable  $s'_{mi}$  is,

$$\boldsymbol{E}[\boldsymbol{s}_{mi}'] = \boldsymbol{E}[A_p \widehat{\boldsymbol{P}}_i + \boldsymbol{N}_i] = A_p \boldsymbol{E}[\widehat{\boldsymbol{P}}_i] + \boldsymbol{E}[\boldsymbol{N}_i] = A_p \boldsymbol{E}[\widehat{\boldsymbol{P}}_i]$$
(2-26)

The term on the right-hand side of Equation (2-23) can be ignored because its expected value is nil and hence it won't play any role in defining the value of the random variable  $s'_{mi}$  [8]. Using the expressions for  $s_i(t)$  and  $C_i(t)$ , the term on the left-hand side of Equation (2-23) can be expanded to,

$$s'_{mi} = \frac{1}{T_S} \int_{mT_S}^{(m+1)T_S} \sum_{i=1}^{U} \sum_{k=1}^{N} s_i \cdot c_{ki} \cdot c'_{ki} \cdot rect \left(\frac{t - \frac{T_C}{2} - (k-1)T_C}{T_C}\right) dt \qquad (2-27)$$

A row vector  $s'_{mFrame}$  consisting of the  $m^{th}$  recovered symbol of each data vector in a frame of L data vectors can be recovered by multiplying the received signal r(t) with the periodic decoding vector signal  $D_{i_{Frame}}(t)$  which consists of L repetitions of the  $i^{th}$  decoding vector signal  $D_i(t)$ , similar to the expression for the periodic coding vector signal  $C_{i_{Frame}}(t)$  in (2-15), resulting in

$$= \frac{1}{T_S} \int_{mT_S}^{(m+1)T_S} \sum_{j=1}^{L} \sum_{i=1}^{U} \sum_{k=1}^{N} s_{ij} \cdot c_{ki} \cdot c'_{ki} \cdot rect \left( \frac{t - \frac{T_C}{2} - (k-1)T_C - (j-1)T_S}{T_C} \right) dt \quad (2-28)$$

Finally, to recover all U transmitted symbols of each data vector simultaneously, U copies of the correlator hardware are used in parallel, each having an individual decoding sequence.

### 2.2.2 PSSS System Analysis

After having discussed the time domain signal representation of the PSSS signals in section 2.2.1.2, in this section, the receiver sensitivity and peak to average power ratio (PAPR) of the PSSS based communication system are discussed. The following assumptions are used to simplify the analysis:

The PSSS signals are only considered for a single PSSS sequence rather than for a complete frame consisting of several PSSS sequences. This removes the time dependency from the data symbol variable i.e.,  $S_i(t)$  can be written as  $S_i$  because  $S_i(t)$  remains constant for the symbol duration  $T_s$ . Moreover, the chip rate is assumed to be arbitrarily high which causes the width of the  $rect(t/T_c)$  signals to asymptotically approach 0 s, thereby resulting in them being equal to unit impulse functions  $\delta(t)$ . Even for finite chip rates, the rect(t) functions can be replaced with unit impulse functions  $\delta(t)$  to simplify the analysis. Lastly, since the sampled value of the correlator output at the end of the symbol duration  $T_s$  and not the instantaneous value is relevant for decoding, the independent continuous time variable t can be replaced with  $nT_s$ . This helps to reduce the continuous time-domain analysis to discrete-time analysis.

$$\lim_{T_C \to 0} \operatorname{rect}\left(\frac{t - t_0}{T_C}\right) = \delta(t - t_0) \stackrel{\text{\tiny def}}{=} \delta(nT_C - n_0T_C) \Rightarrow \delta[n - n_0]$$
(2-29)

 $S_i$  and  $S_j$  are independent random variables with an equally likely distribution. The sample space of  $S_i$  and  $S_j$  is the symbol vector i.e.,  $S_i, S_j \in \{\pm 1, \pm 3, \dots, \pm (M - 1)\}$  where  $M = 2^k$ ;  $k \in \{1, 2, \dots\}$  represents the order of the digital modulation used.  $C_i[n]$  represents the coding sequences. M-sequences are commonly used as the coding sequences (m-sequences) as the coding sequences. The coding sequences are represented by  $C_i[n]$  where the index *i* represents one out of  $U \leq N$  pseudorandom codes, where *U* represents the number of data symbols encoded in parallel and *N* is the length of the coding sequences.

$$P[n] = \sum_{i=1}^{U} S_i C_i[n]$$
(2-30)

To obtain the power of the PSSS sequence, the voltage term is squared.

$$P^{2}[n] = \left[\sum_{i=1}^{U} S_{i}C_{i}[n]\right]^{2} = \sum_{i=1}^{U} S_{i}^{2}C_{i}^{2}[n] + \sum_{i\neq j} S_{i}C_{i}[n]S_{j}C_{j}[n]$$
(2-31)

Since the coding sequences are a function of the time, we need to calculate the timeaverage power,

$$\langle P^{2}[n] \rangle = \sum_{i=1}^{U} S_{i}^{2} \langle C_{i}^{2}[n] \rangle + \sum_{i \neq j} S_{i} S_{j} \langle C_{i}[n] C_{j}[n] \rangle$$

$$= \langle C_{i}^{2}[n] \rangle \sum_{i=1}^{U} S_{i}^{2} + \langle C_{i}[n] C_{j}[n] \rangle \sum_{i \neq j} S_{i} S_{j}$$

$$(2-32)$$

Note that  $S_i$  and  $S_j$  remain constant for the duration of the symbol  $T_{sym}$ . The terms  $\langle C_i^2[n] \rangle$  and  $\langle C_i[n]C_j[n] \rangle$  assume different values depending on the choice of coding sequences.

#### 2.2.2.1 Case 1: Unipolar Coding

$$\langle C_i^2[n] \rangle = \frac{1}{U} \sum_{i=1}^U C_i^2[n] = \frac{1}{U} \left( \frac{U+1}{2} \right) = \frac{U+1}{2U}$$
 (2-33)

$$\langle C_i[n]C_j[n] \rangle = \frac{1}{U} \sum_{i=1}^{U} C_i[n] C_j[n] = \frac{1}{U} \left(\frac{U+1}{4}\right) = \frac{U+1}{4U}$$
 (2-34)

## 2.2.2.2 Case2: Bipolar Coding

$$\langle C_i^2[n] \rangle = \frac{1}{U} \sum_{i=1}^{U} C_i^2[n] = \frac{1}{U} (U) = 1$$
 (2-35)

$$\langle C_i[n]C_j[n] \rangle = \frac{1}{U} \sum_{i=1}^{U} C_i[n]C_j[n] = \frac{1}{U}(-1) = -\frac{1}{U}$$
 (2-36)

 $S_i$  and  $S_j$  are random numbers so we need to find the (statistical) expected value of these variables to calculate the average power of the PSSS sequence.

$$P_{avg,PSSS} = E\langle P^2[n] \rangle = \langle C_i^2[n] \rangle E\left[\sum_{i=1}^U S_i^2\right] + \langle C_i[n]C_j[n] \rangle E\left[\sum_{i\neq j} S_iS_j\right]$$
(2-37)

The sample space of  $S_i$  and  $S_j$  is the symbol vector i.e.,  $S_i, S_j \in \{\pm 1, \pm 3, \dots, \pm (M-1)\}$ where  $M = 2^k$ ;  $k \in \{1, 2, \dots\}$  represents the order of the digital modulation used.  $S_i$  and  $S_j$ are independent and equally likely random variables. This results in,

$$\boldsymbol{E}[S_i] = \boldsymbol{E}[S_j] = 0 \tag{2-38}$$

Moreover, for any two independent random variables X and Y

$$\boldsymbol{E}(XY) = \boldsymbol{E}(X)\boldsymbol{E}(Y) \tag{2-39}$$

which leads to

$$\boldsymbol{E}\left[\sum_{i\neq j} S_i S_j\right] = \sum_{i\neq j} \boldsymbol{E}[S_i S_j] = \sum_{i\neq j} \boldsymbol{E}[S_i] \boldsymbol{E}[S_j] = 0$$
(2-40)

$$P_{avg,PSSS} = \langle C_i^2[n] \rangle \boldsymbol{E}\left[\sum_{i=1}^U S_i^2\right] = \langle C_i^2[n] \rangle \boldsymbol{U} \boldsymbol{E}[S_i^2] = \langle C_i^2[n] \rangle \boldsymbol{U} \sum_{k=1}^M \frac{S_k^2}{M}$$
(2-41)

Peak to average power ratio (PAPR) is one of the most important parameters for the system design of transceivers employing complex modulation schemes. The *PAPR* value without the use of any spreading technique is given by,

$$PAPR_{non-PSSS} = \frac{P_{peak,non-PSSS}}{P_{avg,non-PSSS}} = \frac{(M-1)^2}{\sum_{k=1}^{M} \left(\frac{S_k^2}{M}\right)} = \frac{M(M-1)^2}{\sum_{k=1}^{M} S_k^2}$$
(2-42)

To calculate the PAPR of the PSSS modulation scheme we need to find the maximum possible power of the transmitted PSSS sequence. The peak amplitude of the PSSS sequence depends on the type of coding sequences used i.e.,

$$A_{peak,PSSS\,unipolar\,coding} = \frac{U+1}{2} A_{peak,n} = \frac{U+1}{2} (M-1)$$
(2-43)

$$A_{peak,PSSS_{bipolar-coding}} = U A_{peak,non-PSSS} = U (M-1)$$
(2-44)

To obtain the signal power, the amplitude term is squared:

$$P_{peak,PSSS} = A_{peak,PSSS}^2 \tag{2-45}$$

The PAPR value for the case of unipolar coding is calculated to be:

$$PAPR_{PSSS_{unipolar-coding}} = \frac{\left(\frac{U+1}{2}\right)^2 (M-1)^2}{U\left(\frac{U+1}{2U}\right) \sum_{k=1}^M \frac{S_k^2}{M}} = \left(\frac{U+1}{2}\right) \frac{M(M-1)^2}{\sum_{k=1}^M S_k^2}$$
(2-46a)

$$PAPR_{PSSS_{unipolar-coding}} = \left(\frac{U+1}{2}\right) PAPR_{non-PSSS}$$
(2-46b)

The PAPR value for the case of bipolar coding is calculated as follows:

$$PAPR_{PSSS_{bipolar-coding}} = \frac{(U)^2 (M-1)^2}{U \sum_{k=1}^M \frac{S_k^2}{M}} = U \frac{M(M-1)^2}{\sum_{k=1}^M S_k^2}$$
(2-47a)  
$$PAPR_{PSSS_{bipolar-coding}} = U \times (PAPR_{non-PSSS})$$
(2-47b)

The PAPR value of the PSSS system employing unipolar coding is 2U/(U + 1) times smaller than that of the PSSS system employing bipolar coding.

For the current system, the number of data streams encoded in parallel is 15 i.e., U = N = 15 whereas the digital modulation used is PAM-16 i.e., M = 16. The PAPR values for the current system compare as follows:

$$PAPR_{non-PSSS} = 2.65 = 4.23 \text{ dB}$$
 (2-48a)

$$PAPR_{PSSS_{unipolar-coding}} = 21.20 = 13.26 \text{ dB}$$
(2-48b)

$$PAPR_{PSSS_{bipolar-coding}} = 39.75 = 15.99 \text{ dB}$$
(2-48c)

#### 2.2.3 Receiver Sensitivity Analysis

In this section, the receiver sensitivity is calculated with and without the use of PSSS. If PAM-16 is chosen as the digital modulation, then data communication at the rate of 100 Gbps requires a symbol rate of 25 GS/sec. If the average energy per bit is represented by  $E_b$  and the (one-sided) noise spectral density is represented by  $N_o$  then the receiver sensitivity  $P_{RX_{min}non-PSSS}$ , defined as the minimum signal power required at the receiver end for a given bit error rate (BER) specification, can be calculated as follows:

$$\frac{P_{RX_{min}non-PSSS}}{P_N} = \left(\frac{E_b}{N_o}\right) \times \left(\frac{f_b}{B}\right)$$
(2-49)

where  $f_b$  is the bit rate, and *B* represents the noise-bandwidth of the system assuming a brick filter response. The  $E_b/N_o$  vs. BER curves commonly called water-fall curves owing to their shapes can be plotted for different digital modulation schemes.

 $P_N = kTFB$  is the total noise power at the input of the receiver, where  $k = 1.38 \times 10^{-23} J/K$  is the Boltzmann constant, *T* is the temperature in *Kelvin* and *F* is the

noise factor defined as the ratio of the input signal to noise ratio (SNR) to the output SNR as shown in Equation (2-51).

$$P_{RX_{min}non-PSSS} = P_N \times \left(\frac{E_b}{N_o}\right) \times \left(\frac{f_b}{B}\right) = kTFB \times \left(\frac{E_b}{N_o}\right) \times \left(\frac{f_b}{B}\right) = kTFf_b \left(\frac{E_b}{N_o}\right) \quad (2-50)$$

The noise factor value F in the dB scale is commonly known as the noise figure NF.

$$F = \left(\frac{SNR_{in}}{SNR_{out}}\right); \quad NF = 10 \log(F) \tag{2-51}$$

The value of  $E_b/N_o$  required to achieve  $BER = 10^{-3}$  with PAM-16 can be read from Fig. 2-1 to be 19.35 dB.

$$P_{RX_{min}non-PSSS(dBm)} = 10 \log(kT) + NF + \left(\frac{E_b}{N_o}\right)_{dB} + 10 \log(f_b)$$
(2-52)

Assuming room temperature i.e., T = 290 K, the term  $10 \log(kT)$  evaluates to -174 dBm/Hz. Using a receiver noise figure NF value of 12 dB as an example, we can calculate the receiver sensitivity for the target data rate of  $f_b = 100 Gbps = 1 \times 10^{11} bps$ .

$$P_{RX_{min}non-PSSS(dBm)} = -174 + 12 + 10 \log(10^{11}) + 19.35 = -32.65 \, dBm \quad (2-53)$$

Using PSSS lower receiver sensitivity can be achieved. The advantage stems from the use of multiple receivers with a lower symbol rate. The symbol rate, as well as the receiver



Fig. 2-1 E<sub>b</sub>/N<sub>o</sub> in dB vs. bit error rate (BER) for PAM-16 using bertool from MATLAB.

sensitivity, is reduced by a factor equal to the spreading gain of the PSSS system that is equal to the length of the coding sequences used. Assuming coding sequences of length N = 15, the spreading gain is equal to  $10 \log (N) = 10 \log (15) = 11.76 \, dB$ . Note that the total transmitter power remains the same, but the receiver sensitivity is reduced by using multiple copies of hardware to process lower data rate streams in parallel.

Another important consideration is that the coding and decoding sequences are chosen with complementary polarity i.e., if unipolar coding sequences are used then the decoding sequences are chosen as bipolar and vice versa to obtain good cyclic correlation properties as discussed in section 2.4.1.1. The use of complementary polarity for coding and decoding sequences, however, causes a loss as compared to the traditional matched filter approach for the detection of data. For the case of m-sequences, this loss is equal to (N + 1)/2N owing to the 0's in the unipolar code, where N represents the length of the coding sequence. The actual receiver sensitivity is, therefore, higher by a factor of (N + 1)/2N. For the current PSSS system in discussion with N = 15, the value of the sensitivity can be calculated as follows:

$$P_{RX_{min}PSSS(dBm)} = P_{RX_{min}non-PSSS(dBm)} - 10 \log\left(\frac{N+1}{2N}\right) - 10 \log(N)$$
  
=  $P_{RX_{min}non-PSSS(dBm)} - 10 \log\left(\frac{N+1}{2}\right) = -41.68 \, dBm$  (2-54)

Thus, despite the 2.73 *dB* of implementation loss due to the use of complementary coding and decoding sequence polarity, the use of PSSS still gives ~9 *dB* improvement in the receiver sensitivity. This is the minimum receiver power that is required to guarantee a bit error rate of  $10^{-3}$  with PAM-16 using PSSS.

To calculate the range of a wireless communication system, the Friis equation can be used which relates the transmitter power  $P_{TX(dBm)}$  with the receiver power  $P_{RX(dBm)}$ , the transmitter and receiver antenna gains  $G_{TX(dBi)}$ ,  $G_{RX(dBi)}$  respectively, and the free space path loss given by  $10 \log(\lambda/4\pi r_{max})^2$  where  $\lambda = 3 \times 10^8/f_{RF}$  is the wavelength of the RF signal in the air with the carrier frequency  $f_{RF}$  and  $r_{max}$  is the range of the communication link.

$$P_{RX_{(dBm)}} = P_{TX_{(dBm)}} + G_{TX(dBi)} + G_{RX(dBi)} + 10 \log(\lambda/4\pi r_{max})^2$$
(2-55a)

$$P_{RX} = P_{TX} G_{TX} G_{RX} \left(\frac{\lambda}{4\pi r_{max}}\right)^2$$
(2-55b)

Using (2-50) and (2-55b), the value of the maximum range of the wireless communication link  $r_{max}$  (for a non-PSSS communication link) can be calculated to be,

$$r_{max,non-PSSS} = \left(\frac{\lambda}{4\pi}\right) \sqrt{\frac{P_{TX,non-PSSS} G_{TX} G_{RX}}{f_b \ k \ T \ F \ (E_b/N_o)_{\min}}}$$
(2-56)

All parameters in Equation (2-56) are known except for the transmitter power  $P_{TX}$ . To find the maximum range of the communication link, the  $P_{TX}$  can be defined as the maximum power that ensures linear operation of the TX power amplifier. The PAPR value in

Equation (2-48a) can be used to determine the required power back-off of the transmitter RF-power amplifier for linear operation [12]. The peak value of the modulated signal is set at the output 1-dB compression point  $P_{1dB_{out}}$  of the RF-power amplifier characteristic and the PAPR value is used to define the power back-off i.e.,  $P_{TX(dBm)} = P_{1dB_{out}(dBm)} = -PAPR$ . To make calculations easier for comparisons below, the  $P_{1dB_{out}}$  is assumed to be 0 dBm which makes  $P_{TX} = 1/PAPR$  or  $P_{TX(dBm)} = -PAPR$ .

For PSSS (with unipolar coding and bipolar decoding sequences), the Friis equation gets modified to:

$$P_{RX,PSSS} = P_{RX,non-PSSS} / \left(\frac{N+1}{2}\right) = P_{TX,PSSS} G_{TX} G_{RX} \left(\frac{\lambda}{4\pi r_{max,PSSS}}\right)^2$$
(2-57)

where (2-54) was used in the non-logarithmic form. Using the value of  $P_{RX}$  from the above equation in the receiver sensitivity formula of (2-50) and re-arranging the terms gives the range of the PSSS communication system as follows:

$$r_{max,PSSS} = \left(\frac{\lambda}{4\pi}\right) \sqrt{\frac{\left(\frac{N+1}{2}\right) P_{TX,PSSS} G_{TX} G_{RX}}{f_b \ k \ T \ F \ (E_b/N_o)_{\min}}}$$
(2-58)

The peak value of the PSSS modulated signal is set at the output 1-dB compression point  $P_{1dB_{out}}$  of the RF-power amplifier characteristic and the PAPR value is used to define the power back-off i.e.,  $P_{TX,PSSS} = P_{1dB_{out}}/PAPR_{PSSS}$ . To make calculations easier, the  $P_{1dB_{out}}$  is again assumed to be 0 dBm i.e.,  $P_{TX,PSSS} = 1/PAPR_{PSSS}$ .

From (2-46b),  $PAPR_{PSSS} = \left(\frac{U+1}{2}\right) \cdot PAPR_{non-PSSS}$  for unipolar coding

$$P_{TX,PSSS} = \frac{1}{PAPR_{PSSS}} = \frac{\left(\frac{U+1}{2}\right)^{-1}}{PAPR_{non-PSSS}} = \left(\frac{U+1}{2}\right)^{-1} P_{TX,non-PSSS}$$
(2-59)

where  $P_{TX,non-PSSS} = 1/PAPR_{non-PSSS}$  follows from the assumption  $P_{1dB_{out}} = 0 \ dBm$ .

$$r_{PSSS,max} = \left(\frac{\lambda}{4\pi}\right) \sqrt{\frac{\left(\frac{N+1}{2}\right) \left(\frac{U+1}{2}\right)^{-1} P_{TX,non-PSSS} G_{TX} G_{RX}}{f_b \ k \ T \ F \ (E_b/N_o)_{min}}}$$
(2-60)  
$$= \sqrt{\left(\frac{N+1}{2}\right) \left(\frac{U+1}{2}\right)^{-1}} r_{max,non-PSSS}$$

Thus, the range of the PSSS communication link depends on the number  $U \le N$  of the cyclically shifted versions of the coding sequences that are used to encode and transmit the  $U \le N$  data symbols in parallel. For U = N, the maximum range of the wireless communication link with and without the use of PSSS modulation remains the same. Still, the advantage of using PSSS compared to no spreading is that for the same range and transmit power, the receiver sensitivity of a PSSS receiver baseband is reduced by a factor of ((N + 1)/2) while providing immunity against multipath fading. On the other hand, for

the case of U < N, i.e., if fewer than the N possible cyclically shifted versions of the chosen pseudorandom code are used to encode U parallel data symbols (instead of N possible symbols), then a corresponding increase in range is obtained according to (2-61).

In Table 2-1, the maximum range of a wireless communication link is calculated with and without the use of PSSS for different digital data modulation schemes [12]. The following assumptions are used in the calculations: T = 290 K, NF = 12 dB, the output 1-dB compression point of the RF-transmitter power amplifier is taken as  $P_{1dB_{out,dBm}} = 0 \ dBm$ , the transmitter and receiver antenna gains,  $G_{TX}$  and  $G_{RX}$  respectively, are assumed to be 25 dBi each, the carrier center frequency is assumed to be 240 GHz, double sideband RF modulation is assumed with an ideal brick filter response for noise bandwidth calculation. For calculations with PSSS modulation, a code length of 15 is assumed which results in coding or spreading gain of  $10 \log(15) = 11.76 \ dB$ . An implementation loss of  $10 \log(8/15) = -2.73 \ dB$  is assumed because of the choice of unipolar coding and bipolar decoding [12]. Note that the maximum range is the same with and without the use of PSSS modulation for corresponding digital data modulation schemes. Note that the  $P_{TX}(dBm)$  term for the PSSS case is defined as  $\binom{N+1}{2}P_{TX_{PSSS}} = \binom{N+1}{2} \times \binom{1}{PAPR_{PSSS}}$ .

|                  | No Spreading |       |       | PSSS – Unipolar coding, implementation<br>loss = -2.73 dB, coding gain=10log (15) |      |       |       |        |
|------------------|--------------|-------|-------|-----------------------------------------------------------------------------------|------|-------|-------|--------|
|                  | BPSK         | PAM-4 | PAM-8 | PAM-16                                                                            | BPSK | PAM-4 | PAM-8 | PAM-16 |
| S (bps/Hz)       | 1            | 2     | 3     | 4                                                                                 | 1    | 2     | 3     | 4      |
| $B_{BB}$ (GHz)   | 100          | 50    | 33.33 | 25                                                                                | 100  | 50    | 33.33 | 25     |
| B (GHz)          | 200          | 100   | 66.66 | 50                                                                                | 200  | 100   | 66.66 | 50     |
| PAPR (dB)        | 0            | 2.55  | 3.68  | 4.23                                                                              | 9.03 | 11.58 | 12.71 | 13.26  |
| $P_{TX}$ (dBm)   | 0            | -2.55 | -3.68 | -4.23                                                                             | 0    | -2.55 | -3.68 | -4.23  |
| $(E_b/N_o)$ (dB) | 6.77         | 10.49 | 14.75 | 19.35                                                                             | 6.77 | 10.49 | 14.75 | 19.35  |
| $r_{max}$ (m)    | 5.73         | 2.78  | 1.50  | 0.83                                                                              | 5.73 | 2.78  | 1.50  | 0.83   |

Table 2-1Range calculation for a wireless communication link with and without the use of PSSS modula-<br/>tion for different digital data modulation schemes.

# 2.3 **PSSS System Architecture**

The PSSS transmitter baseband essentially requires the generation of pseudorandom coding sequences to encode parallel symbols of data whereas the PSSS receiver baseband generates local copies of the decoding sequences to recover the transmitted symbols. However, adOditional functional blocks are required for the reliable functioning of the communication link. The proposed general system-level architecture of a PSSS based communication system using a Terahertz radio frequency frontend is shown in Fig. 2-2.

The input data stream is packed into frames whose length is determined among other factors by the channel coherence time after which the channel equalization has to be performed again. The input data stream is packed into frames that contain the start-delimiters to mark the start of the frame, some training data for channel equalization, etc., the payload data, and the end-delimiters to mark the end of the frame. The chosen digital modulation scheme is then used to digitally modulate the data into M-ary symbols where M represents the order of the modulation complexity. Using pulse amplitude modulation (PAM) instead of quadrature amplitude modulation (QAM) allows the use of a Costas loop to recover the carrier signal, with a limiting amplifier in the front to convert the PSSS waveform to a BPSK waveform. The conversion to a BPSK waveform would not be possible with a QAM modulated PSSS waveform. The symbols are arranged in sets of N symbols to encode them in parallel using N cyclically shifted versions of a single coding sequence. The encoded chips from each symbol are added up to make the PSSS chips of the PSSS sequence. Some additional chips in the form of zero-padding or cyclic extension are added at the end of each PSSS sequence to protect against inter-sequence interference. The interference within the chips of a single PSSS sequence i.e., inter-chip interference can be reduced using pulse-shaping filters. The PSSS chips are up-converted and transmitted after pulse-shaping.

The received PSSS waveform is downconverted at the receiver end. A limiting amplifier converts the PSSS stream to a BPSK stream, which is used by a Costas loop to recover the carrier signal. The recovered carrier signal is synchronous with the incoming PSSS waveform and is used as the system clock for the receiver baseband to generate the local copies of the decoding sequences. The PSSS baseband signal is correlated with the decoding sequences using integrate and dump correlator (IDC) circuits which consist of a multiplier followed by a resettable integrator. During the training phase, the transmitter sends the training data which reaches the receiver after being distorted by the channel. The receiver compares the correlator output corresponding to the received training data



Fig. 2-2 Proposed system architecture for PSSS based communication link using a Terahertz radio frequency frontend [13].

with the ideal calculated response of the correlator to perform channel equalization. Instead of an additional filter to perform channel equalization, the binary values of the decoding sequences are weighted using the DAC circuits. This allows the PSSS decoding and channel equalization operations to be performed simultaneously. The output of the integrate and dump correlator (IDC) is sampled periodically with a period equal to the symbol duration. An analog to digital converter (ADC) is used to convert the sampled output of the correlator to a digital representation to recover the original symbol. *N* parallel copies of the IDC and ADC circuits are used to recover the *N* parallelly transmitted symbols simultaneously. The recovered symbols are used to find the start and end of frame delimiters to correctly recover the training and the payload data [12] [68].

### 2.3.1 Digital vs. Mixed-Signal Baseband Architecture

The system architecture described in Fig. 2-2 lists all the required components for a reliable communication system based on PSSS. However, there can be implementation-specific differences in the implementation of the different system components. The baseband circuit is one of the most important system components that can be optimized for the given system specifications and the chosen system architecture.

The baseband circuits for high data rate communication i.e., up to a few tens of Gbps are commonly based on a digital architecture i.e., most of the baseband signal processing takes place in the digital discrete-time domain. The transmitter baseband circuit uses a high data rate, wide bandwidth digital to analog converter (DAC) circuit to convert the digitally modulated information data to an analog waveform to be transmitted using the radio frequency (RF) frontend. On the receiver end, the received signal is downconverted and applied to a high resolution, wide bandwidth, and high sampling rate analog to digital



Fig. 2-3 Figurative representation of the proposed PSSS mixed-signal baseband architecture [13].

converter (ADC) circuit to convert it into a digital format suitable for digital signal processing. The demodulation of the data to recover the information signal is also performed in the digital domain using digital signal processors (DSPs).

Repartitioning the digital/ analog signal processing in the baseband with more focus on analog processing has the potential to outperform purely digital baseband processors in terms of power dissipation, complexity, and cost. As an example, assuming a baseband bandwidth of 25 GHz, a spectral efficiency value of 4 bps/Hz is required to achieve the target data rate of 100 Gbps. To make a fair comparison, PAM-16 modulation is assumed for both digital and mixed-signal baseband signal processors that would require a 4-bit ADC for the receiver baseband. With a traditional digital baseband architecture, the 4-bit ADC will be required to process the incoming baseband signal at 25 GHz with a Nyquist sampling rate of at least 50 GS/sec. If a spreading scheme like DSSS or PSSS is used with a mixed-signal baseband implementation the requirements for the input bandwidth and the sampling rate of the ADC are reduced by a factor equal to the spreading gain of the system which corresponds to the length of the coding sequences i.e., the sampling frequency at the input of each of the 15 parallel ADCs is 1.667 GS/sec. A figurative representation of the proposed PSSS mixed-signal baseband architecture is shown in Fig. 2-3. This figure is a modification of Fig. 1-2 showing the proposed changes in the architecture for a mixed-signal implementation.

### 2.3.2 Transmitter Baseband Architecture

The proposed mixed-signal transmitter baseband architecture is shown in Fig. 2-4 below. The transmitter baseband circuit uses XOR (exclusive OR) gates to encode the *N* parallel data symbols with individual pseudorandom coding sequences simultaneously. The coding sequence generators are clocked with a frequency of  $f_{chip}$ . The first encoded chips of each of the *N* data symbols are added up to make the first chip of the PSSS sequence. Continuing similarly, the *i*<sup>th</sup> encoded chip of each of the *N* data symbols is added up to make the *i*<sup>th</sup> PSSS chip until all *N* PSSS chips are generated. The PSSS chips during the guard interval of length *G* are copies of the first *G* PSSS chips. The PSSS chips are converted from digital to analog form using digital to analog converters (DACs) with the



Fig. 2-4 Proposed mixed-signal PSSS transmitter baseband architecture [12] [73].

outputs in the form of currents. Note that the sampling rate of the DACs is equal to  $f_{sym} = f_{chip}/(N+G)$ . The current mode outputs from the DACs are applied at the inputs of the broadband analog multiplexer clocked with a clock rate of  $f_{chip}$  with each input representing one chip of the PSSS sequence. The multiplexer selects the PSSS chips in a successive order one by one. The broadband analog output of the multiplexer is the required PSSS signal corresponding to the data set  $D_1 - D_N$  [12], [73].

### 2.3.3 Receiver Baseband Architecture

The proposed receiver baseband architecture is shown in Fig. 2-5. On the receiver side, the incoming PSSS stream is correlated with the decoding sequences (locally generated copies of the coding sequences). The local copies of the decoding sequences can be generated using linear feedback shift registers (LFSR) similar to those used in the transmitter. However, to perform channel equalization simultaneously with the decoding process, the decoding sequences have to be weighted. In this case, the decoding sequences are generated using programmable analog weighted code generators instead of LFSRs. A clock recovery circuit is used to synchronize the local system clock with the PSSS signal. The recovered clock signal with the frequency of  $f_{chip}$  is used to clock the analog multiplexers which output the chips of the decoding sequences successively. The chip-wise product of the PSSS chips with the decoding sequence chips is integrated until all chips of the PSSS sequence have been applied. The output of the correlator is sampled at the end of the correlation period  $T_{sym} = (N + G) \times T_{chip}$  before the integrator is reset and is converted to the digital form using analog to digital converters (ADCs). The ADCs have a sampling rate of  $f_{sym} = f_{chip}/(N+G)$  with an output resolution that is at least equal to the bit loading. To recover the N symbols simultaneously, N copies of the above decoding circuit are needed [12].



Fig. 2-5 Proposed mixed-signal PSSS receiver baseband architecture [12] [73].

## 2.4 PSSS Mixed-Signal System Design

The system architecture of a mixed-signal PSSS baseband has been discussed in the previous section. The block diagrams in Fig. 2-4 and Fig. 2-5 show the general sketch of the baseband that can be used for a mixed-signal baseband using PSSS as the modulation scheme. In this section, the system design of such a baseband will be discussed in detail to fill out the details in the diagram. The important system parameters will be defined in the next sub-section, followed by a system model and specifications for the system architecture. The effect of amplitude clipping on the system performance will be discussed at the end.

### 2.4.1 System Design Parameters

An important system-level decision for PSSS based data communication systems is the type and length of pseudorandom codes. Maximum length sequences (MLS or m-sequences) are frequently used as pseudorandom codes with PSSS [9], [62] but other pseudorandom codes like Barker codes, etc. may prove better suited to mixed-signal implementation as discussed below. Regardless of the choice of the code, the combination of unipolar  $\{0,1\}$  or bipolar  $\{-1,1\}$  variants of data, coding, and decoding sequences results in different sets of possible PSSS and cyclic cross-correlation (CCC) result values thus allowing a certain degree of freedom for system design. In [61], MATLAB/ Simulink simulations are used to analyze the influence of different system parameters on the net data rate of a wireline communication link using PSSS. Simulation results show the net data rate of more than 20 Gbps with full-duplex communication using BPSK modulation with the spectral efficiency of less than 1 bps/ Hz at a chip rate of 25 Gcps [61].

#### 2.4.1.1 Selection of Pseudorandom Codes

In the discussion that follows in this sub-section, the code length will be written along with the name of the code e.g., MLS with code length 7 will be written as MLS-7, etc. The code length of MLS is  $N = 2^m - 1$  for  $m \ge 3$  whereas the code length of Barker codes is 2,3,4,5,7,11, or 13. Since smaller codes are not very useful and since Barker-7 is the same as MLS-7, the remaining Barker codes of interest are Barker-11 and Barker-13. Table 2-2 lists all possible PSSS values and cyclic cross-correlation (CCC) values for any MLS of length *N*. For Barker-11 and Barker-13, a shift in the mean value (DC shift) of the decoding sequence is required to use them for PSSS. The required DC shift and the

| Data     | Coding<br>seq. | PSSS values                                                     | Decoding<br>seq. | CCC values                         | Integrator Input<br>Dynamic Range                               | Max. instantaneous o/p<br>of correlator *       |
|----------|----------------|-----------------------------------------------------------------|------------------|------------------------------------|-----------------------------------------------------------------|-------------------------------------------------|
| Unipolar | Unipolar       | $\left\{0, 1, 2, \dots, \frac{N+1}{2}\right\}$                  | Bipolar          | $\left\{0, \frac{N+1}{2}\right\}$  | $\left\{0,\pm 1,\pm 2,\ldots,\pm \frac{N+1}{2}\right\}$         | $\left(\frac{N+1}{2}\right) \times \log_2(N+1)$ |
| {0,1}    | Bipolar        | $\left\{0,\pm 1,\ldots,\pm \frac{N-1}{2},\frac{N+1}{2}\right\}$ | Unipolar         | $\left\{0, \frac{N+1}{2}\right\}$  | $\left\{0,\pm 1,\ldots,\pm \frac{N-1}{2},\frac{N+1}{2}\right\}$ | N-1                                             |
| Bipolar  | Unipolar       | $\left\{0, \pm 1, \pm 2, \dots, \pm \frac{N+1}{4}\right\}$      | Bipolar          | $\left\{\pm \frac{N+1}{4}\right\}$ | $\left\{0,\pm 1,\pm 2,\ldots,\pm \frac{N+1}{4}\right\}$         | $\left(\frac{N+1}{4}\right) \times \log_2(N+1)$ |
| {-1,1}   | Bipolar        | {±1,±3,,±N}                                                     | Unipolar         | $\left\{\pm\frac{N+1}{2}\right\}$  | $\{0, \pm 1, \pm 3, \dots, \pm N\}$                             | $9_{(N=7)}, 23_{(N=15)}, \dots$                 |

Table 2-2PSSS, CCC Values, and Integrator Inputs for MLS-N of Length N [68].

\* The maximum instantaneous correlator output during the correlation cycle value.

resulting PSSS and CCC values for Barker codes are listed in Table 2-3. From the tables, it can be seen that a combination of unipolar coding sequences with bipolar decoding sequences or vice versa results in a two-valued cyclic cross-correlation (CCC) result that is a favorable outcome for realizing decision circuits e.g., ADCs.

The amplitude distribution of the PSSS amplitude set exhibits a non-uniform distribution for both unipolar and bipolar coding variants. The overall PSSS amplitude distribution has a bell-like shape with larger amplitudes occurring much less frequently as compared to the ones with smaller amplitudes. The mean value of the PSSS amplitude set is nonzero for unipolar data and zero for bipolar data. Moreover, the output CCC values for bipolar data have twice the difference as compared to the output CCC values for unipolar data, for the same number of elements in the PSSS amplitude set, which makes bipolar data preferable to unipolar data. Out of the two options using bipolar data, the one using unipolar coding sequences with bipolar decoding sequences generates the smallest set of PSSS amplitudes which allows for the lowest linearity requirements for the transmitter circuit.

The lower the number of elements in the PSSS amplitude set, the lower will be the peak to average power ratio (PAPR) of the transmitter. Hence, the optimum combination for a mixed-signal PSSS transceiver for a wireless communication link will be the one using bipolar data with unipolar coding sequences and bipolar decoding sequences.

A very important consideration for mixed-signal baseband design is the linearity of the analog integrator and correlator circuit. The integrator input is the product of the incoming PSSS stream and the (weighted) decoding sequence. The table shows possible integrator input amplitudes (i.e., dynamic range) for different codes and their combinations. These values have been obtained by taking all possible PSSS vectors and multiplying them with a single decoding sequence vector. Not only the input dynamic range of the correlator circuit but the output dynamic range of the correlator is also very important. The value of the correlator is sampled at the end of the correlation cycle. During integration, the instantaneous value of the correlator becomes larger than the final correlation result. This can have important implications on the design of the correlator circuit as discussed in section 3.5. From Table 2-2, it can be seen that bipolar data with unipolar coding se-

| Code<br>type                 | Data<br>seq. | Coding seq. | PSSS values                       | Decoding seq.  | CCC<br>values | Possible amplitudes of the product of the de-<br>coding seq. with the PSSS amplitudes |  |
|------------------------------|--------------|-------------|-----------------------------------|----------------|---------------|---------------------------------------------------------------------------------------|--|
| Barker Bipolar<br>-11 {-1,1} | Bipolar      | Unipolar    | $\{\pm 1, \pm 3, \pm 5\}$         | Bipolar + 0.2  | {-6,6}        | $\{0, \pm 1, \pm 2, \pm 3, \pm 4, \pm 6\}$                                            |  |
|                              | {-1,1}       | Bipolar     | $\{\pm 1, \pm 3, \dots, \pm 11\}$ | Unipolar - 1.0 | {-6,6}        | $\{0, \pm 1, \pm 3, \dots, \pm 11\}$                                                  |  |
| Barker Bi<br>-13             | Bipolar      | Unipolar    | $\{\pm 1, \pm 3, \dots, \pm 9\}$  | Bipolar - 0.33 | {-6,6}        | $\{-12, -10, -7, -5, -4, -2, -1, 0, 1, 2, 3, 4, 6, 9, 12\}$                           |  |
|                              | {-1,1}       | Bipolar     | $\{\pm 1, \pm 3, \dots, \pm 13\}$ | Unipolar - 0.6 | {-6,6}        | $\{0, \pm 1, \pm 2, \pm 3, \pm 4, \pm 6, -7, -8\}$                                    |  |

| Tahle 2-3 | PSSS CCC VALUES AND INTEGRATOR INPUTS FOR BARKER-11 & BARKER-13 [6]     | R1       |
|-----------|-------------------------------------------------------------------------|----------|
| Tuble 2 5 | 1 555, CCC VILOES, IND INTEGRATION INTO ISTON DIMNER 11 & DIMNER 15 [00 | <i>.</i> |

quences and bipolar decoding sequences put the smallest input linear dynamic range requirement on the integrator and dump correlator circuit. Although the output dynamic range requirement is higher than for the case in the last row in Table 2-2 it is still the most suitable combination for mixed-signal baseband realization for wireless communication links using PSSS.

### 2.4.1.2 Code Length

The longer the code, the larger is the spreading gain, defined as the ratio of the symbol duration  $T_{sym}$  to chip duration  $T_{chip}$  i.e.,  $(T_{sym}/T_{chip})$ , and the larger is the number of symbols that can be transmitted in parallel. Another advantage of using longer codes is improved link utility as explained later. However, for mixed-signal implementation, the migration from one MLS to the next longer MLS translates to roughly doubling the transistor count and the overall chip size. The increase in the transistor count and the chip size poses additional problems like excessive heat dissipation and the requirement of considering the wave properties of the signals traveling on-chip. This complicates the design by requiring transmissions line structures with matched loads to avoid reflections and distortions. Thus, for mixed-signal implementation with a high chip rate of around 25 Gcps, the preferred pseudorandom code sequences for PSSS (w.r.t. code length) are MLS-7 or MLS-15 or Barker codes with length 7, 11, or 13 [68].

#### 2.4.1.3 Guard Interval Length

A PSSS stream consists of PSSS sequences followed by some additional chips after each PSSS sequence that are used as the guard interval. The guard interval can either be implemented either in the form of zero-padding or as a cyclic prefix. Both forms of guard interval help to eliminate ISI between adjacent PSSS sequences in a PSSS stream, however, cyclic prefix makes linear convolution of a frequency selective multipath channel appear as though it were circular convolution. The length of the guard interval is dictated by the delay spread of multipath wireless channels, the interface impedance mismatch, the bandwidth limitation (or group delay dispersion) of the cables, etc. For a given code length, the shorter the guard interval, the higher will be the link utility (defined as the ratio of the channel capacity used for payload data to the channel capacity used for payload data plus that wasted during guard interval). The loss in link utility can be compensated by increasing the chip rate. For example, for a code length of 15 bits if the guard interval is chosen to be 3 bits, then the link utility reduces to 83.33 %. Increasing the chip rate from 25 to 30 Gcps restores the net data rate to 100 Gbps while keeping the bit loading at 4 bits/ symbol.

The use of a guard interval in the form of cyclic extension or zero padding is inevitable to avoid *inter-PSSS-sequence* interference. If the sum of code length and guard interval (measured as the number of chips) is a binary number, then it offers the advantage that the clock and data signal distribution paths in transmitter and receiver circuits can be

implemented as binary trees which allows symmetrical layout and equal path lengths at each node. For example, if the required guard interval is one chip duration (i.e.,  $1 T_{chip}$ ) long, then an MLS code (length  $2^N - 1$ ) is a good choice because it makes the sum of the number of PSSS and guard interval chips equal to a binary number i.e.,  $2^N$ . If a longer guard interval is required, then another code e.g., Barker code can be employed [68].

# 2.4.1.4 Chip Rate

The chip rate is one of the most important circuit design parameters for PSSS based communication systems. The chip rate is dictated by the available baseband bandwidth and the capability of the integrated circuit design technology used to implement the system. The large RF bandwidth of 50 GHz around the carrier frequency of 240 GHz provides 25 GHz of baseband bandwidth with double sideband (DSB) RF modulation. This allows the chip rate to be from 25 to 30 Gcps. For the chip rate of 25 Gcps, the spectral efficiency of 4 bps/Hz is required to achieve a data rate of 100 Gbps.

The chip rate also depends on the capabilities of the semiconductor technology used to implement the baseband integrated circuit. Among the IC design technologies available at the institute, the chip rate of 25-30 Gcps is realizable using modern CMOS technologies with gate lengths smaller than 65 nm or the sub-quarter micron SiGe BiCMOS technologies available in the institute. More details about the semiconductor technologies and circuit design follow later in chapter 3.

# 2.4.2 System Model and Specifications

Based on the discussion in section 2.4.1, the code length is one of the first few system parameters to be defined for a mixed-signal PSSS baseband because it defines the number of copies of the unit circuit architecture required for the complete implementation as shown highlighted in Fig. 2-4 and Fig. 2-5. Since it directly defines the complexity of the baseband circuit, the choice of this parameter is very important. The code length is chosen as 15 to suit mixed-signal baseband implementation.

In [74], spatial and temporal channel properties have been investigated at 300 GHz in an indoor office scenario for THz WLAN systems. The angular and RMS delay spreads have been determined by evaluating ray-tracing simulations. Angular spreads of up to  $50^{\circ}$  and RMS delay spreads of up to 3.5 ns have been reported for omnidirectional antennas, whereas values of up to  $4^{\circ}$  and 0.06 ns have been reported for clusters of multiple-input multiple-output (MIMO) arrays with several single elements and very narrow beams at THz frequencies.

The maximum delay spread is assumed to be 100 ps which is equal to  $3 T_{chip}$  for the chosen chip rate of 30 Gcps. This assumption is good enough for indoor direct line of sight links with the carrier frequency of 240 GHz. Moreover, the correlator circuit is reset

during guard interval. The chosen value of 100 ps is also sufficient for the reset operation of the correlator.

The transmitter data during guard interval is a replica of the first 3 PSSS sequence chips. Using the guard interval of 3 chips reduces the throughput to 83.33%. To restore the net data rate to 100 Gbps the chip rate must be increased by 20% i.e., to 30 Gcps.

The transmitter and receiver baseband architectures are combined in the block diagram in Fig. 2-6. The data encoding processes of XOR and summation can be combined in a single digital CMOS baseband working with a clock rate of 1.667 GHz. The CMOS digital baseband performs matrix multiplication of the incoming data stream with a predefined coding matrix of size  $15 \times 15$ . From Table 2-2, it can be seen that for bipolar BPSK data, and unipolar coding matrix, the PSSS chips can have 1 + (N + 1)/2 different amplitude values where N is the length of the code used. Instead of BPSK if PAM-16 modulated data is used to encode the PSSS sequences with N = 15, it results in  $1 + 15 \times (N + 1)/2 = 121$  possible amplitude levels for the PSSS sequence chips. Thus, for each chip of the PSSS sequence, a 7-bit DAC is required. The inputs to the DACs come from the CMOS digital baseband core.

The outputs of the DACs are in the form of currents which form the inputs for the analog multiplexer (mux) which sequentially selects one out of 18 inputs corresponding to the different chips of the PSSS sequence. The analog mux then selects the next chip of the



Fig. 2-6 Proposed architecture of the mixed-signal PSSS baseband [73].

PSSS sequence and continues in this fashion until all the chips of the PSSS sequence have been transmitted. The inputs to the analog mux are updated at the latest after 15  $T_{chip}$  to correspond to the chips of the next sequence of PSSS. The DACs have a time of 3  $T_{chip}$ to update their outputs before the chips for the next PSSS sequence start to be sent sequentially to the output of the analog mux. The output signal of the analog mux is a broadband signal with a bandwidth of around 25 GHz. A pulse-shaping filter may be used to improve the waveform characteristics before it is up-converted and transmitted by the RF-frontend.

The down-converted PSSS baseband signal from the RF-frontend is applied to a Costas loop to recover the carrier signal which serves as the clock signal for the baseband circuit. Additionally, a static phase adjustment circuit is needed to shift the phase of the clock signal to match the PSSS waveform phase. The static phase delay adjustment must cover the inter-chip as well as the fine intra-chip phase adjustment with the PSSS waveform. This can be added both as a passive phase delay line component as well as a voltage-controlled delay line circuit implemented as part of the baseband chip. Additionally, a variable gain amplifier (VGA) with manual or automatic gain control can be used to ensure that the PSSS signal amplitude remains within the required sensitivity limits despite the change of the distance between the transmitter and receiver antennas.

On the receiver side, the PSSS waveform is decoded by correlating with the decoding sequences. For each of the *N* symbols transmitted in parallel as one PSSS sequence, *N* integrate and dump correlators are required in total, where each correlator uses a copy of the coding sequence having complementary polarity (i.e., unipolar, or bipolar) and weighting to perform channel equalization along with the decoding of the data. An 18-to-1 multiplexer is used to sequentially output the weighted chips of the decoding sequence that are multiplied chip-wise with the incoming PSSS chips. The correlation takes 15  $T_{chip}$  after which there is a reset period of 3  $T_{chip}$ .

At the end of the correlation period, the output of the correlator is sampled and an analog to digital converter (ADC) converts it into digital data. The minimum resolution of the ADC is 4-bits with a sample rate of 1.667 GSa/s. The resulting output interface at this level consists of 15 such 4-bit lanes of data each transmitting data at 1.667 Gbps. To simplify the output interface, the 4-bit output of each ADC is serialized to generate a single 4 times faster data stream. The final output interface is thus a 15-lane bus with each lane transmitting data at 6.667 Gbps [73].

# 2.4.3 PSSS Amplitude Distribution

The PSSS matrix as defined in Equation 2-6 consists of multi-valued entries. The type of the coding sequences and the chosen polarity of data and coding sequences determines the set of amplitudes for the PSSS chips. The possible PSSS amplitudes for BPSK data are mentioned in table Table 2-2. The amplitudes for higher-order PAM can be calculated similarly.

To calculate the amplitude distribution of the PSSS chips, one needs to multiply the coding matrix with a data matrix containing an exhaustive list of all possible data vectors for the chosen modulation scheme. For a coding matrix size of  $15 \times 15$ , the required order of such a matrix for BPSK data is  $15 \times 2^{15}$ . Extending this scheme for PAM-16 data requires a data matrix of order  $15 \times 16^{15}$ ; that is a very large size for computations. The amplitude counts for the different PSSS amplitudes are then enumerated and plotted as a bar graph. A better solution is to calculate the PSSS amplitudes distribution analytically.

The analytical method for the calculation of PSSS amplitude distribution is explained here for the case of BPSK data with m-sequence of length 7 i.e.,  $\{1 \ 1 \ 1 \ -1 \ -1 \ 1 \ -$ 1} as the coding sequences. For generality, the data as well as the coding sequences are considered bipolar. The m-sequence has (N + 1)/2 occurrences of 1's and (N - 1)/2occurrences of -1's, where N = 7 is the length of m-sequence. For the given case of  $7 \times 7$  coding matrix, the size of the data matrix containing all possible data combinations is  $N \times (M)^N = 7 \times 2^7$ ; where M stands for the modulation order that is 2 for BPSK data. Out of the total  $M^N = 2^7 = 128$  data vectors, only  ${}^{N+M-1}C_N = {}^{7+2-1}C_7 = {}^8C_7 = 8$ combinations are unique i.e., all other data vectors can be obtained by shifting the order of the bits in the unique data vectors. For each possible PSSS amplitude, the total number of possible arrangements of the unique data vectors to obtain a given PSSS amplitude are enumerated. Note that the PSSS amplitude distribution is symmetric along the amplitudeaxis i.e., the number of occurrences for the PSSS amplitude values +X and -X is the same. For BPSK data, the PSSS amplitudes set consists of  $\{\pm 1, \pm 3, \pm 5, \pm 7\}$ . In this case, the amplitude distribution of the amplitude +1 will be the same as that for -1 and so on. Thus, the calculations are performed only for the positive amplitudes and the final values can be used for the negative amplitude values as well.

-1}. For this analysis, the m-sequence vector can be broken down into two sub-vectors  $\{1 \ 1 \ 1 \ 1\}$  and  $\{-1 \ -1 \ -1\}$ . Since the BPSK data is restricted to have only the values  $\pm 1$ , the sum-of-the-products of the data and the vector  $\{1 \ 1 \ 1 \ 1 \ 1\}$  can only have the values  $\{-4, -2, 0, 2, 4\}$  whereas the sum-of-the-products of the data and the vector  $\{-1 - 1, -2, 0, 2, 4\}$ 1 - 1 can only have the values  $\{-3, -1, 1, 3\}$ . Now, for each possible amplitude of the PSSS amplitude set, the sum-of-the-products values from the above two sum-of-the-products sets are added to see if they sum up to the required PSSS amplitude e.g., for PSSS amplitude value +7, the only valid combination is 4 from the first sum-of-the-products set and 3 from the second sum-of-the-products set, since 4 + 3 = 7. No other combination of the sum-of-the-products results in +7. For the PSSS amplitude value +5, there are 2 valid combinations i.e., 4 from the first sum-of-the-products set and 1 from the second sum-of-the-products set, since 4 + 1 = 5. Similarly, choosing 2 from the first sum-of-theproducts set and 3 from the second sum-of-the-products set is also a valid combination, since 2 + 3 = 5. No other combinations from the two sets result in the PSSS amplitude value +5. The valid combinations for the PSSS amplitude value +3 are 4 - 1, 2 + 1, and 0 + 3. The valid combinations for the PSSS amplitude value +1 are 4 - 3, 2 - 1,

|                   |                    | {1111}                                               | { <b>-1</b> - 1    | 1 - 1}                                        |                                       |
|-------------------|--------------------|------------------------------------------------------|--------------------|-----------------------------------------------|---------------------------------------|
| PSSS amp<br>value | Sum of the product | No. of ways to get the sum<br>of product on the left | Sum of the product | No. of ways to get the sum of product on left | No. of ways to obtain the combination |
| 7=4+3             | 4                  | 1                                                    | 3                  | 1                                             | $1 \times 1 = 1$                      |
|                   |                    |                                                      |                    |                                               |                                       |
| 5=4+1             | 4                  | 1                                                    | 1                  | 3                                             | $1 \times 3 = 3$                      |
| 5=2+3             | 2                  | 4                                                    | 3                  | 1                                             | $4 \times 1 = 4$                      |
|                   |                    |                                                      |                    |                                               |                                       |
| 3=4-1             | 4                  | 1                                                    | -1                 | 3                                             | $1 \times 3 = 3$                      |
| 3=2+1             | 2                  | 4                                                    | 1                  | 3                                             | $4 \times 3 = 12$                     |
| 3=0+3             | 0                  | 6                                                    | 3                  | 1                                             | $6 \times 1 = 6$                      |
|                   |                    |                                                      |                    |                                               |                                       |
| 1=4-3             | 4                  | 1                                                    | -3                 | 1                                             | $1 \times 1 = 1$                      |
| 1=2-1             | 2                  | 4                                                    | -1                 | 3                                             | $4 \times 3 = 12$                     |
| 1=0+1             | 0                  | 6                                                    | 1                  | 3                                             | $6 \times 3 = 18$                     |
| 1=-2+3            | -2                 | 4                                                    | 3                  | 1                                             | $4 \times 1 = 4$                      |

Table 2-4PSSS amplitude distribution calculation for BPSK data encoded with a single m-sequence vec-<br/>tor of length 7 as the bipolar coding sequence {1 1 1 -1 1 1 -1}.

and 0 + 1. The valid combinations for the negative PSSS amplitudes are similar to combinations for the positive PSSS amplitudes but with opposite + or - signs.

After defining the valid combinations of the *sum-of-the-products* terms for each PSSS amplitude, the number of ways that a certain *sum-of-the-products* term can be obtained is listed against it. There is only one way to obtain the *sum-of-the-products* value of +4 i.e.,  $\{+1 + 1 + 1 + 1\} = +4$ . But to obtain *sum-of-the-products* value of +2, there are 4 ways i.e.,  $\{+1 + 1 + 1 + 1\} = +4$ . But to obtain *sum-of-the-products* value of +2, there are 4 ways i.e.,  $\{+1 + 1 + 1 - 1\}$ ,  $\{+1 + 1 - 1 + 1\}$ ,  $\{+1 - 1 + 1 + 1\}$ , and  $\{-1 + 1 + 1 + 1\}$ . The same result is obtained using the combinations formula i.e.,  ${}^{4}C_{3}$  or binomial distribution i.e.,  ${}^{4}C_{3} = 4!/(3! \times 1!) = 4$  where 4 denotes the number of places to be filled with 4 numbers out of which 3 of the numbers are repeated. Using the same method, it can be calculated that there are 6 ways to obtain the *sum-of-the-products* value of 0, 3 ways to obtain the *sum-of-the-products* value of +3. The number of ways to obtain the negative *sum-of-the-products* value of +3. The number of ways to obtain the negative *sum-of-the-products* value of +2, there are 4 ways to obtain the *sum-of-the-products* value of -2.

Any given PSSS amplitude is obtained by adding the *sum of the product* terms and hence the number of ways to obtain a given PSSS amplitude e.g., 1 = 2 + (-1) is obtained by *multiplying* the number of ways to get the relevant *sum-of-the-products* terms i.e., +2 and -1. In this case, the number of ways to obtain the combination 1 = 2 + (-1)is  $4 \times 3 = 12$  see Table 2-4. The sum of the number of ways to obtain the different combinations for a given PSSS amplitude are then summed up e.g., the total number of ways that the PSSS amplitude +1 can be obtained is 1 + 12 + 18 + 4 = 35.



Fig. 2-7 PSSS amplitude distribution for BPSK data with a single m-sequence of length 7 as the bipolar coding sequence {1, 1, 1, -1, -1, 1, -1}. To obtain the complete amplitude distribution for all 7 coding vectors combined, all the number of occurrences have to be multiplied by 7.

The values for the negative amplitude values of the PSSS amplitude set are the same as those for the positive amplitude values. The calculated values of the PSSS amplitude distribution confirm with the plot obtained by multiplication of the coding vector  $\{1\ 1\ 1\ -1\ 1\ -1\ 1\ -1\}$  with all 2<sup>7</sup> possible data vectors and enumerating the no. of occurrences of the amplitudes as shown in Fig. 2-7. The distribution of the PSSS amplitudes is non-uniform and has a bell-like shape i.e., smaller amplitudes are more common than the larger ones. The results are obtained by considering a single coding vector out of the 7 possible coding vectors. To obtain the complete amplitude distribution for all 7 coding vectors combined, all the number of occurrences have to be multiplied by 7.

The analytical methodology described here becomes complex for higher-order digital modulations of the data e.g., PAM-16 because instead of 2 options of filling each data symbol as in the case of BPSK, there are 16 options to fill each data symbol in case of PAM-16 before adding 15 such data symbols each encoded with its respective coding sequence. Since each encoded data symbol has 16 possible values instead of 2, the multinomial distribution formula is used to find the probability of occurrence of different amplitude values in the resulting PSSS amplitudes set as also explained in [75].

Assuming a set of *k* possible outcomes denoted by  $(X_1, X_2, ..., X_k)$  with the associated probabilities  $(p_1, p_2, ..., p_k)$  such that  $\sum_{i=1}^k p_i = 1$ ; if the experiment is repeated *n* number of times, then the possibility to get a specific outcome  $(X_1 = x_1, X_2 = x_2, ..., X_k = x_k)$  where  $x_i$  denotes the number of times that the result  $X_i$  occurs provided that  $0 \le x_i \le n$  and  $\sum_{i=1}^k x_i = n$  is given by the multinomial distribution formula:

$$P(X_1 = x_1, X_2 = x_2, \dots, X_k = x_k) = \left(\frac{n!}{x_1! \times x_2! \times \dots \times x_k!}\right) \cdot (p_1^{x_1} \times p_2^{x_2} \times \dots p_k^{x_k})$$
(2-62)



Fig. 2-8 Normalized distribution of PSSS amplitudes for PAM-16 data encoded with the bipolar version of msequences of length 15 [75].

The multinomial distribution is calculated for 2 different cases i.e., once for the case of +1's in the m-sequence for which n = 8; and again for the case of -1's in the msequence for which n = 7. The individual probabilities of occurrence of each of the outcomes  $p_k$  is equal to 1/M where M = 16 denotes the modulation complexity for the case of PAM-16 assuming uniform probability distribution of the input data. The partial probabilities for the chosen outcome  $(X_1 = x_1, X_2 = x_2, ..., X_k = x_k)$  for the two cases i.e., for +1's (n = 8) and -1's (n = 7) in the m-sequence are then multiplied to get the final probability distribution for the outcome  $(X_1 = x_1, X_2 = x_2, ..., X_k = x_k)$ . This process is repeated for all unique combinations of the input data vectors, the total number of that is given by  ${}^{N+M-1}C_N = {}^{15+16-1}C_{15} = {}^{30}C_{15}$ . The calculated normalized probability distribution function for the PAM-16 data encoded with the bipolar version of the m-sequences is shown in Fig. 2-8. Note that the bipolar coding sequences are assumed here for the case of generality otherwise unipolar coding sequences are better suited for mixed-signal baseband design as discussed in section 2.4.1. The PSSS amplitudes set for the case of bipolar m-sequences of length 15 with PAM-16 data has the following possible amplitudes  $\{\pm 1, \pm 3, \dots, \pm (15 \times 15)\}$ . The analysis for the case of unipolar coding is simpler because there the entries for the -1's case will be absent and hence the multinomial distribution only needs to be calculated for the case of +1's (n =8). Additionally, the PSSS amplitudes set is smaller for this case i.e.,  $\{0, \pm 2, \pm 4, \dots, \pm 120\}.$ 

An important observation from the normalized distribution of the PSSS amplitudes in Fig. 2-8 is that the amplitudes with large magnitudes occur very infrequently. Despite occurring very infrequently, they still increase the value of the peak to average power ratio (PAPR) very adversely. The PAPR value for this case was calculated in Equation (2-48c) to be 39.75 (15.99 dB) which is quite large.

### 2.4.3.1 Amplitude Clipping

A logical question to ask is: what happens if the PSSS amplitudes with higher magnitudes are clipped-off. To investigate this, a simulation model for PSSS encoding and decoding was implemented in MATLAB Simulink and the effect of the amplitude clipping on the bit error rate (BER) was observed in [75]. This has the effect of setting insufficient back-off for the transmitter power amplifier which causes saturation at the higher amplitudes. The MATLAB Simulink simulations and the BER calculations in Fig. 2-9 show that the bit error rate (BER) increases with the increase in the number of amplitudes clipped [75]. Clipping 50 of the highest amplitudes from each side (100 PSSS amplitudes in total), allows the reduction of amplitudes in the PSSS amplitude set from 225 to 125 which reduces the required DAC resolution from 8-bits to 7-bits while worsening the BER to  $3.9 \times 10^{-3}$ , that is still acceptable for wireless communication.

The MATLAB Simulink simulations and the BER calculations in Fig. 2-8 do not contain any non-ideal effects like channel impairments, bandwidth limitations, electromagnetic interference, noise, circuit nonlinearities, etc. The actual effect of clipping-off of higher amplitudes from the PSSS amplitude set on the BER may be worse than that suggested by the ideal simulation results in Fig. 2-9 [75]. Note that the MATLAB Simulink simulations to investigate the effect of amplitude clipping on the BER in [75] (see Fig. 2-9) were done by the first two co-authors of the publication referenced in [75].



Fig. 2-9 Bit error rate (BER) vs. the number of PSSS amplitudes clipped-off [75].

# 2.5 Channel Equalization by Chip Weighting

Channel equalization is used to reverse the effects of channel impairments on the signal to improve the SNR at the detector. Channel equalization aims to render a flat frequency response from the transmitter end to the detector end. Channel equalization is usually performed with digital filters whose coefficients are calculated using linear or maximum likelihood sequence estimation on training data, or blind/ semi-blind data. The channel equalization for the current PSSS baseband implementation is based on the use of training

data. The idea is to use pre-defined data sequences for the training phase for which the system output is known. The data is passed through the channel which modifies it according to its unit impulse response. The data distorted by the channel impairments is correlated with ideal decoding sequences which generates a non-ideal output. Matrix operations are performed to find new non-binary weighted decoding sequences which when used for correlation with the actual training data generate a close-to-ideal output.

### 2.5.1 Choice of the Training Sequence Data

Referring to Fig. 2-6, 15 correlators are used in parallel to determine the 15 symbols sent in parallel. M-sequences of length 15 are used as coding and decoding sequences. For determining the chip weights, bipolar data, bipolar coding, and bipolar decoding is assumed here. Unipolar data or code is not used since some of the information content important for chip weight calculation would be lost due to the zero entries leading to sub-optimum chip weighting. For channel equalization (or training), the simplest correlation output (recovered data) for the baseband in Fig. 2-6 is a +1 for one of the correlators and -1 (for bipolar data) for all others.

where T represents the transpose function. The PSSS vector  $P_i$  corresponding to this output decoded data vector can be obtained by using the inverse of the decoding matrix D.

$$S_i' = D.P_i \tag{2-63a}$$

$$P_i = D^{-1} . S_i' \tag{2-64b}$$

The vector expression in (2-64b) represents 15 equations, whereas the decoding matrix D has  $15 \times 15$  unknown variables i.e., the decoding chips for which the weights have to be calculated. To determine the chip weights for all  $15 \times 15$  unknown decoding chip weights, 15 training sequences (cyclically shifted versions of the vector in (2-63) have to be sent through the channel during the training phase which leads to:

$$P = D^{-1}.S' \tag{2-65}$$

where *P* represents the  $15 \times 15$  PSSS matrix which when applied at the input of the PSSS receiver results in the correlation output (recovered data) matrix *S'* of order  $15 \times 15$ ; filled with cyclically shifted versions of the vector in (2-63).

#### 2.5.2 Calculation of Chip Weights

So far, no channel impairments have been considered in the PSSS data. The actual response of the channel is difficult to predict; however, it is safe to assume it to be like a low-pass filter. The training data gets distorted while passing through the channel. The channel equalization can be used to correct the linear distortions in the data. The nonlinear distortions are difficult to model and correct. After passing through the channel, the ideal PSSS training data P gets distorted to  $P_{channel \, distorted}$  which results in a correlation output (recovered data) matrix  $S'_{channel \, distorted}$  instead of the ideal correlation output (recovered data) matrix S'. Using the recovered data matrix  $S'_{channel \, distortion}$  and the ideal decoding matrix D in (2-65), the corresponding channel distorted PSSS training data  $P_{channel \, distorted}$  can be determined:

$$P_{channel\ distorted} = D^{-1} \cdot S'_{channel\ distorted}$$
(2-66)

The goal is to find a new weighted decoding matrix  $D_{weighted}$  such that the correlation result (recovered data matrix) is restored to the ideal correlation output (recovered data) matrix S'.

$$P_{channel \, distorted}. \, D_{weighted} = S' \tag{2-66a}$$

$$D_{weighted} = P_{channel \ distorted}^{-1} S' = S_{channel \ distorted}^{-1} D S'$$
(2-67b)

Note that the ideal decoding matrix D is used during the training phase to calculate the chip weights.

### 2.5.3 Practical Limitations in Determination of Chip Weights

The calculation of the chip weights for the decoding sequences i.e.,  $D_{weighted}$  depends on the recovered output data (correlation result)  $S'_{channel distorted}$  as given by Equation (2-67b). Referring to Fig. 2-2, the analog correlator outputs are quantized using analog to digital converters (ADCs). The chip values in the correlator output matrix  $S'_{channel \, distorted}$  in Equation (2-67b) are non-quantized analog voltages. If instead, the chip values  $\hat{S}'_{channel \, distorted}$  are quantized (represented with a ^ sign on the top), the resulting weighted decoding matrix  $\widehat{D}_{weighted}$  calculated using Equation (2-67b) will only be able to approximately satisfy the condition in Equation (2-67b). The resolution of the ADC in Fig. 2-2 determines how good the calculation of the chip weights for the decoding matrix, and the consequent channel equalization, will be. When PAM-16 is used as the digital modulation, the minimum resolution of the ADC is 4-bits. The full 4-bit resolution of the ADC can be used for the correlator output during the training phase. Since the correlator output is quantized to 4-bits, Equation (2-67b) will only hold in approximation. The quantized digital output of the ADC is multiplied with the ideal decoding matrix to get the weighted coefficients for the decoding matrix. The resulting channel equalization will have limited capability to correct channel distortions.

One solution to this problem would be to sub-sample the output of the first sample-andhold circuit (S/H) operating at Nyquist rate with a second S/H circuit. A low-speed highresolution ADC can be used to convert the sub-sampled output of the second S/H to a digital format for the calculation of chip weights. This improvement in the chip weight calculation comes at the cost of increased hardware complexity and higher power dissipation and will not be considered in the current investigation. Another important limitation in the channel equalization stems from the fact that the entries of the weighted decoding matrix  $D_{weighted}$  calculated using Equation (2-67b) will be analog values. To apply the calculated weights to the decoding sequences, digital to analog converters (DACs) are used to convert the digital coefficients to analog weights. The resolution of the DACs plays an important role in channel equalization. Ideally, the resolution should be as large as possible, but it increases the design complexity and the power dissipation. A good resolution for the chip weighting DAC is around 4-6 bits. In the current implementation, 4-bit DACs are used to provide the weighted decoding sequence chips as explained later in Chapter 3.

### 2.6 List of Relevant Publications of the Author

- [12] J. C. Scheytt, A. R. Javed, et. al., "Real100G Ultrabroadband Wireless Communication at High mm-Wave Frequencies," in *Wireless 100 Gbps and Beyond*, R. Kraemer and S. Scholz, Eds., Frankfurt (Oder), Germany, IHP – Innovations for High Performance Microelectronics, 2020, pp. 213-230.
- [13] J. C. Scheytt, A. R. Javed, et. al., "100 Gbps Wireless System and Circuit Design Using Parallel Spread-Spectrum Sequencing," *Frequenz*, vol. 71, no. 9-10, p. 399 – 414, 2017.
- [68] A. R. Javed, J. C. Scheytt, K. KrishneGowda, and R. Kraemer, "System Design Considerations for a PSSS Transceiver for 100Gbps Wireless Communication with Emphasis on Mixed-Signal Implementation," in *IEEE Wireless and Microwave Technology Conference (WAMICON)*, Florida, 2015.
- [69] A. R. Javed, et. al., "Real100G.com," in Wireless 100 Gbps and Beyond, R. Kraemer and S. Scholz, Eds., Frankfurt (Oder), Germany, IHP Innovations for High Performance Microelectronics, 2020, pp. 231-294.
- [73] A. R. Javed, J. C. Scheytt, K. KrishneGowda, and R. Kraemer, "System Design of a Mixed Signal PSSS Transceiver Using a Linear Ultra-Broadband Analog Correlator for the Receiver Baseband Designed in 130 nm SiGe BiCMOS Technology," *IEEE EUROCON 2017 -17th International Conference on Smart Technologies*, pp. 228-233, 2017.
- [75] K. KrishneGowda, L. Wimmer, A. R. Javed, A. C. Wolf, J. C. Scheytt, and R. Kraemer, "Analysis of PSSS Modulation for Optimization of DAC Bit Resolution for 100 Gbps Systems," in 15th *International Symposium on Wireless Communication Systems (ISWCS)*, Lisbon, 2018.

# 2.7 Summary

In this chapter, a mathematical overview of the parallel sequence spread spectrum (PSSS) communication is presented. The system architecture of a mixed-signal PSSS system is developed based on the mathematical overview and system analysis is presented [13] [68] [73]. The receiver sensitivity and the peak to average power ratio (PAPR) of the PSSS communication system are derived and calculations for the range of a communication link with and without the use of PSSS are made [14]. The most important system design parameters for a mixed-signal PSSS baseband circuit are discussed [61], [68] and the most suitable values of the system parameters are determined for a mixed-signal implementation of a high-data-rate communication system using PSSS [68] [73]. The amplitude distribution of the PSSS signal set is bell-shaped which leads to the idea that some of the higher magnitudes can be clipped-off because they occur much less frequently as compared to the amplitudes having smaller magnitudes. This has been investigated using system simulations led by the first three co-authors in [75]. The use of a mixed-signal architecture for the PSSS baseband circuit offers the possibility to merge the channel equalization process with the data decoding process. This idea has been previously implemented in the form of post-processing on the stored samples of the correlator output data captured using high sampling rate oscilloscopes etc. The applicability of this idea to the proposed mixed-signal PSSS baseband was discussed analytically and has been incorporated in the circuit design of the receiver baseband circuit.
# 3 Design and Validation of Critical Components for a Mixed-Signal PSSS Baseband

The general architecture of a mixed-signal PSSS baseband has been outlined in section 2.3. The mixed-signal baseband (BB) has a sliced architecture with each slice representing a unit architecture that is repeated *N* number of times where *N* represents the number of data symbols transmitted in parallel. The proposed system uses 15 parallel streams of data employing a mixed-signal architecture. To investigate the PSSS system design based on a mixed-signal baseband, a single slice of the receiver (RX) baseband (BB) was implemented as a test-chip. The circuit design of the critical components of the RX BB unit-slice and their characterization are presented in this chapter. For the characterization of the RX BB, the transmitter (TX) baseband circuit is emulated using an arbitrary waveform generator (AWG) with a sufficient effective number of bits (ENOB). In addition, the circuit design and characterization of the critical components of the TX unit slice architecture are also discussed in this chapter.

# 3.1 Semiconductor Technology

The realization of the analog/ mixed-signal baseband integrated circuits with ultra-wide bandwidth requires the use of state-of-the-art semiconductor technologies. The baseband architecture proposed in section 2.4.2 uses broadband signals with bandwidth in the range of 25-30 GHz. For the selection of the semiconductor technology, the available design kits in the institute were explored for their high-speed capabilities. The most suitable ones were the following: the compound Silicon-Germanium Bipolar Complementary Metal Oxide Semiconductor (BiCMOS) technology with 130 nm minimum feature-length, and the bulk Complementary Metal Oxide Semiconductor (CMOS) technologies with minimum feature lengths of 65 nm, or 28 nm. The transit frequency  $f_T$  of a transistor is the frequency at which the small-signal current gain  $i_c/i_b$  (for a bipolar junction transistor BJT) or  $i_d/i_g$  (for a MOS transistor) drops to unity. As a rule of thumb, the switching circuits made in a given technology can operate up to a frequency equal to one-half of the  $f_T$  of the transistors in that technology. Therefore, a high transit frequency is an important criterion for the selection of the technology. For amplifier circuits, the more important figure of merit is the unity unilateral power gain frequency  $f_{max}$  where the maximum achievable gain of the circuit reduces to unity.

# 3.1.1 IHP SG13S 130 nm BiCMOS Technology

Among the available semiconductor technologies, the compound Silicon-Germanium Bipolar Complementary Metal Oxide Semiconductor (BiCMOS) technology with the minimum feature-length of 130 nm from the Institute of High-Performance Microelectronics (IHP Microelectronics GmbH) was the most promising one because it offered one of the worldwide fastest NPN HBT transistors as well as a reliable CMOS process. It was suggested as the technology of choice in the project proposal. Moreover, there had been a lot of IC fabrication experience with this technology in the institute in the past. There are two process variants in the 130 nm technology i.e., the SG13S and the SG13G2. The SG13S process offers very fast NPN transistors with a very reliable CMOS process. The SG13G2 process on the other hand offers even faster NPN transistors but the reliability of the CMOS transistors was not known to be as good as that for the SG13S variant at that time. Since the target integrated circuit was supposed to have both high-speed analog components and low-speed digital parts, the SG13S variant was selected.

The SG13S process offers NPN Heterojunction-Bipolar Transistors (HBT) with very high  $f_T$  of 250 GHz and  $f_{max}$  of 340 GHz. It is a self-aligned, single poly-silicon technology with 130 nm minimum lithographic emitter width and seven aluminum metallization layers (5 thin and 2 thick metal layers) [76]. The breakdown voltage BVCEO (breakdown voltage between the collector and the emitter with the base terminal open) value is 1.7 V. Additionally, it offers CMOS transistors with BVDSS (breakdown voltage between drain and source with gate shorted to source) value of 1.2 V for core logic design and 3.3 V for the thick oxide devices for inputs and outputs (IOs). The list of passives includes salicided and un-salicided resistors, metal-insulator-metal (MIM) capacitors, inductors, and varactors.



Fig. 3-1 Variation of the transit frequency of the transistor with the collector current  $I_c$  and the collector to emitter voltage  $V_{ce}$  for the emitter length of 480 nm (left) and 840 nm (right). The  $V_{ce}$  variation is as follows: 0.4 V (red), 0.8 V (blue), 1.2 V (magenta), and 1.6 V (green) [76].

### 3.1.2 TSMC 65 nm Bulk CMOS Technology

The 130 nm BiCMOS technology described above is very suitable for high-speed broadband mixed-signal circuit design, but the CMOS transistors are not fast enough for digital standard-cell design for the transmitter digital baseband core working at the clock rate of 1.667 GHz. For this purpose, the 65 nm bulk CMOS technology from TSMC (Taiwan Semiconductor Manufacturing Company, Limited) was a good option. It provides highspeed CMOS transistors with  $f_T$  and  $f_{max}$  values of 200 GHz and 230 GHz respectively. A rich standard-cell design library was also available for the said technology. One of the reasons for the selection of this technology for TX BB circuit design was the possibility of test-chip fabrication by sharing the spare chip area from other projects in the institute. TSMC 65 nm CMOS is a good candidate for the high-speed mixed-signal BB circuit design as well which offers the possibility for a digital and analog circuit co-design on a single integrated circuit. It was chosen as the primary candidate for the mixed-signal transmitter baseband circuit design as explained in section 3.8.

### 3.1.3 GlobalFoundries 28 nm Bulk CMOS Technology

The 28 nm bulk CMOS technology from GlobalFoundries was one of the most scaled technologies available at the institute. The technology offers super low power (SLP) RF-CMOS transistors with  $f_T$  value of 310 GHz as well as a very rich standard-cell design library for digital design. The core voltage for the technology is 1 V whereas the I/O device voltage was 1.5 V or 1.8 V. It includes value-added RF devices for the RF system-on-chip design. The technology was used for post-layout level simulations of the mixed-signal receiver baseband circuit design to compare it with the implementation in the 130 nm BiCMOS technology. The implementation details of the circuit in 28 nm technology are discussed in section 3.10.

### 3.2 Receiver Baseband Unit-Slice

The starting point for the design of the high-speed circuits is the DC bias point (collector current  $I_{CE}$ , collector to emitter voltage  $V_{CE}$ ) of the transistor because it directly affects the voltage-dependent intrinsic parasitic capacitances of the transistors. The optimum bias point can be obtained by varying the bias point of the transistor and observing the change in the transit frequency  $f_T$  of the transistor. The collector current  $I_{C,f_T}$  for which the transit frequency  $f_T$  is maximum for a given  $V_{CE}$  gives the optimum bias point. Two different emitter lengths are available i.e., 480 nm and 840 nm with the minimum emitter width of 120 nm. The width of the transistors (and consequently the current-carrying capability) can be increased by increasing the number of emitter fingers. Fig. 3-1 shows the variation of the transit frequency  $f_T$  of the transistor with the variation in the collector current  $I_c$  as well as the collector to emitter voltage  $V_{ce}$  for the two emitter lengths i.e., 480 nm (on the left) and 840 nm (on the right).

The architecture of the PSSS receiver baseband based on the discussion in section 2.4.2 is shown in Fig. 3-2. It consists of a sliced architecture where each slice corresponds to a unit set of circuits that takes the PSSS stream as the input and returns a stream of data corresponding to the decoded symbol of data. Each slice uses an individual decoding sequence that is correlated with the incoming PSSS stream.



Fig. 3-2 Simplified block diagram of the mixed-signal receiver baseband [69].

The internal architecture of the unit-slice is shown in Fig. 3-3. One of the most important components is the programmable broadband weighted code generator circuit that generates the weighted decoding sequence chips in the form of current signals that are multiplied with the incoming PSSS chips. The weighted code generator consists of programmable differential current sources that are provided as inputs to a high-speed broadband analog current switching mux. The mux sequentially routes the current input signals to the output one after the other. The output of the weighted code generator is multiplied by the PSSS stream using a broadband analog multiplier circuit. The output of the multiplier is integrated for a duration of 15  $T_{chip}$  that is followed by a reset period of 3  $T_{chip}$  for the integrator circuit. At the end of the correlation phase, the output of the integrator is sampled using a sample-and-hold circuit. The reset or integrate command signal for the integrator circuit also acts as the sample/ hold signal for the sample-and-hold circuit. The



Fig. 3-3 Details of the unit-slice architecture

duty cycle of the reset or integrate command signal (or the sample/ hold command signal) is 16.667 % i.e., 1/6th of the chip rate  $f_{chip} = 1/T_{chip}$ . The output of the sample-and-hold circuit is converted to a 4-bit digital output using an analog to digital converter.

In the next sub-sections, the circuit design of the mixed-signal weighted code generator circuit, as well as the fast-resettable correlator circuit, is discussed. The direct output of the correlator (i.e., at the output of the integrator) as well as the sampled output of the correlator (i.e., at the output of the sample-and-hold circuit) are made available as chip outputs and are discussed later in this chapter. The above two outputs suffice for the verification of the concept for a mixed-signal PSSS receiver baseband unit slice. The circuit design and the output of the 4-bit ADC are not discussed in this chapter.

# 3.3 Mixed-Signal Programmable Weighted Code Generator Circuit

The most important and complex component of the unit slice shown in Fig. 3-3 is the broadband, fully differential, high-speed, mixed-signal weighted code generator. It is implemented as a fully differential, broadband, analog multiplexer (mux) with 18 static programmable inputs in the form of differential scalable currents where each input represents a weighted chip of the decoding sequence. The analog mux uses a select signal to route one of the 18 differential current inputs to a pair of output resistors forming a differential voltage output. The architecture of the mixed-signal weighted code generator is shown in Fig. 3-4.



Fig. 3-4 The detailed block diagram of the mixed-signal weighted code generator circuit.

Going from the bottom to the top of Fig. 3-4, the code generator consists of the following components [77] [78]:

- Programmable differential static CMOS current sources whose values are defined by the contents of a CMOS shift register.
- One-hot pulse generation circuit to provide the control signals for the current switches. The one-hot pulse generation circuit consists of a delay flip-flop (DFF) chain with a clock tree to supply synchronous clock to the DFFs, and a feedback network to route the output of the last DFF to the first DFF.
- High-speed differential current switches.
- 6-to-1 current summation transimpedance stage circuit to combine the current signals of 6 inputs. Three copies of this circuit are needed in total to cater to the 18 input current signals.
- Transadmittance stage to convert the voltage output of the transimpedance stage to a current signal to perform the wired-OR operation with outputs of 2 additional similar transadmittance stages.
- A final transimpedance stage to combine the outputs of 3 of the above transadmittance stages to generate the differential output voltage.

# 3.3.1 Programmable Differential Current Sources

The analog multiplexer part of the mixed-signal weighted code generator circuit shown in Fig. 3-3 selects one out of the 18 scalable differential current inputs and routes it to the output. This sub-section of the chapter discusses the circuit components used to generate the scalable differential current signals which act as the inputs to the analog current switching mux. Fig. 3-4 shows the building blocks of the input unit-cell (i.e., the programmable differential current source) of the weighted code generator circuit. It is important to note that the inputs to the weighted code generator circuit are static, differential, scalable current signals which remain constant after their values are set using a static CMOS shift register. The individual circuit components of the programmable differential current source are explained in the following sub-sections.

# 3.3.1.1 Static CMOS Shift Register

The values of the differential current inputs are set using 4-bit digital inputs. The values of the digital input are stored using a static CMOS shift register with a total length equal to  $18 \times 4 = 72$  bits. The shift register consists of CMOS DFFs connected in series i.e., the output of one DFF feeds the input of the next DFF and so on. The input to the first DFF is made externally accessible to allow a user-defined code to be stored in the shift register. 72 clock cycles are required to fill the shift register by shifting the 72 input data

bits of the external input signal. The shift register is a low-speed component and is, therefore, implemented using components from the CMOS standard-cell library.

The  $V_{DD}$  bias voltage for the bulk CMOS process is 1.2 V. It is important to note that the NMOS transistors used in the standard-cell library have their body terminals tied to the global p-substrate whereas the global p-substrate of the analog/mixed-signal circuitry in the baseband receiver chip is connected to the  $V_{EE}$  supply voltage of -4V. Placing the standard-cell library components directly on the substrate will cause the transition point of the logic gates to shift from the optimum transition point of  $V_{DD}/2$ . The solution is to place the standard-cell library components in an isolated local p-substrate or p-well. A buried n-doped layer or deep n-well is used to isolate the local p-substrate of the standardcell library components from the global p-substrate. The *isolbox* component in the IHP design kit provides a parameterized cell for defining isolated p-wells. Usually, only the isolated NMOS devices (i.e., whose body terminal is not shorted to the global p-substrate) are enclosed in p-wells. However, if the standard-cell library components are enclosed in the p-well, both NMOS and PMOS devices will be inside the local p-substrate. Post-layout transient simulations of the standard-cell library components e.g., DFF, logic gates, etc. were performed with and without the enclosing p-well structure which showed no noticeable differences in the performance. Note that a p-well creates two parasitic diodes i.e., one between the local p-substrate and the enclosing n-well (buried deep n-well layer enclosed with vertical n-well walls) and that between the n-well and the global p-substrate. Both the parasitic diodes must be reverse biased to avoid unnecessary power dissipation which in turn increases the electrical noise and the substrate temperature due to heat dissipation. With the n-well connected to the V<sub>DD</sub> voltage of 1.2 V and the global psubstrate connected to -4 V, the reverse bias voltage across the parasitic diode was 5.2 V. This voltage is below the maximum reverse breakdown voltage of -10.8 V for the PNjunction diode in the technology.

#### 3.3.1.2 CMOS DAC with Differential Current Outputs

The 4-bit digital weight value provided by the static CMOS shift register is applied to a DAC to convert it into an analog voltage. The R-2R DAC architecture is preferred as compared to other alternates because of its simplicity and smaller footprint. The schematic of the CMOS R-2R DAC is shown in Fig. 3-5. Two copies of the R-2R DAC are used to generate the differential rail-to-rail output voltage. The first DAC gets the direct



Fig. 3-5 Schematic of the CMOS R-2R DAC

output of the static CMOS shift register whereas the second DAC gets the inverted version of it. An NMOS pass gate is used for the 0 V connection and a PMOS pass gate is used for the 1.2 V connection. The gates of both the pass-gates are shorted which makes sure that there is a connection from the base of the 2R resistor to either 0 V or 1.2 V. The output of the CMOS R-2R circuit is a rail-to-rail voltage around a center value of 0.6 V.

The two complementary rail-to-rail voltage outputs from the CMOS R-2R DAC circuits form a differential voltage output which must be converted to a differential current signal. For this purpose, an HBT differential pair with active PMOS loads is used as shown in Fig. 3-6. The rail-to-rail differential voltage is applied first to a pair of NMOS common drain amplifiers to step-down the voltage before applying it to an HBT differential pair with PMOS transistors as the active loads. The body terminals of the NMOS transistors used as the common drain amplifiers are connected to their source terminals to enable a linear output. The output from the active PMOS loads is applied at the inputs of (a pair of) PMOS common source stages with NMOS transistors as active loads. The NMOS current is mirrored using much wider NMOS transistors at the output to obtain the required magnitude of the current. At the (NMOS open-drain) output nodes of the circuit, metal-insulator-metal (MIM) type capacitors are added to reduce the glitches at these nodes. The simulation results showed very good linearity of the circuit. All the MOS devices used in the circuit are thick oxide devices because the V<sub>DS</sub> voltage drop is larger than 1.2 V.



Fig. 3-6 Schematic of the differential voltage-to-current conversion circuit.

### 3.3.2 One-Hot Selection Pulse Generation

The starting point of the analog mux design is the generation of the select signal. This is achieved by using a chain of DFFs that is initialized with a one-hot code (i.e., a single 1 followed by all 0's). The subsequent clock pulses shift the one-hot pulse forward by one clock cycle until it reaches the last DFF after which it is fed back to the first DFF using a feedback network. The feedback pulse must be sampled by the first DFF at the falling edge of the clock. If the output of the  $18^{th}$  (last) DFF is fed back to the first DFF, it arrives later than the falling edge of the clock. The pulse is missed and cannot be propagated. The minimum horizontal length of each DFF in the layout is 56 µm. The output of the  $18^{th}$  DFF needs to travel at least the sum of the lengths of 18 DFFs together or 1008 µm.

With  $SiO_2$  as the dielectric medium for the electrical signal on the chip, the speed of the electrical signal is reduced by  $\sqrt{\epsilon_r}$  as compared to that in the vacuum where  $\epsilon_r = 4.1$  for  $SiO_2$ . The speed of the electrical signal in the vacuum is roughly  $c = 3 \times 10^8$  m/sec and that in  $SiO_2$  is  $1.48 \times 10^8$  m/sec. The path length of 1008 µm translates to a phase delay of 6.81 ps that is more than 20 % of the clock period i.e., 33.33 ps for a clock rate of 30 GHz.

The DFF output pulses are broadband signals with a width of 1/30 GHz or 33.33 ps. A good design goal for the rise/ fall time of the 33.33 ps wide pulse is 7.5 ps. This allows for a duration of ~18 ps for the flat part of the pulse. Using a first-order RC filter approximation for the tracks used to connect the DFF output to the next DFF input, the bandwidth of the signal can be calculated from the rise/ fall time of the signal using the following rule of thumb:

$$BW = 0.35/t_{rise} \tag{3-1}$$

where  $t_{rise}$  is the 10%–90% rise or fall time of the pulse. Using this rule of thumb, the required bandwidth of the DFF pulse is 46.667 GHz. The wavelength of such a signal on the chip (with  $SiO_2$  as the dielectric) can be calculated to be 3200 µm.

Moreover, the feedback path length of ~1mm is roughly equal to one-third of the signal wavelength at 30 GHz i.e., 3.2 mm which makes it necessary to consider the wave properties of the signal. A driver cell driving a matched load on either end of a transmission line is used to transmit the broadband pulse from the last DFF to the first DFF as shown in Fig. 3-7. The propagation delay of the driver cell added with the propagation delay of the signal through the long feedback path makes the total delay so large that the feedback pulse is not sampled by the falling edge of the clock. Thus, instead of the 18<sup>th</sup> DFF, the output of the 17<sup>th</sup> (or 16<sup>th</sup>) DFF is used to generate the input pulse for the first DFF. Additionally, a selectable delay cell is used to add a delay (selectable in 3 steps with each step adding a delay equal to one-third of the pulse width at 30 GHz) to the selected pulse (16<sup>th</sup> or 17<sup>th</sup>). More details about the selectable feedback delay circuit follow in section 3.3.2.3.



Fig. 3-7 One-hot pulse generation circuit.

The clock signal to the DFFs is gated to allow the use of the set and reset inputs of the DFFs to provide the desired initial conditions for the DFFs. The *external\_clock\_enable* signal is first applied to the D input of a D latch whose active-low enable (E) signal is the external clock signal. This makes sure that the *internal\_clock\_enable signal* (i.e., the output of the D latch) changes to logic 1 only during the negative edge of the clock and is stable before the rising edge of the next clock cycle. This makes sure that the output of the AND gate has no glitches or short pulses which can be present if the external clock signal is directly applied to the AND gate.

### 3.3.2.1 D Flip-flop

D flip-flops are used to propagate the one-hot pulse forward at each clock cycle. The negative edge-triggered D flip-flops are based on master-slave latches designed using current mode logic (CML). The flip-flops are equipped with a set or reset mechanism to provide the required initial condition to enable the one-hot pulse generation. The initial\_condition input is applied with the clock input disabled i.e., by setting the clock\_enable signal to logic 0 in Fig. 3-7. When the *clock\_enable* signal is low, the differential clock signal remains at logic 0 i.e., the signal clk *n* has a higher voltage than clk *p*. With the clock signal set to logic 0, the *initial\_condition* input can be set to logic 1 to define the initial condition of the DFF by pulling the set or reset input terminal high (i.e., 0 V) in Fig. 3-8 using a logic level conversion circuit (not shown in the figure) that converts the external *initial\_condition* CMOS level signal (0 V to 1.2 V) to the required voltage levels i.e., from -0.4 V to 0 V respectively. For the normal operation of the DFF, the initial\_condition signal is reset to logic 0 (-0.4 V), and the clock\_enable signal is set to logic 1. The DFFs sample the pre-defined initial conditions on their inputs at the next falling edge of the clock. Emitter followers (EFs) are used as buffer circuits in the DFF. They are used between the master and slave latch circuits, at the output of the DFF, and for the clock input. The emitter followers used at the output of the DFF are biased with a larger DC current (1.75 mA) as compared to the emitter followers between the master and slave latches that are biased with only 400 µA of current. This is because in addition to driving the master latch of the next DFF the output emitter followers have to drive an



Fig. 3-8 D flip-flop schematic with set or reset functionality.

additional EF pair load that controls the high-speed differential current switch explained in section 3.3.3 thereby serving as the select signal for the analog mux. The EF pair for the clock input is also biased with a DC current of  $400 \,\mu$ A to reduce the power dissipation.

#### 3.3.2.2 Clock Tree for the DFF Chain

A clock tree is required to supply the high-speed gated clock signal to the 18 DFFs synchronously. The clock tree should be designed to provide the same total delay at each output node. A binary tree cannot be used because the number of output nodes is nonbinary. The clock tree is, therefore designed with the following non-binary branching ratio  $2 \times 3 \times 3$ . Note that a tertiary tree i.e., with three output nodes has a smaller path length to the center output node as compared to those at the edges. The difference between the path lengths depends on the distance between the left and right output nodes. Choosing the first branching ratio for the clock tree as 2 instead of 3 makes the path length difference smaller for the next branches i.e., the branching ratio  $2 \times 3 \times 3$  is a better choice as compared to the branching ratios  $3 \times 2 \times 3$  or  $3 \times 3 \times 2$ .



Fig. 3-9 Clock tree for providing clock to the 18 DFFs. The schematic values of the resistive and capacitive peaking are different for different stages.

The clock tree branches add significant capacitive loading as the differential tracks for the clock tree are located on metal layer 4 and metal layer 5 of the layer stack-up whereas the ground layer is located on metal layer 3 of the layer stack-up as explained later in section 4.5. The repeater circuit at each node of the clock tree, therefore, requires some sort of peaking to be able to drive the capacitive load. Electromagnetic simulation of the clock tree segment tracks is employed to perform the parasitic extraction of the long signal tracks. Capacitive and resistive peaking techniques are used to ensure an amplitude of approximately 400  $mV_{diff}$  for the clock signal at each branching node of the clock tree. The values of the resistive and capacitive peaking elements are selected based on post-layout simulations and they vary for each stage of the clock tree repeater cells.

### 3.3.2.3 Selectable Feedback Pulse Delay Circuit

As illustrated in Fig. 3-7, a selectable delay circuit is used to delay the feedback pulse to the first DFF. This is done to add post-fabrication troubleshooting and flexibility for the operation of the one-hot pulse generation circuit despite the process, voltage, and temperature (PVT) changes. Additionally, it enables operation at different clock frequencies and even allows for a change in the code length. The feedback pulse delay circuit can choose from 4 different delay options where each successive delay cell adds a third of the pulse width at 30 GHz or one-sixth of the pulse width at 15 GHz.

The block diagram of the selectable delay circuit is shown in Fig. 3-10. There are four delay cells connected in series with a switchable output stage connected at the output of each delay cell. The delay cell consists of a cross-coupled pair with resistive loads. The output stage consists of a differential trans-admittance stage with open-collector outputs.



Fig. 3-10 Schematic of the selectable feedback delay circuit with a possibility to select one out of 4 delay options. The CMOS ring register is not shown.

The open-collector outputs of all four output stages are tied together to a pair of resistors to convert the current output to a voltage signal. The output stages have a switchable current source at the bottom. Only one output stage can have its switch closed at any given time while all the other switches are open meaning only one output stage has its current routed to the output resistor pair whereas all the other output stages have no current flowing through them and hence do not contribute to the output signal.

The digital logic used to control the switches of the output stages is controlled using a CMOS ring register. CMOS DFFs with set and reset inputs are used in a ring connection. An external initial condition signal is applied such that the first DFF is set while all others are reset i.e., 1000. This configures the circuit to generate the output after passing through the first delay cell only. If more delay is required, the initial condition signal is removed (or disabled) and a push-button is used to generate a pulse signal that is connected to the clock input of all the CMOS DFFs. With each additional pulse to the clock input of the DFFs in the ring register, the one-hot pulse moves forward by one place thus shifting the output to the next stage (0100, 0010, 0001, ...) while incorporating additional delay.

The simulation results of the feedback pulse delay circuit are shown in Fig. 3-11 for all four delay options for the clock frequency of 30 GHz. Each delay cell adds  $\sim 11 ps$  of additional delay. Note that the input to the first delay cell is the output of either the 16<sup>th</sup> or the 17<sup>th</sup> DFF of the one-hot selection pulse generation circuit explained in section 3.3.2, which is selectable with an external input signal 16*or*17.

The output of the selectable delay circuit shown in Fig. 3-10 goes to the first DFF that is located on the other end of the DFF chain. The minimum length that the signal has to travel is equal to the length of 18 DFFs, i.e., 1008  $\mu$ m as mentioned before. To ensure that there are no reflections and consequent degradation of the broadband pulse signal, the output of the selectable delay circuit goes to a transmission line (TL) driver circuit



Fig. 3-11 Simulation results of the selectable feedback delay selection circuit showing the four possible delayed feedback pulses with either the 16<sup>th</sup> or the 17<sup>th</sup> DFF output pulses as the input.

which drives a load consisting of 80  $\Omega$  resistors connected on both ends of a microstrip line having a characteristic impedance of 80  $\Omega$ . A pair of emitter follower circuits, with 80  $\Omega$  loads connected from the base of the transistor to the ground, are used at the other end of the transmission line. Note that 80  $\Omega$  is the maximum impedance of the microstrip line with the signal track on Top Metal 2 and the ground plane on Metal 3. More details about the choice of metal layers follow later in chapter 4. The simulation results in Fig. 3-11 show all possible selectable delay pulses that can be used as input for the first DFF. This offers a great deal of flexibility for the prototype circuit and allows to correct any post-fabrication issues.

#### 3.3.3 High-Speed Differential Current Switches

The output of the one-hot pulse generation circuit serves as the selection input for the analog mux. The differential outputs of the DFFs in the one-hot pulse generation circuit are applied to EFs before applying them to individual differential current switches consisting of current switching quads. The current switching quad consists of two differential pairs. The input to the switching quad is a linearly scalable differential current signal that is described in section 3.3.1. The differential current signal can either be routed to the output (when the DFF output is high) or be routed to an alternate dummy branch that is not routed to the output (when the DFF output is low) as shown in Fig. 3-12. The differential pairs in the switching quad are made unbalanced by adding degeneration resistors of 15  $\Omega$  to only one side of the differential pair. This allows a smooth transition of the differential current output of the analog mux as the one-hot selection pulse switches to the next DFF. The DC voltage at the I\_in nodes is -2.9 V whereas the voltage at the I\_out nodes is -1.45 V which makes the  $V_{CE}$  drop for the transistors connected to the outputs to be below the breakdown voltage of 1.8 V regardless of whether the DFF output is high or low. However, when the DFF output is low (i.e., DFF\_n signal is higher than the DFF\_p signal), the voltage drop across the transistors connected to the dummy loads exceeds the breakdown voltage value of 1.8 V so two diodes are used to drop down the voltage at the collectors.



Fig. 3-12 Schematic diagram of the high-speed differential current switch.

### 3.3.4 6-to-1 Current Summation Transimpedance Stage

The differential current outputs from the 18 differential current switches have to be combined to form the current output of the mux. A transimpedance stage is required to convert the current output signal to a voltage equivalent. The simplest method to achieve both these goals is to connect all the left-hand side (or p) open-collector outputs to one resistor and all the right-hand side (or n) open-collector outputs to another resistor. There are two problems with this approach. Firstly, combining the 18 open-collector outputs to one resistor increases the RC time constant at the node significantly. Secondly, the current pulses are broadband signals with a width of 1/30 GHz or 33.33 ps. A good approximation of the rise/ fall time of the 33.33 ps wide pulse is 7.5 ps similar to the DFF output pulses. This allows for a duration of 18.33 ps for the flat part of the pulse. Using a firstorder RC filter approximation for the tracks which will be used to connect the open-collector outputs to the resistors, the bandwidth of the signal can be calculated from the rise/ fall time of the signal using the rule of thumb given in Equation 3-1. According to this, the required bandwidth of the broadband multivalued current pulse is 46.667 GHz. The wavelength of such a signal on-chip (with  $SiO_2$  as the dielectric) can be calculated to be 3200 µm. If all 18 open-collector outputs are connected at the geometric mid-point, then the minimum length of the longest track that is connecting the leftmost i.e., 1<sup>st</sup> and the rightmost i.e., 18<sup>th</sup> output to the midpoint has been measured in the layout to be approximately 1000 µm that is 31.25% of the wavelength of the signal. Thus, wave properties of the signal must be considered which requires the use of transmission lines with matched loads to avoid reflections. Also, note that 1000 µm distance is equivalent to 6.66 ps propagation delay which means that there will be a skew of 6.66 ps (20 % of the pulse width) between the 1<sup>st</sup> and the 18<sup>th</sup> current outputs.

The solution to the first problem is to use a common base stage between the node connecting the open collector outputs and the output resistor. This reduces the Miller capacitance between the collector and base terminals of the differential current switches. The solution to the second problem is to connect 6 open collector outputs rather than all 18 of them together. This reduces the difference in track lengths between the farthest output



Fig. 3-13 Schematic of the 6-to-1 current summation cell to combine 6 out of 18 current outputs.

nodes (1<sup>st</sup> and 6<sup>th</sup> or 7<sup>th</sup> and 12<sup>th</sup> or 13<sup>th</sup> and 18<sup>th</sup>) from 1000  $\mu$ m to 333.33  $\mu$ m and the difference in the maximum skew value from 6.66 ps to 2.22 ps. The 6-to-1 current summation cell to combine the output of the six currents is shown in Fig. 3-13.

#### 3.3.5 Transadmittance Stage

Three copies of the 6-to-1 current summation transimpedance cells are needed to combine the 18 current outputs. The voltage outputs of the transimpedance stages need to be converted into currents to combine them in the next step using wired-OR operation. Fig. 3-14 shows the open collector differential amplifier transadmittance stage.



Fig. 3-14 Schematic of the transadmittance stage to convert the sum of 6 voltage signals to voltage signals. Three copies of this circuit are needed to combine the 18 current output signals.

### 3.3.6 3-to-1 Current Summation Transimpedance Stage

The differential current output signals of the three transadmittance stages are wired-OR to generate the final output of the mux. The output of each of the local current summation cells is connected to a transmission line with a characteristic impedance of 80  $\Omega$  that is terminated on the other end with a matched load of 80  $\Omega$ . Fig. 3-15 shows how 3 copies of the current summation cells are combined to generate the final output of the mux.



Fig. 3-15 Combining the 3 current summation cell outputs to generate the final mux output voltage signal.

A common base stage is used as a current buffer. Note that the DC bias current of all 3 current summation cells flows through the mux output resistors which lowers the DC voltage of the resistors. To make sure that the collector bias voltages of the common base stage are enough to keep the transistors out of saturation the resistors are biased at 0.3 V. This is achieved by using a diode to drop down the voltage from 1.2 V.

# 3.4 Characterization of the Mixed-Signal Weighted Code Generator

The simulation results of the mixed-signal weighted code generator circuit are shown in Fig. 3-16. The inputs to the mux in the simulation correspond to the following vector where the first 15 bits correspond to an m-sequence of length 15 (during the correlation phase) followed by three (near) zero levels (during the guard interval). Note that a true zero level is not possible with an R-2R DAC and therefore, signals with levels  $\pm LSB/2$  around the mid-scale are used.

$$\{-1, -1, -1, -1, +1, +1, +1, -1, +1, +1, -1, -1, +1, -1, +1\}, \{0, 0, 0\}$$

The frequency of the select signal of the analog multiplexer which defines the output data rate is 30 GHz. The simulation results show the very good slew rate of the output signal.

An important point to note is that the direct output of the mixed-signal weighted code generator is not available as a chip output and the only way to characterize the output of the weighted code generator is to observe the output of the correlator with the PSSS input set to a constant DC level. When the PSSS input is fixed to logic level 1 (constant DC level), the output of the multiplier is the same as the output of the code generator circuit, and the output of the correlator is the integrated version of the code generator output. The measured result of the integrated version of the weighted code generator is shown in Fig. 3-17. The PSSS input is set to a constant DC level and the weighted code generator is programmed generate m-sequence of length 15 to an i.e.,  $\{-1, -1, -1, -1, +1, +1, +1, -1, +1, +1, -1, -1, +1, -1, +1\}$  followed by 3 zeros



Fig. 3-16 Simulation result of the weighted code generator circuit with a 30 GHz clock signal used as the select signal for the analog current switching mux.

 $\{0,0,0\}$  during the reset phase of the correlator. The measured result shows not only very good linearity but also the quick resetting of the correlator as required.

Note that the measurement setup and the details about the measurements are provided in chapter 5 along with the characterization of the complete mixed-signal PSSS receiver baseband unit-slice circuit.



Fig. 3-17 Measured output signal showing the integral of the output generated by the weighted code generator circuit.

# 3.4.1 Comparison with the State of the Art

There are two circuits design approaches to design a programmable weighed code generator circuit. One approach is based on the analog current switching multiplexer as used in the current implementation. The idea is to perform the wired-OR operation of the input signals with switches to disconnect all but one of the input signals. Commercial analog multiplexers are available as analog switches with input signal bandwidth up to 4.5 GHz. They are used in applications that require the selection of one input out of several inputs based on the select signal. No repetitive or high-speed switching of the signals is required. For the fastest commercial analog switches, the delay between the application of the digital control signal (50% of the full scale) to the output switching on or off is in the nanosecond range i.e., up to 10 ns. The commercially available microelectromechanical (MEM) switches have input 3 dB bandwidth up to 14 GHz, but their switching times are in the microsecond range.

Analog multiplexers with wideband inputs and high switching speeds have been reported in publications e.g., in [79], a 2:1 analog mux is designed in InP DHBT technology with an input 3 dB cut-off frequency of 40 GHz. The multiplexer is used to multiplex the 25 GHz outputs of two 25 GS/s DACs to implement a 50 GS/s DAC. The power consumption of the multiplexer is 1.35 W using a supply voltage of 5.5 V. In [80], a fully differential analog 2:1 mux is presented that features an input signal bandwidth exceeding 67 GHz. The mux is used for the time-interleaving of two DACs each having an analog output bandwidth of 36 GHz. The mux is designed in 130 nm SiGe BiCMOS technology and has a static power dissipation of 1.06 W using -4.5 V for mux core and -3.4 V for clock drivers. The analog mux in [81] has a 3 dB input bandwidth of 97 GHz. The mux is designed in a 0.25  $\mu$ m InP DHBT technology and dissipates 1 W of power using a -4.6 V supply. In [82], a 2:1 analog mux is presented with measured 3 dB bandwidth of 110 GHz for both data and clock paths. The measurements show time-domain large-signal sampling operations of up to 180 GS/s. The circuit is designed in 0.25  $\mu$ m InP DHBT technology and dissipates 0.9 W of power.

Apart from the notable examples mentioned above, analog multiplexers are also reported in other publications. It is important to note that the number of analog inputs to be multiplexed is limited to small numbers. In most cases, there are only two inputs to be multiplexed. The analog mux designed in the current application has 18 inputs to select from using 33.33 ps selection pulses. There are no examples of analog multiplexers with a nonbinary number of inputs let alone with such a large number of inputs.

The other circuit design approach to implement the programmable weighted code generator circuit is using a 4-bit DAC (with a sampling rate of 30 GS/s and an analog output bandwidth of 30 GHz) with a 4-bit 18:1 digital mux. An example of a 30 GS/s 4-bit DAC with 30 GHz of output bandwidth designed in 0.25  $\mu$ m SiGe BiCMOS technology is presented in [83] which dissipates 445 mW of power using a 3.5 V supply with an area of 1.87 mm<sup>2</sup>. Another example is a 32 GS/s 6-bit double sampling DAC in InP HBT technology that dissipates 1.4 W with a -4 V supply with an area of 9 mm<sup>2</sup> [84]. The design of the 4-bit 18:1 digital mux with an output data rate of 30 GBaud is quite challenging in terms of layout and will also dissipate a lot of power.

### 3.5 Broadband Correlator

The weighted code generator circuit provides the weighted decoding vector that is correlated with the incoming PSSS waveform. To perform the correlation, a broadband analog correlator is required that multiplies the chips of the PSSS waveform with the on-chip weighted code generator output. The cross-correlation between two signals u(t) and v(t)describes the similarity between the signals u(t) and v(t) and is defined as

$$u(t)\otimes v(t) \triangleq \int_{-\infty}^{\infty} u^*(\tau)v(\tau+t)d\tau \triangleq \int_{-\infty}^{\infty} u^*(\tau-t)v(\tau)d$$
(3-2)

where  $u^*(\tau)$  denotes the complex conjugate of the signal  $u(\tau)$  that is equal to the signal  $u(\tau)$  for real-valued signals. The process of correlation consists of multiplication and integration and is performed using an analog integrate and dump correlator with a reset mechanism to reset the integrator after completion of the correlation.

#### 3.5.1 Multiplier Circuit

A four-quadrant multiplier circuit based on a Gilbert cell is used as the analog multiplier. The schematic of the multiplier circuit shown in Fig. 3-18 has two inputs i.e., the external PSSS input (*In1*) and the output of the weighted code-generator circuit explained in section 3.3 generated on-chip (*In2*). The PSSS input of the multiplier has a linear dynamic range larger than 1 V<sub>diff</sub> whereas the weighted code generator input has a linear dynamic range of roughly 250 mV<sub>diff</sub>. The 1 V<sub>diff</sub> dynamic range for PSSS is based on the maximum output dynamic range of M8195 arbitrary waveform generator (AWG) from Keysight which was the available test source for the PSSS signal for the baseband receiver chip characterization. The 250 mV<sub>diff</sub> dynamic range for the mux input corresponds to the maximum output linear dynamic range of the mux.



Fig. 3-18 Schematic of the four-quadrant multiplier circuit

#### 3.5.2 Integrator Circuit

For the realization of the broadband integrator, a  $G_m$ -C cell is used. The transconductance  $G_m$ -cell is realized using an HBT differential amplifier which has a pair of resistors as the passive load. A negative resistance generator circuit realized as a cross-coupled HBT differential pair is connected in parallel to the resistive load as shown in the dashed box in Fig. 3-19. The integrating capacitor is connected between the two output nodes of the differential  $G_m$ -cell. The capacitor is realized as a pair of cross-coupled metal-insulator-metal (MIM) type capacitors connected in parallel. The circuit is realized using only NPN transistors because the IHP design-kit does not offer any PNP transistors. A fast reset circuit and a manual offset correction circuit are also included. The integrator is not only required to be linear but it must also allow fast resetting before the start of the next integration cycle.

The dynamic range of the integrator is a very important parameter for both the input and the output of the integrator. The input dynamic range of the  $G_m$ -C integrator depends on the linearity of the  $G_m$  cell. The  $G_m$  cell being an HBT differential pair in the current design can be linearized with resistive degeneration. The linearity of the output of the integrator circuit depends on the RC time constant of the load. There are, however, two conflicting requirements for the RC time constant in this circuit. During the integration phase, the RC time constant must be very large to allow linear operation whereas during reset operation it must be small enough to allow a swift discharge. The required change in the RC time constant during the two phases is obtained by changing the resistance R of the RC load while keeping the capacitance constant during the integration and reset phases.

The magnitude of the negative resistance is designed to be equal in magnitude to the positive load resistance ( $R_{C1}+R_{C2}$ ) which results in a very large equivalent resistance in parallel to the integrating capacitor during the integration phase. Ignoring the output resistance  $r_o$  and the input resistance  $r_{\pi}$  of the common-emitter stage, the negative resistance of the cross-coupled differential pair ( $Q_5$ ,  $Q_6$ ) with emitter degeneration ( $R_E$ ) is given by  $-R_E - 2/g_m$  where  $g_{m5} = g_{m6} = g_m = V_T/I_C$  is the transconductance of the transistors  $Q_5$  and  $Q_6$  and the current sources.  $V_T$  is the thermal voltage given by kT/q where k is the Boltzmann's constant which has the value  $k = 1.381 \times 10^{-23}$  Joules/Kelvin, T is the temperature in Kelvin and q is the elementary charge i.e., the charge on a single proton or the magnitude of the charge on an electron which has the value  $q = 1.602 \times 10^{-19}$  Coulombs.  $I_{C1} = I_{C2} = I_C = I_{neg}/2$  which can be controlled externally can be used to adjust the  $-2/g_m$  part of the negative resistance equal to tune the negative resistance equal in magnitude to that of the collector resistance ( $R_{C1}+R_{C2}$ ) of the main differential pair [85].



Fig. 3-19 Integrator core with fast reset and manual offset correction circuits.

The maximum output voltage of the integrator is limited by the base to collector voltage of the cross-coupled pair transistors (Q<sub>5</sub>, Q<sub>6</sub>); because by increasing it to a large value, one of the two transistors goes into saturation. On the other hand, the negative resistance generated by the cross-coupled pair is a function of the differential voltage across its terminals thereby changing the effective RC constant of the integrator for large output amplitudes. Note that during the correlation process, the output of the integrator can go up to  $log_2(N + 1)$  times the final correlation result where *N*=length of the m-sequence (see Table 2-2). For the chosen case of *N*=15, the instantaneous output of the integrator during the correlation can go up to 4 times the final output of the correlator. The final correlator output is, therefore, limited to ±120 mV<sub>diff</sub> to restrict the maximum instantaneous output voltage amplitude to ±480 mV<sub>diff</sub>. The 120 mV<sub>diff</sub> output corresponds to the maximum amplitude symbol i.e., (±15) of the PAM-16 input data symbol set. The minimum amplitude symbol i.e., (±1) of the PAM-16 input data symbol set corresponds to the final output voltage amplitude of ±8 mV<sub>diff</sub>.

For the reset operation, a current switch is used that connects both terminals of the integrating capacitor ( $C_{integ}$ ) to a low impedance discharge path ( $Q_7 - Q_9$ ). When the current is steered to transistor  $Q_9$  by the *reset* signal, the transistors  $Q_7$ ,  $Q_8$  operate as diodes forcing the voltages at the nodes *Integ\_p* and *Integ\_n* to become equal, and hence  $C_{integ}$  is discharged. Note that the common-mode voltage of the nodes *Integ\_p* and *Integ\_n* remains constant during the reset and integration phases.

The transition between the integration and reset phases is governed by the differential pair formed by  $Q_9$ ,  $Q_{10}$  with inputs *intgrt* and *reset*. The inputs *intgrt* and *reset* is a differential signal formed by taking the logical OR of the 16<sup>th</sup> and 17<sup>th</sup> DFF outputs with a delayed version of the 15<sup>th</sup> DFF output from the one-hot pulse generation circuit. Note that the *intgrt* and *reset* differential signal has a long path length in the layout and therefore, the 18<sup>th</sup> DFF output cannot be used.

A manual offset correction circuit is added to correct any post-fabrication offset voltages in the integrator circuit ( $Q_{13}$ ,  $Q_{14}$ ,  $I_{offset}$ ). The circuit allows control of magnitude and sign of the offset correction. Emitter follower stages are added as buffers at both input ( $Q_1$ ,  $Q_2$ ) and output ( $Q_{11}$ ,  $Q_{12}$ ) of the integrator core. The current sources  $I_{neg}/2$  are implemented as current mirrors and their value can be adjusted externally to match the value of the negative resistance to that of the collector resistance ( $R_{C1}+R_{C2}$ ).

#### 3.5.3 Generation of the Integrate or Reset Command Signal

The generation of the synchronous integrate or reset command signal is achieved by making use of the DFF outputs from the one-hot pulse generation circuit described in section 3.3.2. The outputs of the 15<sup>th</sup>, 16<sup>th</sup>, and 17<sup>th</sup> DFF are OR'ed together using a CML OR gate. Note that the same signal acts as the track or hold command signal for the trackand-hold circuits comprising the sample-and-hold circuit. In the layout, the broadband output signal of the OR gate has to travel a distance of roughly 600  $\mu$ m. To avoid reflections and signal degradation, a microstrip line with matched loads on both sides is used. At the starting point, a differential amplifier with 80  $\Omega$  loads is used to drive a microstrip line with a characteristic impedance of 80  $\Omega$ . On the other side, a common collector stage is used with an 80  $\Omega$  connected from the base to the ground. While the physical path delay of the microstrip line cannot be changed, the propagation delay through the transmission line driver circuit and the common collector stage is adjusted to position the *intgrt* or *reset* command pulse at the correct position i.e., from the rising edge of the 16<sup>th</sup> DFF pulse to the falling edge of the 18<sup>th</sup> DFF pulse of the multiplier output (or integrator input). The simulation result in Fig. 3-20 shows the alignment of the reset or integrate command signal with the code generator output. The PSSS input was set at a constant DC level of -1. The multiplier output (or integrator output) and the integrator output is also shown in the figure.



Fig. 3-20 Simulation results showing the alignment of the reset/ integrate command signal with the input of the correlator.

### 3.6 Characterization of the Broadband Correlator Circuit

To characterize the performance of the broadband, fast-resettable correlator circuit, a testchip was manufactured with includes the four-quadrant multiplier and the fast-resettable integrator described in section 3.5. A sample-and-hold (S/H) circuit is also included in the test-chip but its circuit and the measurement results are not explained here. The circuit design of the S/H circuit is explained in chapter 4 whereas the results are discussed in chapter 5 with the overall measurement results for the receiver baseband unit slice. The block diagram of the test-chip with the external inputs and outputs is shown in Fig. 3-21. To characterize the chip, a high-speed printed circuit board (PCB) was fabricated with a cavity in the middle to place the chip. The height of the chip according to the foundry is 250-300 µm. The cavity allowed to reduce the height difference between the chip and the



Fig. 3-21 Block diagram of the test-chip with the external inputs and outputs indicated.

PCB surface. A ground ring connected to the bottom ground plane of the RF PCB surrounded the chip which allowed to have very short and low inductive connections to the ground plane on the backside as shown in Fig. 3-23. Although the bond wires for the ground connections were quite small, the RF tracks had a larger distance to the RF pads due to the ground ring which increased their inductance. The RF board was fitted with sub-miniature push-on (SMP) prototype connectors. The SMP connectors were very sensitive to any vertical or lateral movement of the cables and required very good fixation to



Fig. 3-22 Measurement setup to characterize the correlator circuit.

avoid damaging the PCB. More details about the PCB with improvements in the design and better high-frequency characteristics are discussed in chapter 5.

The measurement setup for the characterization of the chip is shown in Fig. 3-22. The two inputs to the correlator as well as the reset/integrate command signal were generated using a bit pattern generator (BPG) with 4 sub-rate (up to 32 Gbps) and 2 full-rate channels (up to 56 Gbps) that are the multiplexed versions of the sub-rate channels. The output was connected to a high-bandwidth (70 GHz) sampling oscilloscope with two singleended input channels. A true differential measurement was not possible. The sampling oscilloscope requires a trigger reference signal for sampling the signals. The trigger reference signal should be a clock signal synchronous to the sampled signal. This signal is provided using the clock output signal of the BPG. The output data rate of the BPG signals is defined using an external clock input signal that is provided using a high-frequency programmable signal generator (up to 40 GHz). A photograph of the measurement setup including the PCB containing the chip as well as the peripheral devices is shown in Fig. 3-22. The mechanical instability of the RF connectors, the poor surface profile of the manufactured PCB, and the long RF bond wires necessitated thorough investigations and a new design approach to optimize the high-frequency characteristics of the PCB for the receiver baseband unit-slice test-chip that is explained in detail in chapter 5.



Fig. 3-23 Chip microphotographs of the broadband, fast resettable correlator test-chip.

### 3.6.1 Linearity Test

To investigate the linearity of the step response of the correlator circuit, square wave signals with different amplitudes were applied at the multiplier inputs of the correlator. When the reset signal is low, the differential output signal of the correlator is a ramp signal, as shown in Fig. 3-24 that resets to 0 V<sub>diff</sub> after the reset signal is activated. Fig. 3-24 shows integrator ramp outputs for all four combinations of multiplier inputs i.e.,  $\{++, --, -+, +-\}$  generating both positive and negative ramps. Note that the negative



Fig. 3-24 Step-response of the correlator for all 4 combinations of multiplier inputs i.e. {++, --, ++, +-}

ramp has a slightly larger amplitude than the positive ramp because of a small DC offset voltage in the output driver circuit. The offset is observed even if 0  $V_{diff}$  is applied at both inputs of the correlator. Using peak values of positive and negative ramps for different amplitudes of the two inputs the output 1 dB compression point was calculated. The worst-case value for the output first order compression point P1dB was found to be 9.9 dBm (700 mVdiff) for both inputs.

# 3.6.2 Correlation Test



Fig. 3-25 Correlator output (singe-ended) with bipolar m-sequences applied at both inputs (with and without time shift) with a data rate of 28 Gbps

and the integrator output is reset already within 3  $T_{chip}$  or 107 ps. The results for the bit rate of 33 Gbps are shown in Fig. 3-26. Note that the data rate of 33 Gbps is beyond the



Fig. 3-26 Correlator output (singe-ended) with bipolar m-sequences applied at both inputs (with and without time shift) with a data rate of 33 Gbps.

correlation peak of  $\pm 33$  mV or 66 mV<sub>diff</sub> can be seen which can be easily detected using a suitable detector circuit designed in 130 nm SiGe BiCMOS technology. The circuit shows excellent linearity and high-frequency performance of the correlator circuit.

# 3.6.3 Comparison with the State of the Art

A comparison with the state of the art for the analog correlator circuits is shown in Table 3-1. Broadband analog correlators with bandwidth above 10 GHz have been demonstrated for heterodyne spectrometers for astronomical and remote sensing applications. The maximum reported bandwidth for an analog correlator is 18 GHz which uses the InGaP/ GaAs technology for an analog lag correlator for radio interferometry [86]. The lag correlators in [86] use wideband multiplier chips with bandwidth up to 20 GHz with lag spacings produced by different lengths of microstrip line splitter tree segments. The lag correlator architecture described in [86] is, however, unsuitable for integration on a single integrated chip.

The fastest analog correlator integrated circuit published so far was designed for a 22-29 GHz ultrawideband (UWB) vehicular radar system using 90 nm CMOS technology [87]. It uses a four-quadrant transconductance multiplier circuit with a broadband  $G_m$ -C integrator with a reset time of roughly 500 ps. However, in [87] only simulation results are presented, and no information is provided regarding a chip realization or measurement results thereof. Further published results about broadband analog correlators are limited to UWB applications in the frequency bands from 3.1-10.6 GHz. An example is the broadband multiplier-based correlator in [88] for 3.1-10.6 GHz full band UWB receiver in 0.18 µm CMOS which consumes 52 mW in the correlator core. In [89], the postlayout simulation results of an analog correlator for UWB pulse radar at 3.1-10.6 GHz are discussed that is implemented in 90 nm CMOS, whereas in [90], simulation results of a 3-10 GHz analog correlator using 0.24 µm SiGe BiCMOS are presented. Broadband analog correlators have also been demonstrated for UWB receivers operating in the lower UWB band at 3.1–5 GHz. For example, in [91] a fully integrated pulse-based UWB transceiver is demonstrated in 0.18 µm CMOS for 50 Mbps communication which dissipates 41.4 mW. In [92], an inductor-less wideband multiplier is demonstrated with an integrator and dynamic bias control circuit in 0.18 µm CMOS for the lower UWB band dissipating

|                           | This work               | [2]                  | [3]                  | [4]                       | [5]                       | [6]                        | [7]                    | [8]                    |
|---------------------------|-------------------------|----------------------|----------------------|---------------------------|---------------------------|----------------------------|------------------------|------------------------|
| Bandwidth<br>(GHz)        | 24 GHz<br>(33 Gbps NRZ) | 18 GHz<br>(2-20 GHz) | 7 GHz<br>(22-29 GHz) | 7.5 GHz<br>(3.1-10.6 GHz) | 7.5 GHz<br>(3.1-10.6 GHz) | 7.5 GHz<br>(3.1-10.6 GHz)  | 1.9 GHz<br>(3.1-5 GHz) | 1.9 GHz<br>(3.1-5 GHz) |
| Fabrication<br>Technology | 130 nm<br>SiGe BiCMOS   | InGaP/<br>GaAs       | 90 nm<br>CMOS        | 0.18 μm<br>CMOS           | 0.24 μm<br>SiGe BiCMOS    | 90 nm<br>CMOS              | 0.18 μm<br>CMOS        | 0.18 μm<br>CMOS        |
| Reset Time (ps)           | 120 ps                  | -                    | 500 ps               | -                         | -                         | -                          | -                      | -                      |
| Power Dissipation<br>(mW) | 122.5m                  | -                    | 131 mW               | 52 mW                     | 30 mW                     | 2.3 mW<br>(for multiplier) | 41.4 mW                | 5.2 mW                 |
| Results                   | Measured results        | Measured             | Simulation           | Measured                  | Simulation                | Post-layout                | Measured               | Measured               |
| (Simulated/               | (IC mounted on          | results (not an      | only (for            | results                   | results                   | simulation                 | results                | results                |
| Measured)                 | PCB)                    | intgrtd. circuit)    | correlator)          |                           |                           | results                    |                        |                        |

Table 3-1 Comparison of the broadband correalator circuit with the state of the art.

5.2 mW. As stated above, the correlator test-chip operates at an input data rate of 33 Gbps and can be reset within an ultra-short time of only 120 ps which exceeds the state of the art. This is the fastest correlator circuit published so far with the largest bandwidth and a very fast reset time.

# 3.7 Receiver Baseband Simulations

The programmable weighted code generator and the resettable integrator circuit described in the previous sections can be used to perform the correlation of the received PSSS waveform with the local copy of the decoding sequence. A sample-and-hold (S/H) circuit can be added to sample the result of the correlation at the end of the integration phase. Details about the S/H circuit, the voltage-controlled delay line (VCDL) circuit to adjust the skew of the received PSSS waveform with the reference decoding sequence, and other peripheral circuits required for the design of a complete mixed-signal PSSS receiver baseband are given in chapter 4. In this sub-section, the simulation results of the PSSS receiver baseband (i.e., the programmable code-generator circuit and the resettable integrator followed by an S/H circuit) are shown with PSSS data generated by the encoding of BPSK and PAM-4 data. The corresponding measurement results are presented in chapter 5.

Note that for BPSK data, the PSSS amplitude set has the following values  $\{0, \pm 1, \pm 2, \pm 3, \pm 4\}$  according to Table 2-2. The PSSS chip rate was set at 30 Gcps. The output of the S/H circuit was used to plot an eye diagram, which is shown in Fig. 3-27. The eye diagram shows a clear vertical opening at 30 Gcps.



Fig. 3-27 Eye diagram of the decoded PSSS waveform at the output of the S/H circuit for the case of PSSS data generated by the encoding of BPSK data with a PSSS chip rate of 30 Gcps.

In the case of PAM-4 data, the PSSS amplitude set has the following values  $\{0, \pm 1, \pm 2, ..., \pm 12\}$  according to Table 2-2. The PSSS chip rate was set at 30 Gcps and the output of the S/H circuit was used to plot an eye diagram, which is shown in Fig. 3-28. The eye diagram shows clear vertical eye-openings for all three PAM-4 sub-eyes at 30 Gcps with good linearity.



Fig. 3-28 Eye diagram of the decoded PAM-4 waveform at the output of the S/H circuit for the case of PSSS data generated by the encoding of PAM-4 data with a PSSS chip rate of 30 Gcps.

# 3.8 Transmitter Baseband

The architecture of the mixed-signal PSSS transmitter baseband is discussed in section 2.3.2. The transmitter has a sliced architecture as shown in Fig. 2-4. Each slice consists of XOR gates that are used to encode the symbol with the respective coding vector. The first encoded chips of all N symbols are added up to generate the first chip of the PSSS sequence and the process continues for the next encoded chips. The process of encoding the data vector with the coding chips and summing the encoded chips can be represented as matrix multiplication which can be computed using a digital baseband core as shown in Fig. 2-6.

In [64], the proposed PSSS transmitter architecture was verified as a proof-of-concept on a Virtex UltraScale FPGA (VCU108). The digital baseband core was synthesized using the standard cell library from GlobalFoundries 28 nm FDSOI technology. The static timing analysis showed that the timing constraints were met with a clock rate of 1.785 GHz to achieve above 100 Gbps. The transmitter baseband core design of the 15 parallel encoders requires a chip area of 0.0073 mm<sup>2</sup> with a power dissipation of 20.9 mW and has an energy efficiency of 0.21 pJ/bit.

The output of the digital baseband core is a set of 7-bit vectors where each 7-bit binary number represents the digital equivalent of the PSSS chips. The digital vectors are converted to analog chip values using 7-bit DACs. The DAC outputs are in the form of current signals so that they can be applied as inputs to an analog current switching multiplexer similar to the analog current switching multiplexer that is part of the weighted code generator circuit described in section 3.3.

The transmitter architecture in Fig. 2-6 is a mixed-signal circuit that consists of a digital baseband core working at the symbol rate of 1.667 Gbps, 7-bit DAC circuits working with a clock rate of 1.667 Gbps, and a broadband analog multiplexer circuit working with a clock rate of 30 GHz. Moreover, the input to the digital baseband core circuit is a 60-

bit digital input with a data rate of 1.667 Gbps. To reduce the width of the digital interface, the 4-bits of each data symbol are serialized to a single stream with 4 times faster data rate i.e., 6.667 Gbps. This reduces the number of digital input lanes from 60 to 15. Each of the 15 lanes of data requires a clock and data recovery (CDR) circuit for the 6.667 Gbps input data rate.

The technology for the implementation of the transmitter baseband circuit should be fast enough for both the high-speed analog mux part as well as the digital baseband core circuit. The IHP's 130 nm SG13S technology is a very good candidate for the analog mux circuit implementation, however, it is not suitable for the digital design using the standard-cell design libraries working at the clock rate of 1.667 GHz. The next available technology was the 65 nm bulk CMOS technology from Taiwan Semiconductor Manufacturing Company (TSMC). The technology provides fast RF CMOS transistors for highspeed analog circuit design as well as a rich standard-cell library for digital design. Using a more scaled technology would also mean lower power dissipation. Another important reason for the selection of the TSMC 65 nm bulk CMOS technology for transmitter baseband circuit design was the possibility for circuit fabrication by sharing the spare chip area from other projects. The circuit design of the transmitter baseband components is explained in the next sub-sections.

# 3.8.1 DAC

For the proposed system architecture in Fig. 2-6, the required DAC resolution is 7-bits with a sampling rate of 1.667 GHz. Since the complete receiver baseband would require 15 copies of the unit-slice circuits containing the DACs, the footprint, as well as the power dissipation of the DAC, should be quite small. For the given set of requirements, two DAC architectures were investigated i.e., the C-2C DAC and the current steering DAC. The design of the two DAC architectures, the comparison of their performances, and the measurement results are discussed in the next sub-sections.

### 3.8.1.1 C-2C DAC

The capacitor-based DACs use the loading and unloading of the charges on the capacitor plates to generate an analog output voltage corresponding to the charge stored on the capacitor network. The charge storage depends on the capacitors depends on the switch connections for the different bits. The C-2C network is preferred over the binary-weighted capacitor network because it requires only two values of the capacitors which considerably reduces the matching issues as compared to the binary-weighted capacitors. As compared to an R-2R DAC, the C-2C DAC has smaller power dissipation and better matching behavior. The output of the capacitor-based DACs has to be connected to high input impedance readout circuits. A common readout circuit is an operational amplifier or an operational transconductance amplifier (OTA).

A common problem with the C-2C DACs is the presence of glitches in the direct output of the DAC i.e., at the input to the op-amp or OTA circuit. The glitches distort the output signal and can increase the settling time of the DAC. The cause of the glitches is the simultaneous switching of the switch transistors and the differences in the low-to-high transition times  $T_{LH}$  and the high-to-low transition times  $T_{HL}$  between the different input bits of the DAC. The larger the variance in the transition times, the larger is the amplitude of the glitches. A solution to the above problem is the use of a buffer circuit close to the



Fig. 3-29 Block diagram of the proposed C-2C DAC circuit connected to an op-amp.

actual DAC circuit to store the digital inputs and to apply them synchronously on the rising edge of the clock signal. Another solution is to use a decoder circuit for the bits with higher significance because the switching of the higher significance bits causes larger glitches. A 3-bit to 7-bit decoder is used to convert the upper 3 bits to a 7-bit thermometer code. This increases the number of C-2C branches from 7 to 11 but it significantly reduces the glitches in the output because an increase in the DAC inputs by a value of 1 would only cause a single transistor switch to change its state as compared to the binary case where up to 3 consecutive switches can change their states simultaneously e.g., the transition from 011 to 100.

The output of the C-2C DAC is a single-ended voltage signal with an LSB value of  $V_{FS}/2^7$  where  $V_{FS}$  represents the full-scale output of the DAC. To maximize the size of the LSB voltage, the voltage  $V_{REF}$  is set to  $V_{DD}=1.2$  V which allows the output of the C-2C DAC to range from  $V_{SS}=0$  V to  $V_{DD}=1.2$  V covering the full rail-to-rail voltage range (LSB = 9.375 mV). This requires the op-amp circuit in Fig. 3-29 to have a rail-to-rail input voltage range. A common technique to allow the rail-to-rail input voltage range is the use of both NMOS as well as PMOS differential pairs as the input stage. The problem with this approach is that the overall transconductance  $G_m$  in the mid-input voltage range because at the mid-input voltage range both the NMOS and the PMOS differential amplifiers are in saturation. To correct this behavior, half of the current at the mid-input range needs to be shunted away. The shunt circuit can be based on a Zener diode or an electronic version of it based on MOS transistors as proposed in [93]. The op-amp uses a folded cascode current summation stage at the output similar to that in [93].

The average power dissipation of the C-2C DAC circuit including the op-amp is 2.42 mW and the output voltage swing is 450 mV. To characterize the performance of the DAC, an



Fig. 3-30 Integral non-linearity (INL) of the C-2C DAC calculated using simulated results.



Fig. 3-31 Differential non-linearity (DNL) of the C-2C DAC calculated using simulated results.

up or down counter is used as the digital input data source. The differential nonlinearity (DNL) of a DAC is the difference between the analog output values for two adjacent digital input codes. The calculated value of the DNL using the simulation results is shown in Fig. 3-31. The worst-case DNL value is 0.12 LSB.

The integral nonlinearity (INL) of a DAC is a measure of the deviation between the ideal analog output value and the actual analog measured output value for a given digital input code. The calculated value of the INL using the simulation results is shown in Fig. 3-30 with the worst-case INL value of 1.76 LSB.

#### 3.8.1.2 Current Steering DAC

The alternate architecture of choice for the DAC circuit is the current steering DAC which uses differential amplifier stages to steer the current to the left or the right branch. The currents are wired-OR'ed and applied to a pair of resistors to convert the current output signal to a voltage signal. A total of 7 binary-weighted current steering stages are used in the DAC. The tail current of the current steering stages is a power of 2 multiple of the



Fig. 3-32 Schematic diagram of the 7-bit current steering DAC.

LSB current. Reducing the amplitude of the LSB current helps to reduce the power dissipation of the DAC circuit but choosing too small a value causes makes the DAC circuit sensitive to PVT changes. The value of the current for the LSB is chosen to be 25  $\mu$ A and that for the MSB is 2<sup>6</sup> times the LSB current i.e., 1.6 mA. An improved current mirror architecture is used for the current sources to reduce the sensitivity to PVT variations. The single-ended open-collector outputs (i.e., the *p* and the *n* outputs separately) of all the current steering stages can be connected to perform the wired-OR operation. If the wired-OR outputs (i.e., summed-up currents) are directly fed to resistors to get a voltage output, the RC time constant of the output nodes becomes quite large. The solution is to use a common gate stage as a buffer for the current signals to compensate for the Miller's effect which reduces the RC time constant at the output node. The disadvantage is that the addition of the common gate requires the use of a higher supply voltage to allow the required voltage headroom. Thus, the supply voltage is increased to 1.45 V. The average power dissipation of the circuit is 3.67 mW, and it produces a differential output voltage with an output swing of ±400 mV<sub>diff</sub>.

Note that the DAC outputs go to an analog multiplexer (as shown in Fig. 2-6) which selects one of the 18 inputs and routes it to the output. The mux circuit is similar in architecture to the analog mux circuit in the programmable weighted code generator circuit explained in section 3.3, except that the programmable differential current sources in Fig. 3-4 are replaced with the open-drain outputs of the common gate stage in Fig. 3-32. The analog mux requires the outputs of the DACs to be in the form of currents. To obtain the output of the DAC in the form of current, the load resistors have to be removed to use the open-drain outputs of the DAC as the output nodes.

The simulated values of the DNL and INL are shown in Fig. 3-33 and Fig. 3-34. The worst-case DNL is 0.15 LSB and the worst-case INL is 0.24 LSB. The DNL value is



Fig. 3-33 Differential non-linearity (DNL) of the current-steering DAC calculated using simulated results.



Fig. 3-34 Integral non-linearity (INL) of the current steering DAC calculated using simulated results.

comparable to the C-2C DAC whereas the INL value is much better than that for the C-2C DAC. The settling time for the current steering DAC was approximately 200 ps. The spurious-free dynamic range (SFDR) of the DAC is another important parameter that shows the spectral purity of the DAC output waveform. This is measured as the ratio of the amplitude of the highest spurious tone in the output (considering all the harmonics, noise, and distortions) as compared to the amplitude of the fundamental tone. The SFDR value of the C-2C DAC is 32 dB whereas that for the current steering DAC is 42 dB. Thus, the current steering DAC architecture was selected for the implementation. For this purpose, a test-chip was fabricated that contained the current steering DAC described above along with a 7-bit synchronous up-counter circuit to generate a simple test signal to characterize the performance of the DAC circuit.



Fig. 3-35 Chip layout of the current steering DAC test-chip.
As evident from Fig. 3-32, the transistor sizes of the differential pairs, as well as the current mirrors, change with the order of the bit. The layout of the current steering stage doubles in size for each higher-order bit. The input capacitance of each current steering stage is also different which poses a challenge when the external digital inputs are interfaced with the DAC core. It causes changes in the low-to-high and high-to-low transition times of the different bits. The counter design is made complicated by the fact that each bit of the counter presents a different load for the drive circuit. Moreover, the inputs to the current steering stages are differential signals. To solve these problems, a customized buffer circuit is designed for each current steering stage which takes the single-ended digital output of the counter (0 V to 1.2 V) as its input and drives its respective differential current steering stage (800 mV<sub>DC</sub>  $\pm$  200mV<sub>diff</sub>) shown in Fig. 3-32. The buffer circuit for each current steering stage needs to be placed close to the respective current steering stage. The chip layout is shown in Fig. 3-35 with all the important sections highlighted in the layout. A microphotograph of the chip can also be seen in Fig. 3-35. The DC connections were wire-bonded to a PCB substrate and the high-speed connections were made using RF probes on a wafer prober. The measured results of the current steering DAC test-chip are shown in Fig. 3-36. The circuit shows good high-speed performance and linearity. The measured DAC output shows a non-monotonic behavior in the middle. This is due to a slight variation in the MSB current as also predicted by the Monte Carlo simulations which showed a 1.24 LSB variation in the MSB current and 0.66 LSB variation in the current for the 6<sup>th</sup> bit for a supply voltage of 1.45 V. One solution to improve the mismatch behavior is to increase the  $V_{DS}$  drop across the current mirror and the CMOS differential pair transistors. This can be done by increasing the supply voltage to a higher value. As an example, if the voltage supply is increased to 1.8 V, then according to Monte Carlo simulations, the current variation for the 6<sup>th</sup> bit is reduced to 0.33 LSB, and that for



Fig. 3-36 Measured single-ended output result of the DAC test-chip at a clock rate of 1.667 GHz.

the MSB is reduced to 0.57 LSB that is a significant improvement. However, this increases the power dissipation of the circuit.

### 3.8.2 Digital Baseband Core

The process of generation of the PSSS signals can be divided into two parts i.e., generation of the digital values for the PSSS chips and their conversion to analog values using DACs. The process is equivalent to matrix multiplication as discussed in section 2.2.1.1. The inputs are PAM-16 symbols which have to be represented in 2's complement form which increases the number of bits from 4 to 5. The coding matrix elements (i.e., code chips) are 1-bit binary numbers that simplify the multiplication process. The input baud rate is 1.667 Gbaud which defines the clock rate for the digital processor as 1.667 GHz. The output of the multiplication is a matrix of order  $15 \times 5$  where each of the 5 vectors has to be multiplied with its respective scalar weighting factor and the row elements of the product have to be added up to get the PSSS chips as shown in Fig. 3-37. The task is to



Fig. 3-37 PSSS signal generation for a single data vector using matrix operations for digital implementation.

perform the matrix multiplication in a manner that is hardware efficient. Recognizing the fact that the scalar weighting factors for the vectors are powers of 2, the multiplications are performed by simple left-shift operations. The multiplication with the factor  $-2^4$  requires weighting by the factor  $2^4$  followed by taking the two's complement of the result. The process of 2's complement involves the inversion of all the bits followed by the addition of 1 to the result. The process of addition of the weighted columns of the matrix is a combinational logic task and represents a bottleneck for the processing speed. The addition process is implemented using full-adder circuit blocks with a pipelined approach. The required pipeline length is 7 i.e., the results for the current data inputs appear after 7 clock cycles. The delay due to pipelining is, however, not critical for the current application.

To test the digital baseband circuit, a test-chip was fabricated which contained the abovementioned circuit for matrix multiplication as well as an on-chip bit-sequence generator circuit to generate the test data for the digital baseband core circuit. The bit-sequence generator is pre-programmed to generate 8 selectable patterns for the test. The block diagram of the digital baseband core test-chip is shown in Fig. 3-38.



Fig. 3-38 Block diagram of the digital baseband core test-chip with an on-chip bit-sequence generator for testing purposes.

The on-chip bit-sequence generator circuit, as well as the baseband processor, are designed using register-transfer-level (RTL) programming with Verilog hardware description language (HDL). For the test-chip, some random output bits of the baseband core circuit are selected for testing and are made available as chip outputs. The chip layout and microphotograph are shown in Fig. 3-39. The input clock signal has a clock rate of 1.667 GHz and the high-speed output signals have a data rate of 1.667 Gbps. To interface the high-speed differential clock input, a 50  $\Omega$  input buffer circuit is designed which converts the external differential input signal to a CMOS rail-to-rail signal. The output buffer circuits convert the single-ended rail-to-rail CMOS signals to differential output signals with 50  $\Omega$  output load resistors. Some additional random output bits of the baseband core circuit are made available for testing as well but without any high-speed 50  $\Omega$  output buffers. Instead, these bits are made available for low-speed verification of the circuit behavior. The outputs of two of the high-speed outputs of the digital baseband test-chip are shown in Fig. 3-40. The output data streams for the two high-speed data outputs were verified to match with the simulated outputs. The circuit could be tested successfully up to 1.13 GHz.



Fig. 3-39 Snapshot of the digital baseband core test-chip layout (left) and its chip microphotograph (right).

Here it is important to note that the design narrowly met the static timing analysis constraints for the clock frequency of 1.667 GHz during the design phase. The post-layout simulations of the circuit with typical device models were used to verify the performance of the circuit which showed no problems at the desired frequency of operation.



Fig. 3-40 Measured results of the digital baseband test-chip showing two selected outputs generating the same output bit sequence. The results correspond exactly with the expected bit sequence on the two outputs.

## 3.8.3 Analog Multiplexer

The other important circuit is the analog multiplexer which takes the current mode outputs of the DACs as inputs and routes one of the inputs to the output in a sequential manner i.e., one after the other. The proposed architecture of the broadband analog mux is similar to the mux used in the mixed-signal weighted code generator circuit. The starting point of the circuit is the generation of a one-hot pulse signal using a circuit similar to that explained in section 3.3.2. However, the DFF chain could not operate at the desired frequency of 30 GHz to generate the one-hot pulse signal. The analog mux circuit could not be designed using the chosen 65 nm bulk CMOS technology. Thus, the transmitter circuit could not be realized using the chosen technology, and a more scaled CMOS technology of 28 nm was tried. However, no chip fabrication was possible due to time and budget constraints and only post-layout simulations were performed to verify the working of the analog mux part of the transmitter baseband circuit.

The design details of the circuit implementation using 28 nm bulk CMOS technology are discussed in section 3.10. For the measurement of the receiver baseband circuit, an arbitrary waveform generator circuit was used to emulate the PSSS transmitter circuit. The reader is referred to chapter 5 for the measurement results of the PSSS receiver baseband.

## 3.9 Power Dissipation of the SiGe BiCMOS Implementation

The sum of the power dissipation of the weighted code generator circuit, broadband fastresettable correlator, the sample-and-hold circuit, and the VCDL circuit is 2.78 W using -4 V and 1.2 V voltage supplies. For BPSK data, the correlator works with lower supply voltages down to -3.5 V with reduced power dissipation. The high-power dissipation owes to the use of heterojunction bipolar transistors with current mode logic (CML) topology for most of the circuit components. An interesting idea to reduce power dissipation is discussed in chapter 6.

### 3.10 Comparison with 28 nm CMOS Implementation

The circuit design and characterization of the critical components of the mixed-signal PSSS baseband using 130 nm SiGe BiCMOS was discussed earlier in chapter 3. The circuits are based on the CML topology for the high-speed operation which results in large power dissipation (see section 3.9). One way to reduce the power dissipation is to switch to a CMOS technology smaller than 65 nm (see section 3.8.3). An additional advantage would be the possibility of an on-chip digital standard cell design for the transmitter digital baseband. The 28 nm CMOS technology from GlobalFoundries was chosen



Fig. 3-41 Block diagram of the programmable weighted code-generator as implemented in the 28 nm CMOS technology.

for this investigation. The programmable weighted code generator circuit and the broadband resettable correlator circuit were designed, and post-layout simulation results were used to compare their performance with the SiGe BiCMOS implementation. Note that neither an actual IC fabrication nor the design of additional circuit components as explained in section 4.1 was possible because of limited time and budget constraints and only the post-layout simulations of the critical components of the mixed-signal baseband were done. Below is a summary of the circuit implementation in 28 nm CMOS technology.

The programmable weighted code generator circuit has a repetitive structure (i.e., a unit cell) consisting of a DFF with a clock driver circuit, a programmable differential current source, and a high-speed differential current switch (see Fig. 3-41). 18 copies of the above unit cells are needed to form the 18 selectable inputs of the mux. The power dissipation and layout of each of these repeating circuit components add up to the total power dissipation and the overall layout size respectively. Using the floor plan for the SiGe BiCMOS implementation (shown in Fig. 4-13) as a guideline, the unit cells of the CMOS implementation are also arranged in a horizontal array which necessitates that the horizontal extent of the layout of unit cells must be kept at a minimum. Regarding the layout, the DFF has the largest overall size and horizontal extent among the unit cell components, so it plays a decisive role in determining the horizontal dimension of the chip. The DFF in the 28 nm CMOS technology has a horizontal extent of 12 µm as compared to 56 µm for 130 nm SiGe BiCMOS technology i.e., the horizontal extent of the baseband chip in 28 nm will be 78.6 % shorter. A big advantage of the reduction in the horizontal extent is the fact that no transmission lines with matched loads would be required to connect the outputs of the high-speed differential current switches. Moreover, the transadmittance stage of section 3.3.5 would also not be required. This results in significant savings in power dissipation. However, similar to the SiGe BiCMOS implementation, instead of adding all 18 current signals at one node, the currents of six current switches are summed up with a common gate stage as shown in Fig. 3-41. A total of 3 common gate stages are required for the 18 current signals. The outputs of all 3 common gate stages are summed up at a pair of resistors which generates the final output of the mux.

The circuit design of the four-quadrant multiplier circuit, as well as the resettable integrator, was similar to that of the SiGe BiCMOS implementation (see section 3.5). A Verilog-A behavioral model for the S/H circuit was used to plot the eye diagrams for the case of PSSS data generated by the encoding of BPSK and PAM-4 data. The programmable weighted code generator circuit explained above was used to generate the decoding sequence for the receiver baseband slice. The eye diagram for the case of BPSK data is shown in Fig. 3-42 which shows a clear vertical eye-opening for PSSS waveform with a chip rate of 30 Gcps.



Fig. 3-42 Eye diagram of the decoded PSSS waveform at the output of the S/H circuit for the case of PSSS data generated by the encoding of BPSK data with a PSSS chip rate of 30 Gcps.

The output eye diagram for the case of PAM-4 data is shown in Fig. 3-43. The eye diagram shows clear vertical eye openings for the three sub-eyes as well as good linearity. IC fabrication of a test-chip or further investigations and optimization of the baseband circuits in 28 nm CMOS technology was not possible owing to limited time and resources.

To make a fair comparison of the power dissipation of the programmable weighted code generator circuit in both SiGe BiCMOS technology and the 28 nm CMOS technology, the following assumptions are made: no input or output buffer circuits are considered, no circuitry for the feedback path of the DFF pulse is considered, no circuit for the generation of the reset/ integrate command is considered, and no other periphery or control circuits are considered. Under these assumptions, the SiGe BiCMOS implementation draws 60.9 mA and 375.3 mA currents from 1.2 V and -4 V supplies, resulting in overall power dissipation of 1574.3 mW. The circuit implementation in 28 nm CMOS technology draws 25.1 mA, 79.7 mA, and 1.0 mA from 1.0V, 1.3 V, and 1.7 V supplies respectively, resulting in overall power dissipation of 130.4 mW which is 91.72% less as compared to that of the SiGe BiCMOS implementation.



Fig. 3-43 Eye diagram of the decoded PSSS waveform at the output of the S/H circuit for the case of PSSS data generated by the encoding of PAM-4 data with a PSSS chip rate of 30 Gcps.

The multiplier and integrator circuits are designed similar to the implementation in SiGe BiCMOS technology. The power dissipation of the multiplier circuit in the two technologies compares as follows: the SiGe BiCMOS implementation draws 2.8 mA and 7.4 mA from 1.2 V and -4 V supplies respectively, resulting in 33.0 mW of power dissipation. The CMOS implementation draws 7 mA and 0.5 mA from 1.5 V and 1.7 V supplies respectively resulting in 11.4 mW of power dissipation (65.4% reduction).

For the integrator circuit including the offset correction circuit and the output buffer circuit, the SiGe BiCMOS implementation draws 7.3 mA, 11.8 mA from 1.2 V and -4 V supplies respectively resulting in total power dissipation of 56.0 mW. The CMOS implementation draws 4.2 mA from 1.7 V supply with an additional 1 mA from 1.5 V supply for the output buffer resulting in 8.6 mW of power dissipation (84.6% saving).

Apart from the savings in the power dissipation and a smaller footprint for the critical components of the receiver baseband IC, another very important consideration was to investigate whether the analog mux (i.e., the programmable weighted code generator circuit excluding the programmable differential current sources at the inputs) could be used for the transmitter baseband as well. The main difference is that the inputs to the analog mux in this case would be the outputs of 7-bit DACs which unlike the programmable code generator circuit are allowed to change before the start of each new selection cycle of the analog mux. The simulated differential output of the analog mux designed in 28 nm CMOS technology with changing inputs is shown in the bottom trace of Fig. 3-44. For this simulation, the first 15 inputs during the  $(n+1)^{th}$  cycle were changed (inverted in this example) from that during the  $n^{th}$  cycle. The differential output of the DFF chain which serves as the select signal of the analog mux can be seen in the upper trace of Fig. 3-44. The inputs of the R-2R DACs which provide the input differential currents for the analog mux were applied using CMOS buffer circuits. The settling time of the differential current input of the mux is 131.6 ps with a rise time of 90.4 ps. The settling time is low enough for the current signals to settle to their new value before it's their turn during the next cycle.



Fig. 3-44 Simulation result of the analog mux designed in CMOS 28 nm technology. The upper trace shows the differential outputs of the DFF chain serving as the select signal for the mux. Note that the output of the mux during the  $(n+1)^{th}$  cycle is different (here inverted version) from that during the  $n^{th}$  cycle.

## 3.11 List of Relevant Publications of the Author

- [64] K. KrishneGowda, A. R. Javed, L. Wimmer, A. C. Wolf, J. C. Scheytt, and R. Kraemer, "PSSS Transmitter for a 100 Gbps Data Rate Communication in THz Frequency Band," 2018 26th Telecommunications Forum (TELFOR), pp. 1-5, 2018.
- [68] A. R. Javed, J. C. Scheytt, K. KrishneGowda, and R. Kraemer, "System Design Considerations for a PSSS Transceiver for 100Gbps Wireless Communication with Emphasis on Mixed-Signal Implementation," in *IEEE Wireless and Microwave Technology Conference (WAMICON)*, Florida, 2015.
- [69] A. R. Javed, et. al., "Real100G.com," in Wireless 100 Gbps and Beyond, R. Kraemer and S. Scholz, Eds., Frankfurt (Oder), Germany, IHP Innovations for High Performance Microelectronics, 2020, pp. 231-294.
- [73] A. R. Javed, J. C. Scheytt, K. KrishneGowda, and R. Kraemer, "System Design of a Mixed Signal PSSS Transceiver Using a Linear Ultra-Broadband Analog Correlator for the Receiver Baseband Designed in 130 nm SiGe BiCMOS Technology," *IEEE EUROCON 2017 -17th International Conference on Smart Technologies*, pp. 228-233, 2017.
- [78] A. R. Javed, and J. C. Scheytt, "Mixed-Signal Receiver Baseband Slice for High-Data-Rate Communication Using 130 nm SiGe BiCMOS Technology," in 64th International Midwest Symposium on Circuits and Systems (MWSCAS 2021), East Lansing, 2021.
- [85] A. R. Javed, J. C. Scheytt, and U. v. d. Ahe, "Linear Ultra-Broadband NPN-Only Analog Correlator at 33 Gbps in 130 nm SiGe BiCMOS Technology," in *IEEE Bipolar/BiCMOS Circuits and Technology Meeting (BCTM)*, 2016, New Brunswick, 2016.

## 3.12 Summary

A small introduction to the different semiconductor fabrication technologies used for the circuit design in this thesis and the reasons for their selection are presented. The transmitter and receiver baseband have a sliced architecture where each slice of hardware represents the required set of components to transmit one of the *N* parallel symbols (transmitter baseband unit-slice) or to recover one of the *N* parallelly transmitted symbols (receiver baseband unit-slice) [14] [68] [73]. The detailed circuit design and measurement results of the most important circuit components of the mixed-signal PSSS receiver baseband circuit were discussed which include the mixed-signal weighted code-generator circuit (to generate the weighted local copies of the coding sequences at the receiver end) [69] [77] [78] and the broadband fast-resettable correlator circuit (to perform correlation of the PSSS sequence with the weighted local copy of the coding sequences at the receiver

end) [69] [73] [85]. The circuit design and measurement results of the important circuit components of the transmitter baseband circuit i.e., DAC and the digital baseband core circuit are also discussed. The chapter ends with a comparison of the baseband circuit implementation in 28 nm bulk CMOS technology highlighting the huge savings in power dissipation and reduced footprint of the baseband circuits when using a scaled and more advanced CMOS technology.

# 4 Design of a Complete Mixed-Signal PSSS Receiver Baseband Unit-Slice

The design and characterization of the most important components of the mixed-signal PSSS receiver baseband unit-slice i.e., the integrate and dump correlator and the programmable weighted-code generator were discussed in chapter 3. The correlation is the most important process in the PSSS receiver circuit. However, some additional circuit components are required for the proper working of the transmission link. In this chapter, the additional circuit components required in the PSSS receiver baseband unit-slice after performing the correlation are described. These components include the sample-and-hold circuit to sample the output of the correlator, a 4-bit analog to digital converter (ADC) to convert the sampled value to a digital output with thermometer code format, and a digital interface that converts the thermometer code to a suitable format for interface to the FPGA. A simplified block diagram of the complete mixed-signal PSSS receiver baseband circuit is shown in Fig. 4-1.

The output of the sample-and-hold circuit is converted to a 4-bit digital output using an analog to digital converter (ADC). The sampling clock for the ADC is a pulse signal from the one-hot pulse generation circuit described in section 3.3.2. The pulse duration is  $T_{chip}$  whereas the period of the pulse train is  $18 T_{chip}$ . Due to the high sampling rate of the ADC, a flash architecture is used which generates a 15-bit thermometer code output. A thermometer-to-binary conversion circuit along with a bubble error correction circuit converts the output of the ADC to a 4-bit binary output that is serialized using a 4 to 1 multiplexer to get a 6.667 Gbps stream. The multiplexer requires a 6.667 GHz clock. Details about the circuit components mentioned above are presented in the following sections of this chapter.



Fig. 4-1 Block diagram of the complete mixed-signal PSSS receiver baseband.

## 4.1 Post-Correlation Receiver Components

The output of the correlator at the end of the correlation period (15  $T_{chip}$ ) is the decoded symbol. The output of the correlator needs to be sampled at this instant before the integrator is reset in preparation for the next correlation cycle. An analog to digital converter (ADC) circuit converts the sampled output of the correlator to an equivalent digital representation. The minimum resolution required for the ADC is equal to the bit-loading which is 4 in the current case. As indicated in Fig. 4-2, the sample rate of the ADC is only  $f_{chip}/18$ . The direct conversion or flash type ADC architecture is well-suited for the given specifications of high sampling rate and smaller resolution. The flash ADCs often require a track-and-hold (T/H) circuit in the front to hold (or store) the value of the input signal before the comparators start the comparisons with predefined reference voltages to determine the digital output. However, a T/H circuit is not always required and can be avoided if the comparators are fast enough to switch, regenerate and latch their values within the sampling clock period. Making use of state-of-the-art bipolar or BiCMOS semiconductor technologies, it has been demonstrated in, e.g., [4.1]- [4.4] that high-speed analog to digital conversion with sampling rates up to 35 GS/s can be achieved with a single ADC core conversion scheme, that is, without any T/H and without time interleaving. However, in the current project, the output of the correlator is sampled with a sampleand-hold (S/H) circuit before applying it to the ADC. The S/H circuit is used because of the following reasons: the sampled output of the correlator is required for the Costas loop for carrier synchronization and for plotting the direct eye- diagram for bit error rate testing. The direct output of the integrator is not enough to cause the ADC comparators to switch their output signals while sampling the input waveform, so the integrator output is amplified before the analog to digital conversion using the ADC.

The output of the flash ADC is a 15-bit wide thermometer code generated every  $1/1.667 \times 10^9$  s = 600 ps. The 15-bit thermometer code is passed through a bubble error correction circuit before converting it into a 4-bit Gray code using a thermometer-tobinary encoder based on multiplexer circuits. Each of the 15 receiver slices will generate the 4-bit data at a bit rate of 1.667 Gbps making the total interface width equal to 60 data lanes with each lane having a data rate of 1.667 Gbps. To reduce the number of high-speed connections to the FPGA at the output, the 4-bit word is serialized to a 4-times



Fig. 4-2 Block diagram of the mixed-signal PSSS receiver baseband unit-slice.

faster i.e., 6.667 Gbps data stream using a 4 to 1 multiplexer. This reduces the required number of high-speed connections to the FPGA from 60 to 15 as shown in Fig. 4-1. The data rate of 6.667 Gbps is well within the capabilities of the high-speed serial transceivers of the modern FPGAs. A reference clock signal is also provided for the PLLs inside the serial transceivers of the FPGA.

The detailed block diagram of the mixed-signal PSSS receiver baseband unit-slice is shown in Fig. 4-2. The programmable mixed-signal weighted code-generator and the broadband analog correlator circuits were discussed in chapter 3. The output of the integrator is sampled using a sample-and-hold (S/H) circuit before the integrator is reset. The reset command signal for the integrator also serves as the sample command for the S/H circuit. This signal is generated by performing the logical-OR operation of the 15<sup>th</sup>, 16<sup>th</sup>, and 17<sup>th</sup> DFF outputs inside the one-hot pulse generation circuit described in section 3.3.2. The resulting reset/ sample command signal has a duty cycle of 3/18.

The data inputs to the 4-to-1 multiplexer have a data rate of 1.667 Gbps. To generate the multiplexed output, clock signals with frequencies 1.667 GHz and 3.333 GHz are required. The 3.333 GHz signal is generated by dividing the 30 GHz clock signal using a cascade of two divide-by-3 static frequency dividers. The other clock signal frequency required by the multiplexer i.e., 1.667 GHz is generated using a static frequency divide-by-2 circuit. The output signal of the multiplexer with the data rate of 6.667 Gbps is scrambled with a 7-bit long m-sequence to ensure that the clock and data recovery (CDR) PLLs inside the FPGA have enough transitions to maintain the lock without a very large drift. The clock signal for the data scrambler is generated using a frequency doubler circuit with the 3.333 GHz clock signal at the input. The 1.667 GHz signal is divided by a factor of 8 to generate a 208.33 MHz reference input clock for the CDR PLL.

The circuit design of the sample-and-hold circuit is described in the next section. The remaining circuit components in Fig. 4-3 e.g., the ADC, the thermometer-to-binary encoder, the scrambler, and the digital interface to the FPGA are meant for a digital interface to an FPGA, and although their design involves a lot of challenges and their layout does



*Fig. 4-3* The digital interface from the output of the S/H circuit to the FPGA transceiver input including the generation of the required clock signals for the interface.

take up a lot of the baseband unit-slice chip area, however, their circuit design is not discussed in detail because they do not present a fundamental research challenge.

## 4.2 Sample-and-Hold Circuit

The sample-and-hold circuit is made using two track-and-hold (T/H) circuits in series as shown in Fig. 4-4. The sampling clock signals to the two track-and-hold (T/H) circuits are complementary to each other i.e., when the first T/H is tracking (or following) the input waveform, the second T/H is in the hold mode and vice versa. This is done by inverting the track or hold command signals (here *Integrate* or *Reset*) on the two track-and-hold circuits. Apart from the two track-and-hold circuits, there is also a limiter circuit in Fig. 4-4. The purpose and design of the limiter circuit are explained in the next subsection.



Fig. 4-4 The block diagram of the sample-and-hold circuit preceded by a limiting amplifier.

## 4.2.1 Limiter Circuit

The instantaneous output of the correlator during the correlation cycle can go up to  $\log_2(N+1)$  times larger than the final output of the correlator at the completion of the correlation cycle for the case of unipolar coding and bipolar decoding as mentioned in Table 2-2. As mentioned in chapter 3, the final output of the correlator is limited to  $\pm 120$  $mV_{diff}$  which implies that the instantaneous output of the correlator during the correlation cycle can go up to 4 times larger i.e., ±480 mV<sub>diff</sub>. For the T/H circuit, the signal feedthrough during the hold mode increases with the magnitude of the input signal. Furthermore, the T/H circuit goes into saturation if the magnitude of the input signal increases. Thus, the input to the T/H has to be limited to the necessary input dynamic range to avoid these disadvantages. The schematic diagram of the limiter circuit is shown in Fig. 4-5. An HBT differential pair with emitter degeneration is used as a limiting amplifier to limit the linear input range of the differential pair to a little larger than  $\pm 120 \text{ mV}_{\text{diff}}$ . The linear input range is larger than  $\pm 120 \text{ mV}_{\text{diff}}$  to ensure good linearity till at least the required input range of  $\pm 120 \text{ mV}_{\text{diff}}$ . The limiting value of the output signal of the limiter circuit is  $\pm 200 \text{ mV}_{\text{diff}}$  as shown in the simulation results shown in Fig. 4-7. The gain of the limiter circuit can be adjusted externally using the *Gain\_Ctrl* signal. A differential pair quad is used to provide the gain control. The inputs to the gain control circuit are tied to DC



Fig. 4-5 Schematic of the limiter circuit.

reference voltages generated using a voltage divider and one of the two inputs is made externally accessible to enable the gain control.

### 4.2.2 Track-and-Hold Circuit

As can be seen in Fig. 4-4, the sample-and-hold (S/H) circuit consists of two track-and-hold (T/H) circuits in series. Both T/H circuits are similar in design except for a gain stage at the output of the second T/H to increase the gain of the S/H output signal. The gain stage is similar in construction to the limiter circuit explained above (albeit with a higher gain). The external gain control input signal of the S/H circuit is applied to a voltage divider similar to the limiter gain control input signal in Fig. 4-5. The design of the first T/H circuit from the S/H circuit in Fig. 4-4 is explained below.

The schematic diagram of the track-and-hold circuit is shown in Fig. 4-6. The track-andhold circuit is based on the switched emitter follower topology. The input signal is applied to a cascode differential pair with emitter degeneration to ensure linear operation for the desired input dynamic range. The cascode stage is used to reduce the Miller feedback capacitance and to reduce the  $V_{DS}$  voltage drop across the differential pair transistors. The differential output of the cascode stage is applied to a pair of switchable emitter followers (common-collector amplifier). The switched emitter follower stages use two differential pairs to control (switch) the currents through the emitter followers individually. During the track mode, the current is steered toward the emitter followers which buffer the input



Fig. 4-6 Schematic of the track and hold circuit.

signal from the cascode differential stage to a pair of storage capacitors. The output impedance of the emitter follower stage is  $1/g_m + R_{source}/\beta_o$  where  $g_m$  is the transconductance of the transistor,  $R_{source}$  is the resistance connected from the base terminal to the small-signal ground, and  $\beta_o$  is the small signal current gain of the transistor i.e.,  $I_c/I_b$ . During the hold mode, the current is steered away from the emitter followers which set their respective storage capacitors in floating states because there is no path for the capacitors to discharge, and thus, they hold their last stored values. The voltages at the bases of the common collector transistors are pulled low during the hold mode to make sure that the emitter followers are switched off. To read the stored voltage across the capacitors, a differential high impedance stage is required. A pair of emitter followers is a good candidate for this purpose because it has high input impedance i.e.,  $r_{\pi} + \beta_0 R_E$  where  $R_E$ is the resistance connected between the emitter of the transistor and the small-signal ground. Finally, a cascode differential pair with emitter degeneration is used as an output stage to add some gain (not shown in Fig. 4-6). Note that during the hold mode, the baseemitter capacitance  $C_{\pi}$  of the common collector stage causes feedthrough of the input signal to the output node. The base-emitter capacitance  $C_{\pi}$  is the sum of the base charging capacitance  $C_b$  (also known as base diffusion capacitance  $C_d$ ) and the emitter base junction depletion layer capacitance  $C_{je}$ . The base charging capacitor  $C_b$  can be derived to be equal to  $\tau_F I_C / V_{th}$  where  $\tau_F = W_{base}^2 / 2D_n$  is the base forward transit time ( $W_{base}$  is the base width, and  $D_n$  is the electron diffusivity constant for the base),  $I_c$  is the collector current, and  $V_T = kT/q$  is the thermal voltage. Note that when the transistor is switched off, the base charging capacitor  $C_b$  reduces to zero and only the junction depletion layer capacitance  $C_{ie}$  is relevant in the case. A common method to suppress the hold mode feed through is to add an out-of-phase version of the signal to the fed through signal. In the case of a differential circuit implementation, the out-of-phase signal is already available on the opposite branch of the differential pair. A diode bridge is used to add the out-ofphase voltage signal from the opposite branch of the differential pair to the storage capacitor. The diodes in the diode bridge are implemented using HBT transistors with their

base terminals shorted to the collector terminals having the same dimensions as that of the common collector stage transistor. The first T/H circuit has unity gain.

The simulation result of the S/H circuit is shown in Fig. 4-7. The input signal has an amplitude of 350 mV<sub>diff</sub> that is larger than the linear input dynamic range of the limiter circuit. Thus, the limiter circuit limits the output to approximately 200 mV<sub>diff</sub>. The output of the first T/H circuit follows the limiter output except for the brief hold period. The second T/H circuit tracks (follows) the hold mode output of the first T/H. The track mode output of the second T/H is an amplified version of the hold mode value of the first T/H. The sampling clock input of the T/H circuits is the same signal which sets the integrate and reset modes of the correlator circuit as explained in section 3.5.3. The *intgrt* command signal of the integrator serves as the track signal of the first T/H and as the hold signal of the first T/H. The output of the first T/H and as the track signal of the second T/H. The output of the second T/H. The value of the second the second the first T/H and as the track signal of the integrator serves as the hold signal of the first T/H and as the track signal of the second T/H. The output of the S/H is fed to the analog to digital converter circuit as mentioned in Fig. 4-2.



Fig. 4-7 Simulation results of the S/H circuit. Differential signals: input signal (red), output of the limiter (yellow), output of the T/H (green), output of the S/H (turquoise), and the T/H and S/H command signal (blue).

## 4.3 Inter-Chip Phase Alignment (Voltage Controlled Delay Line)

A very important aspect of the system design is the exact phase alignment between the external PSSS input signal and the locally generated decoding sequence. The phase alignment involves both the *inter-chip* and *intra-chip* phase alignment. The inter-chip phase alignment is a system-level synchronization problem and is discussed in section 5.2.1.2 as part of the baseband system measurements. In this section, the *intra-chip* phase alignment is discussed. This can be considered as the fine-tuning of the phase alignment process after the *inter-chip* alignment has been performed. Fig. 4-8 shows the parametric simulation results of the m-sequence correlation. In the simulations, both inputs of the



Fig. 4-8 M-sequence correlation parametric simulation with skew as the independent variable.

correlator are bipolar m-sequences, where one input m-sequence is fixed, and the other input m-sequence has a positive or negative skew with reference to the first input msequence. When the two inputs have no skew, the output of the multiplier is a constant DC level except for some small dips caused by the transitions of the input m-sequences. As observed in the simulation results in Fig. 4-8, the integrator output is a ramp signal



Fig. 4-9 Block diagram of the VCDL with the schematic design of the VCDL part circuit.

corresponding to a constant DC input. With the increasing skew between the two inputs of the correlator inputs, the output deviates from the ideal ramp signal and the final amplitude gets smaller. The final integrator amplitude can be used to find out the optimum phase alignment between the external PSSS signal and the on-chip generated decoding sequence. The phase delay between the external PSSS signal and the on-chip decoding sequence can be adjusted by using a manual on-chip voltage control delay line (VCDL) circuit as indicated in Fig. 4-1. The circuit design of the VCDL is explained below.

The block diagram of the VCDL consists of six identical copies of the delay interpolation cells named as VCDL part as shown in Fig. 4-9. Each of the six VCDL part circuits contributes  $60^{\circ}$  of phase delay variation at the clock frequency of 30 GHz leading to a total delay variation of 360° which corresponds to one complete chip of the decoding sequence. To achieve the required delay variation, the circuit makes use of a fast path and a slow path for the input signal. The fast path consists of a single amplifier stage and adds only a small delay to the input signal as compared to the slow path which has an additional stage to increase the phase delay of the slow path. To further increase the phase delay of the slow path, a cross-coupled differential pair is attached in parallel to the load which increases the RC time constant at the output nodes of the stage. The transition between the fast and slow paths is controlled with the help of a linear differential amplifier. The linear differential amplifier helps to reduce the tuning sensitivity of the circuit. The inputs of this differential amplifier are equipped with reference voltages defined with the help of voltage divider circuits. If the external Steer Ctrl signal is set to -2.0 V, all the current flows through the fast path and hence the circuit contributes the minimum phase delay. When the external Steer\_Ctrl is steered to -1.5 V, the current steers to the slow path and the circuit contributes the maximum phase delay. For any value of the Steer Ctrl signal between -2.0 V and -1.5 V, the current flows partially through both the fast and the slow paths and the output signal is a sum of both the fast and the slow versions of the signal which generates a delay value between the two extreme delay values depending on the



Fig. 4-10 Phase delay variation in degrees as a function of the Steer\_Ctrl voltage input (simulated results)



Fig. 4-11 Duty cycle in percentage as a function of the Steer\_Ctrl voltage input (simulated results).

value of the *Steer\_Ctrl* signal. The variation of the output signal phase delay as a function of the *Steer\_Ctrl* control voltage is shown in Fig. 4-10. The results show a full 360° phase variation for the input frequency of 30 GHz and 180° phase variation for the input frequency of 15 GHz. The characteristic in Fig. 4-10 shows a slightly non-linear behavior but the linearity is good enough for external manual control or even automatic phase calibration.

A disadvantage of the delay interpolation circuit is the degradation of the signal quality and the duty cycle of the output signal when the *Steer\_Ctrl* signal is not at one of the extreme values. The simulation results in Fig. 4-11 show the worst-case duty cycle for the *Steer\_Ctrl* voltage value of -1.6 V, where it drops to slightly lower than 44% for the input signal frequency of 30 GHz. The circuit behaves better for the 15 GHz clock frequency because the stages have enough gain to allow limiting behavior of the amplifier stages, which significantly improves the output duty cycle of the circuit as well. The worst-case duty cycle for 15 GHz is better than 47%. For the prototype circuit, the VCDL circuit is controlled manually. However, a simple feedback loop can be devised that makes use of the sampled output amplitude of the correlator with the training data to determine the optimum phase delay that generates the maximum correlator output amplitude. Moreover, for the prototype circuit, an additional passive phase shifter module was also used for additional phase delay control.

## 4.4 Multioutput Stage

Ideally, it is desired to make as many nodes on the chip accessible off-chip as possible to determine the outputs at different points in the circuit and to troubleshoot any problems observed in the output. But the size of the RF bond pads is prohibitively large so only necessary outputs are made accessible. An interesting idea is to multiplex several inputs at the output stage and to select one of the outputs at any given time. This way multiple



Fig. 4-12 The schematic of the multioutput stage. Only one of the multiple outputs can be selected at any given time using a CMOS ring register initiated with the inverse of a one-hot pulse code (not shown here).

points on the chip can be probed without the need for additional RF bond pads. The idea is shown with the help of the schematic diagram in Fig. 4-12.

Each input signal is applied to an emitter follower circuit to drive a linear cascode stage with open-collector outputs. The open-collector outputs from all the cascode stages are tied together to perform the wired-OR operation. The summed-up currents are converted to a differential voltage with the help of load resistors that are fed to the final linear differential amplifier with 50  $\Omega$  output loads connected to the RF bond pads. The tail current sources are implemented using long-channel NMOS transistors. The reference current for the current mirrors is provided using an appropriately sized PMOS transistor which can be turned on or off using digital CMOS level inputs. Note that only one of the 3 possible input linear differential amplifier stages has its output routed to the 50  $\Omega$  output stage. This is controlled with the help of a CMOS ring register initialized with the inverse of a one-hot pulse code i.e., 011 (not shown in the figure). This makes sure that only one of the differential amplifier stages has the required current while all the others are turnedoff. Emitter degeneration resistors with appropriate values are used to linearize the amplifier output characteristics. To select a different input signal as the source a push-button can be used to generate a low-speed clock signal which propagates the ring register contents further, thereby selecting the next input as the source for the output stage. Note that this concept is quite generic and can be extended to any number of stages as required. The only important consideration is the input common-mode voltage of the input signals.

The multioutput stage was used to probe the output of the limiting amplifier (after the integrator), the output of the track-and-hold circuit, and the output of the sample-and-hold

circuit. The output of the S/H stage was applied to an amplifier stage with adjustable gain to adjust the magnitude of the signal at the input of the ADC to enable the use of the full-scale range of the ADC.

## 4.5 Layout Design

The IHP SG13S technology provides a 7-layer metal stack with Aluminum as the metal layer. The first two layers from the top are thick metal layers whereas the rest of the metal layers are thin metal layers. The thicker metal layers on the top have better conductivity as compared to the thin metal layers. The first metal layer called *Metal 1* is used mainly for connections to the lowest supply voltage i.e., -4 V in the circuit. This is also the layer to which the substrate connections of all the transistors are connected. The second metal layer named *Metal 2* is reserved for the positive power supply voltage i.e., 1.2 V which also serves as the  $V_{DD}$  supply voltage for the CMOS circuits. The ground connections i.e., 0 V are routed on the third metal layer *Metal 3* which also serves as the ground plane for the microstrip line connections on the chip. The signal layer for the microstrip line is routed on the top-most metal layer named *Top Metal 2*. For a microstrip line structure, no metal tracks must be allowed between the signal and the ground layers of the microstrip structure. Having the ground layer on *Metal 3* allows the use of *Metal 1* and *Metal 2* layers



Fig. 4-13 The chip layout with the important circuit components highlighted.

for any necessary connections that have to cross the microstrip line structure because they will be shielded by the RF-ground layer on *Metal 3*. The maximum realizable microstrip line impedance is achieved by using the minimum thickness for *Top Metal 2* i.e., 2  $\mu m$  which results in the characteristic impedance of 80  $\Omega$ .

The layout of the mixed-signal receiver baseband unit-slice test-chip is shown in Fig. 4-13. The important I/Os, as well as the important circuit components, are highlighted. The size of the chip is  $1.85 \text{ mm} \times 2 \text{ mm}$ . The clock input can be seen at the bottom of the chip, along with the passive clock tree that supplies a synchronous clock to a DFF chain consisting of 18 DFFs. The outputs of the DFFs go to the differential current switches for which they act as control signals. The other inputs for the differential current switches are the scalable differential current switches. The outputs of the current switches are wired-OR to perform the multiplexing operation. The multiplexing operation is divided into two steps. The outputs of 6 current switches are combined in the first step and the summed-up current is converted to a voltage signal that is routed using a microstrip line towards the mid-point to combine with the remaining voltage signals. The voltage signals are converted to currents that are wired-OR'ed together at a pair of load resistors to generate the code-generator voltage output.

The code generator output is applied at one input of the multiplier circuit whereas the other input (i.e., the PSSS input signal) is applied externally. The output of the multiplier goes to the integrator. The reset signal for the integrator is generated by taking the OR operation of the DFF pulses. The reset signal is routed using a microstrip line. The propagation delay caused by the microstrip line requires the use of the 15<sup>th</sup>, 16<sup>th</sup>, and 17<sup>th</sup> DFF output pulses instead of the 16<sup>th</sup>, 17<sup>th</sup>, and 18<sup>th</sup> DFF output pulses. The same signal acts as the track or hold control signal for the two successive track-and-hold circuits forming the sample-and-hold circuit that samples the correlator output. The output of the sampleand-hold circuit is then applied at the input of the analog to digital converter (ADC). The thermometer output of the ADC is converted to a binary format that is serialized and scrambled to generate an output data stream suitable for the FPGA. The clock signals for the serializer (4:1 multiplexer) and the scrambler circuit are derived from the clock input at the bottom of the chip and have to be routed to the top of the chip where the serializer and scrambler circuits are located. For important high-speed signal interconnects, the Momentum electromagnetic field solver from Keysight Advanced Design System (ADS) was used to obtain the s-parameter values. Post-layout simulations were used to verify that the clock signals had good amplitude and slew rate at the different clock input nodes in the circuit. The reader is referred to chapter 5 for details about the characterization of the baseband chip and the actual measurement results.

## 4.6 Concept for a Complete PSSS Receiver Baseband Circuit

The circuit components discussed so far are required for the PSSS receiver baseband unitslice circuit. The complete PSSS receiver baseband circuit consists of 15 copies of the unit-slice structure as shown in Fig. 4-1. In the following sub-sections, the important circuit components, and considerations for a complete PSSS receiver baseband circuit are discussed.

### 4.6.1 Automatic Gain Control

For reliable working of the wireless link, an automatic gain control circuit is required directly after the RF-frontend. This is required to maintain the required signal-to-noise ratio despite the change in the distance between the transmitter and the receiver. The block diagram of the automatic gain control amplifier is shown in Fig. 4-14. A detector circuit is used to determine the amplitude of the received RF-signal. Since the PSSS signal is a multi-amplitude signal, the PSSS signal is averaged for a long time to determine the change in the average signal power due to a change in the distance between the transmitter and the receiver. A negative feedback control circuit is used to control the gain by comparing the gain to a preset value determined by the receiver sensitivity. The gain control signal is used to modify the gain of the variable gain amplifier (VGA). A common circuit topology for the VGA amplifier is based on a transistor quad as shown in Fig. 4-15. This is similar in function to a Gilbert cell mixer, albeit with DC control inputs at one of the inputs to control the current steering between the transistors in the upper quad. The lower differential pair is linearized with the help of emitter denegation to reduce the sensitivity of the circuit. The default position of the gain is defined using voltage divider circuits (similar to pull-up/ pull-down networks) one of that is made externally accessible to control the gain of the circuit.



Fig. 4-14 Block diagram of an automatic gain control amplifier.



Fig. 4-15 Schematic of the variable gain amplifier (VGA).

### 4.6.2 Active Power Divider

The architecture of the mixed-signal PSSS receiver baseband has a sliced architecture that requires 15 copies of the unit-slice circuit. The clock and PSSS signals need to be routed to all 15 unit-slices simultaneously. This requires the use of active power divider circuits to route the clock and PSSS signals. The clock and PSSS signals can be routed using a binary tree with 16 output nodes. The 16<sup>th</sup> output node can be connected to some dummy loads to present equal impedance at all output nodes of the tree structure. All the connections between the nodes should be made using microstrip lines terminated with matched loads on either end to prevent reflections and signal degradation. The left and right arms of the tree after the first binary division i.e., after stage-1 are drawn in the vertical direction to reduce the total length of the chip by a half. The resulting binary tree is shown in Fig. 4-16. The stages of the tree represent active amplifier stages that are explained in Fig. 4-17 for the clock and the PSSS data trees. The tree structure remains the same for both clock and PSSS data distribution trees. The active node of the differential binary tree is also shown in Fig. 4-17. It consists of a linear differential amplifier with capacitive peaking. The maximum realizable microstrip line characteristic impedance of 80  $\Omega$  is terminated with two 160  $\Omega$  resistors (left and right) in parallel.



Fig. 4-16 4-stage binary tree for distribution of PSSS signals to the 15 RX unit-slices. The last node should be terminated in a dummy load impedance similar to the load impedance connected at the other nodes.



Fig. 4-17 The active signal distribution network for distribution of PSSS signal to the 15 receiver unit-slices.

### 4.6.3 Automatic Offset Adjustment

A manual offset correction circuit was used to correct the post-fabrication offset in the output of the integrator. While the manual offset correction works fine for the prototype version of the baseband chip, an automatic offset correction circuit should be incorporated for system-level integration into a complete baseband receiver circuit. The differential outputs of the circuit with the offset problem are applied to low-pass filters to determine the average DC value of the single-ended p and n outputs. The cut-off frequency of the low-pass filters should be far lower (i.e., more than 20 times lower) than the frequency of operation of the circuit to ensure that the high-speed transient waveforms of the p and n outputs have no influence on the output of the op-amp and consequently on the offset correction. The feedback circuit settles to a steady-state solution when the p and n outputs have the same average DC value. The offset correction circuit makes use of a differential pair with open-collector outputs connected to the output load resistors of the circuit whose output offset voltage needs to be corrected as shown in Fig. 4-18. The current source at the bottom of the offset correction differential pair determines the magnitude of offset correction. The inputs of the differential pair are fitted with voltage divider circuits to provide some initial reference voltage. The output of the op-amp steers the current Icorrection from one to the other side of the differential pair by controlling the input voltage of one of the transistors of the differential pair while the other remains fixed at the reference voltage generated by the voltage divider output. Note that the proposed concept is guite general and can be applied to all offset-sensitive circuit blocks in the baseband circuit e.g., multiplier, integrator, ADC input, etc.



Fig. 4-18 The block diagram for automatic offset correction circuit

### 4.6.4 Digital Controls for the Chip

The analog control voltages of the chip, for example, the gain control values for different stages as well as the controls for the offset-correction, etc. should be provided as digital



Fig. 4-19 General idea for the implementation of a digital potentiometer (top). Using NMOS transistors as switches for the current application (bottom).

values rather than as analog steerable potentiometer voltages. It helps to reduce the number of input bond pads by not requiring a separate bond pad for each input control voltage. This requires the use of digital potentiometers that are switch-operated resistor elements whose resistance can be controlled by changing the contents of a digital register. Using digital potentiometers offers the advantage that the values of all the different control voltage input signals can be configured using just a pair of bond pads i.e., the serial data input and serial clock input bond pads. The values of all the digital potentiometers are transferred serially using the serial data-in and serial clock-in interface. The serial data values are first stored in a temporary buffer before the values are transferred to the respective registers of the different digital potentiometers at the last clock edge. A common method to implement a digital potentiometer is to use a chain of resistors that are connected to the middle terminal of the potentiometer using switches controlled by registers as shown in Fig. 4-19 (top). The larger the number of the switches the smaller is the granularity of the digital potentiometer. However, this increases the register length considerably. A common method is to use a serial binary code as input and to use a decoder to generate the register values for the switches.

The actual implementation of the switches depends on the application as well as the voltage levels at the end terminals of the potentiometer. For the baseband chip, the potentiometers are used to provide control voltages at the bases of the transistors by forming a voltage divider connected between the ground and the negative  $V_{EE}$  supply voltage. For this application, a simple high-voltage (HV) NMOS transistor can work fine as a switch. The source of the transistor will always be at a voltage between 0V and the negative  $V_{EE}$ supply voltage. With 0 V at the gate of the NMOS, the switch is open and with 1.2 V the NMOS will be fully-on. Thus, a simple NMOS transistor suffices as the switch for this application. For other applications, a PMOS transistor or even a parallel combination of the PMOS and NMOS transistors can be used as the switch.

## 4.7 Summary

The design and characterization of the most critical components of the mixed-signal PSSS transmitter and receiver baseband circuits were discussed in chapter 3. This chapter deals with the design details of the additional components required for the proper functioning of the PSSS receiver baseband circuit. The most important of these components include the limiter circuit, the sample-and-hold circuit, and the voltage control delay line circuit. The chip layout and the floor plan of the mixed-signal PSSS receiver baseband unit-slice test-chip are discussed. The components mentioned above form a single baseband unit-slice circuit requires additional circuit components and system-level design considerations that are also discussed in this chapter. For the characterization of the mixed-signal PSSS receiver baseband unit-slice useband unit-slice test-chip, the PSSS transmitter baseband circuit is emulated using an arbitrary waveform generator for the remainder of the thesis and the PSSS transmitter design is, therefore, not discussed further.

# 5 Characterization of a Complete Mixed-Signal PSSS Receiver Baseband Unit-Slice

The measurement setup and characterization of the mixed-signal PSSS receiver (RX) baseband (BB) slice are discussed in this chapter. The block diagram of the RX BB slice and the design and characterization of the most important components were discussed in chapter 3. An arbitrary waveform generator is used as the source for the PSSS waveform. To characterize the RX BB test-chip, a high-speed printed circuit board (PCB) was designed, and the chip was wire-bonded to the PCB. The design of the PCB, the modeling of the bond wires, the overall measurement setup, and the measurement results are discussed in this chapter.

## 5.1 High Speed Printed Circuit Board (PCB) Design

The RX BB unit-slice test-chip has a size of roughly 2.1 mm  $\times$  1.9 mm. The chip requires a high-frequency clock signal, a broadband data input (PSSS), multiple DC power supply connections, adjustable DC control voltage signals, and low-speed digital serial I/Os. The high-speed outputs of the chip include the direct output of the correlator, the sampled output of the correlator (i.e., S/H output), the output of the one-hot pulse generation circuit, and a high-speed serial data output stream to the FPGA along with a reference clock signal for the FPGA multi-gigabit transceivers. The usual method for the testing of highspeed ASICs is to use direct wafer probing for the high-speed I/Os and to use a printed circuit board for connecting the low-speed I/Os and DC connections. This scheme is usually suitable for chips with high-speed I/Os limited to 3 in number so that the DC and low-speed I/Os can be connected on the 4<sup>th</sup> side of the chip. However, in the current case, the number of high-speed I/Os is 6 and there is no spare side of the chip where all the DC connections and low-speed I/Os can be routed as can be seen in Fig. 5-1. To use the wafer prober to make connections to the high-speed I/Os, would require leaving a lot of space on either side of the high-speed I/O pads (i.e., to avoid making a contact with the bond wires connected to the adjacent bond pads) which would require a very large chip size and a complicated design for the PCB.

The solution to the above problems is to use a printed circuit board to interface both the high-speed I/Os and the DC or low-speed I/O connections. The problem with this approach is the limited bandwidth of the high-speed signal tracks on the PCB. Thus, the goal of the PCB design is to optimize the frequency response of the passive transmission line structures to achieve the required bandwidth of at least 30 GHz. Note that all the high-speed I/Os are differential signals that are terminated with on-chip 50  $\Omega$  loads. The ideal connection should consist of straight, low-loss, single-ended transmission lines with the characteristic impedance of 50  $\Omega$ . With each discontinuity in the ideal 50  $\Omega$  transmission line, the reflection coefficient  $\Gamma$  of the transmission line structure increases, that is given by the following formula

$$\Gamma = (Z_L - Z_c) / (Z_L + Z_c)$$
(5-1)

where  $Z_c$  denotes the characteristic impedance of the transmission line and  $Z_L$  denotes the load impedance.

To characterize the performance of the receiver baseband unit-slice test-chip, a highspeed PCB is designed. For the PCB, a 127  $\mu m$  high-frequency Isola Astra substrate is used with a 1 mm thick Copper plate at the bottom for heat dissipation (the layer stackup is explained in section 5.1.2). A cavity is created in the center of the PCB that is shorted to both the RF ground plane on the bottom layer of the PCB as well as to the copper plate. The bottom layer (RF-ground plane) of the PCB is at 0 V, whereas the chip-substrate is at -4 V. The chip is, therefore, glued to the PCB using an electrically conductive but electrically insulative glue. The depth of the cavity is designed such that the top of the glued chip is at approximately the same height as the top layer of the PCB itself. The PCB surface is treated with an electroless nickel immersion gold (ENIG) coating to allow wedge-type wire-boding with Aluminum bond wires with a diameter of 25  $\mu$ m. 2.92 mm prototype board connectors are used to connect the PCB to external measurement devices. The DC and control signals are supplied using a separate PCB board whereas the lowspeed digital I/Os e.g., the contents of the decoding vector, etc. are supplied using an FPGA board. The chip uses the following power supply voltages: 1.2 V for digital and some analog mixed-signal bias connections, and -4 V for the high-speed analog and mixed-signal circuits.



Fig. 5-1 Microphotograph of the chip wire-bonded to the PCB substrate.

Moving away from the load side (i.e., the on-chip 50  $\Omega$  terminations) the main instances of the discontinuity in the transmission line structure are the following: the bond wires used to connect the on-chip bond pads to the PCB tracks, the coupled region of the differential transmission lines near the bond pads, the bend structures on the PCB tracks (if used), the transition of the coupled transmission line segment to two single-ended transmission lines, the transition from the planar transmission line on the PCB (i.e., microstrip lines, etc.) to the coaxial transmission line (i.e., the coaxial cables), and any adapters or connectors, etc. used. The main sources of the discontinuities i.e., bond wires and the RF connectors are discussed in the following sub-sections.

## 5.1.1 RF Connectors

The RF connectors provide the transition from the coaxial transmission line to the planar transmission line structure on the PCB. The transition represents a discontinuity in the ideal 50  $\Omega$  transmission line system and therefore plays a very important role in the RF characteristics. Apart from the good high-frequency characteristics up to at least 30 GHz, the RF connectors should offer a low-cost solution for prototyping i.e., they should be easy to install and should be reusable. The 2.92 mm prototype connectors shown in Fig. 5-2 have good high-frequency characteristics up to 40 GHz and are easy to install and reuse. [94]



Fig. 5-2 Pictures of the 2.92 mm (K-type) prototype PCB connectors [94].

## 5.1.2 Layer Stack-Up

As mentioned before, both the high-frequency signals as well as the low-speed digital I/Os and DC connections need to be routed on the PCB. The PCB is split into 2 separate boards i.e., an RF board that is mechanically fixed on another single-sided FR-4 board. The chip is glued on the RF substrate and the chip is wire-bonded to the RF PCB. The bottom layer of the RF substrate serves as the ground plane for the microstrip lines.

A high-frequency substrate is required for the RF PCB for which Isola Astra MT-77 is chosen with a thickness of 127  $\mu$ m. The Isola Astra has a dielectric constant of 3.00 and a low loss tangent of 0.0017 constant up to a frequency of 20 GHz. To aid with the large



Fig. 5-3 Layer stack-up for the PCB. Not drawn to scale.

heat dissipation a 1 mm thick Copper plate is attached below the bottom layer of the RF board. A cavity is made in the RF substrate to house the chip. This is done to make the height of the chip equal to that of the PCB substrate. This helps to reduce the length of the bond wires significantly. The layer stack-up for the PCB is shown in Fig. 5-3.

The depth of the cavity is designed carefully to make sure that the top layer of the PCB is almost at the same height as that of the chip itself. This ensures that the length of the bond wire is not increased because of the difference in the heights. The height of the chip as specified by the foundry is 250-300  $\mu$ m. The depth of the cavity is equal to the sum of the heights of the RF-substrate, the top and the bottom Copper layers of the RF-PCB, and the pre-preg layer i.e., 293  $\mu$ m. As shown in Fig. 5-4, the depth of the cavity is measured to be around 296  $\mu$ m. The chip sits atop a thin layer of glue which makes the height of the chip almost level with the top layer of the RF-PCB.

To reduce the horizontal length of the bond wires, the dimensions of the cavity are chosen not more than 100-150  $\mu$ m larger than the dimensions of the chip on each side. This allows easier placement of the chip using a manual pick and place suction tool. One of the most important features is the vertical plating of the cavity walls. The vertical plating of the walls significantly reduces the inductance of the ground connections as compared to the case where only vias are used to connect the ground connections on the top layer to the RF ground plane on the bottom layer. A common technique to reduce the ground inductance is to draw a metal ring around the cavity. The bond wires for the ground connections are connected to the ground ring which allows a low impedance path to the RF ground plane at the bottom layer of the PCB. Although it does reduce the inductance of the ground bond wire connections, it increases the bond wire inductance of the RF signal tracks because the RF tracks start after a gap of  $S_{min}$  from the edge of the ground ring



Fig. 5-4 Measurement of the depth of the cavity using a 3D microscope.

where  $S_{min}$  is the minimum separation required between any two non-similar tracks as specified by the PCB manufacturer. The width of the ground ring plus the  $S_{min}$  is the unnecessary increase in the length of the bond wires connected to the signal tracks. Thus, in an effort to reduce the bond wire length of the ground connections by using a ground ring outside the cavity, the length of the RF tracks gets increased unnecessarily.

The solution to reducing the bond wire length of both the RF tracks as well as the ground connections is to avoid a complete ground ring as described above. The idea is to extend the RF tracks directly into the plated cavity making a short circuit connection to the RF ground connection with the cavity. This can be seen in Fig. 5-4, where the RF track on the right side (i.e., the track intersected by the blue marker line) can be seen going directly into the vertical wall of the cavity. The short circuit is removed post-fabrication by running a scalpel along the vertical wall of the cavity under the microscope. This significantly reduces the distance of the RF tracks to the chip edge. The ground ring is only interrupted in the region where the RF tracks merge with the vertical wall of the cavity and continues otherwise to provide a low impedance path for the ground connection. A necessary condition for this kind of connection is the GSSG configuration of the RF connections i.e., the differential signal tracks should be surrounded on both sides with ground connections on the side as seen in Fig. 5-1.

The power supply connections, low-speed I/Os, as well as analog control voltages, are provided using a separate FR-4 PCB that is mechanically mounted on the back side of the RF-PCB. The connections between the boards are made using manual vias made with small pieces of copper wires soldered at both ends. A hole in the middle of the FR-4 PCB allows the placement of a copper heat sink attached to the back side of the RF PCB for heat dissipation. A picture of the FR-4 PCB mounted on the back side of the RF PCB with the heat sink in the middle can be seen in Fig. 5-7.

### 5.1.3 Bond wire Modeling

The bond wires are used to connect the PCB tracks to the bond pads on the chip. They represent a discontinuity in the ideal 50  $\Omega$  transmission line because they are much thinner (25  $\mu$ m in diameter) than the microstrip line signal tracks, have no defined ground plane around or underneath like the microstrip lines, and are surrounded by air as the dielectric medium instead of the substrate dielectric in the case of microstrip lines. The bond wires are usually modeled as inductors with the following rule of thumb: 1 nH of inductance per unit length of the bond wire measured in units of mm. The inductance contributed by the bond wires has the most detrimental effect on the reflection coefficient and hence it needs to be properly modeled. While simple analytical models can be used as proposed in [95], [96], [97], the analytical models have limitations and the best way is to use a 3-dimensional (3D) simulator to model the bond wires along with the PCB and the chip



Fig. 5-5 Parameterized HFSS structure for FEM simulation with the chip on the left side and the PCB on the right side of the gap.
layouts to mimic the real-world scenario as close as possible. The 3D finite element method (FEM) solver for electromagnetic structures from Ansys Inc. i.e., High-Frequency Structure Simulator (HFSS) was used to simulate the effects of the bond wires. A parameterized structure was drawn which was excited with differential excitation. The bond wires are drawn flat with a small loop height. This corresponds well with the shape of the bond wires that can be drawn with good repeatability and good reliability using the wire-bonding machine available in the institute. The 3D structure for the FEM simulation is shown in Fig. 5-5. The s-parameter simulation results with the *gap* between the cavity edge and the chip edge (and effectively the length of the bond wire) as the sweep parameter are shown in Fig. 5-6. The results show that the length of the bond wires needs to be as short as possible to improve the bandwidth of the high-frequency signal tracks. Note that the length of the bond wire is 65  $\mu$ m larger than the parameter *gap*. According to the





Fig. 5-6 HFSS S-parameter sweep simulation results.

simulation results the return loss is 15 dB for 30 GHz and 20 GHz for 15 GHz for the bond wire length of 350  $\mu$ m which is also the expected length of the actual bond wires. For the simulations, the length of the PCB tracks was set as 5 mm approximately which also corresponds well to the actual drawn length of the coupled section of the coplanar ground microstrip line structures as can be seen in Fig. 5-1.

The simulations model the bond wire interface from the bond pads on the chip to the coupled transmission line segments on the PCB. The transition from the coupled to single-ended transmission line segments and the bend structures visible in Fig. 5-7 were simulated using Keysight's Advanced Design System's (ADS) 2.5-D momentum simulator.

#### 5.1.4 Transmission Line Structure

For the transmission line structures, microstrip lines and grounded coplanar waveguides (GCPW) were considered. Microstrip lines are easier to design and are often used for moderate bandwidth connections up to microwave frequencies, but they have higher radiation loss at higher millimeter-wave frequencies and poor isolation to neighboring tracks. The grounded coplanar waveguides (GCPW) are the board equivalents of the co-axial transmission lines which feature coplanar ground connections on either side of the signal conductor as well as a ground plane at the bottom. The coplanar ground stripes are connected to the bottom ground plane with the help of closely spaced vias. They have a much-reduced radiation loss at higher frequencies as compared to the microstrip lines and have much better isolation to adjacent high-speed tracks. This makes them well-suited for applications at and above 30 GHz. The downside is the higher insertion loss as compared to the microstrip lines.

The sensitivity of the transmission line characteristics to the variations in the fabrication process is an important consideration regarding the choice of the transmission line structure. For a given substrate height, the only variable for a microstrip line is the width of the signal conductor which defines its characteristic impedance whereas for the GCPW the width of the central conductor, the distance to the coplanar ground stripes as well as the spacing of the vias play a role in the frequency response of the GCPW structure. The surface roughness of the copper layer on the PCB substrates affects both the conductor losses as well as the propagation constant of the transmission line structures. The effective dielectric constant increases with the increase in the surface roughness of the copper layer measured as the root mean square (RMS) roughness value. Moreover, it affects the insertion loss of the transmission line structures by affecting the electric field distribution in the structure. For a tightly coupled GCPW transmission line structure (i.e., with coplanar ground conductors closer to the signal conductor), the electric field is maintained within the coplanar GSG structure and hence the surface roughness has less effect on the characteristic impedance of the GCPW structure as compared to the microstrip lines for which the electric field tends to move towards the bottom part of the Copper layer i.e., at the interface with the substrate layer that is the region that is the cause of the roughness.

The general mode of propagation for microstrip lines is the quasi transverse-electromagnetic (TEM) mode but additional undesired spurious wave propagation modes like hybrid transverse-electric (TE) and hybrid transverse-magnetic modes are also possible. The mode propagation behavior of the GCPW structure is similar to the microstrip lines [98]. GCPW has a zero-cutoff frequency (suitable for wideband applications), but its low order propagation mode is indicated as a quasi-TEM mode because it is not a real TEM mode. At higher frequencies, the field distribution becomes less-TEM, and more TE in nature with an elliptically polarized magnetic field. GCPW is a printed circuit analog of the three-wire transmission lines [99]. A good comparative study between the microstrip line and GCWP structures with measured results is available in [100], [101].

The total length of the PCB board is  $7 \text{ cm} \times 7 \text{ cm}$  and the total length of the high-speed GCPW tracks is approximately 35-40 mm each. The coupled GCPW tracks near the cavity end separate out to single-ended GCPW tracks as can be seen in Fig. 5-7. Note that the coupled GCPW tracks go directly into the cavity, causing a short connection to the plated cavity that is scraped away with a scalpel after fabrication as explained before.



Fig. 5-7 A picture of the RF-PCB on the left side and the FR-4 PCB on the right that is mechanically mounted on the back side of the RF-PCB.

#### 5.1.5 S-Parameter Measurements Results

To characterize the performance of the high-frequency tracks, a direct through connection was made between two neighboring differential RF connections. For the current measurement, the left-middle differential track was wire bonded to the top left differential track (only one connector is shown soldered in Fig. 5-7). A 4-port vector network analyzer with true mode stimulus capability was used to measure the mixed-mode s-parameters. The measured s-parameter data was used in an s-parameter simulation in Keysight ADS with differential 100 Ohm ports connected between the two ports forming one differential port. The results are shown in Fig. 5-8. Note that this measurement involves two complete differential RF signal paths in series i.e., instead of measuring one differential path from

RF-connectors to the chip, the chip end of the PCB track is wire-bonded to another RF signal path. Moreover, the length of the bond wires is also roughly doubled because instead of going from the RF track to the chip they are going to another RF track which increases their length. The  $S_{DD11}$  remains close to -10 dB for most of the frequency range. The  $S_{DD21}$  is above -6 dB till 37 GHz except for two small dips at 28 GHz and 32 GHz. The insertion loss, as well as the return loss, is quite good up to the desired frequency of operation of 30 GHz.



Fig. 5-8 Mixed-mode S-parameter measurement results with the help of a vector network analyzer (VNA).

# 5.2 Standalone Receiver Baseband Unit-Slice Measurements

This section deals with the measurements of the receiver baseband unit-slice circuit as a standalone integrated circuit without the RF-frontend and any data-sink devices.

# 5.2.1 Measurement Setup

The two main input signals required for the characterization of the receiver baseband testchip are the PSSS input and the clock input signals. Both signals need to be synchronous, so they are generated using the two synchronous outputs of the Keysight M8194 arbitrary waveform generator (AWG). It has 2 output channels and 2 marker channels. The AWG has a sampling rate of 120 GSa/s and an analog bandwidth of 50 GHz. The AWG has a vertical resolution of 8 bits with an ENOB of 4.7 at 15 GHz and 5.5 at 30 GHz [102]. The measurement setup is shown in Fig. 5-9.

Apart from the PSSS and clock input signals, a synchronous *clock\_enable* signal is also required. This signal is generated using the marker output from the AWG. A sampling oscilloscope (Keysight 86100D DCA-X) with 70 GHz input bandwidth was used to observe the direct output of the correlator and a real-time oscilloscope (Keysight UXR0402) with 40 GHz input bandwidth and 10-bit ADC was used to observe the sampled output of the correlator (after the sample-and-hold circuit) and to plot the eye diagrams. Note that the direct output of the weighted code generator is not available and the only way to test the output of the programmable weighted code generator is to use a constant DC level for the PSSS input and to observe the direct output.



Fig. 5-9 Measurement setup for the standalone measurements.

# 5.2.1.1 Offset Correction

The first step in the measurements is to correct the offsets in the correlator circuit. As mentioned before, the circuit has a manual offset correction circuit that is adjustable externally using a potentiometer. If the offset is not adjusted properly the integrator output ramps to one or the other extreme output value over time. This will corrupt the correlator output when used with the PSSS input.

The correlator has 2 inputs i.e., the external PSSS input supplied by the arbitrary waveform generator (AWG) and the internally generated code generator output. If the PSSS input is set to all 0's, and the code-generator is programmed to generate all 1's or -1's, the output should ideally remain perfectly horizontal. However, if there is an offset, the output will ramp slowly to the positive or the negative side. The offset can be corrected using the manual offset correction circuit explained in section 3.5.2. Observing the direct output of the correlator using an oscilloscope the offset can be corrected manually until a horizontal output waveform is obtained.

After this correction, the code-generator is programmed to generate all 0's (i.e.,  $\pm LSB/2$ ) and the PSSS input is set to all 1's or -1's, the correlator output should again be a perfectly horizontal signal. However, if the output ramps to the positive or the negative side, the differential offset setting of the AWG can be used to adjust the offset until a horizontal signal is obtained.

The measured direct output waveform of the correlator in Fig. 5-10 shows the correlator output with the code-generator set to all 1's and the PSSS input set to positive and negative m-sequences with sequences of 0's in between. The horizontal parts in the output signal show that the offset has been corrected. Notice that there is a slight difference in the peak signal amplitudes in the positive and the negative m-sequences. This is due to a small offset in the output stage for which there is no compensation possible in the chip.



Fig. 5-10 Test sequence for the detection and correction of offset

#### 5.2.1.2 Inter-Chip Synchronization

A very important aspect of the system design is the exact phase alignment between the external PSSS input signal and the locally generated decoding sequence. The phase alignment involves both the *inter-chip* and *intra-chip* phase alignment. The inter-chip phase alignment means the alignment of the external PSSS input with the on-chip generated *integrate/ reset* command signal that defines the correlation and reset phases of the correlator circuit. Note that the position of the integrate/ reset signal depends on the external *clk\_enable\_cmd* signal which should be synchronous to the external PSSS signal. If the system is not aligned in the beginning, the PSSS signal should be cyclically shifted by 1 chip before transmission. This process has to be repeated until the PSSS signal is aligned with the on-chip *integrate/ reset* command signal.

If the PSSS input and the on-chip generated integrate/ reset command signals are not aligned properly as shown in the top part of Fig. 5-11, the PSSS sequence is only partly correlated with the decoding sequence. Note that the guard interval also contains the same content as the start of the PSSS sequence. Some parts of the PSSS sequence will, however, not be correlated as it will land on the reset phase of the correlator. Thus, inter-chip alignment is a system-level synchronization issue that is commonly dealt with by using certain alignment sequences before transmitting the payload or the training data. The alignment sequences can take many different forms depending on the hardware capabilities of the link.

For the prototype chip, the alignment was achieved by observing the direct output of the correlator using a sampling oscilloscope. The alignment sequence of choice for the prototype chip was the signal  $\{1 -1 \ 0 \ 0 \ 0 \ 0 \ 1 -1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 0\}$  followed by  $\{0 \ 0 \ 0\}$  during the guard interval applied as the external PSSS signal. The weighted code generator was programmed to generate all 1's. The product of the decoding sequence with the PSSS



Fig. 5-11 Inter-chip alignment between the external PSSS input signal and the on-chip generated integrate/ reset signal. In the misaligned case on the top, the external PSSS symbol is not correlated correctly whereas, in the aligned case on the bottom, the whole PSSS symbol will be used for correlation.



Fig. 5-12 The alignment sequence used for checking the alignment of the PSSS signal with the decoding sequence.

sequence, in this case, is the same alignment sequence whose integrated version is available at the output as shown in Fig. 5-12. This constitutes the testing for the inter-chip alignment. The intra-chip alignment was used to fine-tune the alignment by observing the output on the oscilloscope. The intra-chip alignment was performed by using the on-chip VCDL circuit discussed in section 4.3 as well as an external static phase shifter.

#### 5.2.2 Testing with BPSK Data

After the offset correction and synchronization and following the functional verification of the PSSS receiver baseband by observing the correlator output with the PSSS input set to a constant DC level, the circuit performance with the actual PSSS data was tested. For this purpose, BPSK modulated data was encoded with unipolar m-sequences to generate the PSSS waveform which was then used as the input for the AWG and the programmable code generator was configured to generate a bipolar m-sequence. The decoded BPSK data can be seen at the output of the correlator by observing the sampled value of the correlator at the output of the S/H circuit. Note that for the BPSK data, the PSSS amplitude set has the following values  $\{0, \pm 1, \pm 2, \pm 3, \pm 4\}$ . An eye diagram for the said case can be seen in Fig. 5-13. The circuit works very well with PSSS data generated using BPSK encoded data up to 20 Gbps [78]. The circuit was observed to work fine with a lower supply voltage down to -3.5 V from the nominal value of -4V and with lower PSSS signal amplitudes (from 800 mV<sub>diff</sub> down to 200 mV<sub>diff</sub>) as well.



Fig. 5-13 Eye diagram for the BPSK data as input for the PSSS stream.

#### 5.2.3 Testing with PAM-4 Data

When PAM-4 modulated data is encoded with unipolar m-sequences and the programmable code generator is configured to generate a bipolar m-sequence, then the decoded PAM-4 data can be seen at the output of the correlator by observing the sampled value of the correlator at the output of the S/H circuit. Note that for the PAM-4 data, the PSSS amplitude set has the following values  $\{0, \pm 1, \pm 2, \dots, \pm 12\}$ . An eye diagram for the said case can be seen in Fig. 5-14. Note that the Keysight M8194 AWG has an ENOB of 4.7 at 15 GHz and 5.5 at 30 GHz [102]. For PAM-8, the PSSS amplitude set has the following values  $\{0, \pm 1, \pm 2, \dots, \pm 28\}$  which cannot be generated with the current AWG owing to the limited ENOB. The PSSS receiver baseband unit-slice circuit was observed to work very well with PAM-4 data up to 20 Gbps. This verifies not only the very good linearity of the circuit but also the good high-frequency performance of the circuit and the PCB. Note that a big improvement in the performance is expected if proper fixture de-embedding is performed by using the AWG de-embedding feature to equalize the frequency response degradation caused by the cables, connectors, and bond wires. The proper way to de-embed is to connect the AWG output to the PCB connectors on one end and to connect the oscilloscope inputs using the RF-probe tips to some dummy pad structures that are wire-bonded to the PCB tracks. This way the whole fixture can be de-embedded which will ensure that the signal output at the dummy pad structures is properly compensated. This would further improve the output eye diagram even with reduced supply voltage.

The circuit could be operated reliably and with very high repeatability with a clock frequency of up to 20 GHz. The measured eye diagram for the case of PAM-4 data is shown in Fig. 5-14 which shows good linearity of the PAM-4 eye as well as a clear vertical eyeopening [78]. For clock frequencies above 20 GHz, the output of the one-hot pulse generation circuit (available as the output of the  $18^{th}$  DFF see Fig. 4-13) was observed to be dying out i.e., the one-hot pulse could be seen on the oscilloscope initially for some time after the clock signal was enabled but after some time it could not consistently propagate cyclically and the output was seen to be stuck at zero. Since the one-hot pulse serves as the select signal for the analog mux, no repeatable measurements could be made for clock frequencies above 20 GHz. Using PAM-4 data with a chip rate of 20 GHz allows for a data rate of 2.22 Gbps. Using 15 copies of the unit-slice circuits as proposed in the system architecture, a total data rate of 33.33 Gbps can be achieved. With the I-Q receiver baseband as proposed in chapter 1, the net data rate will be doubled i.e., 66.66 Gbps. Lastly, the use of a 2×2 LOS MIMO discussed in chapter 1, increases the net data rate by another factor of 2 to 133.33 Gbps.



Fig. 5-14 Eye diagram for the recovered data for the case of PAM-4 data as input for the PSSS sequences.

#### 5.3 System Integration

The baseband circuit works very well as a standalone circuit as demonstrated using the measurement results in section 5.2. The next step for a complete PSSS communication

system is to perform transmission experiments with the RF-frontend and carrier recovery circuits. The integration of the baseband circuit with the RF-front end and carrier recovery circuit is presented in Fig. 5-15. In [57], [58] transmission experiments are explained which make use of real RF frontends in a hardware in the loop (HiL) scenario to implement a PSSS based communication system. The transmission experiment in [58] uses 240 GHz RF frontend from the project Millilink [103] to achieve 20 Gbps with a bit error rate (BER) of 5.4  $\times$  10<sup>-5</sup>. The experiment uses an AWG to generate the PSSS waveform and a digital storage oscilloscope as the receiver using PSSS modulation with spectral efficiency of 1 bps/Hz. The chip-level synchronization, channel equalization, and demodulation are performed offline using MATLAB/ Simulink. In [57], the 230 GHz RF frontend from the project Real100G.RF is used to transmit 80 Gbps using PSSS with spectral efficiency of 4 bps/Hz. In [57] also, an AWG is used to generate the PSSS waveform whereas the demodulation of the data, the channel equalization, and synchronization are performed offline after sampling and storing the data using a high sampling rate digital storage oscilloscope (DSO). Thus, using these RF-frontends with the mixed-signal PSSS receiver baseband unit-slice chip, a complete system interface can be developed to perform real-time transmission experiments using carrier recovery and baseband components. The complete measurement setup for the system integration of the mixed-signal PSSS receiver baseband with the RF-frontend and the Costas loop is sketched in the block diagram in Fig. 5-15.

The outputs from the RF-frontend are single-ended signals that are converted to differential signals using balance-unbalance (baluns) circuits. The proposed carrier recovery system is based on a BPSK Costas loop and, therefore, allows the use of only the I channel instead of both I and Q channels. The differential I signal is divided into two parts using an active differential power splitter circuit which routes a part of the signal to a BPSK Costas loop that generates the single-ended 20 GHz local oscillator reference signal for the RF-frontend as well as a differential 30 GHz clock signal for the mixed-signal PSSS receiver baseband. The other output from the active power splitter circuit goes to the mixed-signal PSSS receiver baseband. The amplitude of the PSSS baseband signal needs



Fig. 5-15 Complete measurement setup for system integration of mixed-signal PSSS receiver baseband with RFfrontend and Costas loop for carrier recovery.

to be amplified before applying it to the baseband circuit. For the amplification, an offthe-shelf broadband amplifier module can be used. However, no differential broadband amplifier modules were available with the required specifications and hence baluns were used to perform the required differential to single-ended and reverse conversions.

Based on the transmission experiment results from [57] and [58] it can be safely assumed that the mixed-signal PSSS receiver baseband should work well with the RF-frontend if a synchronous clock signal can be recovered and provided to the mixed-signal PSSS RX-BB. The interface from the baseband circuit to the FPGA was tested separately as a test-chip. The test-chip for the FPGA interface includes two copies of an on-chip bit-sequence generator, a scrambler circuit, and a 4:1 mux to emulate the two unit-slices of the baseband circuit each transmitting data at the rate of 6.667 Gbps. The connection to the FPGA could be established successfully and the two lanes of data could be aligned perfectly. The complete PSSS system integration and characterization were not undertaken due to limited time and budgetary resources.

## 5.4 Application as a High-Resolution Ranging Radar

An interesting idea is to use the PSSS baseband mixed-signal receiver baseband circuit as a high-resolution ranging radar circuit. M-sequence radars are commonly used for distance and ranging applications. They offer the advantage of accuracy and high resolution. The resolution of the m-sequence radar is related to the m-sequence signal bandwidth B according to

$$\Delta d = c_o / (2\sqrt{\epsilon_r}B)$$
(5-2)

where  $c_o = 3 \times 10^8$  m/s is the speed of light in vacuum, and  $\epsilon_r$  is the relative dielectric constant of the medium e.g., air, etc [104]. The m-sequences are broadband signals with bandwidth B up to  $f_{chip}$ . To improve the resolution of the m-sequence radar, the chip rate  $f_{chip}$  of the m-sequences must be increased. The well-known architecture of the broadband m-sequence radar with a sub-sampling receiver is shown in Fig. 5-16 [104]. The TX consists of an m-sequence generator, an optional power amplifier (PA), and a TX antenna. The RX consists of an RX antenna followed by a low-noise amplifier (LNA). A low-pass



Fig. 5-16 Broadband m-sequence radar architecture

filter (LPF) is used for anti-aliasing and to remove high-frequency noise. Since the signal (m-sequence) is periodic, so a sub-sampling analog to digital converter (ADC) operating at  $f_{chip}/N$  can be used. The correlation is performed using a digital signal processor (DSP) to calculate the delay [104].

In the proposed radar ranging system in Fig. 5-17, a short m-sequence of length 15 is used with a high chip rate of 20 Gcps with a mixed-signal receiver (RX) baseband (BB). The use of such a high chip rate requires up-conversion to a higher frequency band. The required bandwidth of approximately 20 GHz is available in the V-band (40-75 GHz), or terahertz frequency bands e.g., around 240 GHz. An RF frontend is required to provide the necessary up and down-conversion. The down-converted signal is correlated with the reference m-sequence in the receiver baseband circuit. The use of the large bandwidth of 20 GHz available in the V-band (40-75 GHz) or at 240 GHz allows reducing the detection resolution down to 7.5 mm according to Equation (5-2).

If the architecture in Fig. 5-16 is used, the high chip rate of the m-sequence would require extremely high BW and a moderately high resolution for the ADC. An alternate approach is to perform the correlation in the analog domain using a broadband analog correlator. An example is the system in [105], where the correlation is split into the multiplicative part at the RF circuit level, the integration part using the analog BB, and the frequency correction in the DSP. The chip rate of 3.2 Gcps is used in [105] resulting in a resolution of 93 mm.

The transmission of m-sequences through the RF channel causes signal distortions due to RF channel impairments and non-idealities of the RF frontend. To perform channel equalization, the m-sequence generated in the RX BB can be weighted i.e., instead of binary values  $\{\pm 1\}$  the chip values can be weighted coefficients determined using some training



Fig. 5-17 Proposed m-sequence radar architecture with mixed-signal radar receiver baseband for high-resolution ranging.

data. The reader is referred to the measurement results in section 5.5 to evaluate the performance of the channel equalization.

In the proposed radar system in Fig. 5-17, a weighted code-generator in the RX BB provides a reference m-sequence to perform the correlation with the received (reflected) msequence. The correlator output is sampled at the end of the correlation period equal to the length of the coding sequence. If the received m-sequence is perfectly aligned with the locally generated m-sequence in the receiver, the sampled output of the correlator is maximized. Thus, the aim is to monitor the sampled correlator output for the maximum output amplitude. In each successive transmission, the transmitter sends a 1 T<sub>chip</sub> cyclically shifted version of the m-sequence until the received m-sequence perfectly aligns with the reference m-sequence in the RX BB and the maximum correlator output is obtained. Since the m-sequence inside the receiver is not shifted the total number of shifts required to obtain the peak correlator output plus the delay incurred in receiving the reflected signal (measurable with a digital counter clocked with a clock rate of f<sub>chip</sub>) corresponds to twice the distance (or an integer multiple of it) to the target object. After determining the number of shifts required to obtain the peak output, an external static phase delay circuit or the on-chip voltage controlled delay line (VCDL) circuit can be used for the sub-T<sub>chip</sub> alignment of the received m-sequence with the on-chip reference m-sequence. Thus,

Total distance = (TX m-seq shifts + Time of flight + VCDL delay value)  $\times \frac{c_o}{2 \times \sqrt{\epsilon_r}}$ 

# 5.4.1 Measurement Results

The two input signals required for the characterization of the m-sequence radar (using the PSSS RX BB test-chip) are the (reflected) m-sequence input and the clock input for the on-chip reference code generator. Both signals were generated using the Keysight M8194 arbitrary waveform generator (AWG) which has 2 output channels, a high sampling rate of 120 GSa/s, analog bandwidth of 50 GHz, and a vertical resolution of 10 bits with an ENOB of 4.7 at 15 GHz. The measurement setup for the radar system is shown in Fig. 5-18. A wide bandwidth (70 GHz) sampling oscilloscope (Keysight DCA-X 86100D)



Fig. 5-18 Measurement setup with the microphotograph of the m-sequence radar receiver BB test-chip (1.85mm  $\times 2mm$ ) in the inset.

was used to observe both the instantaneous and the sampled output of the correlator. Note that the direct output of the weighted code generator is not available and cannot be tested directly. To prove the excellent correlation results of the RX BB test-chip as an m-sequence radar, m-sequences with an integer number of delays were applied at the radar baseband input using AWG. When both m-sequences are perfectly aligned i.e., no skew, the multiplier output is a series of fifteen 1's which generates a ramp output upon integration. When the two m-sequences are skewed by a non-zero multiple of  $T_{chip}$  w.r.t. each other, the output is a small number corresponding to -1/15. The direct instantaneous output of the correlator is shown in Fig. 5-19 and the sampled output of the S/H circuit is shown in Fig. 5-20.

The numbers mentioned in the two figures indicate the skew (in multiples of  $T_{chip}$ ) between the external (reflected) m-sequence and the on-chip m-sequence generated by the code generator. It can be seen that only for the case of 0  $T_{chip}$  skew between the received m-sequence and the on-chip generated m-sequence, the output is a monotonous linear ramp that reaches a large amplitude as compared to the amplitudes for the non-zero  $T_{chip}$ skew (see Fig. 5-19).



Fig. 5-19 The instantaneous differential output of the correlator.

The important thing to note is that the final amplitude at the end of the correlation period is important and not the instantaneous value of the correlator. The final amplitude at the end of the correlation period is available as a sampled value at the output of the S/H circuit as shown in Fig. 5-20. Here again, it can be seen that only for the case of 0  $T_{chip}$  skew the sampled output has a large amplitude whereas, for any other case, the output has a small



Fig. 5-20 The sampled differential output of the correlator (S/H output).

amplitude. Thus, the delay encountered by the reflected m-sequence can be calculated based on the no. of shifts required by the on-chip code generator to produce a large output. Note that even in the case of the sub- $T_{chip}$  skew the peak output is still easily discernible which can be maximized with the help of the on-chip VCDL circuit. This is shown with



Fig. 5-21 Instantaneous output of the correlator circuit as a function of the sub- $T_{chip}$  skew.

the help of the instantaneous correlator output results as a function of the sub- $T_{chip}$  skew in Fig. 5-21.

The skew between the two signals is changed with the help of the VCDL circuit. The result in Fig. 5-21 is an overlap of 5 results plotted on the same window. In any given radar measurement, only one of the outputs will be obtained. The VCDL output signal will be adjusted until a large amplitude is obtained. Using the control voltage value of the VCDL circuit required for obtaining the largest amplitude, the total delay encountered by the reflected m-sequence signal can be calculated as the sum of the coarse delay in multiples of  $T_{chip}$  and the fine delay value based on the VCDL control voltage.

# 5.5 Channel Equalization Test

One of the key features of the mixed-signal PSSS receiver baseband architecture is channel equalization using chip weighting. To evaluate the efficiency of the channel equalization, observe the measured output signal of the S/H circuit in Fig. 5-20. The result shows a high cross-correlation value for the case when the external m-sequence (from AWG) is perfectly aligned (i.e., 0  $T_{chip}$  skew) with the on-chip reference m-sequence signal generated using the programmable weighted code generator circuit.

The cross-correlation result for the case of non-zero  $T_{chip}$  skew is small but still quite significant. This is caused by the signal impairments due to low-pass filtering of the RF cables, the adapters, and the coplanar microstrip lines on the PCB, as well as due to the reflections caused by discontinuities such as the coplanar-to-microstrip transition on the PCB, the bond wires, and the transition to microstrip lines on the chip. To perform the channel equalization, the matrix operations listed in section 2.5.2 were used with cyclically shifted bipolar m-sequences as the training sequences. The measurement results in Fig. 5-22 show the normalized amplitude of the sampled correlation result of two bipolar m-sequences with and without channel equalization.



Fig. 5-22 Measurement results of the bipolar m-sequence correlation with and without channel equalization.

To see the effectiveness of the channel equalization, the measurement in Fig. 5-20 was repeated, with weighted m-sequences used as the reference m-sequence (or the decoding sequence). The measurement result in Fig. 5-23 shows improved correlation results using channel equalization. This is the first reported result for channel equalization using chip weighting for a mixed-signal PSSS receiver baseband.

| 1) [20.0 m | IV/ )[-141 m      | ₩ <b>2</b> | 10.0 mV/       | 06 mV ](            | →<br>→<br>₽      | ~~~            |                | GHZ            |        | 137.4 n |        |        |       |
|------------|-------------------|------------|----------------|---------------------|------------------|----------------|----------------|----------------|--------|---------|--------|--------|-------|
| 13         | 14 <mark>0</mark> | 1 2        | 34             | 56                  | 7                | 89             | 10 1           | L1 12          | 13     | 14      | 0      | 12     |       |
|            | M                 |            |                |                     |                  |                |                |                |        |         | ~\     |        | -65   |
|            |                   |            |                |                     |                  |                |                |                |        |         |        |        | -85   |
|            | ~                 | ~~~~       | A march        | X                   | Xax              | )<br>Vico      | s and a        | $\sim$         | $\sim$ | M       |        | $\sim$ | × -10 |
| ▶          |                   |            | No. o<br>b/w t | f shifts<br>he exte | in mu<br>ernal i | ultipl<br>nput | es of<br>m-se  | Tchip<br>≥q    |        |         |        |        | -12   |
|            | W                 |            | and t<br>using | he m-s<br>weigh     | eq geı<br>ted co | nerat          | ed or<br>enera | n chip<br>ator |        |         | $\sim$ |        | -1-   |

Fig. 5-23 The sampled differential output of the correlator (S/H output) showing the correlation of bipolar msequences with channel equalization using an on-chip weighted code generator.

# 5.6 List of Relevant Publications of the Author

- [69] A. R. Javed, et. al., "Real100G.com," in Wireless 100 Gbps and Beyond, R. Kraemer and S. Scholz, Eds., Frankfurt (Oder), Germany, IHP - Innovations for High Performance Microelectronics, 2020, pp. 231-294.
- [77] A. R. Javed and J. C. Scheytt, "M-Sequence Radar for High Resolution Ranging with Mixed-Signal Radar Receiver Baseband Using 130nm SiGe BiCMOS Technology," in 2020 17th European Radar Conference (EuRAD), Utrecht, 2020.
- [78] A. R. Javed and J. C. Scheytt, "Mixed-Signal Receiver Baseband Slice for High-Data-Rate Communication Using 130 nm SiGe BiCMOS Technology," in 64th International Midwest Symposium on Circuits and Systems (MWSCAS 2021), East Lansing, 2021.

# 5.7 Summary

The measurement setup and characterization of the mixed-signal PSSS receiver (RX) baseband (BB) slice are discussed in this chapter. To characterize the RX BB test-chip, a high-speed printed circuit board (PCB) was designed and the chip was wire-bonded to

the PCB. The design of the PCB, the modeling of the bond wires, the overall measurement setup, the power dissipation, and the measurement results are discussed in this chapter.

For the standalone tests of the baseband test-chip, an arbitrary waveform generator device was used as the PSSS waveform source. The AWG has a vertical resolution of 8 bits with an effective number of bits (ENOB) specification of 4.7 at 15 GHz and 5.5 at 30 GHz. The eye diagram for the PAM-4 case is symmetric and has clear vertical eye-openings which proves the very good linearity and excellent high-performance of the circuit. Note that for the chosen system parameters, the PSSS amplitude set corresponding to PAM-4 data is  $\{0, \pm 1, \pm 2, \dots, \pm 12\}$ . PAM-8 and higher constellations cannot be tested owing to the limited effective number of bits of the arbitrary waveform generator (AWG). The circuit could be operated reliably and with very high repeatability with the clock frequency of up to 20 GHz beyond which the analog mux was not working reliably and the one-hot pulse generation circuit which generates the select signal for the analog mux was not propagating cyclically. Using PAM-4 with a chip rate of 20 GHz allows for a data rate of 2.22 Gbps. Using 15 copies of the unit-slice circuits as proposed in the system architecture, a total data rate of 33.33 Gbps can be achieved. With the I-Q receiver baseband as proposed in chapter 1, the net data rate will be doubled i.e., 66.66 Gbps. Lastly, the use of a 2×2 LOS MIMO discussed in chapter 1, increases the net data rate by another factor of 2 to 133.33 Gbps. A complete setup for the system integration of the baseband circuit with the RF-frontend and carrier recovery circuit is presented at the end.

An interesting application of the receiver baseband chip as a high-resolution ranging radar is presented along with measured results that would allow distance resolution down to 7.5 mm.

# 6 Conclusions and Outlook

The system design of a high-speed wireless communication system for mobile internet access is discussed in this thesis with emphasis on the circuit design of an ultrawideband baseband circuit. From the viewpoint of the baseband circuit design, the use of a single RF-carrier with spread spectrum communication is a better choice as compared to a multi-carrier approach like the OFDM. The current project explores the use of a large contiguous chunk of 50 GHz RF bandwidth around the carrier frequency of 240 GHz with Double Sideband Suppressed Carrier (DSB-SC) modulation. The use of the large bandwidth allows to reduce the spectral efficiency to a smaller value but makes a digital transceiver baseband signal processing in the analog domain. Parallel sequence spread spectrum (PSSS) is well suited to efficient analog baseband circuit implementation and is used as the basis of further research in this thesis.

A big advantage of using a mixed-signal architecture for the PSSS baseband circuit is the possibility to merge the channel equalization process with the data decoding process. This idea has been previously implemented in the form of post-processing on the stored samples of the correlator output data captured using high sampling rate oscilloscopes etc. The applicability of this idea to the proposed mixed-signal PSSS baseband was discussed analytically and has been incorporated in the circuit design of the mixed-signal PSSS receiver baseband circuit.

The transmitter and receiver baseband has a sliced architecture where each slice of the hardware represents the required set of components to transmit one of the *N* parallel symbols (transmitter baseband unit-slice) or to recover one of the *N* parallelly transmitted symbols (receiver baseband unit-slice). The detailed circuit design and measurement results of the most important circuit components of the mixed-signal PSSS receiver baseband circuit were discussed which include the mixed-signal weighted code-generator circuit (to generate the weighted local copies of the coding sequences at the receiver end) and the broadband fast-resettable correlator circuit (to perform correlation). The design details of the additional components required for the proper functioning of the PSSS receiver baseband circuit were also discussed. The most important of these components are the limiter circuit, the sample-and-hold circuit, and the voltage control delay line circuit.

The circuit design and measurement results of the important circuit components of the transmitter baseband circuit i.e., DAC and the digital baseband core circuit were implemented in 65 nm bulk CMOS technology. The implementation of the analog mux was not feasible in 65 nm bulk CMOS technology. For the implementation of the analog mux circuit, a more scaled technology was used. The post-layout simulations of the circuit in 28 nm bulk CMOS technology show very good linearity and high-speed performance.

The measurement results of the mixed-signal PSSS receiver baseband unit-slice show that the proposed mixed-signal receiver baseband architecture works well for PSSS data generated by PSSS encoding of BPSK and PAM-4 data. For these standalone tests of the baseband test-chip, an arbitrary waveform generator device was used as the PSSS waveform source. The output of the sample-and-hold circuit was used to plot the eye diagrams. The eye diagram for the BPSK case has a clear vertical eye-opening. Similarly, for the case of PAM-4 case, the eyes diagram has vertical eye-openings and a good vertical symmetry which proves the very good linearity and excellent high-performance of the circuit. Note that for the chosen system parameters, the PSSS amplitude set corresponding to PAM-4 data is  $\{0, \pm 1, \pm 2, \dots, \pm 12\}$ . PAM-8 and higher constellations cannot be tested owing to the limited effective number of bits of the arbitrary waveform generator (AWG). The AWG has a vertical resolution of 8 bits with an effective number of bits (ENOB) specification of 4.7 at 15 GHz and 5.5 at 30 GHz. Note that the above measurements were made without complete de-embedding of the high-frequency structures from the output of the AWG to the bond pads of the test-chip. A considerable improvement in linearity and high-speed performance of the circuit and even better eye-diagrams are expected with a complete fixture de-embedding as explained in section 5.2.3.

The circuit could be operated reliably and with very good repeatability with a clock frequency of up to 20 GHz. At frequencies higher than 20 GHz, the analog mux was not working reliably and the one-hot pulse generation circuit which generates the select signal for the analog mux was not able to consistently propagate the one-hot pulse signal cyclically i.e., the one-hot pulse signal was dying out after some time at higher frequencies. Using PAM-4 with a chip rate of 20 GHz allows for a data rate of 2.22 Gbps. Using 15 copies of the unit-slice circuits as proposed in the system architecture, a total data rate of 33.33 Gbps can be achieved. With the I-Q receiver baseband as proposed in chapter 1, the net data rate is doubled i.e., 66.66 Gbps. Lastly, the use of a  $2\times2$  LOS MIMO increases the net data rate by another factor of 2 to 133.33 Gbps. A complete setup for the system integration of the baseband circuit with the RF-frontend and the carrier recovery circuit is also presented.

The components discussed so far form a single baseband unit-slice circuit. The extension of the unit-slice circuit to a complete PSSS receiver baseband circuit consisting of 15 unit-slices requires additional circuit components and system-level design considerations as discussed in chapter 4. The comparison of the receiver baseband circuit using 28 nm CMOS technology in terms of power dissipation, layout size, and post-layout simulations is presented in chapter 3. The migration to a smaller CMOS technology e.g., 28 nm offers substantial savings in the power dissipation and the layout of the receiver baseband components. A discussion of the savings in power dissipation and layout of the important receiver baseband components is presented in chapter 3.

The rest of this chapter discusses some ideas about furthering the research in the mixedsignal PSSS receiver baseband circuits. Two different ideas about the design of correlator circuits are presented: one is a correlator circuit without any reset functionality, and the other consists of multiple resettable correlators with flag bits to indicate the crossing over of the preset threshold voltage levels. An interesting idea to increase the length of the coding sequences without increasing the size of the chip is to reconfigure the contents of the weighted code generator circuit using multiple CMOS shift registers instead of using only one shift register. Lastly, an idea about reducing the power dissipation is presented according to which the input current sources of the weighted code generator should be disabled (disconnected) after the current signal is routed to the output and should remain disabled for the whole multiplexing cycle. The input current signal should be re-enabled (reconnected) one clock period in advance to ensure that the current input signal has sufficient time to settle to its new value before the select signal routes this signal to the output. The above ideas are discussed in detail below.

## 6.1 No Reset Correlator

The switching between the reset and integrator modes of the correlator causes a small DC shift in the output if not designed carefully which distorts the output signal. Moreover, during the transition period between the integrate or reset modes, some part of the input signal may not get integrated which causes linearity issues for the output signal. To avoid



Fig. 6-1 Proposed schematic diagram of the no reset integrator circuit which forms a part of the no reset correlator circuit

these problems a no reset correlator circuit can be used. A big advantage of the no reset correlator lies in the fact that the circuit does not need the *integrate or reset command* signal whose width, proper routing, and the correct positioning are challenging. The proposed architecture of the no reset correlator is shown in Fig. 6-1. The idea is to use a sample-and-hold (S/H) circuit to hold the value of the correlation result at the end of the correlation cycle. The inverted version of the S/H output is applied as a second input to the correlator circuit for the next PSSS input sequence. The broadband G<sub>m</sub>-C correlator circuit explained in section 3.5.2 uses a short reset phase to reset the correlator to prepare it for the next correlation cycle. In the proposed architecture of the no reset correlator circuit, a secondary G<sub>m</sub> stage (with the inverted version of S/H output as the input) is added in parallel to the primary G<sub>m</sub> stage (with the external PSSS signal as input). The transconductance of the secondary G<sub>m</sub> stage is adjusted as follows: the PSSS input is set to 0 V<sub>diff</sub> which disables the primary G<sub>m</sub> stage, leaving the secondary G<sub>m</sub> stage as the only input to the G<sub>m</sub>-C integrator circuit. The input to the secondary G<sub>m</sub> stage is the inverted value of the S/H output. The most important design goal is to adjust the transconductance of the secondary G<sub>m</sub> stage such that the final value of the integral at the end of the correlation cycle becomes 0 V<sub>diff</sub>. This means that the initial value of the integrator circuit (which was equal to the final value of the correlator circuit at the end of the previous correlation cycle) has been subtracted from the current integration result. This way the correlator circuit can operate without requiring any reset. The fact that the S/H circuit is anyway a part of the system architecture makes this is a very attractive approach. The disadvantage is that since the integrator is not reset after each correlation cycle, the circuit is very sensitive to the precision of the S/H output which must have very little droop and feedthrough during the hold mode. If there is a problem in the S/H output e.g., offset, droop, feedthrough, etc., the error accumulates with each new correlation cycle. Thus, the design constraints of the S/H circuit become more stringent. Another very important point is the increase in the output dynamic range of the correlator as compared to the resettable correlator. However, the increase is not very large and would require some small adjustments to the architecture in Fig. 6-1.

#### 6.2 Multiple Resettable Correlators with Flags

The output dynamic range of the correlator circuit directly affects the linearity of the circuit. The lower output dynamic range of the correlator does not only result in better linearity of the correlator circuit, but it also reduces the magnitude of the hold-mode feedthrough of the S/H circuit connected at the output of the correlator. An interesting approach to reduce the output dynamic range of the correlator circuit is the use of multiple resettable correlator circuits with flag outputs to indicate if the correlator output crosses the preset threshold values as shown in Fig. 6-2. The input signal is applied to all the correlator circuits in parallel but the tail currents of all except one correlator circuit are disabled. All correlators are configured to be in the *integrate* mode. As soon as the tail current is enabled for a correlator, the correlator starts correlation which continues until the correlator output reaches the positive or the negative threshold value. As soon as a threshold crossover is detected by the comparator circuit, a +1 or -1 flag bit is set depending on whether the positive or the negative threshold is crossed and the current source for the next correlator is switched on which indicates that the next correlator circuit should take over. The final correlation result can be read at the end of the correlation cycle, by summing up the flag bits and the sampled value of the correlator that was being used and whose output was routed to the sample-and-hold circuit. This way it is always ensured that the S/H input is within the defined threshold value. The minimum number of correlators required is calculated by the duration of the correlation cycle divided by the reset duration of the correlator in the number of chip intervals i.e.,  $T_{chip}$ . For the current PSSS system having the correlation cycle duration equal to 15  $T_{chip}$  and the correlator reset duration of 3  $T_{chip}$ , a total of 5 correlator circuits will be required.



Fig. 6-2 Block diagram of the multiple resettable correlators with flags.

#### 6.3 Larger Codes with Reconfigurable Hardware

One of the most important parameters for the design of a mixed-signal PSSS baseband circuit is the length of the coding sequences. The use of larger codes increases the spreading gain of the system, increases the link utility, and reduces the sensitivity of the receiver. However, for a mixed-signal implementation, it means an increase in the number of copies of the hardware components used which increases the hardware complexity, thus calling for the use of smaller codes up to the length of 15. An interesting approach to increase the length of the code without having to increase the copies of the hardware components is to use the fixed-length programmable weighted code generator as explained in section 3.3 but instead of using static CMOS programable differential current sources, the inputs to the programmable weighted code generator circuit can be made reconfigurable. The idea is to use multiple storage buffers (implemented as shift registers) instead of a single shift register to store the values of the programmable current sources. The operation starts with the first shift register connected at the input of the DAC circuits, generating the first 15 weighted bits of the coding sequence. The DAC inputs are then connected to the sec-



Fig. 6-3 Idea for generation of larger (weighted) codes with reconfigurable hardware.

ond shift register to generate the next 15 weighted bits and so on. The important consideration here is the settling time of the DAC circuit because the inputs to the analog mux in the weighted code generator circuit must be stable before the analog mux selects and routes the signal to the output. A block diagram of the proposed idea is shown in Fig. 6-3. The DAC and the transconductance stages are similar to those in Fig. 3-4. The only difference is the input to the DAC circuits that is a multiplexed version of the multiple CMOS shift registers outputs. The length of the code depends on the number of CMOS shift registers used. As in Fig. 3-4, the output of the DAC is a voltage waveform that is converted to a current signal using the transconductance stage. The rest of the operation is similar to the circuit explained in Fig. 3-4. The select signal for the multiplexer shown in Fig. 6-3 is the output of the next DFF from the one-hot pulse generation circuit explained in section 3.3.2. For example, when the output of the first DFF (from the one-hot pulse generation circuit) is high, the output of the 1<sup>st</sup> current source is routed to the output of the analog mux (weighted code generator). At the next clock edge, the output of the 2<sup>nd</sup> DFF changes to high, which routes the 2<sup>nd</sup> current source output to the output of the analog mux (weighted code generator). The output of the 2<sup>nd</sup> DFF also acts as the select signal of the input selection mux in Fig. 6-3, which means that as soon as the analog mux switches from the 1<sup>st</sup> current input signal to the 2<sup>nd</sup> current input signal, the 1<sup>st</sup> current input will start to change its value based on the content read from the next CMOS serial register and by the time the analog mux completes its cycle of 18 inputs and it is again the turn of the 1<sup>st</sup> current input to be routed to the output, the 1<sup>st</sup> current source would have settled to its new value.

# 6.4 Reduction of the Power Dissipation

The weighted code generator circuit is the component with the largest power dissipation in the whole of the mixed-signal PSSS receiver baseband circuit. The large power dissipation stems from the fact that the inputs to the analog mux are current signals. The analog mux has a current mode logic topology that requires constant current flow whether or not the current signal is routed to the output. An idea to reduce the power dissipation is to disconnect the currents of all the inputs that are not routed to the output i.e., all except the one that is routed to the output. The current has to be turned on again before it is again the turn for this input to be routed to the output of the analog mux in the next round of the analog mux operation. The design of the switch to turn the current on or off and the generation of the signals to turn the switch on or off are discussed below.

In Fig. 6-4, the connection between a programmable CMOS differential current source is to the high-speed differential current switch is made through an NMOS transistor. The programmable CMOS differential current source and the high-speed differential current switch circuits are similar to those used in the mixed-signal weighted code generator circuit as explained in section 3.3. The connection between the two circuits exists only when the NMOS transistor is turned on. If the transistor is turned off there is no connection between the current source and the switch which routes the signal to the output of the



Fig. 6-4 The proposed idea to reduce the power dissipation by keeping the  $i^{th}$  differential current input disconnected except for the current ( $i^{th}$ ) selection pulse, and the selection pulse before and after the  $i^{th}$  pulse.

analog mux. The idea is to make sure that the  $i^{th}$  current source is connected to the current switch when the  $i^{th}$  DFF of the one-hot pulse generation circuit is high. This is the normal operation of the weighted code generator circuit as explained in section 3.3. At the next clock edge, the output of the  $i^{th}$  DFF will go low and that of the  $(i+1)^{th}$  DFF will go high. The  $i^{th}$  current source cannot be routed to the output of the analog mux now because the  $sel(i)_p$  signal would already be lower than  $sel(i)_n$  now. Thus, this current source should be disconnected to avoid the current flow and the consequent power dissipation. The  $i^{th}$ current source remains disconnected until one DFF pulse before the  $i^{th}$  DFF i.e., before the  $(i-1)^{th}$  pulse. This is done to make sure that the magnitude of the current should reach its steady-state value before the  $i^{th}$  pulse arrives. Note that a logic level shifter may be required to adjust the DC level of the signal that is applied to the NMOS transistors. This is a simple scheme to reduce the power in terms of the schematic, but the layout would become quite complex because of the additional connections of the DFFs and the placement of the 3-input OR-gate. The settling time of the NMOS current switch and the output of the 3-input OR-gate must be monitored carefully using post-layout simulations to investigate any possible issues with regards to the settling time of the current signal and the overall signal quality. Since the current sources are now active for only 3 T<sub>chips</sub> as compared to 18 T<sub>chips</sub>, the power dissipation is approximately reduced by a factor of 6.

## 6.5 Summary

A summary of the research work carried out in this project has been presented in this chapter. Additionally, some suggestions for furthering the research in the area of broadband mixed-signal PSSS baseband circuits are provided. Two different ideas about the design of correlator circuits are presented: one is a correlator circuit that operates without requiring any reset, and the other consists of multiple resettable correlators with flag bits to indicate the crossing over of the preset threshold voltage levels. An interesting idea to increase the length of the coding sequences without increasing the size of the chip is to reconfigure the contents of the weighted code generator circuit using multiple CMOS shift registers instead of using only one shift register. Lastly, an idea about reducing the power dissipation is presented according to which the input current sources of the weighted code generator should be disabled (disconnected) after the current signal is routed to the output and should remain disabled for the whole multiplexing cycle. The input current signal should be re-enabled (reconnected) one clock period in advance to ensure that the current input signal has sufficient time to settle to its new value before the select signal routes this signal to the output.

# Bibliography

- [1] E. Khorov, A. Kiryanov, A. Lyakhov and G. Bianchi, "A tutorial on IEEE 802.11ax high efficiency WLANs," 2019.
- [2] J. Wells, "Faster than Fiber: The Future of Multi-Gb/s Wireless," *IEEE Microwave Magazine*, vol. 10, no. 3, pp. 104-112, May 2009.
- [3] V. Dyadyuk, J. Bunton, J. Pathikulangara, R. Kendall, O. Sevimli, L. Stokes and D. A. Abbott, "A Multigigabit Millimeter-Wave Communication System With Improved Spectral Efficiency," *IEEE Transactions on Microwave Theory and Techniques*, vol. 55, no. 12, pp. 2813-2821, Dec. 2007.
- [4] A. Hirata, T. Kosugi, H. Takahashi, R. Yamaguchi, F. Nakajima, T. Furuta, H. Ito, H. Sugahara, Y. Sato and T. Nagatsuma, "120-GHz-band millimeter-wave photonic wireless link for 10-Gb/s data transmission," *IEEE Transactions on Microwave Theory and Techniques*, vol. 54, no. 5, pp. 1937-1944, May 2006.
- [5] K. Greene, "Wireless at Fiber Speeds: New Millimeter-Wave Technology Sends Data at 10 Gigabits per Second," [Online]. Available: https://www.technologyreview.com/2008/10/03/268740/wireless-at-fiberspeeds/. [Accessed 09 Oct. 2020].
- [6] C. Sheldon et al., "A 60GHz line-of-sight 2x2 MIMO link operating at 1.2 Gbps," *IEEE Antennas and Propagation Society International Symposium*, pp. 1-4, Jul. 2008.
- P. Smulders, "The road to 100 Gb/s wireless and beyond: basic issues and key directions," *IEEE Communications Magazine*, vol. 51, no. 12, pp. 86-91, Dec. 2013.
- [8] A. M. J. Goiser, Handbuch der Spread-Spectrum-Technik, Wien Newyork: Springer, 1998.
- [9] A. Wolf, "PSSS Patents EP 04701288.5-1515/1584151, DE 10 2004 033 581, US 20060256850".
- [10] a. A. R. J. J. C. Scheytt, "100 Gigabit pro Sekunde und mehr für das drahtlose Hochgeschwindigkeits-Internet," *ForschungsForum Paderborn*, March 2015.
- [11] E. G. R. Kraemer, "EASY-A: Gigabit Wireless Communications in the 60 GHz ISM Band," *Proc. 16th International OFDM-Workshop*, vol. 71, Sep. 2011.

- [12] J. C. Scheytt, A. R. Javed, et. al., "Real100G Ultrabroadband Wireless Communication at High mm-Wave Frequencies," in *Wireless 100 Gbps and Beyond*, R. Kraemer and S. Scholz, Eds., Frankfurt (Oder), Germany, IHP – Innovations for High Performance Microelectronics, 2020, pp. 213-230.
- [13] J. C. Scheytt, A. R. Javed, et. al., "100 Gbps Wireless System and Circuit Design Using Parallel Spread-Spectrum Sequencing," *Frequenz*, vol. 71, no. 9-10, p. 399 - 414, 2017.
- [14] C. Scheytt, R. Kraemer and I. Kallfass, "Real100G.COM Mixed-Mode Baseband for 100 Gbit/s Wireless Communication," [Online]. Available: https://www.wireless100gb.de/project\_9\_en.html. [Accessed 10 May 2020].
- [15] U. Pfeiffer and T. Zwick, "Real100G.RF Fully Integrated Radio-Front-End Module for Wireless 100 Gbps Communication," [Online]. Available: https://www.wireless100gb.de/project\_10\_en.html. [Accessed 10 May 2020].
- [16] Y. M. Greshishchev et al., "A 40GS/s 6b ADC in 65nm CMOS," *IEEE International Solid-State Circuits Conference (ISSCC)*, pp. 390-391, 2010.
- [17] R. A. Kertis et al., "A 20 GS/s 5-Bit SiGe BiCMOS Dual-Nyquist Flash ADC With Sampling Capability up to 35 GS/s Featuring Offset Corrected Exclusive-Or Comparators," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 9, pp. 2295-2311, Sep. 2009.
- [18] X. Du, M. Grözing, M. Buck and M. Berroth, "A 40 GS/s 4 bit SiGe BiCMOS flash ADC," *IEEE Bipolar/BiCMOS Circuits and Technology Meeting (BCTM)*, pp. 138-141, 2017.
- [19] L. Kull, J. Pliva, T. Toifl, M. Schmatz, P. A. Francese, C. Menolfi, M. Brandli, M. Kossel, T. Morf, T. M. Andersen und Y. Leblebici, "Implementation of Low-Power 6–8 b 30–90 GS/s Time-Interleaved ADCs With Optimized Input Bandwidth in 32 nm CMOS," *IEEE Journal of Solid-state Circuits*, Bd. 51, Nr. 3, pp. 636-648, 2016.
- [20] L. Kull, T. Toifl, M. Schmatz, P. A. Francese, C. Menolfi, M. Braendli, M. Kossel, T. Morf, T. M. Andersen und Y. Leblebici, "22.1 A 90GS/s 8b 667mW 64× interleaved SAR ADC in 32nm digital SOI CMOS," in 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014.
- [21] L. Kull, D. Luu, C. Menolfi, M. Braendli, P. A. Francese, T. Morf, M. Kossel, A. Cevrero, I. Ozkaya und T. Toifl, "A 24-to-72GS/s 8b time-interleaved SAR ADC with 2.0-to-3.3pJ/conversion and >30dB SNDR at nyquist in 14nm CMOS

FinFET," in 2018 IEEE International Solid - State Circuits Conference - (ISSCC), 2018.

- [22] Fujitsu Semiconductor, Europe, "Analog to Digital Converter—Factsheet LUKE-ES 55–65 GSa/s 8 bit ADC, Mar. 2012," [Online]. Available: https://www.fujitsu.com/downloads/MICRO/fme/documentation/c63.pdf. [Accessed 04 October 2021].
- [23] R. L. Nguyen, A. M. Castrillon, A. Fan, A. Mellati, B. T. Reyes, C. Abidin, E. Olsen, F. Ahmad, G. Hatcher, J. Chana, L. Biolato, L. Tse, L. Wang, M. Azarmnia, M. Davoodi, N. Campos, N. Fan, P. Prabha, Q. Lu, S. Cyrusian, S. Dallaire, S. Ho, S. Jantzi, T. Dusatko und W. Elsharkasy, "8.6 A Highly Reconfigurable 40-97GS/s DAC and ADC with 40GHz AFE Bandwidth and Sub-35fJ/conv-step for 400Gb/s Coherent Optical Applications in 7nm FinFET," in 2021 IEEE International Solid- State Circuits Conference (ISSCC), 2021.
- [24] B. Murmann and Stanford University, "ADC Performance Survey 1997-2021,"
   [Online]. Available: https://web.stanford.edu/~murmann/adcsurvey.html.
   [Accessed 04 October 2021].
- [25] T. Alpert, F. Lang, D. Ferenci, M. Grozing und M. Berroth, "A 28GS/s 6b pseudo segmented current steering DAC in 90nm CMOS," in 2011 IEEE MTT-S International Microwave Symposium, 2011.
- [26] J. Cao, D. Cui, A. Nazemi, T. He, G. Li, B. Catli, M. Khanpour, K. Hu, T. Ali, H. Zhang, H. Yu, B. Rhew, S. Sheng, Y. Shim, B. Zhang und A. Momtaz, "29.2 A transmitter and receiver for 100Gb/s coherent networks with integrated 4×64GS/s 8b ADCs and DACs in 20nm CMOS," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), 2017.
- [27] H. Huang, J. Heilmeyer, M. Grozing, M. Berroth, J. Leibrich und W. Rosenkranz, "An 8-bit 100-GS/s Distributed DAC in 28-nm CMOS for Optical Communications," *IEEE Transactions on Microwave Theory and Techniques*, Bd. 63, Nr. 4, pp. 1211-1218, 2015.
- [28] S. Randel, S. Corteselli, P. J. Winzer, A. Adamiecki, A. Gnauck, S. Chandrasekhar, A. Bielik, L. Altenhain, T. Ellermeyer, U. Dümler, H. Langenhagen und R. Schmid, "Generation of a digitally shaped 55-GBd 64-QAM single-carrier signal using novel high-speed DACs," in *Optical Fiber Communications Conference and Exhibition (OFC), 2014*, 2014.
- [29] L. G. Fujitsu Semicond. Europe, "Digital to analog converter—Factsheet LEIA 55–65 GSa/s 8-bit DAC," Fujitsu Semicond. Europe, Langen, Germany.

[Online]. Available: https://www.fujitsu.com/downloads/MICRO/fme/doc umentation/c60.pdf. [Accessed 04 October 2021].

- [30] IEEE, "IEEE Standard for Information technology--Local and metropolitan area networks--Specific requirements--Part 15.3: Amendment 2: Millimeter-wavebased Alternative Physical Layer Extension," *IEEE Std 802.15.3c-2009* (Amendment to IEEE Std 802.15.3-2003), pp. 1-200, 12 Oct. 2009.
- [31] IEEE, "IEEE Draft Standard for Information technology--Telecommunications and information exchange b/w systems Local and metropolitan area networks--Specific requirements Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications," *IEEE P802.11-REVmc/D4.0, January* 2015 (Revision of IEEE Std 802.11-2012), pp. 1-3730, 2015.
- [32] S. Krone, F. Guderian, G. Fettweis, M. Petri, M. Piz, M. Marinkovic, M. Peter, R. Felbecker and W. Keusgen., "Physical Layer Design, Link Budget Analysis, and Digital Baseband Implementation for 60 GHz Short-Range Applications.," *International Journal of Microwave and Wireless Technologies*, vol. 3, no. 2, pp. 189-200, 2011.
- [33] S. Glisic, J. C. Scheytt, Y. Sun, F. Herzel, R. Wang, K. Schmalz, M. Elkhouly and C.-S. Choi, "Fully Integrated 60 GHz Transceiver in SiGe BiCMOS, RF Modules, and 3.6 Gbit/s OFDM Data Transmission," *International Journal of Microwave and Wireless Technologies*, vol. 3, no. 2, pp. 139-145, 2011.
- [34] A. Ulusoy, G. Liu, A. Trasser and H. Schumacher, "Hardware Efficient Receiver for Low-Cost Ultra-High Rate 60 GHz Wireless Communications.," *International Journal of Microwave and Wireless Technologies*, vol. 3, no. 2, pp. 121-129, 2011.
- [35] ECMA International, "Standard ECMA-387: High Rate 60 GHz PHY, MAC and PALS," Dec. 2010. [Online]. Available: https://www.ecmainternational.org/publications/files/ECMA-ST/ECMA-387.pdf. [Accessed 10 May 2020].
- [36] Wireless HD, "Wireless HD Specification Version 1.1 Overview," May 2010.
   [Online]. Available: https://www.yumpu.com/en/document/read/29774870/ wireless hd-specification-version-11-overview-may-2010. [Accessed 10 May 2020].
- [37] IEEE, "IEEE P802.15.3d Documents," [Online]. Available: https://mentor.ieee.org/802.15/dcn/16/15-16-0595-03-003d-proposal-forieee802-15-3d-thz-phy.docx. [Accessed 10 May 2020].

- [38] T. Kürner, "IEEE P802.15.3d TG3d (100G) Technical Requirements Document,"
   [Online]. Available: https://mentor.ieee.org/802.15/dcn/14/15-14-0309-15-003d-technical-requirements-document.docx. [Accessed 10 May 2020].
- [39] W. Freude, S. Koenig, D. Lopez-Diaz, J. Antes, F. Boes, R. Henneberger, A. Leuther, A. Tessmann, R. Schmogrow, D. Hillerkuss, R. Palmer, T. Zwick, C. Koos, O. Am-bacher, J. Leuthold and I. Kallfass, "Wireless Communications on THz Carriers Takes Shape," *16th International Conference on Transparent Optical Networks (ICTON)*, pp. 1-4, 2014.
- [40] F. Boes, T. Messinger, J. Antes, D. Meier, A. Tessmann, A. Inam and I. Kallfass, "Ultra-Broadband MMIC-Based Wireless Link at 240 GHz Enabled by 64GS/s DAC," 39th International Conference on Infrared, Millimeter, and Terahertz waves (IRMMW-THz), pp. 1-2, 2014.
- [41] E. Grass, H. Schumacher and V. Ziegler, "EuMA special issue on 60 GHz communication systems," *International Journal of Microwave and Wireless Technologies*, vol. 3, no. 2, pp. 87-88, 2011.
- [42] X. Yu, R. Asif, M. Piels, D. Zibar, M. Galili, T. Morioka, P. U. Jepsen and a. L. K. Ox-enlowe, "60 Gbit/s 400 GHz wireless transmission," 2015 International Conference on Photonics in Switching (PS), pp. 4-6, 2015.
- [43] S. Moghadami, F. Hajilou, P. Agrawal and S. Ardalan, "A 210 GHz Fully-Integrated OOK Transceiver for Short-Range Wireless Chip-to-Chip Communication in 40 nm CMOS Technology," *IEEE Transactions on Terahertz Science and Technology*, vol. 5, no. 5, pp. 737-741, Sep. 2015.
- [44] B. Lu, W. Huang, C. Lin and C. Wang, "A 16QAM Modulation Based 3Gbps Wireless Communication Demonstration System at 0.34 THz Band," 2013 38th International Conference on Infrared, Millimeter, and Terahertz Waves (IRMMW-THz), pp. 1-2, 2013.
- [45] M. Mörz, "Analog Signal Processing in Forward Error Correction Decoders," *PhD Thesis, Technical University of Munich,* 2007.
- [46] J. V. Soler, "Analog VLSI Architecticture for Multiantenna Wireless Systems," *PhD thesis, University of Bristol,* 2009.
- [47] J. Antes, S. Konig, A. Leuther, H. Massler, J. Leuthold, O. Ambacher and I. Kallfass, "220 GHz wireless data transmission experiments up to 30 Gbit/s," *IEEE/MTT-S International Microwave Symposium Digest*, pp. 1-3, 2012.
- [48] I. Sarkas, S. T. Nicolson, A. Tomkins, E. Laskin, P. Chevalier, B. Sautreuil and S. P. Voinigescu., "An 18-Gb/s, Direct QPSK Modulation SiGe BiCMOS

Transceiver for Last Mile Links in the 70–80 GHz Band," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 10, pp. 1968-1980, Oct. 2010.

- [49] C. Thakkar, L. Kong, K. Jung, A. Frappe and E. Alon, "A 10 Gb/s 45 mW Adaptive 60 GHz Baseband in 65 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 4, pp. 952-968, Apr. 2012.
- [50] A. C. Ulusoy, G. Liu, M. Peter, R. Felbecker, H. Y. Abdine and H. Schumacher, "A BPSK/QPSK Receiver Architecture Suitable for Low-Cost Ultra-High Rate 60 GHz Wireless Communications," *The 40th European Microwave Conference*, pp. 381-384, 2010.
- [51] D. A. Sobel and R. W. Brodersen, "A 1 Gb/s Mixed-Signal Baseband Analog Front-End for a 60 GHz Wireless Receiver," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 4, pp. 1281-1289, Apr. 2009.
- [52] E. Saeckinger, Broadband Circuits for Optical Fiber Communication., 2005, John Wiley & Sons Inc., 2005.
- [53] N. Sadeghi, V. C. Gaudet and C. Schlegel, "Analog DFT Processors for OFDM Receivers: Circuit Mismatch and System Performance Analysis," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 56, no. 9, pp. 2123-2131, Sep. 2009.
- [54] F. Rivet, Y. Deval, J. Begueret, D. Dallet, P. Cathelin and D. Belot, "The Experimental Demonstration of a SASP-Based Full Software Radio Receiver," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 5, pp. 979-988, May 2010.
- [55] M. Lehne and S. Raman, "A 0.13um 1-GS/s CMOS Discrete-Time FFT Processor for Ultra-Wideband OFDM Wireless Receivers," *IEEE Transactions* on Microwave Theory and Techniques, vol. 59, no. 6, pp. 1639-1650, Jun 2011.
- [56] Y. Abiven, F. Rivet, Y. Deval, D. Dallet, D. Belot and T. Taris, "A low-power 2 GHz discrete time weighting system dedicated to Sampled Analog Signal Processing," 18th IEEE International Conference on Electronics, Circuits, and Systems, 57-60 2011.
- [57] K. KrishneGowda, P. Rodríguez-Vázquez, A. C. Wolf, J. Grzyb, U. R. Pfeiffer und R. Kraemer, "100 Gbps and beyond: Hardware in the Loop experiments with PSSS modulation using 230 GHz RF frontend," 2018 15th Workshop on Positioning, Navigation and Communications (WPNC), pp. 1-5, 2018.
- [58] K. KrishneGowda, T. Messinger, A. C. Wolf, R. Kraemer, I. Kallfass and J. C. Scheytt, "Towards 100 Gbps Wireless Communication in THz Band with PSSS Modulation: A Promising Hardware in the Loop Experiment," 2015 IEEE
International Conference on Ubiquitous Wireless Broadband (ICUWB), pp. 1-5, 2015.

- [59] T. Messinger, K. KrishneGowda, F. Boes, D. Meier, A. Wolf, A. Tessmann, R. Kra-emer and I. Kallfass, "Multi-Level 20 Gbit/s PSSS Transmission Using a Linearity-Limited 240 GHz Wireless Frontend," 2015 IEEE International Conference on Microwaves, Communications, Antennas and Electronic Systems (COMCAS), pp. 1-3, 2015.
- [60] A. Wolf, R. Kraemer and a. J. C. Scheytt, "Ultra High Speed Wireless Communication with Low Complexity Transceiver," 2012 International Symposium on Signals, Systems, and Electronics (ISSSE), pp. 1-6, 2012.
- [61] A. R. Javed and J. C. Scheytt, "System Design and Simulation of a PSSS Based Mixed Signal Transceiver for a 20 Gbps Bandwidth Limited Communication Link," 2015 1st URSI Atlantic Radio Science Conference (URSI AT-RASC), pp. 1-1, 2015.
- [62] A. C. Wolf and C. Scheytt, "15 Gbps Communication Over an USB3.0 Cable and Even More," *International Multi-Conference on Systems, Signals & Devices*, pp. 1-3, 2012.
- [63] L. Underberg, R. Croonenbroeck, R. Kays and R. Kraemer, "ParSec: Wireless industrial communication first PSSS measurements in industrial environment," 2017 IEEE 13th International Workshop on Factory Communication Systems (WFCS), pp. 1-8, 2017.
- [64] K. KrishneGowda, A. R. Javed, L. Wimmer, A. C. Wolf, J. C. Scheytt and R. Kraemer, "PSSS Transmitter for a 100 Gbps Data Rate Communication in Thz Frequency Band," 2018 26th Telecommunications Forum (TELFOR), pp. 1-5, 2018.
- [65] K. KrishneGowda, A. Wolf, R. Kraemer, J. C. Scheytt and I. Kallfass, "Wireless 100 Gb/s: PHY Layer Overview and Challenges in the THz Freqency Band," WAMICON 2014, pp. 1-4, 2014.
- [66] A. C. Wolf and M. Mahlig, "Benchmarking of WSN Solutions and IEEE 802.15.4-2006 PSSS Based Solutions," Proc. 9th GI/ITG KuVS Fachgespräch Sensornetze, p. 13, 2010.
- [67] H. Holma and A. Toskala, HSDPA/HSUPA for UMTS: High Speed Radio Access for Mobile Communications, John Wiley & Sons, 2007.
- [68] A. R. Javed, J. C. Scheytt, K. KrishneGowda and R. Kraemer, "System Design Considerations for a PSSS Transceiver for 100Gbps Wireless Communication

with Emphasis on Mixed Signal Implementation," in *IEEE Wireless and Microwave Technology Conference (WAMICON)*, Florida, 2015.

- [69] A. R. Javed, et. al., "Real100G.com," in Wireless 100 Gbps and Beyond, R. Kraemer and S. Scholz, Eds., Frankfurt (Oder), Germany, IHP Innovations for High Performance Microelectronics, 2020, pp. 231-294.
- [70] T. S. Rappaport, Wireless Communications: Principles and Practice, 2nd ed., Prentice Hall PTR, 2002, p. 707.
- [71] P. Misra and P. Enge, Global Positioning System: Signals, Measurements, and Performance, 2nd ed., Ganga-Jamuna Press, 2006, p. 569.
- [72] E. Marx, "Entwurf eines Übertragungssystems mit hoher Datenrate auf Basis von Parallel Spread Spectrum Sequencing (PSSS) unter Verwendung eines kostengünstigen USB 3.0 Kabels," Masterarbeit, Universität Paderborn, 2013.
- [73] A. R. Javed, J. C. Scheytt, K. KrishneGowda and R. Kraemer, "System Design of a Mixed Signal PSSS Transceiver Using a Linear Ultra-Broadband Analog Correlator for the Receiver Baseband Designed in 130 nm SiGe BiCMOS Technology," *IEEE EUROCON 2017 -17th International Conference on Smart Technologies*, pp. 228-233, 2017.
- [74] S. Priebe, M. Jacob and T. Kürner, "Angular and RMS delay spread modeling in view of THz indoor communication systems," *Radio Science*, vol. 49, pp. 242-251, March 2014.
- [75] K. KrishneGowda, L. Wimmer, A. R. Javed, A. C. Wolf, J. C. Scheytt and R. Kraemer, "Analysis of PSSS Modulation for Optimization of DAC Bit Resolution for 100 Gbps Systems," in 15th International Symposium on Wireless Communication Systems (ISWCS), Lisbon, 2018.
- [76] H. Ruecker, B. Heinemann, W. Winkler, R. Barth, J. Borngraber, J. Drews and G. Fischer, et. al., "A 0.13 um SiGe BiCMOS Technology Featuring fT / fmax of 240/330 GHz and Gate Delays Below 3 ps," *IEEE Journal of Solid State Circuits*, vol. 45, pp. 1678-1686, Oct. 2010.
- [77] A. R. Javed and J. C. Scheytt, "M-Sequence Radar for High Resolution Ranging with Mixed-Signal Radar Receiver Baseband Using 130nm SiGe BiCMOS Technology," in 2020 17th European Radar Conference (EuRAD), Utrecht, 2020.
- [78] A. R. Javed and J. C. Scheytt, "Mixed-Signal Receiver Baseband Slice for High-Data-Rate Communication Using 130 nm SiGe BiCMOS Technology," in 64th

International Midwest Symposium on Circuits and Systems (MWSCAS 2021), East Lansing, 2021.

- [79] D. Ferenci, M. Grozing und M. Berroth, "A 25 GHz Analog Multiplexer for a 50GS/s D/A-Conversion System in InP DHBT Technology," in 2011 IEEE Compound Semiconductor Integrated Circuit Symposium (CSICS), 2011.
- [80] T. Tannert, X.-Q. Du, D. Widmann, M. Grozing, M. Berroth, C. Schmidt, C. Caspar, J. H. Choi, V. Jungnickel und R. Freund, "A SiGe-HBT 2:1 analog multiplexer with more than 67 GHz bandwidth," in 2017 IEEE Bipolar/BiCMOS Circuits and Technology Meeting (BCTM), 2017.
- [81] R. Hersent, A. Konczykowska, F. Jorge, M. Riet, C. Mismer, V. Nodjiadjim, B. Duval und J.-Y. Dupuy, "Analog-Multiplexer (AMUX) circuit realized in InP DHBT technology for high order electrical modulation formats (PAM-4, PAM-8)," in 2020 23rd International Microwave and Radar Conference (MIKON), 2020.
- [82] H. Yamazaki, M. Nagatani, H. Wakita, M. Nakamura, S. Kanazawa, M. Ida, T. Hashimoto, H. Nosaka und Y. Miyamoto, "160-GBd (320-Gb/s) PAM4 Transmission Using 97-GHz Bandwidth Analog Multiplexer," *IEEE Photonics Technology Letters*, Bd. 30, Nr. 20, pp. 1749-1751, 2018.
- [83] S. Halder und H. Gustat, "A 30 GS/s 4-Bit Binary Weighted DAC in SiGe BiCMOS Technology," in 2007 IEEE Bipolar/BiCMOS Circuits and Technology Meeting, 2007.
- [84] M. Nagatani, H. Nosaka, S. Yamanaka, K. Sano und K. Murata, "A 32-GS/s 6-Bit Double-Sampling DAC in InP HBT Technology," in 2009 Annual IEEE Compound Semiconductor Integrated Circuit Symposium, 2009.
- [85] A. R. Javed, J. C. Scheytt and U. v. d. Ahe, "Linear Ultra-Broadband NPN-Only Analog Correlator at 33 Gbps in 130 nm SiGe BiCMOS Technology," in *IEEE Bipolar/BiCMOS Circuits and Technology Meeting (BCTM), 2016*, New Brunswick, 2016.
- [86] C. M. Holler, M. E. Jones, A. C. Taylor, A. I. Harris and S. A. Maas, "A 2–20-GHz Analog Lag Correlator for Radio Interferometry,", " in IEEE Transactions on Instrumentation and Measurement, vol. 61, no. 8, pp. 2253-2261, Aug. 2012.
- [87] N. Karandikar, S. Jung, S. C. Lee, P. Gui and Y. Joo, "Design of an Analog Correlator for 22-29GHz UWB Vehicular Radar System Using Improved High Gain Multiplier Architecture, Seattl," *53rd IEEE International Midwest Symposium on Circuits and Systems*, pp. 930-933, 2010.

- [88] H. Xie, X. Wang, A. Wang, B. Zhao, L. Yang and Y. Zhou, "A Broadband CMOS Multiplier-Based Correlator for IR-UWB Transceiver SoC," 2007 IEEE Radio Frequency Integrated Circuits (RFIC) Symposium, pp. 493-496, 2007.
- [89] M. Mincica, D. Pepe, A. Giordano and D. Zito, "CMOS Correlation Receiver for UWB Pulse Radar," 2009 Ph.D. Research in Microelectronics and Electronics, pp. 356-359, 2009.
- [90] C. Tu, B. Liu and H. Chen, "An Analog Correlator for Ultra-Wideband Receivers," *EURASIP Journal on Applied Signal Processing*, vol. 3, pp. 455-461, 2005.
- [91] Y. Zheng, Y. Tong, J. Yan, Y.-P. Xu, W. G. Yeoh and F. Lin, "A Low Power Noncoherent CMOS UWB Transceiver ICs," *IEEE Radio Frequency integrated Circuits (RFIC) Symposium Digest of Papers*, 2005.
- [92] D. Shen, F. Lin and W. G. Yeoh, "An Analog Correlator with Dynamic Bias Control for Pulse Based UWB Receiver in 0.18um CMOS Technology," *IEEE Radio Frequency Integrated Circuits (RFIC) Symposium*, 2006.
- [93] R. Hogervorst, J. P. Tero and J. H. Huijsing, "Compact CMOS Constant-gm Railto-Rail Input Stages with gm-Control by an Eectronic Zener Diode," *ESSCIRC* '95: Twenty-first European Solid-State Circuits Conference, pp. 78-81, 1995.
- [94] Rosenberger, "02K243-40ME3 Right Angle Jack PCB," [Online]. Available: https://products.rosenberger.com/radio-frequency/connectors/146193/02k243-40me3-right-angle-jack-pcb. [Accessed 03 June 2020].
- [95] Q. Xiaoning, "High Frequency Characterization and Modeling of On-Chip Interconnects and RF IC Wirebonds," *Dissertation, Stanford University*, June 2001.
- [96] H. Xue, C. R. Benedik, X. Zhang, S. Li and S. Ren, "Numerical Solution for Accurate Bondwire Modeling," *IEEE Transactions on Semiconductor Manufacturing*, vol. 31, no. 2, pp. 258-265, May 2018.
- [97] K. Mouthaan, R. Tinti, M. d. Kok, H. C. d. Graaff, J. L. Tauritz and J. Slotboom, "Microwave Modelling and Measurement of the Self- and Mutual Inductance of Coupled Bondwires," *Proceedings of the 1997 Bipolar/BiCMOS Circuits and Technology Meeting*, pp. 166-169, 1997.
- [98] J. Coonrod and B. Rautio, "https://www.rogerscorp.cn/documents/2311/acs/artic les /Comparing-Microstrip-and-CPW-Performance.pdf," *Microwave Journal*, vol. 55, no. 7, pp. 74-8, July 2012.

- [99] I. Rosu, "Microstrip, Stripline, CPW, and SIW Design," [Online]. Available: https://www.qsl.net/va3iul/Microstrip\_Stripline\_CPW\_Design/Microstrip\_Strip line\_and\_CPW\_Design.pdf. [Accessed 03 June 2020].
- [100] Southwest Microwave Inc., "Optimizing Test Boards for 50 GHz End Launch Connectors," [Online]. Available: https://mpd.southwestmicrowave.com/wpcontent/uploads/2018/07/Optimizing-Test-Boards-for-50-GHz-End-Launch-Connectors.pdf. [Accessed 03 June 2020].
- [101] Southwest Microwave Inc., "The Design & Test of Broadband Launches up to 50 GHz on Thin & Thick Substrates," [Online]. Available: https://mpd.southwestmicrowave.com/wp-content/uploads/2018/07/The-Design -and-Test-of-Broadband-Launches-up-to-50-GHz-on-Thin-and-Thick-Substrat es.pdf. [Accessed 03 June 2020].
- [102] Keysight Technologies, "M8194A 120 GSa/s Arbitrary Waveform Datasheet,"
  [Online]. Available: https://www.keysight.com/us/en/assets/7018-06341/data-sheets/5992-3361.pdf. [Accessed 01 June 2020].
- [103] S. Koenig, D. Lopez-Diaz, et. al. "Wireless Sub-THz Communication System with High Data Rate," *Nature Photon 7*, pp. 977-981, 13 Oct. 2013.
- [104] S. D. Brückner, "Maximum Length Sequences for Radar and Synchronization," *PhD Thesis, TU Braunschweig,* 2016.
- [105] C. Meier, A. Terzis and S. Lindenmeier, "Accurate Distance Measurement with a Wideband High Resolution Pseudo Noise Coded Radar," *European Radar Conference (EuRAD 2007)*, Oct. 2007.
- [106] T. Messinger, K. KrishneGowda, F. Boes, D. Meier, A. Wolf, A. Tessmann, R. Kra-emer and I. Kallfass, "Multi-Level 20 Gbit/s PSSS Transmission Using a Linearity-Limited 240 GHz Wireless Frontend," 2015 IEEE International Conference on Microwaves, Communications, Antennas and Electronic Systems (COMCAS), pp. 1-3, 2015.
- [107] M. Nagatani, H. Wakita, H. Yamazaki, M. Mutoh, M. Ida, et. al., "An Over-110-GHz-Bandwidth 2:1 Analog Multiplexer in 0.25-µm InP DHBT Technology," in 2018 IEEE/MTT-S International Microwave Symposium - IMS, 2018.
- [108] Y. M. Greshishchev, J. Aguirre, M. Besson, R. Gibbins, C. Falt, P. Flemke, N. Ben-Hamida, D. Pollex, P. Schvan und S.-C. Wang, "A 40GS/s 6b ADC in 65nm CMOS," in 2010 IEEE International Solid-State Circuits Conference (ISSCC), 2010.

## Das Heinz Nixdorf Institut – Interdisziplinäres Forschungszentrum für Informatik und Technik

Das Heinz Nixdorf Institut ist ein Forschungszentrum der Universität Paderborn. Es entstand 1987 aus der Initiative und mit Förderung von Heinz Nixdorf. Damit wollte er Ingenieurwissenschaften und Informatik zusammenführen, um wesentliche Impulse für neue Produkte und Dienstleistungen zu erzeugen. Dies schließt auch die Wechselwirkungen mit dem gesellschaftlichen Umfeld ein.

Die Forschungsarbeit orientiert sich an dem Programm "Dynamik, Mobilität, Vernetzung: Eine neue Schule des Entwurfs der technischen Systeme von morgen". In der Lehre engagiert sich das Heinz Nixdorf Institut in Studiengängen der Informatik, der Ingenieurwissenschaften und der Wirtschaftswissenschaften.

Heute wirken am Heinz Nixdorf Institut acht Professoren mit insgesamt 130 Mitarbeiterinnen und Mitarbeitern. Pro Jahr promovieren hier etwa 15 Nachwuchswissenschaftlerinnen und Nachwuchswissenschaftler.

## Heinz Nixdorf Institute – Interdisciplinary Research Centre for Computer Science and Technology

The Heinz Nixdorf Institute is a research centre within the Paderborn University. It was founded in 1987 initiated and supported by Heinz Nixdorf. By doing so he wanted to create a symbiosis of computer science and engineering in order to provide critical impetus for new products and services. This includes interactions with the social environment.

Our research is aligned with the program "Dynamics, Mobility, Integration: Enroute to the technical systems of tomorrow." In training and education the Heinz Nixdorf Institute is involved in many programs of study at the Paderborn University. The superior goal in education and training is to communicate competencies that are critical in tomorrows economy.

Today eight Professors and 130 researchers work at the Heinz Nixdorf Institute. Per year approximately 15 young researchers receive a doctorate.

## Zuletzt erschienene Bände der Verlagsschriftenreihe des Heinz Nixdorf Instituts

- Bd. 380 JUNG, D.: Local Strategies for Swarm Formations on a Grid. Dissertation, Fakultät für Elektrotechnik, Informatik und Mathematik, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 380, Paderborn, 2018 – ISBN 978-3-942647-99-1
- Bd. 381 PLACZEK, M.: Systematik zur geschäftsmodellorientierten Technologiefrühaufklärung. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 381, Paderborn, 2018 – ISBN 978-3-947647-00-2
- Bd. 382 Köchling, D.: Systematik zur integrativen Planung des Verhaltens selbstoptimierender Produktionssysteme. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 382, Paderborn, 2018 – ISBN 978-3-947647-01-9
- Bd. 383 KAGE, M.: Systematik zur Positionierung in technologieinduzierten Wertschöpfungsnetzwerken. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 383, Paderborn, 2018 – ISBN 978-3-947647-02-6
- Bd. 384 DÜLME, C.: Systematik zur zukunftsorientierten Konsolidierung variantenreicher Produktprogramme. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 384, Paderborn, 2018 – ISBN 978-3-947647-03-3
- Bd. 385 GAUSEMEIER, J. (Hrsg.): Vorausschau und Technologieplanung. 14. Symposium für Vorausschau und Technologieplanung, Heinz Nixdorf Institut, 8. und 9. November 2018, Berlin-Brandenburgische Akademie der Wissenschaften, Berlin, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 385, Paderborn, 2018 – ISBN 978-3-947647-04-0
- Bd. 386 SCHNEIDER, M.: Spezifikationstechnik zur Beschreibung und Analyse von Wertschöpfungssystemen. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 386, Paderborn, 2018 – ISBN 978-3-947647-05-7

- Bd. 387 ECHTERHOFF, B.: Methodik zur Einführung innovativer Geschäftsmodelle in etablierten Unternehmen. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 387, Paderborn, 2018 – ISBN 978-3-947647-06-4
- Bd. 388 KRUSE, D.: Teilautomatisierte Parameteridentifikation für die Validierung von Dynamikmodellen im modellbasierten Entwurf mechatronischer Systeme. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 388, Paderborn, 2019 – ISBN 978-3-947647-07-1
- Bd. 389 MITTAG, T.: Systematik zur Gestaltung der Wertschöpfung für digitalisierte hybride Marktleistungen. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 389, Paderborn, 2019 – ISBN 978-3-947647-08-8
- Bd. 390 GAUSEMEIER, J. (Hrsg.): Vorausschau und Technologieplanung. 15. Symposium für Vorausschau und Technologieplanung, Heinz Nixdorf Institut, 21. und 22. November 2019, Berlin-Brandenburgische Akademie der Wissenschaften, Berlin, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 390, Paderborn, 2019 – ISBN 978-3-947647-09-5
- Bd. 391 SCHIERBAUM, A.: Systematik zur Ableitung bedarfsgerechter Systems Engineering Leitfäden im Maschinenbau. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 391, Paderborn, 2019 – ISBN 978-3-947647-10-1
- Bd. 392 PAI, A.: Computationally Efficient Modelling and Precision Position and Force Control of SMA Actuators. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 392, Paderborn, 2019 – ISBN 978-3-947647-11-8
- Bd. 393 ECHTERFELD, J.: Systematik zur Digitalisierung von Produktprogrammen. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 393, Paderborn, 2020 – ISBN 978-3-947647-12-5

Bezugsadresse: Heinz Nixdorf Institut Universität Paderborn Fürstenallee 11 33102 Paderborn

## Zuletzt erschienene Bände der Verlagsschriftenreihe des Heinz Nixdorf Instituts

- Bd. 394 LOCHBICHLER, M.: Systematische Wahl einer Modellierungstiefe im Entwurfsprozess mechatronischer Systeme. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 394, Paderborn, 2020 ISBN 978-3-947647-13-2
- Bd. 395 LUKEI, M.: Systematik zur integrativen Entwicklung von mechatronischen Produkten und deren Prüfmittel. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 395, Paderborn, 2020 – ISBN 978-3-947647-14-9
- Bd. 396 KOHLSTEDT, A.: Modellbasierte Synthese einer hybriden Kraft-/Positionsregelung für einen Fahrzeugachsprüfstand mit hydraulischem Hexapod. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 396, Paderborn, 2021 – ISBN 978-3-947647-15-6
- Bd. 397 DREWEL, M.: Systematik zum Einstieg in die Plattformökonomie. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 397, Paderborn, 2021 – ISBN 978-3-947647-16-3
- Bd. 398 FRANK, M.: Systematik zur Planung des organisationalen Wandels zum Smart Service-Anbieter. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 398, Paderborn, 2021 – ISBN 978-3-947647-17-0
- Bd. 399 KOLDEWEY, C.: Systematik zur Entwicklung von Smart Service-Strategien im produzierenden Gewerbe. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 399, Paderborn, 2021 – ISBN 978-3-947647-18-7
- Bd. 400 GAUSEMEIER, J. (Hrsg.): Vorausschau und Technologieplanung. 16. Symposium für Vorausschau und Technologieplanung, Heinz Nixdorf Institut, 2. und 3. Dezember 2021, Berlin-Brandenburgische Akademie der Wissenschaften, Berlin, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 400, Paderborn, 2021 – ISBN 978-3-947647-19-4

- Bd. 401 BRETZ, L.: Rahmenwerk zur Planung und Einführung von Systems Engineering und Model-Based Systems Engineering. Dissertation, Fakultät für Elektrotechnik, Informatik und Mathematik, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 401, Paderborn, 2021 – ISBN 978-3-947647-20-0
- Bd. 402 W∪, L.: Ultrabreitbandige Sampler in SiGe-BiCMOS-Technologie für Analog-Digital-Wandler mit zeitversetzter Abtastung. Dissertation, Fakultät für Elektrotechnik, Informatik und Mathematik, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 402, Paderborn, 2021 – ISBN 978-3-947647-21-7
- Bd. 403 HILLEBRAND, M.: Entwicklungssystematik zur Integration von Eigenschaften der Selbstheilung in Intelligente Technische Systeme. Dissertation, Fakultät für Elektrotechnik, Informatik und Mathematik, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 403, Paderborn, 2021 – ISBN 978-3-947647-22-4
- Bd. 404 OLMA, S.: Systemtheorie von Hardwarein-the-Loop-Simulationen mit Anwendung auf einem Fahrzeugachsprüfstand mit parallelkinematischem Lastsimulator. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 404, Paderborn, 2022 – ISBN 978-3-947647-23-1
- Bd. 405 FECHTELPETER, C.: Rahmenwerk zur Gestaltung des Technologietransfers in mittelständisch geprägten Innovationsclustern. Dissertation, Fakultät für Elektrotechnik, Informatik und Mathematik, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 405, Paderborn, 2022 – ISBN 978-3-947647-24-8
- Bd. 406 OLEFF, C.: Proaktives Management von Anforderungsänderungen in der Entwicklung komplexer technischer Systeme. Dissertation, Fakultät für Maschinenbau, Universität Paderborn, Verlagsschriftenreihe des Heinz Nixdorf Instituts, Band 406, Paderborn, 2022 – ISBN 978-3-947647-25-5

Bezugsadresse: Heinz Nixdorf Institut Universität Paderborn Fürstenallee 11 33102 Paderborn



The conventional digital baseband architecture, employing high-performance digital signal processing and wide bandwidth, high data rate, and high-resolution data convertors, results in large power dissipation for high data rate wireless communication systems. This dissertation investigates the use of analog signal processing using a mixed-signal baseband architecture to reduce power dissipation and circuit complexity. For this purpose, the use of parallel-sequence spread spectrum (PSSS) modulation is investigated owing to its suitability for a mixedsignal baseband implementation. The proposed mixed-signal baseband circuit has a symbol-sliced architecture and allows channel equalization by weighting the chips of the decoding sequence. A complete unit slice of the receiver baseband was fabricated in 130 nm SiGe BiCMOS technology. The measurement results show very good linearity and high-speed performance for both BPSK and PAM-4 data with a PSSS chip rate of 20 Gcps. However, the use of current mode logic with a large supply voltage of 1.2 V and -4 V results in large power dissipation. For the transmitter baseband, CMOS technology was preferred to allow the realization of both the digital baseband core and the high-speed analog components on a single chip. Transmitter baseband components were fabricated and tested in 65 nm bulk CMOS technology. The 65 nm technology was, however, not fast enough for the design of the high-speed, broadband analog multiplexer component of the transmitter baseband which necessitated a migration to more scaled CMOS technology. Therefore, to reduce the power dissipation of the mixed-signal receiver baseband as compared to the SiGe BiCMOS implementation, and to enable the high-speed operation required for the transmitter analog multiplexer, the proposed mixedsignal baseband was implemented in a 28 nm bulk CMOS technology which significantly reduced the power dissipation. However, the circuit was not fabricated in 28 nm bulk CMOS technology due to limited time and budgetary resources. An interesting application of the receiver baseband chip as a high-resolution ranging radar is also discussed along with measured results that allow distance resolution of up to 7.5 mm.