#### University of New Mexico UNM Digital Repository

**Electrical and Computer Engineering ETDs** 

**Engineering ETDs** 

Spring 2-24-2017

# Intelligent ROIC for Real-time In-pixel Image Processing

Mohammad J. Ghasemibenhangi

Follow this and additional works at: https://digitalrepository.unm.edu/ece\_etds Part of the <u>Electrical and Computer Engineering Commons</u>

**Recommended** Citation

Ghasemibenhangi, Mohammad J.. "Intelligent ROIC for Real-time In-pixel Image Processing." (2017). https://digitalrepository.unm.edu/ece\_etds/347

This Dissertation is brought to you for free and open access by the Engineering ETDs at UNM Digital Repository. It has been accepted for inclusion in Electrical and Computer Engineering ETDs by an authorized administrator of UNM Digital Repository. For more information, please contact disc@unm.edu.

Mohmammad Javad GhasemiBenhnagi
Candidate

Electrical and Computer Engineering *Department* 

This dissertation is approved, and it is acceptable in quality and form for publication:

Approved by the Dissertation Committee:

Payman Zarkesh-Ha , Chairperson

Sanjay Krishna

Majeed M. Hayat

Shuang Luan

Biliana Paskaleva

## Intelligent ROIC for Real-time In-pixel Image Processing

 $\mathbf{b}\mathbf{y}$ 

#### Mohammad Javad GhasemiBenhnagi

#### DESSERTATION

Submitted in Partial Fulfillment of the Requirements for the Degree of

> Doctor of Philosophy Engineering

The University of New Mexico

Albuquerque, New Mexico

May, 2017

 $\textcircled{O}2017,\$ Mohammad Javad Ghasemi Benh<br/>nagi

# Dedication

I dedicate this thesis to my spouse, Sara, for her remarkable patience and unwavering love and support.

– M. Javad Ghasemibenhangi

## Acknowledgments

I would like to express my deepest appreciation to Prof. Payman Zarkesh-Ha whom his support and guidance is beyond the limits of this research. Without his persistent encouragement and help, this research would not be possible.

My deep appreciation also goes to Prof. Sanjay Krishna, Prof. Majeed M. Hayat and Steven Brueck who have greatly supported this work. I would not be able to has this work done if I would not have their great source of energy and enthusiasm in the development of the ideas.

I would like to also appreciate my committee members Biliana Paskaleva and Shuang Luan who generously offered their time and guidance throughout the preparation and review of this document. Their valuable feedback resulted in making this work much stronger.

The completion of this research could not have been possible without the participation and assistance of many friends who is hard to list all their names. Here I sincerely acknowledge my colleagues Alexander Neumann, Glauco Fiorente, Manish Bhattarai, John Montoya, and Alireza Kazemi for their contribution where I needed them.

This work was supported in part by the National Science Foundation (ECCS-0925757) and Smart Lighting Engineering Research Center (EEC-0812056).

## Intelligent ROIC for Real-time In-pixel Image Processing

by

#### Mohammad Javad GhasemiBenhnagi

#### ABSTRACT OF DESSERTATION

Submitted in Partial Fulfillment of the Requirements for the Degree of

> Doctor of Philosophy Engineering

The University of New Mexico

Albuquerque, New Mexico

May, 2017

## Intelligent ROIC for Real-time In-pixel Image Processing

by

#### Mohammad Javad GhasemiBenhnagi

PhD, Engineering, University of New Mexico, 2017

#### Abstract

As the resolution of current image sensors is increasing and the readout electronics is getting faster, the amount of data produced by these imagers is becoming excessively large to store, transmit, and analyze. As a result, sparse data representation has become an interesting research topic in the last few years.

Human eye contains millions of photoreceptors, however, only some sparse data is transmitted to the optic nerves. Image processing techniques implemented on a chip and at the pixel level is the main building blocks for retina-like sensors. In this way, instead of transmitting raw images that require massive storage, and offline processing, the imager transmits only vital information that is relevant to the application of interest.

Inspired by the human eye and the way retina handles the data, we have implemented two different readout integrated circuits. In the first method, we have modified a conventional CTIA unit cell to implement in-pixel multispectral classification in the analog domain. The ROIC is designed to utilize spectrally tunable dot-in-a-well (DWELL) infrared photodetector to exploit the possibility of real-time on-chip multispectral imaging for classification. The unit cells are designed to include all necessary elements needed for spectral classification, including the support for high-voltage time varying positive and negative biases, bipolar integration, and selective sample-and-hold circuits. A test chip was designed and fabricated using TSMC's 350nm high voltage CMOS process (CL035HV-DDD) technology. Comprehensive pre-silicon verification proved functionalities of the chip. A custom reconfigurable PCB board and a flexible FPGA firmware are developed to test the chip in cryostat condition. Initial testing results validate the design specs.

In the second method, we report a readout integrated circuit featured with a fine control over the bias of individual pixels in every frame independently. To exploit multispectral spatiotemporal imaging, the hardware is designed compatible with DWELL infrared photodetectors, that supports large swing voltage, extended linearity, and high charge capacity. However, in the current setting, for simplicity, a silicon photodetector is used instead of DWELL. A PCB board is designed to support high signal integrity and a flexible image grabbing firmware is developed to explore all the potentials of the designed test-chip. The spatiotemporal biasing scheme of the test chip along with the versatile firmware enabled the following applications:

1. Compressed-domain image acquisition: The increasing demand in image quality needs an increase in pixel count and a sophisticated post-processing mechanism to efficiently store, transmit, and analysis this humongous data. An inherent trade-off between the generation of big data by such imaging systems and inefficiency in the extraction of useful information in real time, limit the efficacy of such sensors in real time decision making.

A hardware implementation of a real-time compressed-domain image acquisition system is demonstrated. The system performs front-end computational imaging, whereby the inner product between an image and an arbitrarily-specified mask is implemented directly in silicon. The modulated pixels summed up to generate the compressed samples, namely aperture-coded coefficients of an image. Proven functionality of the hardware in compressed-domain transform coding and silicon level compressive sampling are demonstrated.

- 2. Nonuniformity correction: There are many different sources that contribute to the nonuniformity of the acquired image. Defects in the growth or fabrication of the photodetectors, process variation in the fabrication of chip, nonuniformity in the illumination and the variation in the power supply distribution are only a few of the many different sources of non-ideality. In this research, spatio-temporal bias tunability is employed to cancel the nonuniformity caused by all these sources. The test chip can be configured such that it can eliminate the nonuniformities dynamically.
- 3. Standalone image sensor: test-chip can be configured as a stand-alone imager with conventional biasing scheme when the modulation in the bias is disabled.

| Li       | st of             | Figures                                                               | xiv  |
|----------|-------------------|-----------------------------------------------------------------------|------|
| Li       | List of Tables x> |                                                                       | xiii |
| 1        | Intr              | roduction                                                             | 1    |
|          | 1.1               | Background and Motivation                                             | 1    |
|          | 1.2               | Our proposed in-pixel imaging schemes                                 | 5    |
|          |                   | 1.2.1 Contributions of "In-pixel Multispectral Classification" scheme | 7    |
|          |                   | 1.2.2 Contributions of "In-pixel Compressive Sensing" scheme          | 10   |
|          | 1.3               | Publications                                                          | 11   |
|          | 1.4               | Organization of the dissertation                                      | 13   |
| <b>2</b> | In-I              | Pixel Multi-Spectral Classification                                   | 15   |
|          | 2.1               | Sparse imaging                                                        | 16   |
|          | 2.2               | Bias-tunable photodetector for multi-spectral                         |      |
|          |                   | classification                                                        | 18   |

|   | 2.3            | Spectral-tuning algorithm                                                             |
|---|----------------|---------------------------------------------------------------------------------------|
|   | 2.4            | Conclusions                                                                           |
| 3 | $\mathbf{Des}$ | sign of ROIC for Infrared Imaging 26                                                  |
|   | 3.1            | The unit-cell                                                                         |
|   | 3.2            | Process technology                                                                    |
|   | 3.3            | Design of peripherals                                                                 |
|   |                | $3.3.1  \text{Row/column select}  \ldots  \ldots  \ldots  \ldots  \ldots  \ldots  30$ |
|   |                | 3.3.2 The DFF                                                                         |
|   |                | 3.3.3 The row/column-select shift registers $\ldots \ldots \ldots \ldots \ldots 36$   |
|   |                | 3.3.4 The level translators                                                           |
|   |                | 3.3.5 The output amplifier                                                            |
|   |                | 3.3.6 ESD protection                                                                  |
|   |                | 3.3.7 Optical leakage                                                                 |
|   |                | 3.3.8 Optical isolation                                                               |
|   |                | 3.3.9 PAD                                                                             |
|   | 3.4            | Post-silicon validation                                                               |
|   | 3.5            | The EDA tools                                                                         |
|   | 3.6            | Image grabber                                                                         |
|   | 3.7            | Conclusion                                                                            |

| 4        | Cor | ntinuous Time-varying Biasing in a Chip             | 54 |
|----------|-----|-----------------------------------------------------|----|
|          | 4.1 | Integration in dual polarity                        | 56 |
|          |     | 4.1.1 Design of the max-identifier                  | 57 |
|          |     | 4.1.2 The optimized unit-cell                       | 58 |
|          | 4.2 | Test firmware                                       | 63 |
|          | 4.3 | Experimental setup                                  | 64 |
|          | 4.4 | Top view of the prototyped chip                     | 66 |
|          | 4.5 | Experimental results                                | 67 |
|          | 4.6 | Future plans                                        | 71 |
|          |     | 4.6.1 Post-silicon processing and flip-chip bonding | 72 |
|          |     | 4.6.2 Testing at cryostat condition                 | 73 |
|          | 4.7 | Conclusions                                         | 75 |
| <b>5</b> | A F | ROIC for Spatiotemporal Bias Tunability             | 77 |
|          | 5.1 | Background and previous work                        | 78 |
|          | 5.2 | Design of the pixel                                 | 81 |
|          | 5.3 | Experimental setup                                  | 87 |
|          | 5.4 | Conclusion                                          | 91 |
| 6        | Cor | npressed-domain Image Processing Applications       | 92 |
|          | 6.1 | Functioning as a stand-alone camera                 | 93 |

|   |            | 622    | On-chip compressive sensing               |   | 96  |
|---|------------|--------|-------------------------------------------|---|-----|
|   | 6.9        | 0.2.2  | ::f                                       | • | 07  |
|   | 0.0        | Nonun  |                                           | • | 97  |
|   | 6.4        | Comp   | ressed-domain image acquisition           | • | 101 |
|   |            | 6.4.1  | Discrete cosine transform                 | • | 101 |
|   |            | 6.4.2  | Bias selection algorithm                  | • | 103 |
|   |            | 6.4.3  | DCT-based image compression               | • | 107 |
|   |            | 6.4.4  | DCT-based image reconstruction            | • | 107 |
|   |            | 6.4.5  | Compressive sensing implementation        | • | 107 |
|   |            | 6.4.6  | Performance comparison between naïve DCT, |   |     |
|   |            |        | LMS DCT, and CS reconstruction            | • | 110 |
|   | 6.5        | Conclu | usions                                    | • | 111 |
| 7 | Con        | clusio | ns and future directions                  | - | 113 |
| A | Appendices |        |                                           | - | 115 |

| 1.1 | Three example applications of infrared imaging, a) driver vision                                                                           |    |
|-----|--------------------------------------------------------------------------------------------------------------------------------------------|----|
|     | enhancer, b) product inspection, and $c)$ thermal heat loss                                                                                |    |
|     | inspection of a building                                                                                                                   | 2  |
| 1.2 | Conventional imaging, storage, and processing scheme                                                                                       | 4  |
| 1.3 | A system level block diagram of the intelligent readout integrated<br>circuit we proposed for on-chip image acquisition and classification | 5  |
| 14  | A system level block diagram of the intelligent readout integrated                                                                         | 0  |
| 1.1 | circuit we proposed for on-chip image acquisition and classification.                                                                      | 6  |
| 1.5 | a) Readout integrated circuit proposed for hyperspectral classification. b) Readout integrated circuit for on-chip                         |    |
|     | nonuniformity correction and compressive sensing. $\ldots$ $\ldots$ $\ldots$ $\ldots$                                                      | 7  |
| 2.1 | A multispectral camera that employs seven bandpass filters to                                                                              |    |
|     | separate different bands on the left and an iconic representation of                                                                       |    |
|     | the internal block diagram on the right [33] $\ldots$ $\ldots$ $\ldots$ $\ldots$                                                           | 16 |

| 2.2 | Two pictures that have taken using the same DWELL infrared FPA.<br>While the bias of the detectors are uniform across the FPA, in each<br>picture, the bias voltage is selected to optimize the responsivity at                                                                     |                     |
|-----|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------|
|     | MWIR in the left or LWIR on the right $[36]$                                                                                                                                                                                                                                        | 17                  |
| 2.3 | a) Conventional ROIC-based multispectral classification imaging<br>system, which requires extra processing units to extract the class<br>identification of each object and, b) the ROIC we proposed for this<br>application, which integrates the acquisition and classification in |                     |
|     | the same chip                                                                                                                                                                                                                                                                       | 18                  |
| 2.4 | a) A sample growth structure of the DWELL photodetector at<br>center for high technology materials (CHTM), and b)<br>Bias-voltage-dependent spectral responses of the DWELL<br>photodetector showing potential for multispectral sensing. Pictures<br>adopted from [26]             | 19                  |
| 2.5 | Demonstration of the weighted superposition algorithm to implement<br>the ideal narrow-band filter by the mean of linear combination of                                                                                                                                             |                     |
|     | wide-band overlapping filters [26]                                                                                                                                                                                                                                                  | 20                  |
| 2.6 | A preliminary high level schematic of the unit-cell we propose for the rock type classification.                                                                                                                                                                                    | 22                  |
| 2.7 | A transistor level circuit implementation of the weighted<br>superposition algorithm. Because the circuit is composed of 41<br>transistors, it is not likely to fit in the limited area of the unit-cell                                                                            | 23                  |
| 2.8 | Demonstration of the weighed superposition algorithm integration<br>vs the refined integration scheme that embeds the weights in the<br>integration time and eliminates the need for the multiplication block                                                                       | 24                  |
|     | moshoush unit and chimnates the need for the multiplication block.                                                                                                                                                                                                                  | <b>∠</b> – <b>1</b> |

| 3.1  | Schematic of a conventional CTIA preamplifier, S&H switch and S&H capacitor and the multiplexers used or the readout.                                                                                                                                               | 29 |
|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 3.2  | A possible solution to raster scan the pixels using a chain of DFFs<br>embedded in pixels. In this figure, for the sake of demonstration,<br>the pixel array is composed of 64 pixels, which are distributed over<br>8 rows and 8 columns                           | 31 |
| 3.3  | An schematic of the DFF, which is used for the row/column select part of iROIC project.                                                                                                                                                                             | 32 |
| 3.4  | A sample waveform demonstration the functionality of DFF designed for row/column selection of the ROIC. $\ldots$                                                                                                                                                    | 33 |
| 3.5  | The DFFs chained to make the row-select and column select circuit.                                                                                                                                                                                                  | 37 |
| 3.6  | Demonstration of a sample waveform generated by a chain of DFFs,<br>which is used for row-select and column select circuit.                                                                                                                                         | 38 |
| 3.7  | The low to high level translators circuitry and the row/column driver.                                                                                                                                                                                              | 40 |
| 3.8  | The switch level circuit diagram for the output amplifier. Voltage $V_{Bias}$ is fed from outside of the chip, so depending on the operating temperature or the nominal power supply it can be adjusted to provide the best performance in the field.               | 41 |
| 3.9  | a) A schematic of the ESD protection circuit used in this project,<br>which is a NMOS and a PMOS transistor, which their gate are<br>short circuited to their source so they are normally off, and b) the<br>equivalent circuit, which is two reverse biased diodes | 43 |
| 3.10 | Demonstration of the leakage current, which is a result of photons                                                                                                                                                                                                  |    |
|      | getting to the side of the detector.                                                                                                                                                                                                                                | 45 |

| 3.11 | a) A picture of the geometry of different metal layers in the layout                                                              |    |
|------|-----------------------------------------------------------------------------------------------------------------------------------|----|
|      | of the ESD protection, b) the net area of the ESD protection that                                                                 |    |
|      | is covered by metal is shown in black, and c) individual metal layers                                                             |    |
|      | over the unit-cell on the left as well as the net area, which is covered                                                          |    |
|      | by metal on the right side                                                                                                        | 47 |
| 3.12 | a) A screen-capture of the layer selection window in Tanner-EDA and<br>an example asymmetric NMOS transistor from the TSMC CL035- |    |
|      | DDD technology, the technology, which is used in this project, b)                                                                 |    |
|      | and a screen-capture of the chip that is designed for multispectral                                                               |    |
|      | classification. The picture is taken after the design is imported to                                                              |    |
|      | Cadence Virtuoso for the extra pre-silicon verification                                                                           | 51 |
| 3.13 | a) Demonstration of timing diagram used in RS-170 standard for                                                                    |    |
|      | black and white video [46]. b) A sample waveform demonstration of                                                                 |    |
|      | m RS170 timing signals. $ m RS170$ is a protocol for which is used in many                                                        |    |
|      | image-grabbers, including NI PCI-1410, one of national instruments'                                                               |    |
|      | image grabbers, which is used in our lab.                                                                                         | 52 |
| 4.1  | A CTIA preamplifier featured with the integration in both positive                                                                |    |
|      | and negative polarities by using extra switches that can flip the                                                                 |    |
|      | integration capacitor. Reseting of the capacitor happens when all                                                                 |    |
|      | the switches are shorted simultaneously                                                                                           | 56 |
| 4.2  | A CTIA unit-cell capable of multispectral classification, which                                                                   |    |
|      | models the configuration of the integration capacitor and the four                                                                |    |
|      | switches to control the integration polarity and the three S&H $$                                                                 |    |
|      | capacitor and the analog comparison block                                                                                         | 57 |
| 4.3  | Revised block diagram of the unit-cell proposed for multispectral                                                                 |    |
|      | classification.                                                                                                                   | 58 |

| 4.4  | Demonstration of the circuit controlling the S&H switch, which is<br>triggered either using the S&H signal or when the arbiter decides<br>that the recent integrated value is greater than the value already                                                                                                                                                                        |    |
|------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
|      | stored in the S&H capacitor                                                                                                                                                                                                                                                                                                                                                         | 60 |
| 4.5  | Switch level demonstration of the unit-cell proposed for multispectral classification and the switches for the video signal.                                                                                                                                                                                                                                                        | 62 |
| 4.6  | A sample waveform, showing different states the circuit traverse to implement the feature extraction algorithm.                                                                                                                                                                                                                                                                     | 63 |
| 4.7  | a) A picture from the experimental setup including the custom reconfigurable PCB designed to host the test-chip on the right and a Spartan-3E Development board, which is used to generate the timing signals on the left, b) a micro-photograph of the chip, which is wire-bonded to a $25 \times 25$ socket and soldered to PCB, and c) a picture of the layout of the unit-cell. | 65 |
| 4.8  | Block diagram of the proposed readout circuit for multispectral classification.                                                                                                                                                                                                                                                                                                     | 66 |
| 4.9  | The video signal and the row/column pulse that shifts out of the last row/column. $\ldots \ldots \ldots$                                                                                                                                                            | 69 |
| 4.10 | The output image generated by the MSC-ROIC as a result of a laser beam shining to the ROIC. While no detector was installed, the change in the measured value comes from the variation in the operating point of the readout circuit's pixels under illumination                                                                                                                    | 70 |
| 4.11 | An iconic demonstration of the DWELL FPA flip-chipped over the next generation MSC-ROIC. The contacts of the MSC-ROIC side are gold plated, and the bonding is made using indium.                                                                                                                                                                                                   | 72 |

| 4.12 | Demonstration of different processing steps for DWELL FPA and flipchip bonding to MSC-ROIC.                                                                                                                                                                                                                                                                                                                                        | 74 |
|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 4.13 | Demonstration of the alignment marks designed over the MSC-ROIC,<br>which are used for aligning in the flip-chip bonding stage. The picture<br>also shows the contacts designed for the active DWELL detector<br>array, and the test detectors that are accessible directly using direct<br>outputs on the PAD ring. The ring around the FPA provides three<br>rows of substrate contacts, which are short circuited using metal4. | 75 |
| 4.14 | a) A picture of the dewar, which is modified for testing of the multispectral classification MSC-ROIC at cryostat condition, and b) the internal breadboard inside the dewar and the modification diagram.                                                                                                                                                                                                                         | 76 |
| 5.1  | An iconic representation of a compressive sensing setup with a bolometer for the sensor, and a micro-mirror array to implement the projection [55, 49]                                                                                                                                                                                                                                                                             | 79 |
| 5.2  | Block diagram of the individual pixel bias tunable readout integrated circuit, and the CTIA-based unit-cell at the extended view. The extra circuitry added to the CTIA-based unit-cell enables setting independent bias voltages for each pixel while the previous integrated voltage is being read out                                                                                                                           | 81 |
| 5.3  | <ul> <li>a) A cross-section of the n+/nwell/psub photodetector.</li> <li>b) The measured photoresponce of n+/nwell/psub photodetector as a function of the applied bias voltages at different illumination levels.</li> <li>c) The measured photocurrents, which are normalized to one.</li> </ul>                                                                                                                                 | 82 |

| 5.4  | a) Switch level implementation of iROIC unit-cell. The unit-cell is composed of 15 transistors, and three capacitors. b) The video switches, and c) the row/column select peripherals                                                                                                                                                                                       | 83  |
|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| 5.5  | Demonstration of the normalized responsivity of the system to a<br>uniform illumination level. The transfer function drops the hint of<br>how the system reacts to the modulation of the detector's bias                                                                                                                                                                    | 85  |
| 5.6  | a) A sample growth structure of DWELL infrared detector, the detector that is used as a base in designing current iROIC to ease the transition to multispectral imaging in the next generation of the hardware, and b) a sample spectral response of DWELL detector as a function of the applied bias voltage. c) Iconic demonstration of a DWELL FPA hybridized over ROIC. | 86  |
| 5.7  | A microphotograph of the fabricated ROIC, the row, and column select, and the test devices. The unit-cell is shown in the extended view. The total area of the fabricated chip is 5140 $\mu m \times 5140$ $\mu m$ .                                                                                                                                                        | 87  |
| 5.8  | <ul><li>a) A photo of the old setup. b) The PCB board designed for the CS project that is connected to an FPGA board for the timing signals.</li><li>c) The schematic of the designed PCB board for the CS project</li></ul>                                                                                                                                                | 88  |
| 5.9  | The new characterization system developed for the IPBT-ROIC embeds the image grabber inside the chip                                                                                                                                                                                                                                                                        | 89  |
| 5.10 | A block diagram of the experimental setup, which includes a<br>Raspberry Pi board as the main controller of the system, an ADC,<br>and a DAC to set the bias voltage of the detectors, and grabs the<br>readout of the imager. All communication between controller, and a                                                                                                  | 0.0 |
|      | remote machine is over SSH                                                                                                                                                                                                                                                                                                                                                  | 90  |

| 6.1 | Four images that are taken using iROIC camera in normal mode. a)                |     |
|-----|---------------------------------------------------------------------------------|-----|
|     | Phantom, b) a cell, c) some rice grains, and d) UNM logo                        | 93  |
| 6.2 | a) Original white-matter image used for imaging. b) Image is taken              |     |
|     | using iROIC with a uniform biasing for all pixels, where some of the            |     |
|     | pixels are saturated due to the high intensity. c, d, e, and f) The             |     |
|     | same scene is imaged using proper biasing for different areas that              |     |
|     | normally are at the noise floor of the imager                                   | 95  |
| 6.3 | a) Block diagram of DCT coding, and b) decoding and inverse DCT                 |     |
|     | coding                                                                          | 96  |
| 6.4 | a) The result of imaging a white paper with uniform biasing, while              |     |
|     | the illumination is not uniform. Defects and other sources of                   |     |
|     | nonuniformity also contribute to the variation across the image.                |     |
|     | The stack of three graphs demonstrates (I) camera output image,                 |     |
|     | (II) illumination contour, and (III) a 3D view of the intensities. $b$ )        |     |
|     | Another white paper is imaged with the same illumination                        |     |
|     | condition using the implemented nonuniformity correction. The                   |     |
|     | graph has the same scale as part (a) and the legend in the middle is            |     |
|     | for part (II). Figures c) and d) show the histogram for the                     |     |
|     | measured results of parts (a) and (b), respectively. $\ldots$ $\ldots$ $\ldots$ | 100 |

| 6.5 | Acquisition and compression processes, which include mapping $k$    |
|-----|---------------------------------------------------------------------|
|     | mask matrices to their corresponding bias voltages. The mapping is  |
|     | based on the system transfer's function shown in Fig. 5.5. Then,    |
|     | the bias matrices that are sitting in the Raspberry Pi memory are   |
|     | loaded to the imager and projected to the object's reflection       |
|     | function. The resultant dot product is optionally summed up in the  |
|     | hardware, and the $k$ resulting coefficients are sent to the remote |
|     | computer for reconstruction                                         |

6.6 The resulting images reconstructed using a) naïve DCT, b) leastmean-square error based DCT, and c) compressive sensing. d) The performance of different methods is compared in terms of the mean square error between the reconstructed image and the original image. 108

# List of Tables

| 3.1 | Comparison between different configuration for pre-amplifier used    |    |
|-----|----------------------------------------------------------------------|----|
|     | in an imager. Due to the need for a good bias control, high          |    |
|     | injection efficiency and sufficient charge storage, we have selected |    |
|     | CTIA configuration for iROIC                                         | 28 |
|     |                                                                      |    |
| 4.1 | Measurements over the standalone unit-cell using Spartan-3E for the  |    |
|     | generation of the timing signals and the measurements are done using |    |
|     | a Sourcemeter 236, and the current is applied using a Sourcemeter    |    |
|     | 2400                                                                 | 68 |
| 19  | Specifications of the chip designed for on-chip multispectral        |    |
| 4.2 | specifications of the emp designed for on-emp mattispectral          |    |
|     | classifications.                                                     | 71 |

## Chapter 1

## Introduction

#### 1.1 Background and Motivation

It has been long since the time that photograph industry was dominated by filmbased cameras. Digital cameras are the technology that has growing interest in place of the old film based cameras. A solid state camera that is only based on traveling electrons in short distances with no moving parts is the best of all. It is faster, less expensive, more reliable, and smaller. However, it took many years and lots of effort to get to this revolutionary technology [1].

The milestone of the appearance of this inseparable part of every electronic component stems at 1947 when the effort of John Bardeen and Walter Brattain on the fourth floor of Bell Labs in Murray Hill, NJ to reduce the presence of surface state ended up in discovering a transistor [2]. theInvention of solid state rectifier at 1874, although that point-contact got limited production, for sure qualifies as a great achievement toward electronic industry [3]. Nowadays a digital camera can be found in almost every electronic device, including cell phones, security cameras, industrial cameras, and smart watches [4, 5].

Speed, resolution, and other design criteria of a camera depend on the application that it is aimed for. The resolution can vary from one pixel to millions of pixels and the speed can get up to few hundreds of frames per second [1, 6].

Despite the differences in the types of the cameras that are designed for different applications, all digital cameras are composed of two distinct parts: the sensing part and the readout part. The main function of the sensing part is to convert photons to some electron-hole pair and the readout part applies proper bias voltage to the sensor, integrates/samples the electron-hole pair generated and after conversion to voltage, pass it to the output circuit [7, 8, 9, 10].

The design of the sensing part of a digital camera is a function of desired spectral response, speed, and parameters like quantum efficiency and biasing scheme. The sensors can be integrated to the readout part in the silicon chip or hybridized to the chip using flip-chip bonding technology. Among different regions, infrared can reveal information that cannot be seen when we bond ourselves to the visible spectrum. Thermal heat loss inspection of a building, power plant boiler fuel gas leak detection, product inspection, driver vision enhancer and pipeline leak detection are only a few of the countless infrared applications [10, 11, 12, 13]. In Fig. 1.1 three infrared applications mentioned above are shown.



Figure 1.1: Three example applications of infrared imaging, a) driver vision enhancer, b) product inspection, and c) thermal heat loss inspection of a building.

The readout part, on the other hand, varies a lot based on the target speed, dynamic range and the nature of the information that are grabbed. The ever increasing demands on having high-speed and high-resolution image sensors require transmission or storage of a large amount of data, which in turn results in consuming large power and having extra cost for developing and maintaining the system. The growing big data paradigm has imposed a great challenge to the modern technological world for efficient storage, transmission, and analysis of humongous data [14]. At the same time, in certain imaging modalities it possible to relax the need for big data through intelligent selection of samples in the acquisition process, followed by a smart reconstruction algorithm like the compressive-sensing (CS) based magnetic resonance imaging (MRI) and computed tomography (CT), where the scan time has been remarkably reduced [15, 16].

The conventional imaging system suffers from inefficient data transmission, additive latency, and large power consumption, which is not desired in real-time and medical applications [17, 18, 16, 19]. If one is able to achieve the compression of data at the acquisition phase itself, rather than undergoing post-processing, and compression of big data, it will result in the saving of power, time, and the efficient usage of the hardware resources [20, 21]. Compressive image acquisition relaxes the requirements dictated by the Nyquist theorem and gives flexibility in representing information with fewer projection coefficients under the assumption that the original signal is sparse in some domains [22]. Also instead of reading the values of intensity sampled at different pixels, in the case of a compressive-sampling sensor, a set of orthogonal gain matrices are loaded to the pixel array and the image sensor output would be directly an inner product between gain matrices, and the sample values of the image [14].

To address the big data problem, we propose an efficient system that shifts the tradition of grabbing and transmitting raw images, to an efficient compressed-domain

image acquisition system. This novel method provides the merit of reduction in acquisition time, which is one of the critical requirements for real-time systems [23]. The common image grabbing technique is shown in Fig. 1.2.

Inspiring from what the human's eyes transfers to the optic nerve [24], what we need to store or transmit is some concise data that are sparse and depending on the application may result in different type of classes. Processing the image information inside the chip and sending out the abstract information is the main idea behind this project [23]. This new scheme is based on the idea that individual pixels carry independent information and their content does not correlate to what they will have in the next frame. Although these assumptions are not ideal, it simplifies the onchip image processing techniques to a great extent. Figure 1.3 shows a schematic of system with the intelligent readout integrated circuit (iROIC) that we propose for the purpose of the integrated acquisition and processing inside a single chip.

The proposed scheme not only is beneficial in terms of the cost - because it removes the need for extra units for the image processing - but it also consumes less power and introduce less delay that is normally in the post-processing scheme [23].



Figure 1.2: Conventional imaging, storage, and processing scheme.

### 1.2 Our proposed in-pixel imaging schemes

Typically, a ROIC consists of a two-dimensional array of unit-cells, where each of them is responsible for reading/integrating the photocurrent of one photodetector in the array. To access all pixels, a row and a column decoder addresses individual pixels and enables some switches to transfer the sampled data to a video amplifier at the output. A block diagram of the image sensor is shown in Fig. 1.4. The row/column driver is needed to provide enough signal strength, where is needed to drive all the pixels in a row/column. In the iconic block diagram shown in Fig. 1.4, the timing signal is generated outside the ROIC. However, generation of the timing signals can also be embedded in the same chip that is used for ROIC. The operation point of the ROIC also can be hard coded inside the chip. However, the capability to set it outside would enable the fine-tuning of the operation point in the field.

In the schematic shown in Fig. 1.4, we assume the chip outputs an analog video signal. However, in an alternative approach, an analog to digital converter (ADC) can be embedded in every pixel or a high-speed global ADC can be laid out inside the chip to enable outputting digital information. While a digital information can be directed a general purpose DSP to be processed and extract information of different kind, the big data problem is still not solved in the digital ROIC when dealing with



Figure 1.3: A system level block diagram of the intelligent readout integrated circuit we proposed for on-chip image acquisition and classification.

megapixel imaging systems. As a result, digital ROICs are not scalable [25].

To address the big data, we propose two distinct methods for the on-chip imaging systems that are shown in Figs. 1.5(a) and (b). In the first pixel-level image processing hardware demonstrated in Fig. 1.5(a), we add some basic computation components to the unit-cell to extract the spectral features of the object in front-end of the image sensor [26, 27, 28]. We also propose a max identifier component that can identify the type of the object among the database that the algorithm has been programmed for. In the next three chapter, we will show the potential of this readout integrated circuit to implement multispectral classification, when we use it in conjunction with the quantum dot-in-a-well infrared photodetector [29, 30].



Figure 1.4: A system level block diagram of the intelligent readout integrated circuit we proposed for on-chip image acquisition and classification.

In the second in-pixel imaging system shown in Fig. 1.5(b), the unit-cell has the extra feature of controlling the gain of individual pixels when it acquires the photocurrent. The fact that unit-cell's gain can be programmed per pixel and can vary dynamically over time, results in a number of great features, such as region of interest enhancement, nonuniformity correction, and compressive sampling. We will explain the detail of this ROIC in Chapter 5-7. Following we outline the two main contributions of this dissertation [31].

## 1.2.1 Contributions of "In-pixel Multispectral Classification" scheme

The focus of this scheme is the in-pixel image processing with the intention of reducing the multispectral output data. The design, modeling, and characterization of the chip have been done as a part of this dissertation. The main contributions of



Figure 1.5: a) Readout integrated circuit proposed for hyperspectral classification. b) Readout integrated circuit for on-chip nonuniformity correction and compressive sensing.

the work presented in this report are as follow:

- 1. The unit-cell that is designed, simulated, fabricated, and tested to work with DWELL photodetectors. The DWELL sensors are low quantum efficiency, and high dark current devices that show their best performance in the cytogenetic temperatures. The comprehensive simulation, careful layout, and the extensive post-layout simulation over the extracted data ensures the functionality of the designed unit-cell at both room and cytogenetic temperatures.
- 2. The unit-cell is designed to cover a large swing voltage that is needed for a DWELL photodetector. It is also capable of applying both positive and negative bias voltages to the sensor and in this way utilizes the asymmetric bandgap structure of the DWELL sensors, which triggers the spectral shift that is coming from the quantum confined Stark Effect (QCSE) [32]. The photocurrent can be integrated both in positive and negative directions, which allows charging or discharging the integrator capacitor. It is also possible to change the bias voltage during the integration and add or subtract from what has been integrated previously.
- 3. The integrated photocurrent in the unit-cell can be either stored in the sample-and-hold (S&H) capacitor or compared with what is stored in S&H capacitor and selectively transfers to the S&H capacitor. In this way the unit-cell can recognize the type of the object in front of it and report the spectral feature that is stronger. The number of comparisons is infinite in theory so it is possible to extend the classification algorithm to larger databases without any extra hardware.
- 4. While designing an analog chip, signal integrity is of the highest importance because there is no noise margin (like what we have for digital signals) to protect the signal integrity. The careful customized design of the chip secures

lowering the noise. During the design, all the metal wires are shielded with a ground metal. A wide and extensive powering grid has been extended all the way over the chip to ensure having the minimum possible IR drop across the chip and guaranties the highest uniformity for the response of all the pixels.

- 5. The Process Design Kit (PDK) of the foundries did not come with an Electrostatic Discharge (ESD) protection for analog chips. ESD protection is of vital importance for the designed chip because it has to go through extra fabrication processes that are indium bump deposition, flip-chip bonding to a DWELL FPA, substrate polishing and testing. To protect the chip from the electrostatic discharge, modeling, and design of an efficient ESD protection has been done for the fabricated chip. The comprehensive testing confirms that the designed protection circuit is functional.
- 6. Input/Output pins are the best (if not the only) way to access the functions of a chip. Having more IOs means the ability to implement more functions or the potential of the implementation of extra test-points in the chip. We have customized the PADs with the goal of minimizing their pitch. In this way we could implement a lot of test cells that are helpful during the debugging.
- 7. Precise control over the timing signals are of vital importance in a readout integrated circuit. Concurrent programming in a FPGA environment helps to have fine resolution of the events that are to happen when implementing a multispectral imaging system. In this project, we have implemented a standalone operating system that is sitting in the FPGA memory and provides the timing signals for both the intelligent ROIC and also the image grabber.
- 8. Tanner-EDA tools that are a great software package to develop small and middle level projects, generally is not supported by the foundries and usually

no process design kit (PDK) is provided for Tanner L-Edit users. We developed our own DRC and extraction commands for Tanner L-Edit and the comparison against the PDK that has been released for cadence, confirms the functionalities of our work.

9. Testing of a chip, while it's bonded to a DWELL FPA requires working in cryostat temperature and that means the need to a Dewar customized for the design under test. For commercial ROICs usually a Dewar is designed by the company. For this project we have customized the internal daughter board of a Dewar from SEIR to provide the needed connections for our design.

#### 1.2.2 Contributions of "In-pixel Compressive Sensing" scheme

The main focus of this part of the project is to design the needed firmware to implement on-chip compressed-domain image processing. The main contributions of this work are as follow:

- 1. In this design the unit-cell benefits from the capacitive trans-impedance amplifier (CTIA) structure with the extra feature of having an analog memory inside every pixel that can individually control the gain of every pixel.
- 2. The individual pixel bias tunability of the ROIC makes it a perfect candidate for algorithms that rely on controlling individual pixels, like the nonuniformity corrections, automatic gain correction and the compressive sampling.
- 3. The timing signals are generated using a Raspberry-PI (RPB) board that has the benefits of supporting a high speed SPI communication protocol to control an anlog-to-digital converter (ADC) and a digital-to-analog converter (DAC). The RPI board generates the timing signals, and at the same time, controls a

DAC chip to generate the analog bias for different pixels during the readout. It also drives an ADC to read and store the video signal generated by the ROIC. The SD card memory in the RPI board provides physically large amount of space to store the bias voltages that are to load to the board and also the grabbed images.

- 4. Prototyping of a chip always comes with either dealing with breadboards that risks the signal integrity or a PCB board that is not easy to foresee all the possible scenarios. To resolve this issue we have designed a PCB board that is mountable to the Raspberry-PI and hosts an open cavity LCC chip carrier. The board offers all the components needed to test a ROIC, such as opamps, level shifters, buffers, and biasing circuits. The board is designed to have some degree of customizability, so that it can be used for a number of ROICs.
- 5. The versatile design of the chip along with the flexible test setup has offered many applications, including operation in standalone mode, in-chip region of interest enhancement, silicon level nonuniformity correction and compresseddomain image acquisition technique like compressive sensing.
- 6. Functional validation of the designed ROIC with PN junction photodetectors and the wide input/output dynamic range of the pixels is a good proof for the next generation multispectral spatio-temporal imaging.

### 1.3 Publications

Below are the publications during the course of my PhD studies.

1. G. Fiorante, P. Zarkesh-Ha, J. Ghasemi, and S. Krishna, "Spatio-temporal tunable pixels for multi-spectral infrared imagers," 2013 IEEE 56th

International Midwest Symposium on Circuits and Systems (MWSCAS), pp.317-320, 2013.

- J. Ghasemi, P. Zarkesh-Ha, G. Fiorante, and S. Krishna, "A new CMOS readout circuit approach for multispectral imaging," Photonics Conference (IPC), 2013 IEEE, pp. 592-593. IEEE, 2013.
- M. M. Hossain, J. Ghasemi, P. Zarkesh-Ha, and M.M. Hayat, "Design, modeling, and fabrication of a CMOS compatible pn junction avalanche photodiode," IEEE Photonics Conference, Bellevue, Washington, 2013.
- 4. J. Ghasemi, P. Zarkesh-Ha, S. Krishna, S. E. Godoy, and M.M. Hayat, "A novel readout circuit for on-sensor multispectral classification," 2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS), pp.386âĂŞ389, 2014. IEEE.
- J. Ghasemi, A. Neumann, S. Nezhadbadeh, X. Nie, P. Zarkesh-Ha, and S.R. Brueck, "A CMOS-compatible plenoptic sensor for smart lighting applications," in CLEO: 2015, OSA Technical Digest, Optical Society of America, paper STh1I.6.
- A. Neumann, J. Ghasemi, S. Nezhadbadeh, X. Nie, P. Zarkesh-Ha, and S.R. Brueck, "CMOS-compatible plenoptic detector for LED lighting applications," Optics express, vol. 23, no. 18, pp.23208-23216, 2015.
- J. Ghasemi, A. J. Chowdhury, A. Neumann, B. Fahs, M. Hella, S.R. Brueck, and P. Zarkesh-Ha, "A novel blue-enhanced photodetector using honeycomb structure," IEEE SENSORS 2015, pp. 01-04 Nov 2015, Busan, South Korea.
- A. Kazemi, X. He, J. Ghasemi, S.H. Alaie, N.M. Dawson, B. Klein, K. Kiesow, D. Wozniak, T. Habteyes, S.R. Brueck, and S. Krishna, "Graphene nano-objects tailored by interference lithography," SPIE NanoScience+
Engineering, pp. 91680B-91680B. International Society for Optics and Photonics, 2014.

- A. Kazemi, X. He, S.H. Alaie, J. Ghasemi, N.M. Dawson, F. Cavallo, T. Habteyes, S.R. Brueck, S. Krishna, "Large-area semiconducting graphene nanomesh tailored by interferometric lithography," Scientific reports, p.11463. 5, doi: 10.1038/11463, 2015.
- G. Fiorante, J. Ghasemi, P. Zarkesh-Ha, and S. Krishna, "Spatio-temporal bias-tunable readout circuit for on-chip intelligent image processing," IEEE Transactions on Circuits and Systems I: Regular Papers 63.11, pp. 1825-1832, 2016.
- M. Bhattarai, J. Ghasemi, G. Fiorante, P. Zarkesh-Ha, S. Krishna, and M.M. Hayat, "Intelligent bias-selection method for computational imaging on a CMOS imager," Photonics Conference (IPC), IEEE, pp. 244-245. IEEE, 2016.
- B. Fahs, A. Chowdhury, Y. Zhang, J. Ghasemi, P. Zarkesh-Ha, and M. Hella, "Blue-enhanced and bandwidth-extended photodiode in standard 0.35-μm CMOS," SENSORS, 2016 IEEE, pp. 1-3. IEEE, 2016.

## **1.4** Organization of the dissertation

The rest of this report is organized as follow. In Chapter 2 we discuss some background and general requirements for pixel domain image compression. We briefly discuss the design of a readout integrated circuit for infrared imaging in Chapter 3 and follow the design of an "on-chip continuous time-varying biasing" algorithm proposed for multispectral classification in Chapter 4. In Chapter 5 we discuss some background and prior works that have been done in the area of

#### Chapter 1. Introduction

compression propose sensor-leveland thenwe  $\mathbf{a}$ novel method for compressed-domain image acquisition. applications, including Different nonuniformity correction and compressive sensing are discussed in Chapter 6 along with the experimental results. Finally, we will outline conclusions and future works in Chapter 7.

# Chapter 2

# **In-Pixel Multi-Spectral Classification**

Multispectral imaging and classification is normally performed by utilizing a broadband detector with a set of narrow-band filters that are physically placed in front of the broadband detectors. as an example Fig. 2.1 shows a multispectral camera, which internally uses a Sony XCD-SX900 CCD camera. The mechanical parts required for holding and switching the filters, the speed of switching, and the cost associated with such a filters are limiting factors for this approach [33, 34].

To circumvent these drawbacks, our group presented a novel algorithm to perform multispectral imaging and classification by the utilization of the continuous bias tunability of the dot-in-well (DWELL) infrared photodetector and exploit the possibility of real-time on-chip multispectral imaging for classification in analog domain [26, 27, 28].

Figure 2.2 illustrates two images that are taken from the same object using a dual color focal plane array (FPA). As shown, because of stark effect, changing the bias voltage of the DWELL infrared photodetector can shift their spectral response from Medium Wavelength Infrared (MWIR) to Long Wavelength Infrared (LWIR)

and in this way the infrared photodetectors reveal features that were not visible in the other bias voltage. The pictures have been sampled using a conventional readout integrated circuit [35] and the bias voltage for all the pixels across the FPA is the same in each case [36].

## 2.1 Sparse imaging

The traditional way of multispectral imaging is to take the images at different frequencies, sending the spectral information to some post-processing units and implementing the classification algorithm in the remote machine. This method of imaging that is also demonstrated in Fig. 2.3(a) suffers number of issues:

1. Processing the captured image data in a separate hardware introduces extra latency to the classification process. This is not within the constraint of a



Figure 2.1: A multispectral camera that employs seven bandpass filters to separate different bands on the left and an iconic representation of the internal block diagram on the right [33]

real-time application.

- 2. Because of having a large number of subsystems involved in the imaging and classification, the power consumption of the system is high and for the same reason the system suffers from extra costs.
- 3. There is also a bandwidth limitation of the transmission media, which limits the frame rate of the imager. A classified image poses less data transmission per frame, which reduces the impact of the bandwidth limitation for the same frame-rate.

Similar to what we mentioned in the previous chapter, our alternative approach that is shown in Fig. 2.3(b) is to integrate the image classification within data acquisition, inside each individual pixel and develop an on-chip multispectral classification system. In this way, instead of converting the raw analog information to digital and sending the raw digital information to a post-processing classification unit outside the chip, we implement the processing and classification in analog domain and inside the readout chip. The output of the readout integrated circuit



Figure 2.2: Two pictures that have taken using the same DWELL infrared FPA. While the bias of the detectors are uniform across the FPA, in each picture, the bias voltage is selected to optimize the responsivity at MWIR in the left or LWIR on the right [36].

will therefore be the abstract data, which is the concise multispectral classified information [23, 30, 37].

# 2.2 Bias-tunable photodetector for multi-spectral classification

In this section, we review the main algorithms and hardware we employed to exploit on-chip multi-spectral imaging presented in Fig. 2.3(b). The new readout integrated circuit, which is presented for multispectral classification composed of all necessary elements needed to continuously tune the bias information of the



Figure 2.3: a) Conventional ROIC-based multispectral classification imaging system, which requires extra processing units to extract the class identification of each object and, b) the ROIC we proposed for this application, which integrates the acquisition and classification in the same chip.

#### Chapter 2. In-Pixel Multi-Spectral Classification

DWELL infrared photodetector to implement multi-spectral classification. These necessary elements include high-voltage swing, time-varying positive and negative bias tunability, bipolar integration, and selective sample-and-hold circuits.

Figure 2.4(a) depicts the grown structure of a single DWELL photodetector and Fig. 2.4(b) shows the impact of changing the applied bias voltage on the spectral response of the DWELL detector. Nonetheless, the spectral response of the DWELL photodetector at each bias voltage is 1-2 µm wide and has significant overlap with the spectral response associated with other bias voltages. This would raise the requirement for a spectral tuning algorithm, which will deliver some narrow-band nonoverlapping spectral filters.



Figure 2.4: a) A sample growth structure of the DWELL photodetector at center for high technology materials (CHTM), and b) Bias-voltage-dependent spectral responses of the DWELL photodetector showing potential for multispectral sensing. Pictures adopted from [26].

Chapter 2. In-Pixel Multi-Spectral Classification

# 2.3 Spectral-tuning algorithm

The feature selection algorithm, ideally requires a set of narrow-band and nonoverlapping filters [38]. In order to address the spectral overlap, a spectral-tuning algorithm reported in [26] is utilized by forming a weighted superposition of photocurrents, obtained by using different biases. The weights are optimally estimates (in the least-square sense) the ideal narrow-band photocurrent. This will be similar to the use of a broadband detector to probe the same target of interest through a desired physical narrow-band spectral filter. Figure 2.5 depicts the idea behind the weighted superposition algorithm that is developed to



Figure 2.5: Demonstration of the weighted superposition algorithm to implement the ideal narrow-band filter by the mean of linear combination of wide-band overlapping filters [26].

implement an ideal narrow-band filter, algorithmically [26].

Our group also developed a refinement of the algorithm that identifies a minimal set of only four biases to enable sensing of only the relevant spectral information for specific remote-sensing application of interest. For the purpose of this thesis, the application of interest is the classification of three types of rocks: granite, hornfels, and limestone. From our previous experiments [33], we know that these rocks can be correctly classified by computing the synthesized feature vector, which is the linear combination of the incoming test photocurrent with the optimal pre-computed set of weights (one for each rock type).

Because the set of weights are optimally matched to the spectra of each rock type, the feature component with the maximum value is the assigned class [33]. Based on our previous results, the minimal set of bias voltages are [-3.0, -0.8, +1.0, +2.8]volts, and the three weight vectors are  $W_1 = [+15, -109, +32, +10], W_2 =$ [+24, -63, -5, -8] and  $W_3 = [+11, +3, -128, +24]$  (one for each type of rock). The hardware implementation of this algorithm requires a processing unit to multiply each photocurrent by each one of the weights. Based on the algorithm discussed above [33], Fig. 2.6 illustrates a schematic of a unit-cell proposed for rock-type separation.

The block diagram proposed in 2.6 confirms that to be able to classify the type of the rocks, compared to a conventional ROIC, the only extra blocks that are needed to be implemented, are a multiplication and a summation unit that can be easily implemented in the hardware.

Figure 2.7 shows block diagram of a circuit that can be employed for weighted superposition part of the classification algorithm. Although the circuit would show great performance in terms of linearity and responding in real time, it is composed of more than 41 transistors and additional capacitors. Unfortunately, this unit will

#### Chapter 2. In-Pixel Multi-Spectral Classification

require an extended area of the unit-cell, which in practice it is impossible to layout this circuit in the limited area of the pixel.

Another refinement to the spectral tuning algorithm [28] embeds the multiplication (by weights) and addition in the photocurrent integration process by appropriately adjusting the bias scheme of the DWELL continuously in time. In the next chapter, we discuss the hardware design and implementation of the continuous time-varying biasing approach reported in [30]. The refined algorithm proposes that instead of integrating by a constant integration time of  $\tau$  and then multiplying by  $\omega_1$ ,  $\omega_2$ ,  $\omega_3$  and  $\omega_4$ , the multiplication weight is embedded in the integration time and the new integration time is  $\omega_1 \tau$ ,  $\omega_2 \tau$ ,  $\omega_3 \tau$  and  $\omega_4 \tau$ .

In this way, using the improved integration scheme, the multiplication weights



Figure 2.6: A preliminary high level schematic of the unit-cell we propose for the rock type classification.

block could be safely removed and the hardware will be much more simplified. Figure 2.8(a) shows the concept of the earlier weighted superposition algorithm and Fig. 2.8(b) depicts the refined scheme of the algorithmic weighted superposition.



Figure 2.7: A transistor level circuit implementation of the weighted superposition algorithm. Because the circuit is composed of 41 transistors, it is not likely to fit in the limited area of the unit-cell.

# 2.4 Conclusions

In this chapter, we briefly outlined an algorithmic spectrometer, which is based on the bias-dependent spectral response of DWELL detector. The spectral tunability of the DWELL infrared detectors stems in the QCSE effect [39] and the algorithmic spectrometer suggests observing the object repeatedly by the DWELL detector at different operating bias voltages. The algorithm proposes set of optimal voltages that the photodetector must be biased at and also a set of corresponding optimal weights that the sampled photocurrent must be multiplied to. The set of weights, which are reported in, is one set for each wavelength of interest.

In the second stage an improvement to the algorithm proposes integration of the weights in the integration time, which removes the need for the implementation of the multiplier block in hardware and significantly simplifies the circuit. While the



Figure 2.8: Demonstration of the weighed superposition algorithm integration vs the refined integration scheme that embeds the weights in the integration time and eliminates the need for the multiplication block.

#### Chapter 2. In-Pixel Multi-Spectral Classification

space limitation of the unit-cell limits the use of a bulky multiplier and/or adder, the improved algorithm simplifies the hardware at the cost of longer integration.

In the next chapter, we discuss the main building blocks of a ROIC for multispectral imaging and in Chapter 5 we propose implementation of the "Continuous time-varying biasing approach for spectrally tunable infrared detectors" [40] algorithm, which we briefly reviewed in this chapter.

# Chapter 3

# Design of ROIC for Infrared Imaging

A readout integrated circuit capable of real-time multi-spectral imaging has captured the attention of many research groups around the world [23, 41, 42]. A ROIC, typically, consists of a two-dimensional array of unit-cells, where each of them is responsible for reading/integrating the photocurrent of each photodetector in the array and converting the integrated/read value to voltage. The ROIC also has to bias the detector during the integration. To access all the pixels, a row and a column decoder raster scan all the individual pixels and enables some switches to transfer the sampled data to a video amplifier at the output.

The two intelligent ROICs that we introduced in the Chapter 1 was either designed to work with an infrared photodetector like DWELL, which can be tuned to different spectral regions as a result of the modulation of its bias voltage or the chip is intended to work at visible region with a future plan of exploration of multispectral imaging, which would be part of another fabrication run.

Therefore, in circuit's point of view, the designed ROIC must satisfy the following requirements:

- 1. The multispectral imaging algorithm imposes the requirement to apply bias voltages in the range -5 to +5 volts to the DWELL photodetector. Therefore, the readout circuit must be able to provide a minimum swing range of 10 volts.
- 2. The ROIC also must offer enough storage for the charge that is injected from the detector, otherwise the dynamic range of imager would be very low.
- 3. Keeping the low quantum efficiency of DWELL detectors in mind, the injection efficiency of the circuit must be high so that the circuit collects all the charges injected by the detector.
- 4. The circuit must be able to offer a fine control over the applied bias voltages to the detector so that it can support spectral tuning algorithm discussed in the previous chapter.
- 5. The imager also must provide the support for both positive and negative bias voltages, which is needed by the DWELL detector.

A detailed explanation of the design and implementation of the most important building blocks of a readout circuit for multispectral application is discussed in the rest of this chapter.

## 3.1 The unit-cell

The unit-cell is the main component of a ROIC, which provides proper bias voltage to the photodetector, integrates the photocurrent on the defined integration time and samples and holds (S&H) the integrated photocurrent on a S&H capacitor.

Table 3.1 compares the most common configurations of pre-amplifiers that are in-use for different applications.

Table 3.1: Comparison between different configuration for pre-amplifier used in an imager. Due to the need for a good bias control, high injection efficiency and sufficient charge storage, we have selected CTIA configuration for iROIC.

| Structure | Injection  | Detector   | Power       | Pixel            | Charge   |
|-----------|------------|------------|-------------|------------------|----------|
|           | Efficiency | bias       | dissipation | area             | storage  |
| SF        | Low        | No control | Low         | Small            | Very low |
| DI        | Moderate   | No control | Low         | $\mathbf{Small}$ | Low      |
| BDI       | High       | Good       | High        | Large            | Moderate |
| GMI       | Moderate   | Moderate   | Moderate    | $\mathbf{Small}$ | Moderate |
| CTIA      | High       | Good       | High        | Large            | Moderate |

For the purpose of this these, because having high injection efficiency, providing large voltage swing and high charge storage is of vital importance, we have selected capacitive trans-impedance amplifier (CTIA) configuration for the preamplifier. In this way the circuit also delivers high linearity and large dynamic range that would best fits in the design criteria for our real-time in-pixel image processing ROIC.

An example schematic of a conventional CTIA unit-cell is shown in Fig. 3.1. As illustrated in the figure, the unit-cell is composed of an integrator, a S&H capacitor and a source-follower transistor to buffer the charge that is stored in the S&H capacitor in the form of voltage. Two sets of analog multiplexers, one for row and the other for column, make the electrical connections between every pixel and the video amplifier.

## 3.2 Process technology

Due to the required large-swing bias voltage for the DWELL photodetector, a high-voltage 0.35 µm CMOS (CL035-DDDD) process technology node from TSMC

has been selected and used for the implementation of this project. The CL035-DDDD process technology supports two poly and four metal layers. While the four layer of metal is enough for the wiring of all different signals across the ROIC, the two level of poly is a great tool to implement inter-poly capacitors that is a must in every analog design. The poly-inter-poly (PIP) structure, enables having large value of capacitors that can be used for the purpose of compensation of the two stage differential operational amplifier, or storing analog values, at a sample and hold capacitor. Many other design technologies do not provide this type of capacitors and force the designer to use metal-insulator-metal (MIM), metal-oxide-metal (MOM) or metal-oxide-semiconductor capacitor (MOSCAP), which suffer from being limited to the low density that they can provide, consuming the metal layers or being too non-linear.



Figure 3.1: Schematic of a conventional CTIA preamplifier, S&H switch and S&H capacitor and the multiplexers used or the readout.

# 3.3 Design of peripherals

Using CL035-DDDD technology the designer has the option to have devices at both, a low voltage of 3.3 V and a high voltage of 15 V. In terms of area, the feature size for low-voltage devices is 350 nm, the gate size for high voltage devices have to be at least 1.5 µm. In our design, we have used a combination of low voltage and high-voltage devices to minimize the power consumption, where possible and also save the area. The high voltage transistors are perfect devices to provide large bias voltage needed for the QDIP devices, integrate, and amplify the photocurrent and switch on/off the analog signal. The low voltage transistors, on the other hand, are ideal choices to implement the timing signals and benefit from the possibility of having higher density and lower power for the devices.

While the ability to combine high voltage and low voltage devices in the same design offers significant improvement in terms of area and power, isolating different power domains is critical. The extra space that is needed to implement the guard ring is the reason we avoided over-mixing the voltage domains.

#### **3.3.1** Row/column select

Acquiring the image value requires addressing all the individual pixels sequentially and reading the data out, while the pixels is scanned. Mainly, there are two different methods to implement the raster scan circuit, which are:

1. Addressing the pixels using an array of DFFs that are connection serially, making a shift register, and by shifting a "1" through all the pixels allowing sequential selection of all the pixels. Figure 3.2 demonstrates a possible implementation for this method. 2. Implementing, separate shift registers for rows and columns. In this way, at each instance of time only one pixel, which is the crossing of the selected row and columns is selected.

In this project, we have used the second method and the row/column selection circuits are based on chaining DFFs in series. The reason for choosing the row/column selection method is that all the transistors in the unit-cell has to be based on the high voltage technology. This means that if the DFF is to be part of the unit-cell, as offered by the first method, around 22 extra transistors has to fit in the  $60 \,\mu\text{m} \times 30 \,\mu\text{m}$  area of the unit-cell, which is impossible.



Figure 3.2: A possible solution to raster scan the pixels using a chain of DFFs embedded in pixels. In this figure, for the sake of demonstration, the pixel array is composed of 64 pixels, which are distributed over 8 rows and 8 columns.

Our primary criterion is to have the unit-cell based on only high-voltage transistors (vs. mixing low-voltage and high-voltage devices), otherwise the required clearance between the two different voltage domains burdens the area constraints. Additionally, the extra guard-ring that might be needed to protect the analog part from the switching noise that is injected by the DFFs that are switching at high frequency might be a killer to the limited area. Furthermore, when the DFF is integrated to the unit-cell, higher number of transistors (all the transistors that are part of DFFs, inside the unit-cell) will be switching at the pixel-clock, and this would introduce additional crosstalk to the wrining in the design.

#### **3.3.2** The DFF

Figure 3.3 shows a schematic of the DFF used in this project, which is composed of two latches connected in master-slave mode. Figure 3.4 demonstrates a sample



Figure 3.3: An schematic of the DFF, which is used for the row/column select part of iROIC project.

waveform showing the operation of the DFF.

To save the area, all the DFFs are laid out using low voltage transistors and the output of the shift registers is connected to a level shifter to transform the level of the signal to 15 V and also improves the driving capability of the signal in a way which it can drive all the pixels in a row or in a column.

In a digital system-on-chip (SOC) integrated circuit, which is composed of millions of standard cells, there is a formal characterization process that takes a cell extracts and reports the input capacitances of all the cells in the library. Various delay parameters are also reported as a function of power supply voltages and the output loads. The reported information is then used by the synthesis tool to translate the circuit from behavioral to the gate-level implementation. The synthesis tool will go through a time-intensive algorithm to pick the best combination of gates, which results in the best speed, area, and power performance.



Figure 3.4: A sample waveform demonstration the functionality of DFF designed for row/column selection of the ROIC.

Many constraints also can be defined as some synopsys design constraints (SDC) scripts.

In this project, however, because we are dealing with a mixed-signal multi-voltage power supply circuit, and because of the high restriction over the floor-plan, the synthesis tool will have a limited choice in selection of different component from a library. Therefore, instead of taking advantage of some mixed signal modeling languages such as Verilog AMS, all the analog cells are manually laid out and placed in the chip. The DFF is designed to deliver the best performance in terms of speed, power as well as area. In the simulation, the target load that is considered for each DFF is the sum of the input capacitance of the next DFF in the shift register and the input capacitance of the level translator.

Following are the process we used to characterize the DFF:

• Clock input capacitance: The clock input capacitance of the DFF shown in Fig. 3.3 is measured by connecting a 10  $k\Omega$  resistor to the clock input of the DFF and measuring the time constance of the RC circuit that is formed by the resistor and the capacitance that is seen at the clock input. The measured low-to-high-level delay  $(t_{PLH})$  and high-to-low-level delay  $(t_{PHL})$  for the clock input are as follow:

$$t_{PLH} = 27.3 \ pF$$
$$t_{PHL} = 15.2 \ pF$$

Therefore the input delay is:

$$t_{p-clk} = \frac{t_{PLH} + t_{PHL}}{2} = 21.25 \ pSec$$

And the nominal clock input capacitance would be:

$$C_{in-clk} = \frac{21.25 \ pSec}{0.69 * 10000} = 3.1 \ fF$$

Because the clock input is driven from outside of the chip, an amplifier with sufficient driving capability could be employed to operate the circuit at the nominal working frequency.

• **Reset input capacitance:** To characterize the input capacitance of the reset pin the same method is used and the numbers extracted are as follow:

$$\begin{array}{rcl} t_{PLH} &=& 46.4 \ pF \\ t_{PHL} &=& 15.2 \ pF \end{array} \Rightarrow t_{p-data} \ =& \frac{t_{PLH} + t_{PHL}}{2} \ =& 30.8 \ pSec \end{array}$$

Therefore, the nominal reset input capacitance is:

$$C_{in-reset} = \frac{30.8 \ pSec}{0.69 * 10000} = 4.46 \ fF$$

Although the input capacitance of the reset pin is almost 1.5 of the capacitance of the clock pin, because the reset pin is not supposed to work at high frequency, the extra capacitance does not pose any limitation on the operation frequency of the circuit.

• Data input capacitance: Because the data input pin is driven by the output of the previous DFF, having a low capacitance at the data input is extremely important. In our design the input data is connected to only two transistors, one NFET and one PFET with minimum size, which is 2.36 µm in width and 1.5 µm in length.

Having these transistors at the minimum size, the measured delay and capacitance at the input of the data pin is as follow:

$$\begin{array}{rcl} t_{PLH} &=& 32.7 \ pF \\ t_{PHL} &=& 15.2 \ pF \end{array} \Rightarrow t_{p-data} \ =& \frac{t_{PLH} + t_{PHL}}{2} \ =& 23.9 \ pSec \end{array}$$

Therefore, the nominal clock input capacitance would be:

$$C_{in-data} = \frac{23.9 \ pSec}{0.69 * 10000} = 3.5 \ fF$$

The same method has been used to characterize the capacitance at the output of the DFF and also to measure the C2Q delay of the flip-flop. In the characterization, the only point to take into account is to load the output of the DFF with proper capacitance that are to be connected at the worst case operating condition.

#### 3.3.3 The row/column-select shift registers

The row/column select circuit is a shift register made by chaining DFFs together, as shown in Figure 3.5. The RST signal resets all the DFFs and initializes the row/column select blocks for scanning the row/column pixels. Every pulse on the CLK input shifts the content of the shift register by one bit. The first extra DFF generates the initial pulse needed to be shifted into the pre-reseted shift-register and makes the one-row/column-at-a-time selection possible.

While wiring the column select Shift Registers, the CLK and RST input signals are connected to the pixel-clock and line-sync signals respectively, such that the active column selection changes with every pulse on the pixel clock and the line-sync will moves the scan to the next line. The CLK and RST input signals on the row-select-shift-registers, on the other hand, are wired to the line-sync and frame-sync signals respectively, such that the active row selection changes with every pulse on the pixel clock and the frame-sync resets the readout. A sample waveform generated by the row/column-select-shift-registers is shown in Figure 3.6.

#### Metastability

While it is very tempting to push the limits and design the system for the highest operating frequency, design for reliability is a very important to consider. Any metastability in the DFFs could end-up in failing the whole system. The design must be carefully evaluated against the reliability metrics, making sure it meets all the timing requirements.

Normally in a digital system the setup and hold time are the timing parameters, which are defined as the minimum timing window, which the data signals must remain constant before and after the active edge of the clock arrives. Process variation could affect the timing, and power distribution across the die could worsen the operation condition and shrinken the timing window that must be taken in account [43].

To make sure logic "0" and logic "1" can safely sampled by the DFF and passed to the next level the circuit design must be away from the violation of setup and hold time. And violation of these timing parameters could be ended up in latching an erroneous data.

Metastability, which is defined as sampling a signal, while it is not properly registered is a potential disaster and once a DFF enters the metastable condition, the probability of remaining in the state will decay exponentially by time. It is



Figure 3.5: The DFFs chained to make the row-select and column select circuit.

possible that the unit remain in the metastable condition forever. To reduce the likelihood of this disastrous condition, it is needed to wait for signal to completely propagate to the destination sampling point, before the target DFF look at its value, otherwise the meantime between failures (MTBF) for a flip-flop will be proportional to the inverse of the operating frequency:

$$MTBF \propto \frac{1}{f_{clk}} \tag{3.1}$$

Therefore, the reliability of the system will be exponentially proportional to the



Figure 3.6: Demonstration of a sample waveform generated by a chain of DFFs, which is used for row-select and column select circuit.

inverse of MTBF, as follow:

$$Reliability = e^{\frac{Time}{MTBF}}$$
(3.2)

The ideal solution to this problem is to attack using Monte-Carlo simulation. The Monte-Carlo simulation uses the wafer-to-wafer and die-to-die process variation that are measured by the fabrication house (TSMC in this case) and will find the worst case timing condition that the circuit might face.

To verify the functionality of the row-select and column select circuit in the desired operating condition, a back-of-the-envelope estimation relies on the setup time  $t_{su}$ , hold time  $t_h$ , clock-to-Q delay of the flip-flop  $t_{c2q}$  and the propagation delay of the circuit  $t_p$ . The back-of-the-envelope estimation requires that the minimum hold time must be:

$$t_{hold} \ge t_{c2q} + t_{p,comb} \tag{3.3}$$

However, because the circuit must work at temperatures as low as cryostat conditions (70 K) and up to 320 K, the equation above must hold at different temperature/voltage (TV) corners otherwise the circuit may fail at the operating condition.

#### 3.3.4 The level translators

As mentioned before the row/column select shift registers are implemented using low voltage devices. However, the unit-cell is constraint to the power supply that is needed for the DWELL infrared photodetectors. A high voltage signal at the output of the row/column selection circuit may not even reach to the threshold level of the switches at unit-cell. The level-shifter is built to translate the level of the voltages so that they can be captured by different switches. In addition to translation of the level of the selection signals from 3.3 V to 15 volts, the level-translators also provide proper driving strengths, so that all the switches in the active row or column can be driven by the corresponding level translator. Figure 3.7 depicts the schematic of the level-translator block. The four transistors on the left are all at the minimal and fixed size. Because their load is minimum, their reliable operation is guaranteed. The size of the transistors on the next two stages are selected based on the load that they are to drive, which is the accumulative input capacitances of all the switches that are connected to each selection signal.

#### 3.3.5 The output amplifier

The output amplifier has a critical task in the performance of the image sensor. The quality of the image is a function of how well the output amplifier is delivering the analog signal and how much noise is added by this stage. There are many different configurations for the output amplifier. One option is a two-stage differential amplifier, which is great when we are required to change the amplitude of the signal



Figure 3.7: The low to high level translators circuitry and the row/column driver.

and the opamp will provide us with the gain needed. On the other hand, the opamp is needed to be stabilized using an internal or external compensation capacitor.

In this project, because the output amplifier is needed to buffer the video signal and no amplification is needed, we have decided to take the simplest possible design, which is a single stage amplifier that works as a buffer. In this way we drive the capacitance associated to the PAD and the input capacitance of the image grabber and at the same time we stay away from the complications that are tied to designing a more sophistication amplifier. The schematic on Fig. 3.8 presents the transistor-level implementation of the output amplifier.

#### 3.3.6 ESD protection

Electrostatic discharge (ESD) protection has a critical rule in every electronic integrated circuit. The primary function of the ESD protection is to bypass accidental



Figure 3.8: The switch level circuit diagram for the output amplifier. Voltage  $V_{Bias}$  is fed from outside of the chip, so depending on the operating temperature or the nominal power supply it can be adjusted to provide the best performance in the field.

electrostatic charge to the VDD or VSS power rail and to protect the internal circuits that are sensitive and can be damaged by the discharge.

There are a lot of different structures available for the ESD protection circuit and most of the process design kits (PDKs) come with few ready to use pre-laid out designs for the ESD protection. However, most of available designs, cover the needs of digital circuits and are not suitable to be used for analog input/outputs. Therefore, in this project we have designed the ESD protection devices and verified their functionality through simulations and experiments.

The ESD protection used in this project consists of a set of wide NMOS and PMOS transistors with their gates connected to the bulk and power rail, respectively. This configuration ensures the ESD transistors are off in the normal operation, and when an external component wants to pull the rails over VDD or below VSS, it clamps the input like a forward biased diode.

An important scenario are the pins that are to work at high speed, which have very low tolerant to the load capacitance. The pins that are to connect to current sources should be excluded from the ESD protection, otherwise due to the leakage current ESD circuitry, it is definitely not possible to determine the exact amount of the current that is delivered to the actual device under test. Figure 3.9(a) shows the ESD protection, which is designed in this project. Our design is based on the ESD PAD frame designed by TSMC 0.35 technology PDK.

As the schematic suggests, because the gate-source terminals of the transistors are short circuited, both of the transistors are off in the normal operation, and the current passing through the transistors are limited to their leakage current. As soon as the voltage on the pin tries to go over the power rail (VDD) or below the bulk voltage (VSS), either the PMOS or the NMOS will start to conduct and will form to a diode configured MOSFET in forward biased. This will direct the surge current

or any other external sources to the voltage rail; instead of the transistors inside the circuit. All the layers are redesigned to fit higher number of PADs around the chip and as a result having access to higher number of test-cells.

We had some chips wire-bonded by MOSIS and we were recommended in having a minimum pitch of 130 µm between PADs to guarantee having in house wire bonding. Figure 3.9(b) shows the equivalent circuit to the Fig. 3.9(a).

#### 3.3.7 Optical leakage

A mistake in design of a light sensor is that only the part of semiconductor that is intended to react to light is photosensitive and the rest of the circuit will operate normally, as there are not illuminated. The shortcoming of the mentioned mistake might lead to measuring a wrong number for the responsivity or quantum efficiency



Figure 3.9: a) A schematic of the ESD protection circuit used in this project, which is a NMOS and a PMOS transistor, which their gate are short circuited to their source so they are normally off, and b) the equivalent circuit, which is two reverse biased diodes.

of a detector or it might be much worse, causing the whole chip fails in the field. As an example we have run some measurements, trying to characterize the responsivity of a silicon based photodetector. The setup was composed of a monochromator as the light source that can scan the wavelength by a resolution of 0.1 nm and a source-meter to measure the photocurrent as each wavelength. The source-meter was coupled to the detector using a multi-mode optical fiber with a core of 1 mm.

Initially we measured the density of the optical power at the output of the fiber and then we measured the photocurrent of the detector as a result of the monochromator. Using the customized setup we build for this measurement, to degrade the statistical noise we repeated the characterization 10 times and measured the optical power and photocurrent at each wavelength. We calculated the responsivity by the mean of the following equation:

$$R(\lambda) = \frac{P_{avg}(\lambda) \times A_{PD}}{I_{Phavg}(\lambda)} , \qquad (3.4)$$

where  $R(\lambda)$  is responsivity of the detector at wavelength  $\lambda$  and  $P_{avg}(\lambda)$  is the average optical power density getting to the active area of the detector. The area of the detector is denoted by  $A_{PD}$  and  $I_{Ph_{avg}}(\lambda)$  is the average measured photocurrent of the detector as a result of the incident  $P_{avg}(\lambda)$ . Because we are using a monochromatic illumination source that is scanning over wavelength, in the above equation R,  $P_{avg}$ and  $I_{PD_{avg}}$  is a function of the wavelength. We have given a subscript *avg* to R, Pand  $I_{PD}$  because we have repeated the measurement 10 times to cancel the statistical noise. The physic says the absolute limit for the responsivity of any detector is achieved by the mean of following equation:

$$R_{ideal}(\lambda) = \frac{q}{hc}\lambda\tag{3.5}$$

This equation suggests that if all the incident optical power is converted to electronholes, if the semiconductor is perfect with no defect, if all the electron-hole pairs are generated at the depletion region and we can collect all of them then we will get the responsivity denoted by  $R_{ideal}(\lambda)$ . To confirm the speculation above we made another setup using a narrow laser beam as the light source. We developed a LabVIEW program to control the position of the laser pointer using a stepper that has precision of 0.1 µm. As the position of the laser pointer was crossing over the edge of the detector the LabVIEW program was measuring the photocurrent. Our initial expectation was to see a step function. However, we could see some photocurrent even when the laser pointer was 100 µm away from the edge of the detector. This observation leads to define an effective collection area that is determined by the optical penetration depth of photons and the time-constant of the diffusion carriers. Figure 3.10 demonstrates the result of the scanning of the laser pinter over the edge of the detector. The issue discussed above has led the designers to consider an optical window over the detector to optimize the responsivity [44, 45].



Figure 3.10: Demonstration of the leakage current, which is a result of photons getting to the side of the detector.

#### 3.3.8 Optical isolation

When designing a circuit like an readout integrated circuit that will be exposed to light, it is very important to keep in mind that every part of silicon that is to be exposed to light will react differently compared to the time it is not illuminated. The light changes the density of electron-hole pairs, an as a result the operating point of every transistor. Therefore, the part that is not supposed to react to light must be properly blocked using metal layer(s) and if possible the metal blockage must be extended over the side of diffusion for at least a hundred micro meter, otherwise some photons can get to junction and modulate the conductivity and channel properties.

A good example of the application for this blockage is the ESD protection. In this project all the four metal layer that are offered by TSMC CL035-DDD are used to block the incident light. At the same time the metal layers are used in a form of a grid to distribute power and ground around the chip. Figures 3.11(a) and (b) demonstrate the optical blockage over the ESD protection devices made using metal layers. The optical leakage blockage over the unit-cell is shown in Fig. 3.11(c). Note that the circuit is intended to work in infrared region and a focal plane array (FPA) is going to be fabricated over the chip, therefore the small opening between the metal layers should not be an issue.

Additionally, because the readout circuit is going to work in infrared region, where the responsivity of silicon is lower compared to visible region, the infrared photons that might pass the FPA and get to the silicon is limited. However, if the readout circuit is intended to work in the visible region and if the FPA is integrated in silicon and is part of the ROIC, the optical leakage must be carefully taken into consideration. Otherwise, optical power will change the operating point and the performance will be compromised.

#### 3.3.9 PAD

PAD is a critical part when it comes to importing/exporting a transaction to any chip. There are two main aspects to consider in designing every PAD: 1) mechanical 2) electrical properties. Mechanical consideration recommends having all the metal layers on the top of each other and connected using sufficient number of VIAs (if there is no thick metal layer reserved for PADs). Paying no attention to this consideration



Figure 3.11: a) A picture of the geometry of different metal layers in the layout of the ESD protection, b) the net area of the ESD protection that is covered by metal is shown in black, and c) individual metal layers over the unit-cell on the left as well as the net area, which is covered by metal on the right side.

might cause the PAD peel off during the wire-bonding, which results in failure of the chip in the lab.

The PAD must also be large enough to make the wire-bond and packaging of the chip possible. However, in order to work at higher frequencies, the parasitic capacitance must be as low as possible, which translates to having the smallest possible geometries for the PAD.

The PAD we designed in this project is based on TSMC CL035-DDDD PDK. However, we redesigned almost all the layers and VIAs to match the design rules of the new process design kit.

### 3.4 Post-silicon validation

To implement multispectral classification supported by the spectral tuning algorithm discussed earlier in this chapter, the readout integrated circuit fabricated at the foundry has to process and get prepared for the flip-chip bonding to the DWELL FPA. The first step is to test the chip electrically and make sure the chip is functional and can satisfy the specifications that it is designed for. In this step, before flip-chip bonding, there are a number of specs that can be verified, which is for example to make sure the row select and column select are working or the chip is not failing because of a short-circuit between the power rails or any other place in the chip that have not been discovered by the extraction commands or have not been covered while the design was to be verified functionally. Any miss in the verification of the features could potentially lead to a bug and in that case either the chip has to be redesigned or there might be some room to fix it using focused ion beam (FIB) technology.

Independent of the source of failing stems in the design (e.g. verification or
fabrication defects), the designer must invest on the required considerations for test. To test the chip in the lab, two requirements must be met: 1) controllability 2) observability. The controllability is the ability to trigger the value of different points of the design from the pins at primary input and the observability is the capability of observing the value of any point of the design.

For a digital design there is a formal way of field-testing, which benefits from the fact that every node can holds either a state of "1" or "0". The synthesis tool converts the RTL design to its gate-level equivalent circuit, finds all the DFFs and replace them with a scan flip-flop to create a shift-register. In the test-mode, the state of chained flip-flops can be changed using a bit-stream that is shifted to them from a primary-test-input or their values can be read out using a primary-output that is intended for the purpose of testing.

For an analog circuit, on the other hand, the state of different nodes is not limited to "0" and "1". In fact, each node at any point of time can have any value between VSS and VDD. In addition, to have a better insight to the design, for any node of the design we must be able to measure the input/output capacitance, frequency response, swing range, and slew rate, leakage current and noise.

It is obvious that having test-points for every node of the design is not possible and the number of pins on the chip does not allow for having access to each point of the chip. Additionally, the circuit that is designed to set the state of a node or to sample the design will load the circuit and will compromise the functionality itself. Therefore, the trade-off is to select the optimum number of points to control/observe the functionality of the chip and still have enough insight to root-cause the problem.

Fortunately number of different cells that is making the ROIC is limited. Therefore, if we have proper number of test-points to validate the functionality of all these different building blocks, we can verify the design functionally.

### 3.5 The EDA tools

The EDA tool that has been employed in this project to design and simulate the chip are T-Spice and L-Edit from Tanner EDA. Tanner EDA tools are a windows based software package that is fast, handy, and powerful. It can handle even a large design at the complexity of a chip. The only issue with using this tool is the lack of support by foundries. While the foundries usually do not provide the tanner user with the Process Design Kit (PDK), the simulation library of H-Spice is completely compatible with the T-Spice and it satisfied our simulation needs without any need to modify the model files.

The missing component was the design rule checks (DRCs) and the extraction commands that we developed at UNM. For extra verification, the final design was imported to Cadence Virtuoso to verify all the DRC rules. The Tanner EDA tool does have some limitations in synthesis of RTL codes. However, in the design of the readout integrated circuit in this project, we did not need the synthesis features, because the design has to be done manually. Recently, Tanner EDA tools was acquired by Mentor Graphics, which has some addition new features that allows the designers to use the DRC and extraction rules directly from Calibre tool. This new feature gives designers higher degree of confidence. The picture in Fig. 3.12(a) depicts the layer selection window of Tanner L-Edit and Fig. 3.12(b) demonstrates an screen-capture of the readout integrated circuit that we designed, after it is imported to Cadence Virtuoso.

## 3.6 Image grabber

Figure 3.13(a) presents the waveform for RS-170 video signal, the standard black and white video format. Because only one line is used to send the timing signals and the video information, a combination of amplitude and frequency are used to trigger the start of a new frame or a new line. The point to emphasize is that a receiver would be sensitive to the level of the signals to mark an active line or frame[46].

Similar timing signals are used in an image grabber, such as NI PCI-1410, with the only difference that the timing signals are provided using separate line. This should be considered when developing an image sensor and building the test circuit. The RS-170 timing requirements or any other standards that are used in the image grabber



Figure 3.12: a) A screen-capture of the layer selection window in Tanner-EDA and an example asymmetric NMOS transistor from the TSMC CL035-DDD technology, the technology, which is used in this project, b) and a screen-capture of the chip that is designed for multispectral classification. The picture is taken after the design is imported to Cadence Virtuoso for the extra pre-silicon verification.

should be considered in developing the firmware. Figure 3.13(b) demonstrates the timing signal we have implemented in the FPGA board to work with NI-1410.



Figure 3.13: a) Demonstration of timing diagram used in RS-170 standard for black and white video [46]. b) A sample waveform demonstration of RS170 timing signals. RS170 is a protocol for which is used in many image-grabbers, including NI PCI-1410, one of national instruments' image grabbers, which is used in our lab.

## 3.7 Conclusion

In this chapter, we presented the basic requirement for the implementation of multispectral imaging in chip. Then we outlined the main building blocks of an integrated circuit targeting infrared imaging. All the critical components including the row/column select, the ESD protection, the output amplifier are discussed. Additionally the EDA tools and image grabber that is employed in this thesis are reported. We also described the process technology node, which the chip was fabricated in, and outlined some important design criteria that have to be considered when the chip is to be exposed to light.

In the next chapter, we will use the foundation developed in this chapter to design a custom ROIC for "Continuous Time-varying Biasing in a Chip", which in used for on-chip multispectral classification.

## Chapter 4

# Continuous Time-varying Biasing in a Chip

In Chapter 2, we outlined the multispectral classification algorithm that is based on the bias voltage dependence and the spectral tunability of DWELL photodetectors. The proposed technique generate a linear combination of the DWELL's photoresponce taken at some optimal set of bias voltages to deliver a narrow-band spectral filter. In Chapter 3, we discussed the requirements for a readout integrated circuit, which are to work with an infrared photodetector such as DWELL, which has low quantum efficiency, needs a large swing voltage, and the bias has to be modulated. Then, we presented the main building blocks needed to readout an image from such detector.

In this chapter, we extend the hardware implementation for the multispectral classification technique that was discussed earlier [40]. The hardware presented in this chapter is a readout integrated circuit optimized for multispectral classification (MSC-ROIC), which aims to output the class of the object that is being imaged by the hardware. The MSC-ROIC is optimized to implement multispectral classification

in each pixel independently, and builds up a map showing the class at each point. To implement the multispectral classification, the circuit must be able to apply both positive and negative bias voltages to the photodetector. This is guaranteed by opamp with negative feedback in CTIA configuration, which ensures proper bias voltage is applied to the photodetector

Based on the weighted superposition algorithm reviewed earlier, the hardware required to implement multispectral classification must follow the procedures below:

- 1. Apply proper bias voltage to the DWELL photodetector. Using the CTIA configuration while the integration switch is open, the integration capacitor will close the opamp loop, and the negative feedback will force the negative input of the opamp to follow whatever voltage that is applied to its positive input.
- 2. Integrate the photocurrent corresponding to the applied bias voltage for a specific time that is proportional to the first weight (i.e. multiplication is implemented by means of integration time).
- 3. Apply second, third, and fourth bias voltages to the DWELL photodetector and integrate the corresponding photocurrents over a time periods that are proportional to the associated weights.
- 4. Transfer the overall integrated photocurrent to the first S&H capacitor.
- 5. Repeat the procedure above for the second and third weight vectors and record the resultant charge in the second and third S&H capacitor.
- 6. Compare the integrated voltage and recognize the rock type based on the relative magnitude of the voltages for classification.

The process mentioned above has to be repeated continuously to find the class of the new object, in-case of any change.

## 4.1 Integration in dual polarity

The suggested optimal set of weight vectors for the spectral tuning of DWELL infrared detectors [40] requires multiplication of the photocurrent with both positive and negative weights. A possible solution to the negative integration that normally is not available in any conventional preamplifier, is to use a two set of current mirrors, then selectively multiply the injected photocurrent by either +1 or -1. However, current mirrors are normally wide transistors, which is not in agree with the area constraint of the system.

As an alternate approach, the hardware must have the ability of integration in both polarities of the photocurrent. Figure 4.1(a) shows a revised CTIA that performs integration in both polarities. The four switches control the polarity of integration. When the switches labeled (1) are connected, as shown in Fig. 4.1(b), the photocurrent charges/discharges the capacitor and when the switches labeled



Figure 4.1: A CTIA preamplifier featured with the integration in both positive and negative polarities by using extra switches that can flip the integration capacitor. Reseting of the capacitor happens when all the switches are shorted simultaneously.

(2) are connected as shown in Fig. 4.1(c), the polarity of integration becomes reversed. The reset functionality is provided by the mean of shorting the four switches simultaneously.

#### 4.1.1 Design of the max-identifier

Applying three sets of weight vectors to the photocurrents while the photodetector is biased properly will result in three different voltages that must be compared mutually to produce the class of rock that is being imaged by the sensor. Three S&H capacitors are needed to hold the superimposed values for the comparison and they have to be controlled independently using three independently controlled switches. Figure 4.2 depicts a possible configuration for the three S&H capacitors and the arbiter. The arbiter reports the type of the rock based on the relative magnitude of the integrated values.



Figure 4.2: A CTIA unit-cell capable of multispectral classification, which models the configuration of the integration capacitor and the four switches to control the integration polarity and the three S&H capacitor and the analog comparison block.

#### 4.1.2 The optimized unit-cell

To implement the block diagram above in silicon, there would be a need to implement about 35 transistors and five capacitors (one compensation capacitor for the opamp, one integration capacitor and three S&H capacitors). The arbiter also would needs to be laid out in the unit-cell. Because of the large number of transistors and capacitors, the block diagram shown in Fig. 4.2 is not suitable for unit-cell designs. Instead, a revised version of the block diagram is proposed in Fig. 4.3, where utilizes less area than the one shown in Fig. 4.2 with better functionality. The revised unit-cell has only one compensation capacitor, one S&H capacitor and an integration capacitor that also works for the S&H purpose.

In the revised block diagram, the first integrated sample corresponding to the first



Figure 4.3: Revised block diagram of the unit-cell proposed for multispectral classification.

weighted superposition, is transferred to the S&H capacitor. The second and third samples, while the integrated sample is in the integration capacitor, are compared against what is stored in the S&H capacitor. The integrated charge is transferred to the S&H capacitor only if it is larger than the S&H value otherwise it will be discarded. The comparator is located outside of the unit-cell, where the new value stored in the integrator capacitor is compared against the old value stored in the S&H capacitor. The proposed design also has the extra benefit that the number of weighted superpositions are not restricted to three. Because the hardware keeps only the largest of the recent samples, it is practically capable of performing any number of comparisons. The S&H capacitor is updated in one of the following conditions:

- 1. After the integration of the first set of the photocurrents corresponding to the first set of weight vectors. At this condition, the S&H signal will be asserted and all the S&H capacitors in all the pixels are updated with their corresponding integrated photocurrents.
- 2. This situation is also triggered at the end of the integration corresponding to the second and the third set of weight vectors. The condition for this update is that the net integrated value be greater than what is stored in the S&H capacitor. Two handle this situation, the comparison is done outside the pixels and pixels are updated sequentially as they are compared. The *RS.LD.CS* logic implements the selective update of the pixels. *RS.LD* implemented per row and outside the pixel and it is combined with the *CS* signal in the unit-cell. A complementary CMOS implementation of the *SH* + *RS.LD.CS* is shown in Fig. 4.4(a), which is composed of 12 transistors. The depicted circuit exhibits superior performance in terms of power and speed. However, to meet the restriction criteria over unit-cell's area, we have used the circuit that is depicted in Fig. 4.4(b), which is composed of only four transistors with the

same functionally.

The complete switch level schematic of the unit-cell, the column select and the switches for the output video signal are demonstrated in Fig. 4.5. In this schematic, the opamp is a dual stage differential amplifier, which is composed of 8 transistors and one compensation capacitor. To meet the area restriction coming from the pitch between different pixels of the FPA, all the transistors are at the minimum size dictated by DRC, which is W=2.36  $\mu$ m and L=1.5  $\mu$ m.

The bias current for the opamp and the operating point of the source follower are controlled from outside of the chip. A 10 to 1 current mirror is implemented to help with controlling and applying low currents in the range of 100 nA to 5  $\mu$ A.



Figure 4.4: Demonstration of the circuit controlling the S&H switch, which is triggered either using the S&H signal or when the arbiter decides that the recent integrated value is greater than the value already stored in the S&H capacitor.

Because the opamp output voltage starts from  $V_{DET-Com}$  after the reset, the reference is not zero. In order to provide zero volt reference for all the pixels during the readout, the input to Q1 source follower is taken from N2 (vs connecting it to the output of the opamp), as it is shown in Fig. 4.5. Switch S3 is connected to GND to set the reference for the integration capacitor.

The PMOS transistors connected to the outputs of the unit-cell are the active load to pull-up unit-cell's output. The operating point of these transistors are also set from outside the chip, which provides extra knobs to optimize the circuit for the target type of photodetector and/or operating temperature/voltage. Each of the pull-up transistors and output video switches are utilized for each column.

Figure 4.6 depicts an example of a waveform, demonstrating the operation of the circuit. In the waveform, the bias vector is [+3, +5, -4.5, 2] and the weight vectors are W1 = [+2, +1, +3, -2], W2 = [-2, +1, +2, -3] and W3 = [-1, +2, +2, -3]. Note that the weight vectors and bias voltages listed here are not the result of an optimization algorithm and are selected only for demonstration. Different regions of operation are labeled with R, P, N or S to indicate a *Reset*, a *Positive integration*, a *Negative integration* or a S&H to the S&H capacitor, respectively.

The waveform is composed of reseting the integration capacitor three times by triggering both *Int\_Mode\_1* and *Int\_Mode\_2* switches simultaneously. Each reseting is followed by modulating detector biases with the four values listed above and the integration time that each bias comes from the three weight vectors.

Chapter 4. Continuous Time-varying Biasing in a Chip



Figure 4.5: Switch level demonstration of the unit-cell proposed for multispectral classification and the switches for the video signal.

## 4.2 Test firmware

To test the MSC-ROIC, several analog and digital signals are needed to be driven by the test system. A MicroBlaze Development Kit, Spartan-3S1600E from Xilinx, was chosen to generate the needed timing signals. There are a number of benefits to use this method, that includes its low cost and its stand-alone features. Additionally, the clock rate of 50 MHz allows to offer precise timing signal with sufficient resolution for the clock dividers. The  $2 \times 16$  LCD display of the board helps to display the



Figure 4.6: A sample waveform, showing different states the circuit traverse to implement the feature extraction algorithm.

internal state of the written operating system for the imaging system.

The FPGA-based controlling system is a flexible tool for signal generation as well as for FPA testing and characterization. The features listed above present unique features that facilitate the operation, offer flexibility on connection and measurement of all signals, and improve the online-visual analysis of the system's state. The main features include:

- Phase and duty-adjustable pulse generation for testing of specific blocks with multiple digital and analog inputs.
- Synthesized ramp-signal generation for analog response and clock feed-through analysis.
- Repetitive pulse train to access to a specific pixel of the matrix for characterization and testing of related analog circuitries.

### 4.3 Experimental setup

We designed a custom PCB board to test the chip and to deliver high signal integrity. The design of the PCB was based on having enough reconfigurability so that it is adaptable to any future design. A picture of the PCB board connected to the Xilinx FPGA board is shown in Fig. 4.7(a). A micro-photograph of the chip is shown in Fig. 4.7(b). The chip has  $32\times36$  PADs. However, the socket we used had support for only  $25\times25$  PADs so only the PADs that are meant to test the main chip are wire-bonded. The other PADs that are connected to the test-chips are left float or are wire-bonded to the cavity of the chip-carrier to be grounded. The total dimension of the die is  $6052 \,\mu\text{m} \times 5452 \,\mu\text{m}$ .

Figure 4.7(c) shows a picture of the unit-cell that is designed in L-Edit. The dimension of each pixel in the designed readout circuit is  $60 \,\mu\text{m} \times 30 \,\mu\text{m}$ . The reason for having a nonsymmetric dimension for the unit-cell is to have a consistent pitch with the focal-plane arrays (FPAs) and also to have enough room to fit the large number of transistors and capacitors needed to implement the classification algorithm.



Figure 4.7: a) A picture from the experimental setup including the custom reconfigurable PCB designed to host the test-chip on the right and a Spartan-3E Development board, which is used to generate the timing signals on the left, b) a micro-photograph of the chip, which is wire-bonded to a  $25 \times 25$  socket and soldered to PCB, and c) a picture of the layout of the unit-cell.

## 4.4 Top view of the prototyped chip

A block diagram depicting the top view of the proposed readout integrated circuit for multispectral classification is shown in Fig. 4.8. The row/column-decoders generates the timing signals to raster-scan all the pixels. The driver shifts the level of signal and improves the driving capability, which in turn enables controlling all the pixels in a row/column. The driving strength is designed based on the net active capacitance at the inputs of the transistors that are connected to each individual signal.



Figure 4.8: Block diagram of the proposed readout circuit for multispectral classification.

## 4.5 Experimental results

The readout integrated circuit designed for multispectral classification was successfully tested in the lab. Following we overview three different experiments we conducted to validate the functionality of MSC-ROIC.

• In the first experiment we report here, a single standalone unit-cell is used to validate the weighed superposition algorithm discussed in the previous chapter. The standalone unit-cell is biased, and the timing signals are generated using the Spartan-3E evaluation board we discussed earlier. We basically modulated the integration time in the positive and negative integration modes and measured the unit-cell's output using a Sourcemeter 236. The photodetector current is constant during each experiment and fed through a Sourcemeter 2400. The measured values are reported in Table 4.1.

Each of the experiments was repeated 100 times, and the values, which are reported in Table 4.1, are the average of the measured values. The column labeled "Expected value" is what we are expecting based on the equation

$$Q = I.\Delta t \tag{4.1}$$
$$= C.\Delta V ,$$

which results in:

$$\Delta V = \frac{I.\Delta t}{C} \,. \tag{4.2}$$

As seen in Table 4.1, there is an average of 15 of errors between the measured values and the one forecasted using equation (4.2). We believe the source for this deviation is charge sharing, nonlinearity of the transistors in the unitcell and different sources of noise in measurement tools. The value of the

Table 4.1: Measurements over the standalone unit-cell using Spartan-3E for the generation of the timing signals and the measurements are done using a Sourcemeter 236, and the current is applied using a Sourcemeter 2400.

#### $I_{PD} = 3 nA$

|               |        |        |                   | PDBias = -2       |       | PDBias = 0        |       | PDBias = +2       |       |
|---------------|--------|--------|-------------------|-------------------|-------|-------------------|-------|-------------------|-------|
|               | Int #1 | Int #2 | Expected<br>value | Measured<br>value | Error | Measured<br>value | Error | Measured<br>value | Error |
| Experiment #1 | 0 ms   | 200 ms | -3.02 V           | -2.7 V            | %9.0  | -2.7 V            | %9.0  | -2.8 V            | %7.0  |
| Experiment #2 | 200 ms | 0 ms   | 3.02 V            | 2.7 V             | %12.0 | 3.1 V             | %4.0  | 2.7 V             | %10.0 |
| Experiment #3 | 40 ms  | 280 ms | -3.62 V           | -4.1 V            | %12.0 | -3.9 V            | %7.0  | -3.9 V            | %7.0  |
| Experiment #4 | 280 ms | 40 ms  | 3.62 V            | 3.7 V             | %3.0  | 4.0 V             | %10.0 | 3.9 V             | %7.0  |
| Experiment #5 | 40 ms  | 200 ms | -2.41 V           | -2.4 V            | %1.0  | -2.6 V            | %8.0  | -2.4 V            | %1.0  |
| Experiment #6 | 200 ms | 40 ms  | 2.41 V            | 2.4 V             | %0.0  | 2.5 V             | %4.0  | 2.5 V             | %2.0  |

#### $I_{PD} = 5 nA$

|                |        |        |                   | PDBias = -2       |       | PDBias = 0        |       | PDBias = +2       |       |
|----------------|--------|--------|-------------------|-------------------|-------|-------------------|-------|-------------------|-------|
|                | int #1 | Int #2 | Expected<br>value | Measured<br>value | Error | Measured<br>value | Error | Measured<br>value | Error |
| Experiment #7  | 0 ms   | 200 ms | -5.03 V           | -5.0 V            | %0.0  | -4.7 V            | %6.0  | -5.4 V            | %8.0  |
| Experiment #8  | 200 ms | 0 ms   | 5.03 V            | 5.4 V             | %8.0  | 4.6 V             | %8.0  | 5.1 V             | %2.0  |
| Experiment #9  | 40 ms  | 280 ms | -6.03 V           | -5.6 V            | %7.0  | -6.6 V            | %9.0  | -6.1 V            | %1.0  |
| Experiment #10 | 280 ms | 40 ms  | 6.03 V            | 5.3 V             | %12.0 | 6.2 V             | %3.0  | 6.3 V             | %4.0  |
| Experiment #11 | 40 ms  | 200 ms | -4.02 V           | -3.7 V            | %9.0  | -4.0 V            | %1.0  | -4.1 V            | %2.0  |
| Experiment #12 | 200 ms | 40 ms  | 4.02 V            | 3.6 V             | %10.0 | 4.4 V             | %10.0 | 3.9 V             | %3.0  |

capacitor, which is used in the calculation of the expected value, is based on the extraction over layout, which has some degree of inaccuracy, We also could not see a strong correlation between the bias voltage that is applied to the detector during the integration and what we have measured at the output of the unit-cell. This is what we expected, considering the fact that we have used a silicon photodetector for this experiment, where its photocurrect is a very weak function of the applied bias voltage. It is worth mentioning that for a silicon photodetector, the bias voltage reported in Table 4.1 is referenced to the backplane of the FPA, so an "+2 V" is actually -7.5 V - 2 V = -5.5 V, which means the detector is reverse biased in all the reported experiments.

• Figure 4.9 shows a picture of the video signal and the row/column-select pulses, which are serially shifted out after selecting all the rows/columns. The results shown in Fig. 4.9, which is taken while the chip is mounted in a dewar and tested cooled down to cryostat condition, demonstrate that the designed ESD protection, row/column decoders, and their corresponding drivers, video signal buffers, and the unit-cell are entirely functional for the desired application.

If any of the components listed above are not functional, the result would translate into failure of the chip, and the timing signal shown in Fig. 4.9 would not be generated by the chip. For example, if the column-select DFFs are not properly connected, or if there is some sort of setup/hold violation in the column-select circuitry, the single pulse labeled as  $(64 + 1)^{th}$  selection signal would not be generated. Additionally, this single pulse says there is no stuck at one/zero in the row/column-select circuitry.



Figure 4.9: The video signal and the row/column pulse that shifts out of the last row/column.

• In the third experiment, the ROIC with no detector is connected to an image grabber and the output is captured while a laser pointer is shining on the surface of the ROIC. Although no detector array was installed over the ROIC, the laser pointer changes the carrier density of the pixels under the beam, and as a result, the operating point of the detectors is changing. This results in the image shown in Fig. 4.10, which clearly shows the functionality of the system.

Having the comprehensive validation of the chip, the next step is to hybridize the DWELL FPA to the readout circuit, followed by testing the chip in real-time rock classification. The prerequisite for flip-chip bonding is a successful fabrication and growth of the DWELL detectors. The multi-spectral classification algorithm must also be recompiled based on the characteristics of the new growth. Table 4.2



Figure 4.10: The output image generated by the MSC-ROIC as a result of a laser beam shining to the ROIC. While no detector was installed, the change in the measured value comes from the variation in the operating point of the readout circuit's pixels under illumination.

| Number of transistors in the unit-cell | 26                                             |  |  |  |
|----------------------------------------|------------------------------------------------|--|--|--|
| Number of capacitors in the unit-cell  | 3                                              |  |  |  |
| Technology                             | TSMC CV035-DDD                                 |  |  |  |
| Photodetectors                         | DWELL                                          |  |  |  |
| Integration capacitor                  | 199 fF                                         |  |  |  |
| S&H capacitor                          | 228 fF                                         |  |  |  |
| Compensation capacitor                 | 34 fF                                          |  |  |  |
| Input signal swing                     | 13.0 V                                         |  |  |  |
| Output signal swing                    | 12.5 V                                         |  |  |  |
| Power supply                           | 15 V/3.3 V                                     |  |  |  |
| Transistors bias current               | 1 μA                                           |  |  |  |
| Unit-cell dimension                    | $30\mu\mathrm{m}{	imes}60\mu\mathrm{m}$        |  |  |  |
| Unit-cell array size                   | $128 \times 64$                                |  |  |  |
| Chip dimension                         | $6052\mu\mathrm{m}\!\times\!5452\mu\mathrm{m}$ |  |  |  |

Table 4.2: Specifications of the chip designed for on-chip multispectral classifications.

summarizes some specifications of the chip.

## 4.6 Future plans

With the chip successfully passing the post-silicon validation, the next step is to exploit multispectral imaging in the lab. However, the prerequisite for such an experiment is the growth and fabrication of a spectrally tunable photodetector such as DWELL and flipchip bonding of the DWELL FPA over the MSC-ROIC. In the following subsections, we outline the steps that must be followed for a continuation of this project. We also explain some consideration regarding the needed test-points and alignment marks that we have embedded targeting the hybridization and postsilicon processing:

#### 4.6.1 Post-silicon processing and flip-chip bonding

Having the initial testing done and after validation of the MSC-ROIC functionality, the DWELL FPA must be mounted over the chip to enable the chip to exploit the infrared multispectral imaging. Figure 4.11 demonstrates a block diagram of the MSC-ROIC, integrated with the DWELL FPA. The 2D-array of balls forms the electrical connection between the MSC-ROIC and the FPA. The balls, which are usually made of indium bond the two pieces using flip-chip bonder.

We overview different steps that the chip must undergo to be hybridized to the FPA. These steps are also shown in Fig. 4.12.

1. The first postprocessing step probably would be the deposition of the photodetector contract with metal. This is necessary because it will help to make a better bond with indium bumps during the flip-chip bonding.



Figure 4.11: An iconic demonstration of the DWELL FPA flip-chipped over the next generation MSC-ROIC. The contacts of the MSC-ROIC side are gold plated, and the bonding is made using indium.

- 2. Because indium melts at 156.6°C, the flip-chip bonding is probably totally safe for the chip. However, some alignment marks are needed that are used in the hybridization step to align the FPA over the chip. Figure 4.13 shows the alignment marks we embed in the layout. Although the MSC-ROIC will go through chemical-mechanical polishing (CMP) and the surface of the chip will be planarized, it is important to have a minimum amount of features under the alignment PADs. Otherwise the planarization step of flip-chip bonding might fail.
- 3. Under-fill-epoxy is the process we exploit to apply epoxy to fill the area between the chip and and FPA, providing mechanical strength needed for the bumps and prevents indiums from oxidation. The chip's PADs must be at least 0.5 mm away from the sides of the FPA; otherwise, they will be covered with epoxy.
- 4. The back-side polishing is the step in which the chip is installed over a plate and polished for long time to get rid of the extra supporting material on the back of the FPA. The handling of the chip in this step and previous steps might be a source of failure. The only recommendation for the designer might be to include proper ESD protection to save the internal logic from failing in the event of an electrostatic discharge.

#### 4.6.2 Testing at cryostat condition

The dark current of DWELL FPAs is so high at room temperature that it surpasses the photocurrent, which means the measurement must be made only at a cryostat condition. The challenge for testing at low temperature is the number of different pins must be provided by the dewar. Because typical dewars are designed

for conventional readout schemes, which need very few timing signals and power rails, a dewar must be customized for the purpose of testing the chip. To have the cost under control, we found the best solution was to take ISC9705 dewar and add the extra wiring needed to support accessing to all the test-points, to set all the operating points and to provide all of the timing signals. A picture of the ISC9705 dewar and the modification we made on the breadboard are shown in Fig. 4.14(a). To retain the compatibility of the dewar with the MSC-ROIC that it originally was designed for (provided mainly by the SEIR Co.), none of the existing connections was touched and only the pins with no-connection were routed to the outside. Figure 4.14(b) demonstrates a picture of the breadboard inside the dewar and the modification diagram.



Figure 4.12: Demonstration of different processing steps for DWELL FPA and flipchip bonding to MSC-ROIC.

## 4.7 Conclusions

In this section, we discussed the design, outlined the fabrication, and presented the testing results of a single chip CMOS readout integrated circuit for multispectral



Figure 4.13: Demonstration of the alignment marks designed over the MSC-ROIC, which are used for aligning in the flip-chip bonding stage. The picture also shows the contacts designed for the active DWELL detector array, and the test detectors that are accessible directly using direct outputs on the PAD ring. The ring around the FPA provides three rows of substrate contacts, which are short circuited using metal4.



Figure 4.14: a) A picture of the dewar, which is modified for testing of the multispectral classification MSC-ROIC at cryostat condition, and b) the internal breadboard inside the dewar and the modification diagram.

rock classification. The input signals have a dynamic range of 13.0 volts in the input and 12.5 volts at the output. The chip was fabricated through MOSIS using TSMC's  $0.35 \,\mu\text{m}$  HV CMOS technology. The designed readout circuit works with an array of  $128 \times 64$  pixels, and is able to apply both positive and negative large bias voltages to the DWELL infrared photodetector, change the integration polarity, and compare the integrated value against previously integrated samples to classify the rock type.

The test chip was successfully tested as an independent MSC-ROIC, and it is now ready to be hybridized with a DWELL FPA. The proposed design is an ideal candidate for advanced smart-pixel imaging systems, such as real-time multispectral classification imaging, and infrared remote sensing imagers.

## Chapter 5

# A ROIC for Spatiotemporal Bias Tunability

There is an inherent trade-off between the generation of big data by imaging systems and efficiency in extraction of useful information within real-time constraints. Traditional imaging systems is burdened by the acquisition, transmission, and storage of excess data bearing redundant information for the given application of interest. Transmission of the extra information requires a high bandwidth and results in consuming extra power to store or transmit. Similarly, post processing imposes extra latency and burdens the power constraints [47, 48, 49].

There is a need to address this problem by intelligently acquiring limited but important sets of data and then processing the abstract information. This in turn needs an additional ability, where computations are performed at the pixel level, within the readout integrated circuit, at the front end of the imager [50]. Real-time compressed-domain image sampling is the center of attention of many researchers around the world. As Candes and Wakin [22] stated "Many natural signals are sparse

#### Chapter 5. A ROIC for Spatiotemporal Bias Tunability

or compressible in the sense that they have concise representations, when expressed in the proper basis." Compressive sensing/sampling (CS) exploits the fact that many signals can be built based on few nonzero coefficients in a suitable basis. CS refers to the type of imaging that compressed data acquires directly at the sensor level, rather than conventional techniques in which the sensor collects raw data, and then later on the information is compressed for storage or transmission in some sub-processing units. The benefits of the compressed sampling are when the sensor is difficult to be manufactured [51] or when a lower delay is desired for acquiring/processing the image data [23]. As CS removes the need for extra post-processing at the sensor level, it reduces the power consumption as well [51].

In the pursuit of seeking efficient computational imaging hardware, which tends to address the memory efficiency, low power consumption, and minimal latency requirements, we demonstrate a CMOS-based imaging hardware, which supports compression at the acquisition time, inside the pixel.

## 5.1 Background and previous work

For a typical image sensor, imaging involves reading out the values sampled at different pixels [52]; whereas with compressed-domain hardware, a set of gain matrices is loaded to the pixel array, and the image sensor's output would be a linear combination of the projection of the object's reflectance function to the gain matrices [16, 53]. In the following paragraphs, we make comparisons among a few other works devoted to the problem of online compression, and hardware domain sensing based on matrix projections.

One of the earliest reported hardware implementations to the compressive sensing is based on a single-pixel camera [49]. The single-pixel imaging utilizes a digital micro-mirror (DMM) [54] to project the incident light coming from the object to

#### Chapter 5. A ROIC for Spatiotemporal Bias Tunability

the digital masks. The photodetector samples the integrated light coming from the sample, which are modulated by using the DMM. This method usually is used for far infrared imaging, where having an array of low-cost, small-size photodetectors is not feasible. The DMM degrades the sensitivity of the imager, and the alignment of different components is a limit to the scaling of this method. An iconic demonstration of a compressive sensing setup with a bolometer as a sensor is shown in Fig. 5.1.

An optical domain coded-aperture-based compressive sensing has been demonstrated in [56]. A random phase mask injects the measurement matrices, and the modulated intensities at different pixels are sampled using a low resolution imager. This technique suffers from the noise added by the optical masks, and the complexity of the alignment setup is a big challenge.

A CMOS imager is demonstrated in [57], which utilizes a flip-flop-based shift-register distributed over the pixel array to hold the random digital patterns.



Figure 5.1: An iconic representation of a compressive sensing setup with a bolometer for the sensor, and a micro-mirror array to implement the projection [55, 49].

To implement measurement matrices, the shift register selectively disconnects the pixels from the readout, and implements the measurement matrices. The proposed hardware, which offers only multiplication by a binary value, limits the compressive sensing algorithm to the binary projection matrices, which is composed of only "0" or "1". Furthermore, there is no control over the bias voltage of the detectors, which means many features offered by a modulation at the detector level are not supported. And finally, the unit-cell does not support integration; therefore, the proposed hardware cannot work with detectors with lower quantum efficiency.

Figure 5.2 presents our proposed monolithic CMOS image sensor that can run as a standalone image sensor, and is able to perform spatiotemporal region of interest enhancement [31]. The hardware is also capable of generating already compressed images as well as canceling the nonuniformity inherent in process variation or other sources, such as voltage drop across the image sensor. The main contribution of this hardware is the introduction of control over per-pixel gain by the mean of modulation of photodetector's responsivity, which is demonstrated as a controllable gain symbol in the pixels. The capacitor represents the analog memory that is embedded to store, and hold the bias information for individual pixels. The, and gate selectively enables different pixels to load the bias voltage to the active pixel, and this selection occurs at the same time that pixels is being scanned for the readout; therefore no delay penalty is associated with the new design. While sampling the integrated voltage to the sample, and hold (S&H) capacitor, voltage  $V_{ref}$  is used as a global reference voltage for all the pre-amplifiers. This removes the bias voltage from showing up in the readout, and makes the readout value meaningful.

During the readout, the bias information, which is loaded to different pixels, can be different from each other, and also from the bias that is loaded to the same pixel in the previous frame. This is what we call the spatiotemporal independence of pixels biasing scheme. We discuss the detail implementation of the intelligent ROIC (iROIC) in the next section.

## 5.2 Design of the pixel

Implementation of a compressed-domain imaging system requires a means to implement projection of the object's reflectance function to the gain matrices, and we have approached this problem is by embedding a fine control over operating voltage of each individual pixel's detector. The current hardware is designed with an array of n+/nwell/psub detectors that is laid in silicon out along with the rest of the readout integrated circuit.

The graph in Fig. 5.3(a) demonstrates a cross section of the n+/nwell/psub detector. Figure 5.3(b) shows the measured photocurrent of the n+/nwell/psub photodetector at seven illumination levels. A green LED is used as the light source in this experiment, and the intensity is modulated by controlling the injection



Figure 5.2: Block diagram of the individual pixel bias tunable readout integrated circuit, and the CTIA-based unit-cell at the extended view. The extra circuitry added to the CTIA-based unit-cell enables setting independent bias voltages for each pixel while the previous integrated voltage is being read out.

current. The LED is placed at almost 40 cm away from the detector, which means the illumination intensity is uniform. The illumination intensity is measured simultaneously, and the reported optical power is scaled to the area of the detector. As seen in Fig. 5.3(b), because the photo-response is a function of both the bias voltage, and the intensity of the light, one could load the projection matrix to the pixel array, and acquire the image while the pixels are operating at different gains. Figure 5.3(c) demonstrates the measured photocurrents in a normalized scale, which proves the change in the optical power only scales the measured photocurrent. This could lead to many applications that will be discussed in the following sections.

Because capacitive trans-impedance amplifier (CTIA) provides the best



Figure 5.3: a) A cross-section of the n+/nwell/psub photodetector. b) The measured photoresponce of n+/nwell/psub photodetector as a function of the applied bias voltages at different illumination levels. c) The measured photocurrents, which are normalized to one.

performance in terms of precise control over the detector's bias voltage, as well as providing high injection efficiency, large voltage swing, and support for good charge storage, we have selected this configuration as the base for preamplifier. Figure 5.4(a) depicts the detailed block diagram of the unit-cell of iROIC. In the proposed unit-cell, the conventional CTIA configuration is featured with the ability to control individual pixel's bias voltage.

Here we briefly explain the process that is followed to operate the compresseddomain imaging:

1. The bias control circuit is composed of an analog switch,  $SW_{Bias}$ , that is enabled when the row-select, and column-select signals address the pixel; then the analog memory is loaded with the bias voltage.



Figure 5.4: a) Switch level implementation of iROIC unit-cell. The unit-cell is composed of 15 transistors, and three capacitors. b) The video switches, and c) the row/column select peripherals.

- 2. During the integration, the bias is held at the analog memory. Both  $SW_{Bias}$ , and  $SW_{Ref}$  switches are off for the entire integration time to protect the  $C_{Bias}$ capacitor from changing.
- 3. At the end of the integration,  $SW_{Ref}$  switch is enabled to set the same reference voltage for all the pixels, and make the sample values meaningful.

To provide a high voltage swing range, the chip has been fabricated at the TSMC CL035HV technology. A major challenge in design of this circuit was the trade-off between the number of functionalities, and the area for the pixel. To comply with the pitch of standard focal plane arrays (FPAs), we decided to restrict the unit-cell to  $30 \text{ }\mu\text{m} \times 30 \text{ }\mu\text{m}$ . The constraint imposed by area forced us to have all of the switches at the minimum size supported by the technology node. All the switches are based on a single NMOS transistor. The rest of the area was equally divided between the capacitors to achieve the highest possible resolution for the output image data. In total, the unit-cell is composed of seven transistors for the dual-stage differential amplifier, and eight transistors for the rest of the unit-cell circuitry. The unit-cell also includes four capacitors that serve as the compensation, the integration, the sample, and hold, and the bias voltage holder capacitor. Figure 5.4(b) shows the video switches as well as the active load for the source follower at the output of the unit-cell, and the ROIC peripherals are shown in Fig. 5.4(c).

To have a model for the transfer function of the imager, we have measured the response of the system to a uniform level of illumination at different bias voltages. The normalized imager's photo-response is shown in Fig. 5.5. In the error-bar graph, the mean, and standard variation is based on statistical analysis over all pixels in the whole  $96 \times 96$  frame, and each measurement was repeated 10 times to reduce random noises. The mean value, and the standard variation shown in this figure are used as the base for selection of bias voltages when it comes to implement a real application in the system. Inferred by the curve is that the system responds to the
#### Chapter 5. A ROIC for Spatiotemporal Bias Tunability

bias voltage in a semilinear fashion as long as detector's bias voltage is constrained to  $\sim [+0.4, +3.5]$ .

The silicon-based photodetector has been laid out in the form of a 10 µm strip on the right, and top side of the unit-cell, which increases the size of the pixel to 40 µm×40 µm. A microphotograph of the fabricated chip is shown in Fig. 5.10(a), and the layout of the unit-cell is shown in the extended view. The dimension of the pixel array is 3840 µm×3840 µm, and the total area of the chip, including test-cells, PADs, and ESD protection, is 5140 µm×5140 µm.

Nevertheless, we have considered n+/nwell/psub photodetectors as a means to exploit compressed-domain image acquisition; the circuit would work fine with any type of detector, which the nominal operating voltage, and current of the detector fit in the specification of the designed readout integrated circuit. Additionally, we have embedded extra knobs, such as the bias current of the preamplifier, the integration time, and the readout clock speed that are set from outside the chip. This knobs can be employed to adjust the operating point, and optimize for the



Figure 5.5: Demonstration of the normalized responsivity of the system to a uniform illumination level. The transfer function drops the hint of how the system reacts to the modulation of the detector's bias.

detector of interest. Our next generation iROIC would be based on dot-in-a-well (DWELL) [58] quantum well infrared photodetectors (QWIP) [59]. The DWELL infrared photodetector utilizes the quantum-confined Stark effect (QCSE) [39] to allow the spectral response of the sensor to span over the range of 7  $\mu$ m to 11  $\mu$ m, if detector's bias voltage is properly controlled by the circuit. The graph in Fig. 5.6(a) shows a sample growth structure of DWELL infrared detector, and Fig. 5.6(b) demonstrates spectral response of a DWELL photodetector.

Using DWELL detectors, the next generation of iROIC will enable exploration of on-chip spatiotemporal multispectral imaging. The extra cost for this transition is growth, and fabrication of a DWELL focal plane array (FPA), and flip-chip bonding to iROIC. An iconic demonstration of the next version of the hardware integrated with DWELL FPA is shown in Fig. 5.6(c). In this version, the cost, and challenges for the DWELL integration are avoided by embedding a per-pixel n+/nwell/psub detector, and in this way we decoupled the challenges of the circuit design from FPA integration considerations.



Figure 5.6: a) A sample growth structure of DWELL infrared detector, the detector that is used as a base in designing current iROIC to ease the transition to multispectral imaging in the next generation of the hardware, and b) a sample spectral response of DWELL detector as a function of the applied bias voltage. c) Iconic demonstration of a DWELL FPA hybridized over ROIC.

Chapter 5. A ROIC for Spatiotemporal Bias Tunability

# 5.3 Experimental setup

The spatio-temporal region of interest enhancement system offers a variety of features when the chip is mounted on well designed firmware, and hardware. To support high signal integrity, a reconfigurable PCB is designed, which hosts the chip, and provides the timing signals required for the operation. The PCB board also supplies all different powers, and bias voltages required for the image sensor. The board, offers not only robust connections, but it also introduces clean timing/video signals. Also, because the board is designed based on a schematic, in contrast to twisted wires, it is much easier to trace, and find problems on the schematic. A picture of the test hardware is demonstrated in Figs. 5.8(a), and (b).

In terms of the software, it is needed to support the capability to load integer values for the bias voltages of different pixels. Our initial approach was to use a



Figure 5.7: A microphotograph of the fabricated ROIC, the row, and column select, and the test devices. The unit-cell is shown in the extended view. The total area of the fabricated chip is 5140  $\mu$ m×5140  $\mu$ m.

# Chapter 5. A ROIC for Spatiotemporal Bias Tunability

"Digilent Spartan 3E FPGA Starter Board" as the main controller of the system, which is a very good solution when the precision of the timing signals is of the highest importance. However, and unfortunately, the development board does not come with the embedded firmware required to communicate over a network adapter. Additionally, the embedded memory in the board is limited to a few hundred kilo bytes, including the BRAM, and distributed RAM. This means when it comes to image grabbing for long runs of acquisition, the burst of frames has to



Figure 5.8: a) A photo of the old setup. b) The PCB board designed for the CS project that is connected to an FPGA board for the timing signals. c) The schematic of the designed PCB board for the CS project.

break into small chunks, which is very time consuming.

While the initial method of characterization was based on an NI-1410 image-grabber, and the LabVIEW programs we developed to automate the characterization with the help of a 2400 source-meter, the requirement for sufficient space has made us replace the test setup with an autonomous configuration, which is based on a Raspberry-PI board that generates the timing signals, and at the same time communicates with a DAC board to convert the digital (RPB) weights stored in the SD card to analog. The RPB board drives an ADC board as well, and in this way samples the video signal, and generates the image.

The main reason for choosing the RPB as the main controller is its extended support for on-board memory in the form of a micro-SD card. The typical FPGAs do not support for high volume storage, and this challenges the storage of massive



Figure 5.9: The new characterization system developed for the IPBT-ROIC embeds the image grabber inside the chip.

bias information. A DAC converts these digital values to analog, and then feeds them to the iROIC. The output video signal is sampled using an ADC chip, which is derived by the RPB. The sampled data are both sent to a remote computer for the purpose of online monitoring, and also are stored in the local memory of the controller to be processed later. The RPB board acts as a standalone controller for the iROIC, and performs all of the image acquisition details. The RPB board is controlled using a desktop over LAN, and test vectors are loaded using Linux's standard commands such as rsync, ssh, scp, and etc. A block diagram of the experimental setup is shown in Fig. 5.10(b). The control over bias information of every pixel's detector, and the flexibility offered by the experimental setup have enabled many different applications, which are explained in the following sections.



Figure 5.10: A block diagram of the experimental setup, which includes a Raspberry Pi board as the main controller of the system, an ADC, and a DAC to set the bias voltage of the detectors, and grabs the readout of the imager. All communication between controller, and a remote machine is over SSH.

# 5.4 Conclusion

In this chapter a wide dynamic range, high voltage, dual polarity, individual-pixel bias-tunable ROIC is reported. The test chip contains a 96×96 array of unit cells, and the controlling logic circuits for readout functionality. The chip that is fabricated on the TSMC's CL035HV-DDD CMOS process technology utilizes CMOS-based photodetectors.

A flexible test, and characterization system is developed, which not only helps to fine-tune different readout parameters but also provides a flexible tool that can be used in upcoming similar projects. In addition, a specific PCB board is designed for testing the chip, and a Raspberry-PI controller is used to provide the speed, storage, and flexibility needed for the system. All of the timing signal generation, bias voltage generation, and image grabbing is integrated into the new hardware. The novel-designed image sensor, along with the autonomous testing environment, offer a variety of discrete compressed-domain image processing, which we will discuss in the next chapter, and is based on the ability of choosing an intelligent detector biasing scheme.

# Chapter 6

# Compressed-domain Image Processing Applications

Dramatic advances in the field of computational and medical imaging over the past decades have enabled many critical applications, such as night vision, medical diagnosis, quality control, and remote-sensing applications [60, 61, 15, 62, 63]. The increasing demand in image quality and its fidelity need an increase in pixel count and a sophisticated post-processing mechanism to efficiently store, transmit, and analyze this enormous quantities of data [64, 65, 66, 67]. There is an inherent trade-off between the generation of big data by such imaging systems and efficiency in extraction of useful information within real-time constraints, limiting the efficacy of such sensors in real-time decision-making systems [68, 69].

In the previous chapter, we proposed a new hardware that offers per-pixel modulation of the gain through a customized CTIA preamplifier. The spatio-temporal modulation of the gain enables a number of applications, which are discussed below:

# 6.1 Functioning as a stand-alone camera

Depending on the modulation scheme applied to the chip, different applications can be delivered. In the simplest scenario, if all of the pixels are biased with the same voltage, the iROIC camera can be used as stand-alone camera. In this mode of operation,  $V_{bias}$  should remain constant, and as a result, the gain used for different pixels is the same.

The extra benefit of this hardware over the conventional CTIA is that in standalone mode, because the reference voltage for the readout is different from the detector's bias voltage, a  $V_{ref} - V_{bias}$  offset is applied to the measured values, which means a level shifter is embedded in every pixel. This method is beneficial if there is a constant offset at the output of the imager. Figure 6.1 shows four images that



Figure 6.1: Four images that are taken using iROIC camera in normal mode. a) Phantom, b) a cell, c) some rice grains, and d) UNM logo.

are taken by an iROIC camera in stand-alone mode.

# 6.2 Region of interest (ROI) enhancement

The support for continuous spatio-temporal control over the bias voltage applied to each photodetector enables region-of-interest enhancement achieved by means of selectively modulating responsivity of detectors located in the region of interest. Many applications benefit from this method, which are briefly discussed below:

- Enhancing the contrast of image over a given region, which is originally poor due to the limited dynamic range of the sensor. This is also a solution to the challenge of finding an optimum bias for a high-contrast image, where part of it is saturated and some other parts are at the noise level. A smart selection of bias voltages enforces all pixels to operate in a linear region.
- Achieving different resolutions for different regions of a given image by using submasks corresponding to low-pass response and high-pass response is another application. This scheme is useful in the surveillance and medical applications, which the user may be interested over a specific region and wants to ignore the information in the rest of image.
- Spectral selectivity in different areas of the image is another application of the hardware. However, the requirement is to have support for multispectral tunability at the photodetectors.

Figure 6.2(a) shows the original image of the white matter, which we used at the input of iROIC in the image segmentation experiment. Figure 6.2(b) depicts the white-matter image we have taken with iROIC when a uniform bias is applied to all

the pixels, and Fig. 6.2(c)-(f) presents the same scene with the exception of applying different bias to some selected area, which we refer to as a region of interest.

# 6.2.1 nonuniformity correction

There are many different sources of variations in an image, such as process variation for the IC design (including inter-die and intra-die variations), the nonuniformity of the fabrication of detectors, the IR drop in the power distribution of the chip, and the difference in the IR drop for the different signals. Normally, the imager output is passed through post-processing stages to deal with these variations. Having the ability to tune the gain of different pixels individually, the



Figure 6.2: a) Original white-matter image used for imaging. b) Image is taken using iROIC with a uniform biasing for all pixels, where some of the pixels are saturated due to the high intensity. c, d, e, and f) The same scene is imaged using proper biasing for different areas that normally are at the noise floor of the imager.

variations can be directly removed in the chip, without the need for external processing.

# 6.2.2 On-chip compressive sensing

The block diagram of discrete cosine transform (DCT) coding is show in Fig. 6.3(a). Fig. 6.3(b) shows the block diagram of the inverse discrete cosine transform (IDCT). As shown in Fig. 6.3(a), to encode an image, normally the output of an imager is passed through a few post-processing blocks. The idea here is to produce already compressed data by embedding the DCT coefficients inside the image sensor and, in this way, implement the so-called compressive sampling algorithm at the chip



Figure 6.3: a) Block diagram of DCT coding, and b) decoding and inverse DCT coding.

level.

# 6.3 Nonuniformity correction

The pixels are designed to maximize the sensitivity to the photo-response. However, the overall performance of the sensor is limited by noise, which comes from many sources and contributes to the output signal. Random noise is a temporal variation in the signal, is not constant, and changes over time, from frame to frame. This type of noise, which is hard to predict, has a statistical distribution and can be canceled statistically by averaging [70, 10].

On the other hand, pattern noise is the spatial variation in the photo response of different pixels while they are exposed to a uniform illumination. This type of noise is fixed over time and cannot be reduced by averaging. Pattern noise stems from the variations in the growth or fabrication of the photodetectors. The difference in the driving and sampling circuitry or the variation in power distribution also results in deviation in responsivity in the form of pattern noise [13].

Pattern noise is composed of fixed pattern noise (FPN) [71, 72] and photo response nonuniformity (PRNU) components [73, 74]. The FPN is measured in the absence of illumination and is a result of variations in growth, detector dimension, doping concentrations, fabrication defects, characteristics of transistors ( $V_T$ ,  $g_m$ , W, L, etc.) [75, 76], or nonuniformity in the distribution of power [77]. Additionally, at highspeed readout, the difference between the resistance and capacitance that is seen at the output of different unit-cells also can cause nonuniformity. The second component of pattern noise, PRNU, is a function of illumination and varies with on the dimension of the photodetector, the doping concentration, and the color of the light incident to the detector [78].

Nonuniformity correction is an important topic under investigation when dealing with processing inconsistencies that leads to unfavorable pattern noise. Independent of the source of the nonuniformity, it can be corrected using the following techniques:

- 1. Single-point calibration technique: In this method, a uniform scene is imaged, and as a result, a constant gain and offset are calculated per pixel that later is processed to the value that is sensed by the detector to correct the responsivity.
- 2. Two-point calibration technique: In contrast to the single-point method, in two-point-based nonuniformity correction, two different uniform illumination levels are used as the calibration points, and as a result, an offset and a gain are calculated for each pixel. The offset and gain are employed to correct the photo-response that are read from each pixel. This method has the extra benefit that if the nonuniformity grows with temperature, it will offer a better correction [79, 80].
- 3. Scene-based nonuniformity correction: This is an adaptive type of correction for the nonuniformity that requires post-processing and works in real-time [81]. This technique does not require calibration and basically uses motion-related features to identify the actual image from the fixed-pattern noise.

Because pattern noise does not change with time, it could be canceled by using proper biasing of the circuit. The correction method we have used here is based on a static correction/single-point technique. The mathematical formulation for the correction algorithm we used is given below [82]. The linear model of the imaging device is given by

$$Y_{ij}^{k} = g_{ij}^{k} I_{ij}^{k} + o_{ij}^{k}$$
(6.1)

where  $I_{ij}^k$  is the actual object's reflection function that is incident to the image sensor, and the observed pixel value is given by  $Y_{ij}^k$ . Variable k is the frame index, and the gain and offset of the  $(i, j)^{th}$  detector is denoted by  $g_{ij}^k$  and  $o_{ij}^k$ . Here, nonuniformity correction is carried out by means of a linear transformation of the observed pixel values  $Y_{ij}^k$ . The goal is to provide an estimate of the true intensity  $I_{ij}^k$  so that all of the detectors appear to be performing uniformly. The correction is given by

$$I_{ij}^{k} = w_{ij}^{k} Y_{ij}^{k} + b_{ij}^{k}$$
(6.2)

where  $w_{ij}^k$  and  $b_{ij}^k$  are the gain and offset of the linear correction model of the  $(i, j)^{th}$  detector. The relation to the actual gain and offset is given by solving (6.1) and (6.2) as

$$w_{ij}^k = \frac{1}{g_{ij}^k} \tag{6.3}$$

and

$$b_{ij}^{k} = -\frac{o_{ij}^{k}}{g_{ij}^{k}} \tag{6.4}$$

Once we estimate the parameters  $w_{ij}^k$  and  $b_{ij}^k$  or  $g_{ij}^k$  and  $o_{ij}^k$ , the NUC can be achieved as suggested by equation (6.2). In Fig. 6.4(a), we demonstrate an image of a white paper, which is taken at uniform biasing for all the pixels. Although the bias information is uniform, the pixels' response across the image varies because of the nonuniform illumination, weakly sensitive pixels, and other sources of fixed-pattern noise. Figure 6.4(b), on the other hand, shows another image at the same illumination condition with a bias matrix, which is optimized for the NUC technique discussed above. A gain and offset are calculated as per equations (6.3) and (6.4) per pixel and embedded in the bias of each pixel, and the bias is applied using an RPB board.

Figures 6.4(c) and (d) depict the histogram of the images shown in Figs. 6.4(a) and (b), respectively. Because the histogram in Fig. 6.4(c) is resulted from a flat white page that is illuminated with a non-uniform source, the captured image contains a wide range of intensity levels for different pixels. Our NUC method resulted in a narrow histogram as shown in Fig. 6.4(d). Here, the point is that the



Figure 6.4: a) The result of imaging a white paper with uniform biasing, while the illumination is not uniform. Defects and other sources of nonuniformity also contribute to the variation across the image. The stack of three graphs demonstrates (I) camera output image, (II) illumination contour, and (III) a 3D view of the intensities. b) Another white paper is imaged with the same illumination condition using the implemented nonuniformity correction. The graph has the same scale as part (a) and the legend in the middle is for part (II). Figures c) and d) show the histogram for the measured results of parts (a) and (b), respectively.

hardware is able to cancel the net nonuniformity that stems in the process variations of the pixels/ROIC and the nonuniformity in the illumination.

# 6.4 Compressed-domain image acquisition

Another important application of the chip is targeted in compressed-domain imaging framework. The compression is achieved by the hardware through performing projection of an image to a set of basis masks implemented in detectors' biases. We have considered two different in-hardware compression modalities, which are in-pixel discrete-cosine-transform (DCT) based compressed-domain image acquisition and compressive sensing framework [83, 84].

To implement the compression modalities in hardware, we need to adapt the compressive masks as per device responsivity so that we ensure the mask coefficients are exactly achievable as gain factors at the pixels. The content presented in Sections 6.4.2, 6.4.5 and 6.4.6 are out of the scope of this dissertation and are the results of the work of my colleague, Manish Bhattarai. However, they are reported here for the sake of completeness.

# 6.4.1 Discrete cosine transform

In this part, we present the mathematical formulations for compression and reconstruction of the image using the DCT. To realize any sort of transform coding on the computational imaging hardware, one needs to be able to project the acquired image into the designated mask, where the transform coefficients need to be realized at each of the pixels as gain/multiplication factors. Considering R as the responsivity of the image sensor, which is a function of the object's reflectance

function I and the detector's bias voltage V, then:

$$R = g(I, V) \tag{6.5}$$

where g is some nonlinear function of I and V. Here, if I is the object reflectance function in spatial domain, then its frequency domain transform is then given by

$$y^{uv} = \frac{2}{\sqrt{MN}} \sum_{i=0}^{M-1} \sum_{j=0}^{N-1} \left[ C(u)C(v)I_{ij} \cos \frac{\pi(2i+1)u}{2N} \times \cos \frac{\pi(2j+1)v}{2N} \right],$$
(6.6)

where i, j are integers in the range of [0, N-1], which are used to address different pixels, and C(u) and C(v) are defined in the following equation:

$$C(u), C(v) = \begin{cases} \frac{1}{\sqrt{2}} & \text{if } u, v = 0\\ \\ 1 & \text{otherwise }. \end{cases}$$

$$(6.7)$$

The inverse of the DCT transform function is defined as:

$$I_{ij} = \frac{2}{\sqrt{MN}} \sum_{u=0}^{M-1} \sum_{v=0}^{N-1} \left[ y^{uv} \cos \frac{\pi (2i+1)u}{2N} \times \cos \frac{\pi (2j+1)v}{2N} \right].$$
(6.8)

To implement the computationally intensive DCT transform in hardware, we have reordered equation (6.6) and decoupled the bias (mask) matrices from the image sensor responses, which is shown in equation (6.9):

$$y^{uv} = \frac{2}{\sqrt{MN}} C(u) C(v) \sum_{i=0}^{M-1} \sum_{j=0}^{N-1} \left[ I_{ij} Mask^{uv}(i,j) \right] , \qquad (6.9)$$

where

$$u, v = 0, 1, \ldots, N-1$$
.

In the above equation,  $Mask^{uv}(i, j)$  is the mask set that is to load to the image sensor as the bias information. If we assume N = M for exact reconstruction, the total number of masks would be  $N \times N$ . The mask matrices can be represented as:

$$Mask^{uv}(i,j) = \cos \frac{\pi (2m+1)u}{2N} \times \cos \frac{\pi (2n+1)v}{2N} .$$
(6.10)

In the calculation of the mask matrices, because C(u) and C(v) are not a function of m and n, they are treated as constants and not included in equation (6.10). In this way, because all of the coefficients are limited to the same range of [-1, +1], we could efficiently use the limited dynamic range of the analog memory to store the bias voltage; otherwise, the DC coefficient would need a greater number of bits to deliver the same SNR.

The discussion above works fine as long as the system is noise free. However, the system's transfer function shown in Fig. 5.5 triggers the need for a more intelligent bias selection algorithm. Due to the device's limited dynamic range and noisy behavior of the system, it is essential to have a bias selection algorithm, which efficiently prescribes the optimal bias to each pixel, minimizing the effect of noise and some linear transformation to achieve all coefficients for the given dynamic range. The next section is devoted to a mathematical model of the device response and bias selection algorithms.

# 6.4.2 Bias selection algorithm

The projection and reconstruction is exact as long as the device behaves deterministically for the applied mask. However, the complexity rises as its behavior tends to be random and there exists a finite-uncertainty on its response. In this case, the naïve reconstruction method does not lead to exact recovery because it is difficult to find a unique bias that is able to achieve the designated

gain factor. Now we discuss a technique, which enables us to optimally choose the bias for the given mask coefficient.

To begin by describing the bias-selection method, as shown in Fig. 6.5(a), we consider a set of basis masks,  $\{B^k\}_{k=1}^N$ , each of which is to be implemented by a 2D array of biases to be determined later. Each of these masks consists of a 2D array of coefficients, given by  $\{\{b_{ij}^k\}\}_{i,j=1}^N$ . The objective is to map each of these  $b_{ij}^k$  coefficients into achievable responsivity values by means of the application of appropriate bias drawn from the responsivity function given by  $\tilde{R}(v)$ . Here,  $\tilde{R}(v)$  is the noisy responsivity of the device as a function of applied bias. This bias assignment is performed according to the optimization criterion:

For an imaging system of resolution  $N=96\times96$  pixels, the image capture by system I, the matrix of DCT coefficients Y, and the k-th ideal DCT mask  $B^{(k)}$  are



Figure 6.5: Acquisition and compression processes, which include mapping k mask matrices to their corresponding bias voltages. The mapping is based on the system transfer's function shown in Fig. 5.5. Then, the bias matrices that are sitting in the Raspberry Pi memory are loaded to the imager and projected to the object's reflection function. The resultant dot product is optionally summed up in the hardware, and the k resulting coefficients are sent to the remote computer for reconstruction.

represented by

$$I = \begin{bmatrix} I_{1,1} & \dots & I_{1,96} \\ \vdots & \ddots & \vdots \\ I_{96,1} & \dots & I_{96,96} \end{bmatrix}, Y = \begin{bmatrix} y^{(1)} & \dots & y^{(N)} \\ \vdots & \ddots & \vdots \\ y^{(N^2 - N)} & \dots & y^{(N^2)} \end{bmatrix},$$

and

$$B^{(k)} = \begin{bmatrix} \tilde{b}_{1,1}^{(k)} & \dots & \tilde{b}_{1,96}^{(k)} \\ \vdots & \ddots & \vdots \\ \tilde{b}_{96,1}^{(k)} & \dots & \tilde{b}_{96,96}^{(k)} \end{bmatrix}.$$

The k-th practical mask based on noisy responsivity is

$$\tilde{R}^{(k)} = \begin{bmatrix} \tilde{r}_{1,1}^{(k)} & \dots & \tilde{r}_{1,96}^{(k)} \\ \vdots & \vdots & \vdots \\ \tilde{r}_{96,1}^{(k)} & \dots & \tilde{r}_{96,96}^{(k)} \end{bmatrix} ,$$

 $\quad \text{and} \quad$ 

$$\tilde{r}(v) = r(v) + \eta(\mu, \sigma_v^2) , \qquad (6.11)$$

where R(v) is the implementable k-th mask based on ideal responsivity. Now the expression for computing the individual DCT coefficients corresponding to the noisy responsivity mask and corresponding error are given by

$$y_{\tilde{R}}^{(k)} = \sum_{i=1}^{96} \sum_{j=1}^{96} I_{i,j} \tilde{r}_{i,j}^{(k)}(v) , \qquad (6.12)$$

and

$$y_{err}^{(k)} = y_{idl}^{(k)} - y_{\tilde{R}}^{(k)}$$
  
=  $\sum_{i=1}^{96} \sum_{j=1}^{96} I_{i,j} \left( b_{i,j}^{(k)} - \tilde{r}_{i,j}^{(k)}(v) \right) ,$  (6.13)

where the  $k^{th}$  DCT coefficient corresponding to the ideal mask is denoted by:

$$y_{idl}^{(k)} = \sum_{i} \sum_{j} I_{i,j} b_{i,j}^{(k)} .$$
(6.14)

The objective here is to optimize:

$$y_{err}^{(k)} = \left(b_{i,j}^{(k)} - \tilde{r}_{i,j}^{(k)}(v)\right)^2 \to 0 \quad , \tag{6.15}$$

for all k. Considering only one of the k coefficients, dropping off the indices, and letting  $v_o$  be the parameter to be estimated, we can obtain  $V_{opt}$  by using MMSE conditions as follows:

$$V_{opt} = \underset{v_o}{\operatorname{argmin}} E (b - \tilde{r}(v))^2$$
  
=  $\underset{v_o}{\operatorname{argmin}} (b - r(v))^2 -$ (6.16)  
 $2E(\eta) (b - r(v)) + E(\eta^2) ,$ 

where  $\tilde{r}(v)$  is given in (6.11). Here,  $V_{opt}$  is the optimum bias voltage to be applied to have a gain of almost mask coefficient (b) at the camera pixel.

Now, if the objective function to be minimized is:

$$f(v) = (b - r(v))^{2} -$$

$$2\mu (b - r(v)) + \mu^{2} + \sigma_{v}^{2},$$
(6.17)

then to find the optimum  $v_o$ , we  $\frac{d}{dv}f(v_o) = 0$ , and thus we obtain

$$r(v_o) = b - \mu - \sigma \frac{\frac{d}{dv} \sigma_{v_o}}{\frac{d}{dv} r(v_o)} , \qquad (6.18)$$

and

$$V_{opt} = \sigma_{v_o}^2 \left[ \left( \frac{\frac{d}{dv} \sigma_{v_o}}{\frac{d}{dv} r(v_o)} \right)^2 + 1 \right].$$
(6.19)

The above expressions explain the optimal bias selection and driving the corresponding gain coefficients to be implemented on the pixel to realize the optimal mask coefficient b.

# 6.4.3 DCT-based image compression

Once the optimal masks and gains are designed with the aid of the bias selection algorithm, the biases are applied to the hardware, which in turn results in achieving the desired coefficients as gain factors at each pixel. Finally, the DCT coefficient corresponding to each mask is achieved by

$$y_{opt}^{k} = \sum_{i=1}^{96} \sum_{i=1}^{96} I_{i,j} \tilde{r}_{i,j}^{k} v_{opt} .$$
(6.20)

# 6.4.4 DCT-based image reconstruction

The common reconstruction approach is achieved by simply taking the linear combination of the masks to which the image was projected. The coefficients are the projection results as shown below:

$$I = \sum_{k} R^k_{opt} y^k_{opt} . aga{6.21}$$

Following the discussion above, we performed DCT-based image compression optimally on the hardware. However, some error still exists in the projection coefficients that propagate for the reconstruction, which is mainly due to the limited dynamic range of the pixels and different random uncharacterized noise present in the hardware.

# 6.4.5 Compressive sensing implementation

The second type of in-pixel compressed-domain acquisition, which we have explored, is compressive sensing (CS). While in the DCT transform coding, the gain vectors vary continuously, which leads to the maximal exploitation of device dynamic range; the CS implementation simplifies the complexity by making use of

only zeros and ones, which makes the system more resilient to noise. Here, we present some background regarding CS and implementation methodology on the proposed hardware.

CS is based on the principle of achieving a larger and more efficient compression provided that the desired data is sparse in some basis. Sparsity is the primary condition here, which will lead to efficient reconstruction of data if it is sampled in the proper domain. We consider the input image as a discrete-time column vector  $x \in \mathbb{R}^P$  with elements x[n] where  $n = 1, 2, \ldots, P$  and  $P = 96 \times 96$ . Then x can be represented as a linear combination of elements from an orthonormal basis  $\{\phi_i\}_{i=1}^P$ 



Figure 6.6: The resulting images reconstructed using a) naïve DCT, b) least-meansquare error based DCT, and c) compressive sensing. d) The performance of different methods is compared in terms of the mean square error between the reconstructed image and the original image.

and coefficients  $s_i$ . Here,

$$x = \sum_{i=1}^{P} s_i \phi_i , \qquad (6.22)$$

or

$$x = \phi s$$
.

We assume that s is sparse with K nonzero coefficients. Now, by selecting an efficient binary random sensing matrix  $\psi$ , we can represent the reduced data set as  $y = \psi x$  where  $\psi$  is a binary matrix of size M×P and  $M \ll P$ . In this way the dimension of the data set is reduced from P to M. However, the size M also needs to be properly determined for stable reconstruction. The standard expressions for computing M is given as

$$M \ge cK \log(\frac{P}{K}) \;,$$

where c is a constant. The matrix  $\psi$  is composed of M basis functions in P dimension to which data x is projected, i.e.,  $\psi = [\psi_1 | \psi_2 | \dots \psi_M]^T$  where  $\psi_1$  is of size  $P \times 1$ . The matrix was designed with the restricted isometry property (RIP) given below:

$$(1 - \sigma_k) |x|_2^2 \leq |\psi x|_2^2 \leq (1 + \sigma_k) |x|_2^2 , \qquad (6.23)$$

where

 $\sigma_K \in [0,1) \; .$ 

Moreover, each  $\psi_i$  is converted to an equivalent 2D data set and then subjected to be implemented on hardware as a measurement mask. Because this mask is composed of binary elements, it is easier to achieve projections, as the detector tends to switch on or off depending upon the bias applied for acquisition. After the coefficients have been obtained from the projection of the image to the reduced

basis, the challenging problem is to reconstruct the big data out of its dimensionally reduced format. Specifically, in this problem, we look forward to reconstructing image vector x by only using the M measurements in vector y, the random measurement matrix  $\psi$ , and the orthonormal basis  $\phi$ . Equivalently, we could reconstruct the sparse coefficient vector s. The estimate is given by the  $\ell_1$  minimization criteria as

$$\hat{x} = \operatorname{argmin} \left\| x' \right\|_{1}, \qquad (6.24)$$

such that

$$\psi x^{'}=y$$
 .

The reconstruction was performed with the aid of  $\ell_1$ -magic algorithm, where the same random basis was considered for reconstruction, which was used for the hardware implementation [85].

# 6.4.6 Performance comparison between naïve DCT, LMS DCT, and CS reconstruction

For a prescribed intensity modulation factor, mandated by the DCT masks, for example, we analytically calculated the required voltage using the bias-selection algorithm as discussed in Section 6.4.2. Note that without such a statistical calculation of the voltage, the implementation of the modulation level would be inexact, which would result in errors in the image reconstruction. Figure 6.6 shows reconstructed images for different compression methods with a different number of projection coefficients taken into account. The criticality of the statistical calculation of the voltages is evidenced by the presence of noise in the reconstructed images using the naïve approach, which uses bias voltages that are calculated without considering uncertainty in ROIC's implementation of the masks, as shown in Fig. 6.6(a). In contrast, the reconstruction based on bias-selection algorithm

tends to achieve a better reconstruction, as seen in Fig. 6.6(b). In addition, the CS reconstruction, as shown in Fig. 6.6(c), outperforms the DCT-based approach. For the given results, we can see that naïve-based reconstruction fails to retrieve the details of the image as well as of the contrast levels due to the presence of noise. However, MMSE-based results suggest that they achieve better contrast results as well as reproduce most of the details of the original image. And CS gives exact reconstruction when the sufficient number of coefficients are met.

This exact reconstruction in CS is due to the fact that CS exploits randomness as a tool to extract information with fewer coefficients, and the uncertainty in responsivity has less of an implication on it compared to the DCT approach, which relies on the exact implantation of the masks. Also, for the DCT transform, a linear combination of projection coefficients with the corresponding basis masks results in reconstruction, where an error in projection is propagated. CS reconstruction uses  $\ell_1$  minimization-based optimization, which tends to keep the reconstruction noise as low as possible. Hence, the CS-based reconstruction is more tolerant of uncertainty in electronic mask implementation due to its robust  $\ell_1$  optimization, whereas the DCT approach uses an  $\ell_2$  optimization, which is known for its inferior performance compared to  $\ell_1$  optimization.

# 6.5 Conclusions

The spatio-temporal individual bias tenability of the system is a perfect solution for pixel-based algorithms such as nonuniformity correction, and compressive sensing. The successful design of the chip, and the flexible test hardware/firmware lays a solid foundation for future applications of infrared FPAs, such as infrared retina, classification cameras, and remote sensing imagers.

A hardware implementation of a real-time compressed-domain image acquisition

system is demonstrated. The system performs front-end computational imaging, whereby the inner product between an image and an arbitrarily specified mask is implemented in silicon within the analog readout circuit. The acquisition system is based on an iROIC that is capable of providing the biasing voltage to the individual detectors at each pixel, which enables the implementation of spatial multiplication with any prescribed mask through a bias-controlled gain mechanism. The modulated pixels are then summed up electronically to generate the compressed samples, namely, aperture-coded coefficients, of an image. A 2D-DCT, and CS-based random matrices have been used, and the successful implementation results open the avenue to many other pixel-based remote sensing applications.

# Chapter 7

# Conclusions and future directions

In this report, the design of two intelligent ROICs for in-pixel image processing are explained to tackle the problem of processing big data. The two ROICs are designed based on a smart modification to the CTIA pre-amplifier. The choice of the CTIA preamplifier was to offer a high swing range at the input and the output and to support both positive and negative bias voltages to the detector. The modified CTIA unit-cells enable in-pixel processing of the information at the same time they are being captured, so there will not be a need for transmission, and post-processing of the raw information.

The choice of in-pixel processing, which is done in an analog domain, is the alternative approach to converting the raw information to digital, and later to process it in a general purpose DSP. While the DSP option offers a higher degree of customizability, it cannot keep up with the everyday increase in the resolution of the image. The ROICs we are presenting here process the information at the acquisition time, and inside the pixel, so they are much better in terms of scalability.

In the first scheme, we have improved the design of the unit-cell in a way that

### Chapter 7. Conclusions and future directions

it can extract the spectral features by making basic calculations over the measured photocurrent. To implement the in-pixel spectral feature extraction, we utilized the DWELL photodetector that is spectrally tunable with the applied bias voltage. To achieve a narrow-band nonoverlapping spectral filter, we employed a spectral tuning algorithm that previously was reported by our group [26, 27, 28]. We also have implemented a max-identifier block, which employs a spectral-tuning algorithm to find the class of the object. The intelligent ROIC that is designed for in-pixel classification has passed the comprehensive validation and proved to be functional for the multispectral classifications. To implement a multispectral classification system, we need a functional FPA, which will remain as a part of open directions that one could proceed. Having the FPA successfully fabricated, the single photodetectors on the corner of the FPA must be recharacterized (to deal with the nonuniformity of the photodetector in different corners of the wafer), and the feature extraction algorithm must be re-tuned for the new device.

In the second scheme, a ROIC composed of  $96 \times 96$  unit cells has been demonstrated. The ROIC has the unique feature of tuning the bias information of different pixels, individually. The spatial and temporal bias tunability of the pixels reveals its potential to be used in many different applications that require variable gain for individual pixels. A custom PCB testing hardware was developed to interface to Raspberry-PI board, which serves as both image grabber, and generates the timing signals. Extended support for onboard storage and embedded drivers for Ethernet-enabled development of a very flexible and versatile firmware.

Few applications have been developed over the hardware that includes acquisitiontime nonuniformity correction, and compressed-domain image grabbing based on transform coding. There are many directions for future optimization of this design. For example, the present design uses a PN junction photodetector, laid out on the right and top side of the unit-cell to sense the object. Our future plan is to work

# Chapter 7. Conclusions and future directions

with the dot-in-a-well (DWELL) quantum-well infrared photodetector (QWIP) to exploit the multispectral imaging in the near infrared regime. Another improvement is to use compressive sensing algorithms that are based on binary random matrices so that the limited dynamic range of the pixels is used more efficiently.

# References

- R. Aikens, D. Agard, and J. Sedat, "Solid-state imagers for microscopy," Methods Cell Biol, vol. 29, pp. 291–313, 1989.
- [2] I. M. Ross, "The invention of the transistor," *Proceedings of the IEEE*, vol. 86, no. 1, pp. 7–28, 1998.
- [3] W. F. Brinkman, D. E. Haggan, and W. W. Troutman, "A history of the invention of the transistor and where it will lead us," *Solid-State Circuits, IEEE Journal of*, vol. 32, no. 12, pp. 1858–1865, 1997.
- [4] J. Lukas, J. Fridrich, and M. Goljan, "Digital camera identification from sensor pattern noise," *Information Forensics and Security*, *IEEE Transactions on*, vol. 1, no. 2, pp. 205–214, 2006.
- [5] A. Schwarz, Z. Zalevsky, and Y. Sanhedrai, "Digital camera sensing and its image disruption with controlled radio-frequency reception/transmission," in *Microwaves, Communications, Antennas and Electronics Systems (COMCAS),* 2011 IEEE International Conference on, pp. 1–6, IEEE, 2011.
- [6] L. Colace, G. Masini, V. Cencelli, F. DeNotaristefani, and G. Assanto, "A near-infrared digital camera in polycrystalline germanium integrated on silicon," *Quantum Electronics, IEEE Journal of*, vol. 43, no. 4, pp. 311–315, 2007.

- [7] C.-C. Hsieh, C.-Y. Wu, F.-W. Jih, and T.-P. Sun, "Focal-plane-arrays and CMOS readout techniques of infrared imaging systems," *Circuits and Systems* for Video Technology, IEEE Transactions on, vol. 7, no. 4, pp. 594–605, 1997.
- [8] D. A. Scribner, M. R. Kruer, and J. M. Killiany, "Infrared focal plane array technology," *Proceedings of the IEEE*, vol. 79, no. 1, pp. 66–85, 1991.
- [9] A. Rogalski, "Terahertz detectors and focal plane arrays," Elektronika: konstrukcje, technologie, zastosowania, vol. 51, no. 4, pp. 93–108, 2010.
- [10] M. Bigas, E. Cabruja, J. Forest, and J. Salvi, "Review of CMOS image sensors," *Microelectronics journal*, vol. 37, no. 5, pp. 433–451, 2006.
- M. El-Desouki, M. Jamal Deen, Q. Fang, L. Liu, F. Tse, and D. Armstrong, "CMOS image sensors for high speed applications," *Sensors*, vol. 9, no. 1, pp. 430-444, 2009.
- [12] A. J. Theuwissen, "CMOS image sensors: State-of-the-art," Solid-State Electronics, vol. 52, no. 9, pp. 1401–1406, 2008.
- [13] S. Mendis, S. E. Kemeny, and E. R. Fossum, "CMOS active pixel image sensor," *IEEE transactions on Electron Devices*, vol. 41, no. 3, pp. 452–453, 1994.
- [14] J. Short, "How much media: report on american consumers," Institute for Communications Technology Management, Marshall School of Business, 2013.
- [15] M. Lustig, D. L. Donoho, J. M. Santos, and J. M. Pauly, "Compressed sensing MRI," *IEEE Signal Processing Magazine*, vol. 25, no. 2, pp. 72–82, 2008.
- [16] M. Lustig, D. Donoho, and J. M. Pauly, "Sparse MRI: The application of compressed sensing for rapid MR imaging," *Magnetic resonance in medicine*, vol. 58, no. 6, pp. 1182–1195, 2007.

- [17] R. M. Willett, M. F. Duarte, M. A. Davenport, and R. G. Baraniuk, "Sparsity and structure in hyperspectral imaging: Sensing, reconstruction, and target detection," *Signal Processing Magazine*, *IEEE*, vol. 31, no. 1, pp. 116–126, 2014.
- [18] A. Dupret, B. Dupont, M. Vasiliu, B. Dierickx, and A. Defernez, "CMOS image sensor architecture for high-speed sparse image content readout," in *IEEE International Image Sensor Workshop*, pp. 26–28, 2009.
- [19] G. Bansal, "Digital radiography. A comparison with modern conventional imaging," *Postgraduate medical journal*, vol. 82, no. 969, pp. 425–428, 2006.
- [20] F. Chen, A. P. Chandrakasan, and V. M. Stojanovic, "Design and analysis of a hardware-efficient compressed sensing architecture for data compression in wireless sensors," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 3, pp. 744– 756, 2012.
- [21] F. Chen, A. P. Chandrakasan, and V. Stojanović, "A signal-agnostic compressed sensing acquisition system for wireless and implantable sensors," in *Custom Integrated Circuits Conference (CICC)*, 2010 IEEE, pp. 1–4, IEEE, 2010.
- [22] E. J. Candè and M. B. Wakin, "An introduction to compressive sampling," Signal Processing Magazine, IEEE, vol. 25, no. 2, pp. 21–30, 2008.
- [23] P. Zarkesh-Ha, W. Jang, P. Nguyen, A. Khoshakhlagh, and J. Xu, "A reconfigurable roic for integrated infrared spectral sensing," in 2010 IEEE Photinic Society's 23rd Annual Meeting, 2010.
- [24] S. Krishna, "The infrared retina," Journal of Physics D: Applied Physics, vol. 42, no. 23, p. 234005, 2009.
- [25] S. Kavusi and A. El Gamal, "Folded multiple-capture: An architecture for high dynamic range disturbance-tolerant focal plane array," in *Defense and Security*, pp. 351–360, International Society for Optics and Photonics, 2004.

- [26] W.-Y. Jang, M. M. Hayat, J. S. Tyo, R. S. Attaluri, T. E. Vandervelde, Y. D. Sharma, R. Shenoi, A. Stintz, E. R. Cantwell, S. C. Bender, *et al.*, "Demonstration of bias-controlled algorithmic tuning of quantum dots in a well (DWELL) midIR detectors," *IEEE Journal of quantum electronics*, vol. 45, no. 6, pp. 674–683, 2009.
- [27] B. Paskaleva, W.-Y. Jang, S. C. Bender, Y. D. Sharma, S. Krishna, and M. M. Hayat, "Multispectral classification with bias-tunable quantum dots-in-a-well focal plane arrays," *Sensors Journal*, *IEEE*, vol. 11, no. 6, pp. 1342–1351, 2011.
- [28] W.-Y. Jang, M. M. Hayat, S. E. Godoy, S. C. Bender, P. Zarkesh-Ha, and S. Krishna, "Data compressive paradigm for multispectral sensing using tunable dwell mid-infrared detectors," *Optics express*, vol. 19, no. 20, pp. 19454–19472, 2011.
- [29] J. Ghasemi, P. Zarkesh-Ha, G. R. Fiorante, and S. Krishna, "A new CMOS readout circuit approach for multispectral imaging," in 2013 IEEE Photonics Conference, 2013.
- [30] J. Ghasemi, P. Zarkesh-Ha, S. Krishna, S. E. Godoy, and M. M. Hayat, "A novel readout circuit for on-sensor multispectral classification," in *Circuits and* Systems (MWSCAS), 2014 IEEE 57th International Midwest Symposium on, pp. 386-389, IEEE, 2014.
- [31] G. Rogerio Cugler Fiorante, P. Zarkesh-Ha, J. Ghasemi, and S. Krishna, "Spatiotemporal tunable pixels for multi-spectral infrared imagers," in *Circuits and Systems (MWSCAS)*, 2013 IEEE 56th International Midwest Symposium on, pp. 317–320, IEEE, 2013.
- [32] S. Krishna, J. S. Tyo, M. M. Hayat, S. Raghavan, and U. Sakoglu, "Detector with tunable spectral response," May 15 2007. US Patent 7,217,951.

- [33] J. Brauers, N. Schulte, and T. Aach, "Multispectral filter-wheel cameras: Geometric distortion model and compensation algorithms," *Image Processing*, *IEEE Transactions on*, vol. 17, no. 12, pp. 2368–2380, 2008.
- [34] P.-J. Lapray, X. Wang, J.-B. Thomas, and P. Gouton, "Multispectral filter arrays: Recent advances and practical implementation," *Sensors*, vol. 14, no. 11, pp. 21626–21659, 2014.
- [35] N. Y. Aziz, G. T. Kincaid, W. J. Parrish, J. T. Woolaway II, and J. L. Heath, "Standardized high-performance 320 by 256 readout integrated circuit for infrared applications," in *Aerospace/Defense Sensing and Controls*, pp. 80–90, International Society for Optics and Photonics, 1998.
- [36] S. Krishna, D. Forman, S. Annamalai, P. Dowd, P. Varangis, T. Tumolillo Jr, A. Gray, J. Zilko, K. Sun, M. Liu, et al., "Demonstration of a 320× 256 two-color focal plane array using inas/ingaas quantum dots in well detectors," Applied Physics Letters, vol. 86, no. 19, p. 193501, 2005.
- [37] P. Zarkesh-Ha, "An intelligent readout circuit for infrared multispectral remote sensing," in *Circuits and Systems (MWSCAS)*, 2014 IEEE 57th International Midwest Symposium on, pp. 153–156, IEEE, 2014.
- [38] B. Paskaleva, M. M. Hayat, Z. Wang, J. S. Tyo, and S. Krishna, "Canonical correlation feature selection for sensors with overlapping bands: Theory and application," *Geoscience and Remote Sensing, IEEE Transactions on*, vol. 46, no. 10, pp. 3346–3358, 2008.
- [39] S. A. Empedocles and M. G. Bawendi, "Quantum-confined stark effect in single cdse nanocrystallite quantum dots," *Science*, vol. 278, no. 5346, pp. 2114–2117, 1997.
- [40] W.-Y. Jang, M. M. Hayat, P. Zarkesh-Ha, and S. Krishna, "Continuous timevarying biasing approach for spectrally tunable infrared detectors," *Optics express*, vol. 20, no. 28, pp. 29823–29837, 2012.
- [41] J. M. Eichenholz, N. Barnett, Y. Juang, D. Fish, S. Spano, E. Lindsley, and D. L. Farkas, "Real-time megapixel multispectral bioimaging," in *BiOS*, pp. 75681L–75681L, International Society for Optics and Photonics, 2010.
- [42] N. Aleixos, J. Blasco, F. Navarron, and E. Moltó, "Multispectral inspection of citrus in real-time using machine vision and digital signal processors," *Computers and electronics in agriculture*, vol. 33, no. 2, pp. 121–137, 2002.
- [43] R. Biddappa, "Clock domain crossing," The Cadence India Newsletter, pp. 2–8, 2005.
- [44] S. Schidl, R. Enne, and H. Zimmermann, "Blue enhanced vertically stacked quad junction photodetector with opto window," *Electronics Letters*, vol. 51, no. 10, pp. 777–778, 2015.
- [45] J.-W. Shi, K.-L. Chi, C.-Y. Li, and J.-M. Wun, "Dynamic analysis of highefficiency inp-based photodiode for 40 gbit/s optical interconnect across a wide optical window (0.85 to 1.55 μm)," Journal of Lightwave Technology, vol. 33, no. 4, pp. 921–927, 2015.
- [46] K. MacDonald, "Industrial Products Division Introduction to "Video 101"." http://hexagon.physics.wisc.edu/research/technical%20info/ pulnixvideoguide.pdf. [Online; accessed 03-March-2016].
- [47] G. Prisco and M. D'Urso, "An effective approach for sparse arrays design with the minimum number of sensors," in Antennas and Propagation (EUCAP), Proceedings of the 5th European Conference on, pp. 1277–1278, IEEE, 2011.

- [48] M. Loog, B. van Ginneken, and R. P. Duin, "Dimensionality reduction of image features using the canonical contextual correlation projection," *Pattern recognition*, vol. 38, no. 12, pp. 2409–2418, 2005.
- [49] M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. E. Kelly,
  R. G. Baraniuk, et al., "Single-pixel imaging via compressive sampling," *IEEE Signal Processing Magazine*, vol. 25, no. 2, p. 83, 2008.
- [50] M. Leinonen, M. Codreanu, and M. Juntti, "Compressed acquisition and progressive reconstruction of multi-dimensional correlated data in wireless sensor networks," in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6449–6453, IEEE, 2014.
- [51] M. J. Rubin and T. Camp, "On-mote compressive sampling to reduce power consumption for wireless sensors," in Sensor, Mesh and Ad Hoc Communications and Networks (SECON), 2013 10th Annual IEEE Communications Society Conference on, pp. 291–299, IEEE, 2013.
- [52] B. Fowler and A. El Gamal, "CMOS image sensor with pixel level A/D conversion," Oct. 24 1995. US Patent 5,461,425.
- [53] R. G. Baraniuk, "Compressive sensing," *IEEE signal processing magazine*, vol. 24, no. 4, 2007.
- [54] J. B. Sampsell, "Digital micromirror device and its application to projection displays," *Journal of Vacuum Science & Technology B*, vol. 12, no. 6, pp. 3242– 3246, 1994.
- [55] M. B. Wakin, J. N. Laska, M. F. Duarte, D. Baron, S. Sarvotham, D. Takhar, K. F. Kelly, and R. G. Baraniuk, "An architecture for compressive imaging," in 2006 International Conference on Image Processing, pp. 1273–1276, IEEE, 2006.

- [56] P. Llull, X. Liao, X. Yuan, J. Yang, D. Kittle, L. Carin, G. Sapiro, and D. J. Brady, "Coded aperture compressive temporal imaging," *Optics express*, vol. 21, no. 9, pp. 10526-10545, 2013.
- [57] Y. Oike and A. El Gamal, "A 256× 256 CMOS image sensor with ΔΣ-based single-shot compressed sensing," in 2012 IEEE International Solid-State Circuits Conference, pp. 386–388, IEEE, 2012.
- [58] S. Krishna, "Mid infrared quantum dots in a well infrared photodetectors," in 5th IEEE Conference on Nanotechnology, 2005., pp. 27–30, IEEE, 2005.
- [59] H. Schneider and H. C. Liu, Quantum well infrared photodetectors. Springer, 2007.
- [60] B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani, J. Kirby, Y. Burren, N. Porz, J. Slotboom, R. Wiest, et al., "The multimodal brain tumor image segmentation benchmark (BRATS)," *IEEE Transactions on Medical Imaging*, vol. 34, no. 10, pp. 1993–2024, 2015.
- [61] M. P. Edgar, G. M. Gibson, R. W. Bowman, B. Sun, N. Radwell, K. J. Mitchell, S. S. Welsh, and M. J. Padgett, "Simultaneous real-time visible and infrared video with single-pixel detectors," *Scientific reports*, vol. 5, 2015.
- [62] F. Shao, W. Lin, G. Jiang, and Q. Dai, "Models of monocular and binocular visual perception in quality assessment of stereoscopic images," *IEEE Transactions on Computational Imaging*, vol. 2, no. 2, pp. 123–135, 2016.
- [63] S. V. Venkatakrishnan, L. F. Drummy, M. Jackson, M. De Graef, J. Simmons, and C. A. Bouman, "Model-based iterative reconstruction for bright-field electron tomography," *IEEE Transactions on Computational Imaging*, vol. 1, no. 1, pp. 1–15, 2015.

- [64] A. Chowdhury, R. Darveaux, J. Tome, R. Schoonejongen, M. Reifel, A. De Guzman, S. S. Park, Y. W. Kim, and H. W. Kim, "Challenges of megapixel camera module assembly and test," in *Proceedings Electronic Components and Technology*, 2005. ECTC'05., pp. 1390–1401, IEEE, 2005.
- [65] N. Nakano, R. Nishimura, H. Sai, A. Nishizawa, and H. Komatsu, "Digital still camera system for megapixel CCD," *IEEE Transactions on Consumer Electronics*, vol. 44, no. 3, pp. 581–586, 1998.
- [66] C. F. Weiman and J. M. Evans Jr, "Digital image compression employing a resolution gradient," Apr. 7 1992. US Patent 5,103,306.
- [67] P. T. Barrett, "Method for image compression on a personal computer," Feb. 15 1994. US Patent 5,287,420.
- [68] J. G. Daugman, "High confidence visual recognition of persons by a test of statistical independence," *IEEE transactions on pattern analysis and machine intelligence*, vol. 15, no. 11, pp. 1148–1161, 1993.
- [69] A. Gandomi and M. Haider, "Beyond the hype: Big data concepts, methods, and analytics," *International Journal of Information Management*, vol. 35, no. 2, pp. 137–144, 2015.
- [70] H. Tian, Noise analysis in CMOS image sensors. PhD thesis, Citeseer, 2000.
- [71] A. Mehrish, A. Subramanyam, and S. Emmanuel, "Sensor pattern noise estimation using probabilistically estimated RAW values," *IEEE Signal Processing Letters*, vol. 23, no. 5, pp. 693–697, 2016.
- [72] K. Yonemoto and H. Sumi, "A CMOS image sensor with a simple fixed-patternnoise-reduction technology and a hole accumulation diode," *IEEE Journal of Solid-State Circuits*, vol. 35, no. 12, pp. 2038–2043, 2000.

- [73] A. J. Cooper, "Improved photo response non-uniformity (PRNU) based source camera identification," *Forensic science international*, vol. 226, no. 1, pp. 132– 141, 2013.
- [74] M. J. Schulz and L. V. Caldwell, "Nonuniformity correction and correctability of infrared focal plane arrays," in SPIE's 1995 Symposium on OE/Aerospace Sensing and Dual Use Photonics, pp. 200–211, International Society for Optics and Photonics, 1995.
- [75] D. Litwiller, "CCD vs. CMOS," *Photonics Spectra*, vol. 35, no. 1, pp. 154–158, 2001.
- [76] B. E. Stine, D. S. Boning, and J. E. Chung, "Analysis and decomposition of spatial variation in integrated circuit processes and devices," *IEEE Transactions* on Semiconductor Manufacturing, vol. 10, no. 1, pp. 24–41, 1997.
- [77] N. Ricquier and B. Dierickx, "Active pixel CMOS image sensor with on-chip non-uniformity correction," in Proc. IEEE Workshop Charge-Coupled Devices and Advanced Image Sensors, pp. 20-22, 1995.
- [78] A. Piva, "An overview on image forensics," ISRN Signal Processing, vol. 2013, 2013.
- [79] M. Sheng, J. Xie, and Z. Fu, "Calibration-based NUC method in real-time based on IRFPA," *Physics Procedia*, vol. 22, pp. 372–380, 2011.
- [80] D. L. Perry and E. L. Dereniak, "Linear theory of nonuniformity correction in infrared staring sensors," *Optical Engineering*, vol. 32, no. 8, pp. 1854–1859, 1993.
- [81] S. N. Torres, E. M. Vera, R. A. Reeves, and S. K. Sobarzo, "Adaptive scene-based nonuniformity correction method for infrared-focal plane arrays," in *AeroSense* 2003, pp. 130–139, International Society for Optics and Photonics, 2003.

- [82] C. Zuo, Q. Chen, G. Gu, and X. Sui, "Scene-based nonuniformity correction algorithm based on interframe registration," JOSA A, vol. 28, no. 6, pp. 1164– 1176, 2011.
- [83] S. Saha, "Image compression-from DCT to wavelets: a review," Crossroads, vol. 6, no. 3, pp. 12–21, 2000.
- [84] Y.-M. Zhou, C. Zhang, and Z.-K. Zhang, "An efficient fractal image coding algorithm using unified feature and DCT," *Chaos, Solitons & Fractals*, vol. 39, no. 4, pp. 1823–1830, 2009.
- [85] E. Candes and J. Romberg, "*l*1-magic: Recovery of sparse signals via convex programming." http://www.acm.caltech.edu/l1magic/downloads/ l1magic.pdf, 2005. [Online; accessed 03-March-2016].