Contents lists available at ScienceDirect

# Computers and Electrical Engineering

journal homepage: www.elsevier.com/locate/compeleceng

## Low-power ternary content-addressable memory design based on a voltage self-controlled fin field-effect transistor segment<sup>†</sup>

## Yen-Jen Chang<sup>a</sup>, Kun-Lin Tsai<sup>b,\*</sup>, Yu-Cheng Cheng<sup>a</sup>, Meng-Rong Lu<sup>c</sup>

<sup>a</sup> Department of Computer Science and Engineering, National Chung Hsing University, Taichung, Taiwan, ROC <sup>b</sup> Department of Electrical Engineering, Tunghai University, Taichung, Taiwan, ROC <sup>c</sup> Neousys Technology company, New Taipei City, Taiwan, ROC

#### ARTICLE INFO

Article history: Received 3 May 2019 Revised 23 November 2019 Accepted 25 November 2019 Available online 2 December 2019

Keywords: Ternary content-addressable memory (TCAM) Leakage power Dynamic voltage control Fin field-effect transistor (FinFET) Low power

## ABSTRACT

Ternary content-addressable memory (TCAM) is a popular component to use for fast and parallel data searching. However, with technology downscaling, TCAM consumes huge leak-age power, which affects search performance. To control TCAM's leakage power consumption, this paper proposes a multiple-segment voltage self-controlled TCAM (VoSCT) that varies the back-gate voltages of fin field-effect transistors and the supply voltages of a TCAM entry. In the VoSCT, a TCAM entry is partitioned into several segments, and each segment operates in one of the three following modes: high-speed mode, low-power mode, and ultra-low-power mode. Examples that use a real routing table are employed to verify the feasibility of the proposed VoSCT, and the experimental results indicate that 38% of leakage power and 21% of total power can be reduced with only a 9% search delay increase and 4% area overhead increase compared with the traditional TCAM design.

© 2019 Elsevier Ltd. All rights reserved.

## 1. Introduction

With the rapid growth of Internet of Things (IoTs) applications, copious data are transmitted on various network environments, and the demand for fast network routing also increases continually. Ternary content-addressable memory (TCAM) is a high-performance search engine, and it is often equipped in a network router because of its parallel searching capability. However, the parallel searching capability causes tremendous power consumption and also presents many design challenges [1].

The power consumption of a TCAM comes primarily from match-lines (MLs), search-lines (SLs), and clock and control circuits [2]. To solve the problem of TCAM's power consumption, many low-power methods have been proposed, some reducing TCAM's dynamic power [3–5] and some controlling TCAM's static power [6–8] consumption. However, according to the reports of the International Technology Roadmap for Semiconductors [9], memory occupies the greatest area on a modern chip, and the static power required for memory increases when the technology progresses. Although the shrinking of the bulk CMOS size improves circuit performance and transistor density, it also induces a large leakage current and a

\* Corresponding author.

E-mail address: kltsai@thu.edu.tw (K.-L. Tsai).

https://doi.org/10.1016/j.compeleceng.2019.106528 0045-7906/© 2019 Elsevier Ltd. All rights reserved.







<sup>\*</sup> This paper is for regular issues of CAEE. Reviews processed and recommended for publication to the Editor-in-Chief by Associate Editor Dr. Shadi A. Aljawarneh.

short-channel effect. A fin field-effect transistor (FinFET), a tri-gate field-effect transistor, was developed and is frequently used to reduce the short-channel effect and leakage current. Several FinFET-based TCAM designs [7,8] are also proposed to explore minimum energy-delay products.

In contrast with other studies, in this paper, the multisegment voltage self-controlled TCAM (VoSCT) is proposed to reduce the leakage power of TCAM by automatically controlling the back-gate voltage of the independent-gate FinFETs (IG-FinFETs) in a segment. In the VoSCT, an *n*-bit TCAM entry is partitioned into several segments, and each segment operates in one of three modes: high-speed mode, low-power mode, or ultra-low-power mode, according to its mask data. In the high-speed mode, the TCAM segment works in the same manner as traditional TCAM; in the low-power mode, the VoSCT controls the back-gate voltages of TCAM cells to reduce leakage power consumption. In ultra-low-power mode, the VoSCT not only controls the back-gate voltages but also turns off the CAM cells' supply voltages to further reduce the leakage power. Our simulation results indicated that the proposed VoSCT reduces considerable leakage power with only a few performance and area penalties.

The rest of this paper is organized as follows. Section 2 introduces the FinFET structure and the architecture of FinFETbased TCAM. Section 3 describes the proposed voltage self-controlled technique. Section 4 presents the experimental results concerning the VoSCT. Finally, Sections 5 and 6 provide the discussion and conclusion of this paper, respectively.

#### 2. Background

#### 2.1. FinFET structure

Distinguished by gate structure, the two types of FinFET are tied-gate FinFETs (TG-FinFETs) and IG-FinFETs. As shown in Fig. 1(a) and (b), the front gate and back gate of the TG-FinFET are connected, enabling them to be controlled with the same input signal. As a result, TG-FinFET can provide a large drive current and short latency. For an IG-FinFET, as shown in Fig. 1(c) and (d), its front gate and back gate are separated. The front gate is biased with input signals to form the channel, and the back gate is disabled (VDD for a p-type IG-FinFET and GND for an n-type IG-FinFET). Because only one gate forms the channel, the delay for an IG-FinFET is longer than that for a TG-FinFET, but the threshold voltage V<sub>th</sub> of an IG-FinFET is biased with V<sub>PB</sub> (for a p-type IG-FinFET has lower leakage power than a TG-FinFET. Moreover, when the back gate is biased with V<sub>PB</sub> (for a p-type IG-FinFET and V<sub>PB</sub> > VDD) or V<sub>NB</sub> (for an n-type IG-FinFET and V<sub>NB</sub> < GND), the V<sub>th</sub> of an IG-FinFET is much higher than that without V<sub>PB</sub> or V<sub>NB</sub>. This means that the leakage power can be further reduced.

Table 1 shows an example of IG-FinFET leakage current for various back-gate voltages. In the case of a p-type IG-FinFET, when the front gate voltage is 1 V and the back-gate voltage increases from 1 V to 1.1 V (1.2 V), the leakage reduction achieved is 80.56% (93.89%). For an n-type IG-FinFET, when the front gate voltage is 0 V and the back-gate voltage changes from 0 V to -0.1 V (-0.2 V), the leakage reduction achieved is 80.89% (94.05%).

## 2.2. FinFET-based TCAM architecture

The core of a TCAM is a TCAM cell array. As shown in Fig. 2, a typical TG-FinFET-based TCAM cell consists of three major components: (1) an 8T XOR-type CAM cell that not only stores the actual data but also compares the stored data with the search data, (2) the mask cell, which is a 6T SRAM cell used for storing the mask bit to indicate whether this TCAM cell is in the "don't care" state or not, and (3) an evaluation logic implemented with two TG-FinFETs controlled by the mask bit and



Fig. 1. Schematic structures for (a) N-type TG-FinFET, (b) P-type TG-FinFET, (c) N-type IG-FinFET, and (d) P-type IG-FinFET.

| IG-FinFET | Voltage    |                | Leakage current (nA) | Leakage reduction | Delay (ps)     |
|-----------|------------|----------------|----------------------|-------------------|----------------|
|           | Front gate | Back gate      | _                    |                   |                |
| P-type    | 1V         | 1V<br>1.1V     | 9.407<br>1.829       | -<br>80.56%       | 39.38<br>40.20 |
| N-type    | 0V         | 1.2V<br>0V     | 0.575<br>12.429      | 93.89%            | 41.37<br>16.03 |
|           |            | -0.1V<br>-0.2V | 2.375<br>0.740       | 80.89%<br>94.05%  | 17.27<br>18.53 |



Table 1



Fig. 2. Structure of TG-FinFET based asymmetric TCAM.

the CAM cell XOR result. The evaluation logic pulls down the ML when TCAM is mismatched. As evident in Fig. 2, because the conventional TCAM uses the differential SL and differential bit-line schemes for performing the search and read-write operations, each TCAM column contains two SLs (SL,SL) and two bit-lines (BL,BL). In the horizontal dimension, in addition to the ML, each TCAM row also contains the data word-line (DWL) and the mask word-line (MWL) to write the data and mask bit individually.

In contrast to the binary CAM cell, which only has "0" and "1" states, a TCAM cell has three states: "0", "1", and "don't care" (or "X"). As shown in Table 2, the TCAM state is determined by the mask data (M) and the TCAM data (D). When M equals 0, representing that the TCAM cell is in the "don't care" state, the evaluation result is a wild match, regardless of the values for TCAM data (D) and search data (S). By contrast, when the mask bit is 1, the evaluation result is dependent on the values of D and S. When D is equal to S, the evaluation result is a normal match, and the ML holds its voltage as VDD. When D is not equal to S, which indicates a mismatch, the ML's signal is pulled down to GND.

#### 3. Voltage self-controlled for FinFET-based TCAM

## 3.1. Continuous feature of TCAM mask data

In many applications, especially in the routing table of a router, the "don't care" bits continuously appear in some segments. Table 3 shows two examples of routing tables with different prefixes. In Table 3, EX1 shows the IP 140.120.15.0/21,

| Mask data (M) | TCAM data (D) | Search data (S) | TCAM state | Match line | Evaluation result |
|---------------|---------------|-----------------|------------|------------|-------------------|
| 1             | 0             | 0               | care       | VDD        | Normal match      |
| 1             | 1             | 1               |            |            |                   |
| 1             | 0             | 1               | care       | GND        | Mismatch          |
| 1             | 1             | 0               |            |            |                   |
| 0             | 0             | 0               | Х          | VDD        | Wild match        |
| 0             | 1             | 1               |            |            |                   |
| 0             | 0             | 1               |            |            |                   |
| 0             | 1             | 0               |            |            |                   |
|               |               |                 |            |            |                   |

Table 2Traditional TCAM State Table.

#### Table 3

Two examples of routing table with different prefixes.

| IP: 140.120.15.0/21 | 10001100.01111000.00001111.00000000                                                      |
|---------------------|------------------------------------------------------------------------------------------|
| Mask: 255.255.248.0 | 11111111.11111111.11111000.00000000                                                      |
| IP: 140.120.10.0/24 | 10001100.01111000.00001010.0000000                                                       |
| Mask: 255.255.255.0 | 11111111.11111111.11111111.00000000                                                      |
|                     | IP: 140.120.15.0/21<br>Mask: 255.255.248.0<br>IP: 140.120.10.0/24<br>Mask: 255.255.255.0 |



Fig. 3. IG-FinFET-based TCAM in the VoSCT design.

which is divided into two parts. The IP 140.120.15.0 is stored in the CAM cells, and the prefix length of this IP is 21, which is stored in SRAM cells as 255.255.248.0. The number "21" represents 21 "care" bits and 11 "don't care" bits. EX2 shows another example with 24 "care" bits and 8 "don't care" bits, and where the mask is 255.255.255.0. As shown in Table 3, the mask data obviously consist of continuous "1" s and "0" s. Accordingly, the proposed VoSCT partitions the TCAM cells into several segments, and each segment controls its supply voltage and back-gate voltage autonomously. The self-control mechanism is detailed in the following section.

### 3.2. IG-FinFET TCAM

To minimize the leakage power consumption, the aforementioned IG-FinFET is used to design the VoSCT, and one TCAM cell of the VoSCT is shown in Fig 3. In the VoSCT design, the back gates of p-type IG-FinFETs are controlled by  $V_{PB}$ , which is set to VDD, and the back gates of n-type IG-FinFETs are controlled by  $V_{NB1}$ , which is set to -0.1 V, except for N5, N6,



Fig. 4. One segment of the VoSCT design.

Table 4Three power modes of TCAM segments.

| Power mode      | Mask bits    | Meaning    | V <sub>NB2</sub> | VDD for CAM cell |
|-----------------|--------------|------------|------------------|------------------|
| High-Speed      | $1 \cdots 1$ | Care       | 0V               | ON               |
| Low-Power       | $1 \cdots 0$ | Boundary   | -0.1V            | ON               |
| Ultra-Low-Power | $0 \cdots 0$ | Don't care | -0.1V            | OFF              |

N7, and N8 of the TCAM cell. These four transistors–N5, N6, N7, and N8, which affect the search performance of TCAM, are biased by  $V_{NB2}$ . The search performance can be enhanced when  $V_{NB2}$  is set to 0 V, and the leakage power can be reduced when  $V_{NB2}$  is set to -0.1 V.

To maximize the leakage power reduction and minimize the search performance penalty, one TCAM entry is partitioned into several segments. As seen in Fig. 4, for an entry consisting of *n* TCAM cells, the VoSCT partitions it into *m* segments, and each segment has k (=n/m) TCAM cells. For example, when a 32-bit data entry is partitioned into four segments, each segment has eight TCAM cells. Because *m* segments exist in one TCAM entry, and each segment operates in one of three power modes, the leakage power can be reduced dramatically.

### 3.2.1. Segment power modes

As shown in Table 4, based on the mask data, one of three power modes can be adopted to control the segment's power consumption. The high-speed mode is provided for the segment in which all the mask data are "1" s, and the low-power mode is used for the boundary segment in which some mask data are "1" s and others are "0" s. The third power mode is the ultra-low-power mode, which is utilized for those segments in which the mask data are all "0" s. According to the pattern of mask data, our voltage self-controlled method controls IG-FinFETs' back-gate voltages and supply voltages for corresponding segments. Fig. 5 shows a TCAM segment with voltage self-controlled technique. As evident in Fig. 5,  $V_{NB1}$  is set to -0.1 V, and  $V_{PB}$  is set to VDD to achieve low leakage.  $V_{NB2}$  is controlled by the least significant bit (LSB)'s mask data of a segment, and the supply voltage of CAM cells is controlled by the most significant bit (MSB)'s mask data. The detailed descriptions of the three power modes and their corresponding voltage settings are as follows.

- (1) High-Speed Mode: In one segment, when all mask data store "1" s and all TCAM cells remain in a "care" state, all the TCAM data should be compared with the search data. To enhance the comparison performance,  $V_{NB2}$  is set to 0 V. As seen in Fig. 5, all mask data are "1" s, namely signals  $M_1=M_2=...=M_n=1$ . As a result, the MSB  $\overline{M_n}$  is "0", and the supply voltages of CAM cells are set to VDD to maintain the CAM data. Moreover, the LSB  $M_1$  is also "1", and transistor N9(N10) is turned on(off), so that 0 V can be provided for those IG-FinFETs with  $V_{NB2}$  as their back-gate voltages. In this situation, transistors N5 to N8 (of each TCAM cell) can quickly evaluate TCAM data and determine whether to pull down the ML.
- (2) Low-Power Mode: The mask data in one segment include both "1" s and "0" s. Thus, some TCAM cells remain in the "care" state and others in an "X" state, and such a segment is called a boundary segment. A TCAM entry features at most one boundary segment due to the continuous feature of mask data. In the boundary segment, the MSB of the mask data is "1", and the LSB is "0". In Fig. 5,  $M_n$  is "1" and  $\overline{M_n}$  is "0", and the supply voltage of CAM cells is still set to VDD to maintain the CAM data. However, the mask data  $M_1$  is "0", and consequently transistor N9 is turned off and N10 is turned on to provide -0.1 V to  $V_{NB2}$  for transistors N5, N6, N7, and N8 in each TCAM cell. In the low-power mode, the leakage power can be reduced by using lower  $V_{NB2}$ , but the segment's function is maintained.





(3) Ultra-Low-Power Mode: When all mask data store "0" s and all TCAM cells remain in "X" states, the CAM cells' comparisons cannot affect the final result. In this case, the data stored in CAM cells can be ignored and the corresponding segment is set to the ultra-low-power mode. Similar to the boundary segment, mask data  $M_1$  is "0", and consequently transistor N9 is turned off and N10 is turned on to provide -0.1 V for  $V_{NB2}$  of transistors N5, N6, N7, and N8 in each TCAM cell. Unlike in the low-power mode, as seen in Fig. 5, when all mask data are "0" s, the signals are such that  $M_1=M_2=...=M_n=0$ , and  $\overline{M_n}$  is "1", the supply voltages of CAM cells are turned off (i.e., set to GND) to further reduce leakage power consumption. When the supply voltage of CAM cells is turned off, the data (D and D) stored in the CAM cells are lowered almost to 0 to reduce the voltage difference between the data and the power supply.

## 4. Simulation results

In our simulation, the Predictive Technology Model (PTM) 32-nm FinFET technology [10] is utilized to implement the traditional TCAM and the VoSCT design. Use of the PTM, which bridges the process and material development and circuit simulation through compact device modeling, is essential for assessing the potential and limits of new technologies and supporting early design prototyping. For the traditional TCAM,  $V_{NB} = =0.0$  V and  $V_{PB} = =1.0$  V, and for the VoSCT,  $V_{NB1} = =-0.1$  V and  $V_{PB} = VDD$ . The value of  $V_{NB2}$  depends on the power mode of the corresponding segment, which is listed in Table 4. To achieve voltage self-control, in the VoSCT, a 32-bit TCAM entry is partitioned into *m* segments, where *m* = 2, 4, and 8, denoted as VoSCT\_2, VoSCT\_4, and VoSCT\_8.

#### 4.1. Mask data-writing behavior

To verify the correctness of data-writing behavior, Fig. 6 shows the simulation result of a VoSCT\_8 segment that is simulated with HSPICE. As illustrated in Fig. 6,  $M_4$  to  $M_1$  are the values of four mask cells, and Da1 and Da2 are two bit-lines that control the data writing of  $(M_4, M_3)$  and  $(M_2, M_1)$ , respectively. The initial value of  $M_4M_3M_2M_1$  is set to 0000, and Da1 and Da2 are both set to 1. When the clock reaches the first positive, the data are successfully written into mask cells, and the value of  $M_4M_3M_2M_1$  is 1111. The values of Da1 and Da2 are then set to 00, 10, and 11 in turn; the value of mask  $M_4M_3M_2M_1$  then changes to 0000, 1100, and 1111, correspondingly. *It follows, as* illustrated in Fig. 6, that the writing behavior is correct.

### 4.2. Power consumption

Table 5 lists the leakage power, total power, energy, search delay, and transistor count of the traditional TCAM and the proposed VoSCT. In Table 5, the traditional TCAM has the shortest search delay but the largest total power consumption and energy consumption. Compared with the traditional design, VoSCT\_8 reduces leakage power by 36.58% and total power



Fig. 6. Simulation result of mask data writing.

#### Table 5

Power, energy, delay, and transistor count comparison between traditional TCAM and the VoSCT.

|             | Leakage power |           | Total power |           | Search delay |           | Energy |           | Transistor count |           |
|-------------|---------------|-----------|-------------|-----------|--------------|-----------|--------|-----------|------------------|-----------|
|             | uW            | Reduction | uW          | Reduction | ps           | Reduction | fJ     | Reduction | Number           | Reduction |
| Traditional | 5.186         | -         | 10.662      | -         | 39.01        |           | 4.160  |           | 516              | -         |
| VoSCT_8     | 3.289         | 36.58%    | 8.536       | 19.94%    | 41.97        | -7.58%    | 3.672  | 11.73%    | 548              | -6.20%    |
| VoSCT_4     | 3.159         | 39.09%    | 8.488       | 20.39%    | 42.38        | -8.63%    | 3.687  | 11.37%    | 532              | -3.10%    |
| VoSCT_2     | 3.008         | 42.00%    | 8.360       | 21.59%    | 43.01        | -10.25%   | 3.698  | 11.11%    | 524              | -1.55%    |



Fig. 7. Leakage power consumption with various numbers of "don't care" bits.



Fig. 8. Leakage and total power comparison with real IPv4 routing table.

by 19.94%. However, it also increases search delay by 7.58% and uses 32 extra transistors. Similarly, VoSCT\_4 and VoSCT\_2 also reduce total power by 20.39% and 21.59%, with increases of 8.63% and 10.25%, respectively, in search delay. The VoSCT designs reduce energy consumption by approximately 11% compared with the traditional design.

To further investigate the relationship between leakage power consumption and the number of "don't care" ("X") bits, a simulation is created, and the result is presented in Fig. 7. In Fig. 7, when the number of "don't care" bits is 0, all TCAM cells of the entry remain in the "care" state, namely, mask bits are "1" s; when the number of "don't care" bits is 24, all TCAM cells remain in the "X" state. As seen in Fig. 7, the more TCAM cells remain in the "X" state, the less leakage power the TCAM entry dissipates. The leakage power consumptions of the three VoSCT designs are similar to a staircase shape. Taking VoSCT\_4 as an example, one 32-bit TCAM entry is partitioned into four segments (VoSCT\_4\_3, VoSCT\_4\_2, VoSCT\_4\_1, and VoSCT\_4\_0), and each segment has eight TCAM cells. When the number of "don't care" bits equals 1, the mask data of the LSB in VoSCT\_4\_0 is 0, and -0.1 V is consequently set to  $V_{NB2}$  of VoSCT\_4\_0; specifically, VoSCT\_4\_0 is in the low-power mode, and thus the leakage power can be reduced. This also occurs during X = 1 to X = 7. When X = 8, the mask data of the MSB in VoSCT\_4\_0 is 0, and the supply voltages for all TCAM cells in VoSCT\_4\_0 are consequently turned off; VoSCT\_4\_0 is in the leakage power can be further reduced. When X = 9, VoSCT\_4\_1 enters the low-power mode, and the leakage power is continuously reduced.

## 4.3. Power consumption of real routing tables

Three real IPv4 routing data (AFRINIC, ARIN, and APNIC) obtained from the BGP Routing Table Analysis Report [11] are used to demonstrate the feasibility of the VoSCT design. Each routing table has 128 entries and 32 bits per entry. Fig. 8 shows the leakage and total power comparison of the three benchmarks. As Fig. 8 indicates, the VoSCT\_4 consumes less leakage power and total power than the traditional TCAM does. On average, the VoSCT\_4 design reduces leakage power by 38% and total power by 21%.

## 5. Discussion

## 5.1. Related studies

In the past decades, many studies have proposed approaches for controlling the power consumption of CAM and TCAM. Zackriya and Kittur [12] designed a precharge-free CAM to improve the drawbacks of charge sharing in a NAND-type CAM and the short-circuit current in a NOR-type CAM. The CAM proposed in [12] utilized ripple precharge to reduce the dynamic power consumption of MLs. In contrast to the precharge-free CAM, the authors proposed another precharge-controlled CAM in [13]. A precharge controller was utilized to manage the MLs' voltage swing by predicting the match-mismatch state of MLs. During the precharging phase, once the precharge controller predicted the mismatching of the ML, it suspended precharging of the operation to reduce power consumption. Ahn and Kwon [14] demonstrated a local-NOR global-NAND ML architecture to enhance search performance and reduce dynamic power consumption. By segmenting the ML into many short local MLs and evaluating them in parallel, the capacitance of each ML was much smaller than that in a long ML. Current limiting and clamping schemes were also used to reduce the voltage swing of local MLs.

For low-power TCAM design, the authors of [6,15–17] investigated many effective schemes. Jeloka et al. [15] utilized several push-rule 6T SRAM cells to construct a TCAM. Unlike in the traditional TCAM, the TCAM structure in [15] used word-lines to search data and bit-lines to search results. The chip area and overall capacitance could thereby be reduced. Chen et al. [16] utilized a dynamic reconfigurable TCAM on OpenFlow compliant packet processing to support long and variable-length flow tables. The hybrid-pipelined TCAM and self-power gating technique controlled by the mask data were used to optimize both NAND and NOR core cells and to reduce their power consumption. The features of continuous TCAM mask patterns [2] and the two-sided self-powered gating technique [6] were also utilized to reduce leakage power dissipation of TCAM cells. Mishra and Sahni [17] investigated several TCAM architectures and combined some of them to generate an optimal prefix set for the given routing table to reduce the TCAM's power consumption. The proposed method was suitable for those static routing tables, namely, known routing tables, and the optimal prefix set could thus be calculated. However, when the routing table changed, the optimal prefix set also required recalculation.

The MLs of a TCAM are often utilized to control the TCAM's power consumption. Tsai et al. [4] utilized various types of CAMs to lower the voltage swing on MLs. Zhang et al. [18] presented an OR-type cascaded ML scheme to partition an ML into several segments and to control the ML's charge behavior for each segment to reduce the dynamic power. However, in this scheme, the search delay and the functionality required careful consideration. Chen et al. [19] partitioned a long-word TCAM into two stages and provided different voltages for these two stages. As in [18], the ML of the second stage would be charged only when the search data matched the data stored in the first stage. Thus, the dynamic power of the second stage could be reduced, and the leakage power could also be controlled using dual voltages. Choi et al. [20] proposed an adaptive ML discharging scheme for a low-power and high-speed TCAM by employing the gated ML pulldown path and ML boosting scheme. According to the number of mismatches and the ML discharging speed, the ML discharging was adaptively controlled.

The ML and SL are both used for power saving. Chang et al. [3] combined various types of CAMs to reduce MLs' and SLs' power consumption. Because SL initialization was not required to disconnect the ML from the ground node, when the pulldown paths were disconnected from the ML, SL toggling could be effectively reduced. Agarwal et al. [5] used XOR CAMs to improve SL performance and reduce SL power consumption. Arsovski et al. [21] adopted the precharge low ML with a current-saving scheme to minimize unnecessary SL toggling.

TCAM partition is another low-power scheme. Chang et al. [22] introduced a two-sided self-gating (TSSG) scheme to reduce the leakage power consumption of TCAM: This divided one entry into several segments, and when the mask data of one segment were the same, the TSSG-based TCAM severed unnecessary charge and discharge paths in SRAM cells. For the unevenly segmented ML scheme, Chang and He [23] proposed four TCAM architectures—All-SG, All-IG, Hybrid-1, and Hybrid-2—and controlled the MLs' charge–discharge frequency to lower the dynamic power consumption of MLs.

The FinFET-based TCAM was proposed in recent years. Tawfik and Kursun [7] proposed an IG-FinFET-based 6T SRAM to enhance data stability and reduce standby power. Chang et al. [8] and Arulvani et al. [24] also used a FinFET TCAM to explore a minimum energy-delay product. In contrast with other studies' findings, the VoSCT not only controls the MLs' power consumption but also gates TCAM cells' supply voltages so that leakage power and dynamic power can be further reduced.

#### 5.2. Experimental results discussion

In this paper, a voltage self-controlled TCAM architecture was developed to reduce TCAM cells' leakage power consumption using  $V_{dd}$  gating and IG-FinFET back-gate voltage adjustment. The simulation results demonstrated that the leakage power can be effectively reduced using the VoSCT architecture and the voltage self-controlled technique is suitable for controlling the TCAM cells' power consumption because the TCAM cell structure is used for memorizing the data and consumes considerable power. The outcome builds on the work of Chang [6], confirming that using the voltage self-controlled technique for TCAM provides effective leakage power saving. In addition, the improvements identified in this study only affected search performance and transistor count overhead to a small degree. This study therefore indicates that the benefit gained

from the voltage self-controlled technique may address leakage power saving requirements. The main contributions of this paper are as follows:

- 1) Segmenting a TCAM entry and controlling the supply voltage as well as the back-gate voltage of an IG-FinFET by the segment itself;
- 2) Combining supply voltage gating with the FinFET back-gate voltage-adjusting technique on TCAM cells to reduce TCAM leakage power; and
- 3) Providing three operation modes for one segment.

Most notably, this is the first study, to our knowledge, that investigated the combination of TCAM cell supply voltage gating with the FinFET back-gate voltage-adjusting technique applied to TCAM. Only four extra transistors are required in one segment to generate the controlling signals. The results provide compelling evidence for leakage power reduction on an IPv4 forwarding-table TCAM and indicate the effectiveness of applications to IPv6. According to the statistics from other studies, the number of prefixes in an IPv4 address falls in the range of 16 to 24. Thus, the proposed VoSCT partitions the masks using the common cases, namely, two segments, four segments, and eight segments. Although the equal-length segment partition has the advantage of easy implementation, the limitation of inflexibility can also be discussed in future research. The unbalanced segment partition should therefore be included in follow-up work designed to evaluate the approach and whether it can continue to be used to prevent more leakage power.

## 6. Conclusion

In this paper, the voltage self-controlled technique was used on the TCAM design to reduce the leakage power consumption. In the VoSCT, a TCAM entry is partitioned into several segments, and each segment operates with one of three power modes, namely, a high-speed mode, low-power mode, and ultra-low-power mode; consequently, the back-gate voltages of IG-FinFETs as well as supply voltages can be self-controlled according to the MSB and LSB cells' status for each segment. Based on the PTM 32-nm FinFET technology, the simulation results on real routing table data show that leakage power and total power consumption can be reduced by up to 38% and 21%, with a 9% search delay increase and 4% area overhead. The proposed VoSCT is suitable for those routing tables embedded in routers and gateways, especially for mobile communication environments.

## **Declaration of Competing Interest**

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

## **CRediT** authorship contribution statement

**Yen-Jen Chang:** Conceptualization, Methodology, Supervision, Project administration. **Kun-Lin Tsai:** Visualization, Writing - original draft, Writing - review & editing, Data curation. **Yu-Cheng Cheng:** Software, Investigation, Validation. **Meng-Rong Lu:** Software, Investigation.

### References

- Baeg S. Low-power ternary content-addressable memory design using a segmented match line. IEEE Trans Circuits Syst Regul Pap 2008;55(6):1485–94 July.
- [2] Cheng YC, Chen JH, Wu TC, Chang YJ. Low leakage mask vertical control TCAM for network router. In: Proc. IEEE Asia Pacific conference on circuits and systems (APCCAS), Oct; 2016. p. 469–72.
- [3] Chang YJ, Liao YH. Hybrid-Type CAM Design for both power and performance efficiency. IEEE Trans Very Large Scale Integr VLSI Syst 2008;16(8):965–74 Aug.
- [4] Tsai KL, Chang YJ, Cheng YC. Automatic charge balancing content addressable memory with self-control mechanism. IEEE Trans Circuits Syst Regul Pap 2014;61(10):2834–41 Oct.
- [5] Agarwal A, Hsu S, Mathew S, Anders M, Kaul H, Sheikh F, Krishnamurthy R. A 128×128b high-speed wide-AND match-line content addressable memory in 32nm CMOS. In: Proc. IEEE European solid-state circuits conference (ESSCIRC), Sep.; 2011. p. 83–6.
- [6] Chang YJ, Tsai KL, Tsai HJ. Low leakage TCAM for IP lookup using two-side self-gating. IEEE Trans Circuits Syst Regul Pap 2013;60(6):1478-86 June.
- [7] Tawfik SA, Kursun V. Low power and stable FinFET SRAM with static independent gate bias for enhanced integration density. In: Proc. IEEE international conference on electronics, circuits and systems (ICECS), Dec.; 2007. p. 443–6.
- [8] Chang MC, He KL, Wang YC. Design of asymmetric TCAM (Ternary Content-Addressable Memory) Cells Using FinFET. In: Proc. IEEE 3rd global conference on consumer electronics (GCCE), Oct.; 2014. p. 358–9.
- [9] Semiconductor Industry Association, International Technology Roadmap for Semiconductors (ITRS), 2015 [Online]. Available: http://www.itrs2.net/.
- [10] Cao Y. What is Predictive Technology Model (PTM)? ACM SIGDA Newslett 2009;39(3) March.
- [11] [Online] BGP Routing Table Analysis Report. Available: http://bgp.potaroo.net.
- [12] Zackriya V M, Kittur HM. Precharge-free, low-power content-addressable memory. IEEE Trans Very Large Scale Integr VLSI Syst 2016;24(8):2614–21 Aug.
- [13] Zackriya V M, Kittur HM. Content addressable memory—early predict and terminate precharge of match-line. IEEE Trans Very Large Scale Integr VLSI Syst 2017;25(1):385–7 Jan.
- [14] Ahn SG, Kwon KW. Local NOR and global NAND match-line architecture for high performance CAM. In: Proc. of IEEE international midwest symposium on circuits and systems (MWSCAS), Aug.; 2017. p. 707–10.

- [15] Jeloka S, Akesh NB, Sylvester D, Blaauw D. A 28 nm configurable memory (TCAM/BCAM/SRAM) using push-rule 6T bit cell enabling logic-in-memory. IEEE J Solid-State Circuits 2016;51(4):1009–21 April.
- [16] Chen TS, Lee DY, Liu TT, Wu AY. Dynamic reconfigurable ternary content addressable memory for OpenFlow-compliant low-power packet processing. IEEE Trans Circ Syst-I 2016;63(10):1661–72 Oct.
- [17] Mishra T, Sahni S. PETCAM-A power Efficient TCAM for forwarding tables. In: Proc. of IEEE symposium on computers and communications (ISCC); 2009. 5-8 July.
- [18] Zhang J, Zheng S, Teng F, Ding Q, Chen X. An OR-type cascaded match line scheme for high-performance and EDP-efficient ternary content addressable memory. In: Proc. of IEEE nordic circuits and systems conference (NORCAS); 2016. p. 1–6. Nov.
- [19] Chen TS, Lee DY, Liu TT, Wu AY. Filter-based dual-voltage architecture for low-power long-word TCAM Design. In: Proc. of international conference on intelligent green building and smart grid; 2016. p. 1–5. June.
- [20] Choi W, Lee K, Park J. Low cost ternary content addressable memory using adaptive matchline discharging scheme. In: Proc. of IEEE international symposium on circuits and systems (ISCAS); 2018. p. 1–4. May.
- [21] Arsovski I, Patil A, Houle RM, Fragano MT, Rodriguez R, Kim R, Butler V. 1.4Gsearch/s 2-Mb/mm2 TCAM using two-phase-pre-charge ML sensing and power-grid pre-conditioning to reduce Ldi/dt power-supply noise by 50%. IEEE [ Solid-State Circ 2018;53(1):155-63 Jan.
- [22] Chang YJ, Tsai KL, Tsai HJ, Low leakage TCAM for IP lookup using two-side self-gating. IEEE Trans Circuits Syst Regul Pap 2013;60(6):1478-86 June.
- [23] Chang MC, He KL. Design of low-power FinFET-Based TCAMs with unevenly-segmented matchlines for routing table applications. In: Proc. of IEEE international conference on ASIC (ASICON); 2015. p. 1-4. Nov.
- [24] Arulvani M, Ismail MM. Low power FinFET content addressable memory design for 5G communication networks. Comput Electr Eng 2018;72:606–13 Nov.

Yen-Jen Chang received the Ph.D. degree in computer science and information engineering from National Taiwan University in 2003. In 2004, he joined the Faculty of Department of Computer Science and Engineering, National Chung Hsing University, Taiwan, where he is currently a Professor. His research interests include computer and microprocessor architecture, digital integrated circuit design, low-power memory design, and system-on-chip design.

Kun-Lin Tsai received the Ph.D. degree in electrical engineering from National Taiwan University in 2006. He was a postdoc at National Taiwan University of Science and Technology in 2007. He is currently an Associate Professor of Department of Electrical Engineering of TungHai University. His research interests are low-power system design, information security system, and VLSI design.

Yu-Cheng Cheng received the M.S. an Ph. D. degrees in computer science and engineering from National Chung Hsing University, Taichung, Taiwan, in 2013 and 2018, respectively. His research interests include computer and micro-processor architecture, digital integrated circuit design, and low-power memory design.

Meng-Rong Lu received the M.S. degree in computer science and engineering from National Chung Hsing University, Taichung, Taiwan, in 2014. She is currently an engineer at Department of Research and Design in Neousys Technology. Her research interests include FinFET technique and low-power memory design.