Delay-Locked Loop Using 4 Cell Delay Line with Extended Inverters

Jefferson A. Hora, Vincent Alan Heramiz, and Pleiades Faith Longakit
Microelectronics Lab, EECE Department, MSU-Iligan Institute of Technology, Iligan City, Philippines
Email: jefferson.hora@g.msuiit.edu.ph

Abstract—A proposed delay-locked loop (DLL) circuit that uses 4 cell delay line with extended inverters is proposed, designed and simulated in 180nm CMOS process technology. This design can be applied to microprocessors, memory, and communication IC applications whose timing relationships (delay) are essential. Its voltage controlled delay line is improved by adding extended inverter so as to achieve a 50% duty cycle in the DLL output which is usually limited due to jitter and noise in the DLL circuit. The design shows a range of 50-50.3% duty cycle with a 0.6% duty cycle error in its output and its jitter is 5.63ps at 1 GHz. The circuit operates within a frequency range of 520 MHz to 1 GHz and achieves a locking time of 200ns at 1 GHz operation. The DLL’s total chip core area is 0.09703 mm².

Index Terms—delay-locked loop, dynamic phase detector, charge pump, voltage-controlled delay line, duty cycle, jitter.

I. INTRODUCTION

As the speed performance of VLSI systems increases rapidly, clock synchronization between the subsystems is getting more challenging [1] and more emphasis is placed on suppressing skew and jitter in the clocks [2]. These high-speed synchronous systems require tightly controlled clock timing allowances for high performance operation. Efficient performance is essential in communication, network and multimedia applications.

Delay-locked loops (DLL’s) and phase-locked loops (PLL’s) are synchronous circuits routinely employed in microprocessors, memory interfaces and communication IC applications in order to hide clock distribution delays, to cancel the on-chip clock amplification and buffering delays and to improve the overall system timing [2][3][4]. Generally, these circuits are used for clock multiplication and signal synchronization [5].

In applications where no clock synthesis is required, DLLs offer an attractive alternative to PLLs due to their better jitter performance and timing margin performance, inherent stability, and simpler design [4][6]. Additionally, since DLLs do not use a voltage-controlled oscillator (VCO), phase errors induced by supply or substrate noise do not accumulate over many clock cycles. This improved noise immunity is the main reason for the increased adoption of DLLs in applications [4]. The DLL has better stability than the PLL because the DLL uses a first-order loop filter. Furthermore, the jitter of the DLL is smaller than that of the PLL because the DLL has less jitter accumulation. Even though the PLL has been used predominately, the DLL is now getting more attention in the applications such as data communication links and memory interfaces [1].

This commonly used synchronous circuit is used to align the outgoing data with an external clock signal for clock synchronization [7]. A typical DLL involves several design considerations. First, DLLs suffer from the problem of their limited delay range since DLLs adjust only the phase and not the frequency [4]. Second, the output of the DLL also depends greatly on the input to the delay line. Third, the basic DLL cannot generate new frequencies different from that of the delay-line input [8].

The locking time and the jitter performance is always a concern in the design of DLL. The DLL uses a VCDL rather than the VCO since the noise in the voltage-controlled delay-line (VCDL) does not accumulate over many clock cycles; hence, it is preferred to be used in many cases. It also offers a faster locking time, which allows a system to reduce the wait time required before it can operate [9].

Since much of the present and future synchronous systems will depend heavily on high speed operation and efficient performances, a need to meet the goals of fast locking time with less jitter accumulation, given a range of frequencies, is necessary to allow proper data synchronization at high speeds. There are several techniques to achieve a fast lock delay-locked loop. This study proposes a new design approach which involves designing a high frequency fast lock delay-locked loop to meet the demands for microprocessor applications, using a 4 cell voltage-controlled delay-line with extended inverters to improve the 50% duty cycle of the DLL output without the need for duty cycle correction, and designing high speed architectures to attain faster locking time and adding a low pass filter in the overall design to reduce the jitter accumulation.

II. SYSTEM DESIGN AND ARCHITECTURE

A. DLL Overall System

The basic loop building blocks of the delay-locked loop are composed of the phase detector, charge pump, loop filter and a voltage-controlled delay-line. Fig. 1 illustrates the four main functional blocks. The phase

©2014 Engineering and Technology Publishing
doi: 10.12720/ijeee.2.4.298-302
298
detector compares the phase of the reference input and the delay-line output.

![Figure 1. DLL overall system.](image1)

The comparison yields a signal proportional to the phase error. It generates “up” or “down” synchronized signals to the charge pump. The charge pump converts the digital signal output from the phase detector into an analog signal. The charge pump consists of two switched current sources. It is either a source or sink current according to UP and DN signals. This current is converted into control voltage by the loop filter to feed the voltage-controlled delay-line. The delay of each cell in the VCDL depends on control voltage. Each stage provides a delayed version of this signal. The loop acts as a feedback system, compensating any phase difference between out_clk and ref_clk. Therefore, once the loop is in the locked state, the two signals have exactly the same frequency and are aligned in phase. Extended inverters for each delay cell element was proposed because the required delay has not been achieved with a single inverter, so two inverters were added. These inverters were also used to maintain the 50% duty cycle as much as possible without the need for duty cycle correction.

B. Dynamic Phase Detector

Phase detector compares the phase at each input and generates an error signal proportional to the phase difference between the two inputs. Fig. 2 shows the dynamic phase detector used in this study. This type of phase detector is widely used recently in high-speed DLL designs. The basic structure of this phase detector includes two blocks, which are used to generate the UP signal and the DOWN signal, respectively. The two blocks have exactly the same design, except that the two input signals are switched in position. Each block consists of two cascaded stages with a pre-charge PMOS in each stage. The pre-charge activity of the second stage is often controlled by the output of the first stage. The dynamic PD eliminates flip-flops and has the advantages of simple structure and a fast transition time. However, the dynamic PD was carefully designed in order to minimize the dead zone.

![Figure 2. Phase frequency detector using NOR gates](image2)

C. Charge Pump

The schematic of the charge pump is shown in Fig. 3. A single ended switch at the source charge pump is used. The current mismatch between source current and sink current is reduced by ensuring that the source current is the same as the sink current; thus experiencing the same process variations. Studies show that in CMOS circuits, current switching provides a faster switching speed than voltage switching.

![Figure 3. High switch speed charge pump](image3)

Switches M1 and M10 are controlled by the PFD outputs (UP & DN). The current mirrors formed by transistors M5, M6 and M7 as well as M3-M4 pair ensure
equal amount of current for UP and DN branches. Transistors M2, M8 and M9 are included to act as dummy switches to reduce timing mismatch. The current source in this design is calculated using the current equation shown in (1). The transistor sizing is chosen so that current mismatch is avoided.

\[ I_D = \frac{1}{2} \mu C_{\text{ox}} \frac{W}{L} V_{\text{DSAT}}^2 \]  

The current equation shown in (1) is used to design the current source in the charge pump. Proper transistor sizing is carefully designed to avoid current mismatch.

\[ I_D = \frac{1}{2} \mu C_{\text{ox}} \frac{W}{L} V_{\text{DSAT}}^2 \]  

D. Voltage Controlled Delay Line

The proposed delay line circuit corresponding to the current mirror arrangement and the first delay cell with extended inverters is shown on Fig. 4, it consists of four delay cells connected in series, which provide four clock phases required to be generated. Each delay cell consists of two current starved inverters along with four normal digital inverters which are used to improve the rise time and fall time of each phase of the clock.

The delay elements are controlled by the control voltage generated by the charge pump block. This control voltage determines the current through the current mirror of the VCDL. Two inverters were added to the configuration to get the required delay and attain a 50% duty cycle in the output. Initially, one inverter stage is added and sized to get the 50% duty cycle. After correct sizing of the 2\textsuperscript{nd} inverter stage, the third inverter stage is added to get inverted output of the reference signal.

The basic operation of a delay cell is described as follows: If the final output phase lags the reference clock, i.e., it has a lower time period compared to the reference clock, then the control voltage increases; thus increasing the current through the delay cells and thereby increasing the delay of each delay element. Finally the overall time period of the clock phases is increased to match the reference clock period. Once this lock is achieved, the control voltage remains stable and the delay cells maintain the delay locked to the clock time period.

Exactly opposite thing happens when the output clock phase has higher time period compared to the reference clock. The control voltage decreases in this case; thus decreasing the delay offered by each cell, till the time period of each clock phase locks to the reference clock period.

III. SIMULATION RESULTS

The simulation of the schematic was carried out in TSMC 0.18um 1P6M CMOS process technology. Fig. 5 and Fig. 6 show the VCDL output with the control voltage and the VCDL output duty cycle at the locked state, respectively. The delayed output waveforms shown are controlled by the control voltage generated by the charge pump. As can be seen from the graph, the control voltage is increasing which means that the reference clock is leading the output clock. All the delay stages in the VCDL are identical with each delay stage, contributing a time delay of 0.25ns.

Moreover, in Fig. 6, shows the locked state output waveform of the VCDL. This is justified by a stable control voltage at 1.5V. The graph shows a range of 50 to 50.3% duty cycle of each delays produced by the voltage-controlled delay-line.
whether or not the output clock is aligned with the reference clock. A macro shot shown on Fig. 7 is taken at the locked state of the DLL output. The graph shows that the control voltage is decreasing until it reaches a stable state. This goes to show that at in the stable state of the control voltage, the reference clock and the output clock are aligned, which means it is in a locked state. The design locks at about 200ns.

![Figure 7. Overall delay locked loop output](image1)

![Figure 8. Monte carlo analysis at 30 iteration](image2)

![Figure 9. Jitter analysis](image3)

As compared to other published designs, the proposed design has the most reduced duty cycle error of about 0.6% with a jitter of 5.63ps. Hence, it is capable of generating stable time delays which is necessary for clock generation and signal synchronization. Generally, locking time is related to the speed of the architectures used. Since this design incorporates high speed architectures, the circuit attained a fast locking time of about 200ns, which allows a system to reduce the wait time required before it can operate.

The chip layout is shown in Fig. 10 which has a chip area of 0.09703 mm$^2$.

![Figure 10. DLL chip core layout](image4)

### IV. CONCLUSION

The designers were able to achieve a high frequency fast lock delay locked loop using a 4 cell delay line with corner models (SS, TT and FF) to study the effects of process variation. It can be seen that no transistor failed during any of the 30 iterations since the output voltage remained stable when varied. Also, shown on Fig. 9 is the overall DLL output which has attained a jitter of about 5.63ps at 1 GHz operation.

The specifications summary in Table I. shows that there were improvements in most of the properties of the current design than from previous published works. The proposed delay-locked loop design shows that the DLL operates at a high frequency with an operating frequency range of 520 MHz to 1 GHz.

<table>
<thead>
<tr>
<th>TABLE I. DESIGN SPECIFICATION SUMMARY AND COMPARISON</th>
</tr>
</thead>
<tbody>
<tr>
<td>Design</td>
</tr>
<tr>
<td>------------------------------</td>
</tr>
<tr>
<td>Supply Voltage</td>
</tr>
<tr>
<td>Operating Frequency</td>
</tr>
<tr>
<td>Power Consumption</td>
</tr>
<tr>
<td>Jitter</td>
</tr>
<tr>
<td>Lock Time</td>
</tr>
</tbody>
</table>

As compared to other published designs, the proposed design has the most reduced duty cycle error of about 0.6% with a jitter of 5.63ps. Hence, it is capable of generating stable time delays which is necessary for clock generation and signal synchronization. Generally, locking time is related to the speed of the architectures used. Since this design incorporates high speed architectures, the circuit attained a fast locking time of about 200ns, which allows a system to reduce the wait time required before it can operate.
extended inverters operating at a frequency range between 520 MHz to 1 GHz attaining a locking time of about 200ns at 1 GHz operation. The design shows a range of 50-50.3% duty cycle with a 0.6% duty cycle error in its output and its jitter is 5.63ps at 1 GHz with a total core chip area of 219.775 µm x 441.51 µm (0.09703 mm²). This design is suitable for microprocessor and memory IC applications particularly those whose timing relationship is essential.

ACKNOWLEDGMENT

The authors wish to thank DOST-ERDT Eye-C Program for the research grant in providing the industry standard IC design tools. Special thanks to Synopsys engineers for the technical support.

REFERENCES


Jefferson A. Hora received his bachelor degree in Electronics and Communications Engineering from the Mindanao State University-Iligan Institute of Technology (MSU-IIT), Philippines, in 2002 and his M.S. in Electrical Engineering major in IC Design from National Taipei University, Taiwan, in 2009. He has been an IC Design Engineer in Service & Quality Technology Co., Ltd., Taipei, Taiwan from 2009-2010. Recently, a faculty member as Asst. Professor of MSU-IIT since 2010, and is a faculty affiliate and adviser of the Microelectronics Laboratory. His research interest focuses in power management IC, RF-DC converter, analog IC, FPGA design and prototype.

Vincent Alan Heramiz and Pleiades Faith Longakit received their bachelor degree in Electronics Engineering in from the Mindanao State University-Iligan Institute of Technology (MSU-IIT), Philippines, in 2013.