High resolution distributed time-to-digital converter (TDC) in a White Rabbit network

High resolution distributed time-to-digital converter (TDC) in a White Rabbit network

Nuclear Instruments and Methods in Physics Research A 738 (2014) 13–19 Contents lists available at ScienceDirect Nuclear Instruments and Methods in ...

2MB Sizes 1 Downloads 126 Views

Nuclear Instruments and Methods in Physics Research A 738 (2014) 13–19

Contents lists available at ScienceDirect

Nuclear Instruments and Methods in Physics Research A journal homepage: www.elsevier.com/locate/nima

High resolution distributed time-to-digital converter (TDC) in a White Rabbit network Weibin Pan n, Guanghua Gong, Qiang Du, Hongming Li, Jianmin Li Key Laboratory of Particle & Radiation Imaging (Tsinghua University), Ministry of Education, Department of Engineering Physics, Tsinghua University, Beijing 100084, China

art ic l e i nf o

a b s t r a c t

Article history: Received 3 September 2013 Received in revised form 29 November 2013 Accepted 29 November 2013 Available online 11 December 2013

The Large High Altitude Air Shower Observatory (LHAASO) project consists of a complex detector array with over 6000 detector nodes spreading over 1.2 km2 areas. The arrival times of shower particles are captured by time-to-digital converters (TDCs) in the detectors' frontend electronics, the arrival direction of the high energy cosmic ray are then to be reconstructed from the space-time information of all detector nodes. To guarantee the angular resolution of 0.51, a time synchronization of 500 ps (RMS) accuracy and 100 ps precision must be achieved among all TDC nodes. A technology enhancing Gigabit Ethernet, called the White Rabbit (WR), has shown the capability of delivering sub-nanosecond accuracy and picoseconds precision of synchronization over the standard data packet transfer. In this paper we demonstrate a distributed TDC prototype system combining the FPGA based TDC and the WR technology. With the time synchronization and data transfer services from a compact WR node, separate FPGA-TDC nodes can be combined to provide uniform time measurement information for correlated events. The design detail and test performance will be described in the paper. & 2013 Elsevier B.V. All rights reserved.

Keywords: Distributed White Rabbit Time-to-digital converter

1. Introduction The Large High Altitude Air Shower Observatory (LHAASO) [1] is proposed to search the origin of galactic cosmic rays above 30 TeV with a 1.2 km2 complex ground particle detector array. The arrival times of shower particles are captured by time-to-digital converters in the detectors' frontend electronics, the arrival direction of the high energy cosmic ray are then to be reconstructed from the space-time information of all detector nodes. In order to obtain a o 0.51 angular resolution in the 1.2 km2 complex detector array (KM2A), the timestamp of each frontend electronics for 5635 scintillation electron detectors and 1221 muon detectors should be aligned with 500 ps (RMS) accuracy and picoseconds precision, where accuracy represents the measurement deviation from the true value, and precision reflects the repeatability of the measurement [2]. The White Rabbit (WR) technology [3] is proposed and developed by CERN and GSI aiming to provide sub-nanosecond accuracy and picoseconds precision of synchronization for large distributed systems. Compared with other mature technologies, the White Rabbit shows a great applicability for the LHAASO project [4,5]. The reference clock can be distributed to thousands of WR nodes through the cascaded WR switches. Moreover, the Ethernet based

n

Corresponding author. Tel.: +86 13466339331. E-mail address: [email protected] (W. Pan).

0168-9002/$ - see front matter & 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.nima.2013.11.104

WR combines timing and data link over the same physical media, which is a cost-effective solution for the LHAASO. Benefiting from the WR technology, all TDC nodes can obtain a synchronized clock and time reference. To evaluate the timing performance and verify the feasibility of TDCs in the WR network, a prototype system of WR based distributed TDC is developed.

2. White rabbit Sub-nanosecond accuracy and picoseconds precision is achieved by combining Layer-1 Syntonization, Precision Time Protocol (PTP) [6] and Digital Dual-Mixer Time Difference (DDMTD) [7] phase detection in the White Rabbit project. According to the Layer-1 Syntonization, a master node serializes a 125 MHz clock into the physical layer of optical downstream link and the slave node can recover it from the link and use it as the local clock, thus the master and all slave nodes can share a common clock. The recovered clock is reflected back to the master in the upstream link for phase measurement. The PTP protocol defines an Ethernet packet exchange mechanism from which the link delay and clock offset can be calculated and compensated. However, in a Gigabit Ethernet environment, the standard PTP timestamp has a granularity of 8 ns, which limits the synchronization resolution. In the WR, the resolution of PTP timestamp is improved to picoseconds by measuring the phase difference between the transmitted and recovered clock with

14

W. Pan et al. / Nuclear Instruments and Methods in Physics Research A 738 (2014) 13–19

Fig. 1. White Rabbit link architecture.

detected by the Encoder, and the propagation time of carry signals is recorded as the fine time result. More detailed descriptions of carry chain based TDC are presented in Refs. [10–14].

4. Implementations By combining the WR technology and carry chain based TDC, a structure of distributed TDC system is developed. The block diagram of the distributed TDC is illustrated in Fig. 4, where the distributed TDC node is composed of a WR node, a multi-channel TDC module, and a carrier board. 4.1. CUTE-WR

Fig. 2. Topology of White Rabbit network.

the DDMTD in both nodes. The measured phase differences are then incorporated into the PTP calculation to achieve better resolution. The architecture of the WR link is illustrated in Fig. 1 [8]. Keeping utmost compatibility, the White Rabbit network is an enhancement to standard Gigabit Ethernet with sub-nanosecond clock and time distribution features. A WR network consists of several layers of WR switches and multiple WR nodes, all switches and nodes are connected by single mode fibers in a redundant topology. WR switch is the key component of WR network and serves as the boundary clock defined in the PTP. It recovers clock from upstream port and distributes the syntonized clock to all downstream ports which can be connected to a cascaded WR switch or a WR node. The topology of WR network is illustrated in Fig. 2 [3,9].

3. Time-to-digital converter Comparing with analog based TDCs, FPGA TDCs have the advantages of high integration, large dynamic range, and complete digitalization. A plain carry chain based TDC design in todays' modern FPGA devices can achieve very high timing precision of better than 30 ps [10–14]. The sketch structure of carry chain based TDC is shown in Fig. 3 [14]. The measured signal is injected into a chain of adders, the sum of adders are latched by an array of registers. The adders will be set with original values of all ‘1’s, when a logic ‘0’ to ‘1’ transition occurs to the injected signal, the carry signal starts propagating from least significant adder to the most. The rising edge of clock signal will latch the sum of the adders, the number of flipped adders is

A compact universal timing endpoint based on the WR (CUTE-WR) [15], as shown in Fig. 5, is a simplified WR node designed as FPGA Mezzanine Card (FMC) [16] with minimum components required, and it can be integrated with any custom circuit. The CUTE-WR recovers timing information from the WR network, and provides the synchronized 125 MHz clock, Pulse Per Second (PPS) and International Atomic Time (TAI) through the FMC connector. Besides, the CUTE-WR provides a standard network interface for data transmission. Two general purpose SMA/LEMO connectors and a USB interface for local monitor and configuration are also provided on the CUTE-WR. 4.2. 32 Channel TDC module A 32-channel carry-chain based TDC module is developed with an Altera FPGA EP2C35F484C6, as shown in Fig. 6. Input signals are directly connected with FPGA pins via two parallel high-speed connectors in either differential or single-ended level, the unused TDC input pins can be reprogrammed for other functions. In addition, an on-board 40 MHz oscillator and a 100 Mbps Ethernet port with TCP/IP support are provided so that the module can work independently. In our prototype, only 16 TDC channels are supported on each node, the unused TDC pins are connected to the CUTE-WR mezzanine via the carrier board to form a FIFO-like parallel bus for moving the TDC measurement data from TDC module to CUTE-WR mezzanine. 4.3. FMC carrier board The FMC carrier board, shown in Fig. 7, hosts the CUTE-WR board and the 32-channel TDC module, providing power supply and signal connection. 4.4. TDC module synchronization All distributed TDC nodes require the synchronized clock and time reference for producing unified timestamps. The synchronization and

W. Pan et al. / Nuclear Instruments and Methods in Physics Research A 738 (2014) 13–19

15

Fig. 3. Structure of carry chain based TDC.

Fig. 4. Block diagram of the distributed TDC.

measurement scheme inside TDC module is illustrated in Fig. 8. The 125 MHz phase shift compensated clock signal generated by the CUTE-WR is used to drive the TDC circuit; a PPS and encoded TAI information from the WR-derived time can periodically align the start value of TDC coarse counters. In TDC module, the fine time measurement performed in the carry chain is driven by a 375 MHz clock derived from the synchronized 125 MHz clock, the 32-bit coarse time counter is also driven by the 375 MHz and reset by the PPS signal. The final TDC result is obtained by merging the TAI time, coarse counter and fine time fraction, which are then transmitted to the CUTE-WR mezzanine. 4.5. DAQ network The WR PTP Core (WRPC) [17] serving as an Ethernet MAC capable of providing precise timing is included in the CUTE-WR. The fabric redirector inside the WRPC is capable of forwarding Ethernet packets between WR endpoint, Mini-NIC and other external fabric interfaces. In our application, Ethernet packets are classified into two categories: PTPv2 and UDP (User Datagram Protocol) packets. TDC data generated from the TDC module are packed in a hardware based UDP engine, and then transmitted to the endpoint. In the opposite direction, PTPv2 packets from

endpoint are forwarded to a LM32 processor [18] through the Mini-NIC block; UDP packets containing TDC slow control information are forwarded to the UDP engine. The data flow of the distributed TDC node is illustrated in Fig. 9.

5. Measurements 5.1. System setup To evaluate the performance of distributed TDC, a test system including two TDC nodes is built, as shown in Fig. 10. Two signals with the same frequency and a certain phase difference are generated by a function generator and injected into two carrier boards. The carrier boards are connected to a White Rabbit switch with 1 km and 2 km optical fibers respectively. TDC data is collected by the WR switch and eventually transmitted to a PC, where the phase difference is calculated. The input signals of TDC modules are also measured by an oscilloscope to monitor the time difference. To evaluate the contribution of WR synchronization performance to the final measurement resolution, the PPS skew between two CUTE-WRs are continuously measured by the oscilloscope. All measurements are performed under a constant temperature.

16

W. Pan et al. / Nuclear Instruments and Methods in Physics Research A 738 (2014) 13–19

Debug Point

JTAG 32M SPI Flash

REF clock generator DAC

25MHz VCTCXO

5:1 PLL, fanout

64K I2C EEPROM

DMTD clock generator DAC

20MHz VCXO

20+ MHz

LEDx2

LED

LEMO

GPIO

LEMO

GPIO

SFP Cage

SFPTX SFPRX

USB

User Defined Interface

FPGA Xilinx Spartan 6 XC6SLX45T

Data Flow RX Data Flow TX

REFCLK PPS TimeCode GPIO

UART USB Converter CP2102

1.2V

FPGA Mezzanine Card (FMC)

FMC Connector

Power Supply

3.3V/800mA

Fig. 5. WR node implementation: CUTE-WR.

Fig. 6. A 32-channel TDC module.

Fig. 7. Distributed TDC node implementation: FMC carrier board.

W. Pan et al. / Nuclear Instruments and Methods in Physics Research A 738 (2014) 13–19

17

Fig. 8. Synchronization and measurement block in TDC module.

Fig. 9. Data flow of distributed TDC node.

Fig. 10. Test System for distributed TDC.

5.2. Measurement results The frequency of generated signal is set to 13 kHz, and the phase difference between two signals is adjusted from 0.51 to 0.51 with a step of 0.11. For each phase setting, two distributed TDC nodes are sequentially powered up with different time intervals, and more than 100,000 measurements are recorded.

By comparing the calculated phase difference with the value measured by oscilloscope, the accuracy of distributed TDC can be evaluated. The detailed measurement results are shown in Table 1, where sig_gen is the nominal phase shift of function generator; mean_pps_skew and stdev_pps_skew are the mean value and standard deviation of PPS skew measured by oscilloscope; mean_sig_osc

18

W. Pan et al. / Nuclear Instruments and Methods in Physics Research A 738 (2014) 13–19

Table 1 Measurement results of distributed TDC test system. sig_gen/degree sig_gen/ns mean_pps_skew/ps stdev_pps_skew/ps mean_sig_osc/ns mean_sig_tdc/ns mean_diff/ps stdev_sig_tdc/ps

0.5 106.838 22 18.4 106.802 106.814 12 28.6

0.4 85.470 32 17.8 85.433 85.453 20 27.7

0.3 64.103 13 20.5 64.067 64.068 1 26.6

0.2 42.735 27 16.7 42.698 42.712 14 29.5

0.1 21.368 12 19.2 21.334 21.338 4 28.7

0 0.000 15 17.7 0.041 0.017 24 25.4

0.1 21.368 2 17.8 21.405 21.393 12 28.5

0.2 42.735 9 17.5 42.767 42.749 18 29.2

0.3 64.103 29 20.3 64.139 64.123 16 26.8

0.4 85.470 10 17.6 85.502 85.494 8 26.9

0.5 106.838 0 17.8 106.868 106.861 7 28.6

0.4 85.470 85.474 85.659 15 23.7

0.5 106.838 106.846 106.836 10 27.5

Measured histogram Approximated gaussian distribution 14000

standard deviation = 29.2 ps 12000

Count

10000 8000 6000 4000 2000 0 42640

42680

42720

42760

42800

42840

Phase Difference (ps) Fig. 11. Distributed TDC measurement results.

Table 2 Measurement results of TDC alone. sig_gen/degree sig_gen/ns mean_sig_osc/ns mean_sig_tdc/ns mean_diff/ps stdev_sig_tdc/ps

0.5 106.838 106.833 106.841 8 24.7

0.4 85.470 85.458 85.466 8 24.8

0.3 64.103 64.115 64.122 7 26.5

0.2 42.735 42.706 42.716 10 25.3

is the mean value of phase difference measured by oscilloscope; mean_sig_tdc and stdev_sig_tdc are the mean value and standard deviation of phase difference measured by distributed TDC nodes; mean_diff is the difference between mean_sig_osc and mean_sig_tdc. According to mean_pps_skew and stdev_pps_skew, the two distributed TDC nodes are synchronized with 50 ps accuracy and 20 ps precision. Take 0.21 phase shift measurement as an example, the TDC measured histogram is shown in Fig. 11. It has a span of 180 ps with a standard deviation of 29.2 ps. The accuracy is indicated by mean_diff, which is 18 ps. 5.3. Measurement analysis In order to evaluate how much the White Rabbit and TDC contribute to the final timing performance of distributed TDC respectively, the timing resolution of TDC alone is measured. The

0.1 21.368 21.342 21.343 1 24.5

0 0.000 0.023 0.014 37 22.5

0.1 21.368 21.365 21.347 18 24.2

0.2 42.735 42.728 42.720 8 26.8

0.3 64.103 64.126 64.115 11 24.8

generated two signals are injected to two channels of one TDC module, and their time differences are measured as shown in Table 2. Theoretically the timing precision of distributed TDC is determined by TDC clock jitter and TDC timing precision. However, according to Tables 1 and 2, the timing precision of the distributed TDC is mainly contributed by TDC precision. This is because the jitter of White Rabbit recovered clock is cleaned to some degree by an internal PLL of TDC FPGA. If a long cascade of WR switches is implemented in the WR network, the precision of synchronization would deteriorate, then the contribution of synchronization precision should be taken into account. The accuracy of TDC alone is better than 50 ps as indicated by mean_diff in Table 2. Sub-nanosecond synchronization accuracy can be guaranteed by the White Rabbit technology under longterm measurements [19]. So the distributed TDC is supposed to achieve sub-nanosecond accuracy, which is limited by clock synchronization accuracy.

W. Pan et al. / Nuclear Instruments and Methods in Physics Research A 738 (2014) 13–19

6. Conclusion and outlook This paper describes a novel architecture of distributed timeto-digital converter based on the White Rabbit technology. A universal 32-channel TDC module based on the carry chain structure of FPGA is designed for timing measurement of tens of picoseconds. A compact WR node (CUTE-WR) is designed to recover synchronous clock and time from the WR network and provides them for each TDC module. The prototype system of the distributed TDC achieves good timing performance. This work will continue with timing performance evaluation under different temperatures. With this work, large scale distributed synchronized TDC measurement network can be built for experiments like cosmic ray or neutrino observatories. Acknowledgment This work is supported by the National Science Foundation of China (Nos. 11005065 and 11275111). The authors would like to thank Jinyuan Wu of the Femilab, and the White Rabbit team at CERN for their help. References [1] Zhen Cao, Chin. Phys. 34 (2010) 249. [2] Accuracy and Precision, 〈http://en.wikipedia.org/wiki/Accuracy_and_precision〉.

19

[3] J. Serrano, et al., The White Rabbit Project, in: Proceedings of ICALEPCS TUC004, Kobe, Japan, 2009. [4] G. Gong, et al., Sub-nanosecond timing system design and development for LHAASO project, in: Proceedings of ICALEPCS, Grenoble, France, 2011. [5] Q. Du, et al., Nucl. Instrum. Methods Phys. Res. A 732 (2013) 488–492. [6] IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems, IEEE Std. 2008, pp. 1588–2008. [7] P. Moreira, et al., Digital dual mixer time difference for sub-nanosecond time synchronization in Ethernet, IEEE International Frequency Control Symposium (FCS), 1–4 June 2010, pp. 449–453. [8] P. Moreira, J. Serrano, P. Alvarez, M. Lipinski, T. Wlostowski I. Darwazeh, Distributed DDS in a White Rabbit Network: an IEEE 1588 application, International IEEE Symposium on Precision Clock Synchronization for Measurement Control and Communication (ISPCS), 2012. [9] P.P.M Jansweijer, et al., Nucl. Instrum. Methods Phys. Res. A 725 (2013) 187. [10] J. Wu and Z. Shi, The 10-ps wave union TDC: improving FPGA TDC resolution beyond its cell delay, in: Proceedings of the IEEE Nuclear Science Symposium, October 19–25, 2008, pp. 3440–3446. [11] J. Wang, et al., IEEE Trans. Nucl. Sci. NS 57 (2) (2010) 446. [12] E. Bayer, et al.IEEE Trans. Nucl. Sci. NS 58 (4) (2011) 1547. [13] J. Wu, IEEE Trans. Nucl. Sci. NS 57 (3) (2010) 1543. [14] W. Pan, G. Gong, J. Li, J. Tsinghua Univ. (Sci. Technol.), (2013) . [15] W. Pan, et al., Development of a White Rabbit interface for synchronous data acquisition and timing control, RT, Berkley, USA, 2012. [16] VMEbus International Trade Association (VITA), American National Standard for FPGA Mezzanine Card (FMC) Standard. 〈http://www.vita.com〉. [17] White Rabbit PTP Core (WRPC), 〈http://www.ohwr.org/projects/wr-cores/ wiki/wrpc_core〉. [18] LM32 Processor. 〈http://www.ohwr.org/projects/lm32〉. [19] Maciej Lipinski, et al., Performance results of the first White Rabbit installation for CNGS time transfer, ISPCS2012, San Francisco, USA, 2012.