| 1        | The TRIGGER/CLOCK/SYNC Distribution for TJNAF 12 GeV Upgrade                                                                                                                                                                  |
|----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2        | Experiments                                                                                                                                                                                                                   |
|          |                                                                                                                                                                                                                               |
| 3        | William GU, et al.                                                                                                                                                                                                            |
| 4        | DAQ group and Fast Electronics group                                                                                                                                                                                          |
| 5        | Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA 23606, USA                                                                                                                                           |
| 6        |                                                                                                                                                                                                                               |
| 7        | Notice: Authored by Jefferson Science Associates, LLC under U.S. DOE Contract No. DE-AC05-06OR23177. The                                                                                                                      |
| 8        | U.S. Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce this                                                                                                                |
| 9        | manuscript for U.S. Government purposes.                                                                                                                                                                                      |
| 10       | 41 4                                                                                                                                                                                                                          |
| 11       | Abstract                                                                                                                                                                                                                      |
| 12       | The TRIGGER/CLOCK/SYNC (TCS) distribution system for experiments at the Thomas Jefferson National                                                                                                                             |
| 13       | Accelerator Facility (TJNAF) 12 GeV upgrade [2] is described. The TCS system distributes readout trigger                                                                                                                      |
| 14       | (TRIGGER), system clock (CLOCK), and system synchronization (SYNC) signals for the DAQ system. The TCS                                                                                                                        |
| 15       | system also includes system status monitoring. The TCS distribution system includes Trigger Supervisor (TS)[3]                                                                                                                |
| 16<br>17 | printed circuit board (PCB), Trigger Distribution (TD)[4] PCB, Trigger Interface (TI)[5] PCB, Signal Distribution (SD)[6] PCB, VXS crates [7] and optical fibres. The TS is the main hardware interfacing between the trigger |
| 17       | system[8] and Data Acquisition system (DAQ)[9], and it is the sources for the readout trigger, system clock and                                                                                                               |
| 19       | system [5] and Data Acquisition system (DAQ)[7], and it is the sources for the readout trigger, system clock and system synchronization signals. The SD and TD modules are the main fan out hardware. The TI is the main      |
| 20       | hardware interface between the DAQ and the front end electronics. Bundled optical fibres and the dedicated high                                                                                                               |
| 21       | speed point to point connections on VXS connectors are used for signal transmission.                                                                                                                                          |
| 22       | Field Programmable Gate Arrays (FPGA) are utilised on all boards in the system to provide programmability.                                                                                                                    |
| 23       | The production hardware was intensively tested on the bench. A small scale of the TCS distribution system is                                                                                                                  |
| 24       | installed in one experimental hall for DAQ development. The full system will be implemented by the end of the                                                                                                                 |
| 25       | year.                                                                                                                                                                                                                         |
| 26       | Keywords:                                                                                                                                                                                                                     |
| 27       | Data Acquisition (DAQ), Trigger distribution, Clock distribution, 12 GeV upgrade, Electronics System                                                                                                                          |
| 28       | * Corresponding author. Tel: +1-757-269-5358. E-mail address: jgu@jlab.org.                                                                                                                                                   |
| 29       | Postal address: 12000 Jefferson Avenue, Suite 10, Newport News, VA 23606, USA.                                                                                                                                                |
| 30       |                                                                                                                                                                                                                               |
|          |                                                                                                                                                                                                                               |

# I. INTRODUCTION

2 The Trigger/Clock/SYNC (TCS) distribution system is designed for the experiments for the Thomas Jefferson 3 National Accelerator Facility (TJNAF) 12 GeV upgrade. The accelerator consists of a pair of superconducting 4 radiofrequency LINACs linked by recirculation arcs for up to five acceleration passes. It serves four experimental 5 halls with continuous-wave beams at a final energy of up to 12 GeV.

6 The trigger system uses the detector characters to select the interesting beam target interaction events. The 7 pipelined trigger will be formed every 4ns with trigger acceptance rate up to 200 kHz. The final trigger signal (up to 8 200 kHz rate) will initiate the detector information readout by the Data Acquisition (DAQ) system.

9 The DAQ system is built on the VME ReadOut Controller (ROC). The ROC uses the VME bus to readout the 10 data from front end modules via VME bus. Because of the readout overhead, it is more efficient to group the data in 11 blocks of triggers (events) for readout, especially at high trigger rates. Each ROC has a fraction of the detector data. 12 The online computers will assemble the data from all ROCs, and form event, which includes the full detector data. 13 The DAO can further select events, and save the data to permanent storage.

14 TCS distribution system is the hardware interface to bridge the trigger system and the DAQ system. The TCS 15 system receives trigger decision from trigger system, and initiate data readout for the DAQ system by distributing the readout trigger (TRIGGER) signal. In addition to the trigger distribution, it distributes a universal clock 16 17 (CLOCK) of frequency of 250 MHz to pipeline the system. It distributes an encoded synchronous signal (SYNC) 18 for the system synchronization. The front end electronics' status is monitored by the distribution system, and made 19 sure the smooth data readout of the experiments. The TCS signals are sent from TS to TI, while the front-end DAQ 20

status is sent from TI to TS. Figure 1 shows a diagram of the distribution scheme.



<sup>21</sup> 

1

Figure 1 Diagram of the trigger and clock distribution system

24 The main hardware of TCS distribution system includes a Trigger Supervisor (TS) board, Signal Distribution 25 (SD) boards, Trigger Distribution (TD) boards, Trigger Interface (TI) boards, VXS crates, and optical fibres. The 26 TS board, one SD board, and up to sixteen TD modules are located in the global TCS distribution crate. One TI 27 board and one SD board are located in each front end crate.

1 The electronics boards are custom designed and produced for the 12 GeV upgrade. Field Programmable Gate 2 Arrays (FPGA) are used for TCS generation, control and decoding. Optical Fibres and high speed differential 3 backplane connections are used to transmit signals at high speed and long distance.

The TCS distribution hardware will be discussed in the next section. The system synchronization will be discussed in the third section. The TCS distribution system initialization and setup procedure will be discussed in the fourth section. And the current status will be briefly discussed in the last section.

- 7
- 8

# II. HARDWARE DESCRIPTION

- 9 A. Trigger Supervisor(TS)
- 10 1) TS Board Overview

11 The TS board is the very top PCB module in the TCS distribution system. It is the hardware decision making

module for the Data Acquisition (DAQ) system. The TS generates TCS signals, and it throttles readout trigger when the DAQ is BUSY. The TS board is designed as a VXS crate payload slot #18 board with a physical size of

6Ux160mm. Figure 2 is a picture of the TS printed circuit board (PCB).



15

16 Figure 2 picture of Trigger Supervisor (TS) printed circuit board

- 17
- 18 2) TS Design

19 The TS accepts level one triggers from the trigger system, processes the trigger signals, generates event readout 20 triggers and event types, and sends the event readout triggers (or accepted trigger) down to the Trigger Interface (TI)

21 modules through Signal Distribution (SD) board and the Trigger Distribution (TD) boards to initiate data acquisition

22 process. The TS receives three sets of trigger inputs simultaneously. The first set of trigger inputs is 30 level one

trigger signals from the Global Trigger Processor (GTP) board through the VME P2 connector user defined pins via a VME P2 backplane IO card. The input signals are level shifted from LVPECL to LVDS by Texas Instrument SN65LVDT100 differential receivers to go into the FPGA. The receivers also serve as FPGA input protection and isolation. The second set of trigger inputs is 30 source synchronous trigger inputs through TS front panel. These input signals are received by Maxim MAX8602 discriminator chips, so the inputs are compatible with almost any differential signal levels. The discriminator output is +2.5V LVPECL signals, which can directly connect to the FPGA. The third set is 15 asynchronous trigger inputs from the generic front panel connector.

8 The TS supplies a 250 MHz system clock for trigger system and data acquisition system. The trigger system and 9 the front end data acquisition modules are pipelined on the 250 MHz clock. The TS can use its front panel clock 10 input as the system clock. This external clock can come from a clock generator, or the CEBAF synchronized clock. 11 The TS can also use its on-board oscillator as the system clock. The system clock source is selected by a hardware switch to be flexible, but less prone to problems. The clock is distributed to the FPGA, front panel outputs and VXS 12 13 P0 backplane by an On-Semi MC100LVEP111 differential clock driver. The VXS P0 backplane clock is received When the CEBAF 14 by the SD and further fanned out to the TD, then to the TI and the front end electronics. 15 synchronized clock is used as the system clock source, the readout trigger, which is synchronized with the system clock, could be used as TDC start time or stop time. 16

17 To be compatible with earlier design and facilitate the DAQ test, there are sixteen generic differential inputs (could be trigger, busy or inhibit, etc), twelve generic differential outputs, and ten generic single-end outputs. There 18 19 are also 24 outputs from the FPGA going directly to the six quad-pack LEDs. SMA connectors are mounted on the 20 front panel for the external clock input and output. Using the same front panel space as the SMA connectors, two 21 QSFP optic transceivers can be loaded. Through these two transceivers, the TS can connects to two TI boards 22 directly bypassing the TCS distribution. The TS can also be configured to drive the VXS P0 as a TI. The TS is very 23 flexible in configuration. It can be used in a large system with up to 128 front end crates, or a small system with just 24 one crate or several crates.

As a VXS payload board, the TS is compatible with VME64x. It has VME A24D32 registers for board setup and monitoring. It supports A32D32, block, and 2ESST data readout. It can even be configured as a VME master board. Figure 3 is the functional diagram of the TS.



# TS board block diagram

# 1

# 2 Figure 3 TS functional diagram

3

# 4 3) TS FPGA Design

5 The main functions of the TS FPGA are event readout trigger and event type generation, SYNC signal 6 generation, and readout trigger throttling. Additionally, the TS FPGA has two VME to I<sup>2</sup>C engines and two VME to 7 JTAG engines. Each of the I<sup>2</sup>C engines is connected to one switch slot in the VXS crate, to serve as a bridge 8 between the VME controller and the switch slot electronics. The JTAG engines connect to the FPGA JTAG port 9 and PROM JTAG port. The ports can be used to load the PROM, readout the chip identification codes and user 10 programmable code. The TS PROM user code includes the TS board serial number, and TS identification 11 information. The TS FPGA user code includes the firmware revision information.

The FPGA built in Digital Clock Manager (DCM) is used to generate the 125MHz and 62.5MHz lower speed clocks for trigger word serialization. The generated clocks are also used to keep the system synchronized. Figure 4 is a functional diagram of the TS FPGA.



# TS FPGA Block diagram

# 1 2

3

# Figure 4 Function diagram of TS FPGA

# 4 3.1 Readout Trigger Generation

5 After receiving the 75 trigger inputs, the TS pre-scales and enables each inputs independently. The TS generates 6 the readout trigger and event type through a multilevel lookup table, which is implemented with the FPGA built-in 7 block RAM. The TS can also generate triggers by VME command for system tests. The TS forms a trigger word 8 using the readout trigger and event type every 16ns. The valid readout trigger has to pass the trigger rule check and 9 trigger throttling logic. The 16-bit trigger words are serialized by the FPGA's built in Multi-gigabit Transceivers 10 (MGT) at 62.5MHz, that is, every 16ns. The trigger is generated on 250MHz clock and has a time resolution of 4ns. 11 The fine trigger timing information (which quadrant of the 16ns) is sent as part of the 16-bit trigger word. Table 1 is 12 the trigger word definition:

# 13 Table 1 Trigger word definition

| Bit 15 | Bit 14 | Bit 13 | Bit12 | Bit11                           | Bit10  | Bit 9-0                       | comment                                       |
|--------|--------|--------|-------|---------------------------------|--------|-------------------------------|-----------------------------------------------|
| 1      | 0      | 0      | 1     | Quadrant timing                 |        | Event type                    | GTP major trigger                             |
| 1      | 0      | 1      | 0     | Quadrant timing                 |        | Event type                    | Ext major trigger                             |
| 1      | 0      | 1      | 1     | Four TS partitions' event types |        |                               | TS partitioning (4, 3, 2, 1)                  |
| 0      | 1      | 1      | 0     | Quadrant                        | timing | Trigger source/<br>Event type | TImaster legacy Trigger /<br>(TS) VME trigger |
| 0      | 1      | 0      | 1     | Trigger command/Control         |        |                               | VME command                                   |
| 0      | 1      | 0      | 0     | TS timer (TS time bit(13:2))    |        |                               | TI Sync check                                 |
| 0      | 1      | 1      | 1     | Trigger content                 |        |                               | Additional trigger info                       |

2 As the TS and TI are using the same 250MHz system clock, the elastic buffers inside the MGT are not necessary. 3 The MGT phase alignment is used to bypass the transmitter elastic buffers to keep the serializer/deserializer latency 4 at their minimum. The timing on the receiver signal is not as critical, so the receiving elastic buffer is used for easy 5 clock domain transition.

6 3.2 SYNC generation

7 The TS generates and distributes the SYNC signal. The SYNC is an encoded four bits serialized command transferred at 250 Mbps synchronized with the system clock. Normally, the serial SYNC line stays at logic high (or 8 9 '1'). When transferring a SYNC command, the SYNC goes to logic low for one bit, then followed by the 4-bit 10 command code. After the 4-bit SYNC command, the SYNC goes to logic high again. There is a minimum of four 11 '1's before the next cycle begins. The 4-bit command is phase aligned to the 62.5 MHz clock used for the trigger 12 word transfer. This phase relation is used to synchronize the slower clocks on the TI to the 62.5 MHz clock on TS. 13 This also limits the two consecutive SYNC command to no less than 64 ns apart. To facilitate the AC coupled 14 optical transceivers, the SYNC is Manchester encoded on the TS, and Manchester decoded on the TI. Table 2 15 shows some SYNC command codes.

#### 16 Table 2 SYNC command codes

17

1

| 4-bit SYNC code | SYNC action                                                             |
|-----------------|-------------------------------------------------------------------------|
| 0000 or 1111    | Invalid codes                                                           |
| 1101            | Front end crate reset, trigger link realignment                         |
| 0111            | Trigger stop/trigger link disable, and trigger FIFO write counter reset |
| 0101            | Trigger start/trigger link enable, trigger FIFO read counter reset      |
| 0100            | Reset the TI GTP status register                                        |
| 0011            | AD9510 re-sync, that is: slower clock phase re-sync                     |
| 0010            | System clock resynchronization, AD9510 re-sync, DCM reset, MGT reset    |
| 0001            | TI VME clock DCM reset, then full reset                                 |
| Others          | To be assigned                                                          |

18

#### 19 4) TS FPGA programming

20 The TS utilizes a Xilinx XC5VFX70T-2FF1136 FPGA, which needs about 27 Mbit of uncompressed data to 21 configure. A Xilinx XCF32P PROM is used in serial mode to configure the FPGA and store the configure file when 22 the power is off. When the FPGA configuration bits are compressed, the PROM can hold two versions of the 23 firmware. This two versions design makes switching the TS operation modes very easily.

24 The PROM can be loaded remotely by VME command. A user defined address modifier (AM) code 0x19 is 25 used to load the XCF32P PROM by a discrete logic VME to JTAG engine. This loading does not depend on the 26 FPGA and works on a bare board from the assembly house. This process loads one bit of PROM data per VME 27 transfer. To increase the efficiency of VME transfer, the second VME to JTAG engine is implemented inside the 28 FPGA. With the FPGA internal JTAG engine, 32-bits are loaded into the PROM per VME command. This process 29 is much more efficient than that of the discrete JTAG engine, but it works only when the FPGA is programmed and 30 working.

### 31 B. Signal Distribution (SD) board:

1 SD is designed as VXS switch slot #B (as physical slot#12 in the 21-slot VXS crate) module with physical size 2 of 6Ux160mm. Figure 6 is a picture of the SD card.



# 3 4 5

# Figure 5 The Signal Distribution Module

The SD card receives TCS signals from the VXS payload slot#18, and fans out the signals to VXS payload slot#1-16 through VXS P0 connectors using high speed differential signals. In the global trigger/clock distribution crate, the payload slot#18 hosts TS board, and payload slot#1-16 host TD boards. In the front end crates, the slot#18 hosts a TI board, and slot#1-16 host front end electronics boards(Flash ADC, TDC, etc.)

The SD also receives the BUSY status signals from payload slot#1-16 boards, and merges the BUSY signals and sends to the payload slot#18 board. This BUSY status is used to throttle the readout trigger sent from TS to keep the DAQ system from getting out of synchronization.

The SD has the option to clean up the clock jitter using SiliconLab SL5538 PLL component. The jitter cleaned output clock can also be phase delayed. This gives the option of aligning the front end crate clock phases. After the jitter clean-up, the clock jitter of SD clock output is about 1ps, which is close to the measuring limit of our equipments.

# 17 *C.* Trigger Distribution (TD) board:

The TD is a VXS payload module with a physical size of 6Ux160mm. Its main function is to fan out the TCS signals and to collect the front end crate status information using the optical fibres. Figure 6 is a picture of the TD board.



# 1 2

3

# Figure 6 Trigger Distribution (TD) board.

The TD receives the TCS signals through the VXS P0 connector from TS via SD fan out. The trigger signal is re-sampled using the Analog Devices' ADN2805, a 1.25 Gbps clock and data recovery IC. The TCS is fanned out to eight optical transceivers (AVAGO HFBR-7924). Each optical transceiver drives a set of fibres, and connects to one TI board in the front end crate.

8 The TD receives the status from eight TI boards through the optical transceivers. The status words are 9 deserialized by the FPGA built-in MGT modules. The status words include the front end crate (the crate that TI 10 resides) BUSY, readout acknowledge, trigger received etc. The TD merges the BUSY signals from eight TI boards, and sends to the SD, the SD merges the BUSY and sends to the TS. The TD can assert the BUSY if the number of 11 12 events buffered at the front-end crate, which is the difference between the number of triggers it fans out and the 13 number of readout the front end crate performed, is over a preset limit. This is used to limit the number of events 14 buffered on the front end electronics, or the buffer usage on the front end electronics. The special case is the event 15 locking readout mode when the preset limit is one.

The TD board utilize a Xilinx XC5VLX30T-1FF665 FPGA and a XCF32P PROM. When the FPGA bits are not compressed, the PROM can hold two versions of the FPGA firmware. When the FPGA program bits are compressed, the PROM can hold four versions of the firmware. This multiple versions design makes switching the TD operation mode very easily.

The PROM can be loaded remotely by VME command. The mechanism is the same as that implemented on the TS. For the details, refer to the TS design description. Though the TD and TS are residing in the same crate, the TD and TS are in different slots and they have their own geographic addresses as defined by the VME64x protocol, there will be no confusion between TS and TD in the VME operation.

# 1 D. Trigger Interface (TI) board

# 2 1) TI Overview

The TI is the readout interface board to the front end DAQ electronics. It receives the TCS signals from TD. It decodes the TCS and sends them to SD then to the front end crate electronics. Each front end crate has one TI board and it is usually located in VXS payload slot#18. To optimize the system design, the TI shares the same PCB with the TD, but the components are populated differently. The TI and TD are using the same FPGA, Xilinx XC5VLX30T, but their firmware is different. Figure 8 shows a picture of the TI board.



8

Figure 7 Trigger Interface card. The TI shares the same PCB design as TD, but the components are
populated differently from the TD.

11

Because of the shared PCB design with TD, the TI is flexible enough to implement some of the TD functions and setup a small DAQ system with up to nine crates for detector commissioning without the full TCS distribution system.

# 15 2) TI Design

16 TI is designed as a VXS payload slot#18 board with a physical size of 6Ux160mm. The AVAGO's HFBR-7924 17 four channel optical transceiver is used on TI/TD boards. The first optical transceiver is used to receive TCS signals 18 from global TCS distribution crate and to send status to the global distribution crate. The optional second optical 19 transceiver can be used for subsystem TCS signals distribution. The figure 8 shows the diagram of the TI functions.



1

Figure 8 Trigger Interface card functional diagram

2 3

Analog Devices' AD9510 is used for clock distribution and lower frequency clocks generation. A cross-switch buffer is used so that the two clocks on the VXS P0 backplane can get any of the three clocks (250 MHz, 125 MHz, and 31.25 MHz) independently, which is received by different front end electronics. The TI board also distribute the TCS signal to the VME P2 backplane and front panel connectors, so that the TI can interface with non-VXS modules in standard VME crate.

# 9 3) TI FPGA and TI Data

10 One Xilinx XC5VLX30T FPGA is used on the TI board. The FPGA has three main functional blocks: VME 11 interface, TCS interface, and event assembly and BUSY monitoring. Each part will be briefly described in the 12 following paragraphs.

# **TI FPGA block diagram**



1

The VME interface functional block is responsible for the slow controls of the TI board and the VXS switch slots boards and the TI data readout. As there is no VME bus access to the switch slots in the VXS crate, a VME to I<sup>2</sup>C engine is implemented on the TI for each switch slot. Two VME to JTAG engines are implemented to connect the VME to the FPGA JTAG port and PROM JTAG port. Through the JTAG ports, the board type and firmware versions can be verified by reading out the chip code and user code. The board serial number is saved in the PROM's dedicated USERCODE register. The TI also initiates the front end crate data readout through VME bus. The TI data can be readout in simple single A32D32 VME transfer, block transfer mode, or 2eSST mode.

9 Using the Xilinx FPGA's built in MGT transceivers, the trigger word is deserialized. The readout trigger signal, 10 event type, and trigger timing information are extracted. The trigger signal is sent out to the front end electronics in 11 the crate using VXS P0 connector and SD board. Meanwhile, the TI can assemble its own event data based on the 12 trigger received. The TI event data includes the trigger number (or event number), trigger time stamp, and trigger 13 type information. The TI data will be used online for detector event assembly, and event synchronization check. 14 Upon receiving trigger signal, it will initiate ROC readout by either asserting the VME Interrupt Request, or setting 15 a polling register. After the ROC finishes the crate readout, the ROC will acknowledge to the TI that the readout has 16 finished.

Using its IODELAY, the FPGA can automatically align the SYNC signal phase to the 250 MHz system clock. The TI can measure the fibre latency with a precision of less than 1 ns by loopback from TD board. The SYNC will be delayed to compensate for the fibre latency, so all the TIs will receive the SYNC command at the same time with the exception of the system clock skews. The TI will decode the SYNC command. It sends the RESET (one of the decoded SYNC commands) to the other modules in the crate through SD via VXS P0 backplane. The TI combines the BUSY from SD, which is the merged BUSY signals from front end modules, and its own BUSY. Together with other TI status, TI sends the BUSY to the TD through the optical fibre link.

# **III. TCS DISTRIBUTION ISSUES**

# A. Clock distribution

3 The whole system uses the same 250 MHz clock, which comes from the TS in the global distribution crate. This 4 clock is either generated by TS' on board clock oscillator or its external input. It is fanned out to the VXS PO 5 connector, then to the SD board. The SD fans out the clock to TD boards via VXS P0 backplane. The TD board 6 further fans out to the TI boards via optical fibre cables. The TI sends the clock to the front end crate SD board, and 7 the SD fans out to the front-end DAQ modules (TDC, ADC) via VXS P0 backplane. The fan-out buffer level is 8 minimized on every board to limit the clock jitter. The slower clocks derived from system clock are phase aligned 9 thanks to the Analog Devices AD9510 with a synchronous phase re-alignment command. The clock jitter is about 1 10 ps measured at frontend electronics.

# 11 B. SYNC distribution

The SYNC is a 4-bit code, which is decoded by TI boards. Various decoded codes are used to synchronize the system. The SYNC is synchronized across the TI boards by applying different delays on the individual TI boards. The delays are determined by the fibre latency measurement.

Out of the twelve fibres in each cable connecting the TD board with the TI board, eight fibres (four pairs) are connected to the optical transceivers (AVAGO HFBR7924), and four fibres are not used. Out of the four pairs, one pair is used for trigger and status, one pair is used for clock, and one pair is used for SYNC. The forth pair is used to measure the fibre latency. When measuring the latency, the TI sends a test signal to the TD through one fibre of the pair, and the TD loops back the signal through the other fibre of the pair. The TI measures the delay between the test pulse and the looped back test pulse using the FPGA counter and the carry chain [5, delay measurement]. As the fibre skew is pretty small, the measurement on this pair can be used as the delay of the other pairs.

The fibre delay measurement result can be saved in the TI, and used to automatically compensate the fibre delay for the SYNC in 4 ns (the 250 MHz system clock period) steps. Using the Xilinx FPGA IODELAY feature, the SYNC can be automatically phase aligned to the system clock. After the SYNC compensation, all the TI boards receive the SYNC at the same time with the skew of one system clock period, which is 4 ns. The synchronized SYNC signals are used to synchronize the triggers as described in the next section.

# 27 C. Trigger synchronization

28 The trigger words, which include readout trigger signals and event information (event type, trigger timing etc.) 29 are serialized on the TS. The serialized trigger word is fanned out by the SD board and TD board and deserialized 30 by the TI board. The latencies (between TS board and TI boards) depend on the fibre lengths and the deserializers 31 on TI boards, so different TI boards will have different latencies. The TI needs be synchronized, so all the TI boards 32 send the readout trigger at the same time to the front end data acquisition electronics. The trigger synchronization 33 process makes sure that all the TI boards send the trigger out to the front end modules at the same time, and the 34 readout data from different electronics are for the same physics event. To synchronize the trigger signals, both the 35 fibre latency and the deserializer latency need be compensated. The SYNC is used in conjunction with a synchronous FIFO to enforce a fixed latency on the serial trigger link. Figure 11 shows the diagram of compensated 36 37 trigger distribution.

1



1 2 3

Figure 9 Trigger synchronization between TIs

After SYNC delay compensation, all the TI boards receive SYNC at the same time. The TS encodes the SYNC on its 62.5MHz clock, the TI boards use one of the decoded SYNC code to align the TI slower clock phases. By sending phase alignment command to the AD9510 clock distribution chip, all the slower clocks on TI boards are phase aligned with the TS 62.5MHz clock. The 62.5MHz clocks are used for trigger word serialization and deserialization.

9 On the TI board, the trigger word is clocked into a FIFO and clocked out of the FIFO using a 62.5MHz clock 10 derived from (and in phase with) the system 250 MHz clock. At start-up the FIFO is reset (0 words) and the FIFO 11 reading/writing is disabled. No words are written into the FIFO since the TS is not yet transmitting data words on 12 the trigger link (i.e. received data valid signal is not asserted). Acceptance of triggers by the TS is also disabled. The serial trigger link is idle words only. On trigger start, the TS starts transmission (trigger words and/or timing 13 14 words) on the trigger link. The TI will write the deserialized data (valid data, that is non-idle data word) to the 15 FIFO. After some pre-set delay (VME register controlled) from the trigger start, the TS issues a 'Trigger Start' 16 command on the SYNC line. When TI receives the 'Trigger Start' from SYNC line, the TI resets the trigger FIFO 17 readout address, and enables continuous readout of the FIFO. As the CLKSYNC lines are fibre length adjusted and 18 the 62.5MHz clocks are phase aligned, the trigger words from the TI board FIFO are synchronized across the 19 system.

The trigger word also has the fine trigger timing information. By decoding that timing, the TI board distributes the trigger in 4 ns precision, though the trigger word is serialized every 16 ns. If the system clock phase is not adjusted, there will be a maximum of 4ns skew among the clocks on the TI boards. This phase can be adjusted by SD if the skew is critical to the system. As the trigger and sync are phase aligned with the clock, there will be a maximum of 4ns phase differences for the trigger signals among TI boards if the clock phase is not adjusted.

# 25 D. DAQ synchronization (trigger throttling) control

Because of the finite memory size and the randomness of the level one trigger, it is possible that the memory get overwhelmed somewhere in the system, which could cause problems in the DAQ system. The trigger distribution throttling mechanism is used to prevent the possible memory overflows, and to keep the DAQ synchronized. Figure 1 10 shows the DAQ synchronization logic implementation. Three methods, which are used to keep the DAQ

2 synchronized, will be described in the following sections in detail.



3 4

# Figure 10 DAQ synchronization

5

# 6 1) Busy signals

7 The BUSY signals are the primary feedback for the pipelined DAO synchronization. The front end DAO 8 electronics receives the readout trigger signal, finds the matching data, and stores the data in the memory to be 9 readout through VME later. The DAQ is synchronized on the readout trigger, that is, each readout trigger is one 10 physical event. If the front end electronics memory is full (or close to full), it will assert the BUSY signal to inform 11 the trigger distribution system that possible DAQ out of sync could occur if more readout triggers come. This 12 BUSY signal from front end board can be accumulated on the SD, and sent to the TI board in the front end crate. 13 The TI sends the BUSY signal to the TD through fibres. The TD will accumulate the BUSY from TI boards, and 14 send to the SD. The SD in the global distribution crate accumulates the BUSY and sends to the TS board. When TS 15 receives the BUSY, it will throttle the trigger to prevent the memory overflow in the front end electronics. After data readout in the front end, the BUSY will abate. After the BUSY disserted on the TS, TS will resume readout 16 17 trigger generation. The TS board records the busy time for efficiency monitoring. Because of the event trigger 18 latency, the front end board should assert busy before it is really full, leaving some cushion for the trigger in the 19 transfer.

# 20 2) Event limit setting

In addition to the BUSY feedback, the system can set a limit on the number of triggers buffered at the front end electronics. This is especially useful for the electronics that do not support pipeline mode but has a known buffer capability. This is achieved by the trigger acknowledge and readout acknowledge by the TI boards.

24 After the trigger (event) is readout, the Read Out Controller (ROC) will set an acknowledge signal to the TI to 25 indicate that one event is read. The TI sends this acknowledge signal back to the TD through the same fibre used for 26 the BUSY transfer, which is encoded and serialized. The TD keeps track of the number of triggers sent to the TI, 27 and the number of acknowledges from the TI. If the difference is over a preset limit, this means that there are a 28 certain number of events buffered on the front end DAQ electronics; the TD will assert the BUSY. Through the SD 29 board, the TS board receives the BUSY, and the TS will throttle the trigger and disable further trigger fan-out. After 30 front end readout and acknowledge, the difference will decrease, and the BUSY will abate on the TD. After TS 31 senses the desertion of BUSY from TD, it will generate readout trigger again. The TS records it as dead time the 32 same way as the BUSY asserted by the front end electronics. If the preset limit is 1, this is the event locking mode. That is no second trigger is sent out before the first trigger is readout. If the pre-set number is zero, the DAQ will be
working at pipeline mode with trigger throttling by the front end electronics BUSY only.

If event blocking is used, that is, a preset number of triggers is treated as a block in the DAQ readout, the ROC will acknowledge on the block based readout, not individual trigger. In this case, the TD will count the number of blocks sent to the TI, and the number of blocks readout acknowledges by the TI. The TD will set the BUSY if the difference is over the preset number. If the number of the trigger per block is set to 1, each block is one trigger. This special case is the lock mode, which is the same as that mentioned in the previous paragraph.

# 8 *3)* Sync event (special event)

9 In addition to the BUSY and Event Limit setting, the TS can generate a special readout trigger to actively 10 synchronize the system. This special trigger is called SyncEvent. There are three ways for the TS to generate 11 SyncEvent. First, the TS can periodically set a readout trigger as the SyncEvent. In this case, the readout trigger has 12 its original event type. The period can be set by a VME register, and the SyncEvent is the last event in the readout 13 block. Second, the TS can generate (or insert) a SyncEvent trigger. This trigger may happen anywhere in the data 14 block. This SyncEvent is not correlated with normal readout trigger. The event type is zero. Third, the SyncEvent 15 can be generated by the event type lookup tables. In this case, some trigger patterns will generate SyncEvent. This 16 provides a way for hardware to set the SyncEvent.

17 After sending out the SyncEvent, the TS will be in a waiting mode, and inhibits further triggers immediately. 18 Upon receiving SyncEvent, the TI will inform the ROC in the crate of the special event, and set the BUSY status. 19 The BUSY will propagate to the TS to stop further triggers. After ROC receives the SyncEvent marker, it will read 20 all the front end modules memory buffers, and make sure that all the data buffers are empty, and ready for further 21 triggers. If the ROC detects out of synchronization condition, it may request SyncReset from TS through TI before 22 acknowledge the data readout. Then the ROC will set an acknowledge signal to the TI to indicate that the front end 23 crate is ready for triggers. The TI will negate the BUSY. After all the BUSY are abated from TIs, the TS will 24 generate readout triggers again. After SyncEvent, the DAQ system is re-synchronized. The SYNC event is a pre-25 emptive action for DAQ synchronization.

# 26 4) Sync Reset Request

If some front end crates are out of synchronization (ROC has detected out of synchronization), the ROC can issue a Sync\_Reset\_Request signal to the TI board. This signal will propagate to the TS through optic fibres, TD boards and SD board. After the TS detected the request, the TS will set a marker (polling register) to inform the VME controller in the global TCS distribution crate. Meanwhile, the TS will not generate new readout trigger even if the system is not BUSY. The VME controller can issue a SyncReset command to the DAQ system. After SyncReset, the system goes back to the synchronized.

# *E. Subdetector partitioning*

There are two ways to partition the detectors. The first way is to partition the TS, so that the TS can do the functions of several smaller TS. This is configured by firmware and software. The second way is to add several subsystem trigger supervisor boards and to add the optional subsystem trigger receivers on the TI boards. This is mainly configured by hardware.

# 38 1) Partitioning using the TS event type

By encoding a special trigger word on TS, the DAQ system can be partitioned, so that each partition works independently. The 16-bit trigger word is encoded so that the Bit(15:12) indicates that the trigger word is for partition mode. The lower 12 bits is divided into four groups with three bits each. Each group is for one partition. The three bits in each partition support up to 7 event types, where the code 000 means no trigger in that partition. The TI board can be configured to decode any of the four partitions. The front end crate is automatically grouped to the partition that the TI board decodes. The TS generate triggers as four mini trigger supervisor boards (Sub-TS). Each sub-TS has its own event type look up table and data stream. The sub-TS works in parallel with the main TS
functions.

Each sub-TS can have up to 13 level one trigger inputs. These 13 trigger inputs can generate up to 7 different event types by a lookup table implemented with the FPGA block RAM. The user can choose any five trigger inputs from the 30 GTP inputs, any five from the 30 front panel synchronous inputs (external trigger), and any three from the 15 front panel asynchronous trigger inputs. The seven event type is encoded into three bits. Figure 4 green coloured sections show the generations of sub-TS readout triggers and event types.

8 There is no sub-TS trigger timing information, nor sub-TS trigger content word. The sub-TS can also work 9 together with the normal TS, though the normal TS trigger strobe has higher priority than the sub-TS trigger strobe 10 word. The TI can decode both the standard TS trigger strobe word and sub-TS trigger strobe word. The TI needs to 11 know which sub-TS to enable.

The TS board has event data per readout trigger. The data can be read out by the A32D32 access (and up to 2esst). Each sub-TS has its own event data on TS board too, which can only be read out by VME A24 D32 single word access.

# 15 2) Partitioning using the Subsystem trigger supervisor

The shared TI and TD PCB board can be configured and firmware programmed as a subsystem supervisor board. It can generate readout triggers like a trigger supervisor with optical fibre fan-out like a TD. Each subsystem trigger supervisor board can drive up to eight TI boards. These eight TI boards are grouped as a sub-system. The TI uses the optional optic transceiver as subsystem TCS input. The sub-system partition can co-exist with the global system. This implementation requires more hardware, that is one subsystem TS, and one optic transceiver on each TI board. Each subsystem (or partition) is limited to eight crates. But this implementation is more flexible, and it does not require changes on TS.

# 23 F. Subsystem commissioning

By selectively populate the shared TI and TD PCB, the board can be configured as a TImaster which has the combination of the TI, TD and TS functionalities. The TImaster can generate triggers like a TS board, fan out TCS signals like a TD board, and interface with the front end crate like a TI board. This is very useful in sub-detector testing and commissioning. Figure 14 is a sample configure diagram for nine crates testing/commissioning.



28

# 29 Figure 11 Subsystem testing/commissioning for up to nine Front End Crates

30

The First crate has the TImaster board. The TImaster board receives external triggers (either front panel inputs, or VME generated triggers) and generate triggers for the set up. It generates the 250 MHz system clock by either the on-board oscillator or an external clock inputs. It generates the SYNC commands from the VME controls. It sends out the TCS signals like a TI board through the VXS P0 backplane to the crate, and it sends out TCS signals to another eight TI boards like a TD board. The BUSY signals are merged by the TImaster, and used to inhibit the triggers to control the DAQ flow. The ROC readout acknowledges are also collected by the TImaster to control the DAQ process. The other eight crates are standard front end crates with TI board in standard configurations. This setup is especially useful when the TS is not available.

6

# IV. SYSTEM INITIALIZATION

7 The trigger distribution system needs be initialized properly for the synchronized, low clock skew, and no trigger 8 loss distribution. Because of the various constraints, the proper order should be followed. The system clock (250 9 MHz) needs be setup first, then the SYNC link, followed by the secondary clocks (slower clocks), then the trigger 10 word link, and the status feedback is the last.

11 To set up the system clock, first the TS clock source is chosen, which can be either the on-board oscillator or the 12 external input source. Two clocks (one for payload slots 1, 3, 5 ... 17, and the other for payload slots 2, 4, 6... 16) are 13 sent to the P0 backplane through a LVPECL buffer. The SD receives the clock, and jitter cleaned by a PLL chip. 14 The TD boards in the global trigger distribution crate receive the clock from VXS P0 (set by a hardware switch), and 15 fans out to the optical transceivers through a LVPECL buffer. One of the buffer output is sent to the AD9510 to generate the clocks used on TD, especially the FPGA. On TI board, a LVPECL multiplexer is used to select the 16 17 optic transceiver clock input (set by a hardware switch. The setting can be over written by a VME register). The 18 AD9510 is used to generate the clocks for the front end crate, and its FPGA. The front end crate can get two 19 independent clocks of 250 MHz, 125 MHz, or 31.25 MHz (set by hardware switches, and no VME register control) 20 through VXS P0 connector. The TI also output a 41.667MHz clock on the VME P2 connector, which is used for 21 CAEN TDC. A clock re-sync is required to align the phases of the slower clocks, which is achieved by aligning the 22 TI clocks to the TS slower clocks via the SYNC commands. The SYNC has to be set up before the clock can be 23 synchronized.

24 SYNC setup: The SYNC is generated by the TS, and the SYNC command is phase aligned with the 62.5 MHz 25 clock on TS. Because of the requirement of the SYNC command, that is one bit of 0 for the start followed by four 26 bits command code, and 4 bits of 1s for the idle. The delay for the sync code to be serialized needs be set on the TS. 27 This delay is related to the FPGA MGT serializer/deserializer latency. The serialized SYNC code is further Manchester encoded and sent to SD through VXS P0 connector. The SD receives the SYNC from VXS P0, and 28 29 fans out the SYNC by a buffer to the TD boards. The TD board receives the Manchester encoded SYNC signal by 30 the FPGA. For proper decoding, the TD phase aligns the SYNC to the 250MHz clock by adjusting the FPGA input delay (an FPGA delay near the IO block) (a VME command to initiate this). The decoded SYNC command is used 31 32 on the TD, and is encoded again and fanned out to the TI through the optic transceivers. The TI board receives the 33 Manchester encoded SYNC command, and phase align to the 250 MHz system clock. The TI board measures the 34 latency of the TCS signals by sending a pulse to the TD board through the fourth pair of the optic link. This latency 35 (half of that) is used to compensate the fibre (plus optic transceivers) delays for the trigger and sync distribution. The longer the fibre cable, the smaller the delay inside the TI FPGA. With this delay, the TI board receive the 36 37 SYNC at the same time up to the system clock skew. After the SYNC is set up, a clock re-sync command will align 38 the TI slower clocks, so that all the TI boards has phase aligned slower clocks.

39 After the slower clocks are phase aligned, the trigger link can be set up. At the TS, the proper trigger lookup tables have to be loaded for proper event type generation. Proper trigger sources are enabled. At the TI, the proper 40 41 trigger source is setup. The partition is set up if necessary. At the TS, the trigger word is aligned to the 62.5 MHz 42 clock on the TS, at the destination, the trigger word is aligned to the 62.5 MHz clock on the TI. Because of the 43 MGT are using the same system clock (250 MHz), the phase alignment can be used for the MGT transmitter to 44 minimize the serializer latency. Unlike the SYNC command, the TD use ADN2805 to resample the serialized 45 trigger word without decoding it. The trigger word is serialized on the TS, and deserialized on the TI. The SD and 46 TD boards are pure fan-outs. A MGT reset will align the serializer and deserializers. A system SYNC reset (0xD 47 code) will reset all the buffers and counters. The SYNC reset will also synchronize the trigger link. The trigger stop 1 command (set through SYNC commands 0x7) will force the TS to send idle on the trigger link, and the TI boards to

reset its receiver counter. The trigger start command (set through SYNC command 0x5) will force the TS to send
trigger word, and the TI boards to start to read out the trigger word buffer.

- 4 TO summarize, the trigger distribution start-up procedure is:
- 5 Set up the 250 MHz system clock (TS DCM reset); • 6 Set up the SYNC path (fibre latency measurement); • 7 Re-sync all the slower clocks to TS slower clock (re-sync AD9510, TI DCM reset); • 8 Set up the trigger word path (Trigger tables, MGT reset, trigger distribution idle); • 9 Synchronization of the front end boards etc. (counters reset, data buffer reset, sync the trigger path); • 10 System wide SYNC reset; ٠ 11 Trigger distribution start; • 12 Data readout/acknowledge/busy backpressure..... ٠ 13 V. STATUS 14 A. Prototype System Integration 15 16 A prototype of the TCS distribution system was setup and tested. (A small scale of TCS distribution system was 17 setup and being tested using the final production board in HallD). The distribution system works. Figure 13 is a

18 picture of the distribution system.



# 2 Figure 12 Setup for the trigger and clock distribution

3

1

In this setup, one prototype TS board, five production TD boards, nine production TI boards, one production SD boards, one production VXS crate, and two VME crates were used. (In this setup, one production TS board, two production TD boards, nine production TI boards, ten production SD boards, ten production VXS crates are used.) The TS board sends TCS signals to the SD board, and the SD board fans out them to the TD boards in the global distribution crate. Nine fibres connect five (two) TD boards to the nine TI boards in the front end crates. This setup can represent the full trigger/clock distribution system.

The TI boards are synchronized. The trigger and clock distribution are synchronized. The trigger throttling is working. The setup is stable. Figure 14 is an example of the trigger outputs from four TI boards. The trigger output skew is less than 4ns. The four TI fibre lengths are 150 meter, 50 meter, 4 meter and 5 meter respectively. We do

13 not expect any problem for the full trigger/clock distribution system.



# Figure 13 Aligned trigger outputs from four TI boards with fibers lengths of 160 meter, 50 meter, 5 meter and 4 meter respectively

4

17

1

5 The TS board takes about 200 ns from receiving level one trigger either from front panel connectors or VME P2 6 connector, to generate readout trigger and event types. The trigger word takes about 230ns for the TS to serialize 7 and TI to deserialize. The TI takes about 50 ns to distribute the readout trigger from the deserialized trigger word. 8 The SD and TD boards take about 25 ns to fan out the TCS. The total distribution latency is about 550ns. The 9 actual trigger distribution system latency will be longer when the fibre delay is added and the trigger matching 10 window is extended.

11 With this setup, the event trigger rate reached over 500 KHz, and it works reliably.

# 12 *B. Hardware status*

The TI, TD, and SD boards are mass produced, and fully tested. The VXS crates and fibre cables will be installed in the experimental halls in 2013. The TS board is fully tested, and satisfies the design requirement. The production will be finished by 2013. The full system will be installed in the end of 2013 in the experimental halls, and will be ready for 12 GeV upgrade experiments.

# VI. REFERENCES

- 18 [1] GLUEX collaboration, the GlueX experiment
- 19 [2] CEBAF references
- 20 [3] Trigger Supervisor, technical note
- 21 [4] Trigger Distribution board, technical note

1 [5] Trigger Interface board, technical note

- 2 [6] Signal Distribution board, technical note
- 3 [7] VXS crate, and VXS specification
- 4 [8] Cuvas et al, trigger system description
- 5 [9] Heyes et al, DAQ system description
- 6

# VII. GLOSSARY

- 7 ADC: Analogous to Digital Converter.
- 8 TDC: Time to Digital Converter.
- 9 VME: Versa Module European. ANSI/IEEE 1014-1987
- 10 VXS: VME switched Serial. VITA41.0
- 11 TI: Trigger Interface
- 12 TD: Trigger Distribution
- 13 TS: Trigger Supervisor
- 14 SD: Signal Distribution
- 15 GTP: Global Trigger Processor
- 16 CTP: Crate Trigger Processor
- 17 ROC: ReadOut Controller
- 18 DAQ: Data Acquisition
- 19 GLUEX: Gluon Excite experiment
- 20 FPGA: Field Programmable Gate Array
- 21 PROM: Programmable Read Only Memory
- 22 LVPECL: Low Voltage Positive Emission Coupling Logic signals
- 23 LVDS: Low Voltage Differential Signals
- 24 MGT: Multiple Gigabit Transceivers
- 25 MHz: Million Hertz
- 26 TCS: Trigger/Clock/Synchronization signals
- 27 ns: Nano-second, or one billionth of a second
- 28 ps: Pico-second, or one trillionth of a second
- 29 Mbps: Million bits per second
- 30

## VIII. FIGURE CAPTIONS

| 31 | Figure 1 Diagram of the trigger and clock distribution system                                           | 2    |
|----|---------------------------------------------------------------------------------------------------------|------|
| 32 | Figure 2 picture of Trigger Supervisor (TS) printed circuit board                                       | 3    |
| 33 | Figure 3 TS functional diagram                                                                          | 5    |
| 34 | Figure 4 The Signal Distribution Module                                                                 |      |
| 35 | Figure 5 Trigger Distribution (TD) board.                                                               | 9    |
| 36 | Figure 6 Trigger Interface card. The TI shares the same PCB design as TD, but the components are popula | ated |
| 37 | differently from the TD.                                                                                | 10   |
| 38 | Figure 7 Trigger Interface card functional diagram                                                      |      |
| 39 | Figure 8 Trigger synchronization between TIs                                                            | 14   |
| 40 | Figure 9 DAQ synchronization                                                                            | 15   |

| 1 | Figure 10 Subsystem testing/commissioning for up to nine Front End Crates                                   | 17   |
|---|-------------------------------------------------------------------------------------------------------------|------|
| 2 | Figure 11 Setup for the trigger and clock distribution                                                      | 20   |
| 3 | Figure 12 Aligned trigger outputs from four TI boards with fibers lengths of 160 meter, 50 meter, 5 meter a | nd 4 |
| 4 | meter respectively                                                                                          | 21   |