ATLAS-UK Level-1 Calorimeter Trigger Meeting

Tuesday 23 September 2003 at Birmingham

Present: Bruce Barnett, Ian Brawn, James Edwards, Eric Eisenhandler (chair), John Garvey, Norman Gee, Tony Gillman, Steve Hillier, Murrough Landon, Gilles Mahout, Viraj Perera, Weiming Qian, Dave Sankey, Richard Staley, Jürgen Thomas, Peter Watkins, Alan Watson

Agenda


Click this side                               Click this side
for summaries                                for slides (pdf)
 
Hardware current and future
CMM status..............................................Ian
CPM 1.0 and 1.5 status..............................Richard
CPM 1.0 testing................................Gilles/Steve
JEM testing at RAL...................................Jürgen
ROD 0.1 status........................................Bruce
ROD 1.0 specification.................................Viraj
Bits, pieces and CANbus...............................Viraj
New TTC decoder and fanout status...................Weiming
LVDS Source Module design...........................Richard
TileCal cables, patch panels and receivers.............Tony

Schedule
Hardware schedule......................................Tony
Test planning........................................Norman

Online and offline software, etc.
Offline simulation.....................................Alan
Calibration discussion at CERN.........................John
ATLAS TDAQ workshop in Lisbon..........................Eric
Online software summary............................Murrough
 
Rehearsal
LECC talk............................................Gilles
 
Any other business
Date of next UK meeting

Hardware current and future

CMM status - Ian Brawn (slides)

Ian reported on the status of the CMMs. There are currently 6 CMMs. CMM 0 (previously labelled CMM 1) does not configure automatically on power up, and because of this there are no plans to make it available for use in subsystem tests. CMMs 1-5 are of a more recent design, and do configure automatically on power up.

CMMs 2-5 have been commissioned. During this process faulty LVDS transceivers were found and replaced on boards 2 and 5, and a bad BGA joint was found on CMM 3. To avoid the bad BGA joint the firmware has been modified (on all CMMs) to reroute the signal concerned onto a spare track. CMMs 2-5 are now available for use in subsystem tests, and CMMs 3 and 4 are already being used in this capacity in Birmingham and the RAL Trigger Lab respectively. Some minor firmware and hardware updates are required, however, and these will be done in the near future. The commissioning of CMM 1 was started but the board but was found to have serious problems with some BGA joints. It has been sent away to have the devices concerned re-assembled.

Regarding firmware, the CP merging firmware is in use in subsystem tests, the Jet-Energy merging firmware is also available but has not yet been loaded on any of the boards, and the Jet-Hit merging firmware is being developed in Stockholm. Ian described his understanding of who is responsible for developing, testing and supporting each variation of the CMM firmware (see slides).

In discussion, Norman and Bruce asked about coordination of CMM firmware and software.

CPM 1.0 and 1.5 status - Richard Staley (slides)

Status: Two fully working CPMs , seria l#s 1 and 4. CPMs #2 and #3 had uncorrectable assembly problems due to poor PCB finish and defects with gold/nickel plating. The surface finish was respecified as tin for CPMs from #4 onwards, and this hopefully has solved the problem. CPM5 was expected back from assembly soon, and will be JTAG tested at RAL next week.

Power Modules: The 1.8 PM had failed after 1 year. Manufacturer's examination showed failure, very possibly due to overvoltage/current stress of the input circuitry. Previously reported large voltage transients of several volts ringing at 50 MHz, at the input port, were thought to be the cause. However , the transients were later discovered to be pick-up of magnetic fields by the 'scope probe, originating from the internal circuitry, so the cause of stress remains unknown. Replacement PM has run for > 4 months without problem. Next version PCB will accommodate existing design , or optionally a different design of PM.

G-links: Previously reported problems with G-link transmitters on adjacent CPMs were caused by an unstable voltage regulator circuit, feeding (not excessive) noise onto the G-link 5V supply. Now cured on existing design. Next version uses a different circuit, completely isolated from the G-link supply, which will also have extra filtering.

Next version CPM: Addition of bracing bars to aid insertion into crate backplane. Backplane inputs re-routed and clock distribution improved. Uses new TTCdec. Addition of fibre-optic outputs from G-links. Option to switch backplane signal levels from CMOS to SSTL2 standards, giving better noise performance. This new version will be fully compatible with existing hardware, and in the early days will use existing firmware without modification.This will ease the testing of the first modules to be assembled. Two modules seems to be the right number to build.
Timescales: Schematics are with RAL Drawing Office. Assembled modules may appear by Christmas.

CPM 1.0 testing - Gilles Mahout (slides) and Steve Hillier (slides)

Gilles described testing progress on CPM data paths. CPM 1.0 no.4 has been tested. Real-time data have been recovered inside the CP chip downloaded with the scanpath mode, and a TTC scan performed with CPM 1.0 no.2 on its left-hand side. A typical timing window of 2.5 ns with no errors has been observed for the on-board data, and a narrower 1.5 ns for the backplane data. CPM1.0 no.1 has also been used to test fan in/out data by swapping the two boards between them. The TTC scan profile is very similar for both of them, showing no problems with the integrity of the signals on the fan in/out tracks.

The next step was to perform another TTC scan by looking at the data inside the spy memory of the CP chip holding the RoI information. The CP chip has to be loaded with the algorithm firmware instead of the scanpath. The CP chip clock has been shifted per step of 104 ps and the result is a very similar pattern to the one observed with the scanpath mode. This is reassuring to see that the algorithm mode double-checks what has been measured with the scanpath mode.

Hit information has been checked inside the spy memory of the readout controller for the DAQ. A TTC scan has also been performed by moving the clock of the CP chip. The profile observed is a superposition of the profile seen inside the CP chip, i.e. a pattern with a 6.25 ns period, superimposed on top of the errors generated by the hit merger logic. The hit merger superimposes an error area of 12 ns, corresponding to the time needed by the logic to perform the complete merging of all the 16 thresholds. This happens because the hit multiplicities are not clocked at the output of the hit merger block.

The CPM was then integrated with a CMM and hit information was recovered inside the spy memory of the CMM. A TTC scan has been performed but both clocks, driving the serialisers and the CP chips, were shifted together, taking care of staying in the error free area of the CP chips. The error area seen inside the CMM spy memory is similar to the one firstly observed inside the ROC DAQ, i.e. a window of 12 ns. The test has been repeated by moving the CMM to the opposite end of the crate, and the TTC scan shows an identical profile (the shape is different as the data used were different), shifted by 500 ps compared to a closest position to the CMM of one slot.

The main tests left are to perform some BER tests on the new CPM, and study the crosstalk between boards and across the backplane. Tests of each individual backplane slot could also have been done, but the CPM is a rather fragile board and removal and insertion needs to be done with great care.

Steve described further testing, concentrating on G-links, DAQ and RoIs. Before this summer, the G-link streams coming out of the CPM had only really been tested at a low level. There had been no real checking of data at a bit level, or that events being output corresponded to those expected with a level-1 accept on a particular tick. This type of test could only be done with a reliable G-link sink and a controlled L1A generation scheme, which only recently became available.

The first step in the tests happened in mid-August when the CPM was taken to RAL for a few days and data was passed through a ROD and the S-Link packets recorded on a DSS for comparison with the expected output. These were already very successful, with RoIs being captured correctly, and the DAQ data being broadly correct. Some problems were detected and solved at this stage.

More detailed tests were performed back in Birmingham where another L1A generation setup was built, and a G-link sink on a DSS used to collect the G-link data from the CPM. A couple of hardware problems with the G-link operation were discovered and fixed, so now both fully populated CPMs should work with respect to G-link output as well as real-time data. Several minor problems with the firmware were found and fixed, and stable operation was established with starting runs by resetting playback memory pointers, and obtaining stable offsets for the readout data FIFOs. There was also much progress in understanding operation of the TTCvi, and writing software to control the new hardware operations.

In summary, CPM G-link data, both RoI and DAQ (with multiple slices) was tested for long periods with a wide variety of different L1A patterns. L1A rates of up to 130 kHz were tried, and also modes where L1As were seen only 5 ticks apart. The G-link data output by the CPM was found to be correct in all cases.

JEM testing at RAL - Jürgen Thomas (slides)

Jürgen reported on recent tests of the readout of the JEM prototype JEM 0.1 at RAL since the QMUL Joint Meeting. JEM 0.1 is an old-style prototype with 11 InputFPGAs and 88 single-channel LVDS deserialisers. The real-time data path for the energy summation has already been tested successfully at RAL with 16 LVDS input channels from a DSS. The readout tests have been hampered by one of the two G-Links being defective. So the second one, the 'ROI' G-Link has been used, connecting it to a ROD carrying JEM Data firmware and using a DSS as S-Link sink. This setup represents a slice-test-like JEM slice data readout. A new set of firmware has been provided by Uli, which now allows also to read out channels fed by the playback memories within the InputFPGA. The setup and simulation are configured and controlled by the RunControl.

Standalone debugging of the readout firmware has been performed by Cano and Uli in Mainz using spy memories within the ROC FPGA, and comparing the output to the G-Link-Stream data file from the simulation. At RAL, the test with playback memories was used to debug the channel mapping, which is now nearly correctly described in jemSim. A problem in the phi direction is to be fixed. For a constant data pattern (different values in each channel, but same set for all events), the readout stream matches the simulation for a few thousand events. A binary counter pattern shows a constant offset in the readout values, which is due to incomplete adjustment between the hardware and simulation, but otherwise works as expected for 30,000 events.

The communication with the Merger Module has also been tested, with e/gamma firmware loaded into the CMM. The ROD is then connected to the CMM, again using a DSS as the S-Link sink. This test has not been successful yet. Data patterns appear at the correct place in the readout pattern, but do not match the expected value. The JEM 0.1 has now been sent back to Mainz for G-Link repair.

The next testing arrangements need to be discussed with all people involved. A vital point for future tests is the availability of the Jet Algorithm for a JEM 0.x module. Tests at RAL will continue with stand-alone debugging of the energy summation firmware of the CMM, using onboard playback/spy memories. The completion of the simulation and module services is required, including the energy summation crate and system merging code. On the physics test vector issue, initial modifications to TrigT1Calo have been made by Ed at CERN to dump physics data, which needs to be run together with Atlfast-Athena to produce test vectors.

Norman asked for the operation of the missing-energy lookup table to be documented in physics terms.

ROD 0.1 status - Bruce Barnett (slides)

Bruce presented an overview of the current status of these modules and their firmware. The status of existing firmware is looking much better, in general. CP-Data, CP-RoI, JEM-Data and Cmm-CP-Data are all in use in testing, with their data-feeds from CPM, JEM and CMM variously. Work on status bits in the S-link header is required. In the case of JEM-RoI firmware, the functionality is not yet correct but requires feedback to James.

There are inconsistencies and errors in the specifications of these formats, and work is urgently need on rationalisation of those formats. Also, no work has been done on the remaining four firmware designs, due to a lack of specification.

Concerning the hardware, RODS 1, 2, 4, and 7 have been tested extensively, but will soon require new G-link cards and TTCdecs. Modules 3, 5, 6, and 8 are in the slice test-bed. They are loaded, one each, with the previously mentioned firmware variants.

The CPM, JEM and CMM integrations have proceeded with various levels of software maturity. CPM integration is furthest along, in good shape. JEM is well along, and CMM needs software/firmware integration yet – although the ROD firmware status in this case should be ok.

ROD readout issues approach the critical path. Readout via ROS is required quite soon. New high-performance S-link (HOLA/FILAR) and platform hardware (PCI-X box) have been ordered.

ROD 1.0 specification - Viraj Perera (slides)

The ROD specification was reviewed on the 1st of July.

The updated block diagram now shows optical inputs to the ROD. To save front panel space STRATOS dual optical receivers will be used on the ROD module. Since there is currently no in-circuit programmability of the compact flash, the compact flash card is accessible from the front panel. All FPGAS will be XCV2VP20-5FF896 devices.

Currently, actions and recommendations arising from the PDR are being sorted out. We need to sign-off the specification soon so that serious design work (four months) can begin, to have a module available early next year. To that end, the body of the specification document should be released for re-review immediately, with formats and software models to follow soon after.

The number of modules to build is 2–3.

Bits, pieces and CANbus - Viraj Perera (for Adam Davis) (slides)

The CANBus circuit on the CPM has been tested. The interface to the temperature sensors on the FPGAs is via SMBus, and the sensors are connected to the ADC port of the micro-controller. These were read out successfully through to the TCM. The test was done with two CMMs in the crate as well as moving the CPM to slots 3, 4, 9 and 11. Accemic software has been used to monitor and debug the code. The only problem has been the inability to read the 11 bit CANBus ID, which comprises the module number, crate number and module type which is connected to port 7 and 8, due to incompatibility of signal levels. All signals to the micro-controller must be 5V TTL. There are also other recommendations which should be followed for the next iteration of the CPM.

TTC decoder and fanout - Weiming Qian

The TTC fanout modules (5) are being manufactured. The TTCdec cards (15) are about to be sent out to manufacture; these include the alternate crystal clock requested by Uli.

LVDS Source Module design - Richard Staley (slides)

Work resumed on building a LVDS Source Module that will drive a large number of serial LVDS links with 480Mb/s data. There are a few issues needing discussion before the specification is finalised and the module designed.

Some possible choices were shown, with their implications on the number of channels provided. Richard's preference is for a 6U module using Virtex2 FPGAs to serialise data onto 88 links. To help make the choices, Richard should talk to Uli.

TileCal cables, patch panels and receivers - Tony Gillman (slides)

Eric, Weiming, and Tony visited CERN to study LAr receivers using TileCal signals, and to define muon trigger Patch-Panel. Using TileCal calibration pulser in lieu of beam, they confirmed overall signal chain gain at 10 mV/pC into long cables. Default (power-up) Rx voltage gain is ~0.8 when output correctly terminated. Saturation effects were measured – TileCal electronics saturates ~300 pC, LAr Rx output saturates ~3.3V. Post-pulse overshoot from AC-coupling of unipolar TileCal pulses is ~1%, with a decay time-constant ~10 microsec. With occasional beam, they measured TileCal electronics calibration ~11 mV/Gev for electrons (and so ~9 mV/GeV for pions). Conclusion: LAr receiver design can be used unchanged for TileCal.

LAr and TileCal use different grounding schemes – muon trigger patch-panel will be used to "equalise" them for receiver modules. An alternative Rx grounding scheme was proposed by Weiming to reduce common-mode noise - V. Radeka's opinion was sought, but he did not like it. the patch-panel is being designed by by Yuri Ermoline at CERN. It will consist of 64 unpowered 9U VME modules (not very deep); we plann on a prototype ~November 2003.


Schedule

Hardware schedule - Tony Gillman (slides)

Tony discussed the pre-PRR schedule, noting that the critical dates have changed very little since the last meeting (July). Preparation for Slice Tests has progressed well – full system could be ready by Jan 2004. The plan is to complete Slice Tests in April/May 2004 with the new (pre-production) module designs. Critical items are the new 9U ROD and first PPM. We think we could have a PPM by Christmas, but we don't know how long it will take to test it in Heidelberg. The 9U ROD is clearly going to be late, so we should plan on doing as much testing as possible with the 6U RODs. We need an update on the JEM-1 schedule.

Steve asked whether we need an optical receiver module for the 6U RODs. The answer is that all of the new modules will have a facility for installing electrical G-links.

PRRs in Q3/4 2004 are considered essential to complete the production/installation/commissioning phase by Q4 2006. The PRR for PPr ASIC/MCM will be combined – either in Oct/Nov 2003 or after evaluation of redesigned ASIC and new MCM batch, we have asked Philippe Farthouat which is preferred. The FDR for LAr/TileCal receivers will be in Nov 2003, with the PRR in Jan 2004 – both during LAr Weeks – and we will participate. The main FDR/PRR activity will start ~May 2004, and module production ~Aug 2004.

Test planning - Norman Gee (slides)

Norman discussed planning of tests for the next year. Several modules (CPM, CMM, JEM, ROD, DSS) plus supporting software and infrastructure, are essentially stable, and we are starting system integration tests. Several parallel threads should be worked on up to Christmas:

New modules, including PPM, CPM-1, JEM-1, ROD-1 (9U) and CTPD/CTP, should be added when they are ready, to make a full system for the beam test in 2004.

The JEP subsystem urgently needs firmware and software (both JEP and CMM) particularly for jets, where there is a lot of work to do.

The CTPD should be used in a limited way, mainly to reproduce system timing. Detailed discussion is needed with the CERN group over software and firmware for this. Many other tests (some listed on slides) are needed when the main CTP is available to us.

We should aim for concentrated test-beam like tests, for a few days or a week each.

In general there is a lot of software work, and we need more testers. Maybe a training programme is needed. We should be careful to maintain systematic records from now on


Online and offline software, etc.

Offline simulation - Alan Watson

Alan reported briefly on the offline simulation. It is now possible to unpack LAr and Tile towers. There have been problems with the calibration of noise in the LAr; Tile is ok; the problem is still being investigated.

Calibration discussion at CERN - John Garvey (slides)

John reminded us of the calibration systems available, and then described a discussion he had at CERN with Pascal Perrodo. The use of Local Trigger Processors (LTP) allows either calorimeter or trigger to control calibration runs, in a very flexible way. Our needs are different from the calorimeters (among other things we must verify the summing electronics) so it seems that the trigger will need separate calibration runs from the calorimeter. John will update his draft calibration note, and it is clear that discussions with the calorimeter groups must continue.

When a calorimeter cell or trigger tower has a problem, we must have a procedure for deciding what to put into its lookup table. This is not straightforward, it depends on how many cells feed the tower sum, where the tower is in the calorimetry, and whether we are trying to optimise e.m. or hadronic performance.

ATLAS TDAQ workshop in Lisbon - Eric Eisenhandler (slides)

Eric summarised what was known about the programme of the Lisbon ATLAS TDAQ Workshop, 23–29 October. See his slides. It was felt that the agenda should have been worked out further in advance, but this had not happened this time because of intense TDR activity. It looks as if only one or two people will be going, which is not good.

Online software summary - Murrough Landon (slides)

The recent work by software developers has in many cases been more focused on testing than development. Exceptions are work related to L1A generation, control of the TTC, and some related enhancements to the IGUI panels.

A number of topics have been under discussion on the software mailing list and in smaller groups. These include changes to test vector generation and number of readout slices, to make it easier for non-experts to control these.

There has also been a discussion on TTC broadcast commands and the limitations of the DSS, where only two of the available six bits are wired. Some changes to longstanding proposals need to be agreed and implemented.

Some major items such as developing more automated timing calibration and setup procedures and multistep runs need to be done. Also moving to readout via the ROS is becoming more urgent.


Rehearsal

LECC talk - Gilles Mahout (slides of final version)

Gilles gave a rehearsal of his talk on the CPM testing for the LECC Workshop.


Any other business

None.


Not yet fixed; will be after the November Joint Meeting at RAL.


Eric Eisenhandler, 10 October 2003