ClusterAlg.gif (577 bytes) The ATLAS Level-1 Calorimeter Trigger

[Home] [Architecture] [Meetings] [Participants] [Publications] [Software] [Search]


ATLAS Level-1 Calorimeter Trigger
Joint Meeting at RAL
8–10 November 2001

AGENDA

WEDNESDAY 7 NOVEMBER AFTERNOON and THURSDAY 8 NOVEMBER MORNING

Software Meeting

  • Organised by Murrough Landon. Minutes are available here.

THURSDAY, 8 NOVEMBER, AFTERNOON SESSION

ATLAS reviews

  • PPARC LHC GPDs Mid-Term Review - Eric Eisenhandler (pdf)
  • FDR/PRR for TileCal summing amplifiers - Eric Eisenhandler (pdf)
  • ATLAS System Status Overview - Eric Eisenhandler (pdf)

Calorimeter signals and cables

  • LAr cabling problems and solutions - Steve Hillier (pdf)
  • LAr and TileCal receiver situation - Eric Eisenhandler
  • Compilation of calorimeter pulse shapes - Paul Hanke (pdf)
  • Rack layout - Murrough Landon (pdf)

Preprocessor

  • PPr-ASIC: simulations, status, and test plans - Ralf Achenbach (pdf)
  • PPr-MCM: status and test plans - Werner Hinderer (pdf)
  • Preprocessor testing - Karsten Penno (pdf)
  • PPM readout overview and status - Dominique Kaiser (pdf)
  • PPM, AnIn and PPr-ROD and timescale - Paul Hanke (pdf)

THURSDAY, 8 NOVEMBER, EVENING

Management Committee meeting

  • Chaired by Eric Eisenhandler - summary presented on Friday morning

FRIDAY, 9 NOVEMBER, MORNING SESSION

  • Summary of Management Committee meeting - Eric Eisenhandler

Cluster Processor

  • Cluster Processor FPGA and Generic Test Module - James Edwards (pdf)
  • Cluster Processor Module status and test plans - Gilles Mahout (pdf)

Jet/Energy-sum Processor

  • JEM status, plans and timescale - Uli Schäfer (pdf)
  • JEM jet algorithm - Sam Silverstein (pdf)

FRIDAY, 9 NOVEMBER, AFTERNOON SESSION

Common modules and backplane

  • Common Merger Module, cables & adapter modules status and timescale - Ian Brawn (pdf)
  • CP/JEP backplane and crate status and timescale -Sam Silverstein (pdf)
  • CP/JEP ROD prototype test status - Bruce Barnett (pdf)
  • Timing Control Module status and timescale - Adam Davis (pdf)

DCS etc.

  • Fujitsu development and other options - David Mills (pdf)
  • Report on DCS workshop at NIKHEF - Uli Schäfer (pdf)
  • CTPd patch panel for slice tests - Tony Gillman (pdf)

Physics simulation, etc.

  • Trigger simulation and the ATHENA framework - Ed Moyse (pdf)
  • PC cluster at Birmingham - Alan Watson (pdf)

SATURDAY, 10 NOVEMBER, MORNING SESSION

Online software component status summaries

  • Overview, run control and databases - Murrough Landon (pdf)
  • Software modules - Bruce Barnett (pdf)
  • Test vectors and tgrigger simulation - Steve Hillier (pdf)
  • HDMC and Heidelberg plans - Oliver Nix (pdf)
  • Readout issues and preparations for T/DAQ workshop - Norman Gee (pdf)

Slice test planning

  • Thoughts on slice test organisation - Norman Gee (pdf)
  • Overall timescale and milestones - Tony Gillman (pdf)
  • Open discussion on possible "show-stoppers" - Tony Gillman (pdf)

Summary

  • Main issues, highlights & lowlights of the meeting - Eric Eisenhandler (pdf)

 

MINUTES

THURSDAY, 8 NOVEMBER 2001, AFTERNOON SESSION

ATLAS REVIEWS
(minutes by Eric Eisenhandler)

PPARC LHC GPDs Mid-Term Review - Eric Eisenhandler

The slide is available here (pdf).

Eric briefly described this review by the UK funding agency. It covers ATLAS and CMS ("General Purpose Detectors"), and looks at progress on deliverables and the remaining cost and effort needed to complete UK items. It is not concerned with deep technical details. In principle, it could result in money and effort resources being moved between ATLAS and CMS, or the three UK projects in ATLAS: level-1 calorimeter trigger, level-2 trigger, and semiconductor tracker. In the case of level-1, we have asked to convert some capital into more engineering effort.

FDR/PRR for TileCal summing amplifiers - Eric Eisenhandler

The slides are available here (pdf).

The Production Readiness Review for the TileCal trigger summing amplifiers is now scheduled for 10 December at CERN. The Rio group and Bill Cleland are among the participants. We have also been invited, and will probably participate by video link, with a subset of Paul Hanke, Gilles Mahout, Steve Hillier and Eric. Some of the issues are: performance and wiring of the connector at their end – this is needed to design the receiver interconnection daughter cards; choice of long cables and who specifies, buys and tests it – the situation looks messy; and various calibration issues.

ATLAS System Status Overview - Eric Eisenhandler

The slides are available here (pdf).

Eric began by mentioning that the ATLAS Comprehensive Review by the LHCC in July had produced no comments at all about the calorimeter trigger.

The ATLAS System Status Overview review of ATLAS T /DAQ took place on 16 October. It was primarily concerned with project management and interfaces to other ATLAS systems, not internal technical details. As the review day was tightly packed, our two speakers (Paul Hanke on Preprocessor, Tony Gillman on Cluster and Jet/Energy Processors) submitted drafts of their talks in advance to our chief reviewer, Bill Cleland. We also sent him various other materials, and there were exchanges with him by email before the ASSO day. The outcome of the ASSO is a set of actions, some on us and some on other systems. These are aimed not only at picking up points that need improving in our own project, but also to help us by requiring other systems to do various things (and in some cases were suggested by us).

Some items that were discussed with Bill did not appear in the final ASSO document. They included an agreement to combine PRRs for Preprocessor ASIC, MCM and PPM and to hold this after the slice test rather than before it. The ASSO document starts with some comments on various items before moving into the actions. The actions ask us to list effort working on online software, and the DIG to name software contacts to help us with the slice tests. Connections between the LAr receivers and the Preprocessor should be documented (we intend to go right back to the calorimeters), and the LAr group should specify who will build the LAr receivers. Connections between the TileCal and the Preprocessor should be documented, and we must name the group who will build the Tile receivers. Both we and the two calorimeter groups must name contact people concerning our use and requirements of the calorimeter calibration systems. More details are given in the slides.

CALORIMETER SIGNALS AND CABLES
(minutes by Steve Hillier)

LAr cabling problems and solutions - Steve Hillier

The slides are available here (pdf).

Steve summarized the recent problems and discussions concerning the cabling from the LAr receivers to the PPMs. Initially concerns had been raised by Gilles and Steve over the post-PDR version of the PPM specifications. One genuine problem was identified with the input to the CPM because of the 'round the clock' ordering of signals arriving at the PPM front panel. This had resulted in some modifications to the signal routing on the PPM, which was now corrected in the new version of the PPM specifications. Another problem occurred with the signals at the edge of the eta range of the CPMs, and again this was corrected in the specifications, although it did not need a PPM modification, just a change to the requirements on the input signal ordering to the PPM.

Seemingly the most serious problem was concerned with the situation at negative eta. If the input signals were assumed to be reflected, mirroring the detector layout, then the consequences for all parts of the CPM and JEM system were considerable, since the architecture assumed translational symmetry across eta. Because of some uncertainty about the signal configuration, a meeting was arranged with Bill Cleland at CERN. This meeting turned out to be very useful, although it transpired that all of the problems we had encountered could have been solved by the very flexible design of the LAr receiver station re-ordering daughter-boards.

The receiver system itself is well documented and cabling clearly specified, all that remains to be done is a precise enumeration of the re-ordering interconnect boards. The translation from the detector symmetry into our coordinate system had already been catered for, and all that needs to be done is to clearly specify our exact cabling needs. This will be done in a document that ASSO also identified as necessary. This will specify the cabling from the detectors through to the PPM including receiver stations and octopus cables. Specifications for the Tile receivers must also be documented.

LAr and TileCal receiver situation - Eric Eisenhandler

Bill Cleland is keen to build the receivers for both the LAr and Tile Calorimeters at Pittsburgh, but there are some funding and management issues to be cleared up before this can be confirmed. Eric felt that this would be a very good development, as we have a good working relationship. However, the situation would start to look worrying if Bill did not get the funding. Paul said that Bill would be visiting Heidelberg on 26/27th November for discussions.

Compilation of calorimeter pulse shapes - Paul Hanke

The slides are available here (pdf).

Paul reported on the recent work of Hasko Stenzel, who had been gathering analogue pulse shapes from the the test-beams for the EM Barrel, HEC and Tile calorimeters. He has pictures of over 400 signals. Firstly the various detector test-beam setups were described, and the beam types. Paul noted that as the EM End-cap signals should be essentially identical to the barrel, all detectors bar the FCAL have been observed in final, or close to final production versions. The HEC saw-tooth calibration was seen to now have the correct upper limit of 3.0V as required. Some signals were recorded at the front end and also after 70m cables - in the EM Barrel case, the cables were the final Saclay cables, and a prototype receiver station was almost used.

The analogue signals were recorded on a digital oscilloscope with a time resolution of 200ps and a signal length of 2µs, except the EM barrel, where a different oscilloscope was used (400ps for 1µs). The differential signals were converted to unipolar using the oscilloscope. In general the signals looked roughly as expected. Paul continued by showing some details of some typical signals.

On the Tile Calorimeter, it was noted that there appeared to be a reflection effect on the long signal. Also there appeared to be no undershoot after the pulse. The HEC signal on the other hand appeared to have a larger undershoot than expected. On the EM barrel signals, some study of linearity could be done with different signal sizes. The peaking time was about 50ns, which is sharper than anticipated. Some traces for calibration signals had also been recorded, including saturated pulses. The saturated calibration signal appeared to have a slower rise time - there was speculation that this may be due to some peculiarity of the calibration system, but was felt to be worrying as this behaviour would invalidate our saturated signal logic. Saturated signals in the Tile Calorimeter and HEC appeared far more well behaved. Signals taken from the HEC with the 70m cable also showed a high attenuation. Finally there was an illustration of signals with high pile-up, which were very messy.

In summary, Paul showed a table of all the data collected, and said that the files and documentation would be made available for everyone's use, and in particular for the video RAM input to the PPM. It was commented that this was a very useful piece of work, and we should feed back questions about the signals to the calorimeter communities if there were genuine problems.

Rack layout - Murrough Landon

The slides are available here (pdf).

Murrough presented the developments on the rack layout since the Mainz meeting, as well as summarizing our current position. Confirmation had been obtained that the wall was definitely there. The layout as presented at Mainz had been given to Chris Parkman, with the addition of an extra rack of signal monitoring for Bill Cleland. Pictures of the layout were shown, where the main part of the trigger processor is situated in a run of 14 racks. A run of 9 racks on the other side of the barrack are still reserved for level-1 use. Chris Parkman will also need to know about our internal cables - currently he has information just on the LAr input cables and nothing on Tile or internal requirements. The shifted layout provides a reduced latency assuming that we can use the central hole for input cables, but currently these are reserved for magnet cables.

The outstanding issues concern the layout of receiver and PPM crates, particularly for the Tile Calorimeter where the number of receiver crates is still undecided. The specification for Tile cables would soon be needed, and detailed latency optimizations of the layout should be done.

Paul Hanke was unhappy with the receiver/PPM positions relative to the wall. The signal cables are likely to be inflexible and therefore the small space between the racks and the wall would become intolerable. Many options were considered, but all have doubts or drawbacks. If the processors move, then we must also ask for the CTP on the level below to move in order to minimize latency. Spreading the system over more racks impacts on latency, and it was not clear how much flexibility there was to move racks backwards or forwards. It was suggested some drawings of cable layout with realistic bending radius should be made. Sam suggested the CP/JEP crates could be made less deep, but this did not solve the PPM issue. No final decision was taken, but the current design was proposed as our best solution but we should reserve the right to change if it became clear that it would not work. It was also suggested that we go down to look at USA15 as soon as possible (September 2002?).

PREPROCESSOR
(minutes by Oliver Nix)

PPr-ASIC: simulations, status, and test plans - Ralf Achenbach

The slides are available here (pdf).

Ralf gave an overview of the PPr-ASIC history. He mentioned that there was an ASIC-Lab internal review of the PPr-ASIC during the summer. Timing and routing issues did not satisfy the reviewers. The layout of the chip had to be redone. The new layout is symmetrical and four new power pads have been added. In addition, the RAM block placement has been optimized and logical errors in the Verilog coding affecting the readout of the BCID decision bits have been detected.

PPr-MCM: status and test plans - Werner Hinderer

The slides are available here (pdf).

Werner reported that 6 MCMs have been delivered. Three of them have been partly assembled in-house. Two were sent to the HASEC company for evaluation. Power tests performed with the 3 MCMs assembled in house revealed one defective ADC. Werner explained that in case it becomes necessary to use the LVDS serializer version running at 40-60 MHz, that those would also fit on the MCM.

The planned test setup for the MCM (and ASIC) test was introduced. A test board is currently under development and is realized as an adapter board to bridge the ASIC, which will not be ready at the time the MCM tests start.

Preprocessor testing - Karsten Penno

The slides are available here (pdf).

Karsten, who started a few weeks ago as diploma student at Heidelberg, reported about his plans in assisting to set up the slice test environment at Heidelberg. As a first step he reported that he intends to write software to use a general-purpose PC video card to generate test pulses as analogue input to the preprocessor system. This video card will be used to reproduce pulses taken in the test beam and to generate pulses from K. Mahboubi's analytical pulse form formulae. A sketch of a planned test setup for the full PPM was shown.

PPM readout overview and status - Dominique Kaiser

The slides are not available here (pdf). This talk was actually given on Friday morning.

Dominique reviewed the principles of the so called Pipeline Bus used to read out the trigger information from the individual PPM modules. He explained in depth the protocols and the data formats used to establish the ring-like Pipeline Bus. Besides the Pipeline Bus there are also interfaces to VME and CANBus. To establish the readout chain he intends to use existing hardware from the demonstrator system. This should be sufficient to read out 2-4 PPM modules. Presently he is in favour of not using the Pipeline Bus for sending command words to the PPM but rather use the VME interface to do that.

Finally, bandwidth considerations were addressed. When running at the maximum L1 accept rate of 100 kHz and reading out 5 raw samples, 275 MByte/s of bandwidth is needed on the Pipeline Bus. Restraining the readout to 3 samples, 207 Mbyte/s still have to be read out. 106 Mbyte/s need to be transported via the pipeline bus if the readout is limited to one raw sample. Dominique concludes that using a bus clock of 60 MHz, we will be able to read out 3 samples without causing additional dead time due to Pipeline Bus bandwidth limitations.

PPM, AnIn and PPr-RODs and timescale - Paul Hanke

The slides are available here (pdf). This talk was actually given on Friday morning.

Paul reported that the design of the PPM board had started recently and that all commercial parts of the PPM are now fixed. Concerning the AnIn daughterboards, he reported that we moved to a CMC connector to improve the crosstalk behaviour. The LVDS tests are way behind our intentions. In Heidelberg we are going to start to buy all components which are technically fixed and where the price gradient is supposed to be small.

FRIDAY, 9 NOVEMBER 2001, MORNING SESSION

Summary of Management Committee Meeting - Eric Eisenhandler

The main points covered in this meeting were:

  • Purchase of G-link chips for the final system: This has now been done by Mainz; they are at CERN. The UK would like to pay its share by February 2002; as much as possible will be converted into items to buy for Mainz rather than "cash", to be sorted out by Uli and Norman.
  • Budget situation: There was a general discussion in which we went around the table to get budget status and problems in the three countries.
  • Keeping informed between joint meetings: Tony has suggested trying fortnightly telephone conferences, with at least one person from each institute, in order to get an overall view of progress and problems that allows for questions and dialogue. They should be short, and doing it by telephone is much easier to set up, cheaper and more reliable than video.
  • Who writes which firmware: There is a shortage of effort to write firmware, so the question is how to allocate the work for different versions needed for common modules. Mainz will work on the JEM, and Stockholm also possibly have some diploma students for that. The UK should write all ROD versions, and also the jet version of the CMM since it is similar to the e.m./tau. After seeing how things are going we could decide on the energy-sum merging - Mainz would be the first option.
  • Online software effort: There has been some improvement at Mainz and Heidelberg, but we are still short of people and a big problem for new people is lack of continuity.
  • Dates and locations of the next meetings: The next two meetings will be in Heidelberg and Stockholm. After the joint meeting, the dates were fixed to 14-16 March in Heidelberg and 4-6 July in Stockholm.

CLUSTER PROCESSOR
(Minutes by Gilles Mahout)

Cluster Processor FPGA and Generic Test Module - James Edwards

The slides are available here (pdf).

James recalled briefly the function of the GTM. Two big FPGAs are placed on the board with one of them having the CP chip algorithm under test. The other FPGA delivers formatted data to test the algorithm. There are two configurations from where test vectors come: either from BlockRAM inside the FPGA itself or from memory. So far James has been debugging some minor problems with his own test vectors. He has clock-calibrated all the 108 160-MHz input channels and checked they work correctly by processing 256 vectors. The calibration process consists of scanning the channels with different delays, deducing the correct clock and setting all the channels to this clock. By using different set of test vectors, written by Ian and stored in the memory, different problems appear and the debugging is not obvious. He envisages to use the ScanPath configuration of the CP chip to help in this task. It has been shown anyway that the initial debugging process with the GTM has been very successful for the CP algorithm and has already saved time when we will get the CPM.

Cluster Processor Module status and test plans - Gilles Mahout

The slides are available here (pdf).

Gilles gave a quick history of the CPM board over the past few months. The board is now in the RAL drawing office, netlist checked and 25% of components have been placed and routed. This gives a total of 24,000 pins, with 32 BGA packages (8 fine pitch) and 16 layers. After showing the actual layout, there was some concern about power dissipation but nothing has been estimated so far. In order to help in the rework process, if there is any, best efforts have been made to keep all ICs on one side of the board. In the end, only four buffers end up on the solder side, at the edge of the board. A local company, 40 min fromRutherford, seems confident in doing the reworking process on such 9U boards. A list of test points was also given.

The board should be in the lab in February 2002. Preparations for testing are on their way. A new CPU board, the same as at RAL, has been installed at B'ham. Some CPM firmware has been put on CVS, more needs to be added. Dedicated S/W routines are under study with the help of HDMC and new packages introduced by the S/W group.

JET/ENERGY-SUM PROCESSOR
(Minutes by Gilles Mahout)

JEM status, plans and timescale - Uli Schäfer

The slides are available here (pdf).

Uli reminded people of the JEM architecture. It processes 88 LVDS links from the PPr and the board consists of a big FPGA doing the mainJet/Energy logic, a readout controller block, G-link chip, and connector to receive an ELMB card. He gave a short summary of the JEM0 "saga": of two PCBs manufactured, only one was assembled. But the FPGA device assembled was either misplaced or wrong. After several attempts to correct it, this board is still not working. It is going to be sent for X-ray anyway.

It ends up that the present specification of the board (design from an early production run not to final specs) cannot be used with the Spartan II device. A new board has been designed in less than a month but the assembler does not feel confident to do the job and refused it. They claim that the board is too big. Uli found the help and feedback of company very poor in trying to understand the difficulty of such a board.

Viraj gave him a list of wishes given by a UK company in order to help in the rework process. Among them, it is suggested to have all IC components on one side and take care of the finish of the pads. Contact has been made with another assembly company in Hamburg, dealing with military matter.

The work on firmware is slow as they lack of human resources. The latency of the jet algorithm is 4 b.c. The input FPGA needs some updates with Ian. Andrea is going to work on the readout controller. The FCAL handling has to be modified due to revised allocation of FCAL channels.

Finally Uli gave some consideration to the next JEM board. Testing of connectivity on the new board is very important and a JTAG/boundary-scan tester has been bought. Since the size of the board is too big, and the present company cannot handle 9U boards, he proposed different ways to cope with this by dealing with either smaller replaceable parts or using daughter cards. New options will increase the cost of the board, one of them could also be to go for non fine-pitch BGAs. Choice of new serial link chips, with a higher density, enable having no chips on solder side. The previous setup had chips on both sides to minimise track lengths. Three serial chip candidates are available but only one seems quick and easy to implement as it is already compatible with actual setup: DS92LV1260. Tests will be needed to check they can sustain beyond 40 MHz (40.8 MHz) LHC clock. Uli expects to have two boards ready at the beginning of the slice test, and two more during it. Paul added the PPr boards are expected for April 2002.

JEM jet algorithm - Sam Silverstein

The slides are available here (pdf).

Sam presented a new implementation of the jet algorithm using VHDL coding. Re-starting from scratch enables some fresh ideas. It also gives better interaction with Mainz people by sharing the same tool (FPGA Advantage). Finally, it's a good example to justify our effort to use FPGAs in the system.

The new design is more compact, by a factor 2. Sam gave examples of savings by eliminating duplicating operations when it was possible. Two examples were shown: one concerns the adder tree process and the other involved comparators in the cluster-finding algorithm. The jet algorithm consists of producing energy sums of 2x2 jet clusters, 3x3 clusters and/or 4x4 clusters. These different granularities of clusters are taken from the 11x7 jet elements forming the region covered by a JEM. To create this cluster, the adder tree logic proceeds step by step, starting by grouping 2 jet elements together, then 3 and finishing with groups of 4 elements. You can save adding operations in the 3x3 clusters by combining sums of 2x2 elements calculated in the first step with single jet elements. In the same way, 4x4 clusters are also created by just adding two existing 2x2 clusters together. You can also gain in searching for a local maximum, by noticing that a jet element compared with a neighbour will end being compared with the same jet element when you deal with the neighbour. Therefore you can share the same comparator for both of these operations.

These new designs are all written, except the RoI result reporting. The synthesis should decide which device to choose, the larger the better, but Sam estimates it will be less than 300,000 gates. He points out that he will need to interfere with other works of people, such as integraing the design in the main JEM FPGA and producing test vectors to test the algorithm. Steve mentioned he has written a test-vector generator package and they should use it.

So far, the FCAL algorithm has not been implemented. There was some concern that we might need more than just one threshold per FPGA, for example one more for the FCAL algorithm. This could be easily done by added registers in the FPGA. The test of the algorithm is envisaged to be done by two students with the help of HDMC. Oliver warned that the students need to be good and Sam explained they come from instrumentation.

 

FRIDAY, 9 NOVEMBER 2001, AFTERNOON SESSION

The first part of this session actually took place in the morning.

COMMON MODULES AND BACKPLANE
(Minutes by Sten Hellman)

Common Merger Module, cables & adapter modules status and timescale - Ian Brawn

The slides are available here (pdf).

Ian overviewed the module and test set-ups. The module requires relatively large devices (XCV1000E, 660 pins), since the local merging requires a lot of I/O and the system-wide merging requires large block RAMs. While the current CPM version of the merger module uses device resources at the level of 70% or below, the jet processing will use close to 100% of I/O blocks.

Work on the schematics has been started and is estimated to finish next week (= week starting 12 November). 12-layer PCB. For cables there was a discussion with the CERN team which favoured a different cable than the SCSI-3 cable proposed by Ian. Since there were difficulties getting information on the CERN-proposed cable it was decided to go ahead with the SCSI-3 cable. The cables go to a rear transition module, which will be laid out soon.

The firmware for the CPM version is essentially ready. Ian estimated that provided that he got correct information on formats etc. there was a fair chance that the UK team could assist in developing the firmware also for the other dialects.

Discussion:
Paul: Do you have any previous experience routing high-density signals into a fine-pitch BGA? Ian: No
Sam: The SCSI-3 cable is OTS, but what about lengths, are we within the specifications? Viraj: Up to 10 m should be ok. Ian: And this is a temporary solution, for the final system we need to look into cables which are halogen-free.
Sam: We should also look into strain-relief, these are relatively heavy cables. (This comment relates to the new design of braces for the backplane, in which the horizontal strengthening bars are no longer present.)
Tony: What is the impedance of the input tracks? Ian: 60 Ohms.
Tony: So this matches the backplane but not the cable? Ian: Yes there is a mismatch, but these cables are terminated, and we run at 40 MHz, so this should not be a problem

CP/JEP backplane and crate status and timescale -Sam Silverstein

The slides are available here (pdf).

The contract for manufacture of the backplanes have been awarded to APW Electronics, in the UK. Following discussions with the company some modifications to the original design have been made, for example the number of layers has increased by two to 18, to provide for chassis-ground in the outermost layers. This has led to some modifications to accommodate the thicker backplane. In addition, simplifications have been made to power busbars and bracing hardware. Sam explained the modifications to the bracing system, which might imply difficulties for supporting cables for strain relief without additional hardware. The crates are in hand and connectors are handled by APW. Production should start "soon", with production time of 5-6 weeks.

Discussion:
Tony: What type of tests are foreseen? Sam: APW tests connectivity and impedance. In Stockholm we will do tests to document impedance, transmission delays and crosstalk properties of the backplanes.
Norman: Can that be done after first shipments? Sam: Yes, since we are running "critical" the plans are to prioritise assembling and shipping crates with backplanes to labs which are waiting for them. Then tests will proceed in Stockholm.
Norman: We need details of the new bracing scheme. Sam: I will send drawings as soon as I get back to Stockholm, and update documentation.

CP/JEP ROD prototype test status - Bruce Barnett

The slides are available here (pdf).

The CP/JEP ROD prototype has been tested in its CP incarnation in two modes: standalone, where CPM input and output to ROS/RoI-builder was emulated with DSS modules. In addition to the standalone DSS tests, a setup (slide 4 of talk) was used where the second ROD S-link CMC location hosted an S-Link Source Card (LSC - ODIN) which fed into a test data sink. The test data sink consists of a SLIDAD, which sits on a VME board (the SLIMOD) which holds as well an S-Link destination card (LDC): the data from the S-Link LDC is sent into the SLIDAD which provides control signals such as the link-full flag (LFF). The LFF may be toggled by a clock (LEM0-00 input), to force the S-Link LDC to assert XOFF (which propogates back to the ROD S-Link source card).

The prototype was also tested with external interfaces where input was emulated by DSS modules while output was sent to RoI-builder prototype (at CERN) or ROS-frontend (initially at CERN, now at test-setup duplicated at RAL). Several lessons have been learnt about the ROD prototype itself, these are described in copies of the talk.

Another important lesson was on the procedure itself. Bruce insisted that keeping track of encountered - and solved - problems is essential, in particular where different people are involved in module testing and writing the firmware. In the RAL context two types of forms are used:

  1. Route Card, this will be for recording module-specific problems and fixes during initial inspection, testing and operational phase.
  2. Problem Report, to record common design problems and solutions.

The outline for the immediate future is:

  • There is convergence in firmware debugging
  • Now, on to the last fix (or is it two ...)
  • Test zero-suppression thoroughly (RoI/slice)
  • Soak tests (the hard stuff) some errors observed at level of a few per 10,000,000
  • Comprehensive test vectors.
  • DSS modifications: hardware fragment check, wraparound (cycle without s/w intervention.)

Discussion:
Pete: Can you stop to catch low rate errors? Bruce: At the moment the s/w just counts errors, there is no dump of these events. This will be implemented when system gets "ticking over".
Steve: Is there an obvious set of "hardest" test vectors? Bruce: A random ramp is probably the most useful thing?
Steve: So what you have is what you need basically? Bruce: One would perhaps want "something more random" (this could already be in the existing software, will be checked).
Norman: I would like to emphasise the usefulness of the lists of problems, we should apply this to all our modules.

Timing Control Module status and timescale - Adam Davis

The slides are available here (pdf).

Adam described the test of the first populated (out of 7 manufactured) board. Several errors have been found and corrected. Some of these, being caused by incorrect net-lists and affecting the VME decoder and the register decoder require wire modifications. Out of these 9 mods one can get rid of 7 by changing the address decoding, such that it is instead made using three enable signals for address decoding. The test programme is now proceeding with the modified module, what remains to be tested is the CAN controller and to make sure that the module works in the new crate. The timescale depends on a decision on how many of the boards should be populated, and wether the CAN interface can be made operational.

Discussion:
Sam: This module does not seem to have guide pins for insertion which is desirable for modules with this depth. There seems to be room on the board for this though. Also the front panel is not designed for the IEEE standard inject/eject tools which will be needed. These require 2 cm on both top and bottom of the front panel. (During the discussion it also became apparent that the CPM also needs to be looked over with this in mind.)
Paul: We should decide wether to go ahead with the 7 modules. Tony: We only need 3-4 for the slice test. The proposal from the meeting was to finish one module with wire-mods, modified scheme for address decoding and then take stock at that point.

DCS etc.
(Minutes by David Mills)

Fujitsu development and other options - David Mills

The slides are available here (pdf).

Dave reported that work on the Fujitsu is progressing and that a demo system exists that can read the ADC and send the data over the CANbus. The system works by continuously polling the ADC, since as soon as the code is changed to use interrupts from the ADC the code does not work. There is a hardware layer, not yet finished, and an application layer based largely on NIKHEF ELMB code.

He suggested two possible alternative solutions to the Fujitsu, since it is poorly supported:

  1. The Analog Devices ADC812 with an SPI interface to a seperate CAN controller,
  2. The Microchip PIC18Cxxx, a Microcontroller with onchip ADCs and CAN2B interface.

It was suggested that it may be possible to use the ELMB by adding another CAN interface to it via its SPI bus.

Dave will continue working with the Fujitsu micro, so that some test code can be written to enable diagnostics of the the prototype boards that already have a Fujitsu micro on them.

Report on DCS workshop at NIKHEF - Uli Schäfer

The slides are available here (pdf).

Uli reported on the Atlas DCS workshop, Oct.10-12 at NIKHEF. He showed the standard DCS system consisting of ELMB, OPC server, and a SCADA system. He explained how non-standard devices can communicate with the DCS.

He summarised some of the presentations given there:

  1. A separate DAC board, SPI-interfaced to to the ELMB is available at 17.5 CHF per channel.
  2. The MDT is interested in a sort of bridge to allow for increased number of nodes per CANbus. Mainly this is a CAN node power supply issue, CAN signal integrity is hoped to be improved by attaching the CAN host adapter in the middle of the CAN branch on a short stub.
  3. There was a long presentation on radiation hardness --> the ELMBs are radiation tolerant, not radiation hard.In the TCC2 test (SPS target area) some non-destructive faults were observed, also PVSS crashes and NICAN hang-ups (on non-irradiated hardware) were observed.
  4. Henk reported on the ELMB firmware status. All on-chip devices are supported. Devices can be excluded by compile options. All further configuration is via option setting in flash. ELMBio V3.4 (ELMB control software) due soon. Supports DACs and improved diagnostics (checkpoint ELMB status to CAN).
  5. TGC (several reports given by Ronen, Shlomit and Nachman) Their baseline solution is 1700 ELMBs on main boards that extend the capabilities of the ELMB. On-chip ADCs are used rather than the high-precision board-level ADCs. External control via JTAG and I2C. Firmware based on the NIKHEF code which was extended to support special devices. Special long-message protocol devised. They have the complete DCS chain (Hardware, Firmware, Software) up and running. They question the use of an OPC server and report on a PVSS driver used instead. They consider the use of CAN/CAN bridges with a redundant path to improve reliability. Nachman considers replacing the ELMB/mainboard solution by a custom board derived from the ELMB.
  6. RPC talk about checking for SEUs in FPGAs and re-loading flash memories through the ELMB.
  7. There were several talks on the DAQ-DCS interface (DDC) which is, however, not of particular relevance for the calorimeter trigger.
  8. Uli reported on the various DCS options for the Calorimeter Trigger and on LAr DCS issues. The purity monitors will use an OPC server included in the Labview software package.
  9. The MDT will use non-ELMB CCD readout for optical alignment, as well as 1200 ELMBs
  10. The RPC will use a large number of ELMBs and will connect external devices connected via SPI, I2C, JTAG.
  11. A relatively large DCS system has been under test with 1k analog channels read out successfully at an ADC conversion rate of 15 Hz. That seems to be the maximum speed that can be obtained without data losses. The 3 OPC servers in use on ATLAS (CANopen, Wiener, CAEN) were presented. They all work, seems there might be problems running them concurrently on the same CPU.
  12. Training courses on PVSS are available from CERN Technical Training. There exists an extensive collection of documentation available from itcowww.cern.ch. Support is available from pvss.support@cern.ch . The newsgroup CERN.PVSS should provide additional help in solving PVSS problems.

The PVSS JCOP framework was presented. It is meant to customise PVSS for applications of PVSS in HEP. Surf the itcowww documentation before you start to write PVSS code! The status in short:

  • PVSS works though there are still weaknesses in documentation, stability and database issues. It was felt that the DCS data should be exported to a DAQ database-compatible format to allow for usage along with DAQ data.
  • THE OPCs work but some people dislike them since they require Windows-based computers in the DCS chain.
  • The currently used NICAN host adapter is unstable, slow, and unsuitable for use on ATLAS. There doesn't yet exist a generally accepted solution, though.
  • There exists a document on test results on CAN adapters.To be posted on the IT-CO web site.
  • Most detectors use both ELMB-based (~5000 ELMBs) and non-ELMB hardware.
  • Most of the ELMBs are now not equipped with high-resolution ADCs.

And finally, the calorimeter trigger needs to find people to do the PVSS software. A decision must be taken on the CAN hardware to be used on the trigger processor modules. In the December 6th DCS ASSO we will have to present our DCS interfaces!

CTPd patch panel for slice tests - Tony Gillman

The slides are available here (pdf).

The old CTPd has to be used in the slice tests and so an interface must be built to allow the current version CMMs to interface to it. The interface has to convert the LVDS signals from the CMM to ECL signals to the CTPd. A schematic overview of the interface is shown in the slides. It will be built by Yuri Ermoline.

PHYSICS SIMULATION
(minutes by Ed Moyse)

Trigger simulation and the ATHENA framework - Ed Moyse

The slides are available here (pdf).

The e.m./tau trigger has been largely rewritten, following a "mini code review" with Murrough Landon. The jet trigger is nearing completion. To aid in debugging Ed has written a "TriggerSpace" visualisation and test tool, which uses the actual key classes to generate a map of "trigger space". It will (time willing) be extended to test actual trigger algorithms, but it has already helped find subtle rounding errors.

The CTP simulation is now being worked on by Thomas Schoerner-Sadenius. Ed and Thomas had a brief meeting at CERN to ensure tight integration between his code and Ed's.

Ed commented that Athena is still needlessly unfriendly for new users and documentation is sparse. Ed has emailed David Quarrie stating his opinion that the Athena webpage is not helpful, and sent URLs to tutorials etc. to put there if David agrees. So far there have been no changes to the web page, and no sign of further documentation.

PC cluster at Birmingham - Alan Watson

The slides are available here (pdf).

Birmingham decided to apply for a Joint Research Equipment Initiative (JREI) grant. JREI bids give money for equipment in partnership with industry (where partnership often means discounts). The proposal was for a PC farm of 30 dual 1.7GHz P4s (+ 4TB RAID) and was intended for trigger optimisation. The proposal was approved, and now Birmingham must plan in detail how best to use it. Plans include checking trigger menus and making the farm accessible to other Calorimeter Trigger groups as a Grid-like application. It could also be used to contribute to the ATLAS data challenge, and look at physics simulation studies.

The hardware itself may change before purchasing, as the bid is based on May 2001 costs, and work needs to be done to ensure that there is software and expertise ready for the farm, but in any case it will give us the capability to do serious studies ourselves.

 

SATURDAY, 10 NOVEMBER 2001, MORNING SESSION

ONLINE SOFTWARE COMPONENT STATUS SUMMARIES
(Minutes by Bruce Barnett)

Overview, run control and databases - Murrough Landon

The slides are available here (pdf). This talk was actually given on Friday afternoon.

Murrough presented a summary of technical progress in the area of online software. Although real progress is being made, manpower constraints and mutually competing tasks have an impact on the rate of completion of documentation and prototyping work. This said, most areas of responsibility are covered in some way, with notable exceptions including calibration, hardware monitoring, event monitoring distributed histogramming andDCS/SCADA. Of these, probably calibration is most urgent to address. Responsibilities for such things as librarian, system-management and website are currently distributed but need to be considered for the system as a whole.

Work in the future will focus on initial (single module) and small system (several module) tests. For the second of these, integration with TDAQ backend software will need to take place. This should pave the way to the slice tests where calibration and [some level of] hardware monitoring and DCS will be necessary.

Manpower levels are about 5 FTEs, with the happy addition of Thomas in the team. It is hoped that a number of students (Heidelberg, Stockholm) will become involved.

Murrough then outlined some of his personal progress in the areas of run control and (configuration and calibration) databases.

Discussion:
It was questioned whether the run control model is complete, and in particular whether some modules (ROD) need to be at a separate level from the other modules, as is the case with the TTCvi.
Ed asked what was the strategy for CMT, mentioned as 'under investigation' in Murrough's talk. Basically the L1calo strategy is, apart from pursuing some trials, to wait for the online s/w group to move that way first, assuming they will resolve the necessary technical issues and provide an evaluation.
Oliver inquired what needs were present in the areas of librarian and system-management responsibility. For the latter, uniform cluster configuration and maintenance of L1 distributed systems is the need. (A web-page resulting from discussions between Murrough and Bruce, combining their experiences, may be found on the L1Calo software site.) The librarian role is required to oversee the release, tagging and distribution of the group of packages which constitute the L1 Calorimeter trigger software.
The question of platform support arose (Bruce and others). The gamut is de-facto constrained by the online group (Solaris/Sun, Linux/CES, Linux/PC), of which only the latter system is currently encountered in L1 Calorimeter trigger. It was felt that an additional platform (and endian) should be included if possible. The HP would be a candidate if continued use of that platform in Heidelberg were foreseen: Oliver indicated that this was not the long term plan.
Concerning management of the project, Bruce suggested that the time had perhaps come to make use of MS Project in understanding the scheduling and constraints of the software. It was agreed that this might help us meet the demands.
Pete asked what was preventing the release of design documents. Murrough replied that one or two were probably ready [enough], but others were not. These should be signed off, but Bruce pointed out that discussion and agreement on the technical content of the documents should be pre-requisite. Norman elaborated, suggesting that people should be nominated to discuss the requirements documents: the documents should be defended, perhaps with external participation before approval.
Eric pointed out that documents should be classified. Some should be reviewed and others are really more "request for comment" type documents. Norman emphasised that those which define the architecture should be reviewed.

Software modules - Bruce Barnett

The slides are available here (pdf).

Bruce presented a talk which was aimed at enforcing in people's minds the needs for module and submodule constructs in the diagnostic (HDMC) and DAQ software of L1-Calorimeter-Trigger. He argued that weaknesses in the HDMC parts syntax complicated implementing submodules in a general coherent way, and expressed the concern that this issue needs be left on the agenda.

He stressed that maintaining a coherent view of DAQ and diagnostic sides of things was essential in building a manageable system. To this end, he sees the "Part Management" services of HDMC crucial in the sense that it interacts with both the Hardware Access Layer (HAL) and configuration database (in its definition, if not yet implementation). In order to achieve an implementation, he felt that the configuration database approaches had to be rationalised, and a module/submodule code generator (analogous to regbuild) needs to be implemented.

As there was no discussion raised, it is assumed that there was no strong disagreement.

Test vectors and trigger simulation - Steve Hillier

The slides are available here (pdf).

Steve addressed this topic, which has been one of the foci of his recent activity. The scope of the work is to predict data at readout level and provide a common test-vector generation interface. The work has been extended since Mainz, and has been used extensively in the debugging of the CP-Chip FPGA firmware. Steve identified a number of areas which merit additional work.

Steve commented on the VHDL debug loop in which he has participated with James in the context of the CP-FPGA design. This black-box approach to the firmware leaves something to be desired, in particular as it is necessary to get as much right with the VHDL before the hardware appears in order to assure faster implementation of the hardware.

Steve spent an insightful few minutes reflecting on the "mythical FTE". It is necessary for each of us to remain aware that the reduction in the available effort which accompanies extensive personal multi-tasking may be significant if attempts to keep things in perspective are not successful.

Discussion:
Sam asked what other tools than ChipScope are available. There are (or should be) unused pins which have been brought out from the FPGAs and these would allow some real-time debugging.
Steve reflected on the separation of function Engineer/Physicist, and questioned whether this was optimally effective. Bruce agreed, reflecting that he had wondered whether we at RAL have the balance correct.
Eric wondered how code inspections of firmware might help in propagating the understanding of the details of the firmware (and in confirming its quality).
Norman questioned how test vectors could best be used. Monitoring logical test points in the FPGA (by creating test-vector outputs corresponding to the situation at various points in the FPGA) might prove (have proven) effective. Murrough raised this point again a little later. It is clear that the balance between checking connectivity and detailed functionality has to be achieved.
John questioned how best to get into the VHDL world. Sam thought it wasn't hard. One needs a logic analyser, but of course a detailed understanding of the functional elements of the FPGA design. The working model elsewhere, however, is somewhat different than at RAL.

HDMC and Heidelberg plans - Oliver Nix

The slides are available here (pdf).

Oliver gave an overview of his directions and observations within the group. He sees HDMC as providing diagnostic abilities key in the debugging and initial testing of the hardware, and in addition (with some reorganisation) to continue to provide low-level services to DAQ software.

In addition, Oliver mentioned the need to further the firmware development of the PPr-ROD, and indicated that he planned to take responsibility for this.

Oliver has participated in the requirement/documentation effort which the software group has undertaken, and expressed particular interest in encouraging and developing a light but relevant software process.

Oliver indicated the main areas of software and firmware work that faces the Heidelberg group in the near future, and indicated available manpower.

Concerning computing infrastructure, Oliver agreed to participate in some common system-management strategy. He offered to help, too, in organising computing infrastructure required for participants in the slice tests planned for summer 2001.

NOTE: None of the remaining speakers could show their slides due to a power cut.

Readout issues and preparations for T/DAQ workshop - Norman Gee

The slides are available here (pdf).

Norman considered the readout issues which will confront us at the slice tests. It is clear that a number of triggering issues remain to be resolved, in particular in how the real hardware modules interact with those which emulate the flow of real time data into the system.

He discussed the anticipated range of tests, and their mandates.

Next Norman addressed the readout requirements for the slice tests. With a large number of S-links (9) and a large total data throughput (close to 400 MByte/s) it is clear that major attention must be paid to this question. The DSS is able to handle real-time data transfer at high rate, but the issues revolve around the question of how to check/analyse errant fragments: a process which to some degree, at least, involves computer activity.

Norman plans to raise the question of our readout requirements, and how best to address them, in the presence of the DIG at the NIHKEF T/DAQ workshop.

SLICE TEST PLANNING
(Minutes by Steve Hillier)

Thoughts on slice test organisation - Norman Gee

The slides are available here (pdf).

Norman presented his view of how the testing of modules through to system tests should proceed in the next year. For each module, coverage of simple tests through to full functionality in various timing and throughput regimes must be tested to determine if the system is robust. A systematic approach will be needed to ensure that nothing is missed. In order to do this, clear test plans with detailed environment and conditions should be written and reviewed, along with keeping the user documentation up-to-date. Progress on these test plans should be documented, possibly on the web. Libraries of tests and test vectors must be built up which could be repeated to exercise old or new modules at any time.

Moving on to integrated tests, and using the recent ROD testing as a guide, it should be anticipated that problems and unexpected faults will occur, so we must be prepared to be flexible. However, once a good degree of module functionality was in place, Norman proposed working in bursts of activity with agreed goals, before going back to fix-up any new problems identified. These 'runs' would probably be of about 2-3 weeks, similar to a test-beam schedule. Oliver asked if this meant transferring hardware between Heidelberg and UK - Norman thought this probably would be necessary.

Norman's final recommendations were to improve the documentation of the modules, with up-to-date specifications and user guides, define error recovery procedures, and develop test plans. It was thought that well-defined targets would be good for the software development, and that we need to be strict in our information logging as would be the case in a test-beam situation.

Overall timescale and milestones - Tony Gillman

The slides are available here (pdf).

Tony presented the current status of most of the modules - a few new components had been delivered since the Mainz meeting (e.g. the TCM), but the critical items were still the CPM and the back-plane, which seemed to be proceeding at about the same pace, with both predicted to be available in about February 2002. Paul commented that the PPM ASIC was probably similar in timescale. The schedule was still consistent with the slice test starting end-June 2002, but Tony questioned whether this was realistic. The slippage since Mainz had not been so drastic as the previous four months, and the production phase should be predictable in length, but to Tony the major doubt now was the length of the sub-system testing phases, especially in light of the ROD tests. There was also doubt about the readiness of the software for integrated tests. Nevertheless, Tony felt optimistic that the slice tests would start some time in summer 2002.

Tony then pointed out that we should be clear about our goals, which are to prove system feasibility before the PRRs. He warned that it should not be completely open-ended in order to limit the duration - there is now little room for slippage before full production. Questions were raised about how specific modules fit into the PRR schedule - for example can we really have a PPM PRR without a module-0 JEM? Also the module-0 ROD is still needed to run with the slice-test setup before it can have a PRR. The modules would have to have their PRRs on a one-by-one basis as their full functionality is demonstrated.

Open discussion on possible "show-stoppers" - Tony Gillman

The slides are available here (pdf).

It had been suggested that we should look at areas where we are potentially vulnerable and to see if we could plan for backup solutions if necessary. Tony had been making a first-pass attempt to identify the highest risk factors, and classify them in terms of low/medium/high risk and impact. He had done this in six areas: cost-to-complete, effort, trigger performance, production, installation and commissioning, and operation and maintenance. The individual risks are detailed in the transparencies and a summary is given here.

On cost, the major problem could be the increase in component prices, in particular the PPM ASIC - however in general costs of components decrease, although care must be taken to order older components before production ceases (e.g. G-link). There could be cost factors involving 9U board production and re-work. However Tony felt that overall cost increases were a low risk factor.

Effort is a bigger problem. It is already clear that software, firmware and engineering effort is lower than would be desirable, and it is not obvious that the situation will improve. Overall this appears to be a high risk factor, both in probability and impact, and there are no good fall-back solutions.

Tony identified many potential problems for the trigger performance, including fine-pitch BGA assembly, crosstalk, connector problems with high insertion forces, and system-wide timing issues. Another issue is a low-level bug in the ASIC - Paul said this would not be a problem for hardware if identified at the slice test, but re-coding might be a problem. Paul pointed out that a final latency that is too high could be another show-stopper.

On production, Tony pointed out that there could be problems with low yield both on components (e.g. MCMs) and boards with many BGAs. However, he hoped this would be a low risk factor.

On commissioning, there could be problems of a mismatch in module availability and software readiness. He felt this could be a big problem, but Murrough felt that though the software might be an issue for the slice test, things should be better for the final system.

Finally, on operation and maintenance, the problem areas were: infant mortality of custom components, loss of key personnel and inadequate documentation. Tony thought these were a medium risk overall.

John Garvey felt that though it was interesting to think about these problems, we should not worry too much as the main thing was to focus on the slice test, and if that proved successful a lot of these worries would disappear. Tony agreed, but pointed out that it is useful to see if there are any back-up solutions if we do run into problems.The problem of an LHC delay after the PRR was raised, resulting in a possible delay in production and consequent problems in obtaining components. The impact of a resource cut was also raised, but it was felt that there was little we could save as the system was so heavily integrated, and would have to argue this case if pressed.

SUMMARY

Summary of highlights and "lowlights" of the meeting - Eric Eisenhandler

The full summary is available here (pdf), and was also circulated by e-mail.

This page was last updated on 16 December, 2013
Comments and requests for clarification may be sent to
E-mail:
Telephone: