ATLAS-UK Level-1 Calorimeter Trigger Meeting
Thursday 27 February 2003 at RAL
Present: Bruce Barnett, Ian
Brawn, Eric Eisenhandler (chair), John Garvey, Norman Gee, Tony
Gillman, Stephen
Hillier, Murrough Landon, Gilles Mahout, Viraj Perera, Weiming
Qian, David
Sankey, Peter
Watkins
Agenda
Click this side Click this side
for summaries for slides (pdf)
Hardware
CMM hardware and firmware status........................Ian
ROD hardware and firmware status......................Viraj
TTC, TCM, VMM etc. status.............................Viraj
ROD test status.......................................Bruce
CPM hardware.............................Gilles for Richard
CPM tests............................................Gilles
CPM timing calibration techniques.....................Steve
Serialiser and loss of link.............................Ian
Online software
Software summary.............................Bruce/Murrough
Recent meeting highlights
Database and Monitoring workshops........Murrough (DB)(Mon)
ATLAS week.............................................Eric
General discussion
What test hardware is needed, and where; UK tests; long-term schedule
Any other business
Date of next UK meeting
Hardware
CMM hardware and firmware status
- Ian Brawn (slides)
Ian presented the status of the current and next generation
of CMMs. The
testing of the current CMM has finished, bar one minor firmware bug in the
VME control CPLD. For the next generation of CMM, five PCBs will be
manufactured. Initially, two will be assembled. Documentation needs to be
updated as some registers have been added. Commissioning of these modules
should begin in April. Firmware for the Jet versions of
the
CMM
is
being developed by a collaboration of RAL, Mainz and Stockholm. Note that licencing
under EuroPractice needs to be sorted out.
ROD hardware and firmware status
- Viraj Perera (slides)
Viraj summarised the status of the ROD firmware (see slide
1) and the modules (first item, slide 2). Specifications for JEP CMM firmware
(jet, energy) are still lacking.
TTC, TCM, VMM etc.
status - Viraj Perera (slides)
Viraj presented
a table (slide 2) showing the status of the various modules. The six new
RODs still need G-link Rx cards. TTCrx situation is messy due to multiple
versions and some problems with the old ones.
ROD test
status - Bruce Barnett (slides)
Bruce said that a second ROD crate had been set up to allow
independent testing, but there had been problems with the hardware. The second
crate is useful for debugging multi-crate run-control software. He then summarised
the status of the various flavours of ROD firmware that are
in various stages of
testing; see his talk for details.
CPM hardware
- Gilles Mahout for Richard Staley (slides)
The first CPM is still being tested. Timing synchronisation can be done using
a "1-phase" method; see Gilles' talk below. A few of the backplane links deliver
poor quality signals; this is hopefully due to the use of loop-back rather
than a real CPM.
The second CPM has connection problems on four of eight CP chips. The board
had been tarnished. Although the other two unused PCBs have been cleaned, it
was decided for safety and speed to make new ones. The general issue of storing
un-assembled PCBs for long periods of time before putting chips on must be
investigated in order to understand whether the problem we have is due to poor
storage, or is inherent in any long-term storage.
Richard presented ideas for an LVDS source module to be used for large-scale
testing (such as final production) of LVDS inputs to CPM and JEM. It would
be a 6U module with 88 or 44 channels. Murrough asked about a matching "sink"
module. In order to make sure the module is useful for production testing,
it should be reviewed when it has been more fully specified.
Finally, he has produced a "fix" for the power-on surges in the crates. It
is a combination of filtering and sequential turn-on of different voltages.
CPM
tests - Gilles Mahout (slides)
Test of the real time data path between the CP chip and serialiser have been
performed. CP chips have been downloaded with a two"1-phase" method,
one phase
used for the on-board signal and the other one for the backplane signals.
The on-board clocks of the CP chips have been rerouted in order to have backplane
data
and on-board data working correctly within the same time window.
Using the TTC broadcast command, the clock has been scanned between
the CP chip and serialiser over a period of 25 ns. Four periods of error-free
zones
have been recorded, and a time window of 1 ns could be found where all data,
backplane and on-board, work without error. Signal integrity
of
the
backplane data unfortunately prevents having a wider time window.
An overnight run has been performed without errors. New serialiser firmware has
also been used to correct the corrupted data observed in previous runs,
due to VME access mishandling.
A similar scan has been performed between LVDS data from DSS and the
playback memory of the serialiser. A time window of 20 ns has been measured.
One serialiser shows a lot of errors due to one faulty LVDS receiver. It appears
that one pin of the chip is not connected and will need to be soldered. An
overnight run will need to be performed too.
Tests of new firmware normally written for a different speed grade CP chip
than the speed grade
available on the board have been done. First results are encouraging but
it
needs a lot more work before validation. Furthermore,
we don't know if the device can handle the algorithm too.
In order to have
two
CPMs fully working for the slice test, successful tests performed so
far
with the CPM#1 will lead to the assembly of the second CPM with exactly
the same components.
During the test of the LVDS, it has been noticed that the DSS does not
handle correctly the Brct pins. It should also take into account a strobe
in order to validate the Brct pin value.
Testing of the serialiser has been very successful thanks to the access
to
the firmware source code. Code for a BER test has been written and ChipScope
has been
implemented within. Access to the source code for all firmware would then be
very useful,
as seen with the previous example, and also when one of the designer
is on
holiday and minor changes need to be done.
The next step will be to test the ROC and HIT outputs. This will require
more DSSs and
GIO cards.
There was a discussion about access to firmware source code. What is asked
for is read-only access primarily in order to understand how things are being
done. Any changes would only be done to a local copy and only in exceptional
circumstances. The master copy would not be touched. One
problem to solve is version control in the central archive, and another is
the EuroPractice licence. It was generally agreed that this form of access
is essential and should be implemented.
CPM timing calibration techniques
- Steve Hillier (slides)
In light of recent observations of the behaviour of signals
in the CPM, the current firmware calibration algorithms in
the serialiser and CP chip are inadequate in certain, not
entirely rare circumstances. The problem occurs when no
error is seen on all four phases tested, and an arbitrary
choice of phase is used. The algorithm could be improved
by using extra information about the actual data values,
but it may be difficult to fit this more sophisticated
algorithm into the FPGAs. Since the algorithms can also
be performed in software, and software has more flexibility
and time to deal with awkward cases, it may be easier just
to implement the whole calibration scheme in software, and
not use the firmware calibration techniques at all.
Serialiser and loss of link - IanBrawn
(slides)
Ian explained an issue that had been brought to his attention
by Richard
Staley and Steve Hillier: the Link Loss flag from the LVDS receivers to the
Serialiser only becomes active after four consecutive cycles of invalid
data have been seen. Therefore, to flag all slices of readout data possibly
corrupted by a link loss, it is necessary to delay the readout data by
three clock cycles. Ian presented the pros and cons of the required design
modification and, after discussion, it was decided that no action should be
taken until more was known about the behaviour of the LVDS links.
Online software
Software summary
- Bruce Barnett, Murrough Landon
Module services (Bruce - slides): Bruce
summarised recent progress and additions to the ModuleServices of the online
software.
Software status (Murrough - slides): Murrough
presented a brief summary of the software status.
The Mainz/Stockholm visit was very useful in pushing things
along and highlighting the short term priorities. Work has
been done to implement some of these, such as support for
setting TTCrx parameters via the TTC, multistep runs, scans
of calibration parameters, sets of options groups by run type,
etc. This has involved a work mainly in the datbase and
module service packages. There is still a lot to do including
move to more recent OS and Online software versions.
Dave Kant has started work on customising the Online Event
Dump for our ROD fragments. He has produced a requirements
document. We should discuss this in a wider forum.
Recent meeting highlights
Database and Monitoring Workshops - Murrough Landon
(slides: DB, mon.)
Murrough mentioned two working groups (on monitoring and on error handling
and fault tolerance) and the recent database workshop. The database workshop
mainly concerned the conditions database, not the configuration database. Murrough
gave a talk on level-1. A fuller report will be presented at the imminent Mainz
meeting.
ATLAS week - Eric Eisenhandler
Eric gave a brief summary of some relevant items presented at the ATLAS week.
General discussion
What
test hardware is needed, and where;
UK tests;
long-term schedule
Norman showed some ideas on subslice tests for the JEP and the CP (slides)
and how they evolve into the full slice, including some estimates of what is
needed for JEM testing (slide),
as well as test modules (slide) and daughter cards
(slide).
Some event building will be needed, and that requires a readout subsystem
(ROS).
There was a discussion of what is needed to test various pairs of modules
in all reasonable combinations, and also where to do them. Most people felt
that first testing should be where there is expertise on the modules concerned,
but Norman is concerned about how to spread the expertise. Also it is not
clear what to do when the expertise on two modules is in different places;
the
example discussed was CPM to CMM. Eric pointed out that in addition to the
slice tests going "vertically" through the system, there is a need for systematic
"horizontal" testing; the obvious example is the backplane, where a CPM and/or
JEM in combination with a CMM must be put into every single slot and all
possible connections checked for correctness and crosstalk. With subsystem
tests at RAL this could proceed in parallel at Birmingham.
CPM to ROD could be done as soon as some software is ready.
CPM to CMM
would be better with a new module, and also needs software; might be in May.
Venue might depend on how earlier work goes at RAL and Birmingham. CMM
to ROD has been done but further work needs software.
It was also pointed out that tests could be done as if in a test beam, with
intensive work over a period of a few (say three) days; the location is then
not so critcal and might depend on who had got through preceding work first.
None.
Will be arranged after the Mainz joint meeting.
Eric Eisenhandler,
11 March 2003
|