Minutes of CALICE-UK Meeting, Birmingham, 05/02/02
==================================================
Present: Paul Dauncey, Ian Duerdoth, Chris Hawkes, Steve Hillier,
 Richard Staley, David Ward, Matthew Warren, Nigel Watson

Minutes: PaulD

General comments and ideas on the specification document;
 Matthew: The UCL group have discussed various ideas for the BEC.
  Their first option is to implement the BEC as a PCI card and connect it
  directly to the PCI bus of a PC. Each BEC would have four uplink fibres
  and a PC can have a maximum of four such cards, so each PC would read out
  the 16 FEC's in one DIC. This scheme would require 6 PC's. The downlink
  would come from a separate PCI card in each PC. The main advantage of this
  scheme is that the readout of PCI is ~85Mbytes/s compared with the assumed
  ~25MBytes/s for VME. However, disk write speeds are ~40MBytes/s and so may
  not allow the full benefit. In addition, allowing all the raw data to be
  buffered on the BEC requires memory which will not physically fit onto a
  PCI card. There is also the extra complexity of having a multi-PC readout
  system, although this is likely anyway when the HCAL is included.
  Their second option is to removed the buffering on the BEC completely and
  consider the FEC as the buffer. The FEC's can be read out sequentially
  rather than in parallel and the data are sent directly to the PCI bus.
  There is a small mismatch of the data speeds, 96 MBytes/s from the FEC's
  compared with 85 MBytes/s on the PCI bus, but this could be buffered for
  the duration of the read. This option is estimated to cost 69k compared
  with 82k for the VME-based system. A third option would be to use a
  CompaqPCI crate instead of VME; these crates can contain a PC and disk and
  can be cheaper than VME.
  Finally an option connecting the BEC's directly to a network was
  considered although the high-throughput switch required might be
  expensive.
  
 Nigel: Birmingham have considered the FEC in some detail. They propose the
  FEC should have 112 (16x7) rather than 128 channels as it would be difficult
  to use the extras as spares. It is not worth reducing the 112 to the 108
  (3x36) channels actually needed as this would require careful definition of
  the VFE chip layout (which is not yet known).
  They have investigated multiple ADC packages. Dual ADC components are widely
  available but higher multiples with the specifications we require are
  effectively unavailable. Use dual ADC's would result in 56 components, each
  around 1cm x 1cm, hence taking around 8cm x 8cm board space. These can be
  run with both channels output in parallel at the sample rate, or multiplexed,
  where the two outputs are put out in series at twice the input rate. The
  latter mode takes ~twice the power. The cost is quoted as $5/package, which
  is probably around 5 pounds in the UK, or 2.5 pounds/channel. This is
  cheaper than the 4 pounds/channel previously assumed. 8 bit ADC's are less
  available and are probably not significantly cheaper; the saving in data
  volume is minimal as after data reduction it is dominated by the time, gain
  and channel information.
  The FEC memories need to be 1120 bits wide, which is 70 packages x 16 bits
  or 35 packages x 32 bits. With a 10 bit ADC, the latter would be preferable
  as three 10-bit values could be stored without being split between physical
  memories (but would actually then require 38 memories). The cost of 12 or
  24 MHz components is similar; the faster ones would allow the ADC
  multiplexing and hence reduce the number required by a factor of two, at the
  cost of extra control complexity. The cost is estimated to be around
  10 pounds for the 32 bit packages, or 400 pounds per FEC. This is
  substantially above the 80 pounds previously assumed. The same memory is
  assumed in the BEC and so the cost there is also likely to increase;
  however, these memories need narrower inputs but must be deeper and fill at
  48 MHz so it is not clear how the cost differs.
  The FEC power is estimated at 15W for the non-multiplexed option and is
  dominated by the memories and ADCs. The multiplexed option takes 24W. These
  are with the analog section on; the TESLA 0.5% duty cycle gives averages of
  0.7W and 1.3W respectively. To these have to be added the digital section
  power, which is always on and is estimated to be around 1W. Hence, each FEC
  will take ~2W, so for all this is ~200W. N.B. the duty factor at the
  testbeam may be different. Also, the HCAL have requested periods of cosmic
  running with the analog power on continuously.
  The FEC PCB component-mounting costs may be around 100 pounds, so the
  300 pounds estimated for the PCB manufacture and mounting is not thought
  to be too far out.
  The nomenclature for the uplinks and downlinks should be swapped; the data
  flow downstream, i.e. from the FECs to the BECs and so this link should be
  the downlink. This name change will be made in the next version of the
  specifications document (but not in the text below!).
  
 Paul: Two options have been considered. Firstly, the effect on the data rates 
  of combining the data from the FEC's connected to a DIC was investigated.
  This would make the uplink data path conceptually the reverse of the
  downlink path and would save ~18k in the links. This change would reduce
  the FEC uplink bandwidth from 96 to 6 MBytes/s but as all FEC's are read in
  parallel, the overall rate is still dominated by the 25 MBytes/s VME speed.
  One issue which might cause problems is synchronising the data from all the
  FEC's at the DIC before it goes into the uplink. With this change, there
  would be only 6 pairs of fibre-optic links, which might allow all 12 fibres
  to connect to a single BEC. This would save ~10k due to not having to
  manufacture 10 BEC PCB's, even though the single BEC (and hence the spares)
  would be very expensive.
  It would also be possible to save memory on the BEC's by not reading all
  FEC's in parallel. Reading only the FEC's connected to a DIC, for example,
  into BEC memory before reading this out to VME and then repeating for the
  next DIC would reduce the BEC memory needed. The FEC synchronisation
  handshake check at the end of a bunch train would need some thought for
  this scheme. With all fibres into one BEC, a memory reduction to 1/6 of the
  original would be possible, with around a factor of 2 reduction in readout
  rate.
  Secondly, the option of removing the FEC large memory and writing reduced
  data directly into the small memory was investigated, which would save around
  40k in memory costs. However, the number of FPGA's needed to keep up with
  the data rate is around 14; this might make this option more, rather than
  less, expensive.

 Ian: There is a group forming within CALICE to look at the option of VFE
  electronics being mounted directly on the diodes, i.e. within the detector
  volume. Any cooling pipes would have to be between the ECAL and HCAL and
  so the heat would be extracted radially outwards. Ian has done some
  detailed calculations of temperature distributions in the detector. The
  flow is dominated by the conduction within the radial carbon fibre alveolus
  supports. The thermal conductivity here is somewhat uncertain, but it seems
  that it should be feasible to get enough heat out this way.

  
Brainstorming; Several items were considered:
 VFE interface: A voltage shift may be required between the VFE analog
  output and the ADC input. This may be useful anyway to restrict the
  ADC digital switching noise being transmitted to the VFEs.
  It is not clear if the most general calibration channel selection,
  requiring one pin per channel, is actually needed, or if the VME chip
  will be able to do the corresponding selection. A single-ended TTL signal
  for these lines was thought appropriate.
  The digital signals which change during a train, namely the two clocks and
  the calibration timing signal, should all be LVDS.
  It may be desirable to have separate analog and digital power and ground
  pins. The FEC itself will probably need 3.3V rather than 5V, but the
  operating voltage for the VFE chip was not known.
  It seems feasible to have a 32-pin connector purely for the 16 differential
  analog signals per VFE chip and another 32-pin connector to take the
  calibration select signals (16 pins), the switching signals (6 pins) and
  the calibration voltage (2 pins), leaving 8 pins for power and ground.
  It is clear a lot of information is needed from the VFE chip designers.

 BEC: It was thought that combining all the functionality of the BEC onto one
  board would make it very complex (and hence prone to failure) and also make
  the cost of each spare very high. Either three cards (each having two DIC
  fibre pairs) or six cards (each having one pair) was thought best.
  The pros and cons of the computer interface were considered. VME is well
  known and there is a lot of experience with board designs. The PCI crate
  technology is quite new but may be cheaper. It is unlikely the main driver
  for this choice is the data rate, as even if we can read out fast enough to
  be limited by the disk speed of 40 MBytes/s, this would give around
  120 GBytes/hour or 1 TBytes/day, which is probably much more data than
  most institutes can analyse.
  It would be very useful to be able to distribute a "train number" with each
  StartTrain so as to allow checks when building the events from the different
  BECs as well as when including HCAL data. This would require the number to
  be sent to the HCAL also. Ideally, this would be 32 bits, although 16 might
  suffice. There may be enough spare J2 lines in VME to allow this number to
  be distributed between the BECs over the backplane. An alternative would be
  to count the StartTrain commands locally on each board and label each data
  packet with the count. However, what action to take if the counts do not
  agree at the end of the run is not clear. 
  The possibility of splitting the BECs into a "driver" card, from which all
  the downlinks are sent, and several "receiver" cards, to which the uplinks
  read out, was discussed. The driver then makes clock and StartTrain signal
  distribution simpler; it in fact then looks very like a FCT card and with
  the addition of more fibres, might be able to control the HCAL also.
  
 Number of bunches per train: The number of TESLA bunches per train is 2820
  (which takes ~1ms). To be able to do this, a maximum of 4096 has been
  allowed. However, reducing this to 2048 (~0.7ms) would cut the FEC and BEC
  memory depth by a factor of two; the cost reduction (if any) of this is not
  clear. However, if this would give a significant cost saving, would it be
  acceptable? The point of running for up to 2820 bunches is to see any
  possible pedestal drift (e.g. due to the electronics warming up). With only
  2048 bunches, most of this period could be observed. In addition, the analog
  section can be powered on earlier than required to mimic a longer bunch
  train, if required. This seems a viable option.
 
 Testability: The question of whether the boards should have features purely
  for tests and not data taking was raised. For the FEC, this might include
  the ability to load data into the large and small memories to check readback
  and the data reduction algorithm. Also, it would be useful to make a fake
  VFE daughterboard, with a simple FPGA and DAC so as to inject varying
  signals into the various ADC channels. These were considered to be good
  ideas.

 FEC FPGA firmware updates: Updating firmware versions on 100 FEC's by
  physically removing, updating and replacing EPROM's by hand was thought
  unwieldy. Options are to use a JPEG connector (which still requires all
  100 to be done manually) or to use the downlink to send the new code.
  The latter would work with a flash RAM rather than an EPROM which is
  filled using a configure command.

 DIC uplink data skews: Combining the uplinks from 16 FECs into the DIC to be
  sent back on a single fibre was considered a sensible option. However, the
  issue of skew on the data lines was raised. A skew of 5-10ns may be expected
  from an FPGA, which is not ideal given the 48 MHz clock period of 21ns.


Cost, Effort and Schedule:
 Cost: Although the costs given so far (modulo the corrections noted above)
  are not considered too inaccurate, a more careful cost estimate will need
  some dedicated effort from an engineer. This will require some time from RAL
  TD personnel, which may be hard to get before approval; Paul will
  investigate this.

 Effort: Similarly, although the rough estimate of a full-time engineer
  for both the FEC and the BEC is reasonable,  better estimates of the
  effort needed will require RAL TD input. In addition, it now seems likely
  there will not be a roughly equal division of effort between RAL TD
  and University groups as stated in the PPRP letter; there is a lack
  of University effort available at the level needed, so we may need to
  lean more heavily on RAL TD that we indicated.
 
 Schedule: Working backwards from a completion date of early 2004, the
  following schedule was sketched out:
    Mid 2002 - start prototype design
    End 2002 - finish prototype design, start prototype manufacture
               and testing
    Mid 2003 - finish prototype testing, start production manufacture
               and testing
    End 2003 - finish production testing
  The first six months is to go from the rough outline we are preparing
  to a full prototype design.
  The next six months is needed to write software for the system and
  develop production tests, in parallel with the prototype manufacture.
  Assuming the test software has been fully developed on the prototype, then
  each FEC should be able to be tested in 0.5 day. This is 10 per week or
  10 weeks to test all production FECs. Including manufacture, the last
  six months should therefore be adequate for testing.
  
  
February 20 Meeting at Imperial;
 The items we need better defined are:
   o) The VFE interface
   o) The impact of the HCAL, the DAQ and the FCT system.
   o) The maximum bunch train length needed; is 0.7ms sufficient?
   o) The required data rates, data volumes, etc.
   o) Power supplies, distribution and grounding.
 
 The UK should give several talks. Suggestions are: 
   o) VFE-within-detector options
   o) Overview of the readout
   o) VFE interface issues
   o) DAQ/FCT/HCAL issues
   o) MC studies (related to beamtest data rates)
  Ian agreed to do the first and David will possibly do the last. The other
  three are harder to assign. Mark Thomson was suggested for the fourth as he
  could give a description of the MINOS run control as an introduction; if
  not Steve may take this one on. Paul is prepared to do either of the second
  or third, but would prefer not to do both. Any volunteers should contact
  him asap.


AOB; The PPRP still require our proposal for their next meeting, so it has
 to be complete by Mar 11. The first draft should be ready a few days before
 the Mar 6 meeting at UCL. It is unlikely we will be called into the closed
 session at the March PPRP meeting, particularly if we do not do an open
 presentation.