Minutes of the CALICE electronics meeting, UCL, 13/08/02
========================================================
Present: Stewart Boogert, Jon Butterworth, Paul Dauncey, Steve Hillier,
 Dave Mercer, Dave Price, Matt Warren

Minutes: Paul


CDR: The level of detail required for the CDR is such that the proposed
 system can be understood and compared against the requirements. Specifically,
 documentation containing the following will be needed:
   o) Block diagrams of the system level, board level and (some) internal FPGA
      level. At the board level, this should be down to individual FPGA's and
      other major components, or groups of minor related components.
   o) Estimates of pin counts for the FPGA's.
   o) Component choices for the major components; FPGA's, ADC's, DAC's, etc.
   o) A firmer estimate of the cost.

 To consider the system-level requirements obviously requires the CDR to cover
 the whole system, specifically both the readout and trigger boards. Matt
 indicated that he thought the trigger board (both master and a possible
 slave) could be specified to the above level of detail by the time of the CDR.
 As the test board is not part of the system per se, then it does not
 necessarily have to be included in the review. However, it would clearly be
 better to include it, if this level of detail can be documented by the time
 of the review.

 Finding a date for the review was not easy, particularly as we need to have
 three external reviewers (if possible). There seemed to be no way to have it
 before the next PPRP meeting on Sep 30 (which is unfortunate). The possible
 dates for people at the meeting were Oct 4 or Oct 7-11, although Jon would
 not be available for Oct 7-9. The exact date will be fixed depending on the
 availability of the reviewers.

 Three reviewers seems a reasonable number (although two may be sufficient
 if necessary). Of the names mentioned previously, Adam Baird (RAL) and Greg
 Iles (IC) will be approached now. John Lane (UCL, not Manchester as stated
 previously) may not be available and so will not be asked (yet). Pedja
 Jovanovic (Birmingham) was suggested. Also, Richard Staley was raised as a
 possibility, even though he is giving significant comments and help. One
 of these two should be asked now. If the above fall through, we could ask
 Cambridge, although they are not involved in the electronics side of CALICE;
 Steve Wotton and Morris Goodrick were mentioned. These will not be approached
 unless needed. Paul will contact Adam (via Rob Halsall) and Greg. Steve will
 contact the Birmingham people.

 [Note added after: Adam Baird will be able to attend; Rob Halsall himself
 is also interested in coming along. Richard has also said he would be
 prepared to do this.]


Definitions: The word "configure" is sometimes used for loading the FPGA
 firmware and sometimes for loading the software-programmable values used by
 the FPGA's. We should try to stick to "load" for the FPGA firmware and 
 "configure" for the software change.

 When counting channels, FPGA's, boards, etc, we should always count from
 zero, i.e. C-style.


Modes of operation: As it is relevant to the following discussion, the various
 ways the system needs to operate and the uncertainies associated are:
   o) Tile HCAL: We should assume the tile HCAL readout will use our readout
      boards. However, the number required is by no means fixed so it is not
      clear they will fit into the remaining four slots in the one crate. In
      addition, the assumed beam monitoring and/or trigger readout will need
      at least one slot, so there are likely to be only three slots left.
      Hence, we may need two crates. A second "slave" trigger board would then
      be needed to distribute the trigger on the second crate backplane. The
      timing of the sample-and-hold for the tile HCAL will almost certainly be
      different from that of the ECAL and so must be set readout board by
      readout board.
   o) Digital HCAL: No one within CALICE is yet signed up to produce the VME
      boards for the digitial HCAL so the uncertainties here are even greater
      than for the tile HCAL. We should assume this will use completely
      different boards, which will not pick up the trigger from the backplane.
      We therefore need to provide point-to-point trigger cables from our
      trigger board to each of the digital HCAL boards with an adjustable
      delay. We can assume a maximum of 32 lines is needed. The trigger delay
      can be common for all 32 cables. A second crate is very likely to be
      needed in this case. However, no slave trigger board would be needed as
      cables are used.
   o) In either case with a second VME crate, we can possibly connect it using
      a second PCI-VME interface, which would double the VME bandwidth.
      However, it is not clear if this will function straightforwardly. The
      alternative is to use a VME extender card to daisy-chain the second
      crate, so the total appears as if it was one 40-slot crate. Clearly,
      the VME bandwidth is then shared between the two. Any non-standard
      features and/or customisations (e.g. power supply voltages, wire-wrapping
      of backplane pins, etc) made to the second crate must be compatible with
      both tile and digital HCAL modes.


Trigger board: The specification document (latest dated June 7) gives two
 possible implementations for the trigger board. One is a self-contained VME
 board containing an FPGA for the VME and the trigger logic, while the other 
 is a supplementary board to the readout board. This would have a short cable
 connecting to the readout board and use the existing VME interface there.
 The trigger board itself would then contain only the trigger logic. The
 trigger path would not go through the readout board at all. This would
 make it much simpler and cheaper. The downside is the added complexity in the
 readout board master FPGA. If this increased the cost of the readout boards
 significantly, then it would not be worth it. It would also require the
 readout board to be able to generate a VME interupt, which is not needed
 for a standard readout board. If the supplementary board approach is used,
 then a slave trigger board in the second crate for the tile HCAL should be
 an identical board but without needing the readout board, i.e. with no VME
 interface. The decision on which approach to take is clearly needed by the
 time of the CDR.

 Because of the difference in timing required for ECAL and tile HCAL readout
 boards, the calibration strobe is generated on the readout boards, not the
 trigger board. The latter should treat all triggers, whether in normal or
 calibration mode, identically; in fact, it should not be able to detect which
 mode is being used. Hence, no separate calibration trigger input is needed
 on the trigger board.

 The trigger latency and jitter requirements mean it would be best to keep
 the whole trigger path, through the trigger board, across the backplane and
 to the readout board slave FPGA's, as "unclocked", i.e. analog timing. To
 add the multiple of 10 ns delay in the slave will need clock alignment but
 not aligning to a clock anywhere else will minimise latency and jitter.


Readout board: DaveM had previously circulated some ideas for the master-slave
 FPGA interfaces and got back some comments. The following are items which 
 came up from those notes.

 The geometrical addressing of VME boards in the crate using pins wire-wrapped
 on the backplane was thought fine; six rather than five pins should be used
 to cover the case of two daisy-chained crates.

 The master FPGA to slave FPGA configuration data interface does not have
 any significant timing requirement. It can be done as a simple 16-bit data
 read-write bus to all six slaves, each of which will have identical VHDL.
 A point-to-point write line from the master to each slave allows them to have
 the same internal address space; this is preferable to hardwiring three pins
 on the PCB to give an unique address location for each slave as it allows a
 broadcast write to all slaves simultaneously (using a virtual "slave 6").
 The VME asynchronous read and write will be synchronised to the board
 12.5 MHz clock for this bus, so all master-slave interactions are synchronous.
 One subtlety arises from the DAC; the value in the configuration data which
 gives the DAC setting needs to be set before the next trigger. The slave
 should internally detect the DAC address has been accessed and set the DAC
 at that time. However, the master-slave interface itself should be completely
 generic. The address space required for configuration data within each slave
 has not been determined yet, but could be around 1 kByte. A 12-bit address
 bus should be sufficient. With four point-to-point "sideband" signals per
 slave, this would imply around 55 pins for the master and 35 for each slave,
 including some control. The master itself may also need some internal
 configuration space (e.g. for the serial number).

 The slave-master event data transfer is less straightforward. This needs to
 be fast enough to not incur significant extra readtime/deadtime for a 1 kHz
 readout rate. The data could be stored in the slaves until the VME read or
 in the master; for the latter, they could be transfered during or after the
 timing signal sequence generated by the slaves. It would also be good to keep
 the inter-FPGA tracking low so as to reduce pin count and PCB complexity.
 The scheme also needs to include the ability to send back some data from
 the configuration address space, so as to test the interface. The
 configuration array needs to include data for this purpose.
 o) The numbers here are that each slave handles 108 channels and so transfers
    216 bytes per event. The master therefore transfers 1296 bytes per event.
    The 500 kHz ADC rate means the slave timing sequence will take around
    40 us, which is small compared to the total allowed time per event of 1 ms.
    Any extra time due to the slave-master transfer should ideally be kept at
    10 us or less.
 o) General VME access is asynchronous and so the event data transfer from the
    master to VME either requires the VME strobes to be synchronised to the
    readout board clock or be done asynchronously using the VME timing. The
    former adds some overhead (up to one period of whatever clock the transfer
    is synchronised to) to each 32-bit transfer. At a VME speed of 30 MBytes/s,
    then each 32-bit word would take 130 ns, so a clock of 100 MHz or more
    would be needed. An asynchronous transfer will optimise the VME readout
    speed but effectively requires the data to already be in the master FPGA.
    As maximising the VME speed is crucial, we will assume the data are stored
    in the master before readout and asynchronous logic is used. Also, there
    is no good reason for waiting until after the timing sequence so transfer
    should occur during the sequence.
 o) Options for the slave-master event data interface include a 16-bit data
    point-to-point parallel connection for each slave, a 16-bit bus common to
    all six slaves or several 1-bit serial point-to-point connections. The
    16-bit parallel option needs to transfer 216 bytes in 40 us, and so
    needs to run at 2.7 MHz or faster; hence the 12.5 MHz board clock would
    be easily sufficient. The downside of this option is the pin and tracking
    count; the master will receive six such point-to-point connections which,
    including control, will take over 100 pins.
 o) A 16-bit bus would reduce this to less than 20 pins. However, it would
    need to transfer the full 1296 bytes in the 40 us and so would run at
    16.2 MHz, requiring a faster clock. This option also has problems with
    interleaving the slave data; during the sequence, all the data from e.g.
    slave 0 will not be available to transfer first, so each slave will need to
    send its first word in sequence, then each second word, etc. Bus contention
    and timing make this quite complex.
 o) The serial point-to-point option matches to the data rate out
    of the ADC's. The ADC's are read serially; their sample speed is 2 us,
    which means the 16 bits need to come out on an 8 MHz clock or faster; we
    would use the 12.5 MHz board clock, which gives the 16 bits in 1.3 us.
    Each slave handles six ADC's, so these six serial bit streams could be
    simply forwarded to the master over six point-to-point serial lines. No
    additional transfer clock is needed, although a write enable line(s) to
    control when to write to the master memory would be required. The master
    FPGA pin count for this would be around 50 pins.
 Again, decisions on the option to use need to be made in time for the CDR.

 The FPGA's should load from SROM's, with all six slaves loading from a single
 SROM and the master from a separate SROM. The SROM's should be reprogrammable
 using an ISP header (JTAG). In principle, the slave SROM could also be updated
 using VME but not the master, as this itself controls the VME access. Given
 that there will only be ~20 readout boards, doing this number of master SROM's
 with JTAG is not a huge task. Note, there is usually a header added on to the
 firmware data when loading which is made automatically using JTAG but which
 would need to be generated in software if using VME. This might make slave VME
 loading quite complex; the same arguement as above, i.e. that JTAG'ing 20
 boards is not too onerous, also holds for the slave, so, if necessary, not
 implementing VME loading was considered acceptable.

 A standard VME crate supplies +5V, which will probably not be needed. The
 LVDS I/O will need +3.3V while the FPGA's will run on +1.8V or +2.5V. The
 regulator to get these from +5V requires a substantial amount of board space,
 so it would be sensible to get a crate with 3.3V directly. We will not need
 -5V except for the trigger NIM I/O (which is not yet definite; it could be
 TTL). The +/- 12V will not be needed at all.


Next meeting: TBD