CALICE MAPS Interim Design Review 1, Part 2, RAL, 25/01/07 ========================================================== Present: Andy Clark, Jamie Crooks, Paul Dauncey, Matt Noy, Marcel Stanitzki, Konstantin Stefanov, Renato Turchetta Minutes: Paul Overview: The width required (and hence dead area) for the logic columns will not be known until the layout is complete. The width should be rounded up to a multiple of the pixel size so that it represents an integer number of dead pixels. This is likely to be 200mu = 4 pixels. The issue of reflecting neighbouring sectors and hence lumping the dead areas for two logic columns together was discussed. The optimal arrangement is not clear; an EM shower will have a transverse size ~9mm and so is roughly of the same order as the live areas ~2mm. It was felt it would be better to have random inefficiencies so they do not depend on where the shower occured, which would argue for not reflecting the layout for neighbouring sectors, but translating it. However, it might be that the logic columns need e.g. ~3.5 pixels space, so combining them could be done in 7 rather than 8 pixels, reducing the dead area. Unless this latter happens to occur, then tranlating the designs would be prefered. Each sector will handle 42 pixels which will be subdivided into seven groups of six neighbouring pixels in terms of the memory readout. These require six bits for the six pixels and three for the group label. The timestamp will have 13 bits (up to 8k bunch crossings), which was considered reasonable, although 14 bits (up to 16k bunch crossings) would have definitely been safe. This means each memory location stores 6+3+13 = 22 bits. There will be a total of 19 memory locations (i.e. possible hits) for each sector of 42 pixels. The addresses for the seven groups of pixels are 1-7, with 0 reserved for no valid pixel. The addresses will have to be set externally, cycling from 0 over all seven valid values and back to zero at ~50MHz, i.e. between each bunch crossing. The second sensor design should have this implemented on the sensor. The mask register shown in the slides is now obsolete; it is now implemented in the individual pixels. There is a global "force hit" input signal which is tracked over the whole sensor, which is simply OR'ed with the normal pixel output. This is downstream of the mask and so is not influenced by it, i.e. masking will have no effect and all pixels will respond to this signal. The signal can be changed at any time, specifically during a bunch train. The configuration data (mask and trim DAC setting) are loaded into a serial shift register 168 bits at a time (one per pixel column) and then the whole SR is parallel-loaded into the pixels, before being refilled and the process repeated. However, the parallel load is not destructive so the same SR data can be loaded multiple times. This would in principle be possible during a bunch train. However, the mask and trim bits are not loaded separately so any pattern sent would disrupt the trim DAC settings, reducing the utility of this. It seemed more useful to keep to the usual pattern of loading the configuration data before the bunch train and, if desired, having short bunch trains. There is an memory overflow flag per 84x84 pixel bank, for a total of four flags in all, which go directly to output pins on the sensor. The 22 bits saved in the memory are combined with another 9 bits (the "row encoder") which identify the 42-pixel sector at readout, resulting in a 31 bit word. The row encoder bits are hardcoded into each logic column row. All 9 bits are not strictly needed for the first sensor as there are only 168 sectors in each column, which would need 8 bits to label. However, the second sensor will have 5/2 times the size in both dimensions (i.e. 420x420 pixels total compared with 168x168 pixels) giving 420 sectors per column, which will require 9 bits. Hence 9 bits have been reserved already. The data from the four columns cannot be distinguished internally; this information must be provided by the external readout control. The total memory in each logic column is then 22x19x168 = 70224 bits, which gives a total over the whole sensor, with four columns, of 280896 bits, i.e. ~35kBytes. However, on readout, the extra 9 bits which make the words up to 31 bits (rounded externally to 4 bytes) would then result in a maximum data volume per sensor of ~51kBytes. The equivalent for the second sensor will be (5/2)^2 = 25/4 larger and so is ~319kBytes. Logic simulations: The comments refer to pages in the document used in the review. Pg 5: All the transistors work at 1.8V except for a few special cases; such as writing to the SRAM, where the signal has to overpower the SRAM transistor settings. These are the "HV logic" which will be at 3.3V. Pg 7: The variation within a sensor of the monostable capacitors is likely to be much smaller than the sensor-to-sensor variations, so the monostable times are likely to be much more similar that the study here implies. Taking the plot on pg 10 at face value, then a bias setting would be needed which ensures the shortest monostables are just large enough to be guaranteed to be hit for one bunch crossing tick, i.e. 150ns. However, the average length would then be ~10% longer and the maximum ~20% longer. These would give rates of ~10% and 20% of double hits, respectively, hence giving an average of ~10% double hits. Note, the ratio of double to single hits will allow the monostable length for each pixel to be measured. Pg 8: The monostable length can be adjusted by the current biases. There are two biases, one for each type of pixel (shaper and sampler). Pg 9: The monostable inverter is current limited to 5uA. This is done because otherwise there could potentially be a large total current (~100mA for 10's of ns) if they all switched at the same time. This would be most likely to occur at power-up, hence the "power-on-reset" here. In addition, the monostables will have a separate power supply. It was thought that having the return path via the substrate would not be sufficient and so there should also be a separate ground. Pg 12: The table shows even 2.5V is not ideal in some process corners so 3.3V will now be used for this logic. The numbers show that the SRAM is not reliably loaded when driving ~10 sectors simultaneously, at least at 2.5V, as each SRAM is potentially being switched 22x8 times on each bunch crossing. There is now a driver for every four sectors. Pg 16: The edge variation seen is ~20ns. This was thought to set a (somewhat conservative) limit on the readout clock speed of 5MHz. Also, the multiplexing results on pg 19 show a 25ns rise/fall time, which imply a limit at around 20MHz, which would result in the same 5MHz readout speed per column, so this seems a reasonable value to assume. Pg 20: The race condition seemed risky for the sake of saving one clock line, so a third clock line will be added to remove this issue. The extra input will be implemented as phi_3-bar as it could then be approximated by wiring phi_2 to the input if desired. Pg 24: The SRAM acts as a stack (i.e. a LIFO). There is no counter of the number of locations; the pointer is the only item which keeps track of the occupancy. Hence, the total number of hits following memory overflow cannot be recorded. The issue of glitches on the phi_2 clock was raised; this could cause corruption of the SRAM if they occured. This needs to be simulated to check the sensitivity. The 19th hit which fills the memory in a sector will not cause the overflow condition to be set; a following 20th hit will do this. The overflow status is not recorded in the output data per sector. The bit can be read from the output pin but this is common to each 84x84 pixel array and so does not uniquely locate the sector(s) which caused it. Although it would be clearer to get the overflow bit per sector, it was not thought essential. Once the 19th hit is recorded, the pixel is then dead, irrespective of whether a hit actually occured later or not. Hence, any pixels with 19 hits should count towards inefficiency from the timestamp following that of the 19th hit. Pg 29: Since both the rising and falling edge of the HOLD signal is used, then it must run at 3MHz rather than 6MHz as stated, i.e. half the bunch crossing clock. Pg 30: A ~4kHz signal is currently required so as to avoid the possibility of the internal nodes of the latch-hold circuit floating up to the transistor transition point (and hence generating a power surge) when not in use. This was thought to be undesirable and a pulldown should be added to prevent this and so remove the need for the 4kHz clock. The leak rate was found to be long (~0.5ms) compared to the clock times being used but the simulation of this had not been done at high temperatures; checking this as 50C should be done. Pg 33: The 100kHz minimum is assumed for the SR parallel load clock as it is limited by the time to serial load the SR with 168 bits. The latter is assumed to run at ~100MHz, which would give a maximum parallel load rate of 600kHz. At 100kHz for the parallel load, then the 5x168 = 840 loads will take ~8ms, which is acceptable; there is no major constraint on the time to load configuration data. Pg 48: Note the comment in the overview (above); the first sensor needs eight bits to label the sector, not seven as stated here. Pg 50: The address lines must not move much relative to the clocks over the whole length of the logic columns. Hence, a check should be done to see if the address propagation time is as close as possible to the clock propagation time, in terms of the buffering and RC loads on the lines. There was a question on whether the simulation contains enough stray capacitance for these checks. It may be necessary to do a full simulation of the propagation times. Top level schematic: The schematic does not yet include any test structures. Jamie is intending to add at least a monostable and some number of pixels in an array, with access to the analogue signals. These would require the trim DAC and mask configuration settings to be wired to input pins and hence set "by hand" externally. All the I/O signals are currently single-ended, either 2.5V or 3.3V. There are no LVDS converters in the library but they could be made from basic components with some work. It was decided that given the short time left, it was better to leave the signals as is and do any necessary conversion externally if/when required.