## **CMS Binary Chip status**

#### **Outline**

introduction
front end design & simulated performance
current layout status
system aspects
architecture, powering, module concepts
summary

### **CBC** context

CBC targeted at phase II outer tracker region r > ~ 50 cm

assumed instrumented by short strips ~ 2.5 / 5 cm

sLHC CMS tracker will operate at low T ~ -30 -> -40 degrees



=> important to meet specs at low temperatures for all simulation process corners (but will still want to test and run chips and modules at room temperatures)

DC-DC power supply system assumed

run frontend at VDD=1.1 V to provide headroom for LDO in supply rail to improve PSR

CBC under design since March 2009 in 130nm IBM CMOS Lawrence Jones (RAL engineer) planned submission May 2010

### architecture choice

**binary un-sparsified** architecture simplicity/robustness, lowest power

#### main functional blocks

- fast front end amplifier 20 nsec peaking
- comparator with programmable threshold trim
- 256 deep pipeline (6.4 us)
- 32 deep buffer for triggered events
- output mux and driver (SLVS)
- fast (SLVS) and slow (I2C) control interfaces

#### **some target specs** (see \* for full list)

- both signal polarities
- DC coupled to sensor up to 1 uA leakage
- noise: < 1000e for  $C_{SENSOR} \sim 5 pF$
- power consumption
  - < 0.5 mW/channel for  $C_{SENSOR} \sim 5$  pF

### **CBC**



<sup>\*</sup> http://icva.hep.ph.ic.ac.uk/~dmray/CBC\_documentation/CBC\_specifications.pdf



#### preamp

resistive feedback absorbs I<sub>leak</sub>
T network for holes
Rf.Cf implements short
diff. time constant
(good for no pile-up)

### postamp

provides gain and int. time constant ~ 50 mV / fC

AC coupled removes I<sub>leak</sub> DC shift individually programmable O/P DC level implements channel threshold tuning 8-bits, 0.8 mV / bit, 200 mV range

### comparator

global threshold (indiv. tuning at postamp O/P) programmable hysteresis

short summary of simulated front end performance follows for much more detail see design review talk:

# postamp output pulse shape

~ 20 nsec peaking, ~ 50 mV / fC robust to temperature (-40 -> +40) and process variations





## noise at postamp output



preamp input device power varied with C<sub>added</sub> to maintain pulse shape note: power in figures above includes preamp and postamp

noise within spec. (doesn't vary much (< ~10%) with process variation)

# overload recovery at postamp output



T = -40 & +40, Ileak = 0.5uA, preamp Cin = 6 pF,all process corners 4 pC injected at t = 50 ns, 2.5 fC injected at t = 2.5 us recovery spec. comfortably met (< 2.5 usec)



## time-walk at comparator O/P

dependence of comparator fire time on signal size must be less than 1 BX

#### **Atlas specification**

≤ 16 ns time difference between comparator output edges for input signals of 1.25 fC and 10 fC, for a threshold setting of 1 fC

(spec. defined for 300 µm sensors – but still maintained for 200 µm, scaling all quantities by 2/3)







# overall layout - so far

```
what's really there
     complete front end chain
            preamp
           postamp + output offset adjust
           comparator + adjust register
                                                       7 mm
     256 cell pipeline
     32 cell L1 triggered data buffer
     pipeline control logic
     I<sup>2</sup>C
     bias generator
items in yellow not there yet
      (and not necessarily in final position)
     multiplexer
     output pads
     powering circuitry (DC-DC & BandGap from CERN)
     fast control interface
     test pulse
           no time for DLL based version
           will provide pads to feed test caps
 DC-DC -> bandgap/LDO separation issues
```



3.5 mm

## sLHC strips readout system



binary unsparsified output frame format similar to APV (just hits, not analog values) keep data frame  $\sim$ 7  $\mu$ s (must be less than average L1 separation)

- => 128 channel CBC provides output data at 20 Mbps
- => 4 CBCs data combined onto one 80 Mbps link to GBT input

plan to use GBT e-link (e-port IP block in chip)
takes care of data transfer synchronization
but not on this CBC version

## modularity issues

40 x 80 Mbps lanes / GBT

integer number of GBTs preferred / module group (e.g. TOB rod)

=> no. of e-links needs to be close to n x 40 for efficient GBT usage ( $\leq$ , not just over)

### **TOB** rod example

rod length 120 cm, module dimensions ~10 x 8.5 cm<sup>2</sup>

=> 12 modules / rod

for 5 cm strips, ~110 um pitch => 3 e-links / module

=> 36 e-links total => 1 GBT / rod

for 2.5 cm strips (2 hybrids / sensor)

=> 72 e-links => 2 GBT's / rod



# powering

would like to explore some options

#### switched capacitor DC-DC (CERN) 1

converts 2.5 -> 1.25

might be needed if tracker I/P voltage (~12V) cannot be converted to ~1.25 V in one stage could use 1.25 to provide CBC digital rail

#### LDO (low drop-out) linear regulator

converts 1.25 -> 1.1

provides regulated rail to analog front end provides some supply noise rejection <sup>2</sup> needs bandgap reference input (CERN)

can incorporate without risk - ensure can be over-ridden

e.g. don't use DC-DC power digital separately (1.25 or less) power LDO from separate 1.25

or don't use DC-DC or LDO power digital and analog independently (or together)



1 http://indico.cern.ch/getFile.py/access?contribId=3&resId=0&materialId=slides&confId=85278

2 http://icva.hep.ph.ic.ac.uk/~dmray/CBC\_documentation/LDO\_PWG\_Sep09.pdf

## module concepts

#### hybrid, bonding, PA issues

CBC prototype will be "conventional" layout

128 channels, effective pitch 50 um, wire-bondable no special test setup preparations & generally easier to test

**but** 50  $\mu$ m not well matched to sensor pitch ~ 100  $\mu$ m

=> pitch adaption somewhere - where? (would like to avoid separate PA's in future system)

hybrid? - possible but uses space for fanout

would help if CBC input pad pitch better matched to sensor ~ 100 um pitch

#### other issues

would module design benefit from 256 (2 x 128 channels back-to-back)? (not impossible - but detailed study may yet reveal difficulties)

should we be looking at bump-bonding for chip/hybrid/sensor connection?

=> a number of issues to re-examine for a subsequent iteration





### summary

```
CBC development at advanced stage
     most of chip design complete
     front end meets all specs for simulated performance
           -40 \rightarrow +40, all process corners. VDD = 1.1V
May submission – CERN 130nm MPW => plenty of chips
     expect back ~ September
system issues
     compatible with DC-DC powering schemes
           some options to study (with/without switched cap. DC-DC and/or LDO)
     route to compatibility with GBT system seems clear
           can incorporate e-link in future
will learn a lot from this prototype
     gain valuable experience with 130 nm
     lots of functionality and performance issues to study
           noise, power, powering, radiation (SEE),....
     useful information for future chip and system developments
```

## extra

### test structures

lots of space available

test structures to include will be:

one complete dummy channel with buffered signals along chain (top edge of chip)

e.g. preamp, postamp and comp. O/P's

pads on bias generator outputs

allows to directly measure and/or over-ride

other test structures

e.g. individual components and arrays

. . . .



7 mm

3.5 mm -

### 128 -> 256 back-to-back?

0201 capacitors

e.g. power, control but not much else

reduced area, cheaper

less pads available on 256 version

less flexible

overall power consumption probably not much different

#### conclusions

256 back-to-back probably not impossible

but detailed study may yet reveal difficulties



bump-bond option will increase height by factor ~ 2, plus many other implications -> significant changes to layout

128 is still the best prototype unit for now

### combining chips output data using CERN e-port IP core



### alternative scheme

put e-port in separate GBT interface chip

incorporates circuitry to combine CBC data streams



an option to consider in the future



## **CBC** power

### 

0.5 mW / channel seems like an achievable target (c.f. 2.7 mW for APV25)

digital is biggest uncertainty, and maybe largest contributor hope to improve estimate as design progresses can consider running at lower voltage (dig. power ~ V²) => extra contingency e.g. 1.2 -> 0.85 power consumption halved will keep power rails separate on chip to keep option open

using numbers above: 128 chan. chip needs ~ 20 mA analogue, ~30 mA digital

## front end PSR without LDO supply



#### time domain picture

measured noise waveform added to VDD rail supplying FE circuit

sampled scope data for Enpirion "quiet" converter provided by Aachen

but x10 to (artificially) make it noisier

~ 80 mV pk-pk

1 fC normal signal completely swamped by noise

# front end PSR with LDO supply



measured **x10** (80 mV pk-pk) noise waveform now added to LDO Vin

LDO loaded by single CBC frontend + 25 mA extra dummy load

1 fC signal at postamp O/P now appears

postamp O/P noise just visible

~ 125e pk-pk

### matching CBC modules to GBT links

chips / rod

80 Mbps lanes / rod

(40 lane)GBTs / rod

# Duccio's FNAL workshop talk – October '09

| Tag                         | OB_L1   | OB_L2   | OB_L3   | OB_L4   | EC_R1                                                 | EC_R2   | EC_R3   | EC_R4   | EC_R5   | EC_R6   | EC_R7   |
|-----------------------------|---------|---------|---------|---------|-------------------------------------------------------|---------|---------|---------|---------|---------|---------|
| Туре                        | rphi    | rphi    | rphi    | rphi    | rphi                                                  | rphi    | rphi    | rphi    | rphi    | rphi    | rphi    |
| Area (mm²)                  | 8475.8  | 8475.8  | 8475.8  | 8475.8  | 8475.8                                                | 8475.8  | 8475.8  | 8475.8  | 8475.8  | 8475.8  | 8475.8  |
| Occup (max/av)              | 2.7/2.6 | 3.2/3.0 | 1.9/1.8 | 0.8/0.8 | 3.9/3.5                                               | 2.8/2.5 | 2.2/2.0 | 3.3/3.0 | 2.6/2.4 | 2.0/1.8 | 1.7/1.5 |
| Pitch (min/max)             | 110     | 110     | 110     | 110     | 110                                                   | 110     | 110     | 110     | 110     | 110     | 110     |
| Segments x Chips            | 4x6     | 2x6     | 2x6     | 2x6     | 4x6                                                   | 4x6     | 4x6     | 2x6     | 2x6     | 2x6     | 2x6     |
| Strip length                | 24.9    | 49.8    | 49.8    | 49.8    | 24.9                                                  | 24.9    | 24.9    | 49.8    | 49.8    | 49.8    | 49.8    |
| Chan/Sensor                 | 3072    | 1536    | 1536    | 1536    | 3072                                                  | 3072    | 3072    | 1536    | 1536    | 1536    | 1536    |
| N. mod                      | 960     | 1248    | 1536    | 2016    | 400                                                   | 480     | 560     | 600     | 680     | 760     | 800     |
| Channels (M)                | 2.95    | 1.92    | 2.36    | 3.1     | 1.23                                                  | 1.47    | 1.72    | 0.92    | 1.04    | 1.17    | 1.23    |
| Power (kW)                  | 1.5     | 1       | 1.2     | 1.5     | 0.6                                                   | 0.7     | 0.9     | 0.5     | 0.5     | 0.6     | 0.6     |
| N of GBTs                   | 160     | 104     | 128     | 168     | 80                                                    | 80      | 120     | 60      | 60      | 80      | 80      |
| GBT power (kW)              | 0.3     | 0.2     | 0.3     | 0.3     | 0.2                                                   | 0.2     | 0.2     | 0.1     | 0.1     | 0.2     | 0.2     |
| ~ radius [cm]               | 50      | 65      | 85      | 110     |                                                       |         |         |         |         |         |         |
| circumference [cm]          | 314     | 408     | 534     | 692     | module dimensions 10 x 8.5 cm <sup>2</sup> (z x rphi) |         |         |         |         |         |         |
| rods per ½ barrel           | 40      | 52      | 64      | 84      | ½ barrel 120 cm long                                  |         |         |         |         |         |         |
| 128 channel<br>chips/module | 24      | 12      | 12      | 12      | => 12 modules / ½ barrel rod                          |         |         |         |         |         |         |

GBT lanes not fully utilised, but integer no. of GBT's per rod is presumably optimal

### unsparsified binary advantages

data volume known - no trigger-to-trigger variations - occupancy independent => simpler readout system on-detector FE chips -> GBT -> link no extra data buffering and concentrating chips in the system => simpler system off-detector too I think known and unchanging origin and volume of data – must simplify processing synchronous system - all FE chips doing same thing at same time – easy to emulate externally no need to timestamp on front end easy to spot upset chips (pipe address wrong) likely to be lowest power FE chip architecture analog FE + comparator followed by simple digital pipeline and off-chip mux no ADC, no analog pipeline readout easier to scale designs to even finer feature processes (analog pipelines using gate capacitance probably only just possible in 0.13) easy to keep dynamic power variations small or negligible simpler FE module design (less chips)

### unsparsified binary disadvantages

```
no pulse height information
binary worse for position resolution
common mode immunity – (short strips will help – less pickup)
larger data volume => more off-detector links
```

```
data volumes
unsparsified data volume / 128 channel chip = 128 bits + header (say 12) = 140 bits
sparsified data volume / hit
     8 bit chip address
     7 bit channel address (x no. of channels above threshold)
     8 bit timestamp
occupancy 0.8%, 1hit, data volume = 23 bits
occupancy 4%, 5 hits, data volume = 51 bits
so factor ~ 3 – 6 (depending on occupancy) inefficient assuming 100% link bandwidth utilised
   factor ~ 1.5 – 3 inefficient assuming 50% link bandwidth utilised
```

(do we need to provide for heavy ion running in CMS at SLHC?)

# layout pictures

**Preamp** 

35 um



Postamp

-230 um

35 um

Postamp output offset adjust

35 um



Comparator and offset adjust register 35 um



some other bits







### Pipeline control logic