1 of 70

Vertex Report and Plans

Luigi Vigani

Mu3e Collaboration Meeting

14/07/2025

1

2 of 70

Outcome from cosmic+beam time

We made it!

2

3 of 70

Outcome from cosmic+beam time

We made it!

3

Many chips behave good and show the expected behaviour

Tracks and momentum could be extrapolated

4 of 70

Outcome from cosmic+beam time

We made it!

4

BUT…

How many misbehaved? How many lost?

Is it possible to work with this?

And many more questions…

5 of 70

Outcome from cosmic+beam time

We made it!

5

BUT…

How many misbehaved? How many lost?

Is it possible to work with this?

And many more questions…

Still

Huge progress

Thank you to everyone involved!

6 of 70

Part 1: lessons learnt

and

general considerations

6

7 of 70

Lessons learnt: planning

7

Cage preparation

(cables)

Vertex

test installation

Beam

Cosmic 2

Cosmic

Service installation

Move inside magnet

Vertex

installation

8 of 70

Lessons learnt: planning

8

Cage preparation

(cables)

Issue: The cable placement took long and was difficult

Mitigation: prepare on a mockup

Vertex

test installation

Beam

Cosmic 2

Cosmic

Service installation

Move inside magnet

Vertex

installation

9 of 70

Lessons learnt: planning

9

Cage preparation

(cables)

Vertex

test installation

Beam

Cosmic 2

Cosmic

Service installation

Move inside magnet

Vertex

installation

Issue: Service installation was long

Mitigation: more people could join (more about this later)

10 of 70

Lessons learnt: planning

10

Cage preparation

(cables)

Vertex

test installation

Beam

Cosmic 2

Cosmic

Service installation

Move inside magnet

Vertex

installation

Issue: First cosmic run shorter than ideal, lot of tasks postponed. The installation not fully proved by the end of it

Mitigation: next year’s plan to have more headroom…?

11 of 70

Lesson learnt: preparation

More on messy cable placement

11

The principles of organisation were “solid”

Solutions tested on mockup

Allocated (too much) time for installation on cage

QC-tested and secured the cables

12 of 70

Lesson learnt: preparation

More on messy cable placement

12

It was all too improvised

Solutions tested on mockup

Allocated (too much) time for installation on cage

QC-tested and secured the cables

Only critical portion of the beamline was reproduced

Other factors?

Time miscalculated: bottleneck for other systems

Securing system improvised

No pre-testing of insertion on BoSSL ring

13 of 70

Lesson learnt: preparation

Cable placement can be done better

  • Preparation should be done on a more complete mockup
    • Not only the region close to the detector
      • Cable routing on the beamline was OK
      • From beamline to DABs clarified on the spot (plastic spirals and 3D printed hooks)
      • Keeping the cables in place to be tested (kapton tape vs cotton wires?)
    • People can get acquainted with the working space (not the most comfortable)
    • LV and sense cables should have better preparation (LV length, sense soldered)
  • Placeholders for cables
    • What happens to them between installation and connection?
      • Note: detectors are installed much later
      • Many things happen in-between and cables must be protected

13

Risk taken consciously last year: mostly for time constraints Can we afford it the next time…? NO See Thomas R’s talk (and probably more discussion on this)

14 of 70

Lesson learnt: preparation

  • Surprises can come from the least risky sides
    • DABs on DS were found not connected only after moving inside the magnet…

14

15 of 70

Lesson learnt: preparation

  • Surprises can come from the least risky sides

15

Decisions were taken consciously: with little time prioritisation of tasks is key.

The most delicate and riskiest components were put forward.

Bottom issue: DS not tested properly before insertion in magnet

16 of 70

Lesson learnt: preparation

There is clearly some path here…

16

The BoSSLs are difficult to place!

At least the rest of the cable is easy…

Where do we drive the cables?

The LVDS lines are very delicate!

At least DABs are easy to place…

The DABs are not connected!

The knowledge of the system is deep enough: we can assess which components are exposed to more risk than others (and should be prioritised).

The issue is in the relative weight assigned to each of them…

17 of 70

Lesson learnt: preparation

There is clearly some path here…

17

Task risk

Resources involved

Task risk

Resources involved

Resources is meant in the broad sense:

  • Measurable: people, time, material,...
  • Not measurable: efforts, mental load,..

This shift is what comes with experience!

So far

To be adopted

18 of 70

Lesson learnt: preparation

DAQ+Software-wise

  • [Most of] [Vertex] Software tools available when needed
  • Preparing FW/SW/Scripts/Pages smoother and more effective
    • Inherent advantages
      • Remote contributions
      • More flexible (no need to wait for components, technicians,...)
    • Earlier planning
      • Discussions started as early as November in the DAQ and software meetings
      • Efforts scaled up after hackathon

18

19 of 70

Lesson learnt: preparation

DAQ+Software-wise

  • [Most of] [Vertex] Software tools available when needed
  • Preparing FW/SW/Scripts/Pages smoother and more effective
    • Inherent advantages
      • Remote contributions
      • More flexible (no need to wait for components, technicians,...)
    • Earlier planning
      • Discussions started as early as November in the DAQ and software meetings
      • Efforts scaled up after hackathon

19

Overall: hackathon boosted the contribution from the whole collaboration a lot

More people got involved and viewed the open issues, many stayed there until beamtime

Could it be done before? Can the contribution be more evened out in time?

20 of 70

Lesson learnt: operation

Takeaway from previous discussions (2024 meetings, Wengen,...):

  • We must move from “A” to “B” way of doing

  • We must avoid unnecessary risks
    • No shortcuts!

  • Time is limited
    • Compromises!

20

Speed up things by skipping essential procedures

Reduce work on tasks that are not strictly related to the final goals or that do not improve the results enough

Improve quality, consistency and accessibility

21 of 70

Lesson learnt: operation

More insights on these concepts have been learnt in this beamtime

  • We must move from “A” to “B” way of doing

  • We must avoid unnecessary risks
    • No shortcuts!

  • Time is limited
    • Compromises!

21

Very easy to fall: “B” is also a state of mind

Finding the best compromise require experience and time (which per se defies the principle of compromise itself…)

“A” and “B” is not binary, but a spectrum

22 of 70

Lesson learnt: operation

Example of “composite B” execution: micro-twisted pair cable placement

22

90% B

50% B

(got better)

20% B

23 of 70

The “compromises”

23

Quality

Invested time

A

B

“Typical” learning curve

(accidentally similar to an IV curve)

24 of 70

The “compromises”

24

Quality

Invested time

A

B

“Typical” learning curve

(accidentally similar to an IV curve)

The compromise: obtain the best result in a short time

Find this spot!

25 of 70

The “compromises”

25

Quality

Invested time

A

B

Example with Vertex calibration: LVDS optimisation

We have a custom page where we can select chips, change DACs and check 8b10b error rate.

Just change the DACs until you “see” good links

26 of 70

The “compromises”

26

Quality

Invested time

A

B

Example with Vertex calibration: LVDS optimisation

Learn scripts, understand where settings and variables are, identify relevant DACs,...

Write the first script that does not crash.

This is where we were at the beginning of the beam time

27 of 70

The “compromises”

27

Quality

Invested time

A

B

Example with Vertex calibration: LVDS optimisation

Change script parameters, identify boundaries, produce “some” output

28 of 70

The “compromises”

28

Quality

Invested time

A

B

Example with Vertex calibration: LVDS optimisation

Get consistent results, understand them, write an algorithm to find optimal working points…

29 of 70

The “compromises”

29

Quality

Invested time

A

B

Example with Vertex calibration: LVDS optimisation

Plot the output, let the user decide where the optimal point is

30 of 70

The “compromises”

30

Quality

Invested time

A

B

Example with Vertex calibration: LVDS optimisation

Plot the output, let the user decide where the optimal point is

31 of 70

The “compromises”

31

Quality

Invested time

A

B

“Typical” learning curve

(accidentally similar to an IV curve)

Similar to QC - inverted axes

32 of 70

Lessons learnt: operation

  • Safety safety safety!
    • Many things can be done better, but there are far more ways to do things worse
    • Before attempting anything different think carefully
      • Risk vs gain evaluation
      • Decisions based on actual investigations
        • No more “I saw once in a testbeam…” or “this person told me once…”
  • Consistency consistency consistency!
    • Many issues might be of statistical nature
      • Things that happen sometimes and look random at first
      • If seen once, it might be harmful to change procedures immediately without inspecting the issue first
    • The efficient way to tackle them is to collect statistics consistently
      • Same scripts, logs (variable history and actions taken by the user)
      • Consider that later these issues have to be analysed
        • No more “I changed a DAC and it worked” but there is no trace of it

32

“B” state of mind

33 of 70

Lesson learnt: operation

User operation

  • Custom page and “operation flow” in good shape
    • User feedback?
    • Improvements here and there, but overall the concept holds
      • Mostly to streamline operations (less button clicks)
  • Scripts constantly improved
    • Moving from msl to python sequencer
  • Manual OK, but still some work
    • Not always followed through: better during cosmic run than beam time
      • Too complicated? Page too easy? No need (no turn off and on)? Things clear (hardly)?
    • Could be more complete?
    • Google form idea abandoned… should be reprised?

33

34 of 70

Lesson learnt: operation

DQM

  • A lot of progress, the direction is correct
  • More streamlined way (see Mikio’s talk)
    • “3-tier” system seems good
    • Details to be discussed (pdf vs html,...)
  • From experience, DQM can have 2 purposes:
    • Assess data quality
      • Final goal is to have an automated quality scheme
    • Help finding unexpected issues
      • Worked few times
      • More difficult and vague

34

35 of 70

Part 2: Inspecting the Vertex performance

and

Future improvements

35

36 of 70

Overview

From note 113

Layer 1

Layer 2

36

37 of 70

Overview

From note 113

Layer 1

Layer 2

37

~70% of links available

38 of 70

Overview

From note 113

Layer 1

Layer 2

38

~70% of links available

Can’t be afforded next time!

Issues are investigated..

39 of 70

Overview

From note 113

Layer 1

Layer 2

39

Issue fully understood: end-piece flexes not so flexible, miscommunication with producer

See Joey’s talk

40 of 70

Overview

From note 113

Layer 1

Layer 2

40

Hiccups in production. New tools and methods designed for v2

41 of 70

Overview

From note 113

Layer 1

Layer 2

41

SpTAB issues found. Mitigation strategies outlined:

  • Bonding parameters reviewed with LTU
  • HDI design changed: critical lines closer to chip
  • Re-considering glueing and transportation
    • Glue on interposer flex adhered to plastic envelope

2 had unmaskable pixels: check QC data (there was a bug)

42 of 70

Overview

From note 113

Layer 1

Layer 2

42

Still open issue…

43 of 70

Lost chips issue

Possible explanations and how to confirm

  • Lost Clk/SIN connections
    • Micro-twisted pair cables or TA-bonds
    • One ladder tested like this during extraction: negative response
  • Other relevant pads lost
    • ConnRes of last chip
    • Power-on-reset
    • …?
  • Weak Clk/SIN connections
    • Are the TA-bond connections binary or spectrum?
  • Lost VDD/GND connections
    • We consider these low risk as they are redundant
    • How many can we lose before chips stop working good?

43

44 of 70

Lost chips issue

Possible explanations and how to confirm

  • Lost Clk/SIN connections
    • Micro-twisted pair cables or TA-bonds
    • One ladder tested like this during extraction: negative response
  • Other relevant pads lost
    • ConnRes of last chip
    • Power-on-reset
    • …?
  • Weak Clk/SIN connections
    • Are the TA-bond connections binary or spectrum?
  • Lost VDD/GND connections
    • We consider these low risk as they are redundant
    • How many can we loose before chips stop working good?

44

Easy to verify with MTP breakout board measuring resistances

45 of 70

Lost chips issue

Possible explanations and how to confirm

  • Lost Clk/SIN connections
    • Micro-twisted pair cables or TA-bonds
    • One ladder tested like this during extraction: negative response
  • Other relevant pads lost
    • ConnRes of last chip
    • Power-on-reset
    • …?
  • Weak Clk/SIN connections
    • Are the TA-bond connections binary or spectrum?
  • Lost VDD/GND connections
    • We consider these low risk as they are redundant
    • How many can we loose before chips stop working good?

45

Easy to verify with MTP breakout board measuring resistances

Visual inspection…?

Other resistances to measure?

Needle probe?

Anyway, better investigation on a lab bench, not on cage…

46 of 70

Lost chips issue

  • Hard to verify inside the magnet
  • Possibly on cage
    • Visual inspections
    • Direct access of some components
  • Better on a lab bench
    • Decommissioning
  • A proper analysis of QC and slow control data might reveal clues
    • Hystory_analysis project started (ASR, AE)
    • Looking for inconsistencies in turn on procedures, configuration, resets
      • Possibly find the exact moment when we lost them
    • Organic analysis of QC data to be started by Heidelberg students
  • Solving this highest priority after the cage is out
    • If some issues arise from production we have to know it ASAP

46

47 of 70

HV issue

  • 2 power partitions did not have HV during beamtime
    • One was found out after cage extraction: HV box hardware disabled, no current
    • One has deteriorated…

47

01/06

04/04

Maxed out at 0 V

~07/06

Not understood what happened here. Resistance not changed in extraction

48 of 70

HV issue

  • Investigation not possible inside cage
  • Keithley to check the current limit
    • Not a solution neither an answer
  • Broken chips? Only after decommissioning
    • Visual inspections might help (burnt marks)
    • Thermal camera
  • Another crucial point to understand before v2 production

48

49 of 70

HV issue

  • Investigation not possible inside cage
  • Keithley to check the current limit
    • Not a solution neither an answer
  • Broken chips? Only after decommissioning
    • Visual inspections might help (burnt marks)
    • Thermal camera
  • Another crucial point to understand before v2 production
  • The first power partition with not working HV box not recovered after cage extraction
  • Another power partition has no current after cage extraction
    • To be verified once out of the magnet

49

50 of 70

LV puzzle

The adjust values for the DCDC converters were not so uniform

2 power partitions needed >10% adjustment

Note: nominally 1.9 V at 0%

The resistance measurements on VDD-GND do not vary this much

Maybe due to imbalances in the power distribution (disconnected ladders)

Does not seem to be directly connected to any issue

50

51 of 70

5 V issue

  • Sometimes we see 5 V on the VDD sense line
    • It seems unphysical
  • 2 cases observed
    • When it happens randomly
      • 5 V jump in middle of operation
      • Happend on few power partitions, few times a day, but only few days during the whole time
      • Investigated once with voltmeter
        • It is not on Mupix VDD line
        • It is in front of ADC (not ADC failure)
      • Source still unknown
      • Can be recovered by power cycling everything (DCDC, HV, FEB crates)
    • When it happens consistently
      • 5V jump during power up of a specific power partition
      • Can be recovered by power cycling only that DCDC crate
      • Solved by applying target HV after power up…

51

52 of 70

5 V issue (continued)

  • Mitigation strategy: modify DCDC board design
    • To be discussed with Frederick
  • Open question in the second case
    • Why should the power up procedure matter?
    • Are Muipx modules sensitive to this?
    • There are discrepancies indeed:
      • During QC: LV on, HV on (low V), power up and then increase HV
      • During operation: LV on, HV on (high V) and then power up
    • Is the QC way the preferred one? No reason to believe it
    • Is this related to lost chips?
    • Possible investigation: 2 good ladders, power cycle both ways

52

53 of 70

Good chips

How to tell a good chip? From online DQM:

53

54 of 70

Good chips

How to tell a good chip? From online DQM:

54

Uniform col/row distributions

55 of 70

Good chips

How to tell a good chip? From online DQM:

55

Uniform ToA distribution

56 of 70

Good chips

How to tell a good chip? From online DQM:

56

Good looking ToT distribution

Noise

Signal

Delay peak (to be calibrated off)

57 of 70

Good chips

How to tell a good chip? From online DQM:

57

Good looking ToT distribution

Noise

58 of 70

Good chips

How to tell a good chip? From online DQM:

58

Good looking ToT distribution

Very high efficiency expected…!

59 of 70

Bad chips

59

Very bad sorter efficiency…

Partially broken?

60 of 70

Bad chips

60

Low efficiency? Stuck state machine?

61 of 70

Bad chips

61

???

62 of 70

Weird chips

62

Why holes? Software or hardware?

63 of 70

Weird chips

63

There is a crate of beer at stake

64 of 70

Bad & Weird chips

  • They should be treated as not working
  • More analysis on their data output needs to be done
    • Where they always bad?
    • Were they bad at QC?
    • Are there other parameters that are not ordinary?
    • Change of DACs?
  • This is one more question to be solved by v2…

64

65 of 70

Part 3 (last and shortest):

Future plans and proposals

65

66 of 70

Immediate future

  • Cosmic run 2 ongoing
    • Review and improve calibrations and tuning
      • See Thomas Senger talk
      • Move the routines more towards “B”
    • Take some cosmics
      • With SciFi?
      • This time we should all believe we saw them
    • This until end of July
  • Cage out in new staging area (when?)
  • Vertex removed and decommissioned

66

No defined timeline for these 2 points

The main goal is to investigate the open points presented so far:

First explore options on cage

Then move to lab when all options exhausted

67 of 70

Immediate future

  • Cosmic run 2 ongoing
    • Review and improve calibrations and tuning
      • See Thomas Senger talk
      • Move the routines more towards “B”
    • Take some cosmics
      • With SciFi?
      • This time we should all believe we saw them
    • This until end of July
  • Cage out in new staging area (when?)
  • Vertex removed and decommissioned
  • Kick start v2 production
    • See Joey’s talk
    • Many improvements already drawn by this campaign

67

68 of 70

Further future

  • V2 will prosecute into next year (see Joey’s talk)
  • Installation of new cables and boards in Autumn (Thomas Rudzki’s talk)
    • Together with outer layers for the first time
    • Vertex needs to be out by then
  • New data taking campaign next year with V2…?
  • Lesson learnt: we need more time
    • Always underestimated
    • If we wanna go >99% B it might require months
  • Vertex proposal: towards the end of 2026 rather than June like this year
  • Discussion to be opened in this meeting
    • Need for more input from other subsystems (many presentations following)
    • MEG plans?

68

69 of 70

Backup

69

70 of 70

Overview after construction

From note 113

Layer 1

Layer 2

70