OPTIONAL SUBHEAD HERE
Jefferson Lab
Cullan Bedwell, Abhijeet Chawhan, Julie Crowe, Diana McSpadden
Spring 2023
AIEC Capstone
Jefferson Lab (JLab)
2
11/05/21
Newport News, Virginia, Continuous Electron Beam Accelerator
Fig. 1: The GlueX Spectrometer in Hall D at Jefferson Lab, viewed from the downstream side, in October 2017.
The GlueX Detector
FCAL Problem Statement
Model Input:
Model Output:
Traditional FCAL calibration method:
Machine learning method:
Predict gain calibrations using experimental conditions and equipment measurements (LED pulses) using previously collected, well-calibrated data?
Why Use Machine Learning?
Timeline
Timeline
Feb 2023
Data Mining, Data Cleaning, Data Exploration, Data Visualization commenced
March 2023
Data Mining, EDA complete.
Model creation complete.
Jan 2023
Gathering Domain knowledge in progress.
Feb 2023
EDA and Data Visualization in progress.
Feature Engineering + Model creation in Progress.
April 27, 2023
Final Model
Conclusions
Iterative approach for solution design.
Gathering Domain knowledge is continuous process.
April 2023
Model Evaluation
Gain Calibration Evaluation
Three Separate Research Questions
RQ 1: Predict Inner Block Radiation Damage
RQ 2: Run-to-Run Gain Calibration
RQ 3: Block-to-Block Gain Calibration
Predict a block’s radiation damage due to beam exposure based on block’s radius and integrated beam exposure from experiment (different based on different experiments)
Predict the Gain(of all blocks) to calibrate the entire FCAL (all blocks) to measure the “same” as the FCAL for all runs in a run period, based on a reference run
“Blind Drift Calibration” of sensors - calibration of sensors without a reference sensors
Questions:
We have:
Understand the Complex Data Ecosystem
Data Quality is Key
Gains Are a Function of Time and Radius
Normalized Gain Calibration By Run Index - PrimeX Run Period
Next Steps
Feature engineering
Implement Convolutional Neural Net
Combine work
Related Works
https://arxiv.org/pdf/1707.03682.pdf
Proposes a novel deep learning method named projection-recovery network (PRNet) to blindly calibrate sensor measurements online.
The PRNet first projects the drifted data to a feature space, and uses a powerful deep convolutional neural network to estimate drift- free measurements.
Below mentioned is the list of the earlier works that we referred to get insights into the solutions implemented for similar problems of calibration.
Literature scraped for Solution Design
Promising Read
Reminder of FCAL Problem Statement
Model Input:
Model Output:
t = time == run by run is our time
r = radius, ring
I_beam = integrated beam current
j = ring number (different number of blocks per ring)
G_i: Gain for block i (primex_gains.csv) - function of time, function of beam intensity.
G_i = g_PMT_i(t) * g_RD(r_i, I_beam)
Q_j(t) = average(G_i in ring j)
…
JLab can provide us with:
/n_G(t)blocks
/n_G(0)blocks
START OF UNUSED SLIDES FOR 02/23 PRESENTATION
Early Challenges in feature engineering
06/06/22
17
Discussion: Behavior of gains in outer rings during primex (2019): Ratio to t(0) gradius
Early Challenges in feature engineering
06/06/22
18
Discussion: negative chi2/ndf values from primex (2019)
Ratio to t(0) gradius
61352, 61353, 61357,61358, 61359, 61360, 61361, 61362,61363, 61364, 61365, 61366, 61367, 61368, 61369, 61371,
61493, 61495, 61496, 61498, 61499, 671500, 61501, 61505
61580, 61581,61582, 61583, 61584, 61585, 61602, 61603, 61606, 61607, 61608, 61609, 61610, 61611
61670, 61671
Related to vmon changes?
61352, 61353, 61357,61358, 61359, 61360, 61361, 61362,61363, 61364, 61365, 61366, 61367, 61368, 61369, 61371,
61493, 61495, 61496, 61498, 61499, 671500, 61501, 61505
61580, 61581,61582, 61583, 61584, 61585, 61602, 61603, 61606, 61607, 61608, 61609, 61610, 61611
61670, 61671
LED Flashes
JLAB Capstone Problem:
Forward Calorimeter (FCAL) Calibration
The GlueX FCAL consists of 2800, 4 cm x 4 cm x 45 cm lead glass blocks stacked in a circular array. Each block is optically coupled to an FEU 84-3 PMT which will be instrumented with flash ADC electronics. GlueX-doc 985,988 and 989 document the GlueX Fcal as presented in the February 2008 Calorimetry Review.
FCAL Problem Statement
The Forward Calorimeter (FCAL) is a component of the GlueX spectrometer made of 2800 individual lead glass modules, each coupled to its own photomultiplier tube (PMT). The FCAL provides timing and energy measurements for photon showers. The Cherenkov light emitted by the electromagnetic showers produced within the lead glass blocks is detected by PMTs. The resulting PMT pulses are digitized using Flash analog-to-digital converters (fADCs) and the timing resolution is measured using a pulsed LED source.
However, PMT's vary in measurement (sensitive to temperature, humidity, magnetic fields????, over time? distance from center? other conditions?), and when the FCAL was installed, there was no reference PMT installed.
A gain calibration, or scaling factor, can be applied to minimize the event-to-event variance of the sum over all modules that scales the amplitude measured by a module dependent gain factor. The gain calibration may be unique per tube, per experimental time period (usually an ~2 hour chunk of time known as a "run"). Using the gain correction factors, the HV of each module is adjusted. The gain correction is essential to obtaining optimal resolution.
We are attempting the find a model to solve for gain calibrations for the PMT gain calibrations by run. The assumption is that the conditions that lead to changing gain are stable enough throughout an approximate 2-hour run.
Expected Outcome
FCAL Data
Location of data files: /work/epsci/roark/FCAL/cpp/
with EPICS/calibrations: cpp_epics_quad0_df.csv
/work/epsci/roark/FCAL/2019_primex
In case interested: CPP is the “charged pion polarizability” experiment. Data was taken in 2022. Dr. Jeske’s (Torri’s) notebook will explain many of the features/columns in the data. I believe there are 733 “runs”/observations in the CPP data.
https://halldweb.jlab.org/wiki/index.php/Charged_Pion_Polarizability_Experiment_in_Hall-D
HOWEVER… we found out today we need the PrimeX experiment, not the CPP experiment to compare with the physicist that is doing a physics data-intensive method.
Descriptions of the data can be found in: https://halldweb.jlab.org/DocDB/0027/002770/001/FCAL_Manual.pdf
ADC amplitude
gain
coupling
LED amplitude
single module
all modules
vector notation
What is ADC amplitude?
David Lawrence
i is 1:~700
or 2800
j is 1:30
(700)
(700)
(30)
(700)
=
elementwise inverses
-1
-1
Train assuming known amplitudes for L and all gains=1 so that we get an α close to the actual coupling matrix.
Retrain with small learning rate and allow gains to vary, but keep them in reasonable range (e.g. add loss term like (g/1.5)^6.
Idea: Train encoder on its own to begin.
Freeze network.
Then train inverse of g to get the back half.
01/26/23 IDEA: Load up keras/tensorflow.
first group: try out the autoencoder example “images”/matrices example A targets - take a look at weights for “gains”. Recreate one image for one quadrant. experience with an autoencoder.
second group: train Encoder side.
Is it really just a perceptron?
Is this an optimization problem?
David Lawrence
(unknown coupling connectivity)
L: 1-10 Violet, 11-20 Blue, 21-30 Green (see small and large V in speaker notes)
5 inputs of ~700 (2 for blue and violet, 1 for green)
Adapting the Projection-Recovery Network (PRNet): from ‘A Deep Learning Approach for Blind Drift Calibration of Sensor Networks’
For our FCAL implementation:
(2800 x 5)
or (60 x 60 x 5) for a CNN AutoEncoder
(?)
=
How to recover the gains from the multi input?
\
(2800)
or (60 x 60)
(2800)
or (60 x 60)
(2800)
or (60 x 60)
(2800)
or (60 x 60)
(2800)
or (60 x 60)
2:31
I think Igal cares about the color of the LED to mainly monitor for radiation damage (that was the initial point of this system). He only recently realized he could use the LEDs for calibration
2:32
there will be quadrant to quadrant differences because the glass was sanded by hand, by humans. we do not know if Igal truly needs 100s of evio files for calibration, so a first step would be to see if I could refit the amplitudes using one file (like the first file as we do for the CDC) and then just looking at the color during that time.
2:33
They have also not looked at the gain correction distribution per run period per block, which is great that you have already started to look at that.
2:34
they are also unsure the "physics" benefit (if you want to call it that) for stabilizing the gain, i.e. if we are off by 1, 5, 10% etc how does that affect the resolution or energy
2:34
which leads me to my final point, if we can improve the timing resolution, we automatically win. apparently that's a big deal in the world of calorimeters.
2:35
the cycling of the LEDs is all the same and the voltages change rarely (hasn't happened in the 2 years Malte has been working with the FCAL).
Example of inner ring radiation degradation (2018 run period)
Research Questions
Three Research Questions Have Been Proposed:
Reference Method
t = time == run by run is our time
r = radius, ring
I_beam = integrated beam current
j = ring number (different number of blocks per ring)
G_i: Gain for block i (primex_gains.csv) - function of time, function of beam intensity.
G_i = g_PMT_i(t) * g_RD(r_i, I_beam)
Q_j(t) = average(G_i in ring j)
…
JLab can provide us with:
/n_G(t)blocks
/n_G(0)blocks
TODO: For the Reference Method
David Asked For - hopefully as soon as we can, because we will need the values as inputs:
Both:
For all rings, j, average ring gain for the run over the average gain for reference run - plot over the run period (on x axis) - y-axis is the ratio.
We hope this is 1 for the "outside" rings.
Hope not we are finding the g_RD for later runs for inner rings.
This is similar to Cullan's plots
These are the input values seen in the bottom flow that we will need g_RD_j=1, j=(ring)
SAID ANOTHER WAY
step 1) define the rings (this is sort of arbitrary but keep it consistent)
step 2) average the gains per ring per run
step 3) divide step 2 by the average gain per ring from run 61321 (this is like our t_0 or “Reference”, if you will)
step 4) plot (and save the value) as a function of run number (aka "time")
We “hope” for outer rings, this should be one. for inner rings, it should not be one. This is the g_RD, or the contribution to G_i of the g_RD. We will use this as input to our model (the part inside the dotted line on slide 15).
Generative Method
t = time == run by run is our time (a run is a “time”)
r = radius, ring
I_beam = integrated beam current
Uses r >= R as a non radiation damaged reference
G_i(t) = total Gain (primex_gains.csv) is some function of (r, LED, integrated beam, and “t”)
some convolution of g_i and g_RD_i
g_i = g_i(LED,t) - function of the LED pulses and time
g_RD(r, I_beam)
@ r >= R g_RD(r, I_beam) = 1
@ some radius/ring the contribution towards total gain of the radiation gain is just 1, i.e. there is no contribution. These are past the inner ring area.
So, we could have two models:
MODEL ONE:
MODEL TWO:
And you can see how these could work together into one “flow” in the bottom flowchart.
Prep & Backup
Nov/Dec/Jan: Intro to JLab and Data
About the intro to JLAB data
useful definitions
calorimeter (in physics): detector that measures the energy of particles
photo-multiplier tube (PMT): super sensitive detector used to measure tiny amounts of light
Light emitting diode (LED): device that emits light when current flows through it