1 of 14

Quantum Machine Learning Methods for Semiconductor Wafer Defect Classification

An exploration for PHYS 250 by Stephen Reagin

2 of 14

Introduction and Motivation

Quantum machine learning (QML) is an emerging field seeking to improve upon classical machine learning by leveraging the properties of quantum computers

A key motivation is that a quantum system of n qubits can process information in a Hilbert space of dimension 2n, which grows exponentially with qubit count

Quantum feature maps encode classical data into this high-dimensional space, capturing complex patterns via feature interactions that are classically intractable

3 of 14

Quantum Machine Learning

Images generated via ChatGPT

4 of 14

Support Vector Machines and Kernel Tricks

Support Vector Machines (SVM): classification boundaries are drawn as hyperplanes supported by vectors

Figure 1: Kumar, Sanjay & Kumar, Nikhil & Dev, Aditya & Naorem, Siraz. (2022). Movie genre classification using binary relevance, label powerset, and machine learning classifiers. Multimedia Tools and Applications. 82. 1-24. 10.1007/s11042-022-13211-5.

Using the kernel trick, finding the inner product between data points in the feature space enables these nonlinear boundaries to be found

Figure 2: Grace Zhang. What is the kernel trick? Why is it important? https://medium.com/@zxr.nju/what-is-the-kernel-trick-why-is-it-important-98a98db0961d

5 of 14

SVM Limitations and Quantum Kernels

Computing a classical kernel matrix over n samples requires O(n2) evaluations, which becomes a bottleneck for large datasets

The QML approach replaces classical kernels with a quantum kernel:

K(xi,xj) = |⟨ϕ(xi)|ϕ(xj)⟩|2

where |ϕ(x)⟩is the quantum state produced by a parametrized feature map circuit

Encoding x inputs x into |ϕ(xj)⟩, variational circuit

Figure: Kernel-based training of quantum models with scikit-learn

https://pennylane.ai/qml/demos/tutorial_kernel_based_training

6 of 14

Task: Classifying Semiconductor Defects

  • Defects during semiconductor wafer fabrication can reduce chip production
  • Automated systems use AI/ML to analyze wafers and identify faulty die patterns
  • QML methods show promise on small synthetic datasets, but performance on realistic industrial datasets remains unexplored

The task is to classify semiconductor defects using classical SVM and QSVC, and benchmark quantum algorithm performance against classical

Figure: representative examples of defect patterns from MixedWM38 dataset, produced in Python

7 of 14

Dataset: MixedWM38 WaferMap

Classical and quantum machine learning models were trained on the MixedWM38 WaferMap dataset, a publicly available dataset of semiconductor wafer maps

This dataset contains 38,000 wafer images represented as 52 × 52 grids, with each pixel value indicating whether the die is absent, functional, or defective.

Many wafers exhibit spatial patterns of failure which can take several characteristic forms, including:

(1) center defects, (2) donut, (3) edge-localized, (4) edge-ring, (5) localized, (6) nearly full wafer failure, (7) scratches, and (8) random.

Each wafer is labeled using an 8-dimensional Boolean vector indicating which defect patterns are present.

52 pixels

52 pixels

Green = functional

Yellow = defective

8 of 14

Data Processing - Reducing Dimensions

Each wafer image contains 52 ×52 = 2,704 pixels, so directly encoding the raw data into a quantum circuit is not feasible with current quantum hardware or classical simulators

Principal Component Analysis (PCA) was used to reduce the feature dimensionality to n components, where n is swept from 1 to 30 for the classical SVM and from 2 to 11 for the QSVC, allowing the effect of circuit width on model performance to be studied systematically

PCA

PCA

Figure: GeeksForGeeks, Dimensionality Reduction Techniques

https://www.geeksforgeeks.org/data-science/dimensionality-reduction-techniques/

9 of 14

Data Processing - Subsampling

Training the QSVC on the full dataset (38,000 images) is computationally intractable on classical simulators, so only a subsample was used to develop classical and quantum models

Stratified sampling ensures the subsample does not bias the ML models toward any particular defect pattern: the dataset is partitioned by combinations of defect labels, and 20% of samples are drawn uniformly at random from each partition

This preserves the relative frequency of all eight defect types in both the subsample and the subsequent train/test split

Figure generated via ChatGPT

10 of 14

Models and Metrics

  • Two classifiers were trained: Classical SVC baseline and Quantum SVC
    • These were evaluated on based on the well-known confusion matrix

Of all predictions made, how many were correctly classified

Recall:

Of all wafers that were truly defective, what fraction did the model correctly identify

Precision:

Of all wafers the model predicted as defective, what fraction were actually defective

F1 score:

Harmonic mean of precision and recall, balancing both metrics into a single score

Figure: Statistics by Jim, What is a Confusion Matrix?

https://statisticsbyjim.com/glossary/confusion-matrix/

11 of 14

RESULTS

RESULTS

12 of 14

Results

Classical SVM

  • All scoring metrics are improved as the n number of PCA dimensions is increased
  • Improvements are very significant when increasing from 2 to 5 dims, leveling off around 10-15 dims
  • Total runtime roughly 2min

Quantum Kernel

  • Scoring Metrics do not necessarily improve with increasing n number of PCA dimensions
  • Indeed, some models actually regress in performance with higher PCA dimensions
  • Total runtime 6 hours

13 of 14

Discussion and Future Directions

Classical SVM vs. QSVC

  • SVM improved monotonically with PCA dimensions, plateauing ~n = 15
  • QSVC metrics fluctuated irregularly with n, showing no consistent learning signal
  • Classical SVM outperformed QSVC across most defect types and PCA configurations

Near-Term QML Limitations

  • Classical statevector simulation makes the exponential Hilbert space a computational liability rather than an asset
  • The path to practical quantum advantage for classification tasks remains unclear
  • No conclusions about real quantum hardware performance can be drawn from these results

Future Work

  • Execute models on real quantum hardware
  • Explore alternative feature maps
  • Investigate trainable quantum kernels

14 of 14

References

[1] C. Conti, Quantum Machine Learning: Thinking and Exploration in Neural Network Models for Quantum Science and Quantum Computing (Springer International Publishing, 2023).

[2] M. Kuhn and K. Johnson, Applied Predictive Modeling (Springer, 2013).

[3] G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning: with Applications in R, 2nd ed. (Springer, 2021).

[4] Y. Kim, J.-S. Lee, and J.-H. Lee, IEEE Trans. Semicond. Manuf. 36, 476 (2023).

[5] V. Havl´ıˇcek, A. D. C´orcoles, K. Temme, A. W. Harrow, A. Kandala, J. M. Chow, and J. M. Gambetta, Nature 567, 209 (2019).

[6] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, Nature 549, 195 (2017).

[7] J. Wang, C. Xu, Z. Yang, J. Zhang, and X. Li, IEEE Trans. Semicond. Manuf. 33, 587 (2020).

[8] Junliangwangdhu, WaferMap, GitHub repository, https://github.com/Junliangwangdhu/ WaferMap (2026), accessed: 16 March 2026.

[9] Qiskit Community, QSVC — Qiskit machine learning documentation, https://qiskit-community.github.io/qiskit-machine-learning/stubs/qiskit_machine_learning.algorithms.QSVC.html (2024).

[10] Qiskit Community, VQC — Qiskit machine learning documentation, https://qiskit-community.github.io/qiskit-machine-learning/stubs/qiskit_machine_learning.algorithms.VQC.html (2024).