X International conference�“Information Technology and Implementation” (IT&I-2023)�Kyiv, Ukraine
1
Parallel and Distributed Machine Learning for Anomaly Detection Systems
Bohdan Koval, Iulia Khlevna
Information Technology and Implementation, November 20, 2023, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
The purpose of this research
Information Technology and Implementation, November 20, 2023, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Parallel and distributed computing
Parallel Computing | Distributed Computing |
Concurrency is achieved through simultaneous execution of numerous operations | System components are geographically distributed across distinct locations |
A solitary computing unit is sufficient to execute the tasks | Utilizes a network of multiple distinct computing units |
Concurrent operations are performed by multiple processors within a single system | Concurrent operations are distributed across multiple discrete computing systems |
May encompass shared or distributed memory resources | Solely employs distributed memory resources |
Inter-processor communication typically occurs via a shared memory bus | Communication between computing units relies on message passing protocols |
Enhances the overall performance of a system | Enhances system scalability, fault tolerance, and resource sharing capabilities |
Information Technology and Implementation, November 20, 2023, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Parallel and distributed computing
Information Technology and Implementation, November 20, 2023, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Two primary types of parallelism:
Information Technology and Implementation, November 20, 2023, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Two primary types of parallelism:
Information Technology and Implementation, November 20, 2023, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Amdahl's law
where Slatency represents the potential reduction in the time it takes to complete the entire task, s the reduction in time specifically for the part of the task that can be done in parallel, p is the portion of the total task time that is spent on the part that can be parallelized before parallelization.
Since Slatency < 1/(1 - p), it indicates that a small portion of the program that cannot be parallelized will restrict the maximum speedup achievable through parallelization.
Information Technology and Implementation, November 20, 2023, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Data processing performance comparison
Processing approach | Result |
Regular (sequential) processing | 19.341 seconds |
Pool parallel processing | 9.120 seconds |
Pool parallel processing threads | 19.125 seconds |
Joblib parallel processing | 16.912 seconds |
Joblib parallel processing threads | 19.213 seconds |
Information Technology and Implementation, November 20, 2023, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Parallel feature engineering comparison
Feature engineerings processing approach | Result |
Grouping features (sequential) | 1.721 seconds |
Grouping features shifted (sequential) | 1.981 seconds |
Pool grouping features parallel | 2.820 seconds |
Pool grouping features parallel threads | 1.101 seconds |
Pool grouping features shifted parallel | 2.521 seconds |
Pool grouping features shifted parallel threads | 1.226 seconds |
Joblib grouping features parallel threads | 1.009 seconds |
Joblib grouping features shifted parallel threads | 1.182 seconds |
Information Technology and Implementation, November 20, 2023, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Distributed machine learning
Distributed training becomes a necessity under the following circumstances:
Information Technology and Implementation, November 20, 2023, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Distributed machine learning
Distributed machine learning data processing involves:
Information Technology and Implementation, November 20, 2023, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Conclusion