Expectation Maximization Methods for Metabolic Pathway Analysis
Dissertation Proposal
Fil Rondel
Committee: Alex Zelikovsky
Pavel Skums
Murray Patterson
Artem Rogovskyy
July 1st 2022
Part I - Introduction
Metabolism
Metabolic Pathways
Enzymes
More enzymes and substrates → faster metabolism
Metabolic Pathway Activity
Problem formulation
Challenges
Enzyme participation
Contributions
Publications
Part II - Microbial community analysis
Previous work
EMPathways
First and Second EM
EM for enzyme participation (3rd EM)
Third EM
RESULT: more accurate estimation of
Datasets
Datasets
Sample data
Challenge - different EM converging points
Finding pairs
A
B
B - A
Exceptions in finding pairs
Exceptions in finding pairs
Results
Correlation of pathways activity level with environmental parameters (old pipeline)
| Salinity | Temp | Oxygen | Chl | PAR | Density | MLR |
1. # of significantly correlated pathways | 8 | 14 | 5 | 8 | 1 | 4 | 10 |
2. 95% randomized CI | 1-10 | 1-11 | 1-8 | 0-7 | 0-6 | 1-8 | 0-8 |
3. The most correlated pathway | ec00364 | ec00310 | ec00281 | ec00281 | ec00740 | ec00623 | ec00623 |
Correlation of pathways activity level with environmental parameters (new pipeline)
| Salinity | Temp | Oxygen | Chl | PAR | Density | MLR |
1. # of significantly correlated pathways | 31 | 22 | 19 | 18 | 14 | 30 | 22 |
2. 95% randomized CI | 1-8 | 0-8 | 0-6 | 0-6 | 0-6 | 1-8 | 0-7 |
3. The most correlated pathway | ec00071 | ec00195 | ec00622 | ec00460 | ec00360 | ec00071 | ec00626 |
Enzyme-in-Pathway Participation
ec00020 | D1: 12 | D1: 16 | D1: 20 | D2: 00 | D2: 04 | D2: 08 | D2: 12 | D2: 16 | D3: 00 | D3: 04 | D3: 12 | AVE | STD |
EC:1.2.4.1 | 12.82 | 21.68 | 20.64 | 33.71 | 35.76 | 30.38 | 21.78 | 23.71 | 32.4 | 28.07 | 21.98 | 25.72 | 6.6 |
EC:1.2.7.1 | 0.51 | 6.18 | 15.43 | 6.69 | 4.97 | 9.32 | 13.14 | 9.61 | 7.87 | 12.95 | 2.54 | 8.11 | 4.37 |
EC:1.2.7.3 | 13.99 | 21.46 | 20.32 | 26.74 | 28.96 | 24.87 | 21.26 | 22.22 | 27.08 | 24.44 | 26.7 | 23.46 | 4.02 |
EC:1.8.1.4 | 7.61 | 12.92 | 11.24 | 16.94 | 16.65 | 14.39 | 12.93 | 16.92 | 19.16 | 14.03 | 22.16 | 15 | 3.78 |
EC:2.3.1.12 | 12.82 | 21.68 | 20.64 | 33.71 | 35.76 | 30.38 | 21.78 | 23.71 | 32.4 | 28.07 | 21.98 | 25.72 | 6.6 |
EC:4.1.1.32 | 12.82 | 21.68 | 20.64 | 33.71 | 35.76 | 30.38 | 21.78 | 23.71 | 32.4 | 28.07 | 21.98 | 25.72 | 6.6 |
EC:4.1.1.49 | 14.78 | 23.66 | 23.38 | 32.19 | 36.13 | 37.34 | 26.62 | 28.41 | 35.9 | 33.66 | 25.61 | 28.88 | 6.6 |
EC:1.1.1.37 | 18.14 | 19.76 | 26.62 | 17.9 | 18.93 | 30.78 | 20.27 | 20.43 | 22.97 | 22.13 | 44.21 | 23.83 | 7.43 |
EC:1.1.1.41 | 72.88 | 72.85 | 70.78 | 71.2 | 68.42 | 38.66 | 45.68 | 60.11 | 62.77 | 61.29 | 27.09 | 59.25 | 14.74 |
EC:1.1.1.42 | 19.96 | 24.06 | 22.58 | 21.52 | 23.68 | 19.95 | 22.48 | 22.32 | 22.95 | 21.92 | 42.38 | 23.98 | 5.95 |
EC:1.1.5.4 | 0 | 0 | 0 | 29.35 | 0 | 0 | 0 | 20.53 | 0 | 0 | 0 | 24.94 | 4.41 |
EC:1.2.4.2 | 10.1 | 13.02 | 10.76 | 11.91 | 10.91 | 11.72 | 12.75 | 14.08 | 14.74 | 10.13 | 25.75 | 13.26 | 4.21 |
EC:1.3.5.1 | 21.35 | 27.74 | 28.74 | 34.65 | 39.51 | 30.74 | 29.4 | 29.56 | 36.38 | 33.32 | 46.73 | 32.56 | 6.43 |
EC:2.3.1.61 | 10.1 | 13.02 | 10.76 | 11.91 | 10.91 | 11.72 | 12.75 | 14.08 | 14.74 | 10.13 | 25.75 | 13.26 | 4.21 |
EC:2.3.3.1 | 86.31 | 41.26 | 66.16 | 28.14 | 39.2 | 260.4 | 209 | 93.27 | 70.39 | 107.9 | 96.4 | 99.85 | 68.92 |
EC:2.3.3.8 | 19.96 | 24.06 | 22.58 | 21.52 | 23.68 | 19.95 | 22.48 | 22.32 | 22.95 | 21.92 | 42.38 | 23.98 | 5.95 |
EC:4.2.1.2 | 14.54 | 18.81 | 19.68 | 23.77 | 28 | 20.3 | 19.67 | 20.16 | 24.74 | 22.7 | 32.79 | 22.29 | 4.72 |
EC:4.2.1.3 | 33.31 | 29.83 | 34.13 | 23.43 | 28.96 | 41.1 | 44.43 | 37.46 | 35.39 | 38.11 | 69.02 | 37.74 | 11.35 |
EC:6.2.1.4 | 19.96 | 24.06 | 22.58 | 21.52 | 23.68 | 19.95 | 22.48 | 22.32 | 22.95 | 21.92 | 42.38 | 23.98 | 5.95 |
EC:6.4.1.1 | 14.54 | 18.81 | 19.68 | 23.77 | 28 | 20.3 | 19.67 | 20.16 | 24.74 | 22.7 | 32.79 | 22.29 | 4.72 |
Part III - Rodent community analysis
Initial pipeline
EMPathways pipeline
Data Sets
Enzyme coefficients
ec00062 | WS_1 | WS_2 | WS_3 | Average | StDev | WH_1 | WH_2 | WH_3 | Average | StDev |
EC:1.1.1.211 | 0.08483 | 0.08255 | 0.08154 | 0.08297 | 0.00169 | 0.07992 | 0.07949 | 0.07321 | 0.07754 | 0.00376 |
EC:4.2.1.134 | 0.09528 | 0.09273 | 0.09386 | 0.09396 | 0.00128 | 0.09587 | 0.08763 | 0.08563 | 0.08971 | 0.00543 |
EC:1.1.1.35 | 0.03545 | 0.03325 | 0.03408 | 0.03426 | 0.00111 | 0.03396 | 0.03003 | 0.02923 | 0.03107 | 0.00253 |
EC:1.3.1.93 | 0.09528 | 0.09273 | 0.09386 | 0.09396 | 0.00128 | 0.09587 | 0.08763 | 0.08563 | 0.08971 | 0.00543 |
EC:2.3.1.16 | 0.07277 | 0.06689 | 0.06721 | 0.06896 | 0.00331 | 0.06804 | 0.06149 | 0.06175 | 0.06376 | 0.00371 |
EC:2.3.1.199 | 0.09528 | 0.09273 | 0.09386 | 0.09396 | 0.00128 | 0.09587 | 0.08763 | 0.08563 | 0.08971 | 0.00543 |
EC:3.1.2.22 | 0.11754 | 0.12052 | 0.11921 | 0.11909 | 0.00149 | 0.11801 | 0.12478 | 0.12822 | 0.12367 | 0.00519 |
EC:1.3.1.38 | 0.11754 | 0.12052 | 0.11921 | 0.11909 | 0.00149 | 0.11801 | 0.12478 | 0.12822 | 0.12367 | 0.00519 |
ec00062 | LS_1 | LS_2 | LS_3 | Average | StDev | LH_1 | LH_2 | LH_3 | Average | StDev |
EC:1.1.1.211 | 0.07611 | 0.07564 | 0.07962 | 0.07712 | 0.00217 | 0.07214 | 0.07137 | 0.06937 | 0.07096 | 0.00143 |
EC:4.2.1.134 | 0.08968 | 0.09392 | 0.08879 | 0.09080 | 0.00274 | 0.08621 | 0.08510 | 0.08544 | 0.08558 | 0.00057 |
EC:1.1.1.35 | 0.02749 | 0.02697 | 0.02646 | 0.02697 | 0.00051 | 0.02663 | 0.02661 | 0.02736 | 0.02686 | 0.00043 |
EC:1.3.1.93 | 0.08968 | 0.09392 | 0.08879 | 0.09080 | 0.00274 | 0.08621 | 0.08510 | 0.08544 | 0.08558 | 0.00057 |
EC:2.3.1.16 | 0.06534 | 0.06643 | 0.06407 | 0.06528 | 0.00118 | 0.06752 | 0.06742 | 0.06776 | 0.06757 | 0.00018 |
EC:2.3.1.199 | 0.08968 | 0.09392 | 0.08879 | 0.09080 | 0.00274 | 0.08621 | 0.08510 | 0.08544 | 0.08558 | 0.00057 |
EC:3.1.2.22 | 0.12275 | 0.12035 | 0.12346 | 0.12219 | 0.00163 | 0.12638 | 0.12688 | 0.12736 | 0.12687 | 0.00049 |
EC:1.3.1.38 | 0.12275 | 0.12035 | 0.12346 | 0.12219 | 0.00163 | 0.12638 | 0.12688 | 0.12736 | 0.12687 | 0.00049 |
WS / WH - wild sick / wild healthy
LS / LH - lab sick / lab healthy
Results
Differentially expressed pathway ec00053
Ascorbate and aldarate metabolism
| Healthy | Healthy | Healthy | Average | STDev | Sick | Sick | Sick | Average | STDev | Average | STDev |
Wild ec00053 | 11.322 | 11.946 | 12.896 | 12.055 | 0.793 | 33.636 | 12.688 | 11.848 | 19.391 | 12.344 | 15.723 | 8.168 |
Lab ec00053 | 45.066 | 44.674 | 34.24 | 41.327 | 6.140 | 30.887 | 34.548 | 22.335 | 29.257 | 6.268 | 35.292 | 0.090 |
For ec00053:
Differentially expressed pathway ec00280�Valine, leucine and isoleucine degradation
| Healthy | Healthy | Healthy | Average | STDev | Sick | Sick | Sick | Average | STDev | Average | STDev |
Wild ec00280 | 176.159 | 180.124 | 171.589 | 175.957 | 4.271 | 176.788 | 177.81 | 179.145 | 177.914 | 1.182 | 176.936 | 2.184 |
Lab ec00280 | 136.395 | 135.017 | 135.33 | 135.581 | 0.722 | 163.634 | 163.142 | 167.456 | 164.744 | 2.362 | 150.162 | 1.159 |
For ec00280:
Differentially expressed pathway ec00500
Starch and sucrose metabolism
| Healthy | Healthy | Healthy | Average | STDev | Sick | Sick | Sick | Average | STDev | Average | STDev |
Wild ec00500 | 157.897 | 159.69 | 160.486 | 159.358 | 1.326 | 114.175 | 153.072 | 155.825 | 141.024 | 23.293 | 150.191 | 15.533 |
Lab ec00500 | 145.837 | 129.389 | 138.745 | 137.990 | 8.250 | 89.675 | 133.198 | 87.802 | 103.558 | 25.686 | 120.774 | 12.329 |
For ec00500:
Part IV - Conclusions
Conclusions
Future work
Acknowledgements
CS Department
School of Medicine
CS Department
Department of Microbiology
College of Veterinary Medicine and Biomedical Sciences