AllenNikka-Capstone-405027870

Caloric intake is an extremely important, albeit hard to collect, piece of health and wellness information critical in the wellbeing of many at risk populations, like the elderly and the very young, as well as playing a large role in the wellbeing of individuals across the age spectrum. In this work, towards to goal of automatically determining caloric intake based on passive bio-signals, I describe a system by which the state of a human’s heart rate and blood oxygenation level can be clustered into one of three states: pre-eating, eating, and post-eating. Therein, I describe a system for selecting appropriate hardware to select data, extracting data from hardware, preprocessing data it to extract relevant frequency domain features, and using a k-means clustering to achieve the desired separation of signal types. The results of this work are promising, showing a strong fit of the described labels to the acquired data, in spite of the limited set of data acquired to date. Finally, future steps of the research to bridge this current work with the end goal of automatic caloric intake detection are discussed.

Introduction:

Caloric intake (i.e. consuming drinks or food) is a critical piece of information for a broad variety of domains, including: health and wellness, dieting, peak athletic performance, lifestyle coaching, and beyond. Another area of importance includes clinical applications of caloric intake, including malnutrition, overnutrition, and the like. Lastly, caloric intake reporting is a critical problem in the care of certain populations and can be a leading cause of death. In newborns, undernutrition is associated with 45% of child deaths worldwide^[1]. It is also implicated in the survival rates of critically ill patients^[2], as well as the health of elderly individuals^[3].

Given the clear importance of the problem, it is then worth examining current solutions and the limitations therein. At the moment, the populations previously mentioned do not have any readily available, non-intrusive, low-effort method to record caloric intake. Current workarounds include asking users (or patients) to record anything they consume, often in detail, using a food journal of some kind (including a phone app, pen & paper, or otherwise). It should be clear that these methods are cumbersome, time-consuming, and easily forgotten (which severely impacts the effectiveness of these journaling methods), often leading to underreporting and poor adherence. Moreover, several of the populations mentioned clearly have limited capability in reliably reporting these figures themselves (i.e. newborns, the elderly).

There have been some developments in this field, namely the WearSens necklace^[4] developed at UCLA. This device achieves many of the goals above, but also introduces several tradeoffs such as a visible, aesthetically unappealing neckband, and the need to wear a necklace. These tradeoffs, while immaterial for some populations, like the elderly, are cause for great concern or complete lack of use with others such as adolescents, or adults.

Going a step beyond what was achieved by WearSens, but inspired by their technology, this capstone looks to acquire the same extremely powerful data on caloric intake but using a novel methodology even less intrusive than the WearSens necklace. Here, I propose using pulse oximetry, respiration rate, and pulse data readily available from smart watches and fitness trackers that are currently ubiquitous in the marketplace (see: FitBit, AppleWatch, etc.) to acquire the necessary data to furnish this caloric intake data. As a further, simplifying assumption, this project aims to classify the wearer’s behaviors into “Eating”, “Digesting”, or “Not eating or digesting” categories, rather than directly report actual calories consumed. Rather, this work would serve as a basis for future work to make that second requisite step to the envisioned device.

Methods:

This work can be broken down into the following key steps, and then subsequently further explored:

Selecting Data Acquisition Hardware
Finding, and modifying, open source programs to populate data in requisite format from hardware
Generating Training Data (Labeled and Unlabeled)
Implementing custom pre-processing functions and pipeline for data
Evaluating unsupervised learning models appropriate for the learning task
Tuning pre-processing parameters and model hyperparameters
Creating evaluation criteria, and subsequently measuring correctness of outputs based on held out set of labeled data

Selecting Data Acquisition hardware:

Several criteria had to be met for the data-acquisition hardware, which would be used to collect blood oxygenation (SpO2) and heart rate (Beats Per Minute, or BPM). Several options were considered, but mainstream wrist worn trackers were discarded due to the difficulty of getting granular data, and the Contec CMS50F was selected instead, pictured below:

This system was selected because of it’s low price, ease of use, and capability of offloading data into CSV files that could subsequently be easily manipulated.

CMS50F Data Parsing using OSCAR:

OSCAR^[5], an Open Source CPAP Analysis Reporter A fork of sleepyhead/sleepyhead-code, was selected for the task of off-loading data from the CMS50F. It was selected primarily because it already was designed to work with the CMS50F, as well as several other types of pulse oximiters, to collect pulse oximetry data for use in CPAP data analysis, primarily in individuals with sleeping disorders. The other reason the software was selected was because it is open source, meaning that I was able to (and did) make several modifications to the base program to better suit the use case in question.

A distinct limitation of the out-of-the-box software was that the pulse oximetry data, while viewable without CPAP data, could not be exported to a csv on its own. Moreover, the data could only be exported as summaries over time intervals, whereas I required as much of the raw data as possible from the sensor. To solve this problem, I quickly learned the QT build system for GUI based C++ applications that OSCAR is built on, and modified the source code^[6] to collect the raw BPM and SpO2 readings from the device, and export them to a CSV file.

Generating Training Data:

Another distinct challenge for the project was generating training data in a consistent matter to derive meaningful end results. Because of constraints on time and resources as a one-person team, I used myself and my own meal consumption as a means for data generation and labeling. The methodology was as follows:

All meals recorded were lunch at around 12:00pm

Given the very small number of data I could produce by myself, I decided to try to make the data fairly consistent to generate better results, hence using only lunches

Several (5-15, variable) minutes before the meal, I would sync a stopwatch (on a smart-phone or otherwise) with the initiation of a recording on the pulse oximeter
Once eating commenced, a ‘lap’ would be marked on the stopwatch

When this period of eating subsequently finished, ‘lap’ would be marked again, clearly delineating the eating period
This would be repeated if there were long breaks between eating periods, or dessert after the main portion of the lunch, for instance

After completion, all data was recorded in a common format

As this process was fairly labor intensive, not all meals recorded with the pulse oximeter were captured in this way. The latter data files were used as unlabeled training data in the unsupervised learning task.

Data Pre-Processing and Feature Engineering:

The first step in processing the data was converting the raw csv file into a pandas dataframe that consisted of timestamps in milliseconds, and corresponding values of SpO2 and BPM. Once these tables were created, they were visualized to ensure the data was correct. An example plot titled SpO2 and BPM vs Time, which has BPM and SpO2 ranges within 65-100 and 95-100 respectively, well within expected ranges for these values.

Next, the data had to be pre-processed once again as the features to be examined were in the frequency domain, rather than the time domain, as the data currently exists. In order to perform a frequency domain analysis of the data, the data would need to be indexed in a manner consistent with a fixed sampling frequency, i.e. each datapoint was sampled 1/n seconds after the previous, where n is some integer. A limitation of the hardware was that there was not a fixed sample rate, i.e. the samples were inconsistent in their time spacing. However, the samples were time stamped, so this problem was solved by indexing each datapoint based on the time it was recorded, and interpolating between adjacent data-points to fill missing data and allow the Fast Fourier Transform of the waveform to be computed.

A second problem introduced was that the extremely sharp transitions created a lot of spectral noise, i.e. noise visible in the frequency domain. In order to reduce this spectral noise, the resultant interpolated data vector was passed through a gaussian filter kernel with , i.e. variance of 1. The results of this preprocessing are visible in the graph “Time based indexing, Imputation, and Smoothing of SpO2 Data” included.

Once this step was completed, the fast fourier transform (FFT) of the waveform was computed and examined for correctness. The transform was computed on chunks of the waveform, which I will refer to as FFT-kernels, rather than a sliding window over the entire wave. This is because the prior approach is much more computationally efficient, which better suits the end goal of this approach, which is being able to run on wearable hardware such as smartwatches and the like. This approach also produced good results, which are pictured in the graphs “Magnitudes Squared vs. Frequency” and “Phase angle vs Frequency”, which demonstrate those measures respectively over a 1024 datapoint FFT computation over the waveform.

Evaluating Appropriate Unsupervised Learning Models:

Given the goal of clustering data points together based on the type of calorie consumption data being acquired, the two most likely model candidates were k-means and gaussian mixture models, due to their ability to produce cluster elements on unlabeled data. Therefore, both these models were examined, but as was previously stated, due to the vastly superior runtime efficiency of k-means, and the slightly better results it produced, it was selected as the clustering algorithm of choice.

Tuning Hyper-Paramaters:

The model had several hyper parameters that needed to be tuned, those being: number of clusters, size of FFT-kernel, and the coefficient of the gaussian smoothing filter. Doing a grid search over the parameters would be exceptionally difficult, as the result required some qualitative interpretation, so parameters were tuned as follows:

was selected to maintain as much data as possible in the original time-series data while simultaneously removing spectral noise
After examining several FFT-kernel sizes, a size of 64 was selected. Powers of two increase the computational efficiency of the FFT significantly, and combined with the fact that this value produced the best qualitative results, it also makes results highly interpretable, as 64 ~ 60, the number of seconds in a minute.

Selecting the number of clusters (K) was done using the elbow-method, by computing the With-in-Sum-of-Squares (WSS). WSS is the total distance of data points from their respective cluster centroids, and the ‘elbow’ of this graph shows us where increases in accuracy start to be marginal, indicating overfitting rather that fitting the underlying data. The results of this analysis can be seen in the graphs “Inertia (WSS) vs # cluster centers” and “Gradient of inertia at each cluster number”, the latter of which shows the derivative of the graph of the prior at each given point. As is clear from the two graphs, 3 clusters seems to be the elbow of this data, supporting the hypothesis that the data fits best into three distinct categories.

Results:

Given the final model and input data, the last step in the pipeline was to create evaluation criteria, and subsequently evaluate, to determine how well the model was able to perform. The semi-supervised moniker also stems from this portion of the procedure, as we have some (unreliably) labeled data to which we can refer, however, measuring against the labeled data quantitatively could lead to undesirable results, as the labels were not created using a particularly rigorous and/or correct procedure. Please refer to the Methods: Generating Training Data for an explanation of the data acquisition and labeling methods employed. Data from periods of eating were labeled ‘eating’, and otherwise labeled ‘break’, for example:

Date	Activity	Start Minute	End Minute
11/19/2019	break	0	10
11/19/2019	eating	10	20
11/19/2019	break	20	35
11/19/2019	eating	35	36
11/19/2019	break	36	47

Creating Evaluation Criteria:

To create labels to measure model predictions against, the labeled data recorded in part 4 of Methods: Generating Training Data was processed as follows:

First break interval is labeled ‘pre-eating’
First eating interval is labeled ‘eating’
Subsequent break intervals labeled ‘post-eating’
Return to (2) for eating intervals, or (3) for break intervals
Repeat for all timer intervals

Evaluating Performance:

Given these evaluation criteria, certain day’s data was held out from the training set and used to measure performance. Plotting the predictions made on data against the labels outlined above produced the following graph:

11/19/2019 Data:

As can be seen, the predictions are clearly correlated with the observed eating behavior envelope, indicating that the state change reflected by the cluster selection is indicated by the real life eating behavior of the subject. Due to the fact that limited labeled data was collected, and that a large part of the labels collected were inconsistent due to erroneous timings, further graphs couldn’t be produced.

Limitations:

Data Collection:

The work enclosed faced several key limiting factors, the chief among them being the lack of a large publicly available data-set to work with for the problem at hand, and the difficulties in creating good labeled data from which performance could be assessed. The primary issue in creating labeled data was that the time recorded by the pulse oximeter’s internal clock did not seem to be in perfect accord with the timer used in data labeling, resulting in large amounts of labeled data that was not usable. For example, the following data from 10/26/2019 demonstrates this problem:

Here, it is clear that the observation and prediction show a strong correlation, albeit that the observation seems to be shifted 15 minutes to the right of the prediction. This is most likely the result of the timing disparity mentioned.

Future Work:

The goal of this work was to be a first step in estimating calorie intake from food. In it, I describe a computationally efficient, albeit offline, method for estimating consumption state based on heart rate and blood oxygenation. There are several key next steps for this work, including:

Running the algorithm online with streaming data
Running the streaming algorithm on compute-limited hardware, like a fitness tracker
Creating a labeled data-set of BPM and SpO2 mapped to number of calories consumed

Using the model described in this work to trigger data labeling prompts for number of calories consumed in an eating session

Creating an online, computationally lean model to calculate the number of calories consumed on-the-fly in an embedded environment

These future directions of work are absolutely critical to tackling the incumbent problems facing widespread human health and wellness laid out in the Introduction, and hopefully this work can have a small but significant part to play in meeting these goals.

[1] World Health Organization, https://www.who.int/news-room/fact-sheets/detail/infant-and-young-child-feeding

[2] Calorie intake and short-term survival of critically ill patients, Hartl et al., https://www.ncbi.nlm.nih.gov/pubmed/29709380

[3] Calorie Restriction in the Elderly People, Kwang-Il et al., https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3677990/

[4] Necklace and smartphone app developed at UCLA can help people track food intake, UCLA Newsroom, Bill Kisiluik, March 12, 2015, http://newsroom.ucla.edu/releases/necklace-and-smartphone-app-developed-at-ucla-can-help-people-track-food-intake

[5] OSCAR Source, https://gitlab.com/pholy/OSCAR-code

[6] OSCAR modifications, https://github.com/allen-n/OSCAR-PulseOx

Table of Contents

Abstract: