Food Shelter Predictor


Team Members

The Context and Problem

ML Research        

        Constraint Satisfaction Problem

        Decision Trees

        Clustering

        Graphical Modeling

        Research Papers

                Combination of Time Series, Decision Tree and Clustering

                Hybrid between a Decision Tree and a Neuro-Fuzzy system

Solution Approach

        Implementation

Results

        Minimum Specs

        Database Statistics

        Machine Learning Statistics

        Query Requests Speed

Conclusion

        Future Work

Bibliography

Team Members

Nassim Amar        Nevan Wichers            Bianca Flaidar

Source Link: https://github.com/namar0x0309/FoodShelterPredict                         

The Context and Problem

        We have developed this project by using existing databases that contain information about different food banks / shelters. This information that is volunteered by a food bank is collected in a database application (https://github.com/zenev/ClientcardFB3). Food shelters/banks gather data regarding donors and clients (households in need). Their interest is to manage their resources and anticipate volume.

The problem is that food shelters have to manage or anticipate different variables. Ideal for them would be to easily predict food availability, donations, and the volume or influx of clients. The multivariable nature of this situation is what makes machine learning perfectly suited to solve and predict these resource flows for the food shelters. Use of AI could ultimately result in a more efficient distribution of food/aid packages.

The Data

ML Research

        Constraint Satisfaction Problem (CSP)

CSP refers to finding a solution to a set of constraints that variables must satisfy usually using some form of backtrack search. CSPs look like constraint graphs where the nodes of the graph are variables and the arcs are constraints. However, we got stuck trying to visualize the data that we have as what seemed like a non-flexible, standard pattern model. We found decision trees more capable of dealing with noisy data.

        Decision Trees

Decisions Trees are useful to our project because we want to have an algorithm that applies well to our data which is very much like a set of attributes to build decision trends on. Shelters can then predict those trends which will help them manage their resources better. This doesn’t necessarily have to be limited to shelters - donors and households may want to better find shelters suited for their needs. Using the data we have - rather than modeling constraints, we found it easier to visualize a model of decisions and chance event outcomes.

        Clustering

We like clustering because the unsupervised learning algorithm that separates data into clusters of similar data points applies well to this project in regards to when donors donate, geographic areas, demographics of households in need.

Graphical modeling

We looked at how data can be modeled as a joint probability distribution to extract decision trees from.

        Research Papers

After brainstorming different solutions to our project and deciding to go explore decision trees for our project, we started looking at more concrete solutions from research texts and narrowed down our research to these two papers:

        Combination of Time Series, Decision Tree and Clustering

        This is the research that we are interested in and building up on the most. Our data is related to the time events take place (particularly the time of donations). The paper is an example of a predictive system that uses a time series -  a sequence of values that are recorded as equal time intervals. Time series models can predict next time series parameters value and we found that interesting for predicting times of donation events where our time intervals can be months. For a real world application like this, we don’t want to predict the exact time that an event will occur - we want to predict that an event will occur within a time interval.

This research uses the C5 Decision Tree algorithm but we focused more on how decision trees work with time series and a K-means clustering algorithm rather than the specific decision tree algorithm. The algorithm that we ended up using for building a decision tree was chosen to also fit the technology we used for our implementation.

        In the case of clustering, if we have a few disjoint variables, we can cluster them by context - for example, shelters in an area are in the same cluster as donors in an area. K-means can help to achieve more accurate predictions by clearing the different clusters of data.

        The proposed method, which we more or less used as a guide to our solution, involves a few steps: defining a time series interval; picking values of some parameters of the system on specified interval; detecting events from an event stream (logs in our database); finding closest time series record to each event time point; attaching an event to a time series record; creating a new data set with events and specified time series; potentially using a clustering algorithm; using a decision tree algorithm for prediction. And finally, evaluate results.

        Hybrid Decision Tree-Neuro-Fuzzy System

        The main reason for looking at this paper was to see an example of a fuzzy system when using a decision tree. This is because we want to be careful not to create persistent links in the decision trees that we build so that we can keep our models as flexible as possible. This paper proposes a system that would use technical analysis for feature extraction and decision tree for feature selection - the data set is only then applied to an adaptive neuro-fuzzy system.  

For our project, we could link clusters through fuzzy contexts - for example, shelters and donors in an area can be correlated to the needy in that same area.

Neuro-fuzzy hybridization results in a system that combines the human like reasoning style of fuzzy systems with the learning nature of neural networks. The paper offers an example of a fuzzy inference system that creates fuzzy rules (if-then). As it was valuable to consider, we decided that implementing an adaptive network based fuzzy inference system (ANFIS) on top of a decision tree is too complex for the scope and time we have available for this project.

Solution Approach

        We focused our project on predicting  the availability of donors - information that serves a shelter as described in our introduction of the context. Incoming donors are tracked to predict information for the upcoming year. We also predict composition of family given incomplete data.

Implementation

        We decided to go with a web platform to ease design, development and usage of the product. When we were drafting the user cases, we concluded that all parties would benefit from a browser based solution - to access information useful to donors, to shelters or even households. We focused on shelter interest in donation information as the prototype that can be applied to all user cases. This web platform approach also made sense in terms of propagating the product to many food shelters with little to no setup and installation overhead.

        We considered different solutions ranging from the MAMP stack to Ruby on Rails, but NodeJS offered the best mix of low barrier of entry as well as a rich plugin library and community to get us started with a quick prototype. We also had the constraint of needing to get the data from  a mySQL database which was a driving point in finding a web platform that was compatible with the latter. The final constraint was that we are a multi-OS team. From MacOS to Windows and Linux, NodeJS proved to be an excellent experience for all three operating systems.

We used Machine Learning API’s that we modified for our needs - mainly in building decision trees and tweaking splits and entropy variables. Further improvements can be made by pruning decision trees as to remove redundancy (but we want to be careful as to not remove crucial data). This is semi-assisted as we perfect the system to be more autonomous.

        Our decision tree builds on data and results of data. We insert queries for data and results into our ML functions. The algorithm used by the decision tree implementation is the Classification and Regression Tree algorithm (CART). Classification-type problems are generally those where we attempt to predict values of a categorical dependent variable. In this scenario, classification is done by month. CART is an algorithm that makes use of  historical data to construct decision trees. Depending on the information and attributes in the dataset, a classification tree or regression tree can be built. The constructed tree can be further used for the classification of new events and observations.

        

Results

Minimum Specs

Performance

        Machine learning benchmarking isn’t a straightforward metric. More importantly it isn’t only one metric but a few. The more the system knows, the less it has to learn. So we observe an asymptotic decrease in performance required as the decision trees are built with new incoming data. As time goes on, the delta changes of said data are incremental, which makes the learning equally incremental. Maybe just changing or polishing the decision trees in specific areas to stabilize the entropy.

Database Statistics

Machine Learning Statistics

Decision Trees Complexity

Redacted representation of the decision built from the input data

- The decision tree is composed of Months which are then broken further into other months depending on the donors. This makes sense in the case where the food shelter will want to access information relative to donors. The shelters often need certain types of food to keep up with the recommended dietary needs to offer a wholesome diet to its visitors. Having this type of information in this layout allows the shelter to strategize accordingly. Anticipating when donations are usually made in order to build a solid strategize or increase efforts to attract other donors in order to fill in the gaps as time goes by.

Decision Trees Entropy

As data is added to the decision, and the learning is processing this said data, we notice a general decrease in entropy as time goes on. The negative peaks are builts leafs during the learning process. Due to the high variance of data, the initial entropy ~3.5 which then approaches 2.5. That’s a 20% decrease and only improves as data streams in. We have measure the learning for a sample of 1000 entries. For the sake of time, we kept it at that and didn’t notice much entropy improvement change as the sample neared 20,000. This is due to the stabilizing variance of the incoming data, in other words the same donors repeat during a 12 month period.

Decision Tree Depth vs Entries Processed

The API we have chosen and the way we have decided to split the learning limits the tree height to 12. Since the time period is 12 months, we don’t risk of overflowing and over-consuming resources. This is great as we have converged to an upper threshold which makes delivering this product with certain guarantees easier than previously thought.

Conclusion

        We’ve delivered a full experience (donor information) for one of our customers (the shelter). This incorporates the specific queries to access the database, the machine learning and decision tree building and the visual dashboard representation and control through their browser.

Future Work

        The system we implemented to predict donor information that is useful for a shelter is a prototype that could further be used to also:

        From what we learned from our research, using clustering to identify, for example, household demographic groups could reveal some interesting information about the people who come to a shelter and this information can be used by the shelter to better serve the clients. For example, demographic clustering could identify a group of elderly people that are some distance away from a food bank and that food bank may be able to deliver the food to their homes.         

Bibliography / References

Combination of Time Series, Decision Tree and Clustering: A case study in Aerology Event Prediction

by Seyed Behzad Lajevardi, Behrouz Minaei-Bidgoli

A Stock Market Trend Prediction System Using a Hybrid Decision Tree-Neuro-Fuzzy System

by Binoy B. Nair, N. Mohana Dharini, V.P. Mohandas

Constraint Satisfaction Problem

http://aima.cs.berkeley.edu/2nd-ed/newchap05.pdf

Extracting Decision Trees from Diagnostic Bayesian Networks

http://www.phmsociety.org/sites/phmsociety.org/files/phm_submission/2010/phmc_10_080.pdf

Classification and Regression Trees

http://www.statsoft.com/Textbook/Classification-and-Regression-Trees

Classification and Regression Trees (CART) Theory and Applications

http://edoc.hu-berlin.de/master/timofeev-roman-2004-12-20/PDF/timofeev.pdf