Where does our food come from?
Hypothesis: the majority of food in the vending machine is not locally produced
What products are in the IAAC vending machine?
Using various websites to gather info on where it comes from
We identified all of the products in the vending machine
on Open Food Facts
Gullón vitalday
Kinder bueno
Twix
On open food facts we searched for the production location for “sandwiches”....
OPEN FOOD FACTS : DATASHEET
We could download the whole database of the site through their API
OPEN FOOD FACTS : Open Data Commons License
Open Food Facts...
It is community driven. It is open source, you can download the whole API
We could find where the products were sold
But only sometimes we could find where it was “produced” and what ingredients it has but when looking for their origin:
Hard to find exact location
Lack of product traceability
PYTHON SCRIPT PROCESS
We wanted to create a script that would scrape Open Food Facts and write all the ingredients of the products into an excel spreadsheet.
PYTHON CODE:
from requests import get
from requests.exceptions import RequestException
from contextlib import closing
from bs4 import BeautifulSoup
from random import randint
def simple_get(url):
try:
with closing(get(url, stream=True)) as resp:
if is_good_response(resp):
return resp.content
else:
return None
except RequestException as e:
log_error('Error during requests to {0} : {1}'.format(url, str(e)))
return None
def is_good_response(resp):
"""
Returns True if the response seems to be HTML, False otherwise.
"""
content_type = resp.headers['Content-Type'].lower()
return (resp.status_code == 200
and content_type is not None
and content_type.find('html') > -1)
def log_error(e):
"""
This function just prints them, but you can
make it do anything.
"""
print(e)
def match_id(target):
def do_match(tag):
classes = tag.get('id', [])
return all(c in classes for c in target)
return do_match
# This is where I will store the h3s
filename = "ingredients.csv"
# Urls
urls = ['https://world.openfoodfacts.org/product/8410055885101/limonada-font-vella',
'https://world.openfoodfacts.org/product/8425561010084/agua-de-cuevas',
'https://world.openfoodfacts.org/product/5449000171771/aquarius-limon-zero-sin-azucar',
'https://world.openfoodfacts.org/product/5449000133335/coca-cola-zero',
'https://world.openfoodfacts.org/product/5449000267412/coca-cola',
'https://world.openfoodfacts.org/product/8429359320409/enjoy-san-benedetto',
'https://world.openfoodfacts.org/product/4002160092150/naranja-granini-1l',
'https://world.openfoodfacts.org/product/9002490205973/energy-drink-red-bull',
'https://world.openfoodfacts.org/product/5060335632302/monster-energy',
'https://world.openfoodfacts.org/product/8436531820053/sandwich-salami-queso-delikia',
'https://world.openfoodfacts.org/product/8436531821708/sandwich-pollo-bacon-delikia',
'https://world.openfoodfacts.org/product/8410199019684/ruffles-jamon',
'https://world.openfoodfacts.org/product/8413164008713/snatt-s-natuchips-grefusa',
'https://world.openfoodfacts.org/product/8413164005583/snatt-s-palitos-de-cereales-con-pipas-grefusa',
'https://world.openfoodfacts.org/product/8413164005002/snatt-s-palitos-de-trigo-con-queso-grefusa',
'https://world.openfoodfacts.org/product/8413164014783/mister-corn-mix-5-grefusa',
'https://world.openfoodfacts.org/product/8410179015200/natura-almendra-borges',
'https://world.openfoodfacts.org/product/8410376055702/vitalday-gullon',
'https://world.openfoodfacts.org/product/8435351499289/fresa-yogur-chocolate-natwins',
'https://world.openfoodfacts.org/product/5906747312602/digestive-go-fontaneda',
'https://world.openfoodfacts.org/product/8000500037560/kinder-bueno',
'https://world.openfoodfacts.org/product/5900951028502/twix-xtra']
with open(filename, "w") as file:
file.write("url, ingredients")
file.write('\n')
for _url in urls:
print (_url)
raw_html = simple_get(_url)
html = BeautifulSoup(raw_html, 'html.parser')
ing = html.find_all(match_id(["ordered_ingredients_list"]))
print (ing)
h1 = html.select('h1')[0].text.replace(' ', '_')
h1 = h1.replace('\n', '')
with open(filename, 'a') as file:
for item in ing:
if item.text.find('->'):
clean_item = item.text[0:item.text.find('->')].replace('\n', '')
file.write(f"{h1}, {clean_item}")
file.write("\n")
Thanks Oscar!
The python script generated this list of ingredients:
What is local?
Natwins cookies
The “Mediterranean”:
Delikia coffee
Nicaragua
Very few companies have transparent supply chains
Very little detail...
One of the shortest lists on Nestlé’s website
Wheat supply chain
But luckily it’s getting better
Looking for other sites to scrape...
TRACES
“In 2002, the European Union passed a law (the General Food Law) that enforces traceability. All food and feed businesses must have a traceability system, where they record the supplier, customer, what kind of product was delivered and when.”
You can make an account but you must be CONFIRMED by the European Commission
Datasheets are often private. On OEC a subscription costs 2K! Per month!
$$$$$$$
CIAT
CIAT advances data, information, and knowledge management by the implementation of the CGIAR Open Access and Open Data policy, efficient management and update of its information and data repositories, and introduction of participatory methods to foster knowledge sharing (inside and outside CIAT).
https://blog.ciat.cgiar.org/origin-of-crops/
Only shows the top 5 exports (by weight)
But lacking detail in location
We found one of the datasets that CIAT used to build their infographics. From this… we picked some info about imports and exports that were relevant to the ingredients in the vending machine
This shows how much (% of total weight) of X product is exported from X region.
potatoes
sugar
Palm oil
sunflower
This shows what regions are exporting what quantity of produce/products to Spain (but does not divide by ingredient)
We decided to buy a sandwich from the vending machine and trace the possible origins of the main ingredients, using OEC’s data concerning Spain’s imported products
the unit of measurement was the value of product in $
And not the amount in tonnes
CASE STUDY: The delikia sandwich
MAIN COMPONENTS:
-wheat
-pig meat
-cheese
-nuts
-eggs
-yeast
-olive oil
Where does Spain import these products from?
WHEAT
PIGS
CHEESE
YEAST
NUTS
OLIVE
OIL
EGGS
Techniques utilized
Resources utilized
CONCLUSIONS
It is very difficult to retrieve information about where food comes and goes
There is a lack of transparency regarding the movement of goods
There is no detailed information available to the public about food sources
Recognising that Web Scraping is an option, but not always the best or more efficient one.
How could we improve this system?