1 of 42

Where does our food come from?

2 of 42

Hypothesis: the majority of food in the vending machine is not locally produced

3 of 42

4 of 42

What products are in the IAAC vending machine?

5 of 42

Using various websites to gather info on where it comes from

6 of 42

We identified all of the products in the vending machine

on Open Food Facts

7 of 42

Gullón vitalday

8 of 42

Kinder bueno

9 of 42

Twix

10 of 42

On open food facts we searched for the production location for “sandwiches”....

11 of 42

OPEN FOOD FACTS : DATASHEET

We could download the whole database of the site through their API

12 of 42

OPEN FOOD FACTS : Open Data Commons License

13 of 42

Open Food Facts...

It is community driven. It is open source, you can download the whole API

We could find where the products were sold

But only sometimes we could find where it was “produced” and what ingredients it has but when looking for their origin:

Hard to find exact location

Lack of product traceability

14 of 42

PYTHON SCRIPT PROCESS

We wanted to create a script that would scrape Open Food Facts and write all the ingredients of the products into an excel spreadsheet.

PYTHON CODE:

from requests import get

from requests.exceptions import RequestException

from contextlib import closing

from bs4 import BeautifulSoup

from random import randint

def simple_get(url):

try:

with closing(get(url, stream=True)) as resp:

if is_good_response(resp):

return resp.content

else:

return None

except RequestException as e:

log_error('Error during requests to {0} : {1}'.format(url, str(e)))

return None

def is_good_response(resp):

"""

Returns True if the response seems to be HTML, False otherwise.

"""

content_type = resp.headers['Content-Type'].lower()

return (resp.status_code == 200

and content_type is not None

and content_type.find('html') > -1)

def log_error(e):

"""

This function just prints them, but you can

make it do anything.

"""

print(e)

def match_id(target):

def do_match(tag):

classes = tag.get('id', [])

return all(c in classes for c in target)

return do_match

# This is where I will store the h3s

filename = "ingredients.csv"

# Urls

urls = ['https://world.openfoodfacts.org/product/8410055885101/limonada-font-vella',

'https://world.openfoodfacts.org/product/8425561010084/agua-de-cuevas',

'https://world.openfoodfacts.org/product/5449000171771/aquarius-limon-zero-sin-azucar',

'https://world.openfoodfacts.org/product/5449000133335/coca-cola-zero',

'https://world.openfoodfacts.org/product/5449000267412/coca-cola',

'https://world.openfoodfacts.org/product/8429359320409/enjoy-san-benedetto',

'https://world.openfoodfacts.org/product/4002160092150/naranja-granini-1l',

'https://world.openfoodfacts.org/product/9002490205973/energy-drink-red-bull',

'https://world.openfoodfacts.org/product/5060335632302/monster-energy',

'https://world.openfoodfacts.org/product/8436531820053/sandwich-salami-queso-delikia',

'https://world.openfoodfacts.org/product/8436531821708/sandwich-pollo-bacon-delikia',

'https://world.openfoodfacts.org/product/8410199019684/ruffles-jamon',

'https://world.openfoodfacts.org/product/8413164008713/snatt-s-natuchips-grefusa',

'https://world.openfoodfacts.org/product/8413164005583/snatt-s-palitos-de-cereales-con-pipas-grefusa',

'https://world.openfoodfacts.org/product/8413164005002/snatt-s-palitos-de-trigo-con-queso-grefusa',

'https://world.openfoodfacts.org/product/8413164014783/mister-corn-mix-5-grefusa',

'https://world.openfoodfacts.org/product/8410179015200/natura-almendra-borges',

'https://world.openfoodfacts.org/product/8410376055702/vitalday-gullon',

'https://world.openfoodfacts.org/product/8435351499289/fresa-yogur-chocolate-natwins',

'https://world.openfoodfacts.org/product/5906747312602/digestive-go-fontaneda',

'https://world.openfoodfacts.org/product/8000500037560/kinder-bueno',

'https://world.openfoodfacts.org/product/5900951028502/twix-xtra']

with open(filename, "w") as file:

file.write("url, ingredients")

file.write('\n')

for _url in urls:

print (_url)

raw_html = simple_get(_url)

html = BeautifulSoup(raw_html, 'html.parser')

ing = html.find_all(match_id(["ordered_ingredients_list"]))

print (ing)

h1 = html.select('h1')[0].text.replace(' ', '_')

h1 = h1.replace('\n', '')

with open(filename, 'a') as file:

for item in ing:

if item.text.find('->'):

clean_item = item.text[0:item.text.find('->')].replace('\n', '')

file.write(f"{h1}, {clean_item}")

file.write("\n")

Thanks Oscar!

15 of 42

The python script generated this list of ingredients:

16 of 42

What is local?

Natwins cookies

17 of 42

The “Mediterranean”:

18 of 42

Delikia coffee

19 of 42

Nicaragua

20 of 42

Very few companies have transparent supply chains

Very little detail...

One of the shortest lists on Nestlé’s website

21 of 42

22 of 42

Wheat supply chain

23 of 42

But luckily it’s getting better

24 of 42

Looking for other sites to scrape...

TRACES

“In 2002, the European Union passed a law (the General Food Law) that enforces traceability. All food and feed businesses must have a traceability system, where they record the supplier, customer, what kind of product was delivered and when.”

  • Livestock
  • animal products
  • Vegetables
  • Plants
  • Other foods
  • wood

You can make an account but you must be CONFIRMED by the European Commission

25 of 42

Datasheets are often private. On OEC a subscription costs 2K! Per month!

$$$$$$$

26 of 42

CIAT

CIAT advances data, information, and knowledge management by the implementation of the CGIAR Open Access and Open Data policy, efficient management and update of its information and data repositories, and introduction of participatory methods to foster knowledge sharing (inside and outside CIAT).

https://blog.ciat.cgiar.org/origin-of-crops/

Only shows the top 5 exports (by weight)

But lacking detail in location

27 of 42

We found one of the datasets that CIAT used to build their infographics. From this… we picked some info about imports and exports that were relevant to the ingredients in the vending machine

28 of 42

This shows how much (% of total weight) of X product is exported from X region.

potatoes

sugar

Palm oil

sunflower

29 of 42

This shows what regions are exporting what quantity of produce/products to Spain (but does not divide by ingredient)

30 of 42

We decided to buy a sandwich from the vending machine and trace the possible origins of the main ingredients, using OEC’s data concerning Spain’s imported products

the unit of measurement was the value of product in $

And not the amount in tonnes

31 of 42

CASE STUDY: The delikia sandwich

MAIN COMPONENTS:

-wheat

-pig meat

-cheese

-nuts

-eggs

-yeast

-olive oil

32 of 42

Where does Spain import these products from?

WHEAT

33 of 42

PIGS

34 of 42

CHEESE

35 of 42

YEAST

36 of 42

NUTS

37 of 42

OLIVE

OIL

38 of 42

EGGS

39 of 42

Techniques utilized

Resources utilized

  • Manual “web scraping”
  • Automated web scraping
  • Scanning products through Open Food Facts
  • Researching through food brand website
  • delikia.es
  • www.girofibra.com/
  • es.openfoodfacts.org
  • oec.world
  • trademap.org
  • Oscar

40 of 42

CONCLUSIONS

It is very difficult to retrieve information about where food comes and goes

There is a lack of transparency regarding the movement of goods

There is no detailed information available to the public about food sources

Recognising that Web Scraping is an option, but not always the best or more efficient one.

41 of 42

How could we improve this system?

  • How can we re-imagine the vending machine? Can we think of a better more transparent alternative?
  • If we could trace all our food, we could create legislation that could regulate and promote the circular economy.
  • Why don’t we question the transparency of our food system more often? It’s such a basic need.
  • If we can trace food, how will it change the industry?
  • Can we accurately track the carbon emissions of our food ? often data is private or hard to find.

42 of 42