Data quality: prevention is better than the cure
Andrew Jones
Principal Engineer | Author | Google Developer Expert
All opinions are my own
Improve data quality at the source, so we can…
Deliver value from data faster and cheaper
64% of organisations think that big data and analytics are the way to deliver competitive advantage…
…yet only 1 in 5 are using it to deliver increased revenue
Source: Nash Squared
Hey #engineering, anyone know where I can find this data?
Sorry all, an upstream schema change broke this morning's run
IF(date < 2022, v1_pricing, v2_pricing)
We can’t even do BI well, what makes them think we can do AI?!
Working with poor quality data is
time consuming and expensive
Poor data quality costs organisations an average of $12.9 million a year - Gartner
1-10-100 rule of data quality
George Labovitz and Yu Sang Chang, 1992
Failure - $100
Remediation - $10
Prevention - $1
Encourage collaboration
Explicit interface
Prevention is better than the cure
UP NEXT: Book giveaway!
Thanks!
What percentage of AI projects fail to deliver?
Source: Gartner
85%
That’s a lot! How many due to poor quality data?
Source: Gartner
How much of an organisation's data is actually used?
Source: Seagate
32%
That leaves 68% of your data incurring costs, both monetary and in increased risk, without generating any value.
Maybe we should think about quality, more than quantity? 🤔
Source: Seagate
📙 https://data-contracts.com
🔖 https://andrew-jones.com
👥 in/andrewrhysjones
Questions?