dbt Office Hours: Building out an entity resolution pipeline with Python and dbt
Thursday, October 1 5 pm BST / 12 pm EDT / 9 am PDT
Topic: dbt Office Hours: Building out an entity resolution pipeline with Python and dbt
Speaker: Pedram Navid, Data Engineer, Vouch.us
You’d probably expect a table named companies to have one record per company, right?
But what happens when your organization uses a number of different tools to collect customer information, all with different ways to track a company? Or when two people fill in slightly different names for their company? That companies table is likely going to end up with different records representing the same real-life company.
That’s where entity resolution comes in — the practice of mapping different identifiers to a single entity.
In this Office Hours, Pedram from Vouch Insurance will share how they solved this problem, using a combination of both dbt and python!
Please Note:
- Office hours are demo-heavy and will assume you have existing knowledge of dbt
- Hang tight! Your email invitation may take a few minutes to arrive
About dbt Office Hours:
dbt Office hours are a place for the makers in our community to share something they've worked on that they're proud of. Typically, office hours are pretty informal — the guest speaker presents a couple of slides to set up the problem they are solving and shares their screen to highlight some code. We always leave room for questions at the end.