dbt Office Hours: Building out an entity resolution pipeline with Python and dbt
Thursday, October 1 5 pm BST / 12 pm EDT / 9 am PDT
Topic: dbt Office Hours: Building out an entity resolution pipeline with Python and dbt
Speaker: Pedram Navid, Data Engineer, Vouch.us
You’d probably expect a table named companies to have one record per company, right?
But what happens when your organization uses a number of different tools to collect customer information, all with different ways to track a company? Or when two people fill in slightly different names for their company? That companies table is likely going to end up with different records representing the same real-life company.
That’s where entity resolution comes in — the practice of mapping different identifiers to a single entity.
In this Office Hours, Pedram from Vouch Insurance will share how they solved this problem, using a combination of both dbt and python!
- Office hours are demo-heavy and will assume you have existing knowledge of dbt
- Hang tight! Your email invitation may take a few minutes to arrive
About dbt Office Hours:
dbt Office hours are a place for the makers in our community to share something they've worked on that they're proud of. Typically, office hours are pretty informal — the guest speaker presents a couple of slides to set up the problem they are solving and shares their screen to highlight some code. We always leave room for questions at the end.
Sign in to Google
to save your progress.
By submitting this form, you'll be added as a guest to the Google Calendar event (a great way to remember that the event is on!)
Never submit passwords through Google Forms.
This form was created inside of dbt Labs.