Careers in Data Science and Analytics
What is analytics and who wants it.
J. Hathaway,
Program Chair of Data Science @
Brigham Young University Idaho
Connect: LinkedIn
J. Hathaway
For the first decade of my career, I led applied research in statistics, climate model integrations, and sampling designs for the U.S. Department of Energy. Over the past ten years, I have had the opportunity to build and teach data science programs at Brigham Young University-Idaho. Throughout this period, I also founded and operated businesses in advertising, retail sales, private consulting, and a Data Wrangling as a Service (DWaaS) company. Currently, I am the co-owner of DataThink and a full-time faculty member at BYU-Idaho. With over 20 years of experience in writing code, my work has been focused on delivering data-driven solutions.
J. Hathaway
1995: Graduated High School
1998: Got married
1999: First child (Top, far right)
2005: Grad School to PNNL
2015: PNNL to BYU-I
2017: Data Science program starts and last child born (Top, middle)
Exploring Pathways
First in my family (parents, aunts, uncles, siblings, ancestors) to graduate from college and only one to complete graduate school.
You may not know how the end looks but you can focus on riding your wave of potential now.
Outline
“You cannot connect the dots looking forward; you can only connect them looking backward.”
Steve Jobs (& Dieter F. Uchtdorf)
What is data?
According to a data scientist
School Data
Business Data
What is data?
For data analytics, data is anything stored digitally that can be processed with software or programming.
From “raw” data to features.
Raw
Features
Answers
ML/AI
Finding joy in the mess before the feast
Transactions, text, movements, images
Organized summaries, totals, means, counts
Model outputs, visuals, tables
Data Example:
Raw
Features
Answers
Customer purchases
Daily sales of perfume as shown on your payment service.
Customer summarized tables with information like;
What scent should I offer to client a next month to get the most sales?
Data Example:
Raw
Features
Answers
Pictures
Shared pictures from chicken customers
the camera typically generates metadata
Where are my wealthy customers and how healthy are their chickens?
What is data analytics?
Help me understand data science and business analytics.
Data
Science
Business
Analytics
Data
Analytics
Chefs vs Cooks
Data Science vs Business Analytics
A chef is an individual who is trained to understand flavors, cooking techniques, create recipes from scratch with fresh ingredients, and have a high level of responsibility within a kitchen.
A cook is an individual who follows established recipes with standard ingredients to prepare food.
My term to cover both data science and business analytics.
What is data science and business analytics?
Hierarchy of tools
We take business requirements from a data supported need and translate them into CODE and analytics to create profit supporting solutions.
low
high
Which tool?
difficulty
Remote
Local
pay
low
high
The soft skills that support analytics
Curiosity
Want to help people with their problems
Love to build connections
Data Science: Making sure that data and professionals understand each other
A data scientist brings the data into executive decision making. We must learn to
Where do we start?
Depending on your entry point, your journey to a data analytics profession will vary.
A tool centric view
I started here
Start where you are
“You cannot connect the dots looking forward; you can only connect them looking backward.”
Steve Jobs (& Elder Uchtdorf)
You may not know how the end looks but you can focus on riding your wave of potential now.
HTML Books on my Palm Pilot
Installing and using Linux
Starting small businesses
Logic Classes
Math Classes
Stat Classes
Teaching
Technical Training
Problem solving job
Professor
Consultant
Personal
School
How can I build a data science career?
“You cannot connect the dots looking forward; you can only connect them looking backward.”
Steve Jobs
Are online training courses worth it? Often, Yes.
Can they get me a job? Often, NO.
Do the trainings, but work for someone. Even if it is free. Find a friend, a company that needs data work and offer your time. Practice your skills on real problems for real people. Build your resume!
Understanding AI from LLM to ML
Making sure we understand the ‘artificial’ in machine based intelligence
Are these models bushes or bridges?
A bush grows and can be trimmed to remove unwanted growth. Once it has started growing, we have less control over internal details. We can shape it but not redesign it.
Bridges are engineered of specific parts. We can see and understand how to remove or add parts to shape it. If we
don’t like elements
we understand how
to remove them and
can.
Every time a new LLM is developed it is a unique growth. Technicians must get involved to ‘trim’ the model through reinforcement.
Unique creation, explainable in abstraction but not in detail.
Reproducible creation, its abstraction is defined by its detail
Statistical, Machine Learning, and AI continuum
Linear Regression
LLMs
Machine Learning Models
Are these models just parrots?
Stochastic Parrots
Stochastic: determined with probability
Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind. It can’t have been, because the training data never included sharing thoughts with a listener, nor does the machine have the ability to do that.
Like the ‘speech’ of a parrot, the output of LLMs
How do LLMs use text as an input?
How do the tokens work in the model?
The puzzle is understanding sentences and generating new ones.
The transformer breaks the sentences down into smaller parts called "subwords." Each word gets its own little solver called an "attention head."
The model leverages the encoded representation to produce contextually appropriate text as a continuation of the conversation.
How does ML and Statistical Modeling use data?
Data starts by thinking of many varied measured variables on our decision unit. Models are specific to variables used. Data is curated not simply transformed to tokens.
What are the costs of LLMs?
Some projects from my history.
Open discussion time to talk about getting into data science.
Example: Cleaning up dirty building data (real building data)
Stories that might be worth sharing
Visuals as Communication
Data speaks through visuals. You are the translator.
Data Science: Making sure that data and professionals understand each other
A data scientist brings the data into executive decision making. We must learn to
We give data a voice.
Be the translator, don’t hide the nuance
Crafting the comparisons
Providing understanding with visuals
The essential point is to make intelligent and appropriate comparisons. Visual displays, if they are to assist thinking, should show comparisons.
Edward Tufte
What should I do next?
Walk like a data scientist, talk like a data scientist, work like a data scientist
Begin the journey
Exemplify your skills, school is secondary
Use college to build skills don’t worry about the degree.
Broaden your tools
IDE = Integrated Development Environment
Use the BYU-I Data Science Courses
All materials are public and free for anyone to use for learning.
Explore the quickskilling Github Org.
All materials are public and free for anyone to use for learning.
https://github.com/quickskilling
Add programming as a daily ritual
Wake up, pray, write some code, then go to work.
Learning your first programming language can be as hard a learning another spoken language. It can take years to get fluent but you can start to communicate in months.
Questions and Answers
Prof-J.com