1 of 41

Careers in Data Science and Analytics

What is analytics and who wants it.

J. Hathaway,

Program Chair of Data Science @

Brigham Young University Idaho

Connect: LinkedIn

2 of 41

J. Hathaway

For the first decade of my career, I led applied research in statistics, climate model integrations, and sampling designs for the U.S. Department of Energy. Over the past ten years, I have had the opportunity to build and teach data science programs at Brigham Young University-Idaho. Throughout this period, I also founded and operated businesses in advertising, retail sales, private consulting, and a Data Wrangling as a Service (DWaaS) company. Currently, I am the co-owner of DataThink and a full-time faculty member at BYU-Idaho. With over 20 years of experience in writing code, my work has been focused on delivering data-driven solutions.

LinkedIn

3 of 41

J. Hathaway

1995: Graduated High School

1998: Got married

1999: First child (Top, far right)

2005: Grad School to PNNL

2015: PNNL to BYU-I

2017: Data Science program starts and last child born (Top, middle)

LinkedIn

4 of 41

Exploring Pathways

First in my family (parents, aunts, uncles, siblings, ancestors) to graduate from college and only one to complete graduate school.

You may not know how the end looks but you can focus on riding your wave of potential now.

5 of 41

Outline

“You cannot connect the dots looking forward; you can only connect them looking backward.”

Steve Jobs (& Dieter F. Uchtdorf)

  1. What is data?
  2. What is analytics?
  3. What should I do next?

6 of 41

What is data?

According to a data scientist

7 of 41

School Data

Business Data

What is data?

For data analytics, data is anything stored digitally that can be processed with software or programming.

  • School data is packaged nicely
  • Business data is raw and from the wild

8 of 41

From “raw” data to features.

Raw

Features

Answers

ML/AI

Finding joy in the mess before the feast

Transactions, text, movements, images

Organized summaries, totals, means, counts

Model outputs, visuals, tables

9 of 41

Data Example:

Raw

Features

Answers

Customer purchases

Daily sales of perfume as shown on your payment service.

Customer summarized tables with information like;

  • Date of first purchase
  • Total spent each year
  • Most purchased product
  • etc

What scent should I offer to client a next month to get the most sales?

10 of 41

Data Example:

Raw

Features

Answers

Pictures

Shared pictures from chicken customers

the camera typically generates metadata

  • aperture,
  • Resolution,
  • focal length,
  • shutter speed,
  • ISO speed,
  • camera brand
  • Camera model,
  • Time created
  • GPS location

Where are my wealthy customers and how healthy are their chickens?

11 of 41

What is data analytics?

Help me understand data science and business analytics.

12 of 41

Data

Science

Business

Analytics

Data

Analytics

Chefs vs Cooks

Data Science vs Business Analytics

A chef is an individual who is trained to understand flavors, cooking techniques, create recipes from scratch with fresh ingredients, and have a high level of responsibility within a kitchen.

A cook is an individual who follows established recipes with standard ingredients to prepare food.

Link

My term to cover both data science and business analytics.

13 of 41

What is data science and business analytics?

Hierarchy of tools

We take business requirements from a data supported need and translate them into CODE and analytics to create profit supporting solutions.

low

high

14 of 41

Which tool?

difficulty

Remote

Local

pay

low

high

15 of 41

The soft skills that support analytics

Curiosity

Want to help people with their problems

Love to build connections

16 of 41

Data Science: Making sure that data and professionals understand each other

A data scientist brings the data into executive decision making. We must learn to

  • speak data
  • convert business needs to programming
  • Use data visualization
  • communicate in business terms

17 of 41

Where do we start?

Depending on your entry point, your journey to a data analytics profession will vary.

A tool centric view

I started here

18 of 41

Start where you are

“You cannot connect the dots looking forward; you can only connect them looking backward.”

Steve Jobs (& Elder Uchtdorf)

You may not know how the end looks but you can focus on riding your wave of potential now.

HTML Books on my Palm Pilot

Installing and using Linux

Starting small businesses

Logic Classes

Math Classes

Stat Classes

Teaching

Technical Training

Problem solving job

Professor

Consultant

Personal

School

19 of 41

How can I build a data science career?

“You cannot connect the dots looking forward; you can only connect them looking backward.”

Steve Jobs

Are online training courses worth it? Often, Yes.

Can they get me a job? Often, NO.

Do the trainings, but work for someone. Even if it is free. Find a friend, a company that needs data work and offer your time. Practice your skills on real problems for real people. Build your resume!

20 of 41

Understanding AI from LLM to ML

Making sure we understand the ‘artificial’ in machine based intelligence

21 of 41

Are these models bushes or bridges?

A bush grows and can be trimmed to remove unwanted growth. Once it has started growing, we have less control over internal details. We can shape it but not redesign it.

Bridges are engineered of specific parts. We can see and understand how to remove or add parts to shape it. If we

don’t like elements

we understand how

to remove them and

can.

Every time a new LLM is developed it is a unique growth. Technicians must get involved to ‘trim’ the model through reinforcement.

22 of 41

Unique creation, explainable in abstraction but not in detail.

Reproducible creation, its abstraction is defined by its detail

Statistical, Machine Learning, and AI continuum

Linear Regression

LLMs

Machine Learning Models

23 of 41

Are these models just parrots?

Stochastic Parrots

Stochastic: determined with probability

Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind. It can’t have been, because the training data never included sharing thoughts with a listener, nor does the machine have the ability to do that.

Like the ‘speech’ of a parrot, the output of LLMs

  1. involves repetition without understanding
  2. with some probabilistic, generative component
  3. it is very much unlike what humans do or produce.

24 of 41

How do LLMs use text as an input?

Tokens

(numbers mapped to characters)

25 of 41

How do the tokens work in the model?

The puzzle is understanding sentences and generating new ones.

The transformer breaks the sentences down into smaller parts called "subwords." Each word gets its own little solver called an "attention head."

The model leverages the encoded representation to produce contextually appropriate text as a continuation of the conversation.

26 of 41

How does ML and Statistical Modeling use data?

Data starts by thinking of many varied measured variables on our decision unit. Models are specific to variables used. Data is curated not simply transformed to tokens.

27 of 41

What are the costs of LLMs?

28 of 41

Some projects from my history.

Open discussion time to talk about getting into data science.

29 of 41

Example: Cleaning up dirty building data (real building data)

  • Hundreds of messy Excel spreadsheets
  • Missing measurements
  • Handling human traffic
  • Dealing with weather
  • Making comparisons

30 of 41

Stories that might be worth sharing

  1. Scraping NBA data for a stats model.
  2. Mapping math to data collection (ISM)
  3. Working on teams with diverse backgrounds (first meeting for UXO)
  4. Cleaning up dirty building data (real building data)
  5. Learning regex, buying data, and munging for application
  6. Being an expert among experts (Portland DOE visitor)
  7. Helping people answer questions (Visual Sample Plan)
  8. Building a data science team to support climate science
  9. Supporting a very messy Excel project (forecasting employees)
  10. Working with Real World Health Records Data
  11. Supporting small companies automate

31 of 41

Visuals as Communication

Data speaks through visuals. You are the translator.

32 of 41

Data Science: Making sure that data and professionals understand each other

A data scientist brings the data into executive decision making. We must learn to

  • speak data
  • convert business needs to programming
  • Use data visualization
  • communicate in business terms

33 of 41

We give data a voice.

Be the translator, don’t hide the nuance

34 of 41

Crafting the comparisons

Providing understanding with visuals

The essential point is to make intelligent and appropriate comparisons. Visual displays, if they are to assist thinking, should show comparisons.

Edward Tufte

35 of 41

What should I do next?

Walk like a data scientist, talk like a data scientist, work like a data scientist

36 of 41

Begin the journey

Exemplify your skills, school is secondary

Use college to build skills don’t worry about the degree.

37 of 41

Broaden your tools

  • Python (VS Code & Python Extension)
    • Pandas/Polars
    • SkLearn
    • Streamlit
  • R (R-Studio)
    • Tidyverse
    • Tidymodels
    • Shiny
  • JavaScript (Observable)
    • Observable Notebooks
    • Observable Plot
    • Arquero

IDE = Integrated Development Environment

38 of 41

Use the BYU-I Data Science Courses

All materials are public and free for anyone to use for learning.

39 of 41

Explore the quickskilling Github Org.

All materials are public and free for anyone to use for learning.

https://github.com/quickskilling

  • Polars
  • Shiny
  • Streamlit
  • Docker

40 of 41

Add programming as a daily ritual

Wake up, pray, write some code, then go to work.

Learning your first programming language can be as hard a learning another spoken language. It can take years to get fluent but you can start to communicate in months.

41 of 41

Questions and Answers

  • Open time for questions
  • If you have a question for me and don’t have my Whatsapp then use questions@prof-j.com

Prof-J.com