1 of 29

Stata Fundamentals

Parts 1-3

Updated Feb 2025

2 of 29

Introduction to the D-Lab

Dlab.berkeley.edu

  • Workshop schedule
  • Consultation
    • Drop-in Hours
    • Request ticket
  • Sign up for our newsletter

3 of 29

In preparation for Part 1,

please download and install Stata

4 of 29

Downloading Stata

To download the installation materials, please visit https://download.stata.com with the following credentials:

Username: [PROVIDED IN YOUR INVITATION]

Password: [PROVIDED IN YOUR INVITATION]

Serial number: [PROVIDED IN YOUR INVITATION]

For assistance with installation, you may view the Stata installation

guide at: https://www.stata.com/install-guide

If anyone has difficulty installing the program, feel free to contact the technical services department at tech-support@stata.com. Be sure to reference the Stata serial number listed in the section below.

5 of 29

Installing Stata using the provided license information

If you currently have Stata installed, you do not need to reinstall Stata.

However, if you would like to use the trial license we are providing for Stata18MP, feel free to install it.

Please do not share this license and note that the license is a trial and will expire later next month.

Licensed software: Stata/MP 18 (2 cores)

License type: 80-user Network

License term: Expires xx/xx/xxxx

Serial number: [PROVIDED IN YOUR INVITATION]

Code: [PROVIDED IN YOUR INVITATION]

Authorization: [PROVIDED IN YOUR INVITATION]

6 of 29

Installing Stata using the provided license information

If you currently have Stata installed, you do not need to reinstall Stata.

However, if you would like to use the trial license we are providing for Stata18MP, feel free to install it.

Please do not share this license and note that the license is a trial and will expire later next month.

Licensed software: Stata/MP 18 (2 cores)

License type: 60-user Network

License term: Expires 10 March 2025

Serial number: 501809419572

Code: 4r4e nni6 tg2h wbpz 9fmn oqg7 znay j4o2 iwa3 t

Authorization: ytoi

7 of 29

Downloading the Workshop Materials

To download the repository on GitHub, click the green button in the top right hand corner that says "Code" and then select "Download ZIP". You can then unzip the contents of the downloaded folder somewhere accessible on your local computer (we recommend your Desktop).

If you are a Git user, simply clone this repository by opening a terminal and typing: git clone https://github.com/dlab-berkeley/stata-fundamentals.git

We will take a few minutes at the start of the workshop to make sure everyone has Stata installed and the workshop materials downloaded. Please feel free to email dlab-frontdesk@berkeley.edu or visit our help desk at https://dlab.berkeley.edu/frontdesk if you have any questions.

1

2

8 of 29

Troubleshooting download, installation or device issues

If you have a Berkeley CalNet ID, attend the workshop anyway, we can provide you with a cloud-based solution until you figure out the problems with your local installation by using the UC Berkeley Library Citrix Service: https://guides.lib.berkeley.edu/citrix/stata

9 of 29

Part 1: Introduction

  • Loading datasets into Stata (no previous knowledge expected)
  • Examining a dataset and finding variables of interest
  • Summarizing and tabulating variables
  • Stata specific tools and resources (do files, logs, help files, etc.)
  • Coding and cleaning data (making new variables from old variables; labeling variables and values, etc.)
  • Using logical operators in Stata
  • Cross-tabulations

10 of 29

Stata Interface

Output Window

Do File

11 of 29

Setting up a working directory

Where do we want Stata to go?

The Computer

The City

The Address

D-Lab, 350

The Working Directory

Stata

Fundamentals

12 of 29

The Dataset: NLSW 88

13 of 29

Describe the Variable List

Let’s break it down

14 of 29

Variable Label vs.

Value Label

15 of 29

Storage Types in Stata: Numeric

Default - 7 digits of accuracy

Demands more memory - used by banks for financial accuracy

16 of 29

Storage Types in Stata: String

A sequence of characters

12 3 4 5 678 9

17 of 29

18 of 29

Reminder

Stata signifies a missing variable with a period “.”

It is important to remember that Stata ascribes missing values as infinite values.

For example:

34, 35, 36, .

age

Lowest Value -> Highest Value

This is important to remember for using conditional statements, creating new variables, etc.

1, 2, 3, .

race

Lowest Value -> Highest Value

19 of 29

Part 2: Data Analysis in Stata

  • Correlation
  • T-tests
  • Visualization

(histograms, bar graphs, scatter plots)

  • Regression postestimation

(getting predicted values, basic graphs)

  • Merging and appending datasets

20 of 29

21 of 29

22 of 29

23 of 29

Part 3: Stata Programming

  • Local and global variables (macros)
  • Looping (foreach, forvalues)
  • Reshaping data between wide and long formats
  • Recalling and using command output
  • Generating nicely formatted journal-style tables

24 of 29

25 of 29

26 of 29

27 of 29

28 of 29

29 of 29