1 of 17

Bash Scripts and Make

BIOS 611: Introduction to Data Science

Instructor Matt Biggs

2 of 17

Overview of today

  • Logistics:
    • Status of project grades
    • Project 2 concept README.md due today!
    • Homework 4 (Shiny Dashboard) due next class.
  • Bash
  • Remote servers

BIOS 611

3 of 17

Review

  • Shiny

BIOS 611

4 of 17

Objectives

  • Students will be be familiar with and have executed several basic bash commands
  • Students will have logged into a remote server successfully

BIOS 611

5 of 17

Remote servers & VCL demonstration

The VCL is a UNC computer platform available to students. Take a moment to reserve an instance, then it should be ready by the time we reach the exercise.

  1. Reserve an instance of “base_datasci611” using the instructions
  2. Log in

BIOS 611

6 of 17

Project 2 “concept README”

Populate the README.md file in your project with a description of what you intend to do.

The purpose is to get you thinking about the project logistics.

Describe:

  • The overall objective (what questions do you hope to answer?)
  • The audience (who are you doing this for?)
  • The data sets you will use
  • Some analysis approaches and figures you will try first
  • How you will incorporate the interactivity of a Shiny dashboard

BIOS 611

7 of 17

Unix Bash

Bash is the native language of the Unix operating system.

You type Bash commands into the “Terminal”.

Why interact with a computer in this “archaic” way?

  1. You can convert a series of commands into a script!
    1. Scripts are reproducible, transparent, and make it easy to run the same commands on new data.
  2. You can do more stuff from the command line (more software tools, more flexibility)
  3. You can interact with remote servers
  4. You will look smart

BIOS 611

8 of 17

BIOS 611

9 of 17

Essential Bash Commands

pwd - where am I?

cd - “change directory”

../ - up a directory

~ - home directory

Tab - complete

ls - list files in current directory

Cntrl-C - Kill process

head - print first few lines of a file

tail - print last few lines of file

touch - create a new file

mkdir - create a new directory

rm - delete files

mv - move/rename files

cp - copy files

BIOS 611

10 of 17

More Essential Bash Commands

echo - display characters

cat - print contents of file and/or concatenate

vim/nano - text editors

> - redirect text into a file

sudo - run command with admin privileges

sort - order rows

uniq - keep unique values

man <cmd>- manual page for a command

There is a whole universe of other commands to learn! Tools for editing files, parsing spreadsheets, checking CPU usage, running programs in parallel, editing images, sending emails, and many other tasks.

Get comfortable with the basics, then learn more as you progress.

Bash commands facilitate automation.

BIOS 611

11 of 17

Vim basics

vim file_name - enter vim editor

“i” - to enter “insert” mode

Esc - to exit “insert” mode

:wq - save changes and quit

BIOS 611

12 of 17

Remote servers

Most “industrial scale” data science is done “in the cloud”, on computers located somewhere else. But why? What problems does cloud computing solve?

Some advantages:

  • You don’t have to buy the computer, take care of it, keep it up to date, or get new power lines routed to your building
  • You can pay for as much or as little computing power as you need at any given time

BIOS 611

13 of 17

Remote servers

Skills needed to use remote computers:

  • SSH - a protocol for connecting to other computers
  • SCP - a protocol for moving data between computers
  • A typical remote workflow:
    • Reserve a computer
    • choose an operating environment
    • log in remotely
    • access your data and code
    • run analysis
    • download your results
    • log out

BIOS 611

14 of 17

Bash demonstration

  1. Log into VCL
  2. Figure out where you are using pwd
  3. Make a directory and navigate into it
  4. Make a file named hw.txt containing the phrase “Hello world!”
  5. Print the contents of the file using cat, head, and tail
  6. Edit the file using vim to say “Hello world! Bash is the best!”
  7. Make a new subdirectory and move the file into it
  8. Make a copy of the file and name the new version “copy_hw.txt”
  9. Delete the sub-directory and the file you created
  10. Log out of the VCL and use scp to copy “copy_hw.txt” to your local desktop

BIOS 611

15 of 17

QUIZ

16 of 17

Remote servers & VCL exercise

The VCL is a UNC computer platform available to students.

BIOS 611

17 of 17

Bash exercise (as always, feel free to help each other)

  • Log into VCL
  • Figure out where you are using pwd
  • Make a directory and navigate into it
  • Make a file named hw.txt containing the phrase “Hello world!”
  • Print the contents of the file using cat, head, and tail
  • Edit the file using vim to say “Hello world! Bash is the best!”
  • Make a new subdirectory and move the file into it
  • Make a copy of the file and name the new version “copy_hw.txt”
  • Delete the sub-directory and the file you created
  • Log out of the VCL and use scp to copy “copy_hw.txt” to your local desktop

BIOS 611