From CSV to Google Sheets to R
A guide to getting your data online and into R
First, download your data to your computer
All webpages will have different links and ways to download, so you might have to navigate a bit to find out where the link is.
If you already have your data downloaded to your computer you can skip the next two slides.
Kaggle has a download link
Tucson data you need to navigate to the data tab
Then you can click download and select spreadsheet
Unzip if needed
If your data is already in an excel/csv format, you’re good
If your data is in a zipped folder, navigate to it in your explorer and unzip it.
This will create another folder with the spreadsheet(s) inside it
Upload it to Google sheets
Go here: https://docs.google.com/spreadsheets/u/0/
You should have a blank spreadsheet
You should then have a screen pop up where you can select the upload tab
You might get this box. If you do, just change this button to No and then click Import Data.
Now our data are up on Google Sheets and accessible from anywhere!
Now our data are up on Google Sheets and accessible from anywhere!
To get them into R we first need to get a shareable link. Click Share
We need to change who it’s being shared with, so click Advanced down here
Now click Change up here. If it already says Public and not Private then you should be good.
After clicking change you should see this menu. Select the Public on the web option and click Save.
After clicking Save you’ll see this menu. Copy the share link and paste it into a new R studio script
Pasted Link from Sheets
New R Script
Now you need to convert the import code
We’re going to use the function read_csv() to import directly from google. But first, we need to make a link to the spreadsheets location on google.
The first step is to take the bolded section of the share link you copied from Google Sheets
Then paste that into this link replacing the underlined section
Now put that converted link into quotes within the read_csv() function
Take copy everything after the d/ and up to the next /
And replace the whole underscored section here
Your final link should look like this. Note that I’m assigning the imported data to the objected my_df. You are free to call it something else.
After that’s done delete the other stuff and add the libraries you need to the top of your script.
Running my script results in the data successfully being brought into R! As you can see, we got a ton of stats related to football/soccer players.
Of course, our first real check that everything imported correctly is a quick glimpse() of the data.
Things look good overall, but clearly some cleaning needs (e.g removing symbols from the Wage column) to be done to make the data useable. Time to apply what you learned in class!