1 of 38

Introduction to File Management

Joe Ameen

sameen@ucmerced.edu

UC Merced Library

2 of 38

Why is file management important?

3 of 38

Why is file management important?

  • It saves time not having to look, redo, or restore from backups.
  • We will want these files later.
  • You will be managing data in one form or another for a long time.
  • It helps with:
    • Information sharing
    • Recovering from disaster
    • Publishing
    • Preservation

4 of 38

5 of 38

6 of 38

A Bad Example

7 of 38

Workshop Overview

  • Folder Structure
  • File Naming
  • Versioning
  • Tagging
  • Wrap Up

8 of 38

Benefits of Good File Management Practices

  • Makes locating and identifying specific files easier
  • Decreases risk of losing content
  • Decreases likelihood of duplicate and out of sync versions
  • Makes data collection and organization run smoothly
  • Supports research collaborations and group projects

9 of 38

Who is file management for?

You

Future You

Your PI/professor

Future research group members

Current team

Funding agency

10 of 38

What works best for you?

  • The best strategy won’t work if you won’t maintain it
  • Are there constraints? (Budget, cloud connectivity, collaboration)
  • Do you have a preference?

11 of 38

Folder Structure

12 of 38

Hierarchical Folder Structures

Use a hierarchical structure to organize folders & subfolders

  • Create a series of top level folders and subfolders

Broad ------------------------->narrow

  • Group similar items together

Organize by subject, purpose, date, file type, etc.

13 of 38

Hierarchical Folder Structures Best Practices

  • Be consistent
  • Don’t let your folders get too big
  • Don’t let your structure get too deep (recognition vs. recall)
  • Avoid duplicating categories
  • Utilize working, final, and archive folders if appropriate

MIT Libraries. Data management and publishing: Organize your files. Massachusetts Institute of Technology. Retrieved from http://libraries.mit.edu/data-management/store/organize

14 of 38

Folder Organization

File characteristics to pay attention to: function, content, history, format, authorship/ownership, origin

Organize files within subfolders in a way that is intuitive and follows the narrative of the assignment, project, data analysis, etc.

[Research area] / [Project] / [Data or documentation] / [Date]

[Project] / [Sub-project] / [Experiment] / [Instrument] / [Date]

[Project] / [Type of file] / [Data collector name] / [Date]

[Semester] / [Class Subject] / [Class number] / [Assignments]

[Semester] / [Class Subject] / [Class number] / [Readings]

15 of 38

Sample Files: Smoking Cessation Campaigns in CA

FILE 1

Description: Merced Co. Health Records

Date of records: 2010-2015

Accessed on: June 1, 2017

File Format: Excel

Owner: T. Means

FILE 2

Description: Pamphlet

Item number: P002

File created Date: May 2, 2017

File Format: pdf

Owner: K. Ramirez

FILE 3

Description: Pamphlet

Item number: P001

File created Date: May 2, 2017

File Format: pdf

Owner: K. Ramirez

FILE 4

Description: Print ad

Item number: 001

Creation Date: March 2, 2017

File Format: pdf

Author: T. Means

FILE 5

Description: Census Records Merced Co.

File created Date: June 2, 2017

Census year: 2010

File Format: excel

Owner: K. Ramirez

File 6

Description: Madera Co. Health Records

Accessed on: June 1, 2017

Date of Records: Unsure

File Format: Excel

Owner: T. Means

16 of 38

Hierarchical Folder Structures

Benefits

  • Intuitive, logical, and familiar
  • Assists with tracking the progress of a project
  • Works well for collaboration and group projects
  • Works well for projects that have multiple components

MIT Libraries. Data management and publishing: Organize your files. Massachusetts Institute of Technology. Retrieved from http://libraries.mit.edu/data-management/store/organize

17 of 38

File Naming

18 of 38

File Naming Common Elements

File names should include content specific or descriptive information, independent of where the files are stored:

  • Description of content
  • Name of research team/department/class
  • Date of creation/collection
  • Name of creator
  • Publication date
  • Project identifier
  • Version number

19 of 38

File Naming Examples

Order by date:

20140915_Phys121_notes.pdf

20141215_Chem115_lab1.docx

20150120_Psych101_syllabus.pdf

20150412_Psych101_notes.docx

Order by subject:

Chem115_lab1_20141215.docx

Phys121_notes_20140915.pdf

Psych101_notes_20150412.docx

Psych101_syllabus_20150120.pdf

Order by type:

Lab1_Chem115_20141215.docx

Notes_Phys121_20140915.pdf

Notes_Psych101_20150412.docx

Syllabus_Psych101_20150120.pdf

Forced order:

01_Chem115_lab1_20141215.docx

02_Phys121_notes_20140915.pdf

03_Psych101_notes_20150412.docx

04_Psych101_syllabus_20150120.pdf

Adapted from Whitmire A (2015) Northwest 5 Consortium data curation workshop slides. ScholarsArchive@OSU http://hdl.handle.net/1957/56233

20 of 38

File Naming Best Practices

  • Be consistent
  • Always include the same information (e.g. date, owner etc.)
  • Retain the order of information
  • Create names that allow you to best sort your files appropriately
  • Be consistent with date format (ISO date standard: yyyymmdd)
  • Don’t use generic file names that may conflict when moving from one location to another
  • Try to keep file names under 32 characters
    • 32CharactersLooksExactlyLikeThis.csv

Adapted from Whitmire A (2015) Northwest 5 Consortium data curation workshop slides. ScholarsArchive@OSU http://hdl.handle.net/1957/56233

21 of 38

File Naming Best Practices Continued

  • Adopt group guidelines- files should outlast the creator
  • No special characters: & , * % # ; * ( ) ! @$ ^ ~ ' { } [ ] ? < > + /
  • Instead of spaces use underscores or CamelCase
  • If using numbers, consider the scale of your project (e.g. 9 vs. 99 vs. 999)
    • Use zeros as placeholders: 009, 099, 999

* When renaming a large number of files (e.g. digital camera files) use batch renaming software

Adapted from Whitmire A (2015) Northwest 5 Consortium data curation workshop slides. ScholarsArchive@OSU http://hdl.handle.net/1957/56233

22 of 38

From Software Carpentry Foundation (2018) Data & Project Organization, Reproducible Science Curriculum. https://reproducible-science-curriculum.github.io/organization-RR-Jupyter/02-directories/.

23 of 38

Benefits of Good Naming Practices

  • Distinguish from each other within their containing folder
    • webnotes.docx vs. 20120105WebTeamMinutes.docx
  • Allows for files to be sorted in a logical sequence
  • Easier to locate and browse files to be retrieved by the creator but by other users
  • Prevents confusion when multiple people are working on shared files
  • Avoid accidentally overwriting or deleting files

24 of 38

Versioning

25 of 38

Versioning Best Practices

  • Be consistent
  • Keep raw files raw and never edit original copies
  • Avoid using confusing labels such as revision1, final1, final2, finalfinal
    • Exception: use FINAL to indicate the final version of the document
  • Pick a method of versioning that makes sense to you and any collaborators
  • Move older versions into a separate folder (Archive folder)
  • If appropriate, use a tracking facility or version control software such as rather than saving each version after minor changes.

http://www.data-archive.ac.uk/create-manage/format/organising-data

26 of 38

Versioning Examples

http://www.data-archive.ac.uk/create-manage/format/organising-data

27 of 38

Benefits of Good Versioning Practices

  • Clearly differentiate between the different data files
  • Reduce the chance of accidentally losing the current or final version
  • Track the development of your work
  • Return to earlier versions if needed
  • May be a requirement of the project

http://www.data-archive.ac.uk/create-manage/format/organising-data

28 of 38

Hierarchical Structures: Criticisms

  • Too Restrictive
  • May want to have the same file in multiple places
  • Used for over 40 years and may not match current needs and technologies
  • Do people nowadays pay attention to where/how files are stored?

Seltzer, M. I., & Murphy, N. (2009, May). Hierarchical File Systems Are Dead. In HotOS.

29 of 38

Tagging

Create categories and groupings using tags

  • Assign multiple descriptive keywords to your folders and files
    • Support of tags varies between operating systems and platforms
    • Can use a color coded system in place of or along with tags
  • Use tags with a hierarchical folder structure or a more informal folder structure

30 of 38

31 of 38

Tagging Best Practices

  • Establish and standardize your tagging system
  • Keep tags simple and short

“wri 10 research paper" is tagged as "wri 10" and "research paper"

  • If dates are relevant, create tags for month, year, semester, etc.
  • Make your tags consistent
    • singular v. plural
    • capitalize v. lowercase
  • Keep a master list to review periodically and delete unused tags

32 of 38

Tagging Benefits

Benefits

  • Simple and fast
  • A combination of multiple tags typically can describe content characteristics more fully than a folder and file name
  • One of the most flexible organizational tools
  • Easier to remember than file names
  • Works well with for search-based retrieval
  • Visually flag items that require attention

33 of 38

Collaboration Tips

  • Identify shared, stable places to store your files
  • Keep the number of locations small
    • The fewer places there are to find a document, the more likely it is that the file can be found
  • Keep your folders simple
    • Keeping the number of top-level folders low is key
    • Each click is a decision and commitment

34 of 38

Additional Tips

  • Have a file that you want to store in 2 or more places?
    • Create a shortcut to your file to store in additional folders
  • Working across platforms or using both online and offline locations?
    • Utilize shortcut icons and links to other locations or resources
  • Talk with colleagues on how they organize their projects
  • Recognized that your project structure will evolve throughout the project’s lifetime

35 of 38

Strategies Recap

  • Hierarchical Folder System
    • Using a system of parent folders and subfolders
  • File Naming Best Practices
    • Use descriptive but brief names
    • Use ISO date format
    • Implement a system that allows you to easily sort files
  • Versioning
    • Implement a versioning system
  • Tagging
    • Use tags systematically to increase search/access methods

36 of 38

Final Words

  • Organize your project that makes sense for collaborators and most importantly - future you.
  • There’s no single way to do it; establish a system that works for you.
  • Your organization system may evolve as your priorities or workflow change.

  • It’s ok to start small. What is one strategy you can start implementing today?

37 of 38

Resources

Frazer, M. (14 January 2013). An Elevator Pitch for File Naming Conventions. ACRL TechConnect blog. Retrieved from http://acrl.ala.org/techconnect/?p=2607

JISC Digital Media. Choosing a File Name [Managing]. Jisc. Retrieved from http://www.webarchive.org.uk/wayback/archive/20160101151739/http:/www.jiscdigitalmedia.ac.uk/guide/choosing-a-file-name

MIT Libraries. Data management and publishing: Organize your files. Massachusetts Institute of Technology. Retrieved from http://libraries.mit.edu/data-management/store/organize

38 of 38

Resources

macOS Sierra: Use tags to organize files. Retrieved from https://support.apple.com/kb/PH25325?locale=en_US

Adding Properties and Tags to Files (Windows) http://www.informit.com/articles/article.aspx?p=1963993&seqNum=20

UC Merced Library Digital Curation & Scholarship

http://library.ucmerced.edu/digital-curation-and-scholarship