Research and Data Services

Data Support, Training, and Software

Research Data Management

Introduction

Below are selected links and resources to help you manage your research data.

File Naming and Organization

File Naming

  • Most important is to be descriptive and consistent
  • You should be able to understand what is in the file without additional context (like the file creation date or the folder name)
  • Other best practices:
    • Don't use spaces or special characters that aren't machine readable
    • Preferred date format: YYYYMMDD
    • Use leading zeros (e.g. 01, 02, 03) for better sorting
    • Indicate version numbers in file name if not using an automated version control system

File Organization

  • As with file names, make sure folder names are also descriptive and consistent
  • Always save raw data in a separate folder
  • Can sort folders by date, data type, processing stage, or any other system that makes sense to you and your collaborators

File Formats

Resources:

Spreadsheets and Tidy Data

Following a few basic recommendations when working with research data in spreadsheets can save you time when it comes to analyzing your data. Consider these practices adapted from Data Organization in Spreadsheets, Karl W. Broman & Kara H. Woo.

Basic Spreadsheet Practices

Getting Started

  • Make backups
  • No calculations in the raw data files
  • Make it a rectangle and fill in each cell

Inputting Data

Tidy Data

  • Data Science for the Biomedical Sciences
    • Every column is a variable
    • Every row is an observation
    • Every cell is a single value
  • Do not use font color or highlighting as data - consider an additional column with a "flag" value, as described above in  Exercise 3

Sharing

  • Save the data in plain text files

Additional Resources:

Data Science for the Biomedical Sciences - Spreadsheets
Data Carpentry Spreadsheet Lesson
DataONE Data Entry and Manipulation (creating files, missing values, data validation)
Data Organization in Spreadsheets, Karl W. Broman & Kara H. Woo

Data Documentation and Description

Documentation:

Describing your Project

CESSDA has a useful guide for creating project-level documentation:

  1. For what purpose was the data created
  2. What does the dataset contain?
  3. How was the data collected?
  4. Who collected the data and when?
  5. How was the data processed?
  6. What possible manipulations were done to the data?
  7. What were the quality assurance procedures?
  8. How can the data be accessed?

Describing your Dataset(s)

Nice overview on Readme, Data Dictionaries, Codebooks with examples (Iowa)

Readme File

A readme is typically a plain-text file that provides information about a datafile to help facilitate use and re-use of the data. Typical elements to a readme include the following (adapted from Guide to Writing "readme" Style Metadata). Using one of the templates below can help ensure you create a useful readme file.

Readme Content: General Information

  • Dataset title
  • Creator name and contact information
  • Date(s) of data collection
  • Location of data collection
  • Keywords to describe data topic

Readme Content: Data and Files

  • Descriptive file names for each file, and for each, a description of what data is contained
  • Data the file was created
  • List of variables for each dataset, including full names and descriptions of each
  • Definitions of any codes or symbols, including those for missing data

Readme Content: Methods

  • Methods for data collection or generation
  • Methods used for data processing

Additional Readme Resources

Data Dictionary

Codebook

Metadata and Standards

Disciplinary Metadata (Digital Curation Centre) - links to information about metadata standards by discipline/field

UVA Research Data Resources

University Policies:

UVA-Contracted Cloud Storage:

Additional UVA Storage Resources:

Backing Up Your Data:

  • Consider the rule of three:
    • Here (lab computer, personal computer)
    • Near (portable hard drive, flash drive)
    • Far (cloud storage, remote backup)
  • CESSDA Guide to Backing Up Data

Archiving and Sharing:

UVA Resources for Computing and Analysis:

Questions?

Need more information on managing your research data? We are here to help:

University Library Research Data Services +  Sciences - contact at dmconsult@virginia.edu 

Health Sciences Library Research & Data Services - contact at hsl-rdas@virginia.edu

Skip to Main Content

Claude Moore Health Sciences Library
1350 Jefferson Park Avenue P.O. Box 800722
Charlottesville, VA 22908 (Directions)

facebook twitter instagram
© 2021 by the Rector and Visitors of the University of Virginia
Copyright & Privacy