Data Management

UW Libraries Data Services

Research data management is the care and management of data from all stages of the research lifecycle, beginning with idea conception and continuing through proposal writing, generation of results, the publication process, and curation of research data for public access. Managing research data is a complex undertaking that requires thoughtful planning followed by attention to detail throughout the entire research process. 

 

The University of Wyoming Libraries offer comprehensive data management services to help researchers, students, and faculty effectively manage, share, and preserve their research data. Whether you're developing a data management plan (DMP), organizing datasets, or preparing for long-term storage, our team is here to support you throughout the data lifecycle.

Main Page

Data Management Plans + Funding

Many federal agencies and major funding organizations have data sharing policies, which means researchers are required to share their data as part of the conditions for receiving the grant. In addition, Data Management Plans (DMPs) are required by several federal funding agencies as part of the research grant proposal process.  A data management plan states how data will be created, managed, stored, accessed and shared during and after a research project. Data management plan requirements vary by funder and discipline of research. You should always check grant and funder guidelines for specific requirements.

 

If you are looking for somewhere to publish your data, please go to the Wyoming Data Repository.

A student in a study space.

DMPTOol

mountain icon

Metadata + ReadMes

ReadMes

A ReadMe provides information about a data file and is intended to help ensure that the data can be correctly interpreted, by yourself at a later date or by others when sharing or publishing data. ReadMes can help:

  • To document changes to files or file names within a folder
  • To explain file naming conventions, practices, etc. "in general" for future reference
  • To specifically accompany files/data being deposited in a repository
  • To help ensure that the data can be correctly interpreted, by yourself at a later date or by others when sharing or publishing data

Further resources:

Metadata

Metadata is documentation that describes data. Properly describing and documenting data allows users (yourself included) to understand and track important details of the work. Having metadata about the data also facilitates search and retrieval of the data when deposited in a data repository. 

Further resources:

Lab Notebooks

A lab notebook is a complete record of procedures (the actions you take), the reagents you use, the observations you make (these are the data), and the relevant thought processes that would enable another scientist to reproduce your observations. Lab notebooks can be physical objects or they can be digital objects.

Further resources:

Codebooks

A codebook, or data dictionary, defines specific details of your data  -- the variables, column headers for spreadsheets, participant aliases, or qualitative tags are some examples of facets of a dataset that should be described in a codebook. This differs from a ReadMe in that it focuses on specific details of the data, not information about the data file as a whole.

Further resources:

Best Practices in Data Management

The DataONE (Data Observation Network for Earth) website provides guidance on best practices for managing data throughout the data life cycle.

Following file naming conventions helps you organize your data files and makes them easier to discover and retrieve. File names should:

 

  • Be extensible: "ex001" not "ex1"
  • Be unique: not 20 files called "data.xlsx"
  • Only use numbers, letters, and underscores
  • Embody their contents (and parameters): AtherRat_ex012_ather_lipito_128.tif
  • Have non-cryptic names: AtherRate_SOP_DataValidation_v01.docx
  • Be consistent and documented!

 

Using a system like this results in a folder that sorts well. Of course, creating a file naming convention that is easily documentable does not preclude the need to actually create that documentation! Something as simple as a ReadMe file in the directory with the files can prevent a great deal of future confusion. 

 

For more information on file naming conventions, check out these best practices from Stanford Libraries or these 10 Rules for Best Practice.

A key aspect of research data management is a solid storage strategy for your active data and archived data. "Active data" are the data you are collecting and analyzing for your research project. 

 

At the University of Wyoming, one option for active data storage is OneDrive for Business. University accounts each have 5 TB of online OneDrive space, with additional space available upon request. For researchers, another option is the Data Commons, provided by ARCC. This is a highly collaborative space geared towards project-oriented data storage for use with multiple researchers at UWyo and beyond. To provide enhanced collaboration and control, Principal Investigators may assign access permissions to other UWyo users as well as external users. There is no cost to store up to 500 GB, and researchers can pay for more storage space. 

Many types of data may contain confidential or sensitive information that must be protected. In some cases, there may even be legal or regulatory standards that must be followed (such as HIPAA). There are several information security techniques that may be employed to protect your data:

 

  • Encryption: offers protection by scrambling data, so only the owner of the key or password can read the data.
  • Access Control: allow an administrator to specify who is allowed to access digital material and the type of access that is permitted (for example read only, write).
  • Redaction/anonymization/deidentification: the process of analyzing a digital resource, identifying confidential or sensitive information, and removing or replacing it. You should always carry out any anonymization work on a copy of the original data, not the original data itself.
  • Physical Security: involves controlling access to buildings or rooms where your data is held. This may be as simple as ensuring that the lab where your workstation is located is locked, with key card access only granted to you and your lab personnel.

 

Researchers interested in data security measures should contact ARCC.

Never rely on a single copy of data.  Keep your source (primary, or raw) data separate from your active data, and always make a copy of it before working on the data. CD's and DVD's are not reliable as long-term storage options. Their life expectancy is only 2-5 years, and they need to be stored under the appropriate environmental conditions. Hard drives have a life expectancy of 4-6 years. USB (thumb) drives are not a good option.  They are easily lost and stolen.  

 

At the very least, you should have one separate backup of your data. Best practice is to follow the 3-2-1 Rule. You should have three copies of your data, in two different physical locations, on at least one different type of media.  

Data Repositories

A simple and effective way to share your research materials is to publish them in a repository. A repository is a storage facility (often also a preservation and curation facility) where users can upload and download their data, make it accessible and discoverable, all in an effort to fulfill grant requirements and/or support the free sharing of scholarly knowledge.

 

Materials that are deposited into a repository should be:

  • Publication-ready, or data not likely to be modified 
  • Searchable and browesable 
  • Retrieved or downloaded easily 
  • Citeable 
  • Well-documented, to facilitate understanding and re-use

 

A wide variety of institution-based and discipline-specific repositories exist for researchers to choose from. The repository itself should be: 

  • Appropriate for the type of data you generate
  • Appropriate for the audience of the repository (so they will make use of your data!)
  • Open access

 

The Libraries also has information on finding data in repositories for re-use and how to properly cite data in your work

A student on a computer.

Institutional Data Repository

The University of Wyoming maintains the Wyoming Data Repository (WDR), an open-access data repository. WDR holds a variety of material, including research data. The goal of the University of Wyoming’s data collections within WDR is to encourage the dissemination of research data by making these materials open access. The collection houses data created by University of Wyoming students, faculty, staff and University of Wyoming affiliates in the course of their research and registers a DOI for every dataset published. You can visit WDR here and find more specific information in our Data Repository Policy.

Data Licensing

Introduction

Data Licensing is an important part of sharing your data. In order to facilitate reuse of research data, it is imperative that end users know the terms of use for the database and the data content. Even if the funder requires you to open or share your data, a license is a courtesy to potential re-users so they know up front what they can do with it.  It increases the likelihood that other researchers will consider your data for reuse, which is a positive for you. Simply releasing the data without a license creates ambiguity.

Data ownership at the University of Wyoming

According to the University of Wyoming Research Data Policy (2015), "All Research Data is owned by the University, except as otherwise provided by an agreement with a third party, a law, or University policy, such as copyright policy. Not including those copyrighted items which are covered in a different University regulation 3‐641.” Generally, however, these guidelines allow researchers to publish their data. 

Choosing a license

There are many licenses available to choose from. There is no one-size-fits-all solution. It depends on the data type, what you will or will not allow others to do with it, what the funder or institution requires, what your discipline uses, even what your publisher requires. If you are publishing in the Wyoming Data Repository, the standard licenses available are from Creative Commons.

Further Data Management Resources

In general, they will all differ in: 1.) User-interfaces; 2.) interoperability and flexibility; and 3.) the pricing model.

  • Article from Nature describing how to pick an electronic lab notebook.
  • Comparison grid from Harvard to picking the best-suited ELN for  you.
  • Helpful advice and guide from labfolder.
  • Additional comparison tables from University of Cambridge

 

In no particular order, here are some of the more popular, higher rated ELNs:

If you're looking for a general introduction on best practices for research data management, check out some of these training modules. Digital Scholarship can also provide local, customized training.

 

There are a variety of free tools and softwares available out there for anyone to use as a part of a research project. This is not an exhaustive list, but a starting point to help you explore what's available.

 

FAIR stands for Findable, Accessible, Interoperable and Reusable. The FAIR Data Principles were developed and endorsed by researchers, publishers, funding agencies and industry partners in 2016 and are designed to enhance the value of all digital resources. The FAIR principles do not prescribe any particular technology, standard, or specification, but rather act as a guide to researchers to aid them in evaluating whether their current data curation practices are sufficient to render their data Findable, Accessible, Inter-operable, and Reusable.

 

 

Wyoming Data Repository, UW's institutional data repository, uses an platform that strives to make data FAIR. Here is an overview of platform features that adhere to FAIR principles (note: this figure is geared toward a publisher audience, but can still provide helpful information if you need details for. say, a grant application). If you have questions about the FAIR data principles, or how to make your data more FAIR, contact the Data Services Librarian.