4. Storage Policies

A. Research Data Storage Overview

ARCC provides two options for research data storage. Alcova, and Pathfinder. As of June 1st, 2024, Alcova will begin migration from Alcova to ARCCs Data Portal.  Research Data Storage on ARCC’s Data Portal, HPC Storage, and Old Alcova are subject to default quotas as listed on our DRAFT pricing sheet under Costs and Charges (5.)

B. Data on HPC Resources

i.  Intended Use

ARCC high-performance storage system (HPS) for MedicineBow is a high speed, tiered storage system designed to maximize performance while minimizing cost. Beartooth is intended to be used for storing data that is actively being used. 

ii. Best Practices

The following policies discuss the use of this space. In general, the disk space is intended for support of research using the cluster, and as a courtesy to other users of the cluster, you should try to delete any files that are no longer needed or being used. All data on the HPS, are considered to be related to your research and not to be of a personal nature. As such, all data is considered to be owned by the principal investigator for the allocation through which you have access to the cluster. MedicineBow is for the support of active research using the clusters. You should remove data files, etc. from the cluster promptly when you no longer actively working on the computations requiring them. This is to ensure that all users can avail themselves of these resources.

iii.  Backup Policy

None of the MedicineBow file systems are backed up. We do data replication within the file system in order to minimize the loss of data in case of a system fault or failure.

C. Default Storage Quotas

i. General Service Storage Quotas

Each individual researcher is assigned a standard storage allocation or quota on /home/project, and /gscratch. Researchers who use more than their allocated space will be blocked from creating new files until they reduce their use. 

Service Name

Description

Default Quota (ARCC services provided at no cost to UW Researchers) ** 

Data Storage

MedicineBow HPC storage
(Formerly known as Beartooth/TetonCreek)

See Section 2. ARCC HPC Policies -> D. Default HPC Compute Service Quota

See Section 2. ARCC HPC Policies -> D. Default HPC Compute Service Quota **

Data research storage
(Formerly known as Alcova)

General research storage
On-Site Backups and 20 day snapshots included

2TB / project **

Pathfinder S3 Storage

S3 bucket storage, on-premises.
On-site backups only by user request and subject to cost above quota.

N/A
All Pathfinder storage is charged based on allocation and private/secure key.

**Graduate Projects are subject to 50% default quota storage.

ii. Graduate Project Storage Quotas

Graduate Student project storage quotas are 50% of the above default storage quota listed in above table describing general service storage.  

D. Directory Descriptions

/home Personal user space for storing small, long term files such as environment settings, scripts, and source code.

/project Project-specific space shared among all members of a project for storing short term data, input, and output files.

/gscratch User-specific space for storing data that is actively being processed. This storage is subject to purge policy and should not be used for long term storage.

/lscratch Node specific space for storing short-term computational data relevant to jobs running on that node. Files are deleted nightly.

E. Capacity Increase Options

Researchers working with or generating massive data sets that exceed the default allocations or quotas, or tha have significant I/O needs should consider the following options:

  • Rent space on shared hardware: There is a set price per TB per year. Please contact ARCC for the exact pricing.

  • Purchase additional storage disks to be incorporated into HPC: This option is appropriate for groups that need more space than the free offering, but don’t have the extreme space or performance demands that would require investing in dedicated hardware.

  • Buy your own dedicated storage hardware for Research Computing to host: If you need more than about 15 TB of storage or very high performance, dedicated hardware is more economical and appropriate. The exact choices are always evolving. Please contact ARCC for more information.

F. File Deletion and Purge Policy

ARCC HPC file deletion policy is based on directory as described above in directory descriptions:

  • /home: Home directories will only be deleted after the owner has been removed from the university system.

  • /project: Project directories will be preserved for up to 6 months after project termination.

  • /gscratch: Files that have not been accessed in 45 days can be purged. Data may be deleted as necessary.

  • /lscratch: Files are automatically removed after 14 days.

G. Costs Associated with Project Storage

Please see Policy (5.) on Costs and Charges, below 

i.  One-Time Increase Requests Process and Provisions

If users require quota increases on Beartooth file storage, they may apply for a one-time, no-cost capacity increase by providing written justification for the amount of extra storage and it’s benefit to the UW mission of research, education and economic development.

  • This capacity increase lasts for the life of the storage system (typically 3 years), the increase must be approved by ARCC Leadership, and is subject to a 10TB max limit.

ii. Costs associated with section 4E, Capacity Increase Options

In all other cases, the user will be charged per Cost of Resources as listed below in Policy Section 5.