Glasses in a book.

Proposed ARCC HPC Scheduling Policy

picture of processor

ARCC Proposed Policy Changes

UW ARCC plans to make adjustments to our current SLURM HPC Scheduling Policy.  These proposed changes are intended to improve and incentivize the following: 

1. Prioritize Access for Funded Research.
2. Incentivize HPC Investment for Priority Access.
3. Spur innovation by maintaining a level of free services available to the overall University of Wyoming research community. 

Details for suggested changes to policy are provided below.  

 

Proposed HPC Job Scheduling Policy

All job submissions require the specification of an account. Based on wall time and/or quality of service, a job will be placed in one of several queues. If no partition, QoS, or wall-time is specified, a job, by default, will be placed in the Normal queue with a 3 day wall-time.

Job submission syntax will remain largely the same, except that users should supply a Quality of Service (QoS) to put their work into a prioritization queue.  If a user does not supply a QoS or wall time in their job submission, they will, by default be placed into the Normal queue.  Queue information is detailed below under Queues/QoS Types.  Users are incentivized to specify shorter wall times so their jobs are given priority to run, and thereby placed in a queue with a shorter queue time frame when the cluster is under high utilization.    

Job submissions will have a partition set for them if not supplied.  This will work in a specific order of ‘oldest hardware’ to ‘newest.’  This will ensure the newer hardware is available for jobs that require it, and less intense jobs are sent to older hardware.
Jobs associated with investments will be prioritized.

Our department has found that PIs are more frequently using ARCC systems in the classroom or for instructional purposes.  If resources are required during a specific time window, we strongly encourage PIs to request reservations when using part of the cluster in an instructional setting.  

Reservations must be requested 21 days prior to the start of the reservation period.  This advance notice is necessary to account for the scheduling of personnel needed to configure reservations, and the 14-day maximum job wall time.  

This is critically important for classes running interactive sessions.  Reservations must be requested and configured to guarantee timely access.  

  • No single SLURM defined account or HPC project may occupy more that 33% of the cluster + their investment allocation.  This will be set per SLURM Account (AKA project, not per user). 

  • Users can no longer specify all memory on a node using the --mem=0 specification in their submission.  Users must explicitly specify how much memory they need.  This will reduce the likelihood that users request a disproportionally large segment of available HPC memory and thereby reduce likelihood that such portions would be assigned by the SLURM scheduler to any given job. Alternatively, --exclusive may be used but will request all resources on the node.  This will ensure accurate utilization in reporting.

  • Users cannot use a GPU partition without requesting a GPU.  Users will be required to request a GPU (and potentially be billed for that use) if they use a GPU node.  This does not apply to investments with GPUs.  

ARCC has created the following queues and Quality of Service (QoS) as a function within SLURM allowing for configuration of preemption, priority and resource quotas for different purposes.  Each is detailed below:  

 
Queue/QoS Name Priority Maximum Specified Wall Time Limitations

Purpose
Debug n/a 1 hour Limited to a small hardware partition, limited in job size For debugging job submissions, code issues, etc.
Interactive 1 8 hours Limited to interactive jobs Hosting interactive jobs and imposing a short wall time to ensure fair use.
Fast 2 12 hours None outside of the general account/project limitations** For any normal jobs that will not take an extended amount of time.  Higher priority means the user's job will run quickly and the shorter wall time means the scheduler may prioritize more efficiently.  
Normal (Default) 3 3 days None outside of the general account/project limitations** This is the default queue.  The shorter wall time is still liberal, but will allow the scheduler to manage resources effectively.
Long 4 7 days Limited to 20% of overall cluster + Investment** To allow for jobs requiring a longer wall time.  This flexibility benefits users with longer-running workloads while not overwhelming the entire cluster.
Extended 5 14 days Limited to investors
Limited to 15% of cluster total
Limited with respect to number of jobs in queue per project**
To encourage investments to ARCC providing investors more flexibility.  

** Interactive jobs may not be run in this queue

a. Debug
This queue consists of a small subset of HPC hardware with rigid limitations.  Priority is not applicable since this queue falls outside of normal Slurm scheduling and standard scheduling policies.  This is a specialty partition.  Submissions will be subject to partial node allocations, and debug jobs may not request the entire node.  

b. Interactive
The Interactive queue is used for all interactive jobs, and such jobs are given the highest priority when jobs are submitted.   All interactive jobs are subject to a maximum 8 hour wall time. 
 This includes:

1. Interactive desktops (Including On-Demand XFCE Desktop Sessions)
2. On-Demand applications (Jupyter, XFCE Desktop Sessions, Matlab, Paraview, and all sessions launched through web-based https://medicinebow.arcc.uwyo.edu)
3. Jobs requested via an salloc command.  

c. Fast **
The Fast queue is given highest priority for typical job specifications.  It should have a shorter wall time of 12 hours to encourage rapid clearing of open devices for use, and will allow the Slurm Job Scheduler to effectively and more quickly manage any job backlogs.

d. Normal (Default) **
This queue is the default queue for any job submitted without a specified wall time or QoS.  It will have standardized priority, and should be subject to larger wait time than the fast queue to run a job.

e. Long **
The Long queue will allow for 7-day jobs to account for jobs that require this substantial duration, but will be limited in scope encouraging shorter wall times and higher overall HPC resource availability.  This is subject to the 20% of overall cluster + Investment limit detailed in the above table.  

f. Extended (Investor-Only/Discretionary) **
To encourage investment, ARCC investors and their project members will be able to use a 14-day wall-time on any node.  This allows them to run longer jobs, but also allows ARCC to implement maintenance windows as required.  Priority level is 5 (outside of investments when discretionary).  Investors will be able to use a longer wall time on ANY node however can preempt jobs only on nodes they have invested.  Limit to number of resources outside of investment is 15% HPC resources + Investment.

Users may e-mail arcc-help@uwyo.edu to request an extension for long-running jobs.  Approval will not be guaranteed, is discretionary, and will be dependent upon current HPC usage, and justification. 

 

Term Definition
Partition: In this context, partitions refer to a defined group of nodes on the cluster.  Queues do not directly translate to hardware partitions.  The only exception is when investments are taken into account.
Priority: Priority in this context relates only to queued jobs, and does not impact currently running jobs.  When a job has priority over other jobs, it will simply be placed higher in the queue.  Running jobs are unaffected.  
QoS: Quality of Service (QoS) is a function within SLURM job scheduler that allows for the configuration of preemption, priority and resource quotas.  Queue attributes are summarized within the Queue/QoS Table in section 4 (Queues/QoS Types) above.