
A Dance of Two Stars & A Million CPUs
Dylan Perkins
Published March 26, 2026
6 Minute Read
collaborative efforts leverage hpc resources and bring new findings
By working together, a researcher at the University of Wyoming (UWyo) and the Advanced Research Computing Center (ARCC), over a year and a half’s worth of computational work was completed in less than three weeks.
Binary star systems (two stars gravitationally bound in orbit around each other) are one of the many phenomena found in our galaxy that seem from science fiction. Tatooine anyone? However, they are quite common in our very own Milky Way. That said, much research is still needed to classify eclipsing binary systems. One such researcher, Megan Frank, a PhD candidate in UWyo’s Department of Physics and Astronomy is attempting to do just that. As Megan describes her project, “this project aims to classify four eclipsing binary systems located in the Large Magellanic Cloud (a dwarf galaxy of the Milky way). These objects are a bit unique in that they showcase a unique reflection effect in their light curves, meaning that the space between the primary and secondary eclipse is sloped rather than flat.”
One tool that Megan uses to perform the analyses she needs to classify these stars is High Performance Computation (HPC). In particular, she states “We have been utilizing the Physics Of Eclipsing BinariEs (PHOEBE) light curve modeling software on UWyo’s MedicineBow cluster administered and supported by ARCC to fit these unique light curves in order to classify these systems.” However, one step of her workflow required running multiple waves of computing jobs to do approximately 500,000 individual simulations each using a single CPU before moving on to the next step. On May 6th, 2025 knowing that MedicineBow has approximately 18,000 CPUs and taking all of them may cause problems for others, Megan reached out to ARCC for advice on how to best schedule that many jobs without impacting others also needing to use MedicineBow for their research.
HPC was "the right tool" for the job
When ARCC End User Support (EUS) noticed this request the initial feeling was of pleasant surprise because in many workshops they have often stressed the importance of being ‘a good cluster citizen’ to the researchers on campus and eagerly brought it to the attention of ARCC Leadership. ARCC Leadership then suggested that they involve another ARCC team on this request, The Computational Research Software Support (CRST) and meet with Megan to find out how best to help ‘throttle’ these jobs to schedule them in smaller batches and reduce the impact on other researchers using the system. Additionally, ARCC policy states that no single Project can use more than 33% of a cluster’s CPUs at any one time, so this was going to be a challenge to fit 500,000 compute jobs in a timely manner without impacting others.
After the meeting with Megan to learn what she needed to run and how, the CRST team developed a method to schedule these individual compute jobs in batches as would fit within ARCC policy as well as manage the output data. The CRST team confirmed with ARCC Leadership about the appropriateness of the chosen methodology then instructed the EUS team how the workflow is structured. The EUS team and Megan met one more time to hand over the workflow and look through the steps that CRST came up with. Megan began slowly starting with 10’s of jobs, to 1000’s up to a maximum of 5000 jobs running at a single time. Over the course of twenty days, Megan ended up running 1,275,744 individual jobs taking over 14,000 CPU hours. For reference, if she attempted to run this workflow on her own computer running one job after another, it would take somewhere up to 19 months to complete.
As successful as this was, there was a small incident that caused concern. At the height of Megan running the most amount of jobs, UWyo IT was contacted by Villanova University (who hosts the data tables for the PHOEBE software) as they were suspecting a cyber attack. Since ARCC Leadership was aware of the workflow, they stepped in to ensure both UWyo and Villanova IT that this was indeed a research workflow and not a cyber attack, to allow Megan’s jobs to complete before they took measures to block them from contacting their systems. Megan now informs Villanova whenever she is about to run many jobs so they can anticipate the traffic.
Megan is now starting the next phase of her work, which requires a lower number of simulations, but longer running algorithms, and MedicineBow is still helping. “New runs are taking six to eight hours on MedicineBow, which would take about a week on my own computer,” she said. Any issue Megan runs into she reaches out to ARCC as the working relationship is now strong and mutually beneficial. ARCC would like to encourage other researchers to also reach out for assistance should they run into challenging computational problems, just like Megan.
On May 6th, 2025 PhD candidate in Physics and Astronomy Megan Frank reached out to ARCC via the ticketing system for some advice on how to best run over 500,000 jobs on the MedicineBow cluster with the least amount of impact possible. Upon noticing the ticket,ARCC End User Support (EUS), reached out to ARCC leadership for how to handle this request best.
then met with Megan to see her workflow and then worked to optimize Megan's workflow to work within the limits of MedicineBow’s job scheduling and data storage limits. The Computational Research Software Support (CRST) then came up with a workflow to share with Megan that would throttle her job submissions to an amount that fit within ARCC policy of 33% of the cluster being utilized by a single project. The CRST team worked with the ARCC End User Support (EUS) to instruct how the workflow does this so that EUS could work with Megan on implementing it. This proved successful as Megan began ramping up job submissions from 10’s to 1000’s of jobs without impact on other ARCC users.
Things are still ongoing, but a lot better, new runs are taking six to eight hours which would take about a week on my own computer without impacting others.
Same code for next project.
- On May 6th, 2025 PhD candidate Megan Frank reached out to ARCC for some advice on how to best run over 500,000 jobs on the MedicineBow cluster with the least amount of impact possible.
- An internal discussion was had between ARCC members on how to best help her to facilitate this
- ARCC had a meeting with Megan to talk about her workflow and how she was planning to submit these jobs
- At the end of the meeting it was determined that ARCC would help Megan stream job submissions to where there would be no more than 5000 at a time in a running or pending state.
- To facilitate this, the task of developing the streaming workflow would be left up to the CRST team and then collaborate with the User Support team to instruct on how to use the solution to then instruct Megan how to leverage it.
- During this meeting it was discovered that Megan’s workflow included writing individual outputs for each job in a single directory, which would have caused metadata issues so ARCC developed a way to create new directories for each set of 5000 jobs to reduce the impact of this.
- User Support then met with Megan and instructed her on how to use this workflow.
- Megan proceeded slowly going from 10’s to 100’s to 1000’s of job at a time until she was able to complete her work.
- This went unnoticed by other users and created the least amount of impact on others
- During the middle of the work we were contacted by IT security that Villanova University in Philadelphia, PA was suspecting UWyo of a DDoS attack due to the multiple calls to their published Phoebe table that Megan’s jobs were calling.
- ARCC contacted Security and Villanova that this was a legit science workflow and not a malicious attack.
- In the time period of 20 days Megan had run 1.2 million jobs on MedicineBow without any others noticing, showing that collaboration with ARCC and Scientists works in everyone’s favor.
This project aims to classify four eclipsing binary systems located in the Large Magellanic Cloud (LMC). These objects are a bit unique in that they showcase a unique reflection effect in their light curves, meaning that the space between the primary and secondary eclipse is sloped rather than flat. We have been utilizing the Physics Of Eclipsing BinariEs (PHOEBE) light curve modeling software in the MedicineBow cluster to fit these unique light curves in order to classify these systems. We believe these systems are detached post-Algols that have evolved through Case AB roche-lobe overflow.

