Advanced Research Computing Center
University of Wyoming
Office of Research & Economic Development
Department 3355
1000 E. University Ave.
Laramie, WY 82071
Phone: 307-766-7748
Email: arcc-info@uwyo.edu
September 14th, 2020 - When he started his Senior Design project as an undergraduate Damir Pulatov knew that he needed more computational power than what his laptop could provide. That is when he contacted the Advanced Research Computing Center (ARCC) for help. Today Damir’s research is challenging ARCC’s ability to stay ahead of him. Working under the guidance of Assistant Professor Lars Kotthoff, Damir has now transitioned that senior design project into his Ph.D. research on Algorithm Selection for Artificial Intelligence (AI) Problems using ARCC resources.
As he explains it, “When you are trying to solve AI problems there is usually a choice of algorithms to choose from.” He goes on to say, “in practice, these algorithms may have complimentary performance. What that means, for example, one algorithm, say algorithm ‘A’ might work really, really well on these types of problems, but algorithm ‘B’ may work really, really well on other types of problems... If you had a thousand problems to solve, but thirty algorithms to choose from, the most intuitive thing to do is to pick one algorithm that works well in a general case, but as I said, some algorithms have complementary performance.” What Damir is saying is that it makes sense to choose algorithms based on the problems that they are best suited to solve.
It’s all about efficiency he clarifies, “It matters because if you choose a good algorithm you might solve the problem in a second, but if you choose a bad algorithm it could take you hours, days, weeks, and even years. And what I am doing is creating algorithm selectors that try to intelligently choose what software, the algorithm to use, and in what case... to solve problems quicker, better, and have better solutions. Basically how to better solve AI problems.” He went on to say that the impact of his research could improve computational efficiency, not just in how fast problems get solved, but including how much power the system uses, reducing cooling costs, and freeing up compute resources that can be used by others to perform even more research.
Damir uses ARCC’s high performance computing (HPC) system, Teton, to accomplish this work. “This type of research would be impossible without HPC”, he says. He uses the system to both run many simulations to generate data to train AI models and to analyze their performance. And he uses Teton a lot. This summer Damir has run many analyses on Teton, from May 1st to September 1st, 2020 he has totaled 3,167,265 core hours on Teton. That is the equivalent of 361 years if he was limited to a single core! To put it in perspective, a reasonably powerful desktop computer could have approximately eight cores in its Central Processing Unit (CPU), and if Damir did his research on a desktop using all eight cores it would take 45 years to do the same amount of work he did in four months on Teton.
Since Damir is analyzing algorithms, an acceptable outcome is system failures caused by the analyses. At one point this summer, he was using nearly 50% of Teton’s approximately 500 compute nodes and caused them all to crash. Instead of reprimanding Damir for crashing half the system, ARCC worked with him to isolate a set of around 30 compute nodes that he could use and limit the impact his work would have on other users. He details what it’s like working with ARCC, “The staff are so friendly and nice. I had to ask for help with making my jobs run faster they met with me to go through my code and how I was submitting jobs to improve my performance.” His advisor also told him that at some other institutions this is much more difficult, “He said they would probably ban me from the system for causing that failure and I am appreciative that didn't happen with ARCC.”
Damir’s work is funded by a National Science Foundation Grant in the Division Of Information & Intelligent Systems titled Robust Performance Models.
Damir is currently working on publishing this research and at the time of this writing is close to submitting to a journal.
Advanced Research Computing Center
University of Wyoming
Office of Research & Economic Development
Department 3355
1000 E. University Ave.
Laramie, WY 82071
Phone: 307-766-7748
Email: arcc-info@uwyo.edu