Sidebar Site Navigation
May 15, 2013 — Released 05.14.12 from http://www.ncsa.illinois.edu/News/Stories/speedkeys/
by Elizabeth Murray
Solving their code’s I/O problems leads to faster performance and a new step forward in seismological research for University of Wyoming researchers.
Blue Waters could be the key to understanding the connections between Earth’s deep structure and seismic energy. Often one of the major bottlenecks of this sort of research is the large amount of data it can produce. A research team at the University of Wyoming has developed a scalable code to compute this data more efficiently.
Liqiang Wang and Po Chen found that no matter the method for running their codes, the input/output (I/O) was too great; just reading and accessing their data became an issue. That is when the team turned to Blue Waters, a sustained petascale powerhouse.
“For improvements in the performance of our code, it has to come from the hardware, and Blue Waters is ideal for us to improve because of the level of I/O,” says Chen, “It is the most important factor for us, in fact.”
Chen goes on to explain that the calculations they are running on the supercomputer are a product of a very high-resolution seismic topography project. The region of southern California accounts for more than 25% of the total seismic hazards in the U.S. today. Through their research run on the supercomputer, the team is hoping to determine a 3D structural model for this large concentration of seismic energy.
As a faculty member in computer science, Wang is no stranger to the data-intensive parallel computing model and how it could benefit their research. The two have turned to other supercomputers in the past to run their codes, but as Wang explains, “Blue Waters gives us a lot of potential to use quite a lot of GPUs and CPUs together and allows us to do much larger scale computation.”
Since gaining access to Blue Waters after the first of the year, the team has already experienced improved I/O performance. NCSA systems engineer Galen Arnold has been working closely with the team on their code from the beginning.
“We continue to make progress with the team. The performance gain is now at a 50 times speed-up when reading a 1-terabyte file, which made their application run 16 times faster overall,” reports Arnold.
This improved performance is in part due to a technique known as file striping. A key feature of the Lustre file system on Blue Waters is its ability to distribute segments of a single file across multiple Object Storage Targets (OSTs). A file is striped when read and write operations access multiple OST’s concurrently–simultaneous access increases the available I/O bandwidth; hence leading to increased I/O performance.
Racing for results
By improving the I/O performance, the team went from being I/O bound to being CPU bound. This switch may seem minor, but it is actually a very crucial step in their research and changes how they will interact with Blue Waters moving forward. The team can now focus a majority of their effort on compute time, spending more time working within their code and application.
Arnold compares it to a racecar competing in NASCAR, where time spent in the pit is like time spent focusing on your I/O and time spent on the track is like time spent computing. Pit time is essential to the success or failure of the performance of the racecar—the pit crew keeps the car running—but it is time spent on the track that in the end will truly win the race. The more efficient and productive a pit crew can be, the less time the car has to spend in the pit. Consequently, the more time the driver can spend on a track completing laps, the more likely it is for car and driver to come out ahead of the pack.
NCSA is the pit crew to the researchers’ racecar.
The shift in focus from I/O to computing has been essential in the success of the performance of this team’s code and application. With less time being spent waiting on the I/O—the machine to perform its part—the more time the research team has had to compute its code and run its application. That time has meant all the difference in experiencing gains in performance. Nonetheless, it isn’t just about getting the car out onto the track; it’s about teaching the driver to get the most out of the time on the track.
Speed is the end game, and this race isn’t over.
NCSA senior research programmer Dahai Guo is working with Wang and Chen to get the most of their time on the track. Guo specializes in Sparse Matrix-Vector multiplication (SpMV) and knows that getting a greater handle on this computational kernel is the key to optimizing the team’s compute time on Blue Waters.
SpMV is an important and heavily used kernel in scientific computing. It is beneficial and often necessary to use specialized algorithms and data structures that take advantage of the sparse structure of the matrix when storing and manipulating the date within. This requires attention to memory access patterns, data structures, and average line utilization. Depending on the number and distribution of the non-zero entries, substantial memory requirement reductions can be realized by storing only those data entries.
Guo says they aren’t there yet, but early SpMV testing is already showing positive results.
“With a hybrid format, the GPU result showed decent performance improvement for the sorted matric with an increase row length,” he notes.
Overall, with their application already running faster, the team can solve bigger problems. According to Chen, this means resolving geological structures down to a 10 to 100 meter scale in a region that is more than 300 kilometers by 300 kilometers wide.
“That was not possible for any seismology projects that I have seen, or that I myself have done before,” says Chen, “it is a new step forward.”
These types of results could be used to better predict how strong the ground shaking will be at different points throughout the at-risk area. Chen points out a successful structural model could “potentially save a lot of money and also potentially save a lot of lives.”
Wang is also hoping through this access to heterogeneous computing he can tackle the challenge of designing even more scalable codes and look to the future of their research.
“When we can run our code, we can get more experience with that, and we can start developing and expanding our research topics based on our time with Blue Waters,” Wang elaborates.
It is possible these 3D models will lead to a better understanding of deep Earth. In some cases, regions with greater risks for seismic activity are also rich in minerals.
“We also hope to adapt this technic for other purposes, not just limiting to seismic hazards, but to hydrocarbon extraction in both a more economical and environmentally friendly way,” says Chen.