Technical challenges of supporting interactive HPC

Albert Reuther, Jeremy Kepner, Andy MCcabe, Julie Mullen, Nadya T. Bliss, Hahn Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

Users' demand for interactive, on-demand access to a large pool of high performance computing (HPC) resources is increasing. The majority of users at Massachusetts Institute of Technology Lincoln Laboratory (MIT LL) are involved in the interactive development of sensor processing algorithms. This development often requires a large amount of computation due to the complexity of the algorithms being explored and/or the size of the data set being analyzed. These researchers also require rapid turnaround of their jobs because each iteration directly influences code changes made for the following iteration. Historically, batch queue systems have not been a good match for this kind of user. The Lincoln Laboratory Grid (LLGrid) system at MIT-LL is the largest dedicated interactive, on-demand HPC system in the world. While the system also accommodates some batch queue jobs, the vast majority of jobs submitted are interactive, on-demand jobs. Choosing between running a system with a batch queue or in an interactive, on-demand manner involves tradeoffs. This paper discusses the tradeoffs between operating a cluster as a batch system, an interactive, ondemand system, or a hybrid system. The LLGrid system has been operational for over three years, and now serves over 200 users from across Lincoln. The system has run over 100,000 interactive jobs. It has become an integral part of many researchers' algorithm development workflows. For instance, in batch queue systems, an individual user commonly can gain access to 25% of the processors in the system after the job has waited in the queue; in our experience with ondemand, interactive operation, individual users often can also gain access to 20-25% of the cluster processors. This paper will share a variety of the new data on our experiences with running an interactive, on-demand system that also provides some batch queue access.

Original languageEnglish (US)
Title of host publicationDepartment of Defense - Proceedings of the HPCMP Users Group Conference 2007; High Performance Computing Modernization Program
Subtitle of host publicationA Bridge to Future Defense, DoD HPCMP UGC
Pages403-409
Number of pages7
DOIs
StatePublished - Dec 1 2007
Externally publishedYes
EventDepartment of Defense - HPCMP Users Group Conference 2007; High Performance Computing Modernization Program: A Bridge to Future Defense, DoD HPCMP UGC - Pittsburg, PA, United States
Duration: Jun 18 2007Jun 21 2007

Publication series

NameDepartment of Defense - Proceedings of the HPCMP Users Group Conference 2007; High Performance Computing Modernization Program: A Bridge to Future Defense, DoD HPCMP UGC

Other

OtherDepartment of Defense - HPCMP Users Group Conference 2007; High Performance Computing Modernization Program: A Bridge to Future Defense, DoD HPCMP UGC
Country/TerritoryUnited States
CityPittsburg, PA
Period6/18/076/21/07

Keywords

  • Cluster computing
  • Grid computing
  • Interactive high performance computing
  • On-demand
  • Parallel MATLAB

ASJC Scopus subject areas

  • Computer Science(all)
  • Software

Fingerprint

Dive into the research topics of 'Technical challenges of supporting interactive HPC'. Together they form a unique fingerprint.

Cite this