hpctoolkit

[ Home | Overview | Publications ] • [ Examples ] • [ Software/Downloads | Documentation/Questions ] • [ People | Acks ]


HPCToolkit and Related Publications

Selected Overview Paper

[1]
Laksono Adhianto, Sinchan Banerjee, Mike Fagan, Mark Krentel, Gabriel Marin, John Mellor-Crummey, and Nathan R. Tallent. HPCToolkit: Tools for performance analysis of optimized parallel programs. Concurrency and Computation: Practice and Experience, 22(6):685–701, 2010. (PDF) (doi:10.1002/cpe.1553)

HPCToolkit Papers

[0]
Milind Chabbi, Karthik Murthy, Mike Fagan, and John Mellor-Crummey. Critically Missing Pieces on Accelerators: A Performance Tools Perspective. SC '13: Birds of a Feather Session: Critically Missing Pieces in Heterogeneous Accelerator Computing, Pavan Balaji (Organizer). Supercomputing, 2013. (PDF)
[1]
Nathan R. Tallent, John M. Mellor-Crummey, Michael Franco, Reed Landrum, and Laksono Adhianto. Scalable fine-grained call path tracing. In ICS '11: Proc. of the 25th International Conference on Supercomputing, 2011. (PDF)
[2]
Xu Liu and John Mellor-Crummey. Pinpointing data locality problems using data-centric analysis. In CGO '11: Proc. of the 2011 International Symposium on Code Generation and Optimization, 2011. To appear.
[3]
Nathan R. Tallent, Laksono Adhianto, and John M. Mellor-Crummey. Scalable identification of load imbalance in parallel executions using call path profiles. In SC '10: Proc. of the 2010 ACM/IEEE Conference on Supercomputing, 2010. (PDF)
[4]
Laksono Adhianto, John Mellor-Crummey, and Nathan R. Tallent. Effectively presenting call path profiles of application performance. In PSTI 2010: Workshop on Parallel Software Tools and Tool Infrastructures, in conjuction with the 2010 International Conference on Parallel Processing, 2010. (PDF)
[5]
Laksono Adhianto, Sinchan Banerjee, Mike Fagan, Mark Krentel, Gabriel Marin, John Mellor-Crummey, and Nathan R. Tallent. HPCToolkit: Tools for performance analysis of optimized parallel programs. Concurrency and Computation: Practice and Experience, 22(6):685–701, 2010. (PDF) (doi:10.1002/cpe.1553)
[6]
Nathan R. Tallent, John M. Mellor-Crummey, and Allan Porterfield. Analyzing lock contention in multithreaded applications. In PPoPP '10: Proc. of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 269–280, New York, NY, USA, 2010. ACM. (PDF) (doi:10.1145/1693453.1693489)
[7]
Nathan R. Tallent and John M. Mellor-Crummey. Identifying performance bottlenecks in work-stealing computations. Computer, 42(12):44–50, 2009. (doi:10.1109/MC.2009.396)
[8]
Nathan R. Tallent, John M. Mellor-Crummey, Laksono Adhianto, Michael W. Fagan, and Mark Krentel. Diagnosing performance bottlenecks in emerging petascale applications. In SC '09: Proc. of the 2009 ACM/IEEE Conference on Supercomputing, pages 1–11, New York, NY, USA, 2009. ACM. (PDF) (doi:10.1145/1654059.1654111)
[9]
Nathan R. Tallent, John Mellor-Crummey, and Michael W. Fagan. Binary analysis for measurement and attribution of program performance. In PLDI '09: Proc. of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 441–452, New York, NY, USA, 2009. ACM. Distinguished Paper. (PDF) (doi:10.1145/1542476.1542526)
[10]
Robert Fowler, Laksono Adhianto, Bronis de Supinski, Michael Fagan, Todd Gamblin, Mark Krentel, John Mellor-Crummey, Martin Schulz, and Nathan Tallent. Frontiers of performance analysis on leadership-class systems. Journal of Physics: Conference Series, 180:012041 (6pp), 2009.
[11]
Nathan R. Tallent and John Mellor-Crummey. Effective performance measurement and analysis of multithreaded applications. In PPoPP '09: Proc. of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 229–240, New York, NY, USA, 2009. ACM. (PDF) (doi:10.1145/1504176.1504210)
[12]
L. Adhianto, M. Fagan, M. Krentel, G. Marin, J. Mellor-Crummey, and N. R. Tallent. HPCToolkit: Performance measurement and analysis for supercomputers with node-level parallelism. In Workshop on Node Level Parallelism for Large Scale Supercomputers, in conjuction with Supercomputing 2008, November 2008.
[13]
Nathan Tallent, John Mellor-Crummey, Laksono Adhianto, Mike Fagan, and Mark Krentel. HPCToolkit: Performance tools for scientific computing. Journal of Physics: Conference Series, 125:012088 (5pp), 2008.
[14]
John Mellor-Crummey and Nathan R. Tallent. A methodology for accurate, effective and scalable performance analysis of application programs. In Workshop on Tools, Infrastructures and Methodologies for the Evaluation of Research Systems, in conjuction with the 2008 IEEE International Symposium on Performance Analysis of Systems and Software, pages 4–11, February 2008.
[15]
John Mellor-Crummey, Nathan R. Tallent, Mike Fagan, and Jan Odegard. Application performance profiling on the Cray XD1 using HPCToolkit. In Proc. of the Cray User's Group, May 2007.
[16]
Cristian Coarfa, John Mellor-Crummey, Nathan Froyd, and Yuri Dotsenko. Scalability analysis of SPMD codes using expectations. In ICS '07: Proc. of the 21st International Conference on Supercomputing, pages 13–22, New York, NY, USA, 2007. ACM. (doi:10.1145/1274971.1274976)
[17]
Nathan Froyd, Nathan Tallent, John Mellor-Crummey, and Robert Fowler. Call path profiling for unmodified, optimized binaries. In GCC Summit '06: Proc. of the GCC Developers' Summit, 2006, pages 21–36, 2006.
[18]
Nathan Froyd, John Mellor-Crummey, and Rob Fowler. Low-overhead call path profiling of unmodified, optimized code. In Proc. of the 19th International Conference on Supercomputing, pages 81–90, New York, NY, USA, 2005. ACM. (PDF) (doi:10.1145/1088149.1088161)
[19]
John Mellor-Crummey, Robert Fowler, Gabriel Marin, and Nathan Tallent. HPCView: A tool for top-down analysis of node performance. The Journal of Supercomputing, 23(1):81–104, 2002. (PDF) (doi:10.1023/A:1015789220266)
[20]
John Mellor-Crummey, Robert Fowler, and David Whalley. Tools for application-oriented performance tuning. In ICS '01: Proc. of the 15th International Conference on Supercomputing, pages 154–165, New York, NY, USA, 2001. ACM. (PDF) (doi:10.1145/377792.377826)

HPCToolkit Talks and Posters

[0]
John Mellor-Crummey. Introduction to Correctness and Performance Tools for Parallel Programming. The International Conference for High Performance Computing, Networking, Storage and Analysis (SC14). Experiencing HPC for Undergraduates: Introduction to HPC Research. New Orleans, LA. November, 2014.
[1]
John Mellor-Crummey. Hpctoolkit: Sampling-based performance tools for leadership computing. Productivity Tools for Leadership Science Workshop, Argonne Leadership Computing Facility Winter Workshop Series, January 2011.
[2]
Nathan R. Tallent. Performance analysis for parallel programs: From multicore to petascale. Supercomputing 2010 George Michael HPC Fellow Presentation, November 2010.
[3]
John Mellor-Crummey. Gaining insight into parallel program performance using sampling. IBM T. J. Watson Research Center, October 2010.
[4]
John Mellor-Crummey. A slice of CScADS: Performance tools for petascale platforms. SciDAC 2010, July 2010.
[5]
Nathan R. Tallent. Identifying scalability bottlenecks in large-scale parallel programs using HPCToolkit. In Jesus Labarta, Barton P. Miller, Bernd Mohr, and Martin Schulz, editors, Program Development for Extreme-Scale Computing, number 10181 in Dagstuhl Seminar Proceedings, Dagstuhl, Germany, 2010. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany.
[6]
John Mellor-Crummey. Hpctoolkit: Sampling-based performance tools for leadership computing. INCITE Getting Started Workshop, Argonne Leadership Computing Facility, January 2010.
[7]
Nathan R. Tallent. Performance analysis of parallel programs: From multicore to petascale. Supercomputing 2009 Doctoral Research Showcase, November 2009.
[8]
L. Adhianto, S. Banerjee, M. Fagan, M. Krentel, G. Marin, J. Mellor-Crummey, and N. R. Tallent. HPCToolkit: Performance tools for scientific computing. In SC '08: Proc. of the 2008 ACM/IEEE Conference on Supercomputing, New York, NY, USA, 2008. ACM.
[9]
John Mellor-Crummey, Robert Fowler, and Nathan R. Tallent. Practical application performance analysis on Linux systems. Supercomputing 2004 Tutorial, November 2004.
[10]
John Mellor-Crummey. HPCToolkit: Multi-platform tools for profile-based performance analysis. 5th International Workshop on Automatic Performance Analysis (APART), November 2003. (PDF)
[11]
Nathan Froyd, John Mellor-Crummey, and Nathan R. Tallent. A sample-driven call stack profiler. 4th Symposium of the Los Alamos Computer Science Institute (LACSI 2003), October 2003.
[12]
Nathan R. Tallent. HPCToolkit: Top-down analysis of node performance. 2003 MCS Divisional Seminars and Colloquia, Argonne National Laboratory, August 2003.
[13]
John Mellor-Crummey, Robert Fowler, and David Whalley. On providing useful information for analyzing and tuning applications. In SIGMETRICS '01: Proc. of the 2001 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pages 332–333, New York, NY, USA, 2001. ACM. (PDF) (doi:10.1145/378420.378828)

Performance Modeling and Prediction Papers

[1]
G. Marin and J. Mellor-Crummey. Application insight through performance modeling. In IPCCC 2007: Proc. of the 26th IEEE International Performance, Computing, and Communications Conference, pages 65 –74, apr. 2007. (PDF) (doi:10.1109/PCCC.2007.358880)
[2]
Apan Qasem, Ken Kennedy, and John Mellor-Crummey. Automatic tuning of whole applications using direct search and a performance-based transformation system. J. Supercomput., 36(2):183–196, 2006. (PDF) (doi:10.1007/s11227-006-7957-2)
[3]
Gabriel Marin and John Mellor-Crummey. Scalable cross-architecture predictions of memory hierarchy response for scientific applications. In Proc. of the Sixth Annual Los Alamos Computer Science Institute Symposium, 2005. (PDF)
[4]
Gabriel Marin and John Mellor-Crummey. Cross-architecture performance predictions for scientific applications using parameterized models. In SIGMETRICS '04: Proc. of the Joint International Conference on Measurement and Modeling of Computer Systems, pages 2–13, New York, NY, USA, 2004. ACM. (PDF) (doi:10.1145/1005686.1005691)

[Made with bib2xhtml.]

[Page last updated: 2011/06/25]


©2000-2017 Rice UniversityRice Computer Science