Publications

The papers are provided for personal use and are subject to copyright of the publishers.

Eric Anger, Sudhakar Yalamanchili, Scott Pakin, and Patrick McCormick. Architecture-Independent Modeling of Intra-Node Data Movement. The LLVM Compiler Infrastructure in HPC Workshop. In conjunction with Supercomputing 2014. November 2014.

Kevin Barker, Darren Kerbyson, and Eric Anger. On the Feasibility of Dynamic Power Steering. Second Workshop on Energy Efficient Supercomputing (E2SC). In conjunction with Supercomputing 2014. November 2014.

Jin Wang and Sudhakar Yalamanchili. Characterization and Analysis of Dynamic Parallelism in Unstructured GPU Applications. 2014 IEEE International Symposium on Workload Characterization (IISWC). October 2014. [paper]

Haicheng Wu, Daniel Zinn, Molham Aref, and Sudhakar Yalamanchili. Multipredicate Join Algorithms for Accelerating Relational Graph Processing on GPUs. The 5th International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures (ADMS). September 2014. [paper]

Hyesoon Kim, Eric Anger, Prasun Gera, Jeremiah J. Wilke, Patrick S. McCormick, and Sudhakar Yalamanchili. Integrated, Application-Level, Performance-Energy Modeling for Heterogeneous Architectures. Workshop on Modeling & Simulation of Systems and Applications (MODSIM). August 2014.

Si Li, Vilas Sridharan, Sudhanva Gurumurthi, Sudhakar Yalamanchili. Software Based Techniques for Reducing the Vulnerability of GPU Applications. Workshop on Dependable GPU Computing (at DATE). March 2014. [paper]

Jin Wang, Norman Rubin, and Sudharkar Yalamanchili. ParallelJS: An Execution Framework for JavaScript on Heterogeneous Systems Seventh Workshop on General Purpose Processing Using GPUs. March 2014. [paper]

Naila Farooqui, Karsten Schwan, and Sudhakar Yalamanchili. Efficient Instrumentation of GPGPU Applications Using Information Flow Analysis and Symbolic Execution. Seventh Workshop on General Purpose Processing Using GPUs. March 2014. [paper]

Haicheng Wu, Gregory Diamos, Tim Sheard, Molham Aref, Sean Baxter, Michael Garland, and Sudharkar Yalamanchili. Red Fox: An Execution Environment for Relational Query Processing on GPUs. International Symposium on Code Generation and Optimization (CGO). February 2014. [paper]

Si Li, Naila Farooqui, and Sudhakar Yalamanchili. Software Reliability Enhancements for GPU Architectures. Workshop on Programmability Issues for Heterogeneous Multicores (MULTIPROG). January 2013. [paper]

Sudhakar Yalamanchili. Scaling Data Warehousing Applications using GPUs. Second International Workshop on Performance Analysis of Workload Optimized Systems (FastPath-2013), held with ISPASS-2013. April 2013. [slides]

Eric Anger, Gilbert Hendry, Sudhakar Yalamanchili. Modeling Exascale Applications with SST/macro and Eiger (tutorial). IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). April 2013.

Haicheng Wu, Jeffrey Young and Sudhakar Yalamanchili. Satisfying Data-Intensive Queries Using GPU Clusters (poster). GPGPU Technology Conference (GTC-2013). March 2013. [poster]

Jin Wang, Norman Rubin, Haicheng Wu, and Sudhakar Yalamanchili. Accelerating Simulations of Agent Based Models on Heterogeneous Architectures. Workshop on General Purpose Processing on GPUs (GPGPU). March 2013. [paper]

Gregory Diamos, Haicheng Wu, Jin Wang and Ashwin Lele, and Sudharkar Yalamanchili. Efficient Relational Algebra Algorithms and Data Structures for Hierarchical Parallel Processors (short paper). ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). February 2013. [paper]

Si Li, Naila Farooqui, and Sudhakar Yalamanchili. Software Reliability Enhancements for GPU Architectures. Workshop on Programmability Issues for Heterogeneous Multicores (MULTIPROG). January 2013. [paper]

Andrew Kerr, Eric Anger, Gilber Hendry, and Sudhakar Yalamanchili. Eiger: A Framework for the Automated Synthesis of Statistical Performance Models. Workshop on Performance Engineering and Applications (WPEA). December 2012. [paper]

Haicheng Wu, Gregory Diamos, Srihari Cadambi, and Sudhakar Yalamanchili. Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation. Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). December 2012. [paper]

Jeffrey Young, Haicheng Wu, Sudhakar Yalamanchili. Satisfying Data-Intensive Queries Using GPU Clusters. Annual Workshop on High-Performance Computing meets Databases (HPCDB). November 2012. [paper]

Haicheng Wu, Gregory Diamos, Ashwin Lele, Jin Wang, Srihari Cadambi, Sudhakar Yalamanchili, and Srimat Chakradhar. Optimizing Data Warehousing Applications for GPUs using Kernel Fusion/Fission. Workshop on Multicore and GPU Programming Models, Languages and Compilers (PLC). May 2012. [paper]

Andrew Kerr, Gregory Diamos, S. Yalamanchili. Dynamic Compilation of Data-Parallel Kernels for Vector Processors. International Symposium on Code Generation and Optimization (CGO). April 2012. [paper]

Naila Farooqui, Andrew Kerr, Greg Eisenhauer, Karsten Schwan, Sudhakar Yalamanchili. Lynx: A Dynamic Instrumentation System for Data-Parallel Applications on GPGPU Architectures. IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). April 2012. [paper]

Haicheng Wu, Gregory Diamos, Jin Wang, Si Li, and Sudhakar Yalamanchili. Characterization and Transformation of Unstructured Control Flow in Bulk Synchronous GPU Applications. International Journal of High Performance Computing Applications, 26(2):170-185. May 2012. [paper]

Jeffrey S. Vetter, Richard Glassbrook, Jack Dongarra, Karsten Schwan, Bruce Loftis, Stephen McNally, Jeremy Meredith, James Rogers, Philip Roth, Kyle Spafford, and Sudhakar Yalamanchili. Keeneland: Bringing heterogeneous GPU computing to the computational science community. IEEE Computing in Science and Engineering, 13(5):90-95. 2011. [paper]

Gregory Diamos, Benjamin Ashbaugh, Subramaniam Maiyuran, Andrew Kerr, Haicheng Wu, Sudhakar Yalamanchili. SIMD Re-Convergence At Thread Frontiers. Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). December 2011. [paper]

Andrew Kerr, Gregory Diamos, Sudhakar Yalamanchili. GPU Application Development, Debugging, and Performance Tuning with GPU Ocelot. GPU Computing GEMS Jade Edition, 1st Edition. September 2011. [paper]

Haicheng Wu, Gregory Diamos, Si Li, and Sudhakar Yalamanchili. Characterization and Transformation of Unstructured Control Flow in GPU Applications. First International Workshop on Characterizing Applications for Heterogeneous Exascale Systems (CACHES). June 2011. [paper]

Naila Farooqui, Andrew Kerr, Gregory Diamos, Sudhakar Yalamanchili, and Karsten Schwan. A Framework for Dynamically Instrumenting GPU Compute Applications within GPU Ocelot. Workshop on General Purpose Processing Using GPUs (GPGPU). March 2011. [paper]

Gregory Diamos, Andrew Kerr, Sudhakar Yalamanchili, and Nathan Clark. Ocelot: A Dynamic Compiler for Bulk-Synchronous Applications in Heterogeneous Systems. International Conference on Parallel Architectures and Compilation Techniques (PACT). September 2010. [paper]

Gregory Diamos and Sudhakar Yalamanchili. Speculative Execution On Multi-GPU Systems. IEEE International Parallel & Distributed Processing Symposium (IPDPS). April 2010. [paper]

Andrew Kerr, Gregory Diamos, and Sudhakar Yalamanchili. Modeling GPU-CPU Workloads and Systems. Workshop on General Purpose Processing Using GPUs (GPGPU). March 2010. [paper]

Sudnya Padalikar and Gregory Diamos. Exploring The Latency and Bandwidth Tolerance of CUDA Applications. NFinTes Tech Report. December 2009. [paper]

Andrew Kerr, Gregory Diamos, and Sudhakar Yalamanchili. A Characterization and Analysis of PTX Kernels. IEEE International Symposium on Workload Characterization (IISWC). October 2009. [paper]

Gregory Diamos, Andrew Kerr, Mukil Kesavan. Translating GPU Binaries to Tiered Many-Core Architectures with Ocelot. CERCS Tech Report. January 2009. [paper]

Gregory Diamos and Sudhakar Yalamanchili. Harmony: A Flexible Runtime for Heterogeneous Many Core Architectures. International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC). June 2008. [paper]