1. 1

Compiler-assisted toolkit for custom instrumentation at the assembly level of Cuda code, similar to PIN, ATOM, DynInst, et al.

See also this Tutorial slide deck from Micro 2015 (PPTX), which also has a pretty good refresher on GPU architecture.

  1.