| 12481500 |
Accelerating linear algebra kernels for any processor architecture |
Venmugil Elango, Norm Rubin, Mahesh Ravishankar |
2025-11-25 |
|
| 12436876 |
Memory management system |
Sean Lee, James Clarkson |
2025-10-07 |
|
| 12423074 |
Neural network layer fusion |
Bin Fan, Evghenii Gaburov, Yuan Lin |
2025-09-23 |
|
| 12423007 |
Techniques for tensor memory allocation |
Mahesh Ravishankar, Yuan Lin |
2025-09-23 |
|
| 11630653 |
Execution of computation graphs |
Mahesh Ravishankar, Evghenii Gaburov, Alberto MAGNI, Sean Lee |
2023-04-18 |
$1,172,364,000 |
| 11579852 |
Device profiling in GPU accelerators by using host-device coordination |
Hariharan Sandanagobalane, Sean Lee |
2023-02-14 |
$981,891,000 |
| 10853044 |
Device profiling in GPU accelerators by using host-device coordination |
Hariharan Sandanagobalane, Sean Lee |
2020-12-01 |
$734,055,000 |
| 10324693 |
Optimizing multiple invocations of graphics processing unit programs in Java |
Michael Lai, Sean Lee, Jaydeep Marathe |
2019-06-18 |
$317,094,000 |
| 10241761 |
System and method for compiler support for compile time customization of code |
Jaydeep Marathe |
2019-03-26 |
$184,020,000 |
| 10152310 |
Fusing a sequence of operations through subdividing |
Mahesh Ravishankar, Paulius Micikevicius |
2018-12-11 |
$143,375,000 |
| 10152312 |
Dynamic compiler parallelism techniques |
Thibaut Lutz |
2018-12-11 |
$143,375,000 |
| 10067768 |
Execution of divergent threads using a convergence barrier |
Gregory Diamos, Richard Craig Johnson, Olivier Giroux, Jack Choquette, Michael A. Fetterman +3 more |
2018-09-04 |
$382,341,000 |
| 10025643 |
System and method for compiler support for kernel launches in device code |
Jaydeep Marathe, Sean Lee |
2018-07-17 |
$157,822,000 |
| 9952843 |
Partial program specialization at runtime |
Thibaut Lutz |
2018-04-24 |
$155,137,000 |
| 9798569 |
System and method for retrieving values of captured local variables for lambda functions in Java |
Michael Lai, Sean Lee, Jaydeep Marathe |
2017-10-24 |
$203,373,000 |
| 9678775 |
Allocating memory for local variables of a multi-threaded program for execution in a single-threaded environment |
John A. Stratton |
2017-06-13 |
$263,577,000 |
| 9658880 |
Efficient garbage collection and exception handling in a hardware accelerated transactional memory system |
Jan Gray, Martin Taillefer, Yosseff Levanoni, Ali-Reza Adl-Tabatabai, Dave Detlefs +2 more |
2017-05-23 |
$34,380,000 |
| 9639336 |
Algorithm for vectorization and memory coalescing during compiling |
Manjunath Kudlur, Michael Murphy |
2017-05-02 |
$52,442,000 |
| 9612811 |
Confluence analysis and loop fast-forwarding for improving SIMD execution efficiency |
Amit Sabne, Yuan Lin |
2017-04-04 |
$61,571,000 |
| 9563933 |
Methods for reducing memory space in sequential operations using directed acyclic graphs |
Mahesh Ravishankar |
2017-02-07 |
$180,900,000 |
| 9448779 |
Execution of retargetted graphics processor accelerated code by a general purpose processor |
Bastiaan Aarts, Michael Murphy, Jayant B. Kolhe, John Bryan Pormann, Douglas Saylor |
2016-09-20 |
$28,747,000 |
| 9436447 |
Technique for live analysis-based rematerialization to reduce register pressures and enhance parallelism |
Xiangyun Kong, Jian Wang, Yuan Lin |
2016-09-06 |
$15,763,000 |
| 9411715 |
System, method, and computer program product for optimizing the management of thread stack memory |
Adriana Maria Susnea, Sean Lee |
2016-08-09 |
$23,908,000 |
| 9411635 |
Parallel nested transactions in transactional memory |
Michael M. Magruder, David L. Detlefs, John Duffy, Goetz Graefe |
2016-08-09 |
$42,929,000 |
| 9367306 |
Method for transforming a multithreaded program for general execution |
Jaydeep Marathe |
2016-06-14 |
$17,088,000 |