*OpenXLA Community Meeting - March 2023: Abstract* The March 2023 OpenXLA community meeting showcased a range of exciting developments and ongoing discussions within the project. Key highlights included: * *PGRT Plugin advancements:* Enhanced functionality and broader hardware support, including Apple Silicon integration for Jax. * *StableHLO Quantizer:* A novel project addressing quantization scalability challenges and promoting "write once, run everywhere" in the ML domain. * *Chardon Partitioner:* A new partitioner merging the strengths of GSPMD and XLA-Par, offering improved user control, debuggability, and advanced features. * *Batch Dimensions for Gather and Scatter:* A proposal to enhance expressiveness and facilitate efficient sharding for batch operations. * *Composite Op Introduction:* Enabling experimentation with novel ML abstractions and ensuring backend compatibility. * *Active RFCs:* Numerous RFCs addressing diverse topics, including precision configuration for Dops, new MHLO features, hybrid quantization, and StableHLO v1.0 compatibility. The meeting demonstrated a vibrant and growing OpenXLA community, with active contributions and collaborative efforts driving innovation in the ML compiler ecosystem. *OpenXLA Community Meeting - March 2023* *Introductions* * *0:05**:* Elliot, the new Technical Lead (TL) for OpenXLA at Google, introduces himself and outlines his focus on technical direction, roadmap organization, and community process improvement. *Agenda* * *1:27**:* A brief reminder about OpenXLA as an open-source, state-of-the-art ML compiler ecosystem built in collaboration with various partners. *PGRT Blog Post* * *1:47**:* Aman discusses the recently published blog post about the PGRT plugin, covering its functionality, creation process, and discovery by frameworks. * *2:21**:* Updates on the PGRT API are highlighted, including versioning, API compatibility, and multi-node DLAC support. * *2:42**:* Apple's adoption of PGRT for Apple Silicon support in Jax is showcased, detailing the generation and execution of stable HLO, MPS graphs, and integration into a Metal plugin. * *3:07**:* The blog post emphasizes the broad range of hardware targets using PGRT, including Intel Max GPUs, Google Cloud GPUs, NVIDIA GPUs, and Apple Silicon. *Technical Updates* *StableHLO Quantizer* * *4:48**:* J from Google's Model Optimization project introduces the StableHLO Quantizer project and seeks community feedback, interest, and collaboration opportunities. * *5:41**:* The project aims to address the scalability challenges of quantization solutions that are often tied to specific hardware or ML frameworks. * *7:53**:* By leveraging StableHLO graphs and a uniform quantization representation, the project promotes "write once, run everywhere" in the ML domain. * *8:21**:* The current implementation is integrated into Cloud TPU inference converter and TFLite products, enabling quantization across mobile and server environments. * *10:28**:* Open-sourcing the project is planned to enhance the overall quantization execution ecosystem. *Chardon Partitioner* * *15:50**:* Tom and Dom from DeepMind present Chardon, a new partitioner combining the best features of GSPMD and XLA-Par (an ML-based partitioner). * *16:23**:* Key features include priorities for controlling sharding propagation, intermediate sharding annotations for fine-grained control, and improved user control and debuggability. * *17:21**:* Developed in MLIR and planned for open-sourcing within OpenXLA, Chardon aims to be dialect-agnostic and facilitate integration with other compiler infrastructure. * *21:00**:* The presentation details the Sharding API, including mesh representation, sharding annotations, axis splitting, and priorities. * *25:22**:* Additional features like ShardAs/ShardLike and Manual computation are explained. *Batch Dimensions for Gather and Scatter* * *32:17**:* Tom proposes adding batch dimensions to Gather and Scatter operations in StableHLO, aiming for simpler batching, easier partitioning, and improved expressiveness. * *33:26**:* Using a Jax example, the current limitations of Gather in handling batch dimensions are demonstrated, highlighting the need for explicit representation. * *35:50**:* The proposal draws inspiration from DotGeneral in XLA, which preserves batch dimension information during vectorization. * *36:29**:* The solution involves introducing batching dimension attributes to Gather and Scatter, similar to DotGeneral, enabling efficient sharding propagation. *Composite Op* * *45:20**:* Michael from the StableHLO team introduces the new Composite op, designed to support experimentation with novel ML abstractions. * *45:48**:* Composite op allows decomposing complex operations into simpler ones, ensuring backend compatibility and enabling future inclusion in StableHLO if widely adopted. * *46:34**:* An example demonstrates the structure and usage of Composite op, including its name, operands, and decomposition function reference. * *47:39**:* The StableHLO LegalizeCompositeToCall pass enables backends to choose between processing the composite directly or expanding it into simpler operations. *Active RFCs* * *52:09**:* Elliot provides an overview of active RFCs, indicating significant community engagement and progress. * *52:36**:* RFCs under discussion include improved precision configuration for Dops, new MHLO features (Tan op, CustomCall with dictionary, and variadic collectives), hybrid quantization, and ODML compatibility for StableHLO v1.0. *DevLab and Closing* * *54:38**:* Reminder about the upcoming OpenXLA DevLab on April 25th, with the agenda to be finalized and shared soon. * *55:11**:* Slides, recording, and notes from the meeting will be shared by the end of the week. i summarized the transcript using gemini 1.5 pro Token count 14,088 / 1,048,576
@jony77794 ай бұрын
she says vmap and jvp are already supported function transformations for pallas kernels. Do I understand correctly then that reverse mode (i.e. vjp) autodiff is not supported right now? i.e. if you write some DL primitive in pallas you gotta write its grad kernel too?
Пікірлер
*OpenXLA Community Meeting - March 2023: Abstract* The March 2023 OpenXLA community meeting showcased a range of exciting developments and ongoing discussions within the project. Key highlights included: * *PGRT Plugin advancements:* Enhanced functionality and broader hardware support, including Apple Silicon integration for Jax. * *StableHLO Quantizer:* A novel project addressing quantization scalability challenges and promoting "write once, run everywhere" in the ML domain. * *Chardon Partitioner:* A new partitioner merging the strengths of GSPMD and XLA-Par, offering improved user control, debuggability, and advanced features. * *Batch Dimensions for Gather and Scatter:* A proposal to enhance expressiveness and facilitate efficient sharding for batch operations. * *Composite Op Introduction:* Enabling experimentation with novel ML abstractions and ensuring backend compatibility. * *Active RFCs:* Numerous RFCs addressing diverse topics, including precision configuration for Dops, new MHLO features, hybrid quantization, and StableHLO v1.0 compatibility. The meeting demonstrated a vibrant and growing OpenXLA community, with active contributions and collaborative efforts driving innovation in the ML compiler ecosystem. *OpenXLA Community Meeting - March 2023* *Introductions* * *0:05**:* Elliot, the new Technical Lead (TL) for OpenXLA at Google, introduces himself and outlines his focus on technical direction, roadmap organization, and community process improvement. *Agenda* * *1:27**:* A brief reminder about OpenXLA as an open-source, state-of-the-art ML compiler ecosystem built in collaboration with various partners. *PGRT Blog Post* * *1:47**:* Aman discusses the recently published blog post about the PGRT plugin, covering its functionality, creation process, and discovery by frameworks. * *2:21**:* Updates on the PGRT API are highlighted, including versioning, API compatibility, and multi-node DLAC support. * *2:42**:* Apple's adoption of PGRT for Apple Silicon support in Jax is showcased, detailing the generation and execution of stable HLO, MPS graphs, and integration into a Metal plugin. * *3:07**:* The blog post emphasizes the broad range of hardware targets using PGRT, including Intel Max GPUs, Google Cloud GPUs, NVIDIA GPUs, and Apple Silicon. *Technical Updates* *StableHLO Quantizer* * *4:48**:* J from Google's Model Optimization project introduces the StableHLO Quantizer project and seeks community feedback, interest, and collaboration opportunities. * *5:41**:* The project aims to address the scalability challenges of quantization solutions that are often tied to specific hardware or ML frameworks. * *7:53**:* By leveraging StableHLO graphs and a uniform quantization representation, the project promotes "write once, run everywhere" in the ML domain. * *8:21**:* The current implementation is integrated into Cloud TPU inference converter and TFLite products, enabling quantization across mobile and server environments. * *10:28**:* Open-sourcing the project is planned to enhance the overall quantization execution ecosystem. *Chardon Partitioner* * *15:50**:* Tom and Dom from DeepMind present Chardon, a new partitioner combining the best features of GSPMD and XLA-Par (an ML-based partitioner). * *16:23**:* Key features include priorities for controlling sharding propagation, intermediate sharding annotations for fine-grained control, and improved user control and debuggability. * *17:21**:* Developed in MLIR and planned for open-sourcing within OpenXLA, Chardon aims to be dialect-agnostic and facilitate integration with other compiler infrastructure. * *21:00**:* The presentation details the Sharding API, including mesh representation, sharding annotations, axis splitting, and priorities. * *25:22**:* Additional features like ShardAs/ShardLike and Manual computation are explained. *Batch Dimensions for Gather and Scatter* * *32:17**:* Tom proposes adding batch dimensions to Gather and Scatter operations in StableHLO, aiming for simpler batching, easier partitioning, and improved expressiveness. * *33:26**:* Using a Jax example, the current limitations of Gather in handling batch dimensions are demonstrated, highlighting the need for explicit representation. * *35:50**:* The proposal draws inspiration from DotGeneral in XLA, which preserves batch dimension information during vectorization. * *36:29**:* The solution involves introducing batching dimension attributes to Gather and Scatter, similar to DotGeneral, enabling efficient sharding propagation. *Composite Op* * *45:20**:* Michael from the StableHLO team introduces the new Composite op, designed to support experimentation with novel ML abstractions. * *45:48**:* Composite op allows decomposing complex operations into simpler ones, ensuring backend compatibility and enabling future inclusion in StableHLO if widely adopted. * *46:34**:* An example demonstrates the structure and usage of Composite op, including its name, operands, and decomposition function reference. * *47:39**:* The StableHLO LegalizeCompositeToCall pass enables backends to choose between processing the composite directly or expanding it into simpler operations. *Active RFCs* * *52:09**:* Elliot provides an overview of active RFCs, indicating significant community engagement and progress. * *52:36**:* RFCs under discussion include improved precision configuration for Dops, new MHLO features (Tan op, CustomCall with dictionary, and variadic collectives), hybrid quantization, and ODML compatibility for StableHLO v1.0. *DevLab and Closing* * *54:38**:* Reminder about the upcoming OpenXLA DevLab on April 25th, with the agenda to be finalized and shared soon. * *55:11**:* Slides, recording, and notes from the meeting will be shared by the end of the week. i summarized the transcript using gemini 1.5 pro Token count 14,088 / 1,048,576
she says vmap and jvp are already supported function transformations for pallas kernels. Do I understand correctly then that reverse mode (i.e. vjp) autodiff is not supported right now? i.e. if you write some DL primitive in pallas you gotta write its grad kernel too?
What does "PJRT" stand for?
Any slice I can find?
Nice work!
Technical Upates starts kzread.info/dash/bejne/hGaExbSYic-7nrg.html
what's the relation between XLA and Triton?