OpenXLA

OpenXLA

XLA GPU Roadmap

XLA GPU Roadmap

XLA CPU Roadmap

XLA CPU Roadmap

PyTorch 💙 XLA

PyTorch 💙 XLA

OpenXLA Overview

OpenXLA Overview

Пікірлер

  • @wolpumba4099
    @wolpumba40992 ай бұрын

    *OpenXLA Community Meeting - March 2023: Abstract* The March 2023 OpenXLA community meeting showcased a range of exciting developments and ongoing discussions within the project. Key highlights included: * *PGRT Plugin advancements:* Enhanced functionality and broader hardware support, including Apple Silicon integration for Jax. * *StableHLO Quantizer:* A novel project addressing quantization scalability challenges and promoting "write once, run everywhere" in the ML domain. * *Chardon Partitioner:* A new partitioner merging the strengths of GSPMD and XLA-Par, offering improved user control, debuggability, and advanced features. * *Batch Dimensions for Gather and Scatter:* A proposal to enhance expressiveness and facilitate efficient sharding for batch operations. * *Composite Op Introduction:* Enabling experimentation with novel ML abstractions and ensuring backend compatibility. * *Active RFCs:* Numerous RFCs addressing diverse topics, including precision configuration for Dops, new MHLO features, hybrid quantization, and StableHLO v1.0 compatibility. The meeting demonstrated a vibrant and growing OpenXLA community, with active contributions and collaborative efforts driving innovation in the ML compiler ecosystem. *OpenXLA Community Meeting - March 2023* *Introductions* * *0:05**:* Elliot, the new Technical Lead (TL) for OpenXLA at Google, introduces himself and outlines his focus on technical direction, roadmap organization, and community process improvement. *Agenda* * *1:27**:* A brief reminder about OpenXLA as an open-source, state-of-the-art ML compiler ecosystem built in collaboration with various partners. *PGRT Blog Post* * *1:47**:* Aman discusses the recently published blog post about the PGRT plugin, covering its functionality, creation process, and discovery by frameworks. * *2:21**:* Updates on the PGRT API are highlighted, including versioning, API compatibility, and multi-node DLAC support. * *2:42**:* Apple's adoption of PGRT for Apple Silicon support in Jax is showcased, detailing the generation and execution of stable HLO, MPS graphs, and integration into a Metal plugin. * *3:07**:* The blog post emphasizes the broad range of hardware targets using PGRT, including Intel Max GPUs, Google Cloud GPUs, NVIDIA GPUs, and Apple Silicon. *Technical Updates* *StableHLO Quantizer* * *4:48**:* J from Google's Model Optimization project introduces the StableHLO Quantizer project and seeks community feedback, interest, and collaboration opportunities. * *5:41**:* The project aims to address the scalability challenges of quantization solutions that are often tied to specific hardware or ML frameworks. * *7:53**:* By leveraging StableHLO graphs and a uniform quantization representation, the project promotes "write once, run everywhere" in the ML domain. * *8:21**:* The current implementation is integrated into Cloud TPU inference converter and TFLite products, enabling quantization across mobile and server environments. * *10:28**:* Open-sourcing the project is planned to enhance the overall quantization execution ecosystem. *Chardon Partitioner* * *15:50**:* Tom and Dom from DeepMind present Chardon, a new partitioner combining the best features of GSPMD and XLA-Par (an ML-based partitioner). * *16:23**:* Key features include priorities for controlling sharding propagation, intermediate sharding annotations for fine-grained control, and improved user control and debuggability. * *17:21**:* Developed in MLIR and planned for open-sourcing within OpenXLA, Chardon aims to be dialect-agnostic and facilitate integration with other compiler infrastructure. * *21:00**:* The presentation details the Sharding API, including mesh representation, sharding annotations, axis splitting, and priorities. * *25:22**:* Additional features like ShardAs/ShardLike and Manual computation are explained. *Batch Dimensions for Gather and Scatter* * *32:17**:* Tom proposes adding batch dimensions to Gather and Scatter operations in StableHLO, aiming for simpler batching, easier partitioning, and improved expressiveness. * *33:26**:* Using a Jax example, the current limitations of Gather in handling batch dimensions are demonstrated, highlighting the need for explicit representation. * *35:50**:* The proposal draws inspiration from DotGeneral in XLA, which preserves batch dimension information during vectorization. * *36:29**:* The solution involves introducing batching dimension attributes to Gather and Scatter, similar to DotGeneral, enabling efficient sharding propagation. *Composite Op* * *45:20**:* Michael from the StableHLO team introduces the new Composite op, designed to support experimentation with novel ML abstractions. * *45:48**:* Composite op allows decomposing complex operations into simpler ones, ensuring backend compatibility and enabling future inclusion in StableHLO if widely adopted. * *46:34**:* An example demonstrates the structure and usage of Composite op, including its name, operands, and decomposition function reference. * *47:39**:* The StableHLO LegalizeCompositeToCall pass enables backends to choose between processing the composite directly or expanding it into simpler operations. *Active RFCs* * *52:09**:* Elliot provides an overview of active RFCs, indicating significant community engagement and progress. * *52:36**:* RFCs under discussion include improved precision configuration for Dops, new MHLO features (Tan op, CustomCall with dictionary, and variadic collectives), hybrid quantization, and ODML compatibility for StableHLO v1.0. *DevLab and Closing* * *54:38**:* Reminder about the upcoming OpenXLA DevLab on April 25th, with the agenda to be finalized and shared soon. * *55:11**:* Slides, recording, and notes from the meeting will be shared by the end of the week. i summarized the transcript using gemini 1.5 pro Token count 14,088 / 1,048,576

  • @jony7779
    @jony77794 ай бұрын

    she says vmap and jvp are already supported function transformations for pallas kernels. Do I understand correctly then that reverse mode (i.e. vjp) autodiff is not supported right now? i.e. if you write some DL primitive in pallas you gotta write its grad kernel too?

  • @alexkarl3554
    @alexkarl35544 ай бұрын

    What does "PJRT" stand for?

  • @MbjYjbpivj
    @MbjYjbpivj5 ай бұрын

    Any slice I can find?

  • @jueonpark11
    @jueonpark117 ай бұрын

    Nice work!

  • @PeterHan9606
    @PeterHan960610 ай бұрын

    Technical Upates starts kzread.info/dash/bejne/hGaExbSYic-7nrg.html

  • @brookssong4437
    @brookssong443711 ай бұрын

    what's the relation between XLA and Triton?