---
title: "Pioneering AI-assisted code migration: How Google achieved 6x faster migration from TensorFlow to JAX"
description: "AI coding agents are rapidly becoming ubiquitous across the software industry, fundamentally changing how developers write, test, and debug daily code"
tags: ["클라우드", "AI", "에이전트", "Google Cloud", "보안"]
created: "2026-05-07"
---

# Pioneering AI-assisted code migration: How Google achieved 6x faster migration from TensorFlow to JAX

> 레이아웃 확인용으로 생성한 실시간 IT 뉴스 기반 샘플 문서입니다. 원문 RSS의 제목과 요약, 링크를 바탕으로 한국어 해설 형식의 본문을 구성했습니다.

## 원문 정보

- 출처: Google Cloud Blog
- 게시 시각: Wed, 06 May 2026 16:00:00 +0000
- 원문 링크: [https://cloud.google.com/blog/topics/developers-practitioners/6x-faster-migration-from-tensorflow-to-jax/](https://cloud.google.com/blog/topics/developers-practitioners/6x-faster-migration-from-tensorflow-to-jax/)

## 빠른 요약

AI coding agents are rapidly becoming ubiquitous across the software industry, fundamentally changing how developers write, test, and debug daily code. While these tools excel at localized, self-contained tasks, applying them to massive, systemic codebase migrations requires an entirely new approach. Google is already addressing this challenge by incorporating AI into many migration workflows: x86 to ARM (enabling workloads on Google Axion processors); int32 to int64 identifiers (to avoid running out of ids); JUnit3 to JUnit4 (for testing); and Joda-Time to java.time (a modern time library). However, AI model migration represents a whole new level of complexity that requires even more advanced methods for AI-assisted migration. Translating a production-grade machine learning model from one framework to another, for example, from TensorFlow (TF) to JAX, is not a simple syntax update. It is a long-horizon task that requires untangling thousands of lines of code, managing complex states across multiple files, and preserving precise mathematical equivalence. Generic, single-agent coding assistants typically struggle under this weight — they frequently lose context over long workflows, hallucinate APIs, or fail to produce buildable code across an entire repository. Google’s AI and Infrastructure team has pioneered a new approach to this industry-wide problem. The result is 6x faster model migration, a milestone Sundar highlighted in the recent Google Cloud Next keynote . In this post, we share how we deployed specialized, multi-agent AI systems to migrate some of Google’s largest-scale production models from TF to JAX. Accelerating the transition from TF to JAX For many teams at Google — and across the industry — the future of scalable machine learning is being built on JAX. Designed around a functional, stateless paradigm, JAX is heavily optimized for modern Tensor Processing Unit (TPU) infrastructure and XLA compilation, making it the bedrock of the modern AI stack. Evolving to this future presents a monumental challenge. Thousands of production models are built on TensorFlow, a framework characterized by object-oriented, stateful layer initialization and static execution graphs. Manually migrating these models to JAX requires a fundamental rethinking of how layers interact, and how state is explicitly managed. Across large organizations, this type of migration alone represents hundreds (if not thousands) of software engineering (SWE) years — time better spent on researching new architectures and driving product innovation. Overcoming this challenge with AI started as an ambitious experiment within Google’s AI and Infrastructure team, but has evolved into a repeatable blueprint for addressing complex engineering problems across the company. Moving beyond single-agent coding Our early experiments with agentic code translation showed promise for simple models. However, when faced with the realities of a Google-scale migration — complex, production-grade models spanning multiple files and thousands of lines of code — generic, single-agent setups struggled. They could not balance high-level structural rules with low-level execution details, resulting in a variety of failures, such as overwriting critical files or skipping necessary functionality. To overcome these common challenges inherent to enterprise migrations, we developed a highly specialized multi-agent architecture that consists of: The Planner agent: Using deterministic, compiler-based static analysis, the Planner maps out the codebase's entire dependency tree. It then works alongside other agents to break the migration down into a discrete, step-by-step plan, helping ensure the migration happens logically from the "leaf nodes" (layers without unmigrated dependencies) upward. The Orchestrator agent: This agent acts as the project manager. It dynamically groups plan steps into manageable chunks to keep the context window focused, injects the necessary domain knowledge, and handles failure recovery if a step doesn't build. The Coder agent: Built as a reasoning and acting agent, the Coder is the workhorse. Integrated directly into our internal IDE tools, it has the ability to read files, write code, run builds, and execute unit tests. Crucially, it operates in a "test-and-fix" loop, self-correcting until it produces a compilable, verifiable component in the target language. Figure: Multi-agent AI system for complex code migrations. Process diagram describing the multi-agent system used to migrate legacy model code to JAX. Image generated with Gemini Nano Banana 2. Scalable validation and dynamic Playbooks Generative AI models are only as good as the context they are provided. Because source and target architectures rarely map 1-to-1, we engineered a scalable, hierarchical system of Playbooks. These Playbooks range from general repository instructions to highly specific "golden examples" distilled from successful manual migrations. By feeding the Orchestrator a client-specific Playbook (for instance, one tailored to YouTube's unique ranking model infrastructure), the system avoids generic hallucinations and strictly adheres to internal coding standards. This Playbook architecture is framework-agnostic, meaning it can be adapted to guide migrations between any two programming languages or frameworks. Furthermore, we instituted rigorous quality metrics to ensure the generated code is actually production-ready: Quantitative verification: For each unit of code, we verify correctness mathematically. In the case of the TF-to-JAX migration, the system utilizes algorithmic gradient ascent to find the maximum error between the original TF layer and the new JAX layer, mathematically verifying functional equivalence. Qualitative evaluation: We also evaluate the migrated code against a set of qualitative standards. In the case of the TF-to-JAX migration, we deploy a blind-audit LLM Judge that scores the migrated code against a framework-agnostic architectural checklist, so that critical, domain-specific logic is completely captured. Redefining migration velocity By deploying this multi-agent system, we dramatically alter the economics of software migration. In our evaluations on real-world, highly complex YouTube models (featuring thousands of lines of code, hundreds of layers, and deep metric dependencies), the multi-agent system achieved a 6.4x to 8x speedup over performing the migration manually. What traditionally took several SWE-months can now be reduced to only a few weeks of AI-assisted code generation, followed by expert human review. The system effectively handles the boilerplate, identifies target idioms, maps the dependencies, and generates the unit tests, allowing engineers to act as reviewers and architects rather than manual translators. Looking ahead into the AI-assisted era AI is transforming the pace of technological innovation. Without using AI to accelerate our ability to conduct large-scale migrations, it will become increasingly difficult for organizations to adopt the latest breakthroughs and maintain the security, reliability, and performance of their systems. Our work migrating machine learning implementations from one ML framework to another demonstrates that by combining deterministic static analysis, strict testing loops, and specialized multi-agent architectures, we can safely automate some of the most complex software engineering challenges in the industry. A detailed description of the process is published in our technical paper . This work is the result of collaboration across Google. We thank key contributors: Stoyan Nikolov, Niyati Parameswaran, Bernhard Konrad, Moritz Gronbach, Niket Kumar, Ann Yan, Varun Singh, Yaning Liang, Antoine Baudoux, Xevi Miró Bruix, Daniele Codecasa, Madhura Dudhgaonkar, Elian Dumitru, Alex Ivanov, Christopher Milne-O’Grady, Ahmed Omran, Ivan Petrychenko, Assaf Raman, Stefan Schnabl, Yurun Shen, Maxim Tabachnyk, Niranjan Tulpule, Amin Vahdat, and Jeff Zhou.

이 항목은 `뉴스/클라우드` 카테고리에 배치했습니다. 실제 운영에서는 Hermes가 뉴스 후보를 수집한 뒤, 제목·요약·출처·태그·관련 내부 문서를 함께 정리하는 방식으로 확장할 수 있습니다. 지금은 화면 확인을 위해 의도적으로 본문을 어느 정도 길게 구성했습니다.

## 왜 볼 만한가

첫째, 이 소식은 단순한 제품 발표나 링크 모음으로 끝나지 않고 개발자 경험, 인프라 운영, 보안 정책, 클라우드 비용, AI 도구 활용 방식 중 하나와 연결될 가능성이 있습니다. 기술 뉴스 사이트를 운영할 때 중요한 점은 “무슨 일이 있었다”보다 “내 운영 환경이나 학습 경로에 어떤 의미가 있는가”를 정리하는 것입니다.

둘째, 이 문서는 카테고리와 태그가 실제 화면에서 어떻게 보이는지 확인하기 위한 샘플입니다. 좌측 문서 트리에는 디렉토리 구조가 그대로 나타나고, 홈 화면의 최신 문서 카드에는 제목과 설명이 표시됩니다. 검색 페이지에서는 제목, 요약, 본문 일부가 SQLite FTS5 인덱스에 들어가므로 실제 검색 결과의 밀도도 확인할 수 있습니다.

## 운영자 관점의 해설

Hermes 기반 자동 게시 시스템에서는 이런 글을 주기적으로 생성하되, 원문을 단순 번역하지 않는 것이 중요합니다. 원문 링크를 남기고, 한국어 독자가 바로 판단할 수 있도록 맥락과 적용 포인트를 붙이는 편이 좋습니다. 예를 들어 보안 관련 뉴스라면 “패치 여부”, “영향받는 구성”, “내 서버에서 확인할 명령”이 필요하고, AI 도구 뉴스라면 “실제 워크플로우 변화”, “비용 구조”, “자동화 가능성”을 정리하는 것이 유용합니다.

## 사이트 레이아웃 확인 포인트

- 긴 제목이 카드와 본문에서 줄바꿈될 때 어색하지 않은지 확인합니다.
- 태그가 많을 때 좌측 사이드바와 문서 헤더가 지나치게 복잡해지지 않는지 봅니다.
- 원문 링크가 본문 폭을 깨뜨리지 않는지 확인합니다.
- 우측 목차가 H2 섹션을 잘 잡는지 확인합니다.
- 모바일 화면에서 본문, 문서 트리, 검색창이 자연스럽게 접히는지 확인합니다.

## 처리 흐름 예시

```mermaid
flowchart TD
  R[RSS 수집] --> S[주제 분류]
  S --> M[Markdown 문서 생성]
  M --> I[SQLite FTS5 색인]
  I --> W[mdweb2 페이지 노출]
```

## 후속으로 확장할 수 있는 글

이 뉴스가 중요하다고 판단되면 별도의 심층 문서로 확장할 수 있습니다. 예를 들어 `클라우드` 주제의 개념 정리, 실습 가이드, 운영 체크리스트, 관련 도구 비교 문서로 이어갈 수 있습니다. 장기적으로는 이런 뉴스성 문서가 쌓이고, 그중 일부가 책이나 지식베이스 챕터로 승격되는 흐름을 만들 수 있습니다.
