The Astral Toolchain: How Ruff, uv, and ty Replaced My Entire Python Setup

I used to start every Python project the same way: python -m venv .venv, pip install, wait 45 seconds, run black ., run isort ., run flake8, then fight with mypy for 20 minutes. Sound familiar?
Then I found Astral's toolchain — Ruff, uv, and ty — and that entire ritual collapsed into three commands and a single config file.
This post is a practical walkthrough of the three tools, how I set them up together, and why I think every Python engineer (especially in ML and AI) should be on this stack right now.
The Problem with Traditional Python Tooling
Before we get into solutions, let's be honest about the pain:
| What you need | Old solution | The problem |
|---|---|---|
| Package manager | pip + virtualenv | Slow, no lockfile, no Python version management |
| Formatter | black | Separate tool, separate config |
| Import sorter | isort | Conflicts with black constantly |
| Linter | flake8 | Yet another config file |
| Type checker | mypy | Slow. Very slow. |
That's 5 different tools, 3–4 config files, and a pre-commit hook setup that breaks every 6 months.
For AI/ML projects specifically, it's worse — you're managing heavy dependency trees (PyTorch, Hugging Face, CUDA-specific wheels), different Python versions across environments, and you can't afford slow CI.
Meet the Astral Stack
Astral (now part of OpenAI) builds Python tooling in Rust with a simple thesis: developer tools should be fast enough that you forget they're running.
The three tools:
- uv — replaces
pip,pip-tools,virtualenv,pyenv,poetry, andpipx - Ruff — replaces
black,isort,flake8, and 800+ linting rules - ty — replaces
mypyandPyrightas a type checker and language server
All three are written in Rust. All three are orders of magnitude faster than their Python counterparts. And crucially — all three share a single pyproject.toml.
Let's go through each one.
uv — One Tool to Rule Them All
Installation
curl -LsSf https://astral.sh/uv/install.sh | sh
No Python required. No Rust required. It's a single binary.
Starting a new project
# For a simple script-based project (default)
uv init my-ai-project
# For a proper package with src layout (recommended for production)
uv init my-ai-project --package
cd my-ai-project
--package creates the src layout:
my-ai-project/
├── pyproject.toml
├── .python-version
└── src/
└── my_ai_project/
└── __init__.py
The uv.lock lockfile is created automatically on your first uv sync or uv add.
The uv.lock file is a universal lockfile — one file that works across macOS, Linux, and Windows, unlike pip-tools which generates platform-specific outputs.
Managing dependencies
# Add a dependency
uv add fastapi
# Add a dev dependency
uv add --dev pytest ruff ty
# Add with version constraint
uv add "torch>=2.3.0"
# Remove
uv remove requests
# Sync environment (installs everything in uv.lock)
uv sync
Python version management — this is the killer feature
# Pin to a specific Python version
uv python pin 3.12
# Install it (uv downloads the interpreter automatically)
uv python install 3.12
# Run with a specific version without changing anything
uv run --python 3.11 python script.py
No more pyenv. No more .python-version conflicts. uv manages interpreters itself.
Running scripts with inline dependencies
This is my favourite uv trick for one-off AI scripts:
# /// script
# requires-python = ">=3.12"
# dependencies = [
# "anthropic",
# "rich",
# ]
# ///
import anthropic
from rich import print
client = anthropic.Anthropic()
message = client.messages.create(...)
print(message.content)
uv run my_script.py
uv creates an isolated environment, installs anthropic and rich, runs the script, and cleans up. No virtualenv. No pip install. Great for LLM prototyping.
Speed comparison
| Operation | pip | uv |
|---|---|---|
pip install torch (cold cache) |
~45s | ~4s |
pip install -r requirements.txt (warm cache) |
~12s | ~0.3s |
pip install -e . |
~8s | ~0.4s |
In CI/CD pipelines where you run pip install on every push, uv will save you real money on compute costs.
Ruff — The Linter and Formatter That Actually Stays Out of Your Way
Ruff replaces black, isort, flake8, pyupgrade, and 800+ other rules. It's 10–100x faster than any of them individually.
Installation
uv add --dev ruff
That's it. No separate black, no separate isort.
Configuration in pyproject.toml
[tool.ruff]
line-length = 100
target-version = "py312"
[tool.ruff.lint]
select = [
"E", # pycodestyle errors
"W", # pycodestyle warnings
"F", # pyflakes
"I", # isort
"B", # flake8-bugbear
"C4", # flake8-comprehensions
"UP", # pyupgrade
"N", # pep8-naming
]
ignore = ["E501"] # line too long — handled by formatter
[tool.ruff.format]
quote-style = "double"
indent-style = "space"
Usage
# Lint
ruff check .
# Lint + auto-fix
ruff check . --fix
# Format (replaces black)
ruff format .
# Check + format in one pass
ruff check . --fix && ruff format .
Why this matters for AI codebases
AI engineering code tends to be messy — rapid prototyping, Jupyter notebooks converted to scripts, copy-pasted model code. Ruff's --fix flag handles 80% of style issues automatically. I run it as a pre-commit hook and never think about formatting again:
# .pre-commit-config.yaml equivalent using uv run
uv run ruff check . --fix && uv run ruff format .
Speed comparison on a real ML codebase (~200 files):
| Tool | Time |
|---|---|
| black + isort + flake8 | ~8.2s |
| ruff check + ruff format | ~0.18s |
45x faster. On every save in your editor, that difference is noticeable.
ty — The Type Checker That Doesn't Make You Hate Type Checking
ty is Astral's newest tool — currently in beta (v0.0.51 as of June 2026) — and it's the most exciting one for AI engineering code.
The headline numbers: ty checks the entire Home Assistant codebase (one of the largest Python projects) in ~2.19 seconds. mypy takes 45.66 seconds. That's a 20x speedup.
Installation
# As a project dev dependency
uv add --dev ty
# Or as a global tool (run anywhere)
uv tool install ty@latest
# Or run without installing
uvx ty check
Usage
# Type-check the whole project
ty check
# Check a specific file
ty check src/agents/orchestrator.py
# Watch mode (re-checks on file save)
ty check --watch
Configuration in pyproject.toml
[tool.ty]
src = ["src"]
[tool.ty.rules]
possibly-unbound = "warn"
missing-return = "error"
What makes ty different from mypy
1. It actually catches bugs without annotations
def process_llm_response(response):
data = response.choices[0].message
return data.contnet # typo — 'contnet' instead of 'content'
ty flags this — provided the library has type stubs (the Anthropic SDK ships full stubs). For untyped third-party libraries, ty infers Unknown and can't catch attribute errors. mypy behaves the same way in this case. The difference is: ty is significantly more aggressive about inferring types from stubs and context, so you get more catches without having to annotate everything yourself.
2. It's a language server too
ty ships with a built-in LSP. Install the ty VS Code / Cursor extension and you get:
- Go to definition
- Auto-complete
- Inlay hints
- Inline type errors as you type
No separate Pylance, no separate Pyright, no configuration conflicts.
3. The gradual guarantee
Adding type annotations to existing code never introduces new errors in ty. Annotations only narrow existing errors. This makes incremental adoption safe — you can add types to one file at a time without breaking your CI.
A note on beta status
ty is production-ready for motivated teams but it's still beta. Some edge cases (particularly with complex Pydantic models and decorator-heavy code) may have false positives. Astral is targeting a stable 1.0 release in 2026. For AI projects, it works excellently on FastAPI routes, Pydantic schemas, and core agent logic.
Editor Setup — Ruff + ty in VS Code and Cursor
Both tools have first-party extensions. Here's how to get them wired up in under 5 minutes.
Install the extensions
VS Code / Cursor:
Search the Extensions panel for:
astral-sh.ruff— Ruff linter + formatterastral-sh.ty— ty type checker + language server
Or install from the command line:
# VS Code
code --install-extension astral-sh.ruff
code --install-extension astral-sh.ty
# Cursor
cursor --install-extension astral-sh.ruff
cursor --install-extension astral-sh.ty
Configure settings.json
Add this to your workspace .vscode/settings.json (or user settings):
{
// ── Ruff ────────────────────────────────────────────────
"[python]": {
"editor.defaultFormatter": "astral-sh.ruff",
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.fixAll.ruff": "explicit",
"source.organizeImports.ruff": "explicit"
}
},
// ── ty ──────────────────────────────────────────────────
"ty.enable": true,
// ── Disable Pylance type checking to avoid double diagnostics ─
// (keep Pylance installed for other features, just turn off type checking)
"python.analysis.typeCheckingMode": "off"
}
What each setting does:
editor.defaultFormatter— makes Ruff the formatter on save, replacing blacksource.fixAll.ruff— auto-applies all safe lint fixes on save (equivalent toruff check . --fix)source.organizeImports.ruff— auto-sorts imports on save, replacing isortpython.analysis.typeCheckingMode: off— stops Pylance's type checker from running alongside ty, eliminating duplicate diagnostics
Cursor users: Cursor ships with its own Python intelligence layer. You may need to disable "Python › Analysis: Type Checking Mode" in Cursor settings explicitly if you see duplicate diagnostics. Do not fully disable Pylance — it provides auto-complete and other features that are independent of type checking.
What you get
With this setup, every time you save a Python file:
- Ruff formats it (replaces black)
- Ruff sorts imports (replaces isort)
- Ruff auto-fixes safe lint violations
- ty shows inline type errors in the gutter — no terminal needed
The feedback loop goes from "run commands manually" to "instant, on every keystroke."
The Complete Setup — One pyproject.toml to Rule Them All
Here's the full config I use for AI engineering projects:
[project]
name = "my-ai-project"
version = "0.1.0"
description = "Production agentic AI system"
requires-python = ">=3.12"
dependencies = [
"fastapi>=0.115.0",
"anthropic>=0.40.0",
"pydantic>=2.10.0",
"redis>=5.2.0",
"sqlalchemy>=2.0.0",
]
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
# ─── uv ──────────────────────────────────────────────────────
# uv's own dev-dependencies (preferred over [project.optional-dependencies]
# for dev tooling — these are never installed in production or by other packages)
[tool.uv]
dev-dependencies = [
"ruff>=0.9.0",
"ty>=0.0.50",
"pytest>=8.0.0",
"pytest-asyncio>=0.25.0",
]
# ─── Ruff ────────────────────────────────────────────────────
[tool.ruff]
line-length = 100
target-version = "py312"
[tool.ruff.lint]
select = ["E", "W", "F", "I", "B", "C4", "UP", "N", "ASYNC"]
ignore = ["E501", "B008"]
[tool.ruff.lint.per-file-ignores]
"tests/**" = ["S101"] # allow assert in tests
[tool.ruff.format]
quote-style = "double"
indent-style = "space"
# ─── ty ──────────────────────────────────────────────────────
[tool.ty]
src = ["src"]
[tool.ty.rules]
possibly-unbound = "warn"
Ruff Rules — The Full Reference
The minimal config above gets you started. But Ruff supports over 800 rules across dozens of plugins — everything from import ordering to security scanning to async best practices. Here's the complete annotated ruleset I use for production AI projects, with every rule explained:
[tool.ruff.lint]
# ─── DEFAULTS (already on unless you override) ───────────────
# Ruff enables these even without configuration:
# "F" – Pyflakes: undefined names, unused imports, syntax errors
# "E4" – Pycodestyle: import-related style issues
# "E7" – Pycodestyle: statement-level style issues
# "E9" – Pycodestyle: runtime syntax errors
# ─── EXTENDED RULES ──────────────────────────────────────────
extend-select = [
"E", # Pycodestyle errors – style issues (indentation, whitespace, etc.)
"W", # Pycodestyle warnings – style warnings (trailing whitespace, blank lines)
"C90", # McCabe complexity – flags functions above a complexity threshold
"I", # isort – import ordering (replaces isort entirely)
"N", # PEP8 Naming – naming conventions for classes, functions, variables
"D", # Pydocstyle – docstring formatting and presence
"UP", # Pyupgrade – rewrites syntax to newer Python versions automatically
"YTT", # Flake8-2020 – misuse of sys.version / sys.version_info
# "ANN", # Flake8-annotations – enforces type annotation style (aggressive, opt-in)
"ASYNC", # Flake8-async – async/await correctness (blocking calls in async, etc.)
"S", # Flake8-bandit – security: SQL injection, shell injection, hardcoded secrets
"BLE", # Flake8-blind-except – flags bare `except:` without exception type
"FBT", # Flake8-boolean-trap – catches boolean argument pitfalls in function signatures
"B", # Flake8-bugbear – common bug patterns (mutable defaults, assert misuse, etc.)
"A", # Flake8-builtins – prevents shadowing Python built-in names (list, id, etc.)
"COM", # Flake8-commas – trailing comma consistency
# "CPY", # Flake8-copyright – copyright header enforcement (opt-in per project)
"C4", # Flake8-comprehensions – suggests cleaner list/dict/set comprehension patterns
"DTZ", # Flake8-datetimez – requires timezone-aware datetime objects (prevents bugs)
"T10", # Flake8-debugger – flags leftover pdb / breakpoint() statements
"DJ", # Flake8-django – Django-specific conventions (skip if not using Django)
"EM", # Flake8-errmsg – exception message style (no f-strings directly in raise)
"EXE", # Flake8-executable – checks shebang lines and executable bits
"FA", # Flake8-future-annotations – flags missing `from __future__ import annotations`
"ISC", # Flake8-implicit-str-concat – warns on implicit string concatenation across lines
"ICN", # Flake8-import-conventions – enforces conventional aliases (import numpy as np, etc.)
"LOG", # Flake8-logging – proper logging usage (no print, use logger)
"G", # Flake8-logging-format – flags % and .format() in logging calls (use lazy args)
"INP", # Flake8-no-pep420 – requires __init__.py (no implicit namespace packages)
"PIE", # Flake8-pie – miscellaneous Python improvement suggestions
# "T20", # Flake8-print – disallows print() statements (useful in production code)
"PYI", # Flake8-pyi – type stub (.pyi) consistency checks
"PT", # Flake8-pytest-style – pytest best practices (fixture naming, assert style, etc.)
"Q", # Flake8-quotes – enforces consistent quote style
"RSE", # Flake8-raise – proper raise statement usage (no bare raise outside except)
"RET", # Flake8-return – return statement issues (unnecessary else after return, etc.)
"SLF", # Flake8-self – flags instance methods that don't use self
"SLOT", # Flake8-slots – suggests __slots__ for classes that would benefit
"SIM", # Flake8-simplify – code simplification (ternary instead of if/else, etc.)
"TID", # Flake8-tidy-imports – enforces import style (no relative imports, banned modules)
"TC", # Flake8-type-checking – proper use of TYPE_CHECKING blocks for typing imports
"INT", # Flake8-gettext – proper internationalisation (i18n) usage
"ARG", # Flake8-unused-arguments – flags unused function/method arguments
"PTH", # Flake8-use-pathlib – encourages pathlib over os.path
"TD", # Flake8-todos – flags TODO comments (useful to surface in review)
"FIX", # Flake8-fixme – flags FIXME comments
# "ERA", # Eradicate – detects commented-out code (can be noisy, opt-in)
"PD", # Pandas-vet – pandas-specific code practices
"PGH", # Pygrep-hooks – custom grep-based hooks
"PL", # Pylint – Pylint conventions integrated into Ruff
"TRY", # Tryceratops – try/except usage suggestions
"FLY", # Flynt – f-string conversion suggestions
"NPY", # NumPy-specific rules – NumPy coding standards
"FAST", # FastAPI – FastAPI-specific linting (response models, route signatures)
"AIR", # Airflow – Airflow-specific rules (skip if not using Airflow)
"PERF", # Perflint – performance: unnecessary list() calls, slow loops, etc.
"FURB", # Refurb – modern Python rewrites (replace old patterns with newer idioms)
# "DOC", # Pydoclint – stricter docstring linting (opt-in)
"RUF", # Ruff-specific rules – Ruff's own additional checks
]
# ─── GLOBAL IGNORES ──────────────────────────────────────────
ignore = [
"E501", # line too long – handled by formatter (ruff format)
"D1", # missing docstring – too aggressive for internal/private methods
"FBT003", # boolean positional value – common in internal function calls
"D203", # blank line before class – conflicts with D211 (use D211 instead)
"D212", # summary after quotes – conflicts with D213 (use D213 instead)
"D400", # period at end of docstr – overly prescriptive
"D401", # imperative mood – overly prescriptive
"D415", # period/question/exclaim – overly prescriptive
"S311", # pseudo-random generators – fine for non-cryptographic use
"PERF401", # list comprehension – readability sometimes beats micro-optimisation
"RET504", # assign before return – sometimes improves readability (named result)
"FA102", # future annotations union – not needed on Python 3.10+
"TRY003", # long exception message – acceptable for descriptive domain errors
"EM101", # string literal in raise – common pattern, not always worth splitting
"TC002", # typing outside block – too strict for most codebases
"TC003", # typing outside block – too strict for most codebases
]
# ─── PER-FILE IGNORES ────────────────────────────────────────
[tool.ruff.lint.per-file-ignores]
# Test files get relaxed rules — asserts, magic values, and private
# member access are all normal in test code.
"test_*.py" = ["S101", "S105", "S106", "S107", "PLR2004", "SLF001", "D", "ANN", "ARG001", "PLC0415", "EM102"]
"*_test.py" = ["S101", "S105", "S106", "S107", "PLR2004", "SLF001", "D", "ANN", "ARG001", "PLC0415", "EM102"]
"conftest.py"= ["S101", "S105", "S106", "S107", "PLR2004", "SLF001", "D", "ANN", "ARG001", "PLC0415", "EM102"]
# Per-file ignore reference for tests:
# S101 – asserts are standard in pytest
# S105/106/107 – hardcoded passwords in test fixtures are fine
# PLR2004 – magic values (e.g. 200, 404) are common in HTTP response tests
# SLF001 – private member access needed to test internal state
# D – docstrings are redundant in verbose pytest function names
# ANN – type annotations add overhead without benefit in tests
# ARG001 – pytest fixtures ensure state exists even when not directly referenced
# PLC0415 – local imports in tests are valid for isolation
# EM102 – f-string exceptions in test helpers are fine for debugging
A few things worth highlighting from this config:
ASYNC rules are particularly valuable for AI engineering code. They catch blocking calls (requests.get, time.sleep, open()) inside async def functions — a common mistake when wrapping synchronous LLM SDK calls.
FAST rules are purpose-built for FastAPI — they flag missing response models, incorrect dependency injection patterns, and route decorator misuse. Essential if you're building LLM APIs.
S (Bandit) rules catch security issues that matter in production: SQL injection via string concatenation, subprocess shell injection, hardcoded credentials. Non-negotiable for anything that touches customer data.
The D (docstring) rules with the ignores configured above give you docstring checking without the noise — you get enforcement on public APIs but not on every private helper method.
Three Ways to Use the Astral Stack
There isn't one right way to integrate these tools — each approach gives you a different level of enforcement. Here's the full picture.
Option 1: GitHub Actions (Pipeline Enforcement)
The strictest option. Every PR runs the checks automatically, and a failing lint or type check blocks the merge. Nobody ships broken code — but the tradeoff is that developers only find out about failures after pushing.
name: CI
on: [push, pull_request]
jobs:
quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v5
with:
enable-cache: true
- name: Set up Python
run: uv python install
- name: Install dependencies
run: uv sync --all-extras
- name: Lint
run: uv run ruff check .
- name: Format check
run: uv run ruff format --check .
- name: Type check
run: uv run ty check
- name: Tests
run: uv run pytest
The astral-sh/setup-uv action handles caching automatically. On warm runs, uv sync finishes in under a second.
Watch out: If ruff or ty is added to an existing codebase without a clean-up pass first, the first CI run will fail hard. Run
ruff check . --fix && ruff format .locally before wiring this up.
Bitbucket Pipelines
If your team is on Bitbucket instead of GitHub, the setup is nearly identical. Bitbucket doesn't have a first-party uv action, so you install it via the bootstrap script:
# bitbucket-pipelines.yml
image: python:3.12-slim
pipelines:
pull-requests:
'**':
- step:
name: Quality Checks
caches:
- uv
script:
# Install uv
- curl -LsSf https://astral.sh/uv/install.sh | sh
- export PATH="\(HOME/.local/bin:\)PATH"
# Install dependencies
- uv sync --all-extras
# Lint
- uv run ruff check .
# Format check
- uv run ruff format --check .
# Type check
- uv run ty check
# Tests
- uv run pytest
definitions:
caches:
uv: ~/.cache/uv
The uv cache definition at the bottom tells Bitbucket to persist uv's global package cache between pipeline runs — same effect as enable-cache: true in the GitHub Actions setup. Warm runs resolve and install in milliseconds.
Bitbucket pipeline steps exit non-zero on any command failure, so a failing ruff or ty check automatically blocks the PR — no extra configuration needed.
Option 2: Dev Dependencies (uv add --dev)
Install ruff and ty as project dev dependencies. Anyone who clones the repo and runs uv sync gets the tools pinned to the exact same version — no "works on my machine" drift.
uv add --dev ruff ty
Then in pyproject.toml:
[tool.uv]
dev-dependencies = [
"ruff>=0.9.0",
"ty>=0.0.50",
]
Developers run checks explicitly in their terminal:
uv run ruff check . --fix
uv run ruff format .
uv run ty check
This is the right setup for most teams — the tools are version-locked alongside the project, and CI uses the same uv run commands. The pipeline can still block PRs — same as Option 1 — because your GitHub Actions workflow calls uv run ruff check . which uses the dev-installed version.
The key advantage over Option 1 alone: developers can run the exact same commands that CI runs, locally, before pushing. No guessing what the pipeline checks.
Option 3: uvx — Global Tools, No Install Required
This is the most lightweight option and the one I recommend for individual developers who want to run checks on-demand without touching a project's dependencies.
# Run without installing anything into the project
uvx ruff check .
uvx ruff format .
uvx ty check
uvx pulls the latest version of the tool into an isolated ephemeral environment and runs it. No uv add, no virtualenv activation, no version pinning needed.
Why this matters: When you're working on a PR and want to verify your code before pushing — without waiting for a CI run to come back — uvx lets you do it instantly. You're not blocked by pipeline queue times or approval gates. You check locally, fix locally, push clean.
# My pre-push routine — takes under 1 second total
uvx ruff check . --fix && uvx ruff format . && uvx ty check
The one tradeoff: uvx always pulls the latest version, so there's a small risk of version drift between your local run and the pinned version in CI. For most projects this doesn't matter — but if you need byte-for-byte reproducibility, stick with Option 2.
Option 4: Pre-commit Hooks (Local Gate Before Push)
Pre-commit hooks run checks automatically when you run git commit — before the code ever leaves your machine. It's a local enforcement layer that catches issues even earlier than CI.
Ruff has a first-party pre-commit hook maintained by Astral:
# .pre-commit-config.yaml
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.9.0
hooks:
# Run linter and auto-fix
- id: ruff
args: [--fix]
# Run formatter
- id: ruff-format
# ty doesn't have an official pre-commit hook yet (beta)
# Use uv run ty check in CI instead
Install and activate:
uv add --dev pre-commit
uv run pre-commit install # wires it to git commit
uv run pre-commit run --all-files # run manually on everything
Now every git commit automatically runs ruff lint + format. If anything fails, the commit is blocked until the code is clean.
Pre-commit vs uvx — which to use?
| pre-commit hook | uvx (manual) | |
|---|---|---|
| When it runs | Automatically on every git commit |
Only when you run it explicitly |
| Blocks commits | Yes — can't commit until clean | No — you choose when to check |
| Setup overhead | Requires installation per clone | Zero — just uvx ruff check . |
| Best for | Teams where discipline varies | Individual developers who self-enforce |
Honest take: If your team already has a pre-commit discipline, add ruff to it. If not, the
uvxpre-push routine is lighter and easier to adopt. Both are better than relying solely on CI feedback.
Which approach to use?
| Scenario | Recommendation |
|---|---|
| Team project with a shared repo | Option 2 (dev deps) + Option 1 (CI enforcement) |
| Solo project or quick scripts | Option 3 (uvx) |
| Want to catch issues before pushing | Option 3 (uvx) or Option 4 (pre-commit) |
| Team needs automated local enforcement | Option 4 (pre-commit) |
| Strict team standards, no exceptions | Option 1 + Option 4 |
| All four together | ✅ Best setup — local freedom + local gate + CI safety net |
The combination I use day-to-day: dev dependencies for the project (so the whole team uses the same versions), GitHub Actions to enforce on PRs (so nothing slips through), and uvx for quick local checks when I want instant feedback before I even open a PR.
Docker Integration — uv in Production Containers
For AI engineering, your code eventually goes into a Docker container — a FastAPI inference server, an agent runtime, a training job. uv changes how you should write your Dockerfiles.
The key principle: separate dependency installation from code copying so Docker's layer cache means you only re-install packages when pyproject.toml or uv.lock actually changes.
Standard single-stage Dockerfile
FROM python:3.12-slim
# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
WORKDIR /app
# Copy lockfile and project metadata first — Docker caches this layer
# independently from your source code. Dependencies only re-install
# when pyproject.toml or uv.lock changes.
COPY pyproject.toml uv.lock ./
# Install dependencies (no dev deps in production)
RUN uv sync --frozen --no-dev
# Now copy source code
COPY src/ ./src/
# Run via uv so it uses the managed virtualenv
CMD ["uv", "run", "uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]
Multi-stage Dockerfile (leaner production image)
For production AI services where image size matters:
# ── Stage 1: Build ────────────────────────────────────────────
FROM python:3.12-slim AS builder
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
WORKDIR /app
COPY pyproject.toml uv.lock ./
# Install into a known virtualenv path
RUN uv sync --frozen --no-dev --no-editable
# ── Stage 2: Runtime ──────────────────────────────────────────
FROM python:3.12-slim AS runtime
WORKDIR /app
# Copy only the virtualenv from the builder — no uv, no build tools
COPY --from=builder /app/.venv /app/.venv
# Copy source
COPY src/ ./src/
# Use the virtualenv's Python directly (no uv needed at runtime)
ENV PATH="/app/.venv/bin:$PATH"
CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]
The multi-stage build leaves uv itself out of the final image — the runtime only contains Python, your virtualenv, and your code. For ML services with large model weights, keeping the base image lean matters.
Key flags explained
| Flag | What it does |
|---|---|
--frozen |
Fails if uv.lock is out of sync with pyproject.toml — guarantees reproducibility |
--no-dev |
Skips dev dependencies (ruff, ty, pytest) — nothing debug-related ships to production |
--no-editable |
Installs the package normally instead of as an editable install — required for multi-stage copies |
.dockerignore
Always add this so Docker doesn't copy your virtualenv into the build context:
.venv/
__pycache__/
*.pyc
.ruff_cache/
.ty_cache/
.git/
Why This Matters Specifically for AI Engineering
AI/ML Python projects have three tooling problems that make this stack particularly valuable:
1. Heavy dependency trees
Installing torch, transformers, langchain, milvus-client, and kubeflow with pip can take 3–5 minutes in CI. With uv's global dependency cache, re-runs with no changes take milliseconds. If you're running training experiments across multiple branches, this adds up fast.
2. Rapidly evolving codebases
Agentic AI code changes fast — new tools, new agents, new API integrations every sprint. Ruff's --fix mode means you don't slow down to deal with linting noise. ty's incremental type checking means you catch regressions the moment you introduce them.
3. Type safety in LLM integration code
The worst bugs in production AI systems are often shape mismatches — wrong field names on API responses, incorrect Pydantic model paths, async functions called synchronously. ty catches all of these before runtime, in milliseconds, without requiring you to annotate everything upfront.
Migration Path — If You're on the Old Stack
If you're currently on pip + black + flake8 + mypy:
# Step 1: Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Step 2: Migrate your project
uv init --no-package # if not a package
uv pip install -r requirements.txt # import existing deps into uv
uv lock # generate the lockfile
# Step 3: Replace black + isort + flake8 with ruff
uv add --dev ruff
# Delete: .flake8, .isort.cfg, pyproject.toml [tool.black] section
# Step 4: Add ty (optional, can do incrementally)
uv add --dev ty
uv run ty check # see what it finds
Total migration time for a mid-sized project: under 30 minutes.
Summary
| Old Stack | Astral Stack | |
|---|---|---|
| Package manager | pip + virtualenv + pyenv + poetry | uv |
| Formatter | black | ruff format |
| Import sorter | isort | ruff (built-in) |
| Linter | flake8 | ruff check |
| Type checker | mypy | ty |
| Config files | 4–5 | 1 (pyproject.toml) |
| CI cold start | 45–90s | 3–8s |
| Editor integration | Multiple plugins | ty + ruff extensions |
The Astral toolchain won't make your model smarter or your LLM prompts better. But it removes an enormous amount of friction from the day-to-day work of building production AI systems — and in a field that moves as fast as this one, that friction is expensive.
Try it on your next project. You won't go back.
If you found this useful, I write about production Agentic AI systems, MLOps, and LLM engineering. Follow me on LinkedIn and GitHub.
What are you using for Python tooling right now? Drop a comment — I'm curious how many people are still on the old stack.

