Skip to main content

Command Palette

Search for a command to run...

The Astral Toolchain: How Ruff, uv, and ty Replaced My Entire Python Setup

Updated
25 min read
The Astral Toolchain: How Ruff, uv, and ty Replaced My Entire Python Setup
J
I'm Jose Atlin — a Technical Lead and Senior AI Engineer with 5 years building production AI systems at Litmus7. My work lives at the intersection of Agentic AI, multi-agent orchestration, and MLOps. Not research, not demos — systems that handle real production load: an SRE automation platform that resolves incidents in under 5 minutes, a hybrid RAG pipeline with >75% retrieval accuracy across 50+ SOPs, LLMs fine-tuned with QLoRA and served via TGI, and a company-wide ML platform adopted by 4 engineering teams. I write about the unglamorous but high-impact side of AI engineering — the architecture decisions, tooling choices, and production war stories that don't make it into research papers. What you'll find here: Multi-agent system design and MCP integrations RAG pipelines that actually work in production LLM fine-tuning with QLoRA, PEFT, and HuggingFace TGI Python tooling and MLOps Lessons from shipping AI at scale Currently leading a team of 4 AI engineers. Open to senior AI Engineering and Agentic AI roles.

I used to start every Python project the same way: python -m venv .venv, pip install, wait 45 seconds, run black ., run isort ., run flake8, then fight with mypy for 20 minutes. Sound familiar?

Then I found Astral's toolchain — Ruff, uv, and ty — and that entire ritual collapsed into three commands and a single config file.

This post is a practical walkthrough of the three tools, how I set them up together, and why I think every Python engineer (especially in ML and AI) should be on this stack right now.


The Problem with Traditional Python Tooling

Before we get into solutions, let's be honest about the pain:

What you need Old solution The problem
Package manager pip + virtualenv Slow, no lockfile, no Python version management
Formatter black Separate tool, separate config
Import sorter isort Conflicts with black constantly
Linter flake8 Yet another config file
Type checker mypy Slow. Very slow.

That's 5 different tools, 3–4 config files, and a pre-commit hook setup that breaks every 6 months.

For AI/ML projects specifically, it's worse — you're managing heavy dependency trees (PyTorch, Hugging Face, CUDA-specific wheels), different Python versions across environments, and you can't afford slow CI.


Meet the Astral Stack

Astral (now part of OpenAI) builds Python tooling in Rust with a simple thesis: developer tools should be fast enough that you forget they're running.

The three tools:

  • uv — replaces pip, pip-tools, virtualenv, pyenv, poetry, and pipx
  • Ruff — replaces black, isort, flake8, and 800+ linting rules
  • ty — replaces mypy and Pyright as a type checker and language server

All three are written in Rust. All three are orders of magnitude faster than their Python counterparts. And crucially — all three share a single pyproject.toml.

Let's go through each one.


uv — One Tool to Rule Them All

Installation

curl -LsSf https://astral.sh/uv/install.sh | sh

No Python required. No Rust required. It's a single binary.

Starting a new project

# For a simple script-based project (default)
uv init my-ai-project

# For a proper package with src layout (recommended for production)
uv init my-ai-project --package
cd my-ai-project

--package creates the src layout:

my-ai-project/
├── pyproject.toml
├── .python-version
└── src/
    └── my_ai_project/
        └── __init__.py

The uv.lock lockfile is created automatically on your first uv sync or uv add.

The uv.lock file is a universal lockfile — one file that works across macOS, Linux, and Windows, unlike pip-tools which generates platform-specific outputs.

Managing dependencies

# Add a dependency
uv add fastapi

# Add a dev dependency
uv add --dev pytest ruff ty

# Add with version constraint
uv add "torch>=2.3.0"

# Remove
uv remove requests

# Sync environment (installs everything in uv.lock)
uv sync

Python version management — this is the killer feature

# Pin to a specific Python version
uv python pin 3.12

# Install it (uv downloads the interpreter automatically)
uv python install 3.12

# Run with a specific version without changing anything
uv run --python 3.11 python script.py

No more pyenv. No more .python-version conflicts. uv manages interpreters itself.

Running scripts with inline dependencies

This is my favourite uv trick for one-off AI scripts:

# /// script
# requires-python = ">=3.12"
# dependencies = [
#   "anthropic",
#   "rich",
# ]
# ///

import anthropic
from rich import print

client = anthropic.Anthropic()
message = client.messages.create(...)
print(message.content)
uv run my_script.py

uv creates an isolated environment, installs anthropic and rich, runs the script, and cleans up. No virtualenv. No pip install. Great for LLM prototyping.

Speed comparison

Operation pip uv
pip install torch (cold cache) ~45s ~4s
pip install -r requirements.txt (warm cache) ~12s ~0.3s
pip install -e . ~8s ~0.4s

In CI/CD pipelines where you run pip install on every push, uv will save you real money on compute costs.


Ruff — The Linter and Formatter That Actually Stays Out of Your Way

Ruff replaces black, isort, flake8, pyupgrade, and 800+ other rules. It's 10–100x faster than any of them individually.

Installation

uv add --dev ruff

That's it. No separate black, no separate isort.

Configuration in pyproject.toml

[tool.ruff]
line-length = 100
target-version = "py312"

[tool.ruff.lint]
select = [
    "E",    # pycodestyle errors
    "W",    # pycodestyle warnings
    "F",    # pyflakes
    "I",    # isort
    "B",    # flake8-bugbear
    "C4",   # flake8-comprehensions
    "UP",   # pyupgrade
    "N",    # pep8-naming
]
ignore = ["E501"]  # line too long — handled by formatter

[tool.ruff.format]
quote-style = "double"
indent-style = "space"

Usage

# Lint
ruff check .

# Lint + auto-fix
ruff check . --fix

# Format (replaces black)
ruff format .

# Check + format in one pass
ruff check . --fix && ruff format .

Why this matters for AI codebases

AI engineering code tends to be messy — rapid prototyping, Jupyter notebooks converted to scripts, copy-pasted model code. Ruff's --fix flag handles 80% of style issues automatically. I run it as a pre-commit hook and never think about formatting again:

# .pre-commit-config.yaml equivalent using uv run
uv run ruff check . --fix && uv run ruff format .

Speed comparison on a real ML codebase (~200 files):

Tool Time
black + isort + flake8 ~8.2s
ruff check + ruff format ~0.18s

45x faster. On every save in your editor, that difference is noticeable.


ty — The Type Checker That Doesn't Make You Hate Type Checking

ty is Astral's newest tool — currently in beta (v0.0.51 as of June 2026) — and it's the most exciting one for AI engineering code.

The headline numbers: ty checks the entire Home Assistant codebase (one of the largest Python projects) in ~2.19 seconds. mypy takes 45.66 seconds. That's a 20x speedup.

Installation

# As a project dev dependency
uv add --dev ty

# Or as a global tool (run anywhere)
uv tool install ty@latest

# Or run without installing
uvx ty check

Usage

# Type-check the whole project
ty check

# Check a specific file
ty check src/agents/orchestrator.py

# Watch mode (re-checks on file save)
ty check --watch

Configuration in pyproject.toml

[tool.ty]
src = ["src"]

[tool.ty.rules]
possibly-unbound = "warn"
missing-return = "error"

What makes ty different from mypy

1. It actually catches bugs without annotations

def process_llm_response(response):
    data = response.choices[0].message
    return data.contnet  # typo — 'contnet' instead of 'content'

ty flags this — provided the library has type stubs (the Anthropic SDK ships full stubs). For untyped third-party libraries, ty infers Unknown and can't catch attribute errors. mypy behaves the same way in this case. The difference is: ty is significantly more aggressive about inferring types from stubs and context, so you get more catches without having to annotate everything yourself.

2. It's a language server too

ty ships with a built-in LSP. Install the ty VS Code / Cursor extension and you get:

  • Go to definition
  • Auto-complete
  • Inlay hints
  • Inline type errors as you type

No separate Pylance, no separate Pyright, no configuration conflicts.

3. The gradual guarantee

Adding type annotations to existing code never introduces new errors in ty. Annotations only narrow existing errors. This makes incremental adoption safe — you can add types to one file at a time without breaking your CI.

A note on beta status

ty is production-ready for motivated teams but it's still beta. Some edge cases (particularly with complex Pydantic models and decorator-heavy code) may have false positives. Astral is targeting a stable 1.0 release in 2026. For AI projects, it works excellently on FastAPI routes, Pydantic schemas, and core agent logic.


Editor Setup — Ruff + ty in VS Code and Cursor

Both tools have first-party extensions. Here's how to get them wired up in under 5 minutes.

Install the extensions

VS Code / Cursor:

Search the Extensions panel for:

  • astral-sh.ruff — Ruff linter + formatter
  • astral-sh.ty — ty type checker + language server

Or install from the command line:

# VS Code
code --install-extension astral-sh.ruff
code --install-extension astral-sh.ty

# Cursor
cursor --install-extension astral-sh.ruff
cursor --install-extension astral-sh.ty

Configure settings.json

Add this to your workspace .vscode/settings.json (or user settings):

{
  // ── Ruff ────────────────────────────────────────────────
  "[python]": {
    "editor.defaultFormatter": "astral-sh.ruff",
    "editor.formatOnSave": true,
    "editor.codeActionsOnSave": {
      "source.fixAll.ruff": "explicit",
      "source.organizeImports.ruff": "explicit"
    }
  },

  // ── ty ──────────────────────────────────────────────────
  "ty.enable": true,

  // ── Disable Pylance type checking to avoid double diagnostics ─
  // (keep Pylance installed for other features, just turn off type checking)
  "python.analysis.typeCheckingMode": "off"
}

What each setting does:

  • editor.defaultFormatter — makes Ruff the formatter on save, replacing black
  • source.fixAll.ruff — auto-applies all safe lint fixes on save (equivalent to ruff check . --fix)
  • source.organizeImports.ruff — auto-sorts imports on save, replacing isort
  • python.analysis.typeCheckingMode: off — stops Pylance's type checker from running alongside ty, eliminating duplicate diagnostics

Cursor users: Cursor ships with its own Python intelligence layer. You may need to disable "Python › Analysis: Type Checking Mode" in Cursor settings explicitly if you see duplicate diagnostics. Do not fully disable Pylance — it provides auto-complete and other features that are independent of type checking.

What you get

With this setup, every time you save a Python file:

  1. Ruff formats it (replaces black)
  2. Ruff sorts imports (replaces isort)
  3. Ruff auto-fixes safe lint violations
  4. ty shows inline type errors in the gutter — no terminal needed

The feedback loop goes from "run commands manually" to "instant, on every keystroke."


The Complete Setup — One pyproject.toml to Rule Them All

Here's the full config I use for AI engineering projects:

[project]
name = "my-ai-project"
version = "0.1.0"
description = "Production agentic AI system"
requires-python = ">=3.12"
dependencies = [
    "fastapi>=0.115.0",
    "anthropic>=0.40.0",
    "pydantic>=2.10.0",
    "redis>=5.2.0",
    "sqlalchemy>=2.0.0",
]

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

# ─── uv ──────────────────────────────────────────────────────
# uv's own dev-dependencies (preferred over [project.optional-dependencies]
# for dev tooling — these are never installed in production or by other packages)
[tool.uv]
dev-dependencies = [
    "ruff>=0.9.0",
    "ty>=0.0.50",
    "pytest>=8.0.0",
    "pytest-asyncio>=0.25.0",
]

# ─── Ruff ────────────────────────────────────────────────────
[tool.ruff]
line-length = 100
target-version = "py312"

[tool.ruff.lint]
select = ["E", "W", "F", "I", "B", "C4", "UP", "N", "ASYNC"]
ignore = ["E501", "B008"]

[tool.ruff.lint.per-file-ignores]
"tests/**" = ["S101"]  # allow assert in tests

[tool.ruff.format]
quote-style = "double"
indent-style = "space"

# ─── ty ──────────────────────────────────────────────────────
[tool.ty]
src = ["src"]

[tool.ty.rules]
possibly-unbound = "warn"

Ruff Rules — The Full Reference

The minimal config above gets you started. But Ruff supports over 800 rules across dozens of plugins — everything from import ordering to security scanning to async best practices. Here's the complete annotated ruleset I use for production AI projects, with every rule explained:

[tool.ruff.lint]

# ─── DEFAULTS (already on unless you override) ───────────────
# Ruff enables these even without configuration:
# "F"   – Pyflakes: undefined names, unused imports, syntax errors
# "E4"  – Pycodestyle: import-related style issues
# "E7"  – Pycodestyle: statement-level style issues
# "E9"  – Pycodestyle: runtime syntax errors

# ─── EXTENDED RULES ──────────────────────────────────────────
extend-select = [
    "E",     # Pycodestyle errors          – style issues (indentation, whitespace, etc.)
    "W",     # Pycodestyle warnings        – style warnings (trailing whitespace, blank lines)
    "C90",   # McCabe complexity           – flags functions above a complexity threshold
    "I",     # isort                       – import ordering (replaces isort entirely)
    "N",     # PEP8 Naming                 – naming conventions for classes, functions, variables
    "D",     # Pydocstyle                  – docstring formatting and presence
    "UP",    # Pyupgrade                   – rewrites syntax to newer Python versions automatically
    "YTT",   # Flake8-2020                 – misuse of sys.version / sys.version_info
    # "ANN", # Flake8-annotations          – enforces type annotation style (aggressive, opt-in)
    "ASYNC", # Flake8-async                – async/await correctness (blocking calls in async, etc.)
    "S",     # Flake8-bandit              – security: SQL injection, shell injection, hardcoded secrets
    "BLE",   # Flake8-blind-except         – flags bare `except:` without exception type
    "FBT",   # Flake8-boolean-trap         – catches boolean argument pitfalls in function signatures
    "B",     # Flake8-bugbear              – common bug patterns (mutable defaults, assert misuse, etc.)
    "A",     # Flake8-builtins             – prevents shadowing Python built-in names (list, id, etc.)
    "COM",   # Flake8-commas               – trailing comma consistency
    # "CPY", # Flake8-copyright            – copyright header enforcement (opt-in per project)
    "C4",    # Flake8-comprehensions       – suggests cleaner list/dict/set comprehension patterns
    "DTZ",   # Flake8-datetimez            – requires timezone-aware datetime objects (prevents bugs)
    "T10",   # Flake8-debugger             – flags leftover pdb / breakpoint() statements
    "DJ",    # Flake8-django               – Django-specific conventions (skip if not using Django)
    "EM",    # Flake8-errmsg               – exception message style (no f-strings directly in raise)
    "EXE",   # Flake8-executable           – checks shebang lines and executable bits
    "FA",    # Flake8-future-annotations   – flags missing `from __future__ import annotations`
    "ISC",   # Flake8-implicit-str-concat  – warns on implicit string concatenation across lines
    "ICN",   # Flake8-import-conventions   – enforces conventional aliases (import numpy as np, etc.)
    "LOG",   # Flake8-logging              – proper logging usage (no print, use logger)
    "G",     # Flake8-logging-format       – flags % and .format() in logging calls (use lazy args)
    "INP",   # Flake8-no-pep420            – requires __init__.py (no implicit namespace packages)
    "PIE",   # Flake8-pie                  – miscellaneous Python improvement suggestions
    # "T20", # Flake8-print               – disallows print() statements (useful in production code)
    "PYI",   # Flake8-pyi                  – type stub (.pyi) consistency checks
    "PT",    # Flake8-pytest-style         – pytest best practices (fixture naming, assert style, etc.)
    "Q",     # Flake8-quotes               – enforces consistent quote style
    "RSE",   # Flake8-raise                – proper raise statement usage (no bare raise outside except)
    "RET",   # Flake8-return               – return statement issues (unnecessary else after return, etc.)
    "SLF",   # Flake8-self                 – flags instance methods that don't use self
    "SLOT",  # Flake8-slots                – suggests __slots__ for classes that would benefit
    "SIM",   # Flake8-simplify             – code simplification (ternary instead of if/else, etc.)
    "TID",   # Flake8-tidy-imports         – enforces import style (no relative imports, banned modules)
    "TC",    # Flake8-type-checking        – proper use of TYPE_CHECKING blocks for typing imports
    "INT",   # Flake8-gettext              – proper internationalisation (i18n) usage
    "ARG",   # Flake8-unused-arguments     – flags unused function/method arguments
    "PTH",   # Flake8-use-pathlib          – encourages pathlib over os.path
    "TD",    # Flake8-todos                – flags TODO comments (useful to surface in review)
    "FIX",   # Flake8-fixme                – flags FIXME comments
    # "ERA", # Eradicate                   – detects commented-out code (can be noisy, opt-in)
    "PD",    # Pandas-vet                  – pandas-specific code practices
    "PGH",   # Pygrep-hooks                – custom grep-based hooks
    "PL",    # Pylint                      – Pylint conventions integrated into Ruff
    "TRY",   # Tryceratops                 – try/except usage suggestions
    "FLY",   # Flynt                       – f-string conversion suggestions
    "NPY",   # NumPy-specific rules        – NumPy coding standards
    "FAST",  # FastAPI                     – FastAPI-specific linting (response models, route signatures)
    "AIR",   # Airflow                     – Airflow-specific rules (skip if not using Airflow)
    "PERF",  # Perflint                    – performance: unnecessary list() calls, slow loops, etc.
    "FURB",  # Refurb                      – modern Python rewrites (replace old patterns with newer idioms)
    # "DOC", # Pydoclint                   – stricter docstring linting (opt-in)
    "RUF",   # Ruff-specific rules         – Ruff's own additional checks
]

# ─── GLOBAL IGNORES ──────────────────────────────────────────
ignore = [
    "E501",    # line too long             – handled by formatter (ruff format)
    "D1",      # missing docstring         – too aggressive for internal/private methods
    "FBT003",  # boolean positional value  – common in internal function calls
    "D203",    # blank line before class   – conflicts with D211 (use D211 instead)
    "D212",    # summary after quotes      – conflicts with D213 (use D213 instead)
    "D400",    # period at end of docstr   – overly prescriptive
    "D401",    # imperative mood           – overly prescriptive
    "D415",    # period/question/exclaim   – overly prescriptive
    "S311",    # pseudo-random generators  – fine for non-cryptographic use
    "PERF401", # list comprehension        – readability sometimes beats micro-optimisation
    "RET504",  # assign before return      – sometimes improves readability (named result)
    "FA102",   # future annotations union  – not needed on Python 3.10+
    "TRY003",  # long exception message    – acceptable for descriptive domain errors
    "EM101",   # string literal in raise   – common pattern, not always worth splitting
    "TC002",   # typing outside block      – too strict for most codebases
    "TC003",   # typing outside block      – too strict for most codebases
]

# ─── PER-FILE IGNORES ────────────────────────────────────────
[tool.ruff.lint.per-file-ignores]
# Test files get relaxed rules — asserts, magic values, and private
# member access are all normal in test code.
"test_*.py"  = ["S101", "S105", "S106", "S107", "PLR2004", "SLF001", "D", "ANN", "ARG001", "PLC0415", "EM102"]
"*_test.py"  = ["S101", "S105", "S106", "S107", "PLR2004", "SLF001", "D", "ANN", "ARG001", "PLC0415", "EM102"]
"conftest.py"= ["S101", "S105", "S106", "S107", "PLR2004", "SLF001", "D", "ANN", "ARG001", "PLC0415", "EM102"]

# Per-file ignore reference for tests:
# S101    – asserts are standard in pytest
# S105/106/107 – hardcoded passwords in test fixtures are fine
# PLR2004 – magic values (e.g. 200, 404) are common in HTTP response tests
# SLF001  – private member access needed to test internal state
# D       – docstrings are redundant in verbose pytest function names
# ANN     – type annotations add overhead without benefit in tests
# ARG001  – pytest fixtures ensure state exists even when not directly referenced
# PLC0415 – local imports in tests are valid for isolation
# EM102   – f-string exceptions in test helpers are fine for debugging

A few things worth highlighting from this config:

ASYNC rules are particularly valuable for AI engineering code. They catch blocking calls (requests.get, time.sleep, open()) inside async def functions — a common mistake when wrapping synchronous LLM SDK calls.

FAST rules are purpose-built for FastAPI — they flag missing response models, incorrect dependency injection patterns, and route decorator misuse. Essential if you're building LLM APIs.

S (Bandit) rules catch security issues that matter in production: SQL injection via string concatenation, subprocess shell injection, hardcoded credentials. Non-negotiable for anything that touches customer data.

The D (docstring) rules with the ignores configured above give you docstring checking without the noise — you get enforcement on public APIs but not on every private helper method.


Three Ways to Use the Astral Stack

There isn't one right way to integrate these tools — each approach gives you a different level of enforcement. Here's the full picture.


Option 1: GitHub Actions (Pipeline Enforcement)

The strictest option. Every PR runs the checks automatically, and a failing lint or type check blocks the merge. Nobody ships broken code — but the tradeoff is that developers only find out about failures after pushing.

name: CI

on: [push, pull_request]

jobs:
  quality:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install uv
        uses: astral-sh/setup-uv@v5
        with:
          enable-cache: true

      - name: Set up Python
        run: uv python install

      - name: Install dependencies
        run: uv sync --all-extras

      - name: Lint
        run: uv run ruff check .

      - name: Format check
        run: uv run ruff format --check .

      - name: Type check
        run: uv run ty check

      - name: Tests
        run: uv run pytest

The astral-sh/setup-uv action handles caching automatically. On warm runs, uv sync finishes in under a second.

Watch out: If ruff or ty is added to an existing codebase without a clean-up pass first, the first CI run will fail hard. Run ruff check . --fix && ruff format . locally before wiring this up.

Bitbucket Pipelines

If your team is on Bitbucket instead of GitHub, the setup is nearly identical. Bitbucket doesn't have a first-party uv action, so you install it via the bootstrap script:

# bitbucket-pipelines.yml
image: python:3.12-slim

pipelines:
  pull-requests:
    '**':
      - step:
          name: Quality Checks
          caches:
            - uv
          script:
            # Install uv
            - curl -LsSf https://astral.sh/uv/install.sh | sh
            - export PATH="\(HOME/.local/bin:\)PATH"

            # Install dependencies
            - uv sync --all-extras

            # Lint
            - uv run ruff check .

            # Format check
            - uv run ruff format --check .

            # Type check
            - uv run ty check

            # Tests
            - uv run pytest

definitions:
  caches:
    uv: ~/.cache/uv

The uv cache definition at the bottom tells Bitbucket to persist uv's global package cache between pipeline runs — same effect as enable-cache: true in the GitHub Actions setup. Warm runs resolve and install in milliseconds.

Bitbucket pipeline steps exit non-zero on any command failure, so a failing ruff or ty check automatically blocks the PR — no extra configuration needed.


Option 2: Dev Dependencies (uv add --dev)

Install ruff and ty as project dev dependencies. Anyone who clones the repo and runs uv sync gets the tools pinned to the exact same version — no "works on my machine" drift.

uv add --dev ruff ty

Then in pyproject.toml:

[tool.uv]
dev-dependencies = [
    "ruff>=0.9.0",
    "ty>=0.0.50",
]

Developers run checks explicitly in their terminal:

uv run ruff check . --fix
uv run ruff format .
uv run ty check

This is the right setup for most teams — the tools are version-locked alongside the project, and CI uses the same uv run commands. The pipeline can still block PRs — same as Option 1 — because your GitHub Actions workflow calls uv run ruff check . which uses the dev-installed version.

The key advantage over Option 1 alone: developers can run the exact same commands that CI runs, locally, before pushing. No guessing what the pipeline checks.


Option 3: uvx — Global Tools, No Install Required

This is the most lightweight option and the one I recommend for individual developers who want to run checks on-demand without touching a project's dependencies.

# Run without installing anything into the project
uvx ruff check .
uvx ruff format .
uvx ty check

uvx pulls the latest version of the tool into an isolated ephemeral environment and runs it. No uv add, no virtualenv activation, no version pinning needed.

Why this matters: When you're working on a PR and want to verify your code before pushing — without waiting for a CI run to come back — uvx lets you do it instantly. You're not blocked by pipeline queue times or approval gates. You check locally, fix locally, push clean.

# My pre-push routine — takes under 1 second total
uvx ruff check . --fix && uvx ruff format . && uvx ty check

The one tradeoff: uvx always pulls the latest version, so there's a small risk of version drift between your local run and the pinned version in CI. For most projects this doesn't matter — but if you need byte-for-byte reproducibility, stick with Option 2.


Option 4: Pre-commit Hooks (Local Gate Before Push)

Pre-commit hooks run checks automatically when you run git commit — before the code ever leaves your machine. It's a local enforcement layer that catches issues even earlier than CI.

Ruff has a first-party pre-commit hook maintained by Astral:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.9.0
    hooks:
      # Run linter and auto-fix
      - id: ruff
        args: [--fix]
      # Run formatter
      - id: ruff-format

  # ty doesn't have an official pre-commit hook yet (beta)
  # Use uv run ty check in CI instead

Install and activate:

uv add --dev pre-commit
uv run pre-commit install       # wires it to git commit
uv run pre-commit run --all-files  # run manually on everything

Now every git commit automatically runs ruff lint + format. If anything fails, the commit is blocked until the code is clean.

Pre-commit vs uvx — which to use?

pre-commit hook uvx (manual)
When it runs Automatically on every git commit Only when you run it explicitly
Blocks commits Yes — can't commit until clean No — you choose when to check
Setup overhead Requires installation per clone Zero — just uvx ruff check .
Best for Teams where discipline varies Individual developers who self-enforce

Honest take: If your team already has a pre-commit discipline, add ruff to it. If not, the uvx pre-push routine is lighter and easier to adopt. Both are better than relying solely on CI feedback.


Which approach to use?

Scenario Recommendation
Team project with a shared repo Option 2 (dev deps) + Option 1 (CI enforcement)
Solo project or quick scripts Option 3 (uvx)
Want to catch issues before pushing Option 3 (uvx) or Option 4 (pre-commit)
Team needs automated local enforcement Option 4 (pre-commit)
Strict team standards, no exceptions Option 1 + Option 4
All four together ✅ Best setup — local freedom + local gate + CI safety net

The combination I use day-to-day: dev dependencies for the project (so the whole team uses the same versions), GitHub Actions to enforce on PRs (so nothing slips through), and uvx for quick local checks when I want instant feedback before I even open a PR.


Docker Integration — uv in Production Containers

For AI engineering, your code eventually goes into a Docker container — a FastAPI inference server, an agent runtime, a training job. uv changes how you should write your Dockerfiles.

The key principle: separate dependency installation from code copying so Docker's layer cache means you only re-install packages when pyproject.toml or uv.lock actually changes.

Standard single-stage Dockerfile

FROM python:3.12-slim

# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/

WORKDIR /app

# Copy lockfile and project metadata first — Docker caches this layer
# independently from your source code. Dependencies only re-install
# when pyproject.toml or uv.lock changes.
COPY pyproject.toml uv.lock ./

# Install dependencies (no dev deps in production)
RUN uv sync --frozen --no-dev

# Now copy source code
COPY src/ ./src/

# Run via uv so it uses the managed virtualenv
CMD ["uv", "run", "uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]

Multi-stage Dockerfile (leaner production image)

For production AI services where image size matters:

# ── Stage 1: Build ────────────────────────────────────────────
FROM python:3.12-slim AS builder

COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/

WORKDIR /app

COPY pyproject.toml uv.lock ./

# Install into a known virtualenv path
RUN uv sync --frozen --no-dev --no-editable

# ── Stage 2: Runtime ──────────────────────────────────────────
FROM python:3.12-slim AS runtime

WORKDIR /app

# Copy only the virtualenv from the builder — no uv, no build tools
COPY --from=builder /app/.venv /app/.venv

# Copy source
COPY src/ ./src/

# Use the virtualenv's Python directly (no uv needed at runtime)
ENV PATH="/app/.venv/bin:$PATH"
CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]

The multi-stage build leaves uv itself out of the final image — the runtime only contains Python, your virtualenv, and your code. For ML services with large model weights, keeping the base image lean matters.

Key flags explained

Flag What it does
--frozen Fails if uv.lock is out of sync with pyproject.toml — guarantees reproducibility
--no-dev Skips dev dependencies (ruff, ty, pytest) — nothing debug-related ships to production
--no-editable Installs the package normally instead of as an editable install — required for multi-stage copies

.dockerignore

Always add this so Docker doesn't copy your virtualenv into the build context:

.venv/
__pycache__/
*.pyc
.ruff_cache/
.ty_cache/
.git/

Why This Matters Specifically for AI Engineering

AI/ML Python projects have three tooling problems that make this stack particularly valuable:

1. Heavy dependency trees

Installing torch, transformers, langchain, milvus-client, and kubeflow with pip can take 3–5 minutes in CI. With uv's global dependency cache, re-runs with no changes take milliseconds. If you're running training experiments across multiple branches, this adds up fast.

2. Rapidly evolving codebases

Agentic AI code changes fast — new tools, new agents, new API integrations every sprint. Ruff's --fix mode means you don't slow down to deal with linting noise. ty's incremental type checking means you catch regressions the moment you introduce them.

3. Type safety in LLM integration code

The worst bugs in production AI systems are often shape mismatches — wrong field names on API responses, incorrect Pydantic model paths, async functions called synchronously. ty catches all of these before runtime, in milliseconds, without requiring you to annotate everything upfront.


Migration Path — If You're on the Old Stack

If you're currently on pip + black + flake8 + mypy:

# Step 1: Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Step 2: Migrate your project
uv init --no-package   # if not a package
uv pip install -r requirements.txt  # import existing deps into uv
uv lock                              # generate the lockfile

# Step 3: Replace black + isort + flake8 with ruff
uv add --dev ruff
# Delete: .flake8, .isort.cfg, pyproject.toml [tool.black] section

# Step 4: Add ty (optional, can do incrementally)
uv add --dev ty
uv run ty check   # see what it finds

Total migration time for a mid-sized project: under 30 minutes.


Summary

Old Stack Astral Stack
Package manager pip + virtualenv + pyenv + poetry uv
Formatter black ruff format
Import sorter isort ruff (built-in)
Linter flake8 ruff check
Type checker mypy ty
Config files 4–5 1 (pyproject.toml)
CI cold start 45–90s 3–8s
Editor integration Multiple plugins ty + ruff extensions

The Astral toolchain won't make your model smarter or your LLM prompts better. But it removes an enormous amount of friction from the day-to-day work of building production AI systems — and in a field that moves as fast as this one, that friction is expensive.

Try it on your next project. You won't go back.


If you found this useful, I write about production Agentic AI systems, MLOps, and LLM engineering. Follow me on LinkedIn and GitHub.


What are you using for Python tooling right now? Drop a comment — I'm curious how many people are still on the old stack.

B

Great read! I’ve been transitioning my own workflow to the Astral ecosystem recently, and the speed difference alone is a complete game changer. It’s so refreshing to move away from the 'dependency hell' and slow linting times of the traditional setup. Thanks for sharing your experience!