goated-erd-fr — Juozas Žilys

Problem

A production schema I worked with had grown past 190 tables across five logical groups. Every tool for visualizing and versioning it was unacceptable in a different way:

MySQL Workbench — binary .mwb files, unreadable in git diffs
dbdiagram.io — no auto-grouping, dumps everything into one wall of tables
DrawDB — visibly lags past ~100 tables
DbSchema — the only tool with smart grouping, but $59 per seat after a 15-day trial
DBML format — its TableGroup directive is ignored by most renderers

Nothing free handled “200+ tables with smart auto-grouping and a git-friendly file format.”

What I built

A Tauri 2 desktop app with a Rust analyzer driving a React Flow canvas.

Rust backend (schema_analyzer.rs, 1,043 lines) runs the O(n²) relationship analysis and layout. Hub-based clustering finds tables with the most foreign-key connections, treats them as cluster centroids, then assigns peripheral tables to their strongest hub. Orphans are grouped by keyword similarity — Levenshtein distance, underscore/camelCase tokenization, and prefix/suffix overlap.
React + React Flow (WebGL canvas) renders the graph. Drag, zoom, and pan stay smooth at 200+ nodes because the heavy math never crosses into JS.
Tauri 2 over Electron — ~3 MB binary vs. 100 MB+, with native system access for file dialogs and future direct-MySQL connections.
Custom JSON file format stores tables, relationships, positions, colors, and groups. Designed specifically for git: stable ordering, one table per block, human-readable diffs.

Interesting bits

Layout is a three-pass pipeline. Groups are placed on a balanced grid with placement weights derived from inter-group edge counts. Tables within each group use a radial layout around their hub. A final optimize_layout pass detects and resolves overlaps the first two passes can’t avoid.
Keyword tokenization handles real-world naming. user_addresses and userAddresses tokenize to [user, addresses] and match even though their raw strings don’t. Prefix and suffix checks catch conventions like order_* or *_audit.
Why Rust for the math. O(n²) over 200+ tables would stall JS under GC pressure during the analysis pass, so the math lives in Rust from the start. Rust returns serialized JSON and the Tauri IPC boundary is the only crossing the UI has to make.
JSON over binary was the single most useful decision. PR reviews now show real schema diffs instead of “binary file changed.”