Skip to content
Juozas Žilys
← All projects

Modernizing an inherited enterprise monolith

780+ merged PRs into a large .NET / React ERP — custom Roslyn analyzers, source generators, DDD refactors, warehouse automation, and an AI vision pipeline.

Role
Senior full-stack engineer — feature ownership, tooling, architecture
Period
2021 – present

.NET 10 · C# 14 · React · TypeScript · MariaDB · Elasticsearch · SignalR · Hangfire · Roslyn

Context

The primary product I work on is a multi-year, multi-team .NET + React ERP and e-commerce platform for the automotive aftermarket — inventory, vehicle/part cataloguing, multi-marketplace selling, orders, invoicing, shipping. The codebase spans ~28 projects, thousands of C# files, a React frontend, Hangfire background jobs, SignalR real-time hubs, Elasticsearch for search, and integrations into a dozen external systems.

Much of the earliest code was written by juniors under time pressure. A lot of what I do is pick representative slices and move them toward something maintainable without blocking delivery. What follows is contributions I personally drove — not a description of the system as a whole.

For scale: since joining in 2021 I’ve merged 780+ pull requests into this platform — roughly three of every ten PRs in its entire history — spanning backend, frontend, infrastructure, and the warehouse floor.

Custom Roslyn analyzers

The analyzer suite on this codebase is mine end-to-end — I proposed the rules, wrote the analyzers, and own their enforcement in CI. Discussion with the team happens, but the analyzers ship because I drive them.

The reason the suite exists at all is more interesting than the analyzers themselves. The same class of issues kept arriving in PRs from rotating contributors, and busy reviewers — myself included — were approving them. Pointing the same things out in review comments didn’t stick. Once a pattern is wrong twice in production code, it earns an analyzer. Once it has an analyzer, it can’t merge. The next teammate who tries it learns from the build, not from a PR comment three days later.

That’s the senior judgment behind the work: it’s not about loving compiler tooling, it’s about scaling code quality across the team without scaling me. Examples from the set:

  • SQL injection detection — flags string-interpolated or hardcoded-value SQL, with smart false-positive suppression for legitimate cases (LIMIT, OFFSET, schema names). Precise diagnostic locations so IDE squiggles point at the exact offending expression.
  • IEnumerable storage analysis — catches classes that store an IEnumerable<T> in a field, which almost always means a hidden re-enumeration bug.
  • Mutable property detection on value objects — flags setters that shouldn’t exist on types we’ve decided should be immutable.
  • Interface return-type optimization — warns when a method’s return type could be a more specific interface without breaking callers.
  • Primary constructor capture analysis — C# 12/14 primary constructors capture parameters implicitly; the analyzer catches cases where that capture is unintentional and memory-expensive.

Each analyzer ships as part of the build, so violations fail CI rather than drifting in through review — which is the only way rules of this kind actually stick on a multi-team codebase.

Incremental source generator for frozen collections

FrozenSet and FrozenDictionary are the right choice for read-only lookup tables, but writing them by hand is repetitive and invites drift. I wrote an incremental source generator that takes a plain static class with annotated fields and emits the frozen-collection forms, plus filter methods for subsets. An IStaticEntity interface lets the generated code use optimized comparers. Source-generated code has no runtime reflection cost and shows up in IDE “go to definition.” Same story as the analyzers — my design, my code, shipped into the team’s workflow.

DDD refactor of the product domain

Most of the modernization work on the parts-and-vehicles domain is mine, with the team along for the design discussions. The Part aggregate that came out of that refactor carries its own state machine (Draft → Incomplete → Active → Sold), raises domain events on meaningful transitions (PartCreatedEvent, PartPriceChangedEvent, PartLocationChangedEvent, PartStatusChangedEvent), validates itself through a factory method, and uses optimistic concurrency via OriginalLastUpdateDate. Caching lives in a repository decorator layer (CachedPartRepository) so the aggregate doesn’t know it’s cached.

The result pattern came along for the ride — a Result and Result<T> type with explicit failure variants (FailedResult, EmptyResult for accumulated errors, DeferredResult for lazy binding) and implicit conversions so callers stay readable.

Backend rewrites you notice from the outside

The data-access layer for most of the platform’s services got a top-to-bottom rewrite — moving the hot paths to Dapper, cleaning up a long tail of N+1 queries and ad-hoc SQL, consolidating connection management. Same pass straightened out a lot of inconsistent error handling.

The invoice system was rewritten alongside it to gain a full edit-history audit trail — every change to an invoice is reversible and attributable, which the previous schema couldn’t represent. That kind of “look it up, see who did it, undo it cleanly” capability is the difference between an invoicing tool that finance trusts and one they fight with.

The rewrite kept growing into the workflows around it: an unlock-request system (finalized invoices can be reopened through an approval flow with safeguards, status streamed live over SignalR), partial cancellation of individual invoice items, PDF archival and retrieval, file attachments, and MySQL advisory locks plus consumed-proforma locking to keep concurrent edits honest. Payment-provider integration (OPay) landed in the same area.

Document generation

Most of the PDF artifacts the platform produces — invoices, shipping labels, picking lists, financial reports, QR stickers — go through paths I built or rewrote. iText for the heavy stuff, server-side JS via Jering.Javascript.NodeJS for the cases where the JS PDF ecosystem (jsPDF, autotable) is the right tool. The invoice rewrite mentioned above is part of this work.

Warehouse hardware integrations

The platform isn’t only a web app — it runs the physical warehouse. The picking → packing → postage workflow pages that warehouse staff live in all day are mine, iterated over years of merged PRs against real floor feedback: contextual scanning, status-driven tabs, combined order PDFs, auto-print on pack, even handling the scanner misreading when a Lithuanian keyboard layout is active. End-to-end software paths I built for the kit on the floor:

  • ParcelCube integration — automated parcel measurement + weighing tied directly into the carrier dispatch flow. Time from “package built” to “courier label printed” drops to seconds.
  • Scanner tool — handles our product QR codes plus any other QR the system can interpret. The same scan resolves contextually a dozen different ways: auto-insert product data, find an existing product, open a parcel’s info, look up an order, attach to the active workflow, and so on. One device, context-aware behavior.
  • Custom scale drivers — the vendor’s scale software was laggy and supported one protocol. I wrote our own across multiple scale models and protocols, with automatic detection of “product placed” vs “product removed” so the system reads weight without an explicit user action.
  • Printer pipeline — currently sits on a vendor printing API the manager wanted shipped fast. It has known issues and we plan to replace it with our own. Worth flagging: this is a known interim, not a finished product.

Working around PrestaShop

The platform integrates with a PrestaShop storefront. The choice predates me by five years and was wrong for the scale — the team should have picked Magento. We’re stuck with it, so most of my PrestaShop work is making it tolerable.

Catalog search: ~60s → ~1s at 200k products

PrestaShop’s own catalog had degraded to roughly a minute per page load at full product count. Custom modules, faceted-search caching, MySQL triggers maintaining hot-path data, and direct intervention into the core search logic where modules weren’t enough — all together brought the same catalog page to around one second. Same hardware, same product count.

Direct-DB integration library

PrestaShop’s official PHP API is slow and buggy — particularly its payment-confirmation logic, which we’ve seen drop legitimately paid orders into “unpaid” states and silently mark unpaid orders as paid. We don’t trust it.

I wrote a parallel library that talks directly to PrestaShop’s database for fast product CRUD, reads order state from the same place, and verifies payment by calling the payment provider’s API directly instead of trusting PrestaShop’s flag. The slower-but-correct paths still go through the official API where it’s reliable; the fast-and-trust-sensitive paths don’t.

Marketplace pricing intelligence

Built the data-collection workflows we use for same-product search across marketplaces — answering “is the part we’re about to list already on sale somewhere, and at what price?” before we commit. Public listings only, used as input to listing decisions. The technical interesting bit is robustness against the small site-by-site differences in markup that break naive scrapers — selectors, retries, and parsing all live behind a per-marketplace adapter.

Gemini vision for part recognition

An internal workflow required humans to classify photographed parts by category. I built a two-stage Gemini vision pipeline: first pass is open recognition, second pass is context-aware against the known part list, and a final stage generates the search queries for listing. The gains weren’t “replaces the human” — they were “cuts the time spent per part by an order of magnitude, and the human becomes the reviewer, not the cataloguer.”

Background-job infrastructure that survives bad days

The platform leans hard on long-running jobs — marketplace syncs, bulk updates, Elasticsearch indexing. The resilience layer around them is largely my work: task leasing so workers can die without orphaning work, deadlock retry policies and batched DB operations on the hot paths, staggered scheduling so bulk jobs don’t stampede the database, and a task-monitoring view with history charts so “is the sync stuck?” is a glance instead of a database query. Structured logging migrated to Seq, with error-body size limits so one bad payload can’t flood the log store.

SignalR real-time architecture

Six hubs handle different concerns — chat, notifications, shop-sync status, invoice updates, bulk-update progress, Gemini AI feedback. Each hub routes by user, not by broadcast, and long-running operations expose progress callbacks so the UI can show real progress bars instead of spinners.

Auth migration to Keycloak

Helped move the platform’s authorization off ASP.NET’s built-in auth and onto Keycloak — a dedicated identity provider that handles token issuance, refresh, expiration, and the surrounding security primitives properly. Less homegrown plumbing, more delegation to a tool whose maintainers’ job is keeping it secure.

Infrastructure I help run

I co-manage the platform’s running infrastructure: Docker containers and orchestration, DigitalOcean droplets, Ubuntu hosts, nginx as the front-door, Cloudflare for firewalls and edge protection, VPN and SSH tunnel access for the team. The reliability work that comes with that includes regular security and load testing of our own surface — making sure our own ingress, our own services, and our own auth flows hold up before a real adversary or a real traffic spike asks the question.

I also drove the solution-wide migration to .NET 10, taking the Docker images and CI workflows through optimization in the same pass. Current work in flight is an AWS cost-engineering push: rebuilding the image-handling pipeline on Lambda with proper caching, S3 cost safeguards with a cost widget surfaced right in the admin UI, and automated database backups to S3 with in-app freshness monitoring — so “are the backups actually running?” is a dashboard, not a hope.

What stays boring on purpose

The most useful judgment call in this kind of work is what to leave alone. Anything that ships correctly and isn’t actively holding the team back gets left. Modernization budget goes where the next three months of work will benefit from it — not everywhere. Same thinking applies to the analyzer rules: I only add one when the cost of the violation has shown up more than once in real PRs.