Deep Engineering #36: Archit Agarwal on System Design Trade-offs

From monolith-to-services signals to “performance per dollar” and practical resilience under real attacks—clear choices you can defend in production and interviews.

Feb 26, 2026

Unblocked: The context layer your AI tools are missing

Give AI agents the knowledge of your most experienced engineers. Unblocked builds context from your team’s code, PR history, conversations, documentation, planning tools, and runtime signals to surface what matters for the task at hand – while resolving conflicts and respecting permissions. Ship faster with AI outputs that reflect how your system actually works.

See how it works

✍️From the editor’s desk,

Dear reader,
It has been a privilege to serve you as we built Deep Engineering over the past year. This is my final issue, and next month the newsletter will continue under new editorial leadership. I’m glad to introduce Saqib Jan as the new Editor-in-Chief of Deep Engineering.
Deep Engineering remains committed to serving the software community with timely, expert-led insights. Thank you for reading and for being part of this journey with me.

Welcome to the 36th issue of Deep Engineering!

Cloudflare’s Q4 2025 DDoS Threat Report described a record 31.4 Tbps attack and hyper-volumetric HTTP floods exceeding 200 million requests per second—a useful reminder that production systems run in a hostile environment. Today’s issue is grounded in my conversation with Archit Agarwal, Principal Member of Technical Staff at Oracle working on ultra-low-latency authorization services in Go, and the author of The Weekly Golang Journal. As Agarwal says, we must assume the system will fail and the system will be attacked, then work backward into controls that keep the blast radius small.

In this issue, we explore what makes a system design hold up in production once cost, failure, and security constraints become non-negotiable? Agarwal’s guidance is consistent across topics: delay microservices until “splitting signals” are visible, treat performance per dollar as a design requirement, cap autoscaling to prevent runaway spend during attacks, and keep dependency risk contained through abstraction, version discipline, and unified observability. Agarwal also offers battle-tested tips on cracking the system design interview.

You can watch the complete interview and read the transcript here, or read on for distilled insights.

Build an Internal Developer Platform from scratch (2-day live workshop)

Most IDPs fail when teams build tools, not products. This 2-day live, hands-on workshop shows how to design and build a production-ready IDP with platform-as-a-product patterns, adoption, governance, and reliability. Sat, Apr 25, 2026 — 11:00 AM ET.

Reserve your seat →

System Design That Holds Up in Production with Archit Agarwal

A system design diagram is only useful if it reflects the constraints the system will live under. Archit Agarwal’s answer to most design questions is the same: make the trade-offs explicit, and choose them deliberately.

When microservices are worth it

It is now generally established best practice to begin with a monolith when requirements are still changing and the domain is still unfamiliar. Agarwal recommends keeping module boundaries clean so that separation is possible later, but he delays distribution until there is clear evidence it will help.

He looks for “splitting signals”:

Deployment friction shows up: Releases get larger, slower, and harder to roll back.
The blast radius grows: A defect in one area increasingly breaks unrelated user paths.
Independent scaling becomes necessary: One workload needs more capacity than others during predictable peaks.
Technology needs diverge: A subsystem truly benefits from different runtimes, storage, or processing tools.

Microservices add team autonomy, but they also add coordination overhead and network latency. According to Agarwal, splitting makes sense when the monolith’s pain is observable, repeatable, and clearly outweighs the operational tax of moving to services.

Performance per dollar as a design requirement

Cost is part of the architecture. Agarwal evaluates choices as “performance per dollar,” so teams weigh latency gains against the full cost of running and operating the system.

A cost-aware loop can:

Make spend visible through dashboards, so teams connect architecture to real numbers.
Put cost into design reviews alongside latency, availability, and security trade-offs.
Build for horizontal scaling, then scale only when demand and reliability targets prove the need.
Manage data by lifecycle. Keep frequently accessed data in hot tiers and move cold data to cheaper storage.

Design for a hostile environment

It is also important to design around two expectations: systems fail, and systems get attacked. Planning for both keeps failure domains small and recovery predictable.

Recent threat telemetry supports the posture. Cloudflare’s Q4 2025 DDoS report describes a record 31.4 Tbps attack and a subsequent campaign with hyper-volumetric HTTP floods exceeding 200 million requests per second. The same report says DDoS attacks surged 121% in 2025, averaging 5,376 mitigations per hour on Cloudflare’s network.

Agarwal recommends a practical resilience toolkit that ties these risks to concrete controls:

Add layered defenses at the edge before the application spends CPU.
Rate-limit meaningfully by user, token, IP, and geography, and tune limits using real traffic.
Cap autoscaling so an attack cannot trigger runaway capacity and spend.
Separate failure domains with multi–availability zone deployments, and reserve multi-region for the paths that must stay up.
Keep observability complete, because incident response depends on fast visibility.

Supply chain exposure belongs in the same threat model. Sonatype reported 394,877 new malicious open-source packages in Q4 2025 alone, a 476% jump compared to the prior three quarters combined, with 99.8% of the malware it saw in that quarter originating from npm. ReversingLabs’ 2026 software supply chain report also points to npm as the primary hotspot, reporting 10,819 malicious packages detected there in 2025 (nearly 90% of its detections across OSS platforms) and describing a first-of-its-kind self-replicating registry-native worm dubbed Shai-hulud that compromised close to 1,000 npm packages across two campaigns.

Design implications follow naturally: pin and review dependency updates, validate new versions before promotion, avoid automatic upgrades on production-critical paths, and enforce least privilege and deny-by-default authorization so that a compromised component has limited reach.

Depend on services without letting dependencies own you

Modern systems rely on external services and SaaS APIs. Agarwal recommends keeping those dependencies from becoming single points of failure.

His approach is pragmatic:

Put an abstraction layer in front of major external services, so the application talks to an interface rather than a vendor-specific client library.
Prefer portable standards where they fit, such as containers and unified telemetry.
Replicate only what is mission-critical across regions or providers, and keep observability in one place.

Ultra-low latency without sacrificing the whole codebase

Agarwal recommends treating latency like a budget: every hop and allocation spends from it. Ultra-low-latency work, in his view, is mostly subtraction.

Replicable strategies:

Put compute closer to users when geography is on the critical path.
Remove network hops from hot paths, and keep the performance-critical code lean.
Reuse connections and choose efficient service-to-service protocols when calls are unavoidable.
Match hardware and runtime to the goal, because software choices cannot fully offset weak infrastructure.

He also draws a boundary that protects maintainability: only a small part of most systems needs extreme optimization. Keep the rest readable and easy to change.

How to communicate system design under pressure

Agarwal’s interview advice doubles as a template for real design discussions: keep the design aligned with requirements, and keep your reasoning easy to follow.

His recommended structure for cracking the system design interview:

Align on functional requirements first.
Capture nonfunctional requirements next, because they drive architecture.
Present a high-level design, then zoom in one component at a time.
State trade-offs as you choose, including what you gain and what you give up.

When constraints change midstream, he recommends restating the change, identifying the affected parts, and updating only what needs updating. That approach shows control, adaptability, and a clear “commit history” of decisions.

🧠Expert Insight

🛠️Tool of the Week

Envoy Proxy: a high-performance L7 proxy that standardizes service-to-service and edge traffic management (routing, resilience, security, and telemetry) across distributed systems.

Highlights:

Advanced traffic routing & load balancing: dynamic routing rules, weighted traffic splitting, and multiple load-balancing policies.
Resilience controls built in: timeouts, retries, circuit breaking, and outlier detection to contain failures.
Security + observability hooks: mTLS/TLS termination, external authorization integration, and first-class metrics/logging/tracing support.

Learn more about Envoy Proxy

📎Tech Briefs

Cloudflare outage (BYOIP route withdrawal): Explains how a change in BYOIP prefix management unintentionally triggered BGP route withdrawals for some customers, causing timeouts and connectivity failures, and documents the mitigations and follow-up controls.
Post-quantum encryption across Cloudflare One (SASE): Cloudflare has rolled out standards-based post-quantum hybrid ML-KEM across Zero Trust and WAN on-ramps/off-ramps (including TLS/MASQUE/IPsec paths) without requiring specialized hardware changes.
Private incident + maintenance notifications (SSO & Slack) — Fastly:
Fastly has introduced private, security-sensitive incident and maintenance notifications with integrations like SSO and Slack to improve operational transparency for customers without exposing details publicly.
apt.postgresql.org operational updates: The PostgreSQL project’s apt repository announces automatic apt-retrievable changelogs, colocated build logs, and repository changes as Ubuntu 26.04 work begins and Ubuntu 25.04 reaches EOL.
PostgreSQL Anonymizer 3.0: Dalibo’s PostgreSQL Anonymizer 3.0 adds parallel static masking via background workers and JSON import/export for masking rules, alongside security fixes and breaking-change notes.

That is all from me today. Thank you for reading this issue of Deep Engineering.

We will be back next week with more expert-led content.

Signing off one last time,
Divya Anne Selvaraj
Editor-in-Chief, Deep Engineering

(Don’t forget to stay awesome!)

If your company is interested in reaching an audience of developers, software engineers, and tech decision makers, you may want to advertise with us.

Discussion about this post

Ready for more?