Deep Engineering #50: Brian Allbee on Building Better Python Software
On testing discipline, technical debt, concurrency, cloud readiness, and the shift from programming to software engineering
Claude Code for Software Engineering
Join this interactive workshop to learn how to turn Claude Code from a session-by-session assistant into a repeatable engineering system, using structured context, reusable skills, scoped rules, hooks, and guardrails that work across real codebases and team workflows.
🗓️ Friday, June 20 · 10:30 AM EDT onwards
Use code DEEPENG50 for 50% off.
✍️ From the editor’s desk,
Welcome to the 50th issue of Deep Engineering!
Anthropic expanded Project Glasswing on June 2, extending Claude Mythos Preview to approximately 150 new organizations for codebase vulnerability scanning, after initial partners found more than 10,000 high-severity security flaws in production code.
AI can now find vulnerabilities in production systems that were presumably tested, reviewed, and shipped by engineering teams. The gap between code that compiles and code that holds up under real-world pressure is no longer theoretical.
That gap also has a cause. Brian Allbee, Staff Software Engineer at Cleerly and author of Hands-On Software Engineering with Python (Packt), argues that programming focuses on the correctness of the code itself, while software engineering expands that focus to sustainability as change occurs. Too many developers optimize for code that works today, without enough attention to whether that code can be changed, tested, maintained, and handed off tomorrow.
Allbee joined Deep Engineering Live to discuss what closing that gap looks like in practice. Today’s expert insights are based on that conversation, and you can read or watch the full Q&A here.
Let’s get started.
Featured: All the dev content that matters, in one personalized feed
daily.dev is a professional network for developers, built around a personalized feed of the best content from across the dev ecosystem. Millions of developers use it to stay current with their stack, discover new tools and frameworks, and connect with a global community that shares what they’re learning.
Expert Insights
Building Better Python Software Is Not About Writing Better Code
by Saqib Jan with Brian Allbee
Most Python developers measure their work by whether the code runs, assuming the job is done once the function returns the right value, the tests pass, and the build is green. Brian Allbee, Staff Software Engineer at Cleerly and author of Hands-On Software Engineering with Python (Packt), thinks that measure is correct but incomplete, noting that the gap between correct code and sustainable software is where most Python developers stop growing without realizing it.
The distinction Allbee draws is precise. Programming, he explained in our live interview, is focused on “the correctness of the code itself,” whereas software engineering “starts expanding out into more of a focus on sustainability as change occurs.” That shift in focus sounds subtle, but it changes almost every decision an engineer makes, from how they structure a module and handle a growing codebase to how they talk about technical debt with the people who control the roadmap.
The discipline that architecture cannot replace
The instinct when a Python codebase starts to grow is to reach for architecture by breaking things into services, introducing abstractions, and redesigning the data model. Allbee’s experience points in a different direction. “I think most of the paths to success in that context, at least the ones that I can think of that I’ve seen, don’t really start with the architecture, but with discipline behind the process,” he shares.
The discipline he describes is specific and unglamorous, emphasizing the need to keep things as simple as possible while wrapping repeated processes into functions or methods, stressing “that teams should agree on documentation standards and stick to them until something unexpected comes up, and that developers must write code with testability in mind from the beginning, even when there is no immediate requirement for tests.” These practices do not require a new framework or a redesign because they rely on consistency, which is often much harder to achieve than architecture.
The reason discipline comes before architecture is that architecture without discipline produces complexity without clarity. Allbee, in our interview, shared a vivid example with the audience from his own experience regarding a system he encountered that had been written in Python by an engineer who came from a C# background, resulting in an architecture where every function and every class had its own isolated module. The functional layers of the system were seven or eight deep depending on the context, creating a project that was “ridiculously huge,” he recalls, and “way more complicated than it needed to be, and it was hard to manage... hard to maintain.”
The problem was not the language or incompetence, but rather a mental model built for a different environment being applied wholesale to Python. Allbee points to a concept from the book Code That Fits In Your Head to explain why this matters, because humans can only keep “five to seven bits of information in the front of their memory at a given point in time.” A system with seven layers of depth saturates that capacity before a developer has even started reasoning about what any individual layer does.
Allbee argues that developers must “keep it simple” and “collapse things down to the point where you don’t have to have 19 different classes and 15 different instances of, you know, all these other classes to deal with something that really should be capable of being managed as a single function.”
Technical debt is a product decision, not a technical one
One of the more practically useful things Allbee explains about managing an evolving Python system is that technical debt is not primarily a technical problem. “Technical debt is one of those product-level priorities,” he reasons, adding that “whoever’s making the prioritization decisions is going to be in control of when those get tackled, if they get tackled.”
That framing shifts where an engineer should focus their energy when technical debt is accumulating, meaning the work is not just to identify the debt but to communicate its consequences clearly enough that the people controlling the roadmap can make an informed decision. “Making sure that you can communicate effectively, here’s what the impact of this technical debt is to your product-level people or whoever’s making those decisions, is gonna be a key thing,” he adds. That requires being able to sit down and say clearly that if the team does not deal with a bug, it is going to lead to cascading issues, and the longer they put it off, the more likely it is to lead to a really significant problem that will take even longer to get past.
The teams that handle technical debt well, in Allbee’s experience, are the ones that treat it as a first-class concern rather than an emergency. The difference between those two approaches is almost entirely about communication, where debt that gets communicated early and framed in terms of product risk gets prioritized, while debt that surfaces as a crisis gets managed badly.
Testing is a design decision
The most common framing of testing in Python projects is that it is something you add to code that already exists, but Allbee’s position is that testability is a property of the code itself, meaning that designing for it from the beginning changes the shape of the code in ways that make it easier to understand, change, and hand off to other engineers.
His testing approach for his own projects is a method that “exercises valid and invalid inputs for all of the parameters of every callable in the project.” He shares, “You combine that with judicious monitoring of missing lines and a code coverage report, that has served really well for me in making sure that the targets of those tests are being both thoroughly and realistically exercised.” The more important principle underneath the practice is that tests are most valuable when they reflect how the system is actually used, not just how the code is structured.
In team contexts, Allbee advocates for explicit agreement about how tests are organized and what tools are involved. “I’ve seen what happens when different engineers who aren’t communicating with each other each go their own way,” he points out, noting that “the tests that result, even if they’re rigorous and well thought out, are oftentimes difficult to follow across different test modules.” The investment in agreement upfront produces a test suite that the whole team can confidently read, maintain, and extend.
On AI-generated code in testing contexts, Allbee recommends defining a test suite that only humans are permitted to modify, making it as rigorous and complete as possible, and then allowing AI to generate implementation code that must pass that suite. He explains the boundary by stating that you can tell the AI to “write all the code you want,” but it “must pass this test suite” and it does “not get to modify that test suite.” That boundary, he reasons, provides about as much coverage as can realistically be achieved when AI is involved in production code.
Bring Claude Code into real engineering workflows, not just isolated coding sessions. Register here.
Concurrency is a design problem first
Python’s performance limitations and its Global Interpreter Lock have been a recurring concern for engineers building high-throughput systems, and CPython’s free-threaded build has stirred interest in what Python might make possible beyond the GIL. Allbee is measured about expectations, highlighting that most Python code is IO-bound rather than CPU-bound, which is where the GIL has its most significant impact, though he is hopeful that the free-threaded model will open doors for more CPU-bound work to be written in Python.
The framing that matters most to many developers is not about the runtime at all. “Concurrency is a design problem before it’s a runtime problem,” he underscores, adding that having better concurrency support in the language really does not eliminate the need to understand how your processes are going to contend against each other, how to deal with data ownership at the scope of the code, or how failures can happen. His practical advice on concurrency reflects this directly by recommending that developers add it sparingly and only when there is an actual benefit that outweighs the overhead of handling errors, data contention, and coordination costs. “Optimize your clarity and correctness first,” he recommends, and “really only reach for concurrency when you understand where the time is actually being spent.”
Cloud readiness is designing for volatility
The question of what makes a Python application cloud-ready is one Allbee addresses in terms of design principles rather than tools or platforms. The containerized application is cloud-ready, he acknowledges, but so are function-as-a-service constructs like AWS Lambda functions, proving that the specific mechanism matters less than the underlying design orientation.
“The key concept that ties almost every cloud-resident system together, containerization, stateless design, any of those, is that they are inherently disposable,” he explains. Because a container can be killed at any time, a Lambda invocation could be terminated before it reaches a successful completion, and Kubernetes pods restarting are probably routine events, designing for that reality means building processes around the expectation that the hardware can disappear at any point in time.
Statelessness in that context is about making failure cheap. There is no state to manage and no need to write code to reacquire that state, meaning a process simply ends and is restarted, making recovery from a failure as simple as starting a new instance. “Statelessness and containerization matter more because they make failure cheap and recovery routine than for any other purpose or reason,” he says, arguing that this principle should sit near the top of the list of factors shaping design decisions for any system built to run in a cloud environment.
What senior engineers actually do
The question of what separates an engineer ready for senior work from one who is not comes back to the same systems-oriented thinking that distinguishes engineering from programming, where the indicator is not technical mastery but curiosity about the system rather than just the isolated function.
“If they started demonstrating that they’re concerned with more than just is the code doing what it’s supposed to do,” he explains, “if there’s a certain amount of curiosity, why are we doing it this way, do they recognize the trade-offs, those are the things that I think start really indicating somebody is actually ready to go beyond just I’ve written this function, and it’s done, and it’s tested, and it works. Done. I’m finished.”
The senior engineers Allbee has tried to emulate and seen do their best work are not defined by the code they write but by the systems they shape and the teams they are enabling. That involves asking questions that guide less senior engineers to ask those same questions on their own, such as why the team is going down a certain road, what the benefits are, and what trade-offs exist. “There are always trade-offs,” he notes, emphasizing, “Always, always, always trade-offs.”
The advice he offers to Python developers trying to grow in an AI-accelerated world collapses to three core principles, which are to “think in systems,” to “design for change,” and to “optimize for your team.” If you come away thinking differently about why you write the code that you are writing and not just how, then that is the shift that matters. Since the language, tools, and expectations placed on Python engineers will inevitably keep growing, the engineers who hold up under those pressures are the ones who stopped measuring their work by whether the code runs and started asking what it will take for the system to survive.
Go deeper with Brian Allbee’s book
Brian Allbee explores these ideas in more depth in Hands-On Software Engineering with Python, a practical guide to building Python systems that are easier to test, maintain, evolve, and hand off.
In case you missed
🛠️ Tool of the Week
Ruff - fast Python linting and formatting for teams trying to keep quality gates enforceable without slowing every commit.
Highlights
Consolidates linting, import sorting, upgrade checks, and formatting behind one configuration surface.
Runs fast enough for pre-commit and CI workflows, which makes quality checks more likely to stay enabled.
Supports monorepos and hierarchical configuration, helping larger teams avoid one-off project rules.
Already used across major Python projects, making it a practical default rather than a niche experiment.
📎 Tech Briefs
Copilot SDK is now generally available - GitHub made Copilot SDK stable across six languages, letting teams embed agent workflows into internal tools.
Anthropic expands Project Glasswing - Project Glasswing now extends to 150 organizations, shifting AI vulnerability discovery toward coordinated patching capacity.
Python pip 26.1.2 - Pip 26.1.2 shipped with Trusted Publishing attestations, tightening provenance for standard Python installation workflows across teams.
Using uv in GitLab CI/CD - Astral added GitLab CI guidance for uv images and cache pruning, simplifying reproducible Python pipelines outside GitHub.
Pyright 1.1.410 - Pyright 1.1.410 refreshed the Python wrapper package, keeping CLI and editor type checks aligned automatically.
That’s all for today. Thank you for reading this issue of Deep Engineering.
We’ll be back next week with more expert-led content.
Keep building,
Saqib Jan
Editor-in-Chief, Deep Engineering
If your company is interested in reaching an audience of senior developers, software engineers, and technical decision-makers, you may want to advertise with us.






