ProgrammingPro #55: Protobuf Text Format, GitHub Security Upgrades, Django Trends Shift, and Microsoft's New TypeSpec Language
Bite-sized actionable content, practical tutorials, and resources for programmers
Welcome to this week’s edition of ProgrammingPro!
In today’s Expert Insight, we bring you an excerpt from the recently published book,
Protocol Buffers Handbook, which discusses the benefits of using Protobuf's text format for data serialization to enhance readability, debugging, and configuration management without compromising efficiency.
News Highlights: GitHub updates to enhance supply chain security through Artifact Attestations; Django Developer Survey 2024 reveals shifting trends in framework preferences; Stack Overflow and OpenAI collaborate to integrate AI with developer tools; and Microsoft launches TypeSpec, a new language aimed at streamlining API development.
My top 5 picks from today’s learning resources:
Need for Speed - LLMs Beyond OpenAI with C#, .NET 8 SSE + Channels, Llama3, and Fireworks.ai⚡
Securing the Generative AI Frontier - Specialized Tools and Frameworks for AI Firewall🏰
But there’s more, so dive right in.
Stay Awesome!
Divya Anne Selvaraj
Editor-in-Chief
PS: Our monthly survey is still in progress. We invite you to take the opportunity to tell us what you think about ProgrammingPro so far, request a learning resource for future issues, tell us what you think about a recent development in the Programming world, and earn a Packt Credit each month to buy a book of your choice.
🗞️News and Analysis🔎
GitHub announces new updates to improve supply chain security: The updates include a beta of Artifact Attestations for GitHub Actions to verify software origins. Read to learn about GitHub's new security features.
Deno boosts language server performance: Deno 1.43 has significantly upgraded its language server to enhance JavaScript/TypeScript runtime performance. Read to learn about the latest improvements.
Highlights from the Django Developer Survey 2024: Results indicate that while 74% of Python developers still prefer Django, there's a notable trend of exploring alternative frameworks like Flask and FastAPI. Read to learn more about the current trends and preferences within the Django developer community.
Stack Overflow signs deal with OpenAI: This partnership allows OpenAI to train its models on Stack Overflow's public dataset through OverflowAPI. Read to learn more about this collaboration which aims to enhance features like GitHub Copilot.
Postman v11 enables better collaboration on APIs with an improved update feed, comment mode, and more: The new version also introduces Partner Workspaces for multi-partner collaborations, and a Package Library for code reuse. Read to learn how the new features streamline API development.
Fortran popularity rises with numerical and scientific computing: Fortran surged back into the top 10 of the Tiobe index in April 2024 after slipping down post 2002. Read to learn why the 67 year old language still holds its own.
Rust adds diagnostic attributes for compiler messages: Rust 1.78 introduces the #[diagnostic] attribute namespace to improve the clarity and utility of diagnostics, even when not supported by all compilers. Read to learn more about the update.
Microsoft unveils TypeSpec language for API development: The language enables developers to define APIs in a high-level, reusable manner across various protocols and formats. Read to learn more about TypeSpec's capabilities.
Oracle preparing Code Assist: AI coding “fine-tuned” for Java, SQL and its own cloud: The AI coding assistant is aimed at enhancing developer productivity with features like code suggestions, documentation generation, and test creation. Read to learn more about Oracle's upcoming AI tool.
🎓Tutorials and Learning Resources💡
Python
How Python Asyncio Works - Recreating it from Scratch: This article delves into the inner workings of Python's asyncio by recreating its core components using generators. Read to learn how to implement and manipulate asynchronous tasks and event loops.
A 100x speedup with unsafe Python: This article discusses utilizing unconventional memory layouts with numpy and SDL, specifically in the context of image resizing. Read to learn advanced techniques in Python for optimizing performance.
For more Python resources go to PythonPro
C# and .NET
🎓Tutorial | Build an authentication handler for a minimal API in ASP.NET Core: This article guides you on implementing basic password authentication in minimal API using a custom handler. Read to learn how to build and integrate a basic authentication system into ASP.NET Core minimal APIs.
Need for Speed - LLMs Beyond OpenAI with C#, .NET 8 SSE + Channels, Llama3, and Fireworks.ai: This article discusses how integrating C#/.NET 8, System.Threading.Channels, and SSE can enhance responsiveness for gen AI applications. Read to learn about the technical capabilities of newer AI platforms like Fireworks.ai with Llama 3, and how to implement these with .NET technologies.
C# Discriminated Union (DU) - What’s Driving the C# Community’s Inquiries?: This article discusses the increasing interest within the C# community for native support of DUs, a feature well-implemented in F#. Read to learn about the functionality and advantages of DUs.
C and C++
Qt and C++ Trivial Relocation (Part 1): When QVector<T> reallocates, it can use memcpy for trivially relocatable types like int, enhancing performance by avoiding move construction and destruction. Read to learn how this optimizes memory operations for certain data types.
💼Case Study | How to rewrite a C++ codebase successfully: This article discusses the author's experience with rewriting a legacy codebase, emphasizing the challenges of maintaining the project. Read to learn about a strategic incremental approach to rewriting legacy codebases in a new language such as Rust to ensure project success.
Making a 3D Modeler, in C, in a Week: This article discusses the creation of "ShapeUp," a 3D modeler developed in C during a week-long programming event. Read to learn about the benefits and pitfalls of using C and raylib for rapid development of a 3D modeling tool.
Java
🎓Tutorial | Replace Calendar with LocalDate in Java programs: This article discusses the advantages of using Java's LocalDate class over the traditional Calendar class for handling dates, illustrating several practical applications. Read to enable more efficient date handling in your Java programs.
🎓Tutorial | How to use the Foreign Function API in Java 22 to Call C Libraries: This article teaches you how to utilize Java 22's Foreign Function API (FFM) to interact with C libraries like fopen, fgets, and fclose, offering a simpler alternative to JNI without manual C coding. Read to gain insights into memory management and the nuances of integrating C libraries within Java applications.
🎓Tutorial | Spring Microservice Application Resilience - The Role of @Transactional in Preventing Connection Leaks: By handling exceptions properly, @Transactional ensures database connections are released back to the pool, avoiding critical failures and service disruptions. Read to learn the importance of proper transaction management.
JavaScript and TypeScript
7 JavaScript language elements every developer needs: This article discusses array, for loop, forEach, map, reduce, substring, and switch. Read to enhance your skills in both writing and understanding JavaScript code.
At some point, JavaScript got good: JavaScript, once fraught with complexities and limitations, has undergone substantial improvements since ECMAScript 2015. Read to learn how JavaScript's evolution has enhanced its usability and efficiency.
💼Case Study | The evolution of Figma’s mobile engine - Compiling away our custom programming language: Recently Figma transitioned its mobile engine from Skew, a custom programming language, to TypeScript. Read for insights into strategic decisions behind programming language transitions in large-scale projects.
Go
Secure Randomness in Go 1.22: The version enhances the security of its randomness generation by integrating cryptographic random number sources into math/rand, bridging the gap with crypto/rand. Read to learn about the importance of using appropriate random number generators for different applications.
Go fixes its 7th code execution bug in the same feature: The handling of CFLAGS and LDFLAGS in Go's cgo feature had led to seven code execution vulnerabilities. Read to learn about the secure build-time flag handling solution.
Rust
What is in a Rust Allocator?: A custom Rust allocator, such as one implemented using the #[global_allocator] macro, involves specifying a static memory arena for allocations. Read to learn how this enhances efficiency.
👩🏻🏫Free Course | Comprehensive Rust 🦀: This free Rust course developed by the Android team at Google is designed for both beginners and experienced programmers. Read to gain the skills to apply Rust in various environments.
PHP
🎓Tutorial | Lets Build a Web Scraper in PHP and Python: This article introduces using cURL to scrape web content, which can handle various content types and protocols. Read to learn how cURL enhances the scraper's versatility for different web formats.
SQL
🎓Tutorial | SQL Server From Zero To Advanced Level - Leveraging nProbe Data: This tutorial covers installation, SQL commands, data manipulation, indexing, and performance tuning. Read for practical code examples.
Ruby
💼Case Study | Ruby typing 2024 - RBS, Steep, RBS Collections, subjective feelings: This article details the author's practical experiences integrating RBS into a project, highlighting the complexities and ergonomic challenges and more. Read to learn about the current state of Ruby typing tools.
Swift
🎓Tutorial | Designing a Swift library with data-race safety: This article discusses the development of the automerge-repo-swift library, emphasizing compliance with Swift's strict concurrency guidelines. Read to learn how to handle mutable state safely.
Kotlin
UseCase Red Flags and Best Practices in Clean Architecture: This article discusses best practices and common pitfalls in implementing use cases within clean architecture, specifically using Kotlin, focusing on encapsulating business logic for reusable tasks, thread safety, and maintaining domain purity. Read for more.
🌟Best Practices, Advice, and Case Studies🚀
Failure Is Required - Understanding Fail-Safe and Fail-Fast Strategies: This article discusses the benefits of embracing failure in software systems through fail-fast and fail-safe strategies, and more. Read to learn the importance of integrating these strategies to balance error detection and system resilience.
Securing the Generative AI Frontier - Specialized Tools and Frameworks for AI Firewall: This article explores specialized tools and frameworks designed for prompt inspection and protection, known as AI firewalls, to secure Generative AI systems. Read to gain insights into safeguarding AI applications from emerging threats.
MIT Programming Languages Review 2024: The event, held on May 4th, showcased cutting-edge developments in programming languages, focusing on transformative, accessible research. Read to learn about the latest trends in programming languages and their impact.
Programming mantras are proverbs: The author of this article argues that software development mantras should be seen as proverbs, not laws, highlighting that every programming principle has its counterpoint, such as DRY vs. WET. Read to learn to view and apply common software development principles more flexibly.
Take the Survey, Get a Packt credit!
🧠 Expert Insight 📚
Here’s an excerpt from “Chapter 3: Describing Data with Protobuf Text Format” in the book, Protocol Buffers Handbook
by Clément Jean.
Why use the text format?
(In Chapter 1)…, we said that the main reason for using Protobuf is that it reduces the payload by serializing to binary. But … we also said that the very binary that saves us a lot of bandwidth can cost us in terms of readability. This is because it would take way more human effort to read the binary
than to read the text directly.
To solve this problem, Protobuf can also serialize data to text. It can serialize data to JSON, for example, but for this book, the most interesting text format that it can serialize to is its own text format. There are several advantages to this text format, but let us first describe what the use cases are for having a text representation of your data.
The most obvious use case is for debugging. This is a stressful and not-so-enjoyable part of our job. We do not want to add extra complexity on top of the already complex process. As such, we try to make each payload clear by making them readable with something like the following:
id: "a_unique_id"
label: "Total Amount"
quantity: 1
amount {
currency_code: "USD"
units: 9
nanos: 990000000
}
This is much easier to debug than something like this:
0a 0b 61 5f 75 6e 69 71 75 65 5f 69 64 12 0c 54
6f 74 61 6c 20 41 6d 6f 75 6e 74 18 01 22 0d 0a
03 55 53 44 10 09 18 80 e7 88 d8 03
In the text representation, we can clearly read every field and the corresponding data, whereas in the binary, we would have to figure out what the field metadata and the actual data are.
Another use case in which the Protobuf text format is useful is for configuration. Since humans are the main readers and editors of configuration files, we need to have some textual representation of the data. On top of the other benefits that we mentioned during the first use case, there are a few others:
There are fewer boilerplate characters than in JSON or XML; thus, writing the configuration file is faster.
At the time of serialization, Protobuf will check the types for each and every field. If a developer sets the wrong data to a field, Protobuf will let you know.
We can add comments and headers to give extra information to developers.
Less boilerplate
The Protobuf text format is similar to JSON. However, as we saw in the example above, it feels less cluttered with unnecessary characters. The equivalent JSON to the txtpb, which we saw previously, would look like the following:
{
"id": "a_unique_id",
"label": "Total Amount",
"quantity": 1,
"amount": {
"currency_code": "USD",
"units": 9,
"nanos": 990000000
}
}
In total, such a JSON file is 157 bytes, whereas the equivalent txtpb file is 116 bytes. But more importantly, it is easier to write the txtpb file for the following reasons:
There are no outer braces for the first level
There are no commas between the fields
The keys are not enclosed by double quotes
You can omit colons when defining a field that uses a user-defined type (see amount)
In the end, I believe that we can agree on the fact that we use fewer keystrokes to write a txtpb file and that, obviously, it takes less space on disk to store such a file. This makes this format easy to write and read because it is less cluttered.
Type safety
…Protobuf is designed to be type-safe. This means that if we passed the wrong value to a txtpb file and tried to serialize/deserialize it, Protobuf would give us messages like the following (set "9" string to integer field):
Providing "9" instead of 9 to units field: Expected integer, got: "9"
This is especially useful to catch errors before it is too late. If we had a configuration file with an invalid value, we would be able to catch the configuration at the entry point of our application and not later during runtime. This reduces the feedback loop for developers, and it costs less resources to fix the bug than if we had to redeploy.
Headers and comments
Finally, we can have comments right next to the data. This means that we can explain what the field is doing or even add some “metadata” at the beginning of the file. For example, we can add headers that would be used by a tool or developers to understand them. The text format specification displays the following example:
# proto-file: some/proto/my_file.proto
# proto-message: MyMessage
This shows us that the following data are supposed to be serialized using the message definition called MyMessage, and that the definition of that message is in the file some/proto/my_file.proto.
On top of these headers, we can add comments to fields. You might have noticed the weird nanos field in our previous txtpb example. If you did not read the definition code, which contains comments, you would have no way to guess what it means. Furthermore, even if the developer had the headers to be able to track down the definition, we can save some time by adding a comment right next to the field. This could look like the following:
amount {
currency_code: "USD"
units: 9 # 9 dollars
nanos: 990000000 # 99 cents
}
This could also be used to warn the developers about some expectations around the value the field should have.
Finally, it is important to mention that these comments are not encoded into binary in any way. They are only here for the reader of the file. This means that once the data are in binary, you will lose all this information, but this also means that your documentation does not impact your payload size.
Protocol Buffers Handbook by Clément Jean was published in April 2024. You can read the entire first chapter and buy the book here! Packt library subscribers can continue reading the entire book for free here.
🛠️ Useful Tools ⚒️
Zed: a lightweight, MacOS-compatible code editor designed to enhance developer productivity through a minimalist interface and features such as collaborative tools and integration with GitHub Copilot.
peerdb: a high-performance tool optimized for streaming data from PostgreSQL to data warehouses, queues, and storage engines, focusing on speed, reliability, and comprehensive features for large-scale data transfers.
Devv AI: an AI-powered search engine tailored for developers, utilizing a unique vertical search index focused on development resources like documentation, code, and enhanced web search to deliver precise and contextually relevant results.
That’s all for today.
We have an entire range of newsletters with focused content for tech pros. Subscribe to the ones you find the most useful here. Complete ProgrammingPro archives can be found here. Complete PythonPro archives are here.
If your company is interested in reaching an audience of developers, software engineers, and tech decision makers, you may want to advertise with us.
If you have any comments or feedback, take the survey, leave a comment below.