This blog is a precursor of what you can expect from Sean Feldman’s session at INTEGRATE 2019 USA. You can register for the event ticket from here.
Everything around us is evolving. Everything is changing and adapting. Services grow, evolve into more what originally was intended. Azure messaging services are no exception to this rule. Over time, these services have expanded in various directions and became more specialized. Understanding what services are built for and what it does the best is the key to successful service utilization in projects and solutions that require different aspects of messaging. The purpose of this article is to clarify all Azure messaging services to help you make the right decision when to use what service.
Azure Storage Queues
In systems that require intense processing, handling it all in a single process is virtually impossible. Possible processing failures, outages, hardware limitations, or cost constraints prescribe to divide work into chunks and distribute to processes that can handle it. This also helps with the responsibilities concern, where a process doesn’t have to assume all the responsibilities and turn into a monolithic system. The approach is to divide work among processes and distribute or delegate among other processes.
While this approach sounds straight-forward, several considerations need to be made such as temporal coupling, outages, workloads, and more.
Let’s look at an example such as registration process.
- A potential customer is filling out a web form providing their information
- Collected information is sent to the server for processing
- Processing includes storing customer and sending a welcome email
While the process looks like a no-brainer, there are multiple potential points of failure brought earlier as consideration. Here are a few possible scenarios of things going not quite as expected.
The web server has a sudden spike of potentially interested visitors that would like to sign up. They all rush to fill out the form and submit it. The backend server is experiencing load problems and starts to time out while the webserver is waiting for the confirmations to make it back. Potential customers wait for a long time until the signup page is timing out. Alternatively, customers will restlessly try to re-submit their form, causing, even more, the load on the backend, bringing it to its knees. At that point, waiting customers are leaving without ever completing their signup process, and no new visitors can sign up. This is a classical temporal coupling between sub-systems.
The example above is an edge case. A different example that is also more realistic is exceptions and outages that can and will happen from time to time. Let’s assume storing customer information has gone through successfully but sending a notification email to the signup person has failed due to email server error. Being a defensive developer, your code would probably handle that gracefully, but it would now need to re-send that welcome email. And to do so, the system will have to know how to retry.
Queues address all these issues. Queues intend to decouple processes to allow asynchronous communication.
Additional attributes of most queuing services include:
- Work load leveling
Azure Storage Queues (ASQ) is a queuing service that is intended for precisely this purpose. It’s a simple queue service to help with work coordination between various Azure compute services. Messages sent using ASQ would contain the payload that is an instruction for a known destination with an expectation for a specific outcome.
For example, a web server would enqueue a message with potential customer information to register. The web server would not require the backend to be online. More than that, it would even not care if the backend is not even up and running. It would store the intent to register a new customer in a message destined for the backend, trusting that the message will be eventually picked up and processed. The intent is clear, registration information is captured and reliably stored in a durable queue, and there might be a tsunami of registrations that would not matter to the backend.
ASQ is an excellent messaging service choice when it comes to work item distribution. Its low cost and large capacity (up to 500TB of message data) make it unbeatable. However, it doesn’t come without certain compromises. Being a simple queueing service, it lacks enterprise messaging features that would demand intense compute. In addition, message size is limited to 64 KB and offers no message metadata, or what’s also known as message headers/properties. Remember, the service is not intended to move large data between processes but indicates intent with the minimal amount of information required for the destination to perform its job.
Azure Service Bus
While Storage Queues service provides the necessary means for queueing, there are domains where you require real messaging. Usually, that means supporting features that involve compute and processing behind the scenes on behalf of the service. That’s where Azure Service Bus (ASB) service shines. The features it provides are genuinely impressive. Some of those are:
- Transactional processing
- Message ordering
- Expiration (TTL)
- Duplicate detection
- And much, much more
The service goes beyond simple queuing and exhibits all the necessary features commonly found in the enterprise line of business application and systems. A system that requires decoupling with provided durability and guarantees. Such as transactional guarantee. The ability to process an incoming message and dispatch work items as new outgoing messaging in an atomic fashion, where either everything succeeds or gets rolled back. Or dealing with poisonous messages that would not prevent the system from continue functioning, while keeping poison messages for any further investigation. Also, the ability to broadcast messages, rather than send to specific destinations. This feature alone sets ASB apart from ASQ and allows events distribution, decoupling message producers from message consumers.
This service also comes with a few perks that include message size of 256KB/1MB* and metadata/headers support. With two significant offerings, Standard and Premium tier, you’re free to choose performance and throughput. With Premium tier the service allows scaling up and down with purchases dedicated processing power known as Messaging Units, offering reliable and guaranteed throughput and latency. Support for Disaster Recovery with Geo-DR, HA with Availability Zones, and Virtual Network Service Endpoints are baked into the Premium tier offering a truly premium enterprise messaging service.
* Premium only.
Azure Event Hubs
So far, the focus was on individual messages where each message is discrete and important. With ASB there’s support for messages that represent events, but those are not streams of events. Scenarios such as telemetry require a different approach to handling messages. Ability to capture events that arrive at high volumes. That’s where a third Azure message service comes in, Event Hubs. This is a service that is specifically built to deal with event streams.
With exceptional ingress rates, it can take on a large number of incoming messages storing those internally offering no queue semantics. While it might sound like a significant drawback, it is highly optimized for the one job it has – store events to allow later processing by the consumers on their own terms and conditions. Consumers are the ones that manage events reading and decide wherever to move the reading cursor forward or go back to already read messages. With this level of power comes the responsibility to manage the cursor. Often, data read is streamed into other services for analysis or persistence. EH Capture feature helps with that by allowing periodically export events to Storage Blob service using Avro format.
The service has the processing power and is expressed in Throughput Units, where each TU is capable of up to 1MB/s or 1,000 msgs/s of ingress and 2MB/s or 2,000 msgs/s of egress. While TUs have to be pre-allocated, EH service offers Auto Inflation feature, preventing message ingestion from ever been throttled. The dedicated tier of the service also offers an extended event retention period. EH is making a conscious preference in favor of throughput over features.
Azure Event Grid
Azure Event Grid (EG) is one of the latest additions to the Azure messaging services family. This toddler service is making big splashes and at times feels like a threat to its older siblings. After all, the questions it raises are valid.
- What was missing there that a fourth messaging service had to be introduced?
- How is different from ASQ, ASQ, and EH?
- Is EG going to replace some of the other messaging services?
To answer these, let’s see what problems already exists that messaging services didn’t address that EG can.
The world we live in is reactive. Events take place, and we respond to those events. The reactive nature of Azure services is no exception to this paradigm. A newly updated blob to Storage service needs to be processed to import data into CosmosDB. A provisioned resource needs to be tagged. Ability to notify subscribers scaled-out between multiple regions with non-ubiquitous receiving mechanisms (webhooks, queues, streams, etc). Reacting to the events taking place on a cloud-scale is not what the messaging services covered earlier were built for. That’s what EG was created for.
Event Grid’s whole purpose is to allow publishing and handling events crossing datacenter boundaries cheap and straightforward, helping with building reactive architecture.
As of today, EG is already supporting several Azure services capable of publishing events.
Those services are:
- Blob Storage
- Storage (v2)
- Event Hubs
- Service Bus (Premium)
- IoT Hub
- Container Registry
- Resource Groups
- Azure Subscriptions
Additionally, EG supports Custom Topics, expanding this paradigm outside of Azure. When it comes to subscribing to EG topics and handling events, currently supported options include the following:
- Custom WebHooks
- Azure Functions
- Logic Apps
- Storage Queues
- Event Hubs
- Azure Automation
The list is short, but it will expand to support all of the Azure services. And eventually, not just Azure services, but beyond that. With already supported CNCF (Cloud Native Compute Foundation) CloudEvents v0.1, EG can interchange messages with any system supporting this open standard for events.
EG’s ability to process millions of messages and being platform or language-agnostic, it is the go-to messaging service to distribute notifications and build reactive solutions on top of it. Despite its HTTP nature, the service is engineered for reliability. Built-in retries and dead-lettering offer uncompromisable guarantees that events are not lost. Reasonably priced to allow massive scale event publishing and distribution.
Azure Relay and Hybrid Connections
While Relay service is very different from the services listed above, its history and purpose are bound to the messaging domain. This is more of an enabler service, where its whole purpose is to enable secure communication where usually that communication would not be possible.
A good example would be exposing service behind a firewall. Traditionally, it would require to poke a hole in the firewall. With Relay, you no longer need this, instead, “The Relay service allows to establish a cloud endpoint through which both the consumer and the provider behind the firewall can talk to via an establishing secure channel.”
Relay helps to decouple to overcome VPN and firewall constraints in a secure way.
If you got to this point, then throughout the article you probably have noticed a consistent theme – dedicated services focused on a unique set of responsibilities. In a way, today’s Azure messaging services are enforcing a well-known Single Responsibility Principle (SRP). Slightly rephrasing the principle, a messaging service should have responsibility for a single set of functionalities. You can see this throughout the services in the article. Let’s summarize it.
- Storage Queues for simple queueing.
- Service Bus for enterprise messaging.
- Event Hubs for event stream processing.
- Event Grid for pub/sub and large-scale events.
Understanding what messaging service are the best to solve a problem at hand is essential to apply the best tool available for your disposal. One service neither can fit the bill for different jobs nor should it. Evaluate system requirements, understand the needs, and carefully choose messaging service to apply. At times, you require several services to build the solution. Know and understand them all to avoid situations where everything looks like a nail just because the only tool you’ve got under your tool belt is a hammer.