← Return To Home

Event Grid Delivery and Retry Policy

Why Event Grid?

Azure Event Grid facilitates building event-driven serverless apps that can effectively solve a real-time business problems with a focus on the core logic rather than the infrastructure. Event Grid is designed for high availability, consistent performance, and dynamic scale.

Event Grid can simplify event-based apps, as this serves as a single service to manage routing of all events from any source to any destination.

Before we proceed further, it is important to understand what an ‘event’ is?

An event is the smallest amount of information that explains something which happened in a system.

Event Grid Topic

In Azure Event Grid, the event source can be considered as an Event Grid Topic. There are two different types of Event Grid Topics. They are

  1. System Event Grid Topics
  2. Custom Event Grid Topics

System Event Grid Topics

System Event Grid Topics are the predefined set of Azure Services that can generate events. When the event source is a System Event Grid Topic, the Event generation and Event schema are handled by default. Some of the System Event Grid Topics are

  • Azure Subscriptions (management operations)
  • Container Registry
  • Event Hubs
  • IoT Hub
  • Media Services
  • Resource Groups (management operations)
  • Service Bus
  • Storage Blob
  • Azure Maps

Custom Event Grid Topic

Custom Event Grid Topics can be used when the source of the event is not any of the System Event Grid Topics. The events can be sent to the custom Event Grid Topic using the Libraries provided. For example, .Net library used for sending events is Microsoft.Azure.EventGrid.

 Multiple event subscriptions can be created for a single Event Grid Topic.

Event Subscription

The destination to which the events must be sent can be configured using the Event Subscriptions. The Event Subscriptions can be created for both the custom and system Event Grid Topics. The Event Subscriptions also have filters to select the required events even before they are received by the endpoints. The destination endpoints those can be configured with the Event Subscriptions are

  • Azure Automation
  • Azure Functions
  • Event Hubs
  • Hybrid Connections
  • Logic Apps
  • Microsoft Flow
  • Queue Storage
  • Service Bus
  • Webhooks

In short, the important concepts in Azure Event Grid are

Events – What happened in the event source

Event handlers – The destination app or service reacting to the event

Topics – The endpoint where publishers push events

Event subscriptions – The endpoint or built-in mechanism to route events, sometimes to more than one handler. Subscriptions are also used by handlers to intelligently filter incoming events

Event sources – Where the event exactly took place

This performance degradation can be reduced by adopting to an event-driven architecture using Azure Event Grid.

Azure event Grid use case

Architecture using Event Grid

An Event subscription can be created for the Service Bus Namespace, with the destination endpoint as the Payroll queue. So, whenever there is an active message available in the employee queue, an event will be generated, and it will be sent a message to the Payroll queue. The Payroll application can now listen to the Payroll queue. Whenever an active message is available in the Payroll queue, the Payroll application can poll the Employee database and add the new employees to the Payroll application.

This reduces the periodic polling of the database and improves the performance of the entire Employee Management System.

Azure event grid use case

Event Grid Delivery

Event Grid ensures reliable delivery of events to the destination Endpoints. It makes sure that the events are delivered at least once to the destination endpoint. Whenever the event is not received by the configured destination endpoint, the acknowledgment is not sent by the destination endpoint. This mechanism is utilized by the Event grid in handling the retries. Events are sent to the event subscribers i.e. the configured destination endpoints immediately. This makes sure the timely delivery of events from the Event Sources to the Destination endpoint with minimum Latency.

The Event Grid delivers each event individually to the subscribers. The destination endpoints receive the events as an Array with a single event in them.

Retry Schedule and Duration

The Event Grid waits 30 seconds for an acknowledgement from the destination endpoint. If the acknowledgment is not received, the Event Grid queues the event for retry. Event Grid applies exponential retry policy to ensure event delivery.

Event Grid retries delivery on the following schedule on a best effort basis:

  • 10 seconds
  • 30 seconds
  • 1 minute
  • 5 minutes
  • 10 minutes
  • 30 minutes
  • 1 hour
  • Hourly for up to 24 hours

If the destination endpoint responds within 3 minutes, the event grid will try to remove the message from the retry queue on a best effort basis, but this doesn’t guarantee that no duplicate events.

Azure Event Grid adds a small randomization to all retries and may even skip some retries if the destination endpoint is unhealthy or unavailable for a long time.

For deterministic behaviour of Event Grid retry configure the event time to live and maximum retries or delivery attempts in the Subscription Retry Policy.

 Dead-Letter Location

A dead-letter destination can be configured for an event grid subscription. Whenever the maximum number of configured retries are exhausted, the events will be stored in the configured dead-letter destination. Currently it is possible to configure the Azure storage account containers as the dead-letter destination for Azure Event subscriptions.

When creating an Event Grid subscription, it is possible to configure how long Event Grid should try to deliver the event. By default, Event Grid tries for 24 hours (1440 minutes), or 30 times. You can set either Event Time to Live or Maximum retry count for Event Grid subscription. The value for event time-to-live must be between 1 and 1440. The value for max retries must be from 1 and 30.

If no dead-letter location is configured and if all the retry attempts are exhausted, then the Azure Event grid will drop the failed events.

If both the Event Time to Live and Maximum retry attempts are configured, Event Grid uses the first to expire, to determine when to stop event delivery.

Delayed Delivery

Whenever there is an event delivery to an endpoint fails, the event grid will delay the further delivery of events and retry of failed events to that endpoint. For example, if delivery of ten events to an endpoint fails, then the event grid will consider that there is an issue with the endpoint and will delay all subsequent retries and new deliveries for some time – in some cases up to several hours.

The main purpose of delayed delivery is to safeguard unhealthy endpoints as well as the Event Grid architecture. Without back-off mechanism and delay of delivery to unhealthy endpoints, Event Grid’s retry policy and volume capabilities can easily reduce the performance of a system. So, it is important to efficiently configure the retry policy to preserve the application performance.

Dead-Letter Events

When Azure Event Grid can’t deliver an event, it will route the undelivered events to a storage account by a process known as dead-lettering. By default, dead-lettering is disabled for Event Grid Subscriptions. To enable it, you must specify an Azure Storage Account to hold undelivered events when creating the Event Subscription. The dead-letter events will be stored in Azure storage account blobs.

Event Grid routes an event to the dead-letter location when it has tried all of it retry attempts. If Event Grid receives a 400 (Bad Request) or 413 (Request Entity Too Large) response code, it immediately sends the event to the dead-letter endpoint. These response codes indicate delivery of the event is not possible as there is an issue with event schema or the event size.

There is a five-minute delay between the last attempt to deliver an event and when it is delivered to the dead-letter location. This delay is intended to reduce the number of Blob storage operations. If the dead-letter location is not available for four hours, the event is dropped.

For setting the dead-letter location, Azure storage account with a container is required. The endpoint for this container when creating the event subscription must be provided for configuring the dead-letter location.

This dead-lettering of undelivered events can be managed using the Azure Event Grids. An event subscription can be created for the Azure storage account whose container is configured as the dead-letter destination of the event subscription. So, whenever an undelivered event is stored in the storage account container, a new blob created event will be generated and sent to the event handler configured with the event grid subscription created for the dead-letter storage account. The event handler can process the dead-lettered events in the storage blobs using the custom logic.

Event Delivery Status

Event Grid uses Http status codes to determine whether the events are successfully delivered to the event handlers of the Event Grid.

Success Codes

Event Grid considers the following status codes as success status codes and when these codes are received, event grid considers it as acknowledgement from the event handler and marks the event as delivered.

  • 200 OK
  • 201 Created
  • 202 Accepted
  • 203 Non-Authoritative Information
  • 204 No Content

Failure Codes

All other codes not in the above set (200-204) are considered failures and will be retried. It’s important to know that due to the highly parallelized architecture of Event Grid, the retry behaviour is non-deterministic.

Status codeRetry behaviour
400 Bad RequestRetry after 5 minutes or more
401 UnauthorizedRetry after 5 minutes or more
403 ForbiddenRetry after 5 minutes or more
404 Not FoundRetry after 5 minutes or more
408 Request TimeoutRetry after 2 minutes or more
413 Request Entity Too LargeRetry after 10 minutes or more
503 Service UnavailableRetry after 30 minutes or more
All othersRetry after 10 minutes or more

Event Grid Services offered in Serverless360

Data Monitoring

Event grid topics can be monitored on the following metrics in Serverless360

Failed Events: Event sent to the Topic but rejected with an error code.

Published Events: Event successfully sent to the Topic and processed with a 2xx response.

Unmatched Events: Event successfully published to the Topic, but not matched to an Event Subscription. The event was dropped.

Publish-Subscribe Latency: Time is taken by the Topic to publish the event to the Event Subscription.

Serverless360 Event Grid monitoring

Event grid subscriptions can be monitored on the following metrics in Serverless360

Delivered Events: Event successfully delivered to the Subscription’s endpoint and received a 2xx response.

Matched Events: Event in the Topic was matched by the Event Subscription.

Dropped Events: The event was not delivered, and all retry attempts were sent. The event was dropped.

Delivery Failed Events: Event sent to Subscription’s endpoint but received a 4xx or 5xx response.

Dead Lettered Events: Events, that can’t be delivered to an endpoint, can be sent to a Storage Account defined as dead letter location

Destination Processing Duration: Time taken to process the event from Event Subscription to the destination endpoint.

Serverless360 Event Grid monitoring

Sending Events to Event grid Topics

Consider a scenario where there are multiple event grid subscriptions created for a single event grid topic in a business process. There may be a need to verify whether the events sent to the event grid topic is received by all the event grid subscriptions. Serverless360 provides this capability of sending the events to event grid topics. Events can be sent to the event grid topics associated with the composite application of Serverless360.

The following event properties can be configured for the Event to be sent to the Event Grid Topic.

  • Subject
  • Data Type
  • Event Type

Accessing the failed events of Event Grid Subscriptions

Azure Event Grid is used to build event-driven architectures. Event Grid Topics and Event Grid Subscriptions are involved in event publishing and subscribing to the events.

For any reason, if the Event Subscription fails to receive the events, there is a Dead-lettering mechanism available to capture the failed events. Unlike the Service Bus Dead-lettering Event Grid stores the failed events in a Storage Blob Container.

Consider a scenario where the destination entity of the Event Grid Subscription is a Service Bus Queue and an Azure Function listens for the messages from the Queue. If the Queue is deleted accidentally the events would fail and eventually get stored in a Storage Blob Container configured at the time of the creation of the Event Subscription.

Serverless360 Event Grid monitoring

Processing the dead-letter events of Event subscription

Consider a scenario where the endpoint configured to an event subscription was down for an hour and the events sent to it from the event grid topic have been dead-lettered. Now when the endpoint is available to receive the events there may be a need to reprocess the dead-lettered events of the event grid subscription.

This can be achieved using Serverless360. The dead-lettered events can be either resubmitted or repaired and resubmitted to the event grid topic.

 

By default, 20 events will be retrieved from the associated storage account. To fetch subsequent events click to load more.

Resubmit dead-lettered events

Resubmitting the dead-lettered event to an event grid topic preserves all the properties of the event grid event like Id, Subject, Event Type, Event Time, Data Version, Metadata Version, and Event Data.

Serverless360 Event Grid monitoring

Repair and Resubmit dead-lettered events

Repair and resubmitting of dead-lettered events can be used when there is a need to modify the event properties. For example, Event time of the dead-lettered event can be modified before resubmitting it to the event grid topic.

Serverless360 Event Grid monitoring

Delete dead-lettered events

Once the dead-letter events are processed manually either through resubmission or repair and resubmission, the dead-letter events can be deleted from the configured dead-letter destination.

Pre-requisites for processing the dead-lettered events

The dead-letter event processing of event grid subscription can be done only if the following requirements are met.

The event grid subscription must be configured with a custom event grid topic

The storage account container configured as dead-letter destination of event grid subscription must have public anonymous access either to the container or to its blobs

All the reprocess actions on dead-lettered event grid events are captured in the Messaging section of Governance and Audit.

Conclusion

In this blog post, we have seen a detailed view on the Azure Event Grid delivery and retry policy. Also, how you can handle or manage these dead lettered events using Serverless360. Stay updated for more content on Azure integration space. Happy Learning!

Author: Ranjith Eswaran

Ranjith started his career at Kovai.co and works as a Junior Software Engineer. He lives with the passion - "Don't Give Up".