We are delighted to share that Startup Development House is an official recipient of a Forbes Diamond in 2023.
Event-driven architecture: all you need to know
There are a lot of ways for services to communicate with each other. You could use REST API, GraphQL, SOAP, or even share databases. They all have their pros and cons (especially the last one), and they all share the same problem: coupling. No matter the direction of communication, there is always a little bit of knowledge embedded in your application. The knowledge that you deferentially don’t need (for most of the time). Is there any way to avoid such a horrible thing?
Problem-solved in 1, 2, 3
Let’s imagine that we are building a platform for extracting data from files downloaded from S3. Sounds pretty straight forward, just create three modules: one for downloading the file, a second for parsing it and a third for saving data into the database. Not exactly rocket science. Guess what? The scope of the project changed. I know, that’s extremely surprising, but after you’ve regained your composure from the shock, your manager asks you to add another functionality.
After parsing the file, you have to send the result to another service, so it is used for marketing purposes. Ok, another easy task: your first idea was to create an API for them to call on, but you can’t: the files are being delivered irregularly, parsing could take even a few hours, and the Marketing Department needs results as soon as possible, so they would have to call your API every 50 milliseconds or so for an update. You don’t want this kind of traffic. It also seems like a waste of resources, so you decide to call their REST API and call it a day. After all, it’s just a little bit of coupling. What could go wrong?
After some time, it appears that your little service is becoming really popular in your department. Every few weeks, someone is asking you to send them the result of parsing to their service. Obviously, no one wants to use the REST API, and everyone needs reports NOW: every second is important. That’s annoying but relatively easy: just iterate over a list of recipients, and that’s it. Sure it is... But what would happen if one of the services is unavailable for a few hours? Would you remember which report should be resent whenever possible? Or maybe after such delay, would they be useless? Does every service have the same policy? You don't know, and you don't want to know. So you decide that it’s time for a little revolution in your department.
What is the event-driven architecture?
Instead of direct communication between services, we will create a centralized Message Bus that will allow us to communicate via events. Before actually going into details, let’s see what an event really is. An event is a piece of information, most of the time in a JSON format, that describes a change in state. What could that be? A new file was uploaded, a file was parsed, a file was corrupted, and parsing failed, the validation of a file was completed successfully: every change of state could be an event.
But why do we even need those events? Who cares that the file was downloaded? How will they use this information? You don’t know, you don’t have to, it’s not your concern, and that’s the beauty of it. You communicate without talking to anyone. Think of it as a utopia for introverts. Of course, you still need to determine what should be emitted as an event. You can achieve that with the help of Event Storming.
The Middle Man
You don’t have to talk to anyone, yet you inform the world about the domain change. So how does it work? Most of the time, the answer is a Message Broker. It’s just another application that acts as a middle man between a publisher (the service that generates events) and a consumer (someone/something that acts on those changes).
The graph of dependencies will change from:
To something like this:
Now your service is only responsible for downloading and parsing files. Obviously, you still have to take care of delivering events to the broker, but that’s an entirely different topic that heavily depends on the message broker that you use. Message Brokers allow you to publish events under a specific subject/topic (every Message Broker has a different name for it, but essentially it’s a form of categorizing events) and other services can subscribe to that subject.
Let’s take a look at an example of publishing and subscription in NATS, one of the most popular message brokers out there.
We create a connection, call a publish method, and that’s it! Your message with the result of parsing should be delivered to a message broker and be available to all subscribers interested in the “FILE_PARSED” event.
How would one create such a subscription? Not particularly difficult task:
To receive an event, all you have to do is call the “subscribe” method and provide the subject’s name that interests you. In this case, your callback will be called each time someone publishes a message under the “FILE_PARSED” subject.
The 3 benefits of event-driven architecture
Let’s go through the list of benefits of adopting an event-driven way of communication.
That’s a no brainer. Thanks to using a Message Broker instead of direct communication (regardless of the direction of such), we are separating services and their knowledge about each other. That makes us much more flexible.
The event-driven architecture is scalable by nature. If the event processing throughput is too low, all you have to do is set up a new instance(s) of a consumer to handle spikes in requests dynamically. That works both ways. When the additional resources are not needed anymore, you could easily cut them down.
Adding new blocks to the entire process is pretty straight forward: just create a new subscription, and you can start your work. And that’s a true value of software: how fast you can adapt.
And 3 drawbacks
For bigger systems, it might be quite a challenge to reason about the entire process. If one of the modules needs to obtain data from multiple services… that might create a complex dependency graph.
Because of its asynchronous nature, Event-Driven Architecture makes it somewhat challenging to handle transactional actions, maintaining data consistency across multiple services might be problematic.
Single Point of Failure
When one of the services is down in the huge system, nothing happens. When the Message Broker is down, it’s the end of the world. Event-Driven Architecture introduces a significant threat to your ecosystem, i.e., a single point of failure. When services can’t communicate with each other, nothing gets done. Of course, most of the message brokers can work in high availability, but that requires a more complex setup.
In a nutshell
An event-driven architecture is an excellent choice if you want to achieve low coupling of your services, high scalability and flexibility of your system. Of course, there is much more to consider when choosing it. You need to think about the durability of events, flow tracking when the change of state has to be emitted as one, the ability to replay messages and the list goes on.
Would you like to know more about event-driven architecture? Or do you need a hand to decide if that’s what you need? We’d be happy to help. Just write to us at firstname.lastname@example.org
You may also like...
Airbnb, Dropbox, and AirHelp. It's quite likely you're familiar with these company names, given their prominence among the more successfully accelerated startups of the...