Simple AWS
Posts
Microservices vs. Agentic AI (Part 4): Agentic Microservices

Microservices vs. Agentic AI (Part 4): Agentic Microservices

Guille Ojeda
May 12, 2025

This is part 4 of this Microservices vs Agentic AI series. In the first two parts of this series we laid out the core distinctions between Microservice and Agentic AI architectures. Part 1 traced their separate reasons for being: Microservices emerged to solve the challenges of monolithic scale by decomposing applications along business domain lines. Agentic AI, on the other hand, leverages the recent power of Large Language Models (LLMs) to tackle complex tasks through autonomous reasoning and functional decomposition. This led us to see how their fundamental approaches to autonomy and specialization naturally diverge.

Part 2 then took us into the thick of their runtime behaviors. We contrasted the predictable, deterministic logic common in Microservice communication and state management (using Sagas for eventual consistency) with the intelligent, context-rich flows, operational memory, and inherent non-determinism that characterize Agentic AI systems and their unique pattern needs (and the unique challenges they bring).

Following that, Part 3 led us deep into the operational realities. We considered the demands and challenges of operating agentic systems, compared to the more understood load-based, horizontal scaling of microservices. We discussed the distinct resilience challenges posed by infrastructure failures versus cognitive ones, the critical differences in observability needs (seeing the "why" of AI decisions), and the current state of tooling and the MLOps/LLMOps frontier. We also briefly touched upon the cost structures of both architectures, with a focus on understanding LLM token consumption.

Now that we've established this comprehensive picture of their individual characteristics, from foundational philosophy through runtime execution to operational demands, I'd like to propose something different for this fourth and final article on this series. We're going to explore how these two architectural paradigms can be combined, and what that means for system design. Beyond the technical integration, we'll examine the mindset shifts required from the teams involved. And finally, we'll look towards future considerations, especially the continued operational importance of security and ethics. The goal is to put everything together and offer a complete perspective. Let's hope I can pull that off.

Hybrid Architecture Patterns in Practice

Let me kick this off by saying I am not recommending you build a hybrid system with both microservices and AI agents just so you can “get the benefits of both patterns” or something like that. That's the completely wrong approach to architecture. I'm not saying either that you should never combine them. The purpose of this section is to highlight the areas to pay special attention to if you ever find the need to add agentic capabilities to some parts of a microservices architecture, or vice versa. Basically, to warn you of where things will likely get extra difficult, the points where the complexities of each pattern overlap.

Common Integration Patterns

These are the most obvious ways in which these two patterns would combine:

Agent as Intelligent Facade: The agent provides a natural language (or other intuitive) interface, understanding user intent and translating it into calls to existing backend microservice APIs. Example: A user asks the OmniMart chatbot, "Can I get free shipping on order 12345 to my alternate address?". The agent parses this, calls the Order Service API for order details, the Customer Profile Service API for addresses, and potentially a Shipping Policy Service API (or consults a KB), synthesizes the answer, and responds naturally. Basically the agent is the user interface, and the microservices are called as tools by that agent instead of (or in addition to) as a backend by a typical frontend.
Microservice as Reliable Tool: Agents offload complex, sensitive, or mission-critical business logic execution to dedicated microservices, treating them as reliable tools. Example: The OmniMart financial planning agent uses a validated PortfolioRiskCalculationService microservice instead of trying to implement complex financial math via prompts. The agent focuses on goal setting and strategy, the microservice provides reliable calculation. This looks very similar to the above, but the core distinction is that in Agent as Intelligent Facade you start from the actions (microservices) and then expose them via an agent. In Microservice as Reliable Tool the core of the system is the agent, and microservices just solve things that LLMs are bad at.
Event-Driven Trigger: Microservices emit business events (e.g., using Amazon EventBridge) that trigger agentic processes for deeper analysis or complex follow-up actions. Example: OmniMart Fraud Detection Service (a microservice using programmatic rules) detects a potentially suspicious pattern and emits a MediumRiskTransactionEvent. An Investigation Agent subscribed to the topic receives the event, then uses tools to pull data from multiple other services (user history, location data, etc.), performs more complex reasoning, and decides whether to escalate to a human analyst or automatically block the transaction via another tool call. This way we'd be using agents and microservices interchangeably, where components of an Event-Driven Architecture emit events and those are handled by other components, and any component can be a microservice or an agent. One thing to keep in mind is that agents are non-deterministic, and that will complicate distributed transactions.
Agent Orchestrating Microservices: An agent's plan involves coordinating a sequence of actions across multiple microservices. Example: The OmniMart returns agent, after interacting with the customer and approving an exception based on complex logic, calls the Order Service API (tool) to mark the item for return and the Logistics Service API (tool) to schedule a pickup. This is pretty similar to Microservice as Reliable Tool, but the microservices aren't just potentially useful tools for the agent, the tools are the end goal and the purpose of the agent is to orchestrate them. It would be like a very smart orchestrator pattern.

Challenges

While all of the above sounds like a great idea to implement, it comes with a few operational challenges:

End-to-End Observability: Tracing a logical operation becomes significantly harder when it crosses between the agent's reasoning steps (often opaque) and multiple microservice calls (potentially asynchronous). Correlating logs and metrics across these two types of components requires meticulous instrumentation, context propagation (passing trace IDs), and potentially specialized tooling beyond standard distributed tracing offered by AWS X-Ray alone. Debugging failures that span the boundaries of microservices and agents are going to be particularly challenging, even more so than failures that just span multiple microservices.
Security Context Propagation: How is the user's identity or authorization context securely passed from the initial interaction with the agent through its subsequent calls to backend microservices? Mechanisms like secure token exchange or carefully managed IAM permissions for agent tools are needed, adding complexity. The key problem here is that, as I've said many times in this series, agentic AI is still a new thing, and we don't have the generally accepted patterns and shared best practices that we can count on for microservices.
Deployment Dependencies & Contracts: Changes to a microservice API used as a tool directly impact the agent. This requires careful versioning, communication between teams, and potentially updating the agent's tool description, prompts, or even retraining its ability to use the tool correctly. The annoying part here is that changes that don't add capabilities, e.g. a reduction in latency, would typically not affect the caller of a microservice, but if the caller is an AI agent then these aspects matter in tool selection, and need to be reflected in the tool's description. We've inadvertently introduced more coupling than what we're used to when dealing with APIs, and broken a fundamental tenet of microservices: callers only depend on the exposed interfaces and contracts, not on implementation details.
Consistency Management: If an agent orchestrates actions that modify state across multiple microservices, ensuring overall transactional integrity becomes very challenging. Implementing a Saga pattern driven by the agent's logic is complex and potentially fragile compared to service-level Sagas (remember that LLMs are non-deterministic!). Careful design is needed to avoid leaving backend systems in inconsistent states if the agent's workflow fails mid-process. To be fair, a perfect distributed system would already be designed to recover from any kind of failure, so we already know how to handle this. But of course, no system is really perfect, and those imperfections might become more evident with LLM hallucinations having a much higher probability of occurrence than network failures.

People and Process: The Socio-Technical Shift

This section is intended to help those who already work with microservices and would like to add AI capabilities to your software. It's based on my experiences at Caylent building really interesting Generative AI applications, and a few discussions with some friends who tried to build GenAI teams.

Evolving Team Structures and Skillsets

Microservice development often relies on cross-functional teams blending at least backend and infrastructure, and ideally with some decent domain expertise. Building sophisticated agentic systems requires a broader skillset across the team:

AI/ML Expertise: The obvious skills here are model understanding, evaluation and optimization. But the real AI/ML experts (not just GenAI experts) also bring a lot of relevant experience. Remember that GenAI and LLMs are just a special flavor of ML, and while the uses are different, a lot of the building blocks are similar enough. Look out especially for MLOps expertise, and knowledge of how to prepare and manage data. If you get LLMOps, you've hit the jackpot.
Prompt Engineering: This sounds completely trivial now that we have reasoning models, and fortunately that ridiculous hype of “everyone will just be a prompt engineer” died really quickly. You don't need a prompt engineer. But you do need someone to understand how prompting works, including prompt caching, cost optimization, and other stuff related to prompts. At least one person in the team should have a good idea about this.
Software and Cloud Engineering: Do not try to build software without software engineers, and do not deploy to the cloud without cloud engineering. I know a lot of brilliant people who I'm pretty sure could code an LLM from scratch, but who don't know how AWS Lambda works or how to achieve high availability and scalability. Generative AI applications need generative AI skills, but they also need application skills.
Data Science/Engineering: This is a bit more situational, and will depend on the exact skillset of your AI/ML experts and how you manage data. Creating and using a Knowledge Base should be easy enough with GenAI expertise, even the very fine details like choosing the embeddings size, optimizing the vector database, or fine-tuning the embeddings model. However, if you need to do some data processing, you'll need someone with data science and/or data engineering skills. Any cloud engineer can stand up a data pipeline on AWS, and any software engineer can write the ETL code (it's just the MapReduce pattern). But without the right conceptual and domain knowledge you'll likely be processing useless data and reaching wrong conclusions.
Domain Expertise: This is always super important. But in my experience, it's even more important for Generative AI applications. We're collectively still discovering where GenAI is most useful, and where it sounds useful on paper but isn't reliable or cheap enough yet. I work at a consulting company, so helping companies figure out the business impact of our work is second nature at this point. But I've found that while building blindly is always bad, building GenAI stuff blindly is even worse than usual. Btw I dumped a lot of thoughts on this in the 2025 Outlook on GenAI whitepaper I co-wrote with our head of AI.
UI/UX Design: In many cases, AI is redefining the way users interact with software. In many cases we've traded screens filled with buttons for a chatbot that provides easier access to the same functionality. Sometimes that's not the best decision. On some occasions you need a combination of both, with either a dashboard on screen and a chat to ask more questions, or a chatbot that can output graphs instead of just text. Sometimes we're just making backend changes but the whole expectation shifts a bit (remember non-determinism!!). And sometimes we want to do away with screens entirely, e.g. with Speech-to-Speech models. There are no clear patterns yet, so you need someone capable of understanding and especially designing how users interact.

Smaller teams will have a hard time including all the skills I listed above, and you're going to need to make compromises. When you do, be aware of what you'll need and why, and I'd suggest don't hesitate to rely on consultants or fractional resources, either for fewer hours a day or for just a few weeks.

Organizations that need to field teams like this multiple times will likely gravitate towards centralized expertise, just like we already do with cloud or security: Cloud Center of Excellence, Security Operations Center, platform engineering with best practices. I believe that's a good investment for the long term. However, for the shorter term the practice that I've seen have the biggest impact is to create a culture of sharing knowledge. Many times an engineer with some knowledge can get you 80% of the way there, and you can bring in the experts for a few hours to help push the rest of the way. Moreover, sharing the lessons from different projects helps everyone grow. So right now I'd put my focus there: Build cross-functional teams as good as you can, and create a culture of sharing knowledge across the entire company.

Conclusion

And with that, our deep dive comparison into Microservice and Agentic AI architectures comes to an end. Across these four parts, we've journeyed from their foundational principles and distinct origins (Part 1), through their contrasting runtime dynamics concerning communication, state, and predictability (Part 2), into the practical operational realities of scale, resilience, observability, tooling, deployment, and cost (Part 3), and finally, explored the integration points and challenges in hybrid systems and what all of this means for the humans involved (Part 4, this one).

While both architectural styles leverage decomposition and operate as distributed systems, they are fundamentally different tools designed for different primary purposes. Microservices offer a mature, robust paradigm for structuring large applications around business domains, optimizing for engineering lifecycle agility, operational scalability, and reliability through well-understood patterns and a rich tooling ecosystem. Agentic AI provides a powerful, though operationally less mature, paradigm for automating complex tasks, enabling autonomous reasoning and action, and creating intelligent interactions by orchestrating LLMs, tools, and knowledge.

This series was never intended as a guide for choosing one over the other as if they were direct competitors for every problem. My initial goal was to uncover useful patterns and lessons from microservices that could be applied to Agentic AI, under the assumption that since both patterns are based on decomposition, there would be a lot of similarities (in fact, I said multiple times that agents were just LLM-based microservices). I've failed at that goal: I found a lot more differences than similarities, and few transferrable lessons.

But even if I failed at my original goal, I hope exploring both patterns, with their respective strengths, weaknesses, unique characteristics, and operational demands, still yielded some value to you. My hope is that understanding these nuances allows you to avoid misapplying principles from one domain to the other, to set realistic expectations, and, most importantly, to identify opportunities where their strengths can be combined.

As LLMs keep percolating the software world, I don't think I'm too crazy if I predict that the future of many complex systems will likely be hybrid, blending the deterministic reliability of microservices for core business logic and data management with the adaptive intelligence of agents for sophisticated interaction, automation, and reasoning. So this should all be useful as we move towards that future.

Did you like this issue?

Loved it! 💖 | It was good 🙂 | No bueno 😑

Reply

or to participate.