Beyond Personalization: How Aampe's AI Agents Are Changing Customer Relationships

Paul Meinshausen on building agentic infrastructure, reinforcement learning, and creating technology that thinks long-term.

COO @ DataRoot Labs

11 May 2025

14 min read

Beyond Personalization: How Aampe's AI Agents Are Changing Customer Relationships

Paul Meinshausen is the CEO and Co-Founder at Aampe, a San Francisco-based AI company changing user engagement through agentic AI infrastructure. Founded in 2020, Aampe deploys reinforcement learning agents that autonomously adapt to user preferences, delivering hyper-personalized experiences across various industries, including food delivery, fintech, and entertainment. These agents manage up to 200 billion personalized decisions weekly, enhancing user retention and engagement at an unprecedented scale.

In December 2024, Aampe secured $18M in a Series A funding round led by Theory Ventures, bringing its total funding to $27.3 million.

Aampe applies AI agents that continuously learn and improve according to the personal preferences of each user. How is this agentic infrastructure distinct from other personalization methods?

Agentic AI Infrastructure works by instantiating an agent designed to develop and maintain a long-term relationship with an end-user.

Most earlier personalization methods operate at the level of a single, discrete decision. The decision might maintain a reference to some memory of previous decisions (although these are usually aggregated memories), but they generally could be described as stateless.

An example of why this matters is where a certain campaign - let's call it Campaign A - is set up, and then a personalization method is used to determine which action out of a set of Actions (1 through N) is chosen. Then a second campaign is set up in a different context - call it Campaign B and the personalization method is again used to determine the best action out of a set of Actions (1B through NB) is chosen. With historical personalization methods, the system cannot orchestrate across the two campaigns. The decision for Campaign A is blind to the decision for Campaign B - there is no meta-decision process. This leads to a major Governance problem, with no ability for the system to “think” at a higher and more strategic level.

Aampe's agents operate at the user level, across all marketing and product contexts and materials. So, each agent is aware of every possible Action a given User is eligible for (according to strategic guidelines), and the agent then manages decisions across contexts. And agents can coordinate between themselves, allowing them to efficiently explore all possible actions in all the relevant contexts.

What drove you and your co-founders to begin Aampe in 2020, and how has the company's mission evolved since its inception?

The archaeology of startups is a tenuous science, but I'll say the seeds of Aampe's core vision go back about as far as I can remember. I've always been deeply intrigued by the way we make decisions and try to navigate the radical complexity of the world we inhabit. Technology, and specifically software and mobile devices, always seemed the best way to both understand how we make decisions and to improve how we make them. Aampe is the accumulated and ongoing effort of my co-founders' and my interest in better technology for a better world.

For the last decade and more, companies have adopted machine learning, expecting it to transform their business decisions. Too often, they just ended up with dashboards and dubious forecasts instead of actionable levers for growth. For years, data science has been stuck looking backward—analysing historical data to find patterns, make predictions, and extract general rules. There was way too much attention on data, and not nearly enough on the science that generates useful data.

Surprisingly few data scientists have a background in experimental methods. Experimentation actively tests different actions in the real world to see what drives outcomes. The real power of data emerges from running experiments - lots of tests, way more than however many A/B tests you think is a reasonable amount - and measuring counterfactuals. You can't know whether it's a good idea to message someone on Wednesday at 6 pm unless you've messaged them at 6 pm, and messaged them a bunch of other times, and seen the relative benefit. There are ways to accelerate that learning and reduce the communication overhead - you can use more traditional ML methods to infer the counterfactuals for one user based on how those counterfactuals look for other, similar users - but you can't get away from the need to generate data on the counterfactuals.

That was our key insight, early on. It’s not about being “data-driven” in general. It’s not about running A/B tests. Furthermore, it’s about turning every interaction with every user into a data factory. That’s actually what we do as humans in our everyday interactions: when I talk to you, I look at your body language, and listen to your tone of voice, I follow your eye movements, and so on. Just by having a functioning brain, I’m constantly generating data from your reactions that I attach to the “treatment” of what I’m saying to you, and I use that to adjust my behaviour, which then adjusts the next data I create. That’s how we manage human relationships. All we’ve done with our agents is give computers the ability to mimic that capability. That’s the core of what we do.

Businesses don’t win by perfectly understanding the past; they win by making better decisions about the future.

The main way Aampe has evolved is by consistently expanding the surfaces (or channels) in which our agentic learners can operate. We started with a purposeful and narrow focus on outbound communication, especially push notifications. We pretty quickly extended to surfaces like SMS, WhatsApp, email, and then even in-app messaging. As we demonstrated success, we’ve extended into the actual work of managing a product. A web page, an app screen, a product description, search results - again, these are all just surfaces where you can render content to deliver value to users, so they can deliver value to you as a company. And by syncing all of these surfaces together through shared agentic infrastructure, each surface makes all the other surfaces better. What you learn from your emails should inform your product pages. What you learn from your subscription pop-up should change how your push notifications work.

The real power of data emerges from running experiments - lots of tests, way more than however many A/B tests you think is a reasonable amount - and measuring counterfactuals.

Paul Meinshausen, CEO and Co-Founder at Aampe

Aampe applies reinforcement learning to enable continuous, parallelized experimentation. Can you discuss the benefits and challenges of using this approach in consumer use cases?

You have to define three things in order to operate reinforcement learning: an agent, a set of actions the agent can choose from, and an environment that rewards some actions more than others. Until just a few years ago, almost everything in RL dealt with environments you could know completely, simple action sets, and agents who could run through the environment thousands or even millions of times, learn which actions to do in which situations. Think of chess - you can see the whole board, each piece can only move in certain ways, and there are established rules for taking turns, capturing pieces, and so on.

But that kind of RL isn't all that useful for businesses. Businesses never know the whole environment - the stuff you don't know about your customers is always greater than the stuff you do know, no matter how much data you've collected. Action sets for customer engagement are always multidimensional - whereas with chess you just need to know the number of spaces you can move and what direction to move in, for customer engagement you have to consider day of week, time of day, frequency of interaction, channel, offering (products, incentives), framing (value proposition, tone of voice), and much more. And you don't get to run through an interaction with a customer thousands of times. Each shot you take is the only shot you get in that moment.

Bandit algorithms are better in some way than narrow reinforcement learning and can handle this kind of “online” environment, but bandits typically don't perform well enough with multidimensional action sets, and they don't personalize at all - even contextual bandits optimize for segment performance, not individual performance.

So when we started Aampe, we had to ignore a lot of the RL work done up to that point, because it only worked for far more bounded problems or toy problems, and we had to adapt the bandit approach until we reached a point that we weren’t doing bandits anymore. To move from adaptively optimizing for a segment to adaptively optimizing for an individual user, we had to adapt several established methods for situations that they originally hadn’t been designed for. For example, we needed to work out a network-traversal logic for agents to weight every event in an app’s event stream, so agents could pursue goals (purchases, video watches, game plays, etc.), but still be able to assess their actions even if the user didn’t do the goal event.

Probably one of the biggest conceptual breakthroughs was when we realized we could adapt Difference in Differences methods. Those methods were developed in Economics, Political Science, and Public Health, and they've historically been used to study macro-trends like policy impacts or environmental regulations. Nothing about the method screamed “personalization”, but it turned out to be an excellent way to assess the impact of an individual touchpoint - a message delivered, an app screen viewed - upon an individual user. There was - and still is a lot of work to do on defining baseline expectations, comparing pre-treatment to post-treatment evidence, and so on, but all of that work is on our plate only because at some point we realized that an econometric method could be adapted to create individualized statistical distributions for as many actions as we wanted to track, and once we have to individualized distributions, we can use standard bandit methods - Thompson Sampling, for the most part - to build individualized contexts.

That's maybe the main difference we see between the agentic AI we're building and the agentic AI we see others starting to offer: most agents, especially those based on LLMs, input context and output actions. Our agents - inspired by RL, if not following the same methods RL has historically emphasized - input actions and output context. Using that context to look up or generate user-facing content is a relatively straightforward task.

Businesses don’t win by perfectly understanding the past; they win by making better decisions about the future.

Paul Meinshausen, CEO and Co-Founder at Aampe

With over 100 million AI agents executing and managing up to 200 billion decisions per week, how does Aampe ensure scalability and efficiency in its processes?

With truly great engineers dedicated and devoted to maintaining the infrastructure that our business partners use to drive ambitious growth. Our first partners were in India and Southeast Asia, and that meant we were necessarily exposed to tremendous scale for our earliest days. It's just a lot more normal and expected for consumer applications to have tens of millions of monthly or even weekly and daily active users in Asia than it is in the US and Europe, and that was a problem we had to face from day 0. Being born in the fire of extreme scale builds up a lot of resilience in your infrastructure and development operations.

Can you mention specific cases where Aampe's technology significantly enhanced customer user engagement and retention?

Aampe's two primary vectors of value are user actions and our customer-partner's return on operational effort. I'll share a couple of examples.

One of Aampe's customer-partners is a mobile fitness app with a subscription business model. Their traditional marketing campaigns were oriented around conventional messaging to drive subscriptions: a lot of "calls to action" to subscribe, and some discounts and promos to help incentivize users to subscribe.

When they deployed Aampe, we helped them realize there was no reason to have just a few message templates that exclusively featured traditional marketing language. Instead, they could populate a fairly massive library of potentially relevant topics and all kinds of diverse tones and value propositions, and other modes of communication.

Rather than generate that content from scratch, we pointed out that they already had a thoughtful blog and a lot of content that had been developed primarily for SEO purposes. Unfortunately, all of that rich content was hiding away in screens that most users never came across. So we helped them pull all of that content in and transform it with their Aampe agents so that their agents could actively experiment and use that content to add value to their users. When they did that, they pretty quickly saw a substantial increase in subscriptions happening in response to the content. As it turns out (and it should make abundant sense), people thinking about fitness would much rather receive practical tips for improving their fitness or read thoughtful content about their fitness practice, rather than receiving a Nth push notification asking them to “subscribe now!”.

Commerce businesses have more complex ways they drive up their customer lifetime value. They can drive more purchases in a given vertical or category; drive more purchases across categories (cross-sell, leading to higher wallet share); and drive larger purchases ("basket-size"). Another case was a food delivery app that sent most of their notifications around lunch and dinner time because that's when they observed most of their orders. They didn't send any notifications later at night because they feared those notifications would annoy the majority of their customers, and they didn't have a good basis to determine which customers would appreciate nudges late at night.

Aampe let them remove their hard-coded rules for when notifications were sent. Their Aampe agents started experimentally sending small numbers of notifications at later times and then gradually scaled up such that 4-7% of the notifications sent on 3 to 5 days of the week were sent between 9 pm and midnight. And a significant portion of those notifications led to the recipients ordering late at night. Even more interestingly, only about 30% of those customers had ever ordered late at night before, which means you couldn't have identified late-night ordering customers only from historical data. Instead, you had to help those customers discover the use case of a late dinner or late-night snack, and you had to do that without annoying a lot of customers who don't want to order anything late at night. There are only so many lunches and dinners a given person is going to order. So the spend in the use case of lunch and dinner is not elastic past a certain point. But some customers will spend more if they discover a new use case, like a late-night snack. Using an agentic AI infrastructure helped this food delivery business grow its wallet share for those customers by expanding its usefulness for those customers.

Aampe's co-founders (Left to right: Schaun Wheeler, Kate Field, Sami Abboud, Paul Meinshausen)

In the era of data privacy concerns, how does Aampe reconcile user data protection with personalized experiences?

Every business has a relationship with each of its customers. The relationship is reciprocal and bilateral, such that each customer has a relationship with the business. The relationship depends on trust.

The vast majority of the concerns that users have with their data privacy and protection result from situations where the relationship surreptitiously becomes tri- or multi-lateral; where businesses try to use customer data that they procured from some third business, or where they try to sell or benefit from sharing their customer data with some third party or business.

When a fashion retail business tries to learn things about you from what you do in and with TikTok, or when your food delivery app tries to sell your data to a fast food chain, many customers view that as an invasion of their privacy and trust.

Aampe is designed on the premise that businesses do not need to do this. Their direct interactions with their customers, through their app or website, and their notifications and emails, are more than enough to learn what they need to know about their customers and to serve their customers more effectively than ever before. But those businesses need the technology that lets them continually turn their interactions with customers into high-value data, and then use that data effectively to deliver even more high-value interactions.

Aampe does not need to reconcile user data protection with personalized experiences, because we know that technologically there is no need for data protection and personalization to diverge at all - they can and should be mutually reinforcing.

How does Aampe's solution integrate into a business's existing marketing and product toolkit, and what is the typical onboarding process like?

Most of today's conventional marketing and product toolkits do a pretty good job of servicing the mechanics of effective communication and digital surface management. A MarTech customer engagement platform knows how to maintain reliable addresses for a given user's device-app-profile and reliably deliver a message to that device-app-profile at scale. Businesses have invested a lot of time and effort, and energy into implementing those platforms. The platforms just don't know what to put in that message, or when to deliver it.

So, rather than reinvent the delivery wheel, we designed Aampe's agents to be able to operate the delivery mechanisms of those platforms. They can operate them a lot more rapidly and efficiently, and effectively than human teams, especially because they are operating them under the management of those teams.

When a new customer-partner deploys Aampe into their tech stack, they don't have to rip and replace all their tooling. Instead, they can primarily integrate their existing tools with Aampe via APIs and various forms of data pipelines.

The complexity of that process has a lot to do with the complexity of the business's current MarTech stack. If it's just a couple of tools, then they can usually have Aampe running in a couple of hours, or even under an hour. If the business is a large enterprise with layers of legacy tools, many of them siloed from each other, then it can be a more gradual process of implementing Aampe. In those cases, we guide and recommend our customer-partners to connect Aampe leanly into just parts of their stack and then gradually extend their Aampe agents’ operating footprint more broadly. Usually, a great byproduct of this process is that they can clean up a lot of their tools and end up with a far leaner, simpler, and more efficient technology stack.

Aampe's team

How do you envision the function of agentic AI evolving in the next five years, particularly within the realm of personalized user experiences?

We're entering an interesting time - full of opportunity, and just as full of risk. Right now, "agentic AI" means too many different things. We see three categories of offerings:

Old stuff repackaged as new. There are a lot of AI offerings that are essentially A/B testing, multi-armed bandits, or machine learning, even simple reinforcement learning. These are all the same offerings that existed before, but they're now brought under a new label, and they're integrated just a little bit better than they used to be. These are all the same tools that offered some business value over the last decade, but didn't offer nearly as much value as they claimed to. If this kind of offering becomes the face of Agentic AI, then businesses are going to quickly grow disillusioned with Agentic AI.
Narrow LLM agents. There's a lot of hype right now around the idea of having large language models make a lot of decisions. This road also leads to disillusionment. LLMs have the wrong, and at best inefficient, architecture to tackle most real-world business problems. LLMs deal well with "kind" problems: problems where there are clear rules, and consistent and clear feedback. Most business problems aren't like that. An LLM can power a chatbot that returns results when a user asks for a blue shirt for a formal occasion, but that same LLM is dead in the water if the user abandons the session. LLMs can't act unless they are fed context in the form of prompts. Only kind problems have scripts.
Agentic learners. This is what Aampe provides. An agentic learner can efficiently handle "wicked" problems - situations where the rules constantly change, or there are no rules, and where feedback is incomplete, partially incorrect, or absent. That's the real world that businesses operate in. The challenge isn't to ingest context - LLMs can do that well enough. The challenge is to build context. That requires constant, adaptive experimentation at scale. It requires an abstraction layer that translates "this one message had an impact" to "this time of day works has a high chance of impact" or "this value proposition has a high chance of impact." And it requires not an agentic product, but agentic infrastructure - the connectivity with other systems within the business to handle coordination, alignment, and governance without needing a human to hand-hold.

If, over the next five years, "Agentic AI" comes to mean "agentic learners" (and it's Aampe’s mission to ensure that happens), then we're going to see an explosion in company productivity and an explosion in individual human creativity. The goal is for agents to take on a lot of the tedious, tactical decisions that current customer engagement systems force human operators to focus on. If we can take that off the human team member's plates, then those humans can focus more on strategy. That holds the potential to create a virtuous cycle, where agents give humans the headspace and opportunity to be more creative, and where that creativity feeds better strategies and better content to agents who can then deliver to end-users.

Important copyright notice © DataRoot Labs and datarootlabs.com, 2025. Unauthorized use and/or duplication of this material without express and written permission from this site’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to DataRoot Labs and datarootlabs.com with appropriate and specific direction to the original content.