Selective State Space Models: Solving the Cost-Quality Tradeoff
As AI is increasingly used in production scenarios, costs are mounting. Are alternative architectures the solution?
There are two big movements in infrastructure software. One is open source, epitomized by Confluent and Mongo. These are projects that often incubate in large companies, or as hobbies for their talented founders, before blossoming into independent companies. The other is “serverless”, which asks engineers to accept closed-source in exchange for ease of use, infinite scale and usage-based pricing. Good examples are Snowflake and DynamoDB.
We think of serverless as an interface that greatly simplifies cloud programming. It enables developers to work at higher levels of abstraction and get more done. This is particularly important as more activities “shift left”, increasing developer workloads. Whereas before the job was just writing code, now it might also include building in security or observability. The only way developers can keep up is to operate at a higher level, free from managing virtual machines and other system administration tasks.
That’s why we were so intrigued when a three-line “you guys should talk” email appeared out of the blue one Friday afternoon from our friend, Dan Portillo at TheGP. Dan’s email introduced us to Khawaja Shams and Daniela Miao, co-founders of Momento, who were building an intelligent serverless cache. We cleared our calendar for Saturday and asked to meet immediately. Within the first few minutes of our conversation, it was clear that Khawaja and Daniela were addressing an important problem.
Caching is a huge pain point. It’s complicated to build into an app and often causes other issues. Daniela’s insight is that “it always feels too early or too late to add caching”. Every developer defers building it (too early) because of the risk, until they face performance or latency issues (too late) at which point they scramble. As Khawaja explained, what if caching were so easy that anyone could add it within five minutes? Developers could then build caching in from the beginning, with no work or heartache.
Our other learning from that first meeting was that this is a special team. Khawaja and Daniela worked together for many years at AWS. Khawaja built AWS’ DynamoDB business from zero to over a billion dollars in revenue. Daniela left to help scale Lightstep, an observability company started by a talented team out of Google which was later acquired by ServiceNow. They are both deeply technical with unique insight into how best to solve this problem and — most importantly — are kind, curious and deeply committed to growing an enduring company.
We see Momento as part of the broader movement to a serverless future. In the old world, users wanted the same computing environment for the cloud as they had on their local machines, to simplify porting their workloads (see below).
In the new world, every app is cloud-native with developers supported by easy-to-use, infinitely-scalable, inexpensive serverless offerings:
Three days after our first meeting, we put forward a term sheet to lead Momento’s seed round. Today, the company is announcing the availability of its product (if you are a developer, give it a try!) along with its $15 million seed, led by Bain Capital Ventures with participation from TheGP and some exceptional operators. We couldn’t be more excited to support Khawaja, Daniela, and team in their mission to build the world’s fastest cache — and add a critical part of the serverless stack.
You can follow Momento on Twitter and LinkedIn and join the Momento Discord.
As AI is increasingly used in production scenarios, costs are mounting. Are alternative architectures the solution?
Cube is the standard for providing semantic consistency to LLMs, and we are investing in a new $25M financing after leading the seed round in 2020.
In this edition of “In the Lab,” Amit Aggarwal explains why he’s building an AI startup in BCV Labs after selling his company The Yes to Pinterest.