
Key Contributors: John Stuppy, Min Kim, Laimonas Turauskas, Nathan Marks, Ben Bernard
At Instacart, artificial intelligence (AI) is an accelerant reshaping how we build, scale, and innovate. Whether it’s fostering a company-wide culture of AI-assisted engineering, or bringing a product like Fizz from idea to launch in record time, we’ve embraced AI to work smarter and faster.
This blog explores how AI has unlocked productivity gains across individual projects and at organizational scale, the lessons learned along the way, and how these insights are influencing our engineering culture.
From Grassroots Adoption to Scaled Enablement
Our AI journey started with individual engineers experimenting, from writing unit tests to navigating legacy services, with AI copilots becoming their powerful assistants. Informal demos, Slack threads, and “how I use AI” discussions began organically surfacing across teams.
But to make AI adoption consistent and applied in a thoughtful, durable way, we needed a structured approach.
Case Study: Project Tomato — Accelerating Workflows with AI
One of our most pivotal internal initiatives has been Project Tomato, a focused effort to explore how AI can meaningfully accelerate engineering workflows in real-world scenarios. We called it Project Tomato because, like fruit in a garden, some AI use cases are ripe for immediate use, while others need more time to mature.
Born within our Commerce organization, Project Tomato explored practical use cases across the engineering lifecycle. From outlining an ERD and generating code to debugging, analyzing logs, and ensuring test coverage, engineers embedded AI tools like Ava, Cursor, and Glean directly into their day-to-day tasks. What we uncovered was a set of high-leverage patterns (ripe tomatoes!) and equally valuable constraints (unripe tomatoes).
✅ Ripe Tomatoes
🚀 AI as a Productivity Companion
One of the early wins came from using agentic AI coding assistants to automate repetitive and often neglected tasks. For instance, when tasked with cleaning up stale feature flags or deprecated logic, the AI agent not only identified unused paths but rewrote the class with precision. We cleaned up 15+ feature flags from a single service just with the use of an AI coding agent in a single PR. Refactoring code, a task often deferred, became faster and safer.
🖼️ Design-to-Execution Made Faster
We also saw success in bridging the design-to-dev handoff. Using AI coding assistant features like image upload, engineers uploaded Figma screenshots of UI components. The AI interpreted layouts and generated usable scaffolds, turning static mocks into functional components with impressive fidelity. This accelerated the build process while preserving designer intent.

🐛 Smart Debugging
AI can help debug complex issues by analyzing code and dependencies. For example, we used our AI coding agent to identify why certain error codes weren’t returned from some cross-service API calls. The AI pinpointed that enabling a specific parsing option in one of our libraries will resolve the issue. We also used an AI agent to identify “N+1” queries in slow performing code and the agent also successfully pinpointed the core issue.
✍️ Planning and Architecture Support
AI wasn’t limited to coding. In redesigning parts of our invoicing system, engineers worked with our AI coding agent to brainstorm ERDs for migrating tax calculation logic from legacy flows. The back-and-forth interactions generated diagrams, structured schemas, and code snippets — all while allowing engineers to maintain design ownership and judgment.
🚨 RCA Handling
AI also proved valuable during incident response. With tools like Ava and Glean, teams were able to rapidly analyze logs and stack traces, draft root cause narratives, and even auto-generate “Five Why’s” to feed into RCA documentation. The AI coding assistant then helped implement follow-up action items, tightening the loop between identification and resolution.
🏗️ Structuring the Environment for Success
One of the most consistent learnings was just how much the development environment impacted AI performance. AI coding agents were most effective in modular, well-annotated workspaces. Teams that created scoped workspaces in their IDE focused on a single service instead of a big repository saw improved indexing speed, smarter suggestions, and more reliable outputs.
🤖 Task-Specific Model Thinking
As engineers gained hands-on experience, they began treating different LLM models like specialized teammates. For example, earlier in our AI adoption journey Claude-3.7 was preferred for planning-oriented tasks like designing ERDs or generating flow diagrams, while Claude-3.5 shined when asked to generate, refine, or review code. Other models were chosen for log analysis or QA tasks. While the specific models have changed as newer, more capable ones become available, the approach remains the same: align each model’s strengths with the right task profile to unlock more accurate and context-aware results, effectively treating LLMs as expert “personas” for different phases of development.
⚒️ Prompt Engineering as a Skillset
Another key insight was just how much prompt quality affects AI output. Prompt structure, clarity, and context layering significantly influenced the quality of AI output. Engineers who adopted structured prompting techniques such as clearly scoping tasks, reusing successful phrasing, or chaining context, consistently achieved better outcomes. This resulted in an organic sharing of techniques across teams and even led to the creation of internal prompt engineering playbooks. As a result, prompt crafting is increasingly seen as a core skill in the AI-assisted developer’s toolkit.
❌ Unripe Tomatoes
⛰️ Context Challenges in Monolith
Of course, not every task was ready for automation. In legacy systems like a monolith codebase, AI coding agents struggled to navigate large, interconnected code. Without clear boundaries or modular organization AI had to process too much unrelated context which led to slower suggestions, incomplete answers or even hallucinated code. In contrast, smaller and well-scoped codebases provided a cleaner working context which resulted in better outcomes.
📁 Large Files
Even in a well structured codebase, file size itself could be a bottleneck and could lead to incomplete or inconsistent outputs. For example, with basic prompts and minimal guidance, the AI coding assistant successfully converted protobuf option comments to standard // notation in a 500-line file. However, when the same task was attempted on a 5,000-line file (10x larger), the tool repeatedly failed to complete it.
🔄 Code Translation Without Transformation
AI streamlines the translation of existing code into new languages, but it can also carry over flawed logic without addressing it. Our team encountered this during a system rewrite. The experience reinforced the importance of not blindly trusting autogenerated code and of providing clear, explicit instructions before initiating a rewrite.
Project Tomato helped us spot where AI shines within our engineering work. What became clear across all of these use cases is that the real unlock was in learning how to use these tools effectively in a complex codebase. AI is like a racecar: with the right skills and a well-paved road, it can achieve incredible speed. But in the jungle of a complex codebase, you first have to learn to drive and clear a path. The lessons we’ve learned continue to shape how we design AI workflows, support teams, and think about productivity at scale. What began as a small experiment is now guiding us how we build across the company.
Case Study: Fizz — Building Fast and Smart
While Project Tomato focused on internal engineering enablement and workflow transformation, looking at Fizz as a case study offers a complementary perspective: how AI can accelerate a product from a seed idea to a fully launched customer experience.
A small product group at Instacart had an idea for a new group ordering app for drinks and snacks. Thanks to our AI-first mindset, we went from concept to customer-ready in just a few short months — something that would’ve taken far longer a year ago. What began as a hackathon prototype became a fully functional product through extensive use of AI-assisted workflows.
AI at Every Step: From Prototypes to Polish
Leveraging AI, our team autogenerated UI scaffolding from Figma mocks and rapidly built out complex client-side systems. Tasks that traditionally took several days — such as creating a dynamic search experience for iOS — were completed in hours, guided by intent and constraints.
Code Reviews and Automation
AI tools like Cursor reduced the burden of code reviews while boosting confidence in debugging and refactoring. Mock data, documentation, and test generation were also streamlined, cutting down repetitive tasks.
Final Refinement
AI even supported creative tasks like generating notification copy. Despite its limitations in precise visual details (e.g., animations and transitions), it allowed us to focus engineering time on the user experience.
Building Fizz showed us just how much AI can accelerate development when used thoughtfully, and where human insight still plays a critical role. We’re continuing to refine our approach as we build products at Instacart.
Best Practices for AI-First Development
Through our experiences with both Fizz and Project Tomato, we’ve surfaced a set of practices that help teams get the most out of AI-assisted development and not just in theory, but in production reality.
🧑🤝🧑 Use AI as a teammate — not a replacement
Think of AI as a capable but inexperienced engineer. Like any teammate, it needs clear instructions and context to perform well. We found that the best results came when engineers framed tasks precisely, reviewed output critically, and iterated in tight feedback loops. Structured prompts and layered context often made the difference between success and confusion.
🌱 Start small to build trust
Adopting AI doesn’t require a full rewrite of your workflow. The most effective teams began with lower-risk tasks like generating test stubs, adding comments, or drafting boilerplate before extending AI into deeper parts of the codebase. These early wins built confidence and helped engineers develop a feel for the tools’ strengths and boundaries.
🧠 Context is everything
The quality of AI output is directly tied to its understanding of the codebase. Modular, well-annotated services with predictable patterns yielded dramatically better results. Conversely, large legacy systems written in untyped languages often created too much ambiguity for AI agents to navigate reliably. Creating clean workspaces, isolating services, and scoping tasks made a meaningful difference.
🎯 Apply AI strategically — not universally
AI doesn’t need to be used everywhere to be effective. We saw the greatest ROI when applying it to high-leverage, well-scoped tasks like prototyping UI components, scaffolding backend services, comparing logs, and writing tests. For complex architectural work or nuanced design decisions, human expertise still leads the way.
📊 Measure impact to focus your efforts
Whenever possible, we tried to quantify the value. For example, in the development of Fizz, we observed up to 20% time savings on frontend workflows like rendering web UI and integrating client logic. These insights help us focus our efforts and prioritize where AI adds the most value, and where human experience is irreplaceable.
Structuring Success: Workshops, Playbooks, and Mindset Shifts
These efforts gave us a set of learnings but to scale their impact, we had to turn insight into infrastructure. We created structured learning programs for the whole company like bootcamps and playbooks to help all functions, not just engineers, adopt and adapt AI tools thoughtfully.
- Bootcamps: Hands-on sessions that guide teams through real use cases, from writing infrastructure scripts to generating QA test cases.
- Playbooks: Living documentation capturing prompt engineering tips, model selection strategies, and feedback techniques.
- Cross-functional involvement: We’re embedding AI across product, design, and data science teams because engineering isn’t the only place where velocity matters.
This systematization ensures that AI adoption isn’t left to chance. It’s built into our culture of continuous learning.
While we’ve explored multiple tools, we believe the tools themselves are just one part of the story. The real unlock comes from a shift in mindset. We’re investing in the idea of the “AI-enabled engineer” who sees AI as a creative partner. We’re fostering a culture where experimentation is encouraged, and engineers are empowered to lead the next wave of productivity.
Final Thoughts
Instacart’s AI journey is still unfolding, but we’re establishing systems, practices, and mindsets that allow AI to thrive as part of our engineering craft and rethink how we build.
AI-Driven Development at Instacart: Scaling Impact and Increasing Velocity was originally published in tech-at-instacart on Medium, where people are continuing the conversation by highlighting and responding to this story.