How do Y Combinator founders use AI?

Evolution (2020–2025): From Analog to AI Master

Author’s Note: Over the last five years, artificial intelligence has rapidly transformed software development. What began as simple autocomplete suggestions has blossomed into AI “pair programmers” capable of generating entire codebases. We will analyze the evolution of AI in coding from 2020 to 2025, with a special focus on the tools and workflows that today’s Y Combinator (YC) startup founders use to move faster. We’ll explore how AI changed the coding landscape and developer roles, highlight specific 2025 tools (Cursor, WindSurf, OpenAI Codex, DeepSeek R1, Claude “Sonnet” 3.5, etc.), and share real quotes and practices from YC founders.

We also compare these tools’ functionalities (code generation, debugging, codebase indexing, reasoning capabilities, etc.), on adoption and impact, and offer practical recommendations for new founders building AI-native engineering teams. Finally, we discuss how the developer’s role is shifting from traditional engineer to high-level product thinker, and the challenges that remain (debugging, reasoning, scaling from zero-to-one to one-to-n).

AI-assisted coding has gone from a novelty to the norm in many startups.

How Y Combinator founders use AI, From Autocomplete to AI Pair Programmers (2020–2023)

In 2020, the idea of AI writing significant amounts of code was mostly speculative. Early tools like TabNine (launched in 2019) used deep learning to autocomplete code, but they were limited to single-line suggestions. This began to change with OpenAI’s GPT-3 (mid-2020), which, while a general language model, surprised developers by generating workable code from plain English prompts. Enthusiastic programmers started experimenting with GPT-3 in the OpenAI Playground, crafting prompts to produce functions or even simple apps. These early experiments hinted at a future where coding could be more about describing what you want than manually typing everything.

The real inflection point came in mid-2021 with OpenAI Codex, a descendant of GPT-3 fine-tuned for coding tasks. Codex powered GitHub Copilot, launched as a technical preview in June 2021. Copilot was essentially an “AI pair programmer” integrated into editors like VS Code, autocompleting chunks of code or entire functions based on context. Developers could write a comment describing a desired function, and Copilot would suggest the implementation. This was a game-changer – by 2022, Copilot had millions of users and was generating a significant share of code in languages like Python and JavaScript.

Y combinator reported that developers using Copilot were 55% more productive and that about 40% of the code they checked in was AI-generated (and accepted without modification) . In other words, nearly half of new code for those users was written by AI by 2023. Such statistics underscored that AI assistance was not just a gimmick; it was becoming a mainstream productivity booster. Refer to: (95% AI-written code? Unpacking the Y Combinator CEO’s developer jobs bombshell - LeadDev)

Throughout 2022 and 2023, other tech giants and startups introduced their own AI coding aids. Amazon CodeWhisperer and Google’s AI Codey (in Google Cloud) emerged as alternatives. Replit built an AI assistant into its cloud IDE. Open-source communities released models like CodeGen and PolyCoder, and later Meta’s Code Llama (2023) provided powerful code generation capabilities to all. Developers also started using general AI chatbots for coding help – when ChatGPT launched in late 2022 (based on GPT-3.5, then GPT-4 in 2023), programmers flocked to it for debugging advice, code review, and generating snippets on the side.

By mid-2023, asking ChatGPT “Why is my React state management buggy?” or “Write a Python script to parse this data” was normal. These chatbots were not integrated into editors by default, but they offered something new: deeper reasoning about code. A developer could paste an error trace or ask for design ideas, and models like GPT-4 would provide thoughtful analysis, not just boilerplate.

Crucially, the quality of AI-generated code improved over this period. Early on, models often produced incorrect or insecure code unless the prompt was very detailed. But each model iteration brought gains. By 2023, GPT-4 and Anthropic’s Claude (another large language model) could solve complex algorithmic challenges, generate unit tests, and even suggest architectural improvements. This convergence of improved code generation reasoning ability set the stage for a new kind of development workflow.

As YC Managing Partner Jared Friedman observed,

“with the release of new AI models that are better at coding, developers are increasingly using AI to generate code”. The concept of an “AI pair programmer” shifted from novelty to necessity for competitive developers.

In summary, 2020–2023 saw AI transition from simply finishing your line of code to meaningfully collaborating on your project. By late 2023, many developers were treating AI as core parts of their toolkit (A quarter of startups in YC's current cohort have codebases that are almost entirely AI-generated | TechCrunch) – using autocompletion for routine code, leveraging chatbots for debugging and design advice, and drastically reducing the time needed to go from idea to working code. This evolution laid the groundwork for what came next: an explosion of AI-native development tools and new workflows embraced by the most cutting-edge software teams.

The AI-Native Coding Revolution (2024–2025)

Around 2024, a tipping point was reached. AI assistance in coding moved from early adopter phase to early majority, especially in startup ecosystems. Y Combinator’s Winter 2025 batch provides a striking example: a quarter of the startups in YC’s current cohort have codebases that are 95% AI-generated. In other words, these founders estimate that only 5% (or less) of their code is typed by human hands – the rest is produced by AI. And we’re not talking about non-technical founders who need AI because they can’t code. As Friedman emphasizes, these teams are led by “highly technical” people, fully capable of coding things themselves. Yet in 2024–2025, even great engineers are choosing to let AI do most of the coding work, because it’s faster and it frees them to focus on higher-level problems.

What enabled this dramatic leap? In short: better models, better tools, a mindset. On the model front, late 2023 brought Claude 3.5 “Sonnet” (Anthropic’s highly capable coding model with a 100k token context, OpenAI’s advancements like GPT-4 “O1” preview (an iteration of GPT-4 geared towards deeper reasoning for code). These models made AI coding more reliable and context-aware than ever. One YC partner noted that six months prior (mid-2024), “Claude Sonnet 3.5 [was] still actually being used but newer “reasoning models” (like OpenAI’s GPT-4 variants and others) were quickly catching up in popularity. By late 2024, many teams had access that could not only generate code quickly, but also perform non-trivial reasoning – and logic, understanding intent, and even reading multiple files to fix a bug.

Hacker News:

"Today, almost 50% of code is written by AI"

Equally important was the emergence of AI-native developer tools – integrated development environments (IDEs) and assistants built from the ground up to leverage these models. Two names that came up repeatedly in YC are Cursor and WindSurf. These are next-generation coding tools that go beyond the plug-in approach of Copilot. Cursor (a YC-backed AI code editor) became “by far the leader” among the Summer ’24 YC batch. Founders love that it deeply integrates chat-based code generation into the editing workflow.

One founder, Yoav from CIX, said:

“I write everything with Cursor. Sometimes I even have two windows of Cursor open in parallel and I prompt them on two different features.”

This anecdote captures the new “AI-native” workflow: the developer is orchestrating multiple AI instances to build features concurrently, something essentially impossible for a single human coder before. It’s like having several junior engineers who never sleep, each working on a different task; the senior engineer can spawn them on demand and guide them with natural language instructions.

Right on Cursor’s heels, WindSurf emerged as a strong alternative, described as a “fast follower.” The key difference? WindSurf indexes your entire codebase and uses that to inform its answers.

As YC partner Jared explains,

“The number one reason people are switching is that Cursor today largely needs to be told what files to look at … WindSurf indexes your whole codebase and is pretty good at figuring out what files to look at on its own.”

For a large project its a big deal – it means the AI can automatically draw on relevant modules or configs when generating code, without the user having to manually copy them into the prompt. This semantic code search ability is something traditional Copilot lacks, and it resonated with teams working on sizable codebases.

Meanwhile, many developers still kept ChatGPT (especially GPT-4) in their toolkit for tasks requiring heavy-duty reasoning or long-form answers. Founders report using ChatGPT’s interface to ask debugging questions or get architectural advice, taking advantage of GPT-4’s prowess in logical reasoning. Interestingly, in late 2024 a few adventurous coders even experimented with self-hosting large models or using open-source alternatives like DeepSeek R1.

DeepSeek R1 is an open-source 2025-era LLM known for strong reasoning and a large context (128k) – YC folks mentioned it as a “viable contender” that some were using, especially those handling sensitive code who preferred not to rely on cloud APIs. And on the horizon, there were reports of teams trying out Google’s Gemini model (which at the time boasted the longest context window) by feeding “their entire codebase into [Gemini’s] context window” to attempt one-shot bug fixes. This didn’t always work reliably, but it shows how developers were pushing the envelope, leveraging massive context lengths to give AI a holistic view of their project.

Perhaps the biggest change from 2020 to 2025 is the mindset of developers. AI is no longer seen as a crutch or novelty – it’s viewed as “the dominant way to code”, and those not on board risk being “left behind”. This quote from YC CEO Garry Tan encapsulates the industry’s sentiment: “This isn’t a fad. This isn’t going away. This is the dominant way to code. And if you are not doing it, you might just be left behind.” Founders have fully “given in to the vibes,” to borrow a phrase from AI guru Andrej Karpathy. In fact, Karpathy coined the term “vibe coding” in early 2025 to describe this new approach: “a new kind of coding … where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.” It sounds whimsical, but YC’s leadership agrees it captures something real about the state of software development. Instead of meticulously managing every line, developers trust the AI to handle low-level implementation, wielding on the idea or “vibe” of what they want to create.

The results of this AI-native approach are staggering in productivity terms. As one YC founder (of TrainLoop) put it, “6 months ago [we had] a 10× speed up; one month ago to 100x speed up – exponential acceleration. … I’m no longer an engineer, I’m a product person.” This may sound hyperbolic, but even if taken loosely it means tasks that used to take days might now take hours. Multiple founders echoed that sentiment. They feel like they’ve effectively scaled themselves:

“AI tools make everyone a 10× engineer”, as the founder of Outlet said, which means “human taste is now more important than ever.”

In other words, when raw coding ability is amplified by AI, the differentiator becomes what you build and why, rather than how fast you can type or how intimately you know a framework’s API.

To illustrate how far we’ve come, consider this: some new programmers entering startups in 2025 never learned to code the “old way.”

YC partners have observed a “generation” of founders who

“learned to code in the last two years, so they actually don’t know a world where Cursor didn’t exist.”

These individuals often have strong analytical backgrounds (math, physics, etc.) and can think in systems, but they skipped spending years mastering syntax and debugging by hand. Instead, they think of problems and use AI to manifest those solutions in code. They are “incredibly productive” – AI writes almost the entire application for them – yet they can build impressive products because they understand the problem domain and how to guide the AI. This trend is reminiscent of how high-level programming languages opened software development to people who weren’t electrical engineers; now AI is opening it to people who aren’t traditional software engineers, potentially broadening the pool of who can create software.

However, this revolution also comes with new challenges and responsibilities – which we’ll discuss later (see Section 6 and 7). First, let’s delve into the concrete tools and workflows YC founders are using in 2025, and how these AI tools differ in functionality.

The New Arsenal: AI Coding Tools Used by YC Founders in 2025

By 2025, developers have a rich arsenal of AI-powered tools at their disposal. YC startup founders tend to be early adopters, often using a combination of multiple AI tools to cover different needs.

Here we provide an overview of the key tools and models, focusing on those specifically mentioned by YC founders, and what each is best at:

Tool/Model	Description & Strengths	Notable Use by YC Founders
Cursor (AI Editor)	An AI-integrated code editor that uses Anthropic’s Claude 3.5 “Sonnet”. It offers interactive code generation within your IDE through chat-based prompting and inline suggestions. Strength: Fast code generation with a large context window (although it requires manual file guidance).	“I write everything with Cursor… I even run two windows in parallel on different features” – Used to multitask development efficiently.
Windsurf AI Coding tool	An AI coding tool that pre-indexes your entire codebase and automatically brings in relevant context when generating code. Strength: Ideal for large projects and refactoring across multiple files without manual guidance on context.	Many founders switched to WindSurf because it “figures out what files to look at on its own,” simplifying complex code changes.
GitHub Copilot (OpenAI Codex/GPT)	A pioneering autocomplete assistant integrated into popular editors. Strength: Provides seamless inline code suggestions and, with Copilot X, improved chat capabilities powered by GPT-4.	Remains widely used for generating boilerplate code – reports indicate it produces about 40% of code for many users.
ChatGPT (GPT-4)	A general AI chatbot available via web/API that excels in reasoning, debugging, and offering broad technical insights. Strength: Superior for complex problem-solving, though not directly integrated in IDEs.	Often used to solve tricky bugs, generate algorithm insights, or explain error logs – serving as an external “senior engineer” advisor.
Anthropic Claude 3.5 (“Sonnet”) and Claude 2	Large-context models specialized for coding, frequently used behind the scenes in tools like Cursor and Replit Ghostwriter. Strength: Fast, up-to-date, and reliable for generating bulk code.	Powers many of the day-to-day coding tools used by YC founders, ensuring quick and accurate code generation.
DeepSeek R1	An open-source model from 2025, approximately at GPT-4 level, featuring a 128k context window. Strength: Self-hostable and cost-efficient for large context tasks; excellent at reasoning and code analysis.	Employed by privacy-conscious founders who prefer running models locally to analyze their entire codebase.
Devin (Auto-Coder)	An autonomous coding agent that generates, runs, and iteratively fixes code with minimal intervention. Strength: Aims to handle entire coding tasks autonomously.	Considered experimental – mentioned but generally “not used for serious features” due to challenges with large codebases.
Google Gemini	A next-generation model by Google featuring a very large context window. Strength: Has the potential to analyze entire systems at once, offering improved reasoning over large codebases.	Tested by some founders by feeding entire repositories to see if it can one-shot fix bugs; not yet mainstream but highly anticipated.
Others (Codeium, Phind, etc.)	Includes tools such as Codeium (free AI autocomplete), Phind (AI search and code assistant), Replit Ghostwriter, and Amazon CodeWhisperer. Strength: Fill niche requirements, like cost-efficiency or on-prem needs.	Often used as alternatives or supplementary tools when specific constraints arise (e.g., budget or compliance).

Table 1: A comparison of notable AI coding tools/models in 2025. This table provides a snapshot of the major AI coding tools and models used by YC founders today. These tools integrate AI seamlessly into everyday coding workflows, allowing teams to blend fast code generation with deep reasoning. In practice, developers often mix and match these capabilities—using an AI editor powered by Claude for quick code scaffolding, switching to GPT-4 via ChatGPT for tackling complex algorithms, or leveraging WindSurf’s comprehensive codebase indexing for large-scale refactoring. Many YC startups emphasize the importance of keeping human oversight in the loop; while AI tools are powerful, they’re not “set-it-and-forget-it.” Instead, they are integrated as active collaborators that augment, but don’t replace, expert judgment.

Cursor: An AI-integrated code editor (imagine VS Code infused with AI in every corner). Cursor became the go-to daily driver for many YC engineers in 2024. It provides an in-editor chat where you can ask for code implementations, refactoring, or explanations, and it will modify or create files accordingly. Notably, Cursor uses Anthropic’s Claude 3.5 “Sonnet” this model is exceptionally fast and capable of processing a large amount of text at once (a “large context window”). This means it can review more lines of code in one go, understand broader code structures, and generate coherent, contextually accurate code quickly. This gives it the ability to handle sizeable files and up-to-date knowledge of APIs. Cursor excels at code generation (you can literally ask, “Implement a function to do X,” and it writes it in your codebase) and in-line suggestions similar to Copilot. A current limitation, as noted, is that you typically need to specify which files or sections of code it should consider; it doesn’t automatically read your entire project (yet). Despite that, it’s loved for its responsiveness and deep IDE integration. Founders often keep a conversation thread for each feature they’re working on. As mentioned, some even run multiple instances: one founder prompts two Cursor windows in parallel to develop different features side-by-side – effectively parallelizing software development, which is a novel workflow enabled by AI.

WindSurf: A rising star in late 2024, WindSurf is an AI assistant for coding, scans and indexes your entire codebase ahead of time. This process enables the tool to automatically identify and pull in relevant sections of code from anywhere in your project when it answers a query. So instead of you having to specify which file or section the AI should focus on, WindSurf already “knows” your codebase and can provide more accurate, context-aware suggestions. Its hallmark feature is a global codebase index. WindSurf pre-processes your entire repository, so the AI can draw upon any part of it when answering. The benefit is obvious: if you ask WindSurf’s AI, “Add a new route similar to X,” it already knows where in your codebase X is defined and how, without you explicitly pointing it out.

YC insiders note that this is “the most important” reason some switched to WindSurf. It can automatically determine relevant files (say, models, views, controllers in an MVC framework) and include them in context. WindSurf uses powerful models under the hood as well (likely GPT-4 or Claude; specifics vary). It is particularly good for large-scale refactors or feature additions that span multiple files. If you’re working in a big monorepo, WindSurf might feel more “aware” than other tools. On the flip side, WindSurf may be a bit slower or more resource-intensive due to all that indexing, and if your project is small, the advantage is less pronounced. As of early 2025, both Cursor and WindSurf are evolving rapidly – they’re “fast moving” products and essentially leap-frogging each other with new features every few months. Many teams actually have both at their disposal and use whichever suits a task (e.g. Cursor for quick experiments, WindSurf for repo-wide changes).

OpenAI Codex / GitHub Copilot: The old guard still has a role. Copilot (powered by Codex and now GPT-4) remains widely used, especially for developers working in traditional IDEs or on teams that adopted it earlier. It provides great inline code suggestions as you type and can speed up writing boilerplate tremendously. By 2025, GitHub Copilot has an improved version called Copilot X which introduced a ChatGPT-like assistant in your IDE and can participate in pull request reviews. However, many YC founders feel Copilot’s capabilities are a subset of what Cursor or WindSurf offer, since those newer tools effectively include Copilot-like autocomplete plus more interactive chat. One advantage Copilot has is its tight integration with GitHub and editors like VS Code/Visual Studio, making it a low-friction tool for many developers. It also benefited from continuous improvements; for example, Microsoft announced that Copilot users would gain the ability to choose models like GPT-4 or Claude for generation. Still, in founder surveys, Copilot (based on Codex) was not the top mention by 2025 – likely because those on the cutting edge had gravitated to more specialized tools or model-agnostic platforms.

ChatGPT (OpenAI GPT-4): Even with many purpose-built coding tools, ChatGPT remains invaluable for many developers – particularly for research, debugging, and conceptual help. YC founders reported using ChatGPT (often GPT-4 via the web interface or API) when they needed more powerful reasoning or broader knowledge than their coding-focused tools could provide. For instance, if an error message pops up that they don’t understand, they might paste it into ChatGPT and ask for an explanation or a solution. Or if they need to integrate an unfamiliar API or write an algorithm, ChatGPT can explain the steps or even generate pseudocode. It’s essentially the “Stack Overflow + Rubber Duck + Senior Engineer” amalgam for many.

The downside is that ChatGPT is outside the IDE context, which means copying code back and forth. However, plugins and editors have appeared to bridge this gap (for example, VS Code plugins that let you chat with GPT-4 about your code). Also, GPT-4 with the code interpreter (now called “Advanced Data Analysis”) could execute code in a sandbox, which some used for testing small code snippets. In summary, while ChatGPT isn’t dedicated to your project, its general integrated reasoning capabilities make it a crucial part of the AI toolkit – especially for problems that aren’t localized to one or two files or when the integrated tools get something wrong and you need a second opinion.

Anthropic Claude 3.5 (“Sonnet”) and Claude 2: Anthropic’s Claude models deserve special mention because they power many of the above tools behind the scenes. Claude 3.5 (nicknamed “Sonnet” for the coding-tuned version) was widely recognized in late 2024 as one of the best models for coding. It’s fast, has a huge context window (up to 100K tokens, far larger than GPT-4’s standard context at that time), and was cost-effective for high-volume code generation. Cursor, for example, uses Claude Sonnet as its default engine, and other tools like Replit’s Ghostwriter switched to Claude as well.

Claude excels at generating boilerplate and bulk code (it’s great at spitting out lots of structured code, e.g., a series of similar API endpoints or data class definitions). It’s also very good at following instructions – if you tell it the style or approach you want, it tends to adhere closely. On the flip side, GPT-4 (especially the “O1” variant) was noted to have stronger deep reasoning – useful for tricky debugging or complex algorithmic tasks. For this reason, some tools (like the AI coding agent Fine) allow users to choose between Claude and GPT-4 models for different tasks. By 2025, Claude 2 had likely been introduced with further improvements, but broadly, Anthropic’s models remain a cornerstone of the AI coding landscape, often running behind the scenes in many products.

DeepSeek R1: This is a newcomer on the block – an open-source large language model tailored for coding and reasoning. DeepSeek R1 was released around late 2024 as a fully open model (MIT licensed), making waves for reportedly offering GPT-4 level reasoning on some tasks and a context window up to 128k. YC conversation noted DeepSeek R1 as “a viable contender” that some founders have started using. The appeal of DeepSeek is that it can be self-hosted, avoiding the need to send code to an external API (great for startups dealing with proprietary code or tight budgets). Its strengths are in reasoning and math, and it’s designed to handle coding tasks like understanding and refactoring code across multiple files. The trade-off is that running such a model requires serious computing power (often a high-end GPU or cloud instance), and the model might not be as finely tuned to your specific framework as, say, Copilot.

Still, the very presence of open models like DeepSeek has likely kept pressure on commercial providers. Founders who can afford the time to tinker might use DeepSeek for certain tasks (or incorporate it into their CI pipeline for AI code reviews, etc.). It underscores a larger point: by 2025, you’re not limited to closed APIs; if needed, you can bring the AI in-house.

“Devin” AI (Autonomous Code Agent): A notable experiment in this era is the idea of an autonomous coding agent – not just suggesting code, but taking higher-level objectives and building entire programs. One such agent is Devin AI, marketed as the “AI software engineer.” It attempts to plan, write, test, and fix code iteratively given a goal. Devin (and similar projects like Smol Developer or MetaGPT in open-source) got some attention in (tactiq-free-transcript-IACHfKmZMr8.txt)nders did mention “Devin” (though transcribed as “Devon”) in their survey, but noted it’s “not really being used for serious features” – primarily because “it doesn’t really understand the codebase” deeply. Essentially, fully autonomous agents haven’t yet matched the reliability of a human-guided approach.

Devin might be used for small, isolated tasks (e.g., “generate a snippet for this small utility”), but for now it’s more of a curiosity in production settings. The consensus was that these agents, while promising, still feel like junior developers who need constant oversight. That said, to pre-index your entire codebase is rapid, so this could change in the future.

Other Tools and Models: The list goes on – Codeium (a popular free AI coding assistant), Tabnine (still around, now partially AI-powered), Phind (which offers an AI search engine and coding assistant), and corporate offerings like IBM’s Watson Code Assistant, etc. YC’s focus, however, was on whatever gave them an edge, which in 2025 meant having the best LLM integration possible. Some founders also experimented with Google Gemini once it became available, particularly because of its rumored 1 million token context window. As mentioned, a few tried feeding entire codebases into Gemini to see if it could one-shot solve problems. While results were mixed, it demonstrated the appetite for using the absolute cutting edge model for certain use cases. In practice, many teams used a hybrid approach: an AI-centric editor (like Cursor/WindSurf) for day-to-day coding, and a powerful chatbot (GPT-4/Gemini) for brainstorming or tricky issues, plus various utilities for testing or reviewing code.

It’s worth noting that many of these tools/models are not mutually exclusive. A single founder might use Cursor for writing a new module, jump to ChatGPT to troubleshoot a bug in an older part of the code, use WindSurf when modifying a cross-cutting feature, and run a quick Codeium suggestion in the browser for a one-liner. The toolbox is rich, and part of the skill is knowing which tool fits which task – similar to how a craftsman uses different tools for cutting, sanding, and polishing.

Changing Workflows, Debugging, and More

In an AI-integrated environment, the software development workflow has transformed dramatically compared to five years ago. Let's explore how various aspects of the coding lifecycle—from writing and debugging to overall process management—have evolved:

Code Generation & Implementation: This is the most obvious change (tactiq-free-transcript-IACHfKmZMr8.txt)to begin with a blank editor now often begins with a conversation. A developer might start by writing a natural language prompt: e.g., “Create a Django model for an e-commerce product with fields for name, price, description, and inventory count”. The AI (within Cursor or Copilot, etc.) then generates the corresponding class code. If it’s correct, the developer might accept it with minor tweaks. If not, the developer can refine the prompt or edit the code and ask the AI to fix the specific error. The result is that much of the rote typing is gone. Founders say they are “writing code by thinking and reviewing” rather than typing line-by-line.

One YC founder (Obie from Aztra) put it bluntly:

“I don’t write code much. I just think and review.”.

With AI generating code, creative control shifts to deciding what to build and reviewing outputs. One founder noted they now code 3× faster and feel less attached to code—they can regenerate or refactor quickly, which leads to cleaner results due to lower costs of iteration.

Parallel Development

AI allows a developer to manage multiple tasks simultaneously, using different windows or threads. While AI works on one feature, the human can review another. This “concurrent coding” increases speed without overwhelming the developer. YC’s Yoav exemplified this by using two Cursor windows at once—some push for even more.

Codebase Understanding

AI tools like WindSurf and Cody help developers understand unfamiliar codebases quickly. Developers can ask direct questions like “How is authentication implemented?” and get relevant summaries. While not perfect, this dramatically accelerates onboarding and cross-functional work.

Debugging & Reasoning

AI can hypothesize causes for bugs (e.g., null returns) across broad context. But humans must still test and verify. Over-relying on AI may lead to bugs or security flaws. Current reasoning models can’t yet autonomously detect or fix all errors. Teams now often treat debugging as pair programming between human and AI, with unit tests and validation as core practices.

Choosing the Right Model

Developers choose between models based on task type: e.g., Claude for fast scaffolding, GPT-4 for complex reasoning. Some workflows even chain models—a planner assigns tasks to specialized generators. In 2025, models are converging in capability, but engineers still need to know when to switch tools.

Maintenance & Refactoring

AI excels at bulk edits and migrations, like renaming APIs or updating libraries. Developers freely restructure and refactor code with AI’s help, since it's easy to regenerate. However, consistency is crucial—teams often provide style guides or use linters post-generation to maintain code quality.

Documentation & Communication

Developers now use AI to create docstrings, comments, commit messages, and PR summaries. This automates the tedious parts of communication, letting humans focus on purpose and clarity. AI-generated documentation ensures new team members understand functions without having to write everything manually.

Changing Roles: From Software Engineer to “Product Engineer”

The Developer’s Role Is Evolving

YC founders highlight a major shift: software developers are becoming product engineers. With AI handling much of the routine coding, developers now focus on strategic decisions, product design, and validation. As one founder said: “I’m no longer an engineer, I’m a product person.”

The value lies not in writing code (which AI can do fast) but in deciding what to build and why. This makes human taste—judgment, vision, and user understanding—more important than raw coding ability.

“Product engineers” blend technical skills with empathy for users. They work closely with AI to build, test, and refine, acting more like creative directors than line-by-line coders.

Smaller Teams, Bigger Outcomes

AI allows lean teams to build powerful products. One skilled developer with AI can now do the work of several. YC’s Garry Tan notes that companies no longer need massive engineering teams—5–10 people may be enough to build what once took dozens.

This empowers smaller startups and solo founders, reducing costs and barriers to entry, while putting pressure on traditional companies to adapt.

Education & Entry-Level Shifts

This change raises questions about software careers and education. Entry-level engineers may need to learn prompting, validation, and architecture—not just writing code. The field is becoming more abstract, shifting from manual programming to designing and steering intelligent tools.

Still, foundational knowledge remains vital. YC’s Diana Hu warns that engineers need training to recognize bad AI output. Garry Tan adds that scaling AI-built systems still requires deep engineering expertise.

AI’s Impact on Coding – Condensed Summary

By 2025, up to 95% of code in some YC startups is written by AI—up from virtually none in 2020. This explosive adoption allows small teams to do more, as one developer can handle multiple features in parallel. Lean startups now build full SaaS products with just 2–3 people, reducing costs and complexity.

AI lowers the barrier to entry for non-traditional developers. Domain experts can describe what they want in natural language and get working code—enabling innovation across diverse fields.

With less time needed to build, teams are free to experiment and iterate, improving end products. Developers also shift focus to design, specs, and testing, while the AI handles the routine. This improves code quality when paired with disciplined validation.

AI tools also aid onboarding and knowledge retention, acting as live documentation for new team members and reducing the “bus factor.”

Key Challenges of AI in Development

AI-generated code can still have bugs or security issues. Human review is essential, especially since models can make subtle logical errors. Debugging AI-written code can be harder, especially when the logic is unfamiliar or inconsistent.

Limited context windows mean some tools miss crucial parts of the codebase, leading to poor integration. There's also legal and compliance risk from cloud AI usage or regurgitated licensed code.

Current models still lack deep reasoning and true code understanding. They work well on familiar patterns but struggle with novel problems unless guided by humans. This can result in knowledge debt when no one fully understands AI-generated code.

Building a Strong AI-Native Team

Adopt AI early, using tools like Cursor or WindSurf.
Choose tools based on project needs (e.g., Claude for speed, WindSurf for context).
Train your team in prompting and verification.
Maintain coding standards and ensure code quality.
Use AI for both writing and reviewing code.
Foster a product-first mindset in engineers.
Don’t over-automate—humans must still approve and oversee.
Protect your IP and sensitive data when using cloud models.
Plan ahead for scaling and technical debt.
Track metrics and continuously improve your AI workflows.

In short, use AI as an accelerator—not a replacement. Teams that combine strong engineers with the right AI tools can build faster, smarter, and more sustainably.

Sources:

Y Combinator “Lightcone” podcast – Vibe Coding Is The Future (2024). etc. (Quotes and insights from YC founders and partners on AI coding tools and changing developer roles.)

TechCrunch (Mar 6, 2025) – A quarter of startups in YC’s current cohort have codebases almost entirely AI-generated.

LeadDev (Mar 20, 2025) – 95% AI-written code? Unpacking the YC CEO’s bombshell.

fine.dev blog (Oct 2024) – OpenAI o1 vs Claude 3.5 Sonnet for coding.

YC founders survey (Winter 2024/25) – various quotes: e.g., Outlet founder on product engineer, Obie (Aztra) on not writing code, Copycat founder on scrapping code, Yoav (CIX) on using Cursor, TrainLoop on 100× acceleration.

YC discussion on AI tools: Cursor vs WindSurf, use of ChatGPT for reasoning, mention of DeepSeek R1, Gemini context use, limitations of “Devin” agent.

Microsoft/GitHub stats on Copilot – ~40% of code generated for users, 55% faster dev.

Anthropic Claude 3.5 “Sonnet” announcement and usage in dev tools.

Additional industry commentary on vibe coding and AI in dev (Karpathy’s post, Fortune/Ars Technica coverage, etc.).

Evolution (2020–2025): From Analog to AI Master

How Y Combinator founders use AI, From Autocomplete to AI Pair Programmers (2020–2023)

The AI-Native Coding Revolution (2024–2025)

The New Arsenal: AI Coding Tools Used by YC Founders in 2025

Tool/Model

Description & Strengths

Notable Use by YC Founders

Cursor (AI Editor)

An AI-integrated code editor that uses Anthropic’s Claude 3.5 “Sonnet”. It offers interactive code generation within your IDE through chat-based prompting and inline suggestions.

Strength: Fast code generation with a large context window (although it requires manual file guidance).

“I write everything with Cursor… I even run two windows in parallel on different features” – Used to multitask development efficiently.

Windsurf AI Coding tool

An AI coding tool that pre-indexes your entire codebase and automatically brings in relevant context when generating code.

Strength: Ideal for large projects and refactoring across multiple files without manual guidance on context.

Many founders switched to WindSurf because it “figures out what files to look at on its own,” simplifying complex code changes.

GitHub Copilot (OpenAI Codex/GPT)

A pioneering autocomplete assistant integrated into popular editors.

Strength: Provides seamless inline code suggestions and, with Copilot X, improved chat capabilities powered by GPT-4.

Remains widely used for generating boilerplate code – reports indicate it produces about 40% of code for many users.

ChatGPT (GPT-4)

A general AI chatbot available via web/API that excels in reasoning, debugging, and offering broad technical insights.

Strength: Superior for complex problem-solving, though not directly integrated in IDEs.

Often used to solve tricky bugs, generate algorithm insights, or explain error logs – serving as an external “senior engineer” advisor.

Anthropic Claude 3.5 (“Sonnet”) and Claude 2

Large-context models specialized for coding, frequently used behind the scenes in tools like Cursor and Replit Ghostwriter.

Strength: Fast, up-to-date, and reliable for generating bulk code.

Powers many of the day-to-day coding tools used by YC founders, ensuring quick and accurate code generation.

DeepSeek R1

An open-source model from 2025, approximately at GPT-4 level, featuring a 128k context window.

Strength: Self-hostable and cost-efficient for large context tasks; excellent at reasoning and code analysis.

Employed by privacy-conscious founders who prefer running models locally to analyze their entire codebase.

Devin (Auto-Coder)

An autonomous coding agent that generates, runs, and iteratively fixes code with minimal intervention.

Strength: Aims to handle entire coding tasks autonomously.

Considered experimental – mentioned but generally “not used for serious features” due to challenges with large codebases.

Google Gemini

A next-generation model by Google featuring a very large context window.

Strength: Has the potential to analyze entire systems at once, offering improved reasoning over large codebases.

Tested by some founders by feeding entire repositories to see if it can one-shot fix bugs; not yet mainstream but highly anticipated.

Others (Codeium, Phind, etc.)

Includes tools such as Codeium (free AI autocomplete), Phind (AI search and code assistant), Replit Ghostwriter, and Amazon CodeWhisperer.

Strength: Fill niche requirements, like cost-efficiency or on-prem needs.

Often used as alternatives or supplementary tools when specific constraints arise (e.g., budget or compliance).

Changing Workflows, Debugging, and More