Why AI can’t write your software

Jan 13, 2026

We attend events across Europe and the US, and one of the most common general questions we get is about the impact of Generative AI – when should you make your own software instead of paying for it? It’s a great question, and one that is shifting as tools improve.  

This blog post specifically talks about Large Language Models, a subset of Generative AI. It’s safest to assume the advice here doesn’t apply to other categories of AI.  

Modern generative AI is impressive – it’s now feasible to avoid search engines and instead use search-capable generative AI to answer questions. AI can generate images, (bad) poems, and even code. So what is it good for, and when is it good enough? It’s very appealing to cut out the expensive developers and start talking to an LLM right away to build your next product. And it can work – there are products on the market currently which were built primarily using AI. Why not your next product? The reality is more nuanced. Code generation can genuinely accelerate certain tasks. But depending on what you’re building, the speed you gain upfront can become a liability later—costing your organization time, money, and trust.  

Here are a few cases where we think AI is great:  

Quick Mockups and Proof-of-Concept Work 

Your product manager has a wild idea for how a feature might work. Instead of blocking engineering time for weeks on design conversations, you can spend 30 minutes describing it to Claude or Cursor and create an interactive prototype. You can show stakeholders your vision, gather feedback, and decide if it’s worth building properly. The prototype doesn’t need to be perfect—it needs to spark conversation. 

Internal Tools with a Clear Endpoint 

Every engineering organization builds one-off utilities: data migration scripts, internal dashboards, and admin panels. These tools often have short lifespans and minimal complexity. They don’t interact with customer data in sensitive ways, and if they break or become obsolete, the impact is contained. Code generation is efficient here because the engineering overhead of “getting it right” is lower than the time cost of building it carefully. 

Simple Apps with a Controlled Lifetime 

Want to build something fairly simple, that you can reasonably check that it’s doing what it should be?  

If the tool that you want to build meets these criteria, it might be worth exploring generative options:

Minimal security implications. How bad would it be if someone had access to all the data or code, or changed the application?

Limited future changes. It’s easier to build once than to continue building, and nobody understands the code. Making it do something new can be challenging.

No regulatory requirements. Whether the regulation is part of what it does, or how you’re required to build it, regulations can make using AI complicated.  

Where Code Generation can be a problem 

Long-Term Applications Accumulate Hidden Costs 

The most seductive thing about code generation is how fast initial implementation feels. But here’s what your engineering team knows: today’s clever shortcut becomes tomorrow’s technical debt. Code generated by LLMs often prioritizes speed and surface correctness over architectural coherence. It doesn’t respect your existing patterns, doesn’t account for your scaling constraints, and doesn’t anticipate how the system will evolve in two years when the original author has moved on. Research consistently shows that developers spend weeks every month dealing with technical debt, and AI-generated code compounds the problem significantly. When you ask a domain expert who’s spent years understanding your systems—your engineering lead—they’re considering maintainability, consistency, and long-term cost. An LLM is pattern-matching against millions of GitHub repositories, many of which embody questionable practices. 

Call out: today’s clever shortcut becomes tomorrow’s technical debt. 

Regulatory Compliance Isn’t a Coding Problem 

If your product operates in healthcare, finance, or any regulated industry, you likely have compliance mandates: HIPAA, SOX, PCI-DSS, or regional data protection laws. These aren’t coding standards—they’re legal frameworks that shape how systems behave. An LLM can suggest encryption libraries, but it won’t understand that California AB 3030 requires specific disclosures when AI generates clinical text, or that your payment processor requires quarterly security audits, or that data residency laws in your target market restrict where customer information can be stored.  

Your compliance officer isn’t a person you can go back to and apologise to for the extra work—they’re a business constraint that determines what’s actually buildable. 

Security Risks Are Not Obvious Until They Matter 

This is where I need to be direct: AI-generated code is substantially more likely to have security vulnerabilities than code written by experienced engineers. Veracode’s 2025 research found that roughly 45% of AI‑generated code samples contained known security vulnerabilities, and that models failed to avoid cross‑site scripting and log‑injection vulnerabilities in 86 and 88% of test cases respectively [1] [2]. If you’re building anything that touches customer data, payment information, or confidential business logic, code generation introduces unacceptable risk. Your security team will flag this. Your customers will rightfully object. And if something goes wrong, you’ll have a hard time explaining to auditors or regulators why you deployed code that was never reviewed by someone who understands the security implications. This isn’t about being paranoid—it’s about recognising that security is not a feature that can be tacked on later. It’s architectural. It’s contextual. And it requires expertise. 

System Design, Architecture, Testing, … 

By bringing in people skilled in a domain you gain access to judgment, not just implementation. A good software engineer will challenge and refine your requirements: they will ask how a feature fits onboarding, what happens on the unhappy path, how it impacts observability, what service level agreements you’ve promised, and what trade-offs you are implicitly making. Instead of just wiring together endpoints, they design boundaries, failure modes, and feedback loops so the system is operable six months from now, not just demoable on Friday. Equally important, experienced engineers co-design the product with you. They help craft sensible defaults, guardrails, and first-run experiences so new users understand the value quickly, and existing users aren’t surprised when something changes. An AI assistant, by contrast, will faithfully translate whatever you type into code—bugs, missing edge cases, and flawed assumptions included. It won’t push back on an unsafe workflow, question an ambiguous requirement, or redesign a flow to reduce support load; it just automates the mistake at scale. 

 

The Experience You Should Expect 

Here’s what a reasonable workflow with humans looks like: You describe a problem to your engineering lead. They ask clarifying questions. Together you establish scope, constraints, and non-negotiable requirements. Then they decide: Is this a good candidate for LLM assistance, or should I design it carefully? If it’s LLM-assisted, they provide that context to the tool and iterate quickly. If it needs careful design, they own that process and communicate progress along the way. The result is code that fits your system, handles edge cases, and doesn’t create future problems. It takes longer than the “fastest possible” path, but less time than it would have taken a year ago. If you’re working with an LLM directly—building a prototype or a one-off tool—you’ll move quickly, but with the understanding that this code has a limited lifespan. You’re not trying to build something that scales to enterprise production; you’re trying to validate an idea or solve an immediate problem.

The Bottom Line

Code generation is genuinely useful. Use it for prototypes, for scaffolding, for boilerplate. But for anything that will live long-term, touch sensitive data, or require ongoing maintenance, invest in experienced engineering talent and give them the space to think. Your engineering lead isn’t being slow when they insist on careful design for the things that matter. They’re protecting you from costs that aren’t obvious until they’re expensive. And if you’re tempted to skip that step by leaning on LLM code generation, remember: the speed you gain upfront is often paid back with interest in maintenance, security, and debt. The organizations winning with this technology aren’t the ones trying to replace engineers. They’re the ones using AI to amplify what good engineers do best—thinking clearly, making informed tradeoffs, and building systems that last. 


So are we worried at Firefinch about the use of LLMs to generate code? Not at all! If anything, there will be an increased need for professional services to oversee and guide the process of building robust, compliant software.  


💬 Would you like help to build your software? Book a friendly, no-obligation chat with our team and explore how we can support your journey to market. Contact us for further information. 


🖱️ Firefinch specialises in compliant medical and life science software development with deep expertise in regulatory requirements, quality systems, and development best practices.