p016-planifest-frontend-stack-evaluation - Planifest Frontend Stack Evaluation
Purpose
This document evaluates frontend frameworks and meta-frameworks for use in Planifest's agentic CI/CD pipeline, where code is generated by AI agents (via LLM API), not written by humans. Traditional developer-experience priorities (learning curve, community vibe, personal preference) are irrelevant. The sole question is: what gets an LLM to write correct, visually acceptable, production-ready frontend code with minimal iteration?
Frontend presents a unique challenge compared to backend: correctness has two dimensions - functional (does it work?) and visual (does it look right?). Addy Osmani's React Summit research identifies a "complexity cliff" where models achieve ~40% success on isolated component tasks but collapse to ~25% on multi-step integrations involving state management, routing, and design coherence. The evaluation criteria below are calibrated to this reality.
Evaluation Criteria
The backend evaluation (p013) used 15 criteria. Frontend shares some of these but introduces visual and interaction-specific concerns. The criteria used here are:
- LLM Training Corpus Coverage - how much idiomatic code exists in the training data
- Compile-Time Error Detection - how many bugs are caught before runtime
- Error Feedback Clarity - how actionable are build/runtime errors for agent self-correction
- Type System - strength, soundness, and boundary enforcement
- Component Model Clarity - how predictable is the component lifecycle/rendering model
- State Management - how likely is the agent to produce correct state logic
- Styling & Design System Integration - how naturally does the framework pair with utility CSS and component libraries
- Testing Framework - how well do LLMs generate tests and how verifiable are the results
- Build Tooling & Bundle Characteristics - build speed, bundle size, HMR, container footprint
- Routing & Data Fetching - complexity of SSR/SSG/CSR routing and data loading patterns
- Accessibility by Default - how much a11y does the framework enforce or encourage
- Third-Party Integration Coverage - availability of UI libraries, SDKs, and community packages
- Ecosystem Maturity & Stability - production track record, breaking change frequency, governance
- Agent Skill & Context Engineering Support - availability of published agent skills, AGENTS.md files, best-practice documents optimised for LLM consumption
- Overall Agent-Suitability - estimated first-pass success rates and typical iteration counts
Scoring Key
| Stars | Meaning |
|---|---|
| ★★★★★ | Best in class - near-zero agent iteration needed |
| ★★★★ | Strong - occasional iteration, mostly correct first time |
| ★★★ | Adequate - regular iteration needed but manageable |
| ★★ | Weak - frequent iteration, many classes of bugs slip through |
| ★ | Poor - unsuitable for agent-generated code |
1. React 19 + Vite + TypeScript (SPA / Client-Side)
Note: Evaluated as a Vite-based SPA without a meta-framework. This is the "plain React" option.
LLM Training Corpus Coverage
Score: ★★★★★
- React is the most represented frontend framework in LLM training data by a wide margin. Stack Overflow, GitHub, blog posts, documentation - all saturated with React + TypeScript examples.
- LLMs generate idiomatic React with hooks, context, and JSX more fluently than any other frontend framework.
- The React + TypeScript + Tailwind CSS + shadcn/ui combination is the de facto "vibe coding" stack and is over-represented in training corpora from 2023 onwards.
Compile-Time Error Detection
Score: ★★★
- TypeScript catches type mismatches, prop errors, and basic null issues.
- Same weaknesses as backend TypeScript:
anyescape hatch, unsound type system, no enforced error handling. - JSX type checking catches mismatched props, missing required props, and invalid HTML attributes - this is genuinely useful for agent-generated UI code.
- Common agent mistakes that slip through: missing
keyprops in lists (warning only), stale closures inuseEffect, race conditions in async state updates.
Error Feedback Clarity
Score: ★★★★
- Vite's HMR overlay provides clear, formatted error messages with source locations.
- TypeScript compiler errors in JSX are verbose but actionable.
- React's development-mode warnings (missing keys, invalid hook calls, prop type mismatches) are well-known to LLMs from training data.
- Stack traces from React error boundaries are readable.
Type System
Score: ★★★
- Same structural typing as backend TypeScript - powerful but unsound.
- React's generic component types (
FC<Props>,ComponentProps<>,Ref<>) add meaningful safety for agent-generated UI. - Discriminated unions for component variants (e.g.
type ButtonProps = PrimaryButton | SecondaryButton) are well-handled by LLMs. - Risk: LLMs frequently use
asassertions to silence complex generic errors rather than fixing them.
Component Model Clarity
Score: ★★★★
- Function components with hooks are a straightforward mental model that LLMs handle well.
- The rules of hooks (don't call conditionally, don't call in loops) are well-represented in training data; agents rarely violate them.
useEffectdependency arrays are the #1 source of agent-generated bugs - missing dependencies, unnecessary dependencies, and infinite re-render loops.- React 19's compiler (React Compiler / React Forget) reduces the need for manual
useMemo/useCallback, which removes a class of optimisation errors agents previously produced.
State Management
Score: ★★★
useStateanduseReducerare generated correctly for simple cases.- Context API is generated idiomatically but agents frequently create unnecessary re-renders by placing too much state in a single context.
- For complex state, LLMs reach for Redux (verbose, boilerplate-heavy) or Zustand (simpler, but less represented in training data). Neither is generated reliably without explicit prompting.
- Async state (loading/error/data patterns) is a common failure point - agents often forget error states or produce race conditions between rapid state updates.
Styling & Design System Integration
Score: ★★★★★
- Tailwind CSS utility classes are generated with near-perfect fluency by all major LLMs.
- shadcn/ui components (built on Radix UI primitives) are extremely well-represented in training data and are generated correctly with minimal prompting.
- The combination of Tailwind + shadcn/ui gives the agent a constrained design vocabulary that produces visually consistent results.
- Agents handle responsive breakpoints (
sm:,md:,lg:) correctly. Dark mode (dark:prefix) is usually correct. - Risk: agents produce "generic AI aesthetic" - safe but bland. Design differentiation requires explicit constraints in the specification.
Testing Framework
Score: ★★★★
- Vitest is Vite-native, fast, and well-understood by LLMs (API is Jest-compatible).
- React Testing Library (
@testing-library/react) is the standard and LLMs generate idiomatic tests usingrender,screen,userEvent. - Playwright for E2E testing is well-supported; Playwright now ships dedicated Test Agents (planner, generator, healer) for LLM-driven test creation.
- Agents generate reasonable happy-path tests but frequently miss edge cases, error states, and accessibility assertions.
- Vitest Browser Mode (backed by Playwright) enables component tests in real browsers - important for catching rendering bugs jsdom misses.
Build Tooling & Bundle Characteristics
Score: ★★★★★
- Build tool: Vite - sub-second HMR, fast cold starts, excellent DX.
- Typical production bundle: 80-200 KB (gzipped, depending on dependencies).
- Build time: 2-10 seconds for a typical SPA.
- Container image (static serving via Nginx/Caddy): 10-30 MB.
- Startup time: Instant (static files served by a web server).
- Vite's Rollup-based production build produces well-optimised chunks with tree-shaking.
Routing & Data Fetching
Score: ★★★
- React Router v7 (or TanStack Router) for client-side routing. LLMs generate basic routing correctly.
- No built-in SSR/SSG - this is a pure SPA. Data fetching is entirely client-side.
- TanStack Query (React Query) for server state is well-known to LLMs and generated idiomatically.
- Risk: agents often produce waterfall data fetching patterns (sequential
useEffectchains) rather than parallelised queries.
Accessibility by Default
Score: ★★★
- React itself enforces nothing. Accessibility is opt-in.
- shadcn/ui (built on Radix UI) provides accessible primitives (keyboard navigation, ARIA attributes, focus management) out of the box - this is a significant advantage when agents use it as the component library.
- Without an accessible component library, agents rarely generate correct ARIA attributes, skip links, or keyboard handlers.
- ESLint plugin
eslint-plugin-jsx-a11ycatches basic issues at lint time.
Third-Party Integration Coverage
Score: ★★★★★
- React has the broadest UI component library ecosystem of any frontend framework.
- Every major design system ships React components: Material UI, Ant Design, Chakra UI, Mantine, shadcn/ui, Radix, Headless UI.
- All major charting libraries (Recharts, Nivo, Victory, Chart.js wrappers) have React bindings.
- Map libraries (react-map-gl, Leaflet wrappers), form libraries (React Hook Form, Formik), animation libraries (Framer Motion, React Spring) - all mature.
Ecosystem Maturity & Stability
Score: ★★★★
- React 19 released December 2024. Stable, backward-compatible upgrade from React 18.
- Meta maintains React with a long-term roadmap. React Compiler reached v1.0 in 2025.
- Vite is maintained by Evan You and a dedicated team, with rapid iteration but good backward compatibility.
- Risk: the React ecosystem's breadth creates decision fatigue - many ways to solve the same problem, and agents can produce inconsistent patterns across a codebase.
Agent Skill & Context Engineering Support
Score: ★★★★★
- Vercel's
react-best-practicesAgent Skill: 58+ rules across 8 categories, compiled into a single AGENTS.md optimised for LLM consumption. Installable into Claude Code, Cursor, Codex, and other coding agents vianpx skills add vercel-labs/agent-skills. - Anthropic's frontend-design skill: bundled with Claude, provides React/Tailwind/shadcn/ui generation patterns.
- Playwright Test Agents: dedicated planner/generator/healer agents for E2E test creation.
- The React + TypeScript + Tailwind stack is the best-supported stack in the entire agent skills ecosystem. No other framework comes close.
Overall Agent-Suitability
Score: ★★★★★
- Estimated first-pass functional success rate: 70-80% (component renders, routes work, basic interactions function).
- Estimated first-pass visual acceptability rate: 55-65% (layout correct, spacing acceptable, responsive behaviour present - but design often generic).
- Typical iterations for a standard CRUD SPA: 2-4.
- The enormous training corpus and mature agent skills ecosystem make React + Vite + TypeScript the clear leader for agent-generated frontend code.
Best Use Cases
- CRUD SPAs (admin panels, dashboards, internal tools)
- Integration-heavy applications where third-party React component libraries save agent iteration
- Any application where the confirmed design requirementsification drives a known component structure
Avoid If
- You need server-side rendering or static site generation (use Next.js instead)
- You need edge rendering or streaming HTML (use a meta-framework)
- Content-heavy marketing sites where SEO is critical
Key Risks
useEffectcomplexity: Dependency arrays, cleanup functions, and async effects are the primary source of agent-generated bugs- State management sprawl: Without explicit architectural guidance in the specification, agents produce inconsistent state patterns
- Generic design: Agent-generated UI converges on a "Tailwind default" aesthetic without strong design constraints in the spec
- No SSR: Pure SPAs have inherent SEO and initial load limitations
2. Next.js 15+ (React Meta-Framework)
LLM Training Corpus Coverage
Score: ★★★★★
- Next.js is the most-used React meta-framework and extremely well-represented in training data.
- App Router (introduced in Next.js 13, stabilised in 14-15) has substantial but less mature training data than Pages Router. LLMs sometimes conflate the two patterns.
Compile-Time Error Detection
Score: ★★★
- Same TypeScript base as plain React. Next.js adds some build-time checks for invalid configurations.
- Server Component / Client Component boundary violations produce clear errors.
Error Feedback Clarity
Score: ★★★★
- Next.js dev overlay is informative with source-mapped stack traces.
- Server-side errors are rendered in the browser with clear formatting.
- Risk: hydration mismatch errors can be cryptic and are a common source of agent-generated bugs.
Type System
Score: ★★★
- Same as React + TypeScript.
Component Model Clarity
Score: ★★★
- App Router introduces Server Components (default) and Client Components (
'use client'directive). - This dual model is a significant source of agent confusion. LLMs frequently place hooks in Server Components or forget the
'use client'directive. loading.tsx,error.tsx,layout.tsxconventions are well-documented but agents sometimes misplace them in the file hierarchy.- Server Actions (
'use server') add another boundary that agents must reason about correctly.
State Management
Score: ★★★
- Same client-side state options as React.
- Server state via Server Components and Server Actions is powerful but agents struggle with the mental model - especially around when data is fetched server-side vs. client-side.
Styling & Design System Integration
Score: ★★★★★
- Same Tailwind + shadcn/ui story as plain React.
- Next.js has first-party
next/fontfor optimised font loading, which agents use correctly.
Testing Framework
Score: ★★★
- Same Vitest/Playwright options as React.
- Testing Server Components and Server Actions is less mature and less well-represented in training data.
- E2E tests work well; unit/component testing of the App Router model is more complex.
Build Tooling & Bundle Characteristics
Score: ★★★★
- Build tool: Turbopack (default since Next.js 15) or Webpack.
- Typical production bundle: 80-250 KB (first load JS, after code splitting).
- Build time: 10-60 seconds (depends on page count and data fetching).
- Container image: 100-300 MB (Node.js runtime required for SSR).
- Startup time: 500 ms - 2 s (Node.js server).
- Larger and slower than a static SPA, but enables SSR/SSG.
Routing & Data Fetching
Score: ★★★★
- File-based routing is intuitive and well-generated by agents.
- Data fetching in Server Components via
asynccomponent functions is clean. - Caching behaviour (
fetchoptions,revalidate,dynamic) is complex and agents frequently guess wrong. Vercel's react-best-practices skill specifically addresses this. - Parallel data fetching via
Promise.allin layouts is a pattern agents often miss.
Accessibility by Default
Score: ★★★
- Same as React - framework enforces nothing, component library (shadcn/ui) provides the foundation.
Third-Party Integration Coverage
Score: ★★★★★
- All React libraries work. Next.js adds its own optimised components:
next/image,next/link,next/font. - Vercel's ecosystem (Analytics, Speed Insights, Toolbar) integrates natively.
Ecosystem Maturity & Stability
Score: ★★★★
- Next.js is maintained by Vercel with a rapid release cadence.
- Major versions introduce significant API changes (Pages Router -> App Router was a paradigm shift).
- Risk: the App Router is still evolving; some patterns that LLMs learned from early documentation have been superseded.
Agent Skill & Context Engineering Support
Score: ★★★★★
- Vercel's
react-best-practicesskill specifically covers Next.js patterns (Server Components, caching, bundle optimisation). - Vercel's
web-design-guidelinesskill audits UI code for 100+ rules. - Next.js evals (Vercel's own benchmark) measure LLM performance on Next.js-specific tasks - best models achieve ~42% success on these evaluation tasks.
Overall Agent-Suitability
Score: ★★★★
- Estimated first-pass functional success rate: 60-70%.
- Estimated first-pass visual acceptability rate: 55-65%.
- Typical iterations for a standard CRUD application with SSR: 3-5.
- The Server/Client Component boundary is the primary source of additional iteration compared to plain React + Vite.
Best Use Cases
- SEO-critical applications (marketing sites, e-commerce, content platforms)
- Full-stack applications where the frontend and API co-locate
- Applications requiring SSR, SSG, or ISR for performance
Avoid If
- Pure SPAs where SSR adds complexity without benefit
- Teams where the agent orchestration layer cannot handle the Server/Client Component boundary reliably
Key Risks
- Server/Client Component confusion: The #1 source of agent-generated Next.js bugs
- Hydration mismatches: Server-rendered HTML diverging from client hydration
- Caching complexity:
fetchcaching,revalidate, anddynamicoptions are frequently misconfigured - Rapid evolution: App Router patterns change between minor versions, and LLM training data lags
3. SvelteKit (Svelte 5)
LLM Training Corpus Coverage
Score: ★★★
- Svelte has meaningful but substantially less training data than React. Svelte 5's runes system (
$state,$derived,$effect) was released in late 2024 and is poorly represented in most LLM training corpora. - LLMs frequently generate Svelte 4 syntax (reactive
$:declarations) when targeting Svelte 5, requiring correction.
Compile-Time Error Detection
Score: ★★★★
- Svelte's compiler catches many errors that React defers to runtime: unused CSS, unreachable template code, invalid bindings.
- TypeScript support is solid and integrated into the Svelte compiler.
- Compile-time errors are tied directly to source code, not to an intermediate representation.
Error Feedback Clarity
Score: ★★★★
- Svelte compiler errors are clear, concise, and point directly to the problematic line in the
.sveltefile. - Error messages are among the best of any frontend framework - one error at a time, actionable suggestion included.
- SvelteKit runtime errors are less polished but still readable.
Type System
Score: ★★★
- TypeScript in
<script lang="ts">blocks. Same structural typing limitations as React. - Svelte's template type checking is more limited than JSX - some prop errors are only caught at runtime.
Component Model Clarity
Score: ★★★★
- Single-file components (
.svelte) with<script>,<style>, and template in one file - close to vanilla HTML/CSS/JS. - Svelte 5 runes provide explicit reactivity:
$state()for state,$derived()for computed values,$effect()for side effects. - Fewer footguns than React hooks - no dependency arrays, no stale closures, no rules-of-hooks violations.
- Risk: the runes API is new and LLMs have limited training data for it.
State Management
Score: ★★★★
- Svelte stores (
writable,readable,derived) are simple and lightweight - no external library needed. - Svelte 5 runes make reactivity explicit and less error-prone than React's
useState/useEffectmodel. - Less state management fragmentation than React (no Redux vs. Zustand vs. Jotai vs. Recoil decision).
Styling & Design System Integration
Score: ★★★
- Scoped CSS by default - clean and predictable.
- Tailwind CSS works well with Svelte.
- Component library ecosystem is much smaller: Skeleton UI, Melt UI, Bits UI exist but have far less LLM training data than React equivalents.
- No shadcn/ui equivalent with the same depth of LLM training data (shadcn-svelte exists but is less mature).
Testing Framework
Score: ★★★
- Vitest for unit tests.
@testing-library/sveltefor component tests. - Playwright for E2E.
- Less testing training data than React - agents produce less idiomatic Svelte tests.
Build Tooling & Bundle Characteristics
Score: ★★★★★
- Build tool: Vite (SvelteKit is Vite-native).
- Typical production bundle: 30-80 KB (gzipped) - significantly smaller than React due to compile-time approach.
- Build time: 2-8 seconds.
- Container image (static): 10-25 MB. (SSR: 80-150 MB with Node.js).
- Startup time: Instant (static) or 100-500 ms (SSR).
- Svelte's compiler eliminates the virtual DOM runtime, producing the smallest bundles of any major framework.
Routing & Data Fetching
Score: ★★★★
- SvelteKit's file-based routing with
+page.svelte,+page.server.ts,+layout.svelteis clean and well-documented. loadfunctions for data fetching are explicit and type-safe.- SSR, SSG, and CSR are configurable per route.
- Less complex than Next.js App Router - fewer boundary concepts for agents to reason about.
Accessibility by Default
Score: ★★★★
- Svelte compiler produces a11y warnings for missing alt attributes, incorrect ARIA roles, and other common issues.
- This is a meaningful advantage - the compiler catches a11y bugs that agents would otherwise miss.
Third-Party Integration Coverage
Score: ★★
- Significantly smaller ecosystem than React. Most third-party SDKs ship React components, not Svelte components.
- Charting: LayerChart, Pancake - far less mature than React charting options.
- Form handling: Superforms - solid but less training data.
- Agent must write more custom code where a React agent would use a pre-built library.
Ecosystem Maturity & Stability
Score: ★★★
- Svelte 5 represents a major paradigm shift (runes replacing reactive declarations). LLM training data covers both old and new patterns, creating confusion.
- SvelteKit reached 1.0 in December 2022 and has been stable since, but the Svelte 5 migration is still ongoing in the ecosystem.
- Smaller community means fewer maintained third-party packages.
- Rich Harris is employed by Vercel, providing institutional backing.
Agent Skill & Context Engineering Support
Score: ★★
- No published Svelte-specific agent skills comparable to Vercel's React best practices.
- No AGENTS.md equivalent for SvelteKit.
- Agents must rely on general documentation rather than LLM-optimised guidance.
Overall Agent-Suitability
Score: ★★★
- Estimated first-pass functional success rate: 50-60%.
- Estimated first-pass visual acceptability rate: 45-55%.
- Typical iterations for a standard CRUD application: 4-6.
- Better compiler errors and simpler reactivity model are offset by smaller training corpus and ecosystem.
Best Use Cases
- Performance-critical SPAs where bundle size matters (dashboards, embedded widgets)
- Applications where the smaller ecosystem is not a limitation (internal tools with standard UI)
- Teams willing to invest in Svelte-specific agent prompts and specifications
Avoid If
- The application requires extensive third-party UI component integration
- Agent iteration cost is the primary concern (React will iterate faster)
- LLM training data currency is important (Svelte 5 runes are poorly represented)
Key Risks
- Svelte 4 vs. 5 confusion: LLMs generate outdated reactive syntax for Svelte 5 targets
- Ecosystem gaps: Missing libraries force custom agent-generated code with higher error rates
- Smaller training corpus: Fewer Stack Overflow answers, fewer GitHub examples, less blog coverage
- Component library deficit: No shadcn/ui-quality component library with deep LLM training data
4. Vue 3 + Nuxt 3
LLM Training Corpus Coverage
Score: ★★★★
- Vue 3 has strong training data coverage - second only to React among frontend frameworks.
- Composition API (
ref,reactive,computed,watch) is well-represented. Options API also well-known. - Risk: LLMs sometimes generate Vue 2 patterns (Options API,
this.$emit) when targeting Vue 3 Composition API.
Compile-Time Error Detection
Score: ★★★
vue-tsc(Volar) provides TypeScript checking for.vuesingle-file components.- Template type checking has improved significantly with Volar but is less mature than JSX type checking.
<script setup>syntax is clean and well-generated by LLMs.
Error Feedback Clarity
Score: ★★★★
- Vue's development warnings are detailed and include component hierarchy context.
- Vite HMR overlay provides clear error reporting.
- Nuxt's error handling for SSR/SSG issues is reasonable.
Component Model Clarity
Score: ★★★★
- Single-file components with
<script setup>,<template>,<style scoped>are clear and predictable. - Composition API provides a function-based reactivity model similar to React hooks but with automatic dependency tracking (no dependency arrays).
- Less footgun-prone than React:
refandreactivehandle dependency tracking automatically.
State Management
Score: ★★★★
- Pinia is the official state management library - simple, type-safe, and well-generated by LLMs.
- Less fragmentation than React's state management ecosystem.
- Composables (custom hooks equivalent) are generated correctly for common patterns.
Styling & Design System Integration
Score: ★★★
- Scoped styles by default in SFCs.
- Tailwind works well. UnoCSS (an alternative) is popular in the Vue ecosystem.
- Component libraries: Vuetify, PrimeVue, Naive UI, Element Plus - decent but less LLM training data than React equivalents.
- No shadcn/ui equivalent with the same depth (shadcn-vue exists but is less mature).
Testing Framework
Score: ★★★★
- Vitest is Vite-native and works seamlessly with Vue.
@vue/test-utilsand@testing-library/vuefor component testing.- Playwright/Cypress for E2E.
Build Tooling & Bundle Characteristics
Score: ★★★★
- Build tool: Vite (Vue is Vite's native framework - Evan You created both).
- Typical production bundle: 60-150 KB (gzipped).
- Build time: 3-12 seconds.
- Container image (Nuxt SSR): 100-250 MB. (Static: 10-25 MB).
- Smaller bundles than React due to Vue's lighter runtime.
Routing & Data Fetching
Score: ★★★★
- Nuxt 3 file-based routing is clean and similar to Next.js.
useFetch,useAsyncDatacomposables for data fetching are straightforward.- Less Server/Client Component confusion than Next.js App Router.
Accessibility by Default
Score: ★★★
- Vue does not enforce accessibility at the framework level.
- Component libraries vary in a11y quality (Vuetify is strong; others vary).
Third-Party Integration Coverage
Score: ★★★
- Smaller than React but larger than Svelte.
- Most major UI needs are covered but with fewer options per category.
- Some SDKs ship Vue components (e.g. Stripe Elements), but many are React-only.
Ecosystem Maturity & Stability
Score: ★★★★
- Vue 3 is stable and mature. Composition API is settled.
- Nuxt 3 is stable since late 2022, with regular releases.
- Strong governance under Evan You and the Vue core team.
Agent Skill & Context Engineering Support
Score: ★★
- No published Vue-specific agent skills comparable to Vercel's React best practices.
- No AGENTS.md equivalent for Vue/Nuxt.
Overall Agent-Suitability
Score: ★★★★
- Estimated first-pass functional success rate: 60-70%.
- Estimated first-pass visual acceptability rate: 50-60%.
- Typical iterations for a standard CRUD application: 3-5.
- Strong training data and simpler reactivity model compared to React, but smaller ecosystem and agent skill support.
Best Use Cases
- SPAs and SSR applications where Vue's simpler reactivity model reduces iteration
- Teams with existing Vue codebases being retrofitted with Planifest
- Laravel-backed applications (Vue + Laravel is a common pairing)
Avoid If
- Maximum agent skill/context engineering support is required (React is better served)
- The application requires extensive third-party component integration
- React Native mobile support is needed in future
Key Risks
- Vue 2 vs. 3 confusion: LLMs occasionally generate Options API when Composition API is required
- Component library depth: Fewer deeply-trained component libraries than React
- No agent skills ecosystem: Agents must rely on general documentation
5. Angular 18+ (with Signals)
LLM Training Corpus Coverage
Score: ★★★★
- Angular has substantial training data, particularly for enterprise patterns.
- Risk: much of the training data covers older Angular versions (RxJS-heavy, module-based). Angular's shift to standalone components and signals (v16-18) is less well-represented.
Compile-Time Error Detection
Score: ★★★★
- Angular's ahead-of-time (AOT) compiler catches template errors, binding mismatches, and type violations at build time.
- Strict mode catches more issues than React's TypeScript setup.
- Angular's template type checking is more rigorous than Vue or React.
Error Feedback Clarity
Score: ★★★
- Angular's error messages have historically been verbose and noisy - cascading template errors are common.
- Improved in recent versions but still less clear than Svelte or Go (from the backend evaluation).
- LLMs can struggle to extract the actionable fix from Angular's multi-line error output.
Component Model Clarity
Score: ★★★
- Angular's component model is comprehensive but complex: decorators, dependency injection, lifecycle hooks, change detection strategies.
- Standalone components (v15+) simplify the model, but LLMs still generate module-based patterns from older training data.
- Signals (v16+) provide a simpler reactivity model, but training data coverage is limited.
State Management
Score: ★★★
- RxJS (Observables) is powerful but complex - agents frequently produce incorrect observable chains, missing unsubscribe calls, and broken pipe sequences.
- Signals reduce complexity but are new and less well-known to LLMs.
- NgRx (Redux-like) is heavyweight and produces verbose, boilerplate-heavy code.
Styling & Design System Integration
Score: ★★★
- Angular Material is mature and well-documented, with reasonable LLM training data.
- Tailwind works but is less idiomatic in Angular than in React.
- PrimeNG, ng-bootstrap, and NG-ZORRO are available.
Testing Framework
Score: ★★★
- Karma/Jasmine (default) is being replaced by Jest/Vitest, but the transition is incomplete.
- Angular's TestBed for component testing is verbose and agents frequently produce incorrect test configurations.
- Playwright for E2E is straightforward.
Build Tooling & Bundle Characteristics
Score: ★★★
- Build tool: Angular CLI (esbuild-based since v17).
- Typical production bundle: 100-300 KB (gzipped) - larger than React, Vue, or Svelte.
- Build time: 10-30 seconds.
- Container image (SSR): 150-350 MB.
- Larger bundles due to the framework's comprehensive runtime.
Routing & Data Fetching
Score: ★★★★
- Angular Router is mature and feature-rich.
- Route guards, resolvers, and lazy loading are well-documented.
- Agents generate basic routing correctly but struggle with complex guard logic.
Overall Agent-Suitability
Score: ★★★
- Estimated first-pass functional success rate: 55-65%.
- Estimated first-pass visual acceptability rate: 45-55%.
- Typical iterations for a standard CRUD application: 4-6.
- Strong compiler but excessive complexity and boilerplate increase iteration cost.
Best Use Cases
- Enterprise applications being retrofitted with Planifest where Angular is already the standard
- Applications requiring Angular-specific features (Angular Material, complex routing, DI)
Avoid If
- Starting greenfield with Planifest (React or Vue are simpler targets for agents)
- Agent iteration cost is the primary concern
Key Risks
- Version confusion: LLMs generate module-based patterns for standalone component targets
- RxJS complexity: Observable chains are a major source of agent-generated bugs
- Boilerplate: Angular's verbosity increases the surface area for agent errors
- Testing verbosity: TestBed configuration errors are common
6. Solid.js + SolidStart
LLM Training Corpus Coverage
Score: ★★
- Very small training corpus. LLMs frequently generate React patterns when targeting Solid (JSX looks similar but reactivity semantics are fundamentally different).
- SolidStart is even less represented.
Compile-Time Error Detection
Score: ★★★
- TypeScript support is solid. JSX type checking works.
- No Solid-specific compile-time analysis beyond TypeScript.
Component Model Clarity
Score: ★★★★
- Fine-grained reactivity with signals - components don't re-render, only specific DOM nodes update.
- Simpler mental model than React for reactivity, but agents must understand that destructuring props breaks reactivity (a common mistake).
Overall Agent-Suitability
Score: ★★
- Estimated first-pass functional success rate: 35-45%.
- Typical iterations: 5-8.
- The JSX similarity to React causes agents to generate React patterns that silently break in Solid.
Key Risks
- React pattern contamination: Agents generate React hooks syntax in Solid contexts
- Tiny ecosystem: Very few third-party libraries
- Minimal training data: Most LLMs cannot generate idiomatic Solid code reliably
7. Qwik + QwikCity
LLM Training Corpus Coverage
Score: ★
- Extremely small training corpus. LLMs have very limited knowledge of Qwik's resumability model.
Component Model Clarity
Score: ★★
- Qwik's
$suffix convention (e.g.,component$,useSignal$) is unique and poorly understood by LLMs. - Resumability (lazy-loading at the component level) is a novel concept that agents cannot reason about from first principles.
Overall Agent-Suitability
Score: ★
- Estimated first-pass functional success rate: 20-30%.
- Typical iterations: 8-12.
- Not viable for agent-generated code with current LLM capabilities.
8. Astro
LLM Training Corpus Coverage
Score: ★★★
- Growing training data, particularly for content-heavy sites.
- Astro's
.astrocomponent syntax is unique but simple.
Component Model Clarity
Score: ★★★★
- Islands architecture: static HTML by default, interactive "islands" opt-in using React/Vue/Svelte components.
- The model is straightforward - agents generate static templates well.
- Risk: agents sometimes add unnecessary interactivity (React islands) where static HTML would suffice.
Overall Agent-Suitability
Score: ★★★
- Estimated first-pass functional success rate: 55-65% (for content sites).
- Typical iterations: 3-5.
- Strong for content-heavy sites but limited for interactive SPAs.
Best Use Cases
- Documentation sites, blogs, marketing sites
- Content-first applications where interactivity is isolated
Avoid If
- Building interactive SPAs or dashboards
- The application is interaction-heavy rather than content-heavy
9. HTMX + Server-Rendered HTML
LLM Training Corpus Coverage
Score: ★★★
- HTMX is well-represented in recent training data due to its popularity surge.
- The HTML-first model is simple and LLMs generate it competently.
Component Model Clarity
Score: ★★★★★
- No component model - server renders HTML, HTMX handles partial updates via HTML attributes.
- The simplicity is a significant advantage for agents: no virtual DOM, no reactivity system, no state management.
- HTMX attributes (
hx-get,hx-post,hx-target,hx-swap) are a small, well-defined API.
Overall Agent-Suitability
Score: ★★★
- Estimated first-pass functional success rate: 65-75% (for server-rendered pages with partial updates).
- Typical iterations: 2-3.
- High success rate for what it does, but limited to server-rendered patterns.
Best Use Cases
- Server-rendered applications paired with a Go or Python backend
- Applications where client-side JavaScript should be minimal
Avoid If
- Building rich interactive SPAs
- Offline-capable or client-heavy applications
- The confirmed design architecture specifies a React frontend (HTMX is architecturally incompatible)
10. Remix (React Meta-Framework)
LLM Training Corpus Coverage
Score: ★★★
- Moderate training data. Remix was acquired by Shopify and has evolved significantly.
- Remix v2 merged with React Router v7, creating some confusion in training data.
Component Model Clarity
Score: ★★★★
- Loader/action pattern for data fetching is explicit and well-defined.
- Less Server/Client Component confusion than Next.js - Remix's model is simpler.
- Progressive enhancement by default.
Overall Agent-Suitability
Score: ★★★
- Estimated first-pass functional success rate: 55-65%.
- Typical iterations: 3-5.
- Simpler mental model than Next.js App Router, but less training data and less agent skill support.
Comparative Analysis
Tier Rankings by Agent Success Rate
| Rank | Framework | First-Pass Functional | First-Pass Visual | Typical Iterations |
|---|---|---|---|---|
| 1 | React 19 + Vite + TS | 70-80% | 55-65% | 2-4 |
| 2 | Next.js 15+ | 60-70% | 55-65% | 3-5 |
| 3 | Vue 3 + Nuxt 3 | 60-70% | 50-60% | 3-5 |
| 4 | HTMX | 65-75% | N/A (server) | 2-3 |
| 5 | Angular 18+ | 55-65% | 45-55% | 4-6 |
| 6 | Astro | 55-65% | 50-60% | 3-5 |
| 7 | SvelteKit (Svelte 5) | 50-60% | 45-55% | 4-6 |
| 8 | Remix | 55-65% | 50-60% | 3-5 |
| 9 | Solid.js | 35-45% | 35-45% | 5-8 |
| 10 | Qwik | 20-30% | 20-30% | 8-12 |
Tier Rankings by Ecosystem Breadth
| Rank | Framework | Third-Party Libraries | UI Component Depth | Agent Skills |
|---|---|---|---|---|
| 1 | React + Vite | ★★★★★ | ★★★★★ | ★★★★★ |
| 2 | Next.js | ★★★★★ | ★★★★★ | ★★★★★ |
| 3 | Angular | ★★★★ | ★★★★ | ★★ |
| 4 | Vue + Nuxt | ★★★★ | ★★★ | ★★ |
| 5 | Remix | ★★★★ | ★★★★ | ★★ |
| 6 | SvelteKit | ★★★ | ★★★ | ★★ |
| 7 | Astro | ★★★ | ★★★ | ★★ |
| 8 | HTMX | ★★ | ★ | ★ |
| 9 | Solid.js | ★★ | ★★ | ★ |
| 10 | Qwik | ★ | ★ | ★ |
Tier Rankings by Bundle / Container Efficiency
| Rank | Framework | Bundle (gzipped) | Container Image | Startup |
|---|---|---|---|---|
| 1 | HTMX | <10 KB (JS) | N/A (backend) | N/A |
| 2 | SvelteKit (static) | 30-80 KB | 10-25 MB | Instant |
| 3 | React + Vite (SPA) | 80-200 KB | 10-30 MB | Instant |
| 4 | Astro | 50-100 KB | 10-25 MB | Instant |
| 5 | Vue + Nuxt (static) | 60-150 KB | 10-25 MB | Instant |
| 6 | Qwik | 30-60 KB | 80-150 MB | 100-500 ms |
| 7 | Solid.js | 40-80 KB | 10-25 MB | Instant |
| 8 | Next.js | 80-250 KB | 100-300 MB | 500 ms-2 s |
| 9 | Angular | 100-300 KB | 150-350 MB | 500 ms-2 s |
| 10 | Remix | 80-200 KB | 100-250 MB | 500 ms-2 s |
Trade-Off Matrix
Agent Success ←-> Bundle Efficiency
┌─────────────────────────────────┐
High │ React+Vite │
Agent │ Next.js Vue+Nuxt │
Success │ Angular │
│ Remix Astro │
│ SvelteKit │
│ │
Low │ Solid.js │
Agent │ Qwik │
Success │ │
└─────────────────────────────────┘
Large Small
Bundles Bundles Agent Success ←-> Framework Complexity
┌─────────────────────────────────┐
High │ React+Vite HTMX │
Agent │ Vue+Nuxt │
Success │ Next.js │
│ Remix Astro │
│ Angular SvelteKit │
│ │
Low │ Solid.js │
Agent │ Qwik │
Success │ │
└─────────────────────────────────┘
Simple Complex
Framework FrameworkThe Complexity Cliff - Frontend-Specific
Addy Osmani's research at React Summit quantified the "complexity cliff" for frontend AI code generation:
| Task Complexity | Success Rate (Best Models) | Notes |
|---|---|---|
| Isolated component generation | ~70-80% | Single component, clear props, no routing |
| Page-level composition | ~40-50% | Multiple components, state management, layout |
| Multi-step full-stack tasks | ~25% | Routing, data fetching, state, error handling |
| Framework-specific eval tasks | ~42% | Next.js-specific patterns (SSR, caching, routing) |
The cliff is steepest where:
- State management crosses component boundaries
- Routing and data fetching interact with component lifecycle
- Design taste (spacing, hierarchy, colour) must be applied consistently
- Error handling must cover loading, error, and empty states for every data dependency
Red Flag Summary
| Framework | TS Types | Compile-Time UI Checks | Silent Failures | A11y Enforcement | Testing Maturity | SDK > 30% | Clear Errors | Stable API | Mature (5yr+) | Agent Skills | Bundle < 300 KB |
|---|---|---|---|---|---|---|---|---|---|---|---|
| React + Vite | ✅ | ⚠️ | ⚠️ | ⚠️ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Next.js | ✅ | ⚠️ | ⚠️ | ⚠️ | ✅ | ✅ | ✅ | ⚠️ | ✅ | ✅ | ✅ |
| Vue + Nuxt | ✅ | ⚠️ | ⚠️ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ |
| SvelteKit | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ⚠️ | ❌ | ❌ | ✅ |
| Angular | ✅ | ✅ | ⚠️ | ❌ | ⚠️ | ✅ | ⚠️ | ⚠️ | ✅ | ❌ | ⚠️ |
| Solid.js | ✅ | ❌ | ⚠️ | ❌ | ⚠️ | ❌ | ⚠️ | ⚠️ | ❌ | ❌ | ✅ |
| Qwik | ✅ | ⚠️ | ⚠️ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
| Astro | ✅ | ⚠️ | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ |
| HTMX | ❌ | ❌ | ⚠️ | ❌ | ⚠️ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ |
| Remix | ✅ | ⚠️ | ⚠️ | ⚠️ | ✅ | ✅ | ✅ | ⚠️ | ❌ | ❌ | ✅ |
Legend: ✅ = passes, ⚠️ = conditional/partial, ❌ = fails
Frameworks with red flags:
- Qwik: Immature, tiny training corpus, novel paradigm LLMs cannot handle
- Solid.js: React-pattern contamination causes silent failures
- Angular: Version confusion and RxJS complexity increase iteration cost
- HTMX: Architecturally incompatible with SPA-driven Planifest frontends
Final Recommendations
1. Single Best Framework for Agent-Generated Frontend Applications
React 19 + Vite + TypeScript + Tailwind CSS + shadcn/ui
React wins on the combination that matters most for agent-generated frontend code: the largest LLM training corpus, the broadest component library ecosystem, the deepest agent skill support (Vercel's react-best-practices, Playwright Test Agents, Anthropic's frontend-design skill), and the highest first-pass success rate.
The useEffect footgun and state management complexity are real risks, but they are well-understood and mitigatable through:
- Explicit specification constraints (e.g. "use TanStack Query for all server state, Zustand for client state")
- Vercel's react-best-practices skill loaded into the codegen-agent
- React Compiler (v1.0) eliminating manual memoisation errors
- ESLint rules (
react-hooks/exhaustive-deps,jsx-a11y) catching common issues at lint time
The combination of Tailwind CSS + shadcn/ui is equally critical - it constrains the agent's design vocabulary, producing consistent visual output. Without a specified component library, agents produce inconsistent, "AI slop" UI.
2. Best Framework by Use Case
| Use Case | Recommendation | Runner-Up |
|---|---|---|
| SPA / Dashboard / Internal Tool | React + Vite + TS | Vue 3 + Vite |
| SEO-Critical Application | Next.js 15 | Nuxt 3 |
| Content-Heavy Site (docs, blog) | Astro | Next.js (static) |
| Performance-Critical Widget | SvelteKit | React + Vite |
| Server-Rendered + Minimal JS | HTMX + Go/Python backend | Astro |
| Existing Angular Codebase | Angular 18+ (with Signals) | - |
| Full-Stack React Application | Next.js 15 | Remix |
3. Recommended Planifest Frontend Template Stack
┌──────────────────────────────────────────────────────────┐
│ Frontend Architecture │
├──────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ React 19 + TypeScript (strict mode) │ │
│ │ Build: Vite 6+ │ │
│ │ Styling: Tailwind CSS v4 + shadcn/ui │ │
│ │ State: Zustand (client) + TanStack Query │ │
│ │ (server state) │ │
│ │ Routing: React Router v7 (SPA) or │ │
│ │ Next.js App Router (SSR) │ │
│ │ Forms: React Hook Form + Zod validation │ │
│ │ Animation: Framer Motion │ │
│ └─────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Testing │ │
│ │ Unit/Component: Vitest + Testing Library │ │
│ │ E2E: Playwright (with Test Agents) │ │
│ │ Visual Regression: Playwright screenshots │ │
│ │ a11y: eslint-plugin-jsx-a11y + axe-core │ │
│ └─────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Quality Gates │ │
│ │ Lint: ESLint (strict React + a11y rules) │ │
│ │ Format: Prettier │ │
│ │ Types: tsc --noEmit (strict mode) │ │
│ │ Bundle: bundlesize / size-limit │ │
│ │ Agent Skill: vercel react-best-practices │ │
│ └─────────────────────────────────────────────┘ │
│ │
│ Shared contracts: OpenAPI spec (Zod derived from it) │
│ Container: Multi-stage Dockerfile -> Nginx/Caddy │
│ Image size target: < 30 MB │
│ Bundle size target: < 200 KB (gzipped, first load) │
└──────────────────────────────────────────────────────────┘4. Rationale - Why These Choices
React + Vite as default frontend:
- 70-80% first-pass agent success rate is the highest evaluated for SPAs
- The deepest agent skills ecosystem of any frontend framework
- The broadest third-party component library coverage
- Frontend Zod schemas derived from the OpenAPI spec enforce API contracts regardless of backend language
- LLMs generate React + TypeScript more fluently than any other frontend framework
Tailwind CSS + shadcn/ui for styling:
- Constrains the agent's design vocabulary, producing consistent visual output
- shadcn/ui provides accessible, well-tested primitives (built on Radix UI)
- Both are over-represented in LLM training data, maximising first-pass correctness
- Utility classes are more deterministic for agents than CSS-in-JS or custom CSS
TanStack Query for server state, Zustand for client state:
- Eliminates the "Redux vs. Context vs. hooks" decision from agent-generated code
- TanStack Query handles loading/error/data states, caching, and refetching - patterns agents frequently get wrong when implementing manually
- Zustand is minimal and produces less boilerplate for agents to get wrong
Vitest + Playwright for testing:
- Vitest is Vite-native, fast, and Jest-API-compatible (maximising training data coverage)
- Playwright's Test Agents (planner, generator, healer) are purpose-built for LLM-driven test creation
- Playwright Browser Mode in Vitest enables component testing in real browsers
Next.js for SSR use cases:
- When the specification requires SSR/SSG, Next.js is the only React meta-framework with deep agent skill support
- The Server/Client Component boundary adds iteration cost; use only when SSR is genuinely required
5. Trade-Offs
| Choice | You Gain | You Lose |
|---|---|---|
| React + Vite as default | Best agent success rate, deepest ecosystem, most agent skills | No SSR (use Next.js when needed), larger runtime than Svelte |
| Tailwind + shadcn/ui | Consistent visual output, accessible primitives | Design differentiation requires explicit spec constraints |
| Zustand + TanStack Query | Clear state management boundaries, reduced boilerplate | Two libraries to learn; less flexibility than raw Context |
| Vitest + Playwright | Fast feedback loops, real-browser testing, LLM-native test agents | More tooling to configure than a single Jest setup |
| Next.js for SSR | SEO, streaming, ISR, server components | Larger container, more complex mental model, more agent iterations |
6. Strategies for Minimising Agent Error Rate
Based on the research, the following strategies have the largest impact on reducing first-pass failure rates for agent-generated frontend code:
Specification-Level:
- Explicitly name every library in the spec (React, Vite, Tailwind, shadcn/ui, Zustand, TanStack Query, React Hook Form, Zod)
- Specify the component library (shadcn/ui) and constrain the agent to use it for all UI primitives
- Define content density ("minimal landing page" vs. "data-dense dashboard") to prevent spacing defaults
- Specify responsive breakpoints and collapse behaviour
- Require accessibility: semantic landmarks, skip links, ARIA labels, safe contrast ratios
Agent-Level:
- Load Vercel's react-best-practices skill into the codegen-agent context
- Load Playwright Test Agents for automated E2E test generation
- Enforce incremental generation: scaffold structure first, then implement page by page
- Use a generator/evaluator pattern: one agent generates, a second reviews for edge cases, security, and style
Template-Level:
- Provide a stamped template repo with Vite, TypeScript (strict mode), Tailwind, shadcn/ui, Zustand, TanStack Query, Vitest, and Playwright pre-configured
- Include ESLint configuration with
react-hooks/exhaustive-deps,jsx-a11y, and ano-anyrule - Include a
size-limitconfiguration enforcing bundle budgets - Pre-install shadcn/ui components that the agent can import rather than generate from scratch
Validation-Level:
- Type checking (
tsc --noEmit) - catches ~30% of agent-generated bugs - ESLint (with strict React + a11y rules) - catches ~15% of remaining issues
- Vitest unit/component tests - catches functional regressions
- Playwright E2E tests - catches integration failures
- Playwright screenshot comparison - catches visual regressions (the "does it look right?" dimension)
- Lighthouse CI - catches performance regressions, a11y violations, SEO issues
Answers to Success Criteria
Which framework produces the lowest agent error rate for frontend code? React 19 + Vite + TypeScript. The combination of the deepest training corpus, broadest ecosystem, and most mature agent skills produces the highest first-pass success rate (70-80% functional, 55-65% visual).
Which framework has the best error messages for LLM iteration? SvelteKit. Svelte's compiler errors are the clearest of any frontend framework - one error at a time, actionable, tied to source. However, React + Vite is close behind and the larger training corpus means fewer errors in the first place.
Which framework has the best component library ecosystem? React. No other framework has a comparable depth of UI component libraries, charting libraries, form libraries, and design system implementations.
Which framework produces the best visual output from agents? React + Tailwind CSS + shadcn/ui. The constrained design vocabulary and over-representation in training data produce the most visually acceptable agent-generated UI. Gemini 3 Pro specifically leads Web Dev Arena scores for frontend aesthetics.
Which framework has the best agent skill support? React. Vercel's react-best-practices (58+ rules), web-design-guidelines (100+ rules), Playwright Test Agents, and Anthropic's frontend-design skill create an unmatched context engineering ecosystem.
Which would you choose for a data-dense dashboard? React + Vite + TanStack Query + Recharts + shadcn/ui. Maximum component library coverage for tables, charts, forms, and data visualisation.
Which would you choose for an SEO-critical marketing site? Next.js 15 for SSR/SSG with static export where possible. Astro if JavaScript interactivity is minimal.
Which would you choose for an embedded widget with strict bundle budgets? SvelteKit. Smallest bundle size of any major framework (30-80 KB gzipped).
For a Planifest-managed application built entirely from agent-generated code, which would you choose? React 19 + Vite + TypeScript for the frontend, with Tailwind CSS + shadcn/ui as the design system, TanStack Query + Zustand for state management, and Vitest + Playwright for testing. This stack optimises for the metric that matters most in Planifest's context: correct code on the first pass, with the fewest agent iterations.
Implications for Planifest
The current Planifest architecture specifies React 18+ with TypeScript, Vite, TailwindCSS for the frontend. This evaluation confirms this is the optimal choice for agent-generated code, with the following refinements:
- Upgrade target to React 19 - React Compiler eliminates manual memoisation errors; a common agent footgun removed
- Specify shadcn/ui as the component library - constrains design vocabulary and provides accessible primitives; this is not currently specified and its absence increases visual inconsistency in agent output
- Specify TanStack Query for server state and Zustand for client state - eliminates state management decision fatigue for the codegen-agent
- Specify React Hook Form + Zod for form handling - Zod schemas on the frontend are derived from the OpenAPI spec, so form validation contracts hold regardless of backend language; the backend validates against the same OpenAPI spec using its own language-native library (e.g. Pydantic for Python, Go struct validation, Rust serde)
- Load Vercel's react-best-practices skill into the codegen-agent - 58+ rules optimised for LLM consumption, covering the most impactful performance patterns
- Use Playwright Test Agents for E2E test generation - purpose-built for LLM-driven test creation with plan/generate/heal workflow
- Enforce strict TypeScript (
strict: true,noUncheckedIndexedAccess,noAnyvia ESLint) - mitigates the type system's weaknesses - Add bundle size budgets via
size-limit- prevents agent-generated code from silently bloating the bundle - For SSR use cases only: adopt Next.js 15 as the meta-framework, accepting the 10-15% increase in iteration cost for the Server/Client Component boundary
The OpenAPI spec remains the language-agnostic contract between frontend and backend (as established in p013). On the frontend, Zod schemas are derived from the OpenAPI definition, giving the codegen-agent type-safe validation without assuming the backend is also TypeScript. When the backend is TypeScript, Zod schemas can be shared directly - but this is an optimisation, not a requirement. The frontend stack requires no polyglot consideration - React + TypeScript is the clear winner on every agent-suitability metric regardless of backend language choice.