Skip to main content

Command Palette

Search for a command to run...

From Prompt to Passing Test: A Complete Agentic QA Session

Watch an AI agent go from a single prompt to page objects, factories, and passing Playwright tests

Published
โ€ข7 min read
From Prompt to Passing Test: A Complete Agentic QA Session

Sound familiar? In the first article, we set up a project scaffold designed for AI. But a good structure only gets you so far if the AI is just a code suggester. Useful, but not transformative. You still have to know what to ask, verify what it wrote, adapt it to your project, and repeat for every file.

In the second article, we saw what makes an AI agent different from a chatbot. It reads your code, takes actions, and works inside your project. But here's the catch: an agent is only as good as the instructions it follows.

In the third article, we saw how CLAUDE.md gives the agent its rules and workflow. But rules without depth only get you so far. "Use the Page Object Model" is a rule, but how exactly do you structure a page object? What's the difference between a locator getter and an action method? How do you compose components into page objects?

In the fourth article, we gave the agent deep expertise through skill files. Now it knows how to build page objects, selectors, and fixtures. But there's still a gap: the agent has never seen your application.

Everything in this series has been building toward this moment. You have a scaffold with clear structure. An AI agent that reads your rules. Skills that give it deep expertise. And in the fift article, a browser tool that lets it see your application.


Now let's watch it work.


๐ŸŽฌ The Scenario

Your team has just shipped a user registration feature. You need:

  • A page object for the registration form

  • A Faker factory for user data

  • A smoke test for the happy path

  • A regression test for validation errors

Before agentic QA, this would be half a day of boilerplate. Inspecting the DOM, writing locators, setting up factories, wiring fixtures, writing tests, debugging failures.

With an agent, your entire input is this:

"The app has a new user registration page at /register. Create a page object, a user factory, a smoke test, and regression tests for validation errors".


๐Ÿ”„ What the Agent Does: Step by Step

Step 1: Read the rules

The agent opens CLAUDE.md first. It notes the mandatory rules, the workflow, and the skills it needs to load. For this task: playwright-cli, selectors, page-objects, test-standards, and data-strategy.

Step 2: Explore the application

Agent โ†’ playwright-cli:
"Navigate to /register. List all form fields (role, label, placeholder),
buttons (role, name), headings, and any validation messages visible."

Discovery result:

Heading: "Create your account" (role: heading, level: 1)
Input: label "First name" - required
Input: label "Last name" - required
Input: label "Email address" - required, type: email
Input: label "Password" - required, type: password, placeholder: "At least 8 characters"
Input: label "Confirm password" - required, type: password
Button: "Create account" (role: button)
Link: "Already have an account? Sign in" (role: link)

Step 3: Check existing patterns

The agent reads pages/app/ to see how existing page objects are structured. It reads test-data/factories/app/ to see how existing factories are formatted. It confirms the import paths and naming conventions.

Step 4: Generate the page object

// pages/app/registration.page.ts
import { Page } from "@playwright/test";

export class RegistrationPage {
  constructor(private readonly page: Page) {}

  get firstNameInput() {
    return this.page.getByLabel("First name");
  }

  get lastNameInput() {
    return this.page.getByLabel("Last name");
  }

  get emailInput() {
    return this.page.getByLabel("Email address");
  }

  get passwordInput() {
    return this.page.getByLabel("Password");
  }

  get confirmPasswordInput() {
    return this.page.getByLabel("Confirm password");
  }

  get createAccountButton() {
    return this.page.getByRole("button", { name: "Create account" });
  }

  get firstNameError() {
    return this.page.getByText("First name is required");
  }

  async register(user: RegistrationData) {
    await this.firstNameInput.fill(user.firstName);
    await this.lastNameInput.fill(user.lastName);
    await this.emailInput.fill(user.email);
    await this.passwordInput.fill(user.password);
    await this.confirmPasswordInput.fill(user.password);
    await this.createAccountButton.click();
  }

  async goto() {
    await this.page.goto("/register");
  }
}

Step 5: Register the page object

The agent adds registrationPage to fixtures/pom/page-object-fixture.ts so tests can access it via dependency injection.

Step 6: Generate the factory

// test-data/factories/app/user.factory.ts
import { faker } from "@faker-js/faker";

export function createUser() {
  const firstName = faker.person.firstName();
  const lastName = faker.person.lastName();
  return {
    firstName,
    lastName,
    email: faker.internet.email({ firstName, lastName }),
    password: faker.internet.password({ length: 12, memorable: false }),
  };
}

Step 7: Generate the smoke test

// tests/app/functional/registration.spec.ts
import { test, expect } from "fixtures/pom/test-options";
import { createUser } from "test-data/factories/app/user.factory";

test(
  "new user can register successfully",
  { tag: "@smoke" },
  async ({ registrationPage, dashboardPage }) => {
    const user = createUser();

    await test.step("Given the user is on the registration page", async () => {
      await registrationPage.goto();
    });

    await test.step("When they fill in valid details and submit", async () => {
      await registrationPage.register(user);
    });

    await test.step("Then they are redirected to the dashboard", async () => {
      await expect(dashboardPage.welcomeHeading).toBeVisible();
    });
  },
);

Step 8: Generate the regression tests

import validationCases from "test-data/static/app/registrationErrors.json";

for (const { field, value, error } of validationCases) {
  test(
    `shows error when ${field} is invalid`,
    { tag: "@regression" },
    async ({ registrationPage }) => {
      await registrationPage.goto();
      await registrationPage.submitWithInvalid(field, value);
      await expect(registrationPage.errorFor(field)).toHaveText(error);
    },
  );
}

Step 9: Run the tests

npx playwright test tests/app/functional/registration.spec.ts
Running 1 test using 1 worker
  โœ“ new user can register successfully (2.3s)
  โœ“ shows error when email is invalid (1.1s)
  โœ“ shows error when password is too short (1.0s)
  โœ“ shows error when first name is empty (0.9s)

  4 passed (6.2s)

The agent reports back: all tests pass. Task complete.


๐Ÿ‘€ What You Review

You didn't write any of that. But your job isn't done. You review it.

What you're looking for:

  • Do the locators match what you see in the app? The agent used real discovery, but you know your application.

  • Is the test actually testing the right thing? The happy path asserts the dashboard heading. Is that the right success indicator?

  • Are the edge cases covered? The regression tests came from a static JSON file. Did the agent create sensible validation cases?

  • Does it fit the codebase style? Compare against existing tests. Does this look like it belongs?

This review takes 5-10 minutes. Writing everything from scratch would have taken half a day.


๐ŸŒฑ Growing the Framework With AI

This workflow doesn't just apply to new features. The same pattern works for:

  • Refactoring. "The navigation component was moved to a sidebar. Update the relevant page objects".

  • New API endpoints. "The /users endpoint now returns a role field. Update the schema and any affected tests".

  • Cleanup. "There are three page objects with duplicate navigation methods. Extract them into a shared component".

The agent reads the current state of the codebase, makes targeted changes, runs the affected tests, and confirms nothing broke. You review the diff.

Over time, your role becomes less about writing tests and more about defining what should be tested. The thinking part of QA, not the typing part.


๐ŸŽฏ The Bigger Picture

The scaffold, CLAUDE.md, the skills, the explore-first workflow are not a sorcery. They're just a well-designed system that makes it easy for an agent to do the right thing.

The insight at the heart of agentic QA is simple: AI is most useful when it has clear constraints. A blank slate produces inconsistent results. A scaffold with rules, skills, and a workflow produces output you can trust.

You're not replacing the QA engineer. You're giving the QA engineer a tireless, fast, rule-following colleague who never complains about writing boilerplate.


๐Ÿš€ Get Started

You have complete instructions to get started with the AI-assisted development.

You can find the Public README.md file for the scaffold on GitHub: Playwright Scaffold

You can get access to the private GitHub repository here: Get Access


๐Ÿ™๐Ÿป Thank you for reading this series! If you've made it to the end, you now have a complete picture of how agentic QA works, from the scaffold that makes it possible to the moment the tests go green. The tools are here, the patterns are proven, and the only thing left is to start building.

If this series helped you, I'd love to hear about it. See you in the community.

Every coffee you buy โ˜• directly contributes to keeping this resource open and growing for everyone.

Buy me a coffee

Agentic QA

Part 1 of 6

A beginner-friendly series about using AI agents to build and maintain a professional Playwright test automation framework - from project scaffold to passing tests.

Up next

Explore First: Why the Agent Looks Before It Writes

How exploration gives your AI agent real page context before it writes a single locator