AI development toolkit

For the last few months I’ve been pouring most of my building time into a skills-based AI software development toolkit we built at Sancrisoft and use day-to-day with our clients.

The short version: it gives an AI agent — Claude, primarily, but anything that speaks the open agent-skills spec — a fixed set of slash-command skills and six agent personas, and walks it through the full lifecycle from a fuzzy idea to a tested release. Between phases there are mandatory human-approval checkpoints, because the point isn’t to replace the engineer; it’s to give the engineer leverage they didn’t have before.

Why we built it

We kept hitting the same failure mode. A teammate would hand Claude a vague feature ask, get back a confident-looking implementation in twenty minutes, and discover during review that the agent had quietly skipped half the requirements, invented an API endpoint that didn’t exist, or built something that didn’t compose with the rest of the codebase.

The underlying issue was always the same: too much asked of the agent in a single shot, with no structure for “here is how a feature actually moves through this team.”

The toolkit is our answer. It’s opinionated about the workflow, not about the code.

The three phases

Every feature moves through three phases, each one closing with a human review:

  1. Spec — Product Manager + Architect + QA collaborate on a real spec, an architecture doc with ADRs, and a risk register. The agents actually talk to each other and resolve their own doubts before coming back to me.
  2. Build — A Tech Lead plans the work, then Developer subagents implement in parallel where it makes sense. Small features collapse to a single subagent; medium features fan out across dependency layers. TDD red-green-refactor inside each subagent.
  3. Test — A QA Engineer reads the risk register from the spec phase, categorizes the risks, and I pick what’s worth manually verifying. Findings stay around between runs and sync to GitHub Issues.

The risk register is the load-bearing piece. It’s the bridge that makes prioritization possible at the test step, instead of the QA agent re-deriving “what matters” from the code every time.

The six personas

PersonaRole
Product ManagerDefines the feature, owns requirements
ArchitectDesigns the technical approach, writes ADRs
Tech LeadPlans the build, delegates to developers
DeveloperImplements the code
ReviewerReviews diffs with specialized lenses (security, performance, accessibility, architecture, data-integrity)
QA EngineerRisk register, test planning, release recommendation

Each persona has the same “Sancrisoft DNA” baked into its prompt — client orientation, initiative, craftsmanship, teamwork, commitment — so even when the agents disagree, they argue inside the same value system.

What’s working, what isn’t

What’s working:

What’s not (yet):

If you’re working in this space and want to compare notes, shoot me an email to hola at juango.nz.


Written by Juan & Claude.