Delible
Describe a screen; get four divergent, hand-drawn takes on it — pictured: the real picker, one prompt solved four ways.
- My role
- Founder & sole builder — product strategy, the design system, the AI generation pipeline, the web app, and the MCP server
- Context
- A standalone product, originated as an internal tool ("Nivoda Sketch") to get a B2B marketplace's PMs off Lovable
- Duration
- Internal v1 to shipped SaaS — live at app.delible.dev
- Stack
- Next.js 15 · React 19 · Supabase · Claude API · rough.js +
blackchalk· MCP server · Stripe - What shipped
- Live web app · open-source component library (~70 components) · 4-tool MCP server · permalink publishing · markdown + PDF brief export · eval harness
The problem
Polish arrives before the idea is ready
The dominant way teams prototype now is to describe a screen to an AI and get back something that looks finished. That feels like progress. It isn't — it's the most expensive failure mode in early product work, and it shows up two ways:
- Premature polish skews the feedback. When a prototype looks done, stakeholders critique the gradient, not the flow. A polished mock whispers "this is decided" whether or not it is — so the structural questions that actually matter early never get asked.
- Context is disposable. The incumbent (Lovable) spins up a working prototype fast, then discards the why the moment you export. Engineering inherits code without story; design inherits a screen without intent. Every handoff is a context reset.
The deeper issue is that fidelity and confidence come uncoupled. Confidence in a solution builds slowly — through feedback, iteration, killed ideas. But AI tools jump straight to maximum fidelity on the first prompt. The gap between the two is the premature-polish tax:
The thesis
Make polish impossible, not discouraged
Every other lo-fi tool relies on convention — a style guide, a reviewer's restraint, a "please keep it rough." Conventions leak. So I built the constraint into the rendering layer: there is no path by which Delible produces something that looks finished.
Everything renders hand-drawn (rough.js wobble, no straight machine edges), strictly monochrome, with border-radius, box-shadow, gradients, and decorative transitions banned at the system level. You can't polish your way out of the sketch — so the conversation stays on structure, flow, and hierarchy, which is exactly where early-stage work needs it.
That's the wedge. But the constraint isn't the product — it's the price of admission. The real product is the layer underneath: while you sketch, Delible quietly builds the brief — a versioned record of decisions, rationale, and open questions — and mints a shareable permalink that outlives the AI session. The loop is four moves: type → see → steer → hand off. Engineering gets the story, not a context reset.
How I built it
Four layers, constraint to handoff
1 · A design system that can't be polished
The aesthetic is the enforcement mechanism, so it had to be a real system, not a CSS theme. I built blackchalk — a monochrome, rough.js-based React kit of ~70 components — and published it open-source. Stroke weight is governed by exactly two tokens (surfaces recede, interactive elements come forward); colour is greyscale-only; the banned-polish list lives in code, not a wiki.
2 · AI generation you can trust
- Rules as the system prompt: the canonical design rules (
RULES.md) are the single source of truth and ship into the generator's prompt — one rulebook for humans, the web app, and the MCP server. - A validator as a guardrail: generated JSX is checked against the same rules production enforces, so off-system output never reaches the canvas.
- An eval harness, not vibes: a fixed prompt set across core archetypes, stretch domains, and edge cases runs offline and scores pass/fail against the validator — currently 42/43 (97.7%).
- Divergent options, then refine: a run returns several genuinely different takes on the one screen (up to four, A–D), not a single answer — then you iterate the keepers down to a direction. Compare-and-choose is itself a fidelity guardrail.
3 · The context layer — the actual moat
Anyone can clone a hand-drawn look in a weekend; rough.js is open. What's hard to copy is the data layer. As you iterate, Delible captures a versioned markdown brief — problem, decisions, rationale, open questions — exportable as markdown and PDF, and publishes a permalink (a real /api/publish endpoint minting a public slug) that lives in Delible's own database. The sketch escapes the gravity of any single AI session: stakeholders who never opened the tool can view it, and a future session can reopen it with context intact.
4 · Two surfaces, one sovereign data layer
Delible ships on two surfaces that share one backend. The web app (app.delible.dev, with Google sign-in and Stripe billing) is the durable home where accounts, data, and permalinks live. The MCP server — four production tools, create_sketch / iterate_sketch / list_sketches / publish_sketch — puts the same product inside Claude as a distribution channel. The data layer stays Delible-owned regardless of surface, so platform risk never touches the moat.
Judgment
The decisions that define it
A product like this is a series of opinionated calls — and the opinion is the artifact:
- Monochrome, enforced by CI. Greyscale-only isn't a guideline; it's a build failure. The constraint has to be unroutable or it isn't a constraint.
- Divergent options, then refine — never one-shot. A single output invites "make this one perfect." Several divergent sketches (A–D) keep the user comparing structures, then refining the keepers — exactly the disposable mindset the tool is for.
- Extracted
blackchalkto open source. The library is a credibility signal and a top-of-funnel channel for the PM-native audience — a soft moat and a funnel, deliberately not the business. - Positioned against Claude Design, on purpose. Anthropic validated "talk to an AI, get a prototype" — and aimed at polished, on-brand, hi-fi. That leaves the deliberately-disposable lo-fi lane open. "Claude Design makes it look done; Delible makes it look exactly as done as it is."
How I worked
Blank repo to shipped, solo
Beyond the code, what this build demonstrates:
- End-to-end ownership. Strategy, the design system, the AI pipeline, the web app, the MCP server, billing — I took it from an empty repo to a deployed, monetised product on my own.
- Design judgment encoded as engineering. The taste — what "low fidelity" actually means — lives in tokens, a banned-polish list, and a lint rule, not in a deck. The opinion is executable.
- Validated before generalised. It earned its keep as an internal tool with a measurable goal before it became a SaaS — product-market fit evidence first, platform second.
- Quality instrumented, not asserted. An eval harness and a production validator mean "the AI output is good" is a number I can re-run, not a claim.
- Built for the moat, not the demo. The flashy part is the sketch; the defensible part is the brief and the permalink database — and I prioritised the latter.
Where it stands
What shipped
- Live and monetised — deployed at
app.delible.devwith Google auth and Stripe billing wired. - The constraint holds structurally — monochrome enforced by lint, polish banned in code; the output can't drift hi-fi.
- Generation is trustworthy and measured — 42/43 eval pass rate against the same validator production runs.
- The permalink is real — published sketches mint a public slug and live independently of any AI session.
- Two surfaces, one backend — a standalone web app as the home, a four-tool MCP server as the distribution channel into Claude.
- An open-source asset —
blackchalk(~70 components) doing double duty as credibility and funnel.
Look as done as you actually are.
Stack at a glance