Lower CAC by fixing the
learning loop.
Claude Code does not know which leads close. The Performance Marketing agent connects campaigns, creative, and CRM into one operating layer that optimizes for pipeline quality, not platform proxies.
Claude Code can answer the question. It cannot close the loop.
Generic agents do not know which leads close. They cannot weigh marginal CAC, flag incrementality risk, or design an experiment with a decision rule. Platform automation gives you cheap conversions. Pipeline quality requires a harness.
“You don't ship pipeline from a fresh context window.”
What should I do with my Google Ads spend this week?
Based on what you have shared, here are some general suggestions:
- Consider pausing keywords with high CPL and low conversions.
- You could test new ad copy or landing page variants.
- Review your audience targeting and exclude irrelevant segments.
Pain-led primary text outperforms feature-led for ICP segment B.
A four-tab paid stack vs. one operating layer.
Operators jump between Claude, the ads UI, Looker, and a budget sheet. The CRM never reaches back. Lifetime CAC averages decide budget. Nothing compounds.
"What should I do with my Google Ads spend?"
No CRM access · no incrementality view
Smart Bidding · Maximize Conversions
Conv = form fill (no closed-won)
CTR ↑ CPL ↓ Conv ↑
Pipeline? /not-connected/
lifetime CAC averages
last touched 2 weeks ago
Pain-led primary text outperforms feature-led for ICP segment B.
What replaces dashboard tweaking.
Four operating principles. Each one carries a method anchor and a piece of product evidence.
Audience first, channel second.
Performance improves when audience quality, suppression, and CRM feedback are strong before budget moves. Channel comes second.
Method anchor — Audience-first paid — Emily Kramer
| Segment | Fit | CAC payback |
|---|---|---|
| ICP A · enterprise | 0.92 | 4.1mo |
| ICP B · mid-market | 0.84 | 5.6mo |
| ICP C · SMB long-tail | 0.41 | 14mo |
From ad account ingest to memory update.
The agent reads the account, scores spend leakage and pipeline quality, designs experiments with decision rules, audits LPs, and writes outcomes to memory.
Each run writes outcomes to memory. The next run starts with the prior decision graph and review boundary already loaded.
Production-grade agents need more than a clever prompt. Each layer below is required for governed autonomy.
What the agent reads before every run.
Every Metaflow agent is grounded in a domain-specific skill file — a structured operating procedure that defines inputs, workflows, evaluation criteria, anti-patterns, and output contracts.
The skill file is editable, versioned, and inspectable. It is not a hidden prompt.
# Mission Optimize paid growth against pipeline quality and CAC payback — not platform-reported conversions. Audience first. Value signal honest. Experiments disciplined. Budget moved on marginal CAC. Lessons preserved. ## Optimizes for - pipeline quality, weighted by closed-won - CAC payback within target window - incremental contribution, not raw conversion volume ## Does not promise - guaranteed CAC reduction - "fully autonomous paid growth" - replacing the operator's judgment on offer or strategy
Recommendations are scored before they reach an operator.
The agent does not act on the ad account directly. It scores its own recommendations and routes anything below threshold for review.
Where the agent stops or hands back instead of guessing.
- CRM signal stale — no closed-won data in 30 days.
- Audience overlap > 0.6 without operator override.
- Experiment without a decision rule.
- Budget shift exceeds 20% without explicit approval.
- Score campaign quality
- Mine search terms
- Detect audience overlap
- Assemble experiment cards
- Budget reallocation memo
- New audience expansion
- Creative variant queue
- Landing page edits
- Spend changes above threshold
- New campaign launches
- Negative keyword additions
- Account structure changes
System evidence, not feature cards.
Five artifacts. Each one is something the agent generates, scores, or maintains.
Search-term mining — last 7 days.
Continuous waste detection. Negative keywords queued for approval, never silently applied.
| Term | Spend | Pipe | Action |
|---|---|---|---|
| free crm software | $214 | None | Negate |
| jobs in marketing | $184 | None | Negate |
| metaflow tutorial free | $162 | Low | Negate |
| ai agents pricing | $86 | High | Keep |
| ai marketing platform comparison | $71 | High | Keep |
Google value-based bidding
Paid systems perform better when they optimize against meaningful business value, not raw conversion volume.
CRM closed-won outcomes flow back to the platform.
Google experiment frameworks
Tests need clean hypotheses, controlled variables, sufficient duration, and decision rules.
Experiment cards require all six fields. The agent will not start without them.
Demand Curve creative testing
Creative testing should produce learning, not endless variants.
Each variant tests a named belief: pain, promise, proof, offer, audience, or objection.
Audience-first paid — Emily Kramer
Performance improves when the audience foundation is strong before budget moves.
Audience quality and CRM feedback are scored before scale.
CAC payback discipline
Not every reported conversion is incremental. Not every cheap lead is good.
Pipeline quality and CAC payback are the two scores that ship.
Reforge-style growth loops
Growth compounds when execution creates learning that improves the next cycle.
Winners persist as memory. Next quarter does not start from zero.
Not a bid optimizer. Not an ad generator. An operating layer.
Four contenders. Hover the Metaflow column to see the product evidence.
| Dimension | Claude Code, Cursor Generic agents in chat windows. | Platform automation Smart bidding, broad targeting. | Performance agency Outsourced humans with platform access. | Metaflow agent Agentic operating layer with memory and evals. |
|---|---|---|---|---|
| Optimizes for | Whatever the prompt encodes. | Platform-reported conversions. | Whatever the strategist negotiates. | Pipeline quality and CAC payback. CRM closed-won fed back. crm.match 1,287 leads matched · 14 campaigns scored by closed-won. |
| Experiment discipline | Variants without a hypothesis. | Auto-experiments with hidden logic. | Sometimes structured, often not. | Six fields required. No fields, no test. experiment.compose belief · control · treatment · primary · guardrail · rule |
| Memory | Resets every chat. | Lives in the platform's ML. | Lives in slack and the strategist. | Winners persist. Losers retire with reasoning. memory.write “Pain-led primary text wins for ICP B” — persisted v1.4.0. |
| Incrementality | Not addressed. | Limited and proprietary. | Available if requested. | Surfaced before reallocation, not after. budget.compose Audience overlap 0.42 flagged · awaiting operator. |
| Compounding | Each chat starts at zero. | Compounds inside the platform's walls. | Compounds with the strategist who stays. | Outcomes update memory across runs and platforms. cross-platform memory Belief won on Meta · re-tested on Google · pending data. |
Repeatable plays the agent runs end-to-end.
Each play has the same shape: ingest, audit, score, hypothesize, design, audit LP, recommend, approve, learn.
Weekly account hygiene
Search-term mining, fatigue detection, audience overlap, stale ad sets.
Outcome
A reasoned waste report and queued negs for approval.
Audit your paid growth system in one focused session.
A 30-minute working session with a Metaflow operator. We pull a sample week of spend, score campaigns against pipeline quality, and surface waste and incrementality risk.
- 01Pipeline-quality scorecard for your top 5 active campaigns.
- 02Wasted spend report from a 7-day search term sample.
- 03LP message-match audit on your highest-spend ad set.
- 04A written 1-page memo with the next 3 plays we would run.
- brand-search +18%+$580
- icp-a-bofu +12%+$340
- broad-discovery -22%-$960
A focused diagnostic. No slides. Walk away with a written assessment whether or not we work together.