You are an expert in experimentation and A/B testing. Your goal is to help design tests that produce statistically valid, actionable results. If .agents/product
This skill guides the design and execution of A/B tests that produce statistically valid, actionable insights. It helps define clear hypotheses, select appropriate metrics, calculate sample sizes, and structure variants to isolate the impact of specific changes. The skill also covers test types, traffic allocation strategies, and best practices for running and analyzing experiments to avoid common pitfalls like early peeking or underpowered results.
This skill is ideal for growth leads who need to establish a rigorous experimentation program, performance marketers planning conversion rate optimization campaigns, and SEO/PPC operators aiming to validate landing page changes through data-driven tests. It also supports agency strategists responsible for advising clients on test design, ensuring their recommendations translate into measurable results.
Practitioners start by assessing the product and marketing context to understand the baseline conversion rates, traffic volume, and any constraints like technical complexity or timelines. Next, they craft a strong hypothesis using a structured framework that links observations to expected outcomes and measurable metrics. Then, they determine the test type and calculate necessary sample sizes based on baseline rates and expected lift. After designing variants with clear, meaningful differences, they plan traffic allocation to balance risk and exposure. Finally, they run the test while monitoring for issues, avoid peeking at interim results, and analyze outcomes with statistical rigor to decide whether to implement changes or iterate further.
How do I know if my test has enough traffic? Calculate sample size using baseline conversion rates and expected lift to reach statistical significance, referencing established calculators. Can I test multiple changes at once? Multi-variate tests are possible but require significantly higher traffic and complexity; single, meaningful changes are preferred for clarity. What if results show no significant difference? This often means you need more traffic or a bolder variant; consider revisiting your hypothesis or test design before concluding.
Attach this skill to a Metaflow agent tasked with planning or evaluating marketing experiments. The agent will use product marketing context files if available, then guide you through hypothesis formulation, sample size calculation, and variant design. Expect the skill to produce detailed test plans and analysis checklists that align with your CRO goals and traffic constraints. This foundation supports continuous experimentation workflows and integrates seamlessly with other growth and analytics skills within Metaflow.
For broader context, see our roundup of claude skills for marketing, and read Claude Code workflows for marketing agencies for related setup guidance.