If you can answer "yes" to all 24 of these, your AI project ships clean. If you can't answer "yes" to half of the "Before you start" items, you're not ready, regardless of how good the vendor's deck looks. Print this list, take it to any vendor pitch, and ask which 24 items they help with directly. The answer tells you whether you're hiring an implementation partner or a tool reseller.
TL;DR.
The 24 items break into three phases: 8 readiness items before any vendor selection, 8 build-quality items during construction, 8 operational items after launch. Failing more than 2 items per phase is a strong signal of project risk. The 3 most-skipped items: #6 (CRM data quality), #16 (written rollback plan), #20 (false-confidence logging). Most failed SMB AI projects skipped at least two of those. Use this list to evaluate vendors, score your own readiness, and run weekly health checks during the first 90 days. The checklist is free, no email gate, no PDF wall.
How to use this checklist.
- Before any vendor selection: score yourself on items 1-8. If you score under 5/8, slow down. The vendor problems you're trying to solve usually aren't actually vendor problems; they're readiness problems.
- During vendor evaluation: ask any candidate vendor to walk through items 9-16. Which do they handle, which do they expect you to do, which do they not address? The gaps tell you whether you're hiring an implementation partner or a tool reseller.
- During build: review items 9-16 at week 1 + week 4. Anything not yet addressed by week 4 needs an explicit conversation about why.
- After launch: use items 17-24 as your weekly health check for the first 90 days. Anything red for 2 consecutive weeks = escalation conversation.
- At day 90: use item 24 (the expand/kill/hold decision) as a forcing function. No more open-ended pilots.
Before you start (8 items).
- I have named ONE bottleneck I am solving, in one sentence.
- I can quantify the bottleneck (hours per week, dollars per month, or a percentage).
- I have written the success metric and the threshold for "this worked."
- I have committed 4 to 8 hours per week of internal time for the first 60 days.
- I know which CRM holds the customer data this project will touch.
- I have confirmed my CRM data is clean enough to use (or budgeted to clean it).
- I have approved a budget range that matches a realistic project size.
- I have one decision-maker on my side empowered to approve in week 1.
Deep notes on the "Before you start" items.
- On #1 (one bottleneck): If you can list 5 bottlenecks, you don't have a project yet, you have a wish list. Pick one. The other 4 wait.
- On #3 (success metric): "We want to use AI more" is not a metric. "We want to cut response time from 4 hours to under 2 minutes" is a metric. The difference is whether the result is measurable.
- On #4 (4-8 hours/week): If the owner can't commit this, the project will fail. AI implementation is not a magic outsourcing where you sign a check and walk away. The owner has to be in the room (or on Slack) for the first 60 days.
- On #6 (CRM data quality): Most underestimated cost in the whole sequence. Dirty data = confident garbage out. Budget 1-3 weeks of cleanup before the AI deployment makes sense.
- On #8 (decision-maker): Multi-stakeholder approval cycles kill AI projects faster than any technical problem. One person who can decide in week 1.
During build (8 items).
- The vendor has written a one-page spec, not a deck.
- The spec includes acceptance criteria at 30, 60, and 90 days.
- The project has a go-live date, not just a "pilot" milestone.
- Integrations are listed by name (CRM, phone, calendar, etc.), not vaguely.
- I know exactly which conversations the AI will handle and which it will escalate.
- My team has been informed and assigned roles for handoffs.
- Shadow-mode testing is planned before live cutover.
- I have a written rollback plan if the launch fails.
Deep notes on the "During build" items.
- On #9 (one-page spec): If the vendor's documentation is a 40-slide deck, you're paying for slides. Implementation specs are tight: scope, integrations, acceptance criteria, go-live date, rollback. One page.
- On #11 (go-live date): The single most-skipped commit in vendor proposals. Without a go-live date, "pilots" run forever. Pin the date in writing.
- On #15 (shadow-mode testing): Run the AI in parallel with the existing process for 1-2 weeks before cutover. Watch where it agrees with the human, where it disagrees, why. Tune from there.
- On #16 (rollback plan): Most teams skip this assuming launches will work. The teams that ship clean always have it written, even if they never use it. The act of writing it surfaces edge cases the vendor missed.
After launch (8 items).
- I have a weekly review meeting on the calendar for the first 90 days.
- The metric defined in step 3 is being tracked and visible.
- Escalated conversations are reviewed weekly and used to tune.
- False-confidence cases (AI was wrong but acted certain) are logged separately.
- Customer feedback channels are monitored for any AI-related complaints.
- Internal SOPs are written and accessible to anyone on my team.
- There is a documented owner for ongoing maintenance.
- At day 90, I have a clear decision: expand, kill, or hold.
Deep notes on the "After launch" items.
- On #17 (weekly review): 30 minutes. Same time every week. Owner + vendor + one team member who works with the AI daily. Three questions: what worked, what didn't, what changes this week.
- On #20 (false-confidence logging): Most-skipped operational item. AI getting things wrong while sounding confident is the highest-impact failure mode. Log every one, review weekly, tune the knowledge base or escalation rules to prevent recurrence.
- On #21 (customer feedback channels): Email, GBP reviews, social DMs, support tickets. Set up an alert for any mention of the AI by name or by description ("the robot," "your bot"). Address proactively.
- On #24 (day-90 decision): The forcing function. Without it, half-working systems linger and erode budget + team trust.
The 3 most-skipped items (and why they matter).
| Item | Why teams skip it | What it costs to skip |
|---|---|---|
| #6 CRM data quality | Looks like "boring" prep work; nobody wants to do dedupe | AI produces confident garbage on bad data; trust evaporates within 60 days |
| #16 Written rollback plan | Assume launches will work; "we'll figure it out" | Bad launches that lack a rollback plan eat 5-15 days of recovery time and erode team confidence permanently |
| #20 False-confidence logging | Sounds tedious; "we'll just fix things as they come up" | Repeat errors compound; customers notice; trust in AI never recovers |
Tools that help with the checklist.
- For Step 3 (success metric): a shared Google Sheet or Notion doc that lives on the project channel
- For Step 6 (data cleanup): OpenRefine (free), Dedupely, or your CRM's native dedupe tool
- For Step 11 (go-live date): a calendar invite that includes vendor + owner + team lead. Public commitment.
- For Step 15 (shadow-mode testing): CRM tagging on AI-touched vs human-touched conversations to enable comparison
- For Step 16 (rollback plan): a documented one-page "if this breaks, here's what we do" with on-call contacts
- For Step 19 (escalation review): tag escalations in the CRM with a custom field; review in a weekly 30-min meeting
- For Step 20 (false-confidence logging): a Slack channel or Notion table where the team drops examples as they happen
- For Step 22 (SOPs): Loom + a shared Notion or Google Doc with role-by-role responsibilities
Vendor questions, framed against the checklist.
- "Will you give me a one-page spec with go-live date and acceptance criteria? (Item 9-11)"
- "Which of these 4 integrations will you handle, and which do you expect me to handle? (Item 12)"
- "Walk me through exactly which conversations the AI will handle vs escalate. (Item 13)"
- "What does shadow-mode testing look like in your engagement? (Item 15)"
- "What's the rollback plan if launch fails? (Item 16)"
- "Who owns the system after launch? Do I get admin access? (Item 23)"
- "What does the day-90 decision meeting look like? Who attends? (Item 24)"
FAQ.
- How do I use this?
- Print it. Take to vendor pitches. Ask which items they handle, which you do, which they skip. The gap is the truth about whether they implement or just resell.
- How many items before starting?
- All 8 'Before you start.' Less than half = not ready.
- Most-skipped item?
- #16 (written rollback plan). #6 and #20 are close seconds.
- Works for non-AI projects?
- Yes. Items 1-8 and 17-24 apply to any operational project. 9-16 are AI-specific.
- Free?
- Yes. No email, no PDF wall.
- Vendor refuses?
- That tells you what you need to know. Walk.