Build an ABM fit score: AI Lead Scoring for B2B Account-Based Marketing

You’ve been there. You start building lead scoring in HubSpot or Marketo, and immediately fall into dropdown purgatory: endless clicking through menus, assigning point values to job titles, email opens, and webinar attendance.

Weeks later, you launch it. Before you know it, you’re stuck QAing, checking how your rules affect MQL flow, and realizing you may have over-scored or under-scored in ways that ripple through your ABM funnel.

The problem isn’t the tool. It’s the approach.

Manual lead scoring treats every signal equally and forces you to guess point values before you see the data. You’re building a lead scoring model based on intuition, instead of patterns. And once it’s live, changing those rules means clicking through the same menus again.

ABM fit scoring is the faster, saner alternative.

Instead of scoring individual leads based on behavior, you’re scoring accounts based on how closely they match the patterns of customers who actually closed.

This article outlines how Steve Armenti, founder of twelfth agency and former ABM lead at Google, uses AI to build fit scoring outputs using enriched account data, a scoring prompt, and a scored CSV (ideally with confidence indicators and top drivers).

This isn’t about replacing HubSpot or other platforms entirely. It’s about using AI to do the pattern recognition work that humans are terrible at, then importing those scores into whatever CRM you’re using.

Here’s the playbook.

The problem with “points” as an ABM strategy (Fit scores vs. behavior scoring)
The AI fit scoring workflow
The “don’t blow your Clay credits” guide to smart enrichment
How to import AI fit scores into HubSpot (or any CRM) without breaking your stack
ABM lead scoring without segmentation is just a spreadsheet flex.
Your ABM fit score build play (next steps)
The real ROI of AI fit scoring is time, not accuracy

The problem with “points” as an ABM strategy (Fit scores vs. behavior scoring)

Traditional lead scoring optimizes for engagement. The more someone interacts with your content, the higher they score.

That works if your goal is to “generate leads.” But it’s terrible for closing deals with the right accounts.”

Fit scoring answers a different question: Does this account look like the ones that actually buy from us?

The inputs are different:

Behavior scoring: Email opens, content downloads, webinar attendance, website visits
Fit scoring: Company size, revenue, tech stack, hiring trends, market signals, job titles on the buying committee

Behavior tells you who’s paying attention. Fit tells you who’s worth pursuing.

Armenti’s framework separates the two deliberately. While fit determines whether an account enters your program, behavior determines how aggressively you engage once they’re in.

Most teams conflate the two and end up with MQL lists full of accounts that will never close because they don’t match the profile of buyers who actually convert.

With fit scoring, you’re not guessing which signals matter. You’re letting your CRM data show you which patterns correlate with closed deals, then scoring new accounts against those patterns.

The AI fit scoring workflow

Armenti’s process assumes you’ve already built a lookalike list and enriched it with signals beyond basic firmographics. (If you haven’t, start with CRM pattern analysis to build your lookalike list first.)

Here’s what you need before you start account scoring:

Required enrichment data:

Company size (employees)
Annual revenue
Industry/sub-industry
Headquarters location
Tech stack (legacy and current software)
Hiring trends (number of open roles, growth/decline signal)
Primary, secondary, and tertiary contact job titles
Any custom signals specific to your ICP (funding, locations, customers, partnerships)

Optional but powerful additions:

Website traffic trends (via SimilarWeb or BuiltWith)
News mentions (keyword-based scraping for terms like “expanding,” “funding,” “acquisition”)
Product usage data (if you’re PLG and these accounts are already using your free tier)
G2 or review site presence

You don’t need all of this. But the richer your dataset, the more nuanced your fit scores become.

“We’re trying to find more of the intangibles like the team size and who’s inside the company making these decisions, who needs to be influenced, who’s actually purchasing, who’s putting their name on the contract.”

– Steve Armenti

Armenti uses Clay to aggregate these signals into a single CSV. Apollo, ZoomInfo, and other B2B account-based marketing tools work too.

The format matters less than having everything in one exportable file. Once you have that CSV, the AI scoring workflow takes three steps.

Step 1: Upload your enriched list to ChatGPT (or Claude) and define the scoring job

Use ChatGPT agent mode if you can. It’s better suited for working directly with CSVs and generating new output files.

Upload your enriched CSV and give it a clear job description:

Prompt framework:

This prompt does three things manual scoring can’t:

Pattern recognition across multiple variables simultaneously: AI correlates company size with tech stack, hiring trends, and job titles in ways humans can’t manually track.
Confidence levels (or signal strength): Not all scores are equally reliable. If an account has incomplete data, the model should flag lower confidence so you know what needs enrichment before you prioritize it.
Driver transparency: Instead of black-box scores, you see exactly which signals contributed most to each account’s ranking.

Step 2: Review the preview and validate logic

AI will process your data and return a preview of the first 10-20 rows.

What to check:

Do the top-ranked accounts actually look like your best customers?
Are the driver signals aligned with your ICP thesis?
Do any low-confidence scores indicate missing data you can fill in?

If something looks off, refine the prompt. Add constraints like:

“Prioritize accounts with tech stack signals over company size.”
“Flag accounts with declining hiring trends as lower fit.”
“Weight director-level contacts higher than VP-level.”

This is where domain expertise matters. AI finds patterns, but you validate whether those patterns are actionable.

Armenti runs test batches first (10 to 20 accounts) before running the full dataset.

“I always like to take a couple of rows and manually go through and just say,

“Is this the right domain?”

“Chat said it’s a healthcare company but is it really?”

Just do spot-checks because you don’t want to feed inaccurate data into this process.”

– Steve Armenti

Step 3: Export the scored CSV and segment into tiers

Once you’re confident in the logic, run the full analysis and export the new CSV.

Now you have an ABM fit score for every account on your list.

The next step: tiering.

Armenti uses a three-tier model based on score thresholds:

Tier 1 (70-100): Highest fit, highest confidence. These accounts get personalized multi-channel campaigns, SDR outreach, and the majority of your budget.
Tier 2 (40-69): Moderate fit. These accounts get LinkedIn ads, basic nurture sequences, and outbound email but not gifting or high-touch plays.
Tier 3 (0-39): Low fit or low confidence. These accounts enter broad awareness campaigns (content syndication, display ads) or get removed from the list entirely.

The tier boundaries depend on your list size and budget. If you have 500 accounts and can only afford high-touch campaigns for 50, your Tier 1 threshold might be 80+.

The “don’t blow your Clay credits” guide to smart enrichment

Before you run out and enrich 2,000 accounts with every available signal, know this:

Clay credits burn fast, and most enrichments aren’t worth the spend for account scoring.

Here’s how to enrich strategically without wasting budget:

Start with free or low-cost signals

Company domain → Clearbit (free tier or cheap lookups)
LinkedIn profile data → Free if you’re scraping manually
Website scraping → Free with web scraping tools or Clay’s built-in scrapers

Prioritize signals that correlate with closed deals

Before you enrich 500 accounts with technographic data at 4 credits per row, analyze 20-30 closed deals in your CRM. Look at which signals appeared most consistently.

If 80% of your customers were using Snowflake or Salesforce at the time of close, tech stack enrichment is worth the credits.

If job title matters more than tech stack, prioritize contact data over technographics.

Use waterfalls for expensive data

Clay supports enrichment waterfalls: start with cheaper sources first, then fall back to premium providers only when the earlier step fails.

The point is to avoid paying top-dollar for every row when you only need it for the hard-to-find cases.

Batch-test enrichments on 50 accounts before scaling

Pick 50 accounts from your list and enrich them fully. Then run the fit scoring prompt and check whether the expensive signals (tech stack, hiring trends, news mentions) actually impact scores.

If they do, enrich the full list. If they don’t, save your credits.

Sample enrichment stack for a 500-account ABM list (budget: $500/month)

Clay is credit-based, so test before you scale. Armenti notes, Clay starts around $149/month for 2,000 credits, and in his experience, that may only get you a couple of hundred accounts enriched, depending on what you run.

Start with a small batch, confirm the enrichments actually improve lead scoring, then expand.

How to import AI fit scores into HubSpot (or any CRM) without breaking your stack

AI generates the scores. Your CRM operationalizes them.

Here’s the import AI lead scoring workflow:

1. Create fields for the score and the “why”

Add properties for Fit Score, Tier, and Top Drivers so you don’t end up with a mysterious number nobody trusts. (Drivers are what make the score usable in messaging.)

2. Import the scored file into your CRM

Bring in the scored CSV and map it to company records so each account carries its score, tier, and drivers.

3. Tier immediately (don’t admire the spreadsheet)

Convert scores into action buckets (Tier 1 / Tier 2 / Tier 3). Tiering is what turns “analysis” into “execution.”

4. Segment within tiers by the strongest drivers

Within Tier 1, create sub-segments based on what actually made them high-fit (e.g., “rapid hiring growth,” “specific tech signals,” “recent announcements”). This is where personalization gets real.

5. Attach a play to each tier and segment

Define what happens next for each tier (and each Tier 1 sub-segment): who owns it, what channels fire, and what “next step” you’re driving toward.

6. Validate with spot checks before you scale

Before you let workflows rip across hundreds of accounts, spot-check a handful: domain accuracy, industry classification, and whether the segments make sense. Armenti explicitly recommends this to avoid bad data flowing through the system.

7. Set a refresh cadence

When your ICP shifts or you add new signals, re-run the workflow and re-tier. The point is speed and adaptability, not building a fragile lead scoring model you’ll dread touching again.

ABM lead scoring without segmentation is just a spreadsheet flex.

Here’s where most teams mess this up: they build fit scores, import them into the CRM, and then… do nothing with them.

Lead scores mean nothing without segmentation and action.

Armenti’s clients who succeed follow this sequence:

1. Score the accounts (AI fit scoring workflow)

2. Tier the accounts (70+ = Tier 1, 40-69 = Tier 2, etc.)

3. Segment by signal (Within Tier 1, create sub-segments based on top drivers: “Tier 1 – Snowflake users,” “Tier 1 – Rapid hiring growth,” “Tier 1 – Recent funding”)

4. Build ABM campaigns by segment (Each sub-segment gets messaging tailored to the signal that drove their high fit score)

“ABM is a full-funnel full experience from marketing to sales to closed one to post-sale… you may need to adjust some of these assets to ensure that the experience is unique and stays consistent across the whole campaign.”

– Steve Armenti

If your Tier 1 accounts scored high because they’re using legacy BI tools and you sell a modern data stack, your campaign messaging should speak directly to modernization pain points.

If they scored high because they’re hiring aggressively in sales, your messaging should focus on scaling revenue operations without scaling headcount.

Your ABM fit score build play (next steps)

1. Export your enriched lookalike list

If you don’t have one yet, build it using CRM pattern analysis first. Make sure it includes firmographics, technographics, and job title data (at minimum).

2. Run a test scoring batch on 20-50 accounts

Upload to ChatGPT in agent mode. Use the prompt framework from this article. Review the output for logic and driver alignment.

3. Enrich strategically, not exhaustively

Use Clay’s free tier and cheap signals first. Only add expensive enrichments (tech stack, news, G2 data) if your test batch proves they impact fit scores.

4. Import scores to your CRM and tier immediately

Create Tier 1, Tier 2, and Tier 3 lists. Enroll each tier in the appropriate campaign workflows. Don’t let scored accounts sit in limbo.

5. Refresh quarterly

Set a calendar reminder to re-run enrichment and scoring every 90 days. Account fit isn’t static.

The real ROI of AI fit scoring is time, not accuracy

Manual lead scoring in HubSpot isn’t just slow. It’s fragile.

Every time your ICP shifts or you add a new signal, you’re back in the menus, clicking through dropdown rules. It takes weeks to rebuild and weeks to QA.

With the right data in place, this workflow can run in hours, and it replaces what used to take “weeks and weeks” of manual setup and QA. When your ICP changes, you tweak the prompt and re-run the analysis. Far less manual configuration and fewer lengthy QA loops.

The accuracy argument is secondary. Yes, AI finds patterns humans miss. But the real advantage is speed and adaptability.

Your ICP will shift. Your market will change. And your best customers six months from now might look different than your best customers today.

Manual scoring can’t keep up. AI lead scoring rebuilds itself in an hour.

That’s the unlock.

Ready to stop guessing which accounts deserve your budget? Learn how to use AI and enriched CRM data to build lookalike lists, score them for fit, and launch account-based marketing B2B campaigns that prioritize the accounts most likely to close. → Join Steve Armenti’s live course on AI-powered ABM list building.

Here’s a sneak peek at what you’ll be learning in the course:

HubSpot isn’t your scoring engine. Your data is. (Get ABM fit with AI lead scoring)

Table of contents