Reading PPP Loan Data to Estimate a Small Business's Payroll and Headcount
Last updated: July 2026
If you are sizing up an off-market business — one that isn't listed anywhere, whose owner you haven't met — you have almost no financial information to work with. No listing blurb, no broker CIM, no revenue figure. That is the whole point of off-market sourcing: you are looking at companies before a sale process exists, which means before anyone has packaged financials for you.
There is one public dataset that comes surprisingly close to a P&L line for millions of American small businesses: the Paycheck Protection Program loan data the SBA released under FOIA. Read correctly, a PPP record gives you a defensible estimate of a company's payroll and headcount as of 2020–21 — for free, from your desk, before you ever pick up the phone.
Read incorrectly, it becomes the kind of overreach that gets sourcing methods dismissed. This guide covers both: the mechanics of turning a PPP record into a payroll estimate, and the honesty rules that keep the estimate an estimate.
What the PPP dataset is
The Paycheck Protection Program ran in 2020 and 2021, lending to small businesses to keep staff on payroll during the pandemic. After FOIA litigation, the SBA published the loan-level data for the entire program — every borrower, name and location included. It is downloadable from the SBA's open-data portal at data.sba.gov/dataset/ppp-foia, and it covers on the order of 11.5 million loans.
For a searcher or ETA buyer, the useful columns in each record are:
- Borrower name, city, state — enough to match a record to a company you're tracking.
- Loan amount (initial and current approval) — the key number, for the reason below.
- Jobs reported — the employee count the borrower stated on the application.
- Date approved and loan status — when the loan was made, and whether it was forgiven, paid in full, or charged off.
- NAICS industry code — so you can filter to your vertical.
- Business age description — the borrower's own statement of how established the business was.
No other public dataset ties a specific small business's name to a payroll-derived dollar figure. SBA 7(a) records tell you a bank underwrote the company (a strong signal in its own right); registries tell you how long it has existed. PPP is the one that gives you a number you can convert to scale.
The formula: a PPP loan is 2.5× monthly payroll
Here is why the loan amount is so informative. PPP loan sizes weren't negotiated — under the program's rules, a first-draw loan was calculated as 2.5× the borrower's average monthly payroll costs (with compensation above $100,000 annualized per employee excluded from the calculation; see the SBA's program documentation at sba.gov).
That makes the arithmetic mechanical:
- Monthly payroll ≈ loan amount ÷ 2.5
- Annual payroll ≈ monthly payroll × 12
A company that borrowed $500,000 was running roughly $200,000 a month — about $2.4 million a year — in payroll costs at the time of its application. Cross-check that against the jobs-reported field and you have both a dollar figure and a headcount, each stated by the borrower on a federal application, for a company that has never published a financial statement in its life.
This is the closest thing to a P&L line that exists publicly for an off-market small business. It is also exactly where the overreach starts, so:
The four honesty rules
1. It is a 2020–21 snapshot, and you must date it. The company you're looking at today is five-plus years past its PPP application. It may have doubled; it may have shrunk. Always carry the filing year with the number — "roughly $2.4M/yr payroll as of its 2020 PPP filing," never "does $2.4M in payroll."
2. Payroll is not revenue. The formula gives you payroll costs. To get from payroll to revenue you need an assumption about the industry's revenue-per-payroll-dollar or revenue-per-employee, and that assumption must be named. The defensible way to do it: take a headcount figure and multiply by the industry's receipts-per-employee from the Census Bureau's Statistics of US Businesses (the 2017 Economic Census is the latest with published receipts; adjust for inflation). That yields a labeled revenue band — "likely low-seven-figures for a shop this size in this NAICS" — not a revenue fact.
3. Nothing here estimates EBITDA. Payroll and headcount say how big a company is, not what it earns. No public record supports a profitability estimate, and any method (or tool) that hands you one is overreaching. Margins are what diligence and the owner conversation are for.
4. The compensation cap skews some firms. Because pay above $100k annualized per employee was excluded from the loan calculation, businesses with highly paid staff — a dental practice with associate dentists, say — will show less payroll in the formula than they actually run. Treat the derived figure as a floor in high-wage verticals.
What loan status tells you (the underrated field)
Most people stop at the amount. The loan status field is quietly one of the most useful columns in the dataset:
- Forgiven or paid in full means the business held up its end — and it is evidence the business survived the pandemic intact. Be precise about what was verified, though: the payroll figure was documentation-backed at origination, when lenders collected payroll records to set the loan amount, but forgiveness review depended on size. Loans of $150,000 and above submitted payroll documentation for forgiveness review; most smaller loans were forgiven via self-certification on SBA Form 3508S.
- Charged off is a caution flag. Charged-off PPP borrowers are disproportionately businesses that later failed or were never substantial. A charged-off record doesn't prove a company is gone, but it moves it down your list.
When Scouly builds its company spine from this dataset, it applies exactly this logic: PPP-only companies enter the database only when the loan was at least $150,000 and the status shows it was repaid or forgiven — because a deal-sourcing feed must not surface defunct businesses. You can apply the same two filters in a spreadsheet.
Running the method yourself
- Download the PPP FOIA files from data.sba.gov/dataset/ppp-foia. They're large CSVs; filter early.
- Filter to your vertical and metro using the NAICS code prefix and the borrower city/state.
- Derive payroll: loan amount ÷ 2.5 = monthly payroll; × 12 = annual. Note the approval year next to every figure.
- Keep the jobs-reported number alongside the dollars — two independent statements of scale that should roughly agree.
- Screen on loan status — prioritize forgiven/paid-in-full, flag charged-off.
- Cross-reference the rest of the footprint: state-registry formation date for longevity, SBA 7(a)/504 records for bank underwriting, Form 5500 filings for a current headcount check (PPP is frozen in 2020–21; 5500s refresh annually).
- Rank and sequence. Write down what you're looking for, rank the list against it, and start conversations with the best-fit operators — before a listing exists.
How Scouly uses PPP data (and what it refuses to do with it)
This dataset is one of the public records Scouly is built on, so it's worth being precise about the boundaries.
Scouly ingests the PPP FOIA set filtered to its seven verticals — about 433,000 loans out of the full 11.5 million — and attaches each one to the matching company as a payroll snapshot: the derived monthly and annual payroll, the jobs reported, the filing year, and the loan status, with every input named. Two deliberate design choices:
- The payroll snapshot carries zero score points. Scouly scores targets on three underwritten or structural signals — registry longevity, SBA loan history, and market fragmentation. PPP-derived figures are evidence, shown to help you size a deal, and deliberately excluded from scoring precisely because they are estimates from a dated snapshot.
- Estimates are labeled as estimates, and EBITDA is never estimated. A revenue band shown on a Scouly target names its inputs (headcount evidence × Census SUSB receipts-per-employee) and its vintage. There is no profitability number anywhere in the product, because no public record supports one.
You can see every source and formula on the data page. And you can run the whole method by hand — Scouly's job is doing the cross-referencing at scale across hundreds of metros, not owning the data.
FAQ
Is PPP loan data public and legal to use? Yes. The SBA released the loan-level PPP data under the Freedom of Information Act, and it is downloadable by anyone at data.sba.gov. Using public records to research acquisition targets is standard practice.
Can I calculate a business's revenue from its PPP loan? Not directly. The loan encodes payroll (2.5× average monthly payroll cost, by program formula). You can build a revenue estimate by combining a headcount figure with industry receipts-per-employee benchmarks from Census SUSB data — but it must be presented as a labeled estimate with named inputs, never as a fact.
How accurate is the jobs-reported field? It is what the borrower stated on a federal loan application, documentation-backed at origination — though note most forgiveness under $150,000 was self-certified rather than re-reviewed. It is a good-faith 2020–21 figure — reliable enough to sort a five-person shop from a forty-person operation, not precise enough to state a current headcount. For fresher employee counts, check whether the company files a Form 5500.
Does a PPP loan mean the business was struggling? No. PPP was near-universal among small employers during 2020–21 — taking the loan says the business had payroll to protect, not that it was distressed. A forgiven loan is, if anything, mild positive evidence: the payroll was documented and the business survived.
Want the cross-referencing done for you? Scouly scores off-market operators from these public records across seven verticals — browse them by vertical and metro or build your thesis — free.