SEO Agency Scorecard: Free Evaluation Template with Semantic SEO Criteria [2026]
An SEO agency scorecard is a weighted evaluation framework that lets you compare multiple agency proposals on objective, predefined criteria β eliminating decisions based on presentation quality or vague promises. This template includes 8 weighted dimensions, semantic SEO-specific evaluation criteria, evidence requirements for each criterion, and a scoring guide based on POS1’s methodology for evaluating genuine SEO expertise.
Why You Need a Scorecard (Not Just a Gut Feeling)
When comparing 2β5 SEO agency proposals, the risk isn’t choosing “expensive vs. cheap” β it’s choosing an agency whose methodology doesn’t match what actually drives results in 2026. A scorecard forces structured comparison and prevents the most common evaluation mistake: selecting the agency with the most polished pitch deck rather than the strongest methodology.
| Evaluation Approach | Risk | Outcome |
|---|---|---|
| Gut feeling / best presenter | High β selects for sales skill, not SEO skill | Frequent misalignment with expectations |
| Lowest price | High β races to bottom on deliverable quality | Thin content, no methodology |
| Biggest brand name | Medium β junior staff handle accounts | Generic, template-driven work |
| Weighted scorecard | Low β forces evidence-based comparison | Objective selection aligned to your goals |
The Complete SEO Agency Scorecard Template
How to use this scorecard:
- Score each agency on each criterion from 1β5
- Multiply score Γ weight to get weighted score
- Sum all weighted scores for a total out of 5.0
- Agencies scoring below 3.0 should be eliminated
- Top 2 agencies advance to live presentation round
| # | Criterion | Weight | Agency A | Agency B | Agency C |
|---|---|---|---|---|---|
| 1 | Semantic SEO methodology | 25% | _ | _ | _ |
| 2 | Case studies with verifiable metrics | 20% | _ | _ | _ |
| 3 | Technical SEO depth | 15% | _ | _ | _ |
| 4 | Content strategy quality | 15% | _ | _ | _ |
| 5 | Reporting and communication | 10% | _ | _ | _ |
| 6 | GEO/LLMO capability | 10% | _ | _ | _ |
| 7 | Pricing value (deliverables/dollar) | 5% | _ | _ | _ |
| Total | 100% | _/5.0 | _/5.0 | _/5.0 |
Criterion 1: Semantic SEO Methodology (Weight: 25%)
This is the most important criterion in 2026 β it determines whether an agency understands how Google actually evaluates content quality.
Evidence to request:
- Example topical map built for a client (even anonymized)
- Explanation of their entity optimization process
- Sample content brief showing EAV structure and Question H2s
- How they handle content cannibalization
Scoring guide:
- 5 β Presents specific topical map, explains entity-attribute-value structure, references Koray Framework or equivalent semantic methodology
- 4 β Explains topic clusters and intent alignment with clear process, has sample deliverables
- 3 β Mentions topical authority but can’t show a concrete example
- 2 β Focuses on keywords, mentions “content clusters” without methodology
- 1 β Talks only about “quality content” and keyword research
Criterion 2: Case Studies with Verifiable Metrics (Weight: 20%)
Case studies reveal actual results β not projected outcomes. Require metrics that show the full picture: traffic, not just rankings.
Evidence to request:
- 2β3 case studies in your industry vertical
- GSC screenshots showing impression and click growth
- Timeline: how long did results take to materialize?
- Client references willing to confirm results
Scoring guide:
- 5 β 2+ case studies with GSC data, traffic AND conversion metrics, verifiable client reference
- 4 β Case studies with traffic data, timeline, and client name (even if reference not provided)
- 3 β Case studies with ranking improvements only (no traffic or conversion data)
- 2 β One vague case study without metrics
- 1 β No case studies or “all work is confidential”
β See what documented semantic SEO case studies look like: +340% traffic e-commerce case | +156% conversion B2B SaaS case
Criterion 3: Technical SEO Depth (Weight: 15%)
Evidence to request:
- List of technical SEO elements covered in their audit
- How they handle Core Web Vitals remediation
- Schema markup types they implement by default
- Crawl budget optimization approach for large sites
Scoring guide:
- 5 β Covers CWV, schema markup (FAQPage, Article, Organization), crawl optimization, JavaScript SEO, international SEO if applicable
- 4 β Strong technical audit process with prioritized action tiers
- 3 β Standard audit (titles, metas, 404s, speed) without semantic technical elements
- 2 β Mentions technical SEO without specific deliverables
- 1 β No technical SEO component in proposal
Criterion 4: Content Strategy Quality (Weight: 15%)
Evidence to request:
- Sample content brief or content calendar
- How they determine content priority (by traffic potential, intent, topical gap?)
- Their process for content that doesn’t rank after 3 months
- Word count and format guidelines
Scoring guide:
- 5 β Topic cluster approach with pillar-spoke architecture, intent-aligned formats, EAV content structure
- 4 β Clear content calendar with topic clusters and intent classification
- 3 β Volume-based content plan without cluster architecture
- 2 β Generic “X posts per month” without strategic rationale
- 1 β AI-generated bulk content with no editorial process
Criterion 5: Reporting and Communication (Weight: 10%)
Evidence to request:
- Sample monthly report (anonymized)
- Communication cadence (Slack, email, calls)
- How they report on KPIs vs. vanity metrics
- Escalation process for underperforming months
Scoring guide:
- 5 β Monthly report includes GSC impressions + clicks, organic traffic trend, conversion attribution, upcoming actions, and honest analysis of what’s not working
- 4 β Clear reporting with traffic data and actions, regular calls
- 3 β Standard ranking report with some traffic data
- 2 β Rankings-only report, infrequent communication
- 1 β No reporting sample provided or report shows only rankings
Criterion 6: GEO/LLMO Capability (Weight: 10%)
Evidence to request:
- How they optimize content for Google AI Overviews
- Whether they track AI citation mentions for clients
- Their approach to structured data for AI search
Scoring guide:
- 5 β Documented GEO methodology, tracks AI Overview appearances in GSC, has AI citation monitoring in place
- 4 β Clear understanding of GEO principles, implements schema for AI visibility
- 3 β Aware of AI search but no specific methodology
- 2 β Mentions AI search without practical implementation
- 1 β Unaware of GEO/LLMO or dismisses AI search relevance
β Understand what GEO/LLMO means: From Semantic SEO to GEO/LLMO Guide
Frequently Asked Questions
How many agencies should I evaluate with this scorecard?
Evaluate 3β5 agencies. Fewer limits comparison; more creates evaluation overhead with diminishing returns. Pre-screen by reviewing agency websites and case study pages before requesting proposals β eliminate obvious mismatches before investing in the full evaluation process.
What score indicates a good agency?
A weighted total above 3.5/5.0 indicates a strong agency. Scores of 4.0+ represent agencies with genuine semantic SEO capability and documented results. Eliminate any agency scoring below 2.5/5.0 regardless of price, as the methodology gap will result in poor results regardless of budget.
Should price be a major factor?
Price should be the last factor evaluated, not the first. The difference between a $3,000/month agency and a $7,000/month agency is meaningless if the cheaper one lacks semantic SEO methodology β you’ll pay for 12 months of work that doesn’t compound. Evaluate methodology and results first, then assess whether the pricing reflects the value of what’s being delivered.
How is this scorecard different for semantic SEO agencies vs. traditional SEO?
Traditional SEO scorecards focus on backlinks, keyword rankings, and technical audits. This scorecard weights semantic methodology (topical maps, entity optimization, EAV structure) at 25% β the most important factor β because in 2026, agencies without semantic SEO methodology are optimizing for signals Google is actively devaluing. Use the RFP template alongside this scorecard for a complete evaluation process.
