Harmonize Scores 100% on the Customs Broker License Exam

Harmonize achieved a perfect 100% score on the classification section of the Customs Broker License Exam (CBLE), correctly answering all 22 product classification questions from the April 2025 and October 2024 exams at the 8-digit HTS level. Here is how we tested, how we improved, and how we compare to other AI classification tools on the market.

What Is the CBLE?

The Customs Broker License Exam is the licensing examination administered by U.S. Customs and Border Protection (CBP). It is the gateway to becoming a licensed customs broker in the United States. The exam consists of 80 multiple-choice questions to be completed in 4.5 hours, with a 75% passing score required.

The exam is divided into categories. Category IV covers product classification under the Harmonized Tariff Schedule of the United States (HTSUS) — the core skill that determines whether an importer pays the correct duty rate. These questions present a product description and ask the examinee to select the correct HTS code from four options, typically at the 8-digit level.

For customs brokers, this section is both the most practical and the most challenging. It requires navigating General Rules of Interpretation (GRI), Section and Chapter Notes, and thousands of heading and subheading provisions — the same skills required every day in a working brokerage.

Our Benchmark Methodology

We tested Harmonize against 22 classification questions drawn from the April 2025 and October 2024 CBLE examinations. These are the questions from Category IV (Questions 26-40 on each exam) that test product classification ability.

We deliberately excluded tariff knowledge questions — those that test memorization of duty rates, trade program eligibility, or specific regulatory provisions. Those questions assess recall, not classification reasoning. Our benchmark focuses exclusively on the skill that matters for production use: given a product description, can the engine identify the correct HTS code?

Each question was run through the same production classification engine available to all Harmonize users. No test-specific prompts, no manual overrides, no post-hoc corrections. The engine received the product description and returned a classification. We compared that classification against the official CBP answer key.

Results: 100% (22/22)

Harmonize correctly classified every product in the benchmark set at the 8-digit HTS level. Option matching was perfect — every answer matched the correct multiple-choice selection from the official exam.

The questions covered a representative range of classification challenges:

  • Material-based classification: Products classified primarily by their constituent material (textiles, plastics, metals)
  • Function-based classification: Products classified by their intended use or function (machinery, instruments)
  • GRI 3 composite goods: Products composed of multiple materials requiring essential character analysis
  • Chapter Note exclusions: Products that appear to belong in one chapter but are excluded by a Note and classified elsewhere
  • Numeric threshold questions: Products where a specific percentage (e.g., 50% cotton by weight) determines the classification

Score Progression: From 68% to 100%

The 100% result did not come on the first attempt. Our benchmark scores tracked the engine's development over three major versions:

  • v1 (September 2025): 68% (15/22) — The initial engine struggled with material vs. function classification conflicts and missed several Chapter Note exclusions
  • v2 (December 2025): 86% (19/22) — Improved material composition detection and GRI hierarchy application. Remaining errors were in numeric threshold extraction and cross-chapter exclusion rules
  • v3 (February 2026): 100% (22/22) — Key improvements in three areas: reliable material vs. function detection when both are classification-relevant, accurate numeric threshold extraction from product descriptions, and consistent application of cross-chapter exclusion rules from Section and Chapter Notes

Each version improvement came from analyzing the specific failure modes in the benchmark. The CBLE questions proved to be an excellent diagnostic tool — each missed question pointed to a concrete gap in the classification pipeline that could be addressed systematically.

Competitor Comparison

We benchmarked two other AI classification tools on the same exam questions using the same methodology:

  • Harmonize: 100% (22/22)
  • Gaia Dynamics: 100% (22/22)
  • KYG Trade: 93% (20.5/22)

Gaia Dynamics matched our perfect score. KYG Trade missed 1.5 questions — one outright incorrect classification and one partial credit answer where the 6-digit heading was correct but the 8-digit subheading was wrong.

All three tools were tested on identical questions with identical methodology. No tool received advance access to the questions or answer key. These results reflect production-grade performance from each platform's publicly available classification engine.

Methodology Transparency

We believe benchmark results are only meaningful if they are reproducible and transparent. Here is what we commit to:

  • No test-specific overrides: Every code change that improved benchmark scores was made to the production classification engine. There are no "benchmark mode" flags or question-specific patches
  • Full per-question results: The complete question-by-question breakdown — including the product description, the engine's classification, and the correct answer — is available on our benchmark page
  • Ongoing benchmarking: As new CBLE exams are released, we will add those questions to the benchmark set and publish updated results
  • Production parity: The engine version used for benchmarking is the same version available to every Harmonize user at the time of testing

We encourage other classification tool providers to publish their own CBLE benchmark results. A shared, transparent benchmark benefits the entire customs brokerage industry by giving brokers objective data when evaluating tools.

What the CBLE Benchmark Does and Does Not Measure

The CBLE is a strong benchmark because it tests real classification reasoning on questions vetted by CBP. However, it has limitations worth acknowledging:

  • Multiple choice format: The engine selects from four options rather than generating a classification from scratch. This is easier than open-ended classification because the answer space is constrained
  • Clean product descriptions: CBLE questions provide well-structured product descriptions. Real-world classification often starts with vague importer-provided descriptions like "plastic part" or "metal component"
  • Limited scope: 22 questions cannot cover the full complexity of the HTSUS. Edge cases in specific chapters (particularly textiles, chemicals, and machinery) require deeper testing

That said, 100% on the CBLE classification section means the engine correctly applies GRI hierarchy, reads Section and Chapter Notes, handles material-based and function-based classification, and resolves composite goods — the fundamental skills required for tariff classification.

Try the Engine That Passes the CBLE

Harmonize.ai scored 100% on the Customs Broker License Exam classification section. See the full per-question results on our benchmark page, or try the classification engine yourself.

Try Harmonize Free

View full benchmark results • Free 6-digit classification • No account required


This article is for informational purposes only and does not constitute legal or customs brokerage advice. Importers and brokers should consult with a licensed customs broker or trade attorney for guidance on specific classification and compliance decisions. Harmonize.ai is a classification research tool operating under 19 U.S.C. § 1641 — we provide research support, not customs brokerage services.