Matching Methods Used with Matching Rules

The Exact matching method looks for strings that match a pattern exactly. If you're using international data, we recommend you use the Exact matching method with your matching rules. We've provided an exact matching method that can be used for almost any field, including custom fields.

The Fuzzy matching methods look for strings that match a pattern approximately.



Matching MethodMatching AlgorithmsScoring MethodThresholdSpecial Handling
ExactExact
Fuzzy: First NameExact

Initials

Jaro-Winkler

Name Variant

Maximum85The Middle Name field, if used in your matching rule, is compared by the Fuzzy: First Name matching method.
Fuzzy: Last NameExact

Keyboard Distance

Metaphone 3

Maximum90
Fuzzy: Company NameAcronym

Exact

Syllable Alignment

Maximum70Removes words such as Inc and Corpbefore comparing fields. Also, company names are normalized. For example, IBMis normalized to International Business Machines.
Fuzzy: PhoneExactWeighted Average80Phone numbers are broken into sections and compared by those sections. Each section has its own matching method and match score. The section scores are weighted to come up with one score for the field. This process works best with North American data.
  • International code (Exact, 10% of field's match score)
  • Area code (Exact, 50% of field's match score)
  • Next 3 digits (Exact, 30% of field's match score
  • Last 4 digits (Exact, 10% of field's match score)

For example, suppose these two phone numbers are being compared: 1-415-555-1234 and 1-415-555-5678.

All sections match exactly except the last 4 digits, so the field has a match score of 90, which is considered a match because it exceeds the threshold of 80.

Fuzzy: CityEdit Distance

Exact

Maximum85
Fuzzy: StreetExactWeighted Average80Addresses are broken into sections and compared by those sections. Each section has its own matching method and match score. The section scores are weighted to come up with one score for the field. This process works best with North American data.
  • Street Name (Edit Distance, 50% of field's match score)
  • Street Number (Exact, 20% of field's match score)
  • Street Suffix (Exact, 15% of field's match score)
  • Suite Number (Exact, 15% of field's match score)

For example, suppose these two billing streets are being compared: 123 Market Street, Suite 100 and123 Market Drive, Suite 300.

Because only the street number and street name match, the field has a match score of 70, which is not considered a match because it's less than the threshold of 80.

Fuzzy: ZIPExactWeighted Average80ZIP codes are broken into sections and compared by those sections. Each section has its own matching method and match score. The section scores are weighted to come up with one score for the field.
  • First 5 digits (Exact, 90% of field's match score)
  • Next 4 digits(Exact, 10% of field's match score)

For example, suppose these two ZIP codes are being compared: 94104–1001and 94104.

Because only the first 5 digits match, the field has a match score of 90, which is considered a match because it exceeds the threshold of 80.

Fuzzy: TitleAcronym

Exact

Kullback-Liebler Distance

Maximum50

No comments:

Counters