Using Fuzzy Search in AML queries

Updated on 14.11.24
8 minutes to read
Copy link

Overview

SEON's AML tools offer a wide range of solutions and configuration options to support KYC & MLRO officers in their fight against fincrime and money laundering. Simply run quick checks with simple search, or use or complex search to hone in on the correct person. You can also decide to use exact search or fuzzy search and shift through their data quickly with our relevancy score system.

 

Fuzzy Search Settings

When using AML API over an API integration, you can customize your fuzzy search settings by including parameters in your API request.

Include your fuzzy parameters nested within the config.fuzzy_config parameter. You can use the parameters below to tweak what kind of results fuzzy search returns.

  • phonetic_search_enabled – Default setting: False. 
    If enabled the parameter will turn on SEON's Phoinetic Search module. This means that the tokens entered into AML API are converted into a phonetic representation using the double metaphone, koelnerphonetik, haasephonetik, beider-morse, and daitch-mokotoff algorithms. When enabled the AML lookup will only use these phonetic representations of entered name and those in the database.
  • edit_distance_enabled – Default setting: True. 
    edit distance is the number of single-character changes needed to turn one term into another (e.g. mat » bat has an edit distance of 1). When set to True, AML lookups will return names similar to the search term entered: e.g.: 'Anastasia' matches 'Anastasya'. When you enter a search query, our system compares it to the names in our database. If a name token (a name element) has a length equal to or greater than 7 letters, we allow for 1-character edits to find potential matches. If the token length is equal to or greater than 13, we allow for 2-character edits to find potential matches.
    For example, you enter the name "Tetjana Donez" which consists of two tokens. With the default value of 7, for each token with a character length above this value, our system will search for variations with 1 edit distance (single-character changes) to find potential matches.
    For example, "Tetjana Donez" will be considered a match with "Tetiana Donets" because they differ by just one character.
  • scoring.result_limit – Default setting: 10.
    Use this parameter to define the maximum number of hits AML API should return in the result set. The result set is ordered by the source type hits are identified in: ['sanction', 'warned_entities/crimelist', 'central_bank/watchlist', 'pep']
  • scoring.score_threshold – Default setting: 0.585.
    Set the Relevancy score threshold. The relevancy score is a normalized probability score with possible values between 0-1.

 

DOB Filter

The Date of Birth (DOB) filter is applied by default during searches to enhance accuracy. Here’s how it works:

  • DOB Discrepancy Filtering: If the DOB in the search query does not match the DOB in the result, the result will be filtered out.
  • Handling Missing DOB Information: If the database lacks DOB information (e.g., when authorities haven’t disclosed it), the result will not be filtered out, as it’s impossible to conclusively rule out a match based solely on name.
  • Customizable Filtering Options: For specific use cases, we can configure the system to filter out all results where the DOB does not match exactly, even if the AML database lacks DOB information. This can be particularly useful in low-risk scenarios, such as in low-risk countries where companies primarily serve only domestic, low-risk users.

 

DOB Estimation

The DOB Estimation feature is a back-end enhancement to our database designed to reduce false positives in AML screening by filtering out irrelevant results, even when an exact date of birth (DOB) isn’t available. 

This improvement helps bridge data gaps commonly seen in OSINT (Open Source Intelligence) and AML data sources, where providers often have only partial DOB coverage. By using an estimation approach, we’ve effectively increased DOB coverage, enhancing match precision and reliability.

Key Features

  • Reducing False Positives: Decreases irrelevant matches by estimating DOB ranges, making it easier to focus on genuine matches.
  • Back-End Integration: This feature works entirely behind the scenes, requiring no input or configuration from the user.
  • Improved Compliance Accuracy: Enhances the screening process by effectively filling DOB gaps, which are common in partial data from AML and OSINT sources.

How DOB Estimation Works

  • Back-End Life Event Analysis: The system uses available life event data (e.g., employment, education milestones) to infer an estimated age range in cases where the exact DOB is not available.
  • Increased DOB Coverage: By filling in the DOB gaps through estimation, we’re increasing our database coverage—meaning more complete profiles.
  • Sharper Relevance Filtering: With estimated age ranges, results are now filtered more precisely, helping you focus on what matters and skip over irrelevant matches.

User Impact

  • Reduction in False Positives: Users will notice a decrease in false positives, as the system can now filter more precisely, even without complete DOB data.
  • Streamlined Screening Experience: Results will be more relevant, reducing time spent on manually dismissing non-relevant matches.

FAQ

Will I need to configure anything to benefit from DOB Estimation?

No configuration is required. DOB Estimation is implemented on the back end and is automatically applied to your searches.

How will DOB Estimation affect my search results?

DOB Estimation enhances the relevance of search results by reducing false positives, especially where DOB data is not available or incomplete, resulting in more accurate matches and a smoother screening process.

The DOB Estimation feature is designed to address common data limitations in AML and OSINT sources, enhancing our database’s DOB coverage and delivering improved compliance outcomes by reducing false positives in user screenings.
 

What writing systems (languages) does fuzzy search support?

The default language within the SEON system is English. Our more advanced search tools are only available in English, including the fuzzy search engine.

Was this article helpful?