Using Fuzzy Search in AML queries

Updated on 26.05.23
8 minutes to read
Copy link

Overview

SEON's AML tools offer a wide range of solutions and configuration options to support KYC officers in their fight against fincrime and money laundering. Simply run quick checks with simple search, or use or complex search to hone in on the correct person. You can also decide to use exact search or fuzzy search and shift through their data quickly with our relevancy score system.

 

Fuzzy Search Settings

When using AML API over an API integration, you can customize your fuzzy search settings by including parameters in your API request.

Include your fuzzy parameters nested within the config.fuzzy_config parameter. You can use the parameters below to tweak what kind of results fuzzy search returns.

  • phonetic_search_enabled – Default setting: False. 
    If enabled the parameter will turn on SEON's Phoinetic Search module. This means that the tokens entered into AML API are converted into a phonetic representation using the double metaphone, koelnerphonetik, haasephonetik, beider-morse, and daitch-mokotoff algorithms. When enabled the AML lookup will only use these phonetic representations of entered name and those in the database.
  • edit_distance_enabled – Default setting: True. 
    edit distance is the number of single-character changes needed to turn one term into another (e.g. mat » bat has an edit distance of 1). When set to True, AML lookups will return names similar to the search term entered: e.g.: 'Anastasia' matches 'Anastasya'. When you enter a search query, our system compares it to the names in our database. If a name token (a name element) has a length equal to or greater than 7 letters, we allow for 1-character edits to find potential matches. If the token length is equal to or greater than 13, we allow for 2-character edits to find potential matches.
    For example, you enter the name "Tetjana Donez" which consists of two tokens. With the default value of 7, for each token with a character length above this value, our system will search for variations with 1 edit distance (single-character changes) to find potential matches.
    For example, "Tetjana Donez" will be considered a match with "Tetiana Donets" because they differ by just one character.
  • scoring.result_limit – Default setting: 10.
    Use this parameter to define the maximum number of hits AML API should return in the result set. The result set is ordered by the source type hits are identified in: ['sanction', 'warned_entities/crimelist', 'central_bank/watchlist', 'pep']
  • scoring.score_threshold – Default setting: 0.585.
    Set the Relevancy score threshold. The relevancy score is a normalized probability score with possible values between 0-1.
  • scoring.min_nr_token_match – Default setting 67.
    Set the percentage of tokens in results that must match your search query. 100% means all tokens should match, while a setting of 67% means the search name consisting of 3 tokens 'Alexander Gahon Gesmundo' would 'Alexander Gesmundo'.
     

What writing systems (languages) does fuzzy search support?

The default language within the SEON system is English. Our more advanced search tools are only available in English, including the fuzzy search engine.

We also support other languages, but only exact search will be available. Even so this exact search engine provides robust text processing capabilities to handle various types of text variations and complexities, including ASCII folding for non-ASCII characters, hyphenation and punctuation differences, out-of-order name matching, missing name components, and casing differences. These features allow our search engine to deliver more accurate and relevant search results, even when dealing with challenging text inputs that would otherwise cause errors or miss relevant matches.

Our exact search engine provides support for the following languages:

Afrikaans
Albanian
Amharic
Arabic
Armenian
Assamese
Azerbaijani (Latin)
Basque
Belarusian
Bengali
Bosnian
Bulgarian
Burmese
Catalan
Chinese (Hans)
Chinese (Hant)
Croatian
Czech
Danish
Dutch
English
Estonian
Filipino (Latin)
Finnish
French
Galician
Ganda
Georgian
German
Greek
Gujarati
Hausa
Hebrew
Hindi
Hungarian
Icelandic
Igbo
Indonesian
Italian
Japanese
Kannada
Kazakh
Khmer
Kinyarwanda
Konkani
Korean
Lao
Latvian
Lithuanian
Macedonian
Malay
Malayalam
Maltese
Marathi
Mongolian (Cyrillic)
Nepali
Norwegian Bokmål
Norwegian Nynorsk
Oriya
Oromo
Polish
Portuguese
Punjabi
Romanian
Russian
Serbian (Latin)
Serbian (Cyrillic)
Sinhala
Slovak
Slovenian
Spanish
Swahili
Swedish
Tamil
Telugu
Thai
Turkish
Ukrainian
Urdu
Uzbek
Vietnamese
Welsh
Yoruba (Latin)
Zulu

 

Was this article helpful?

?Got a question

Talk to sales