Network Scores in Advanced Digital Footprinting

Updated on 10.02.25

5 minutes to read

Copy link

SEON Blackbox Machine Learning

SEON offers ready-to-use blackbox machine learning models from day one, eliminating the need to wait for data accumulation while fraudulent events might go undetected. Our Advanced Digital Footprint Machine Learning Network Scores leverage pretrained “base models” to instantly calculate email and phone network scores as soon as you start using SEON.

These base models are developed using sanitized, cross-customer data from SEON’s proprietary consortium dataset, designed to maximize predictive accuracy. They serve as a starting point, enabling immediate fraud risk assessment of email addresses and phone numbers.

As you integrate SEON into your workflows — configuring decisioning rules and feeding back verification labels — SEON’s ML evolves to create a bespoke model tailored to your data. This customized model offers enhanced predictive accuracy and is uniquely tuned to the patterns in your business. The customer-specific models will replace the base model as soon as they are available and usually outperform it. Usually, a bespoke model outperforms a base model.

Day 1

SEON desisioning and labeling implemented

1000 transactions with 100-100 Declines and Approved collected

Model Management processes ensure the performance assessment

The best-performing model goes into production

Base Models used

Transaction data and verification labels flow in

Customer-Specific Model trained

Continous evolution of models

Customer-Specific Model used usually

This document explains the foundation of these base models and the factors that drive the scores they generate.

Did you know: In machine learning, words like recall, precision, and accuracy have mathematical definitions that may differ from, or be more specific than, more commonly used meanings of the word. Feel free to explore these page for learn more https://developers.google.com/machine-learning/crash-course/classification .

The data used: Sanitized, relevant, representative

We utilized sanitized data from our top customers, which included verified fraudulent transactions labeled accordingly. To ensure relevance, we focused on transaction data from August to November 2024. The dataset was curated to reflect the top-tier customer base using SEON, considering factors such as user and phone geographies, email domains, and phone carriers. The final sample included 1.5 million transactions for email-based models and 1 million transactions for phone-based models.

The Email Network Score Base Model

The factors (referred to as features) that influence the network score for email addresses fall into several categories. These include data provided directly by customers to our API (e.g., email domain), information enriched by SEON’s capabilities (e.g., total registrations), insights derived from SEON’s consortium data (e.g., hits), and calculated metrics designed to capture key fraud patterns (e.g., vowel ratio in the email). The table below provides a sample of these features used to determine a higher network score.

	Important feature examples
Consortium data	The number of SEON customers have seen the email and saw it fraudulent.
Consortium data	Number of customers having the email currently or previously on the backlist.
Email characteristics	Likely a gibberish email username.
	Deliverable email address.
	The number of data breaches the email was seen.
Social Media registration pattern	Number of social media registrations with the email in total.
Social Media registration pattern	Number of social media registrations by personal and business types.

The Email Network Score Behavior and Usage Suggestions

Our model training process has achieved a predictive performance deemed ready for deployment. The AUC (Area Under the Curve) value of 0.94 demonstrates the model’s exceptional ability to detect fraud, where AUC = 0.5 represents random guessing (akin to flipping a coin). An AUC of 0.94 indicates that, with the right decision threshold, the model is highly effective at distinguishing between fraudulent and non-fraudulent activities.
The choice of an appropriate threshold for making decisions based on the Network Score depends on your specific use case and risk tolerance. Since the Network Score is a probability metric, any value above 0.5 suggests a likely fraudulent email address. For businesses aiming to minimize false positives, we recommend using a threshold of 0.85 or higher.

The table below summarizes the model’s performance metrics — Precision, Recall, and Accuracy — at thresholds of 0.5 and 0.85. Higher values for these metrics generally indicate better performance. However, keep in mind that Precision and Recall often trade off against one another as the threshold shifts from the balanced 0.5 point. Precision stands for the ratio of transactions classified correctly as fraudulent compared to all transactions the model predicted to be fraudulent. Recall stands for the ratio of transactions correctly predicted to be fraudulent compared to all verified fraudulent transactions. Accuracy is the proportion of all predictions that were correct, whether positive or negative.

Network Score

Threshold

Precision

Recall

Accuracy

0.5

0.6846

0.6654

0.9359

0.85

0.9727

0.1295

0.9126

The email network score is calculated from the latest data available at the time of the request: social signals, email characteristics, and consortium data. If these signals change, the network score adapts to this to stay relevant. Such social signals tend to change, e.g. from manual lookup to lookup basically because some checks need more time to deliver results. The network score may change when more and more data is available. This ensures that the score gets more comprehensive by the more information is revealed about the email address. This also marks the importance of setting the right time-out thresholds that fit your business while ensuring to receive the most data in the response.

The Phone Network Score Base Model

The factors that significantly influence the network score for phone numbers include data enriched by SEON’s capabilities (e.g., total registrations), insights derived from SEON’s consortium data (e.g., hits), and calculated metrics designed to detect key fraud patterns (e.g., whether the original carrier matches the provider carrier). The table below ranks these three categories of factors by their importance in increasing the network score.

	Important factors
Consortium data	The number of SEON times SEON has seen the phone and saw it fraudulent.
Consortium data	The number of customers have seen the phone and saw it fraudulent.
Phone number characteristics	Phone number registration country characteristics.
Phone number characteristics	Mobile phone service lookup characteristics
	Phone carrier characteristics.
Social Media registration pattern	Number of social media registrations by personal and technology types.
Social Media registration pattern	Number of social media registrations with the phone in total.

The Phone Network Score Behavior and Usage Suggestions

Our model training process has achieved a predictive performance deemed ready for deployment The AUC (Area Under the Curve) value of 0.79 indicates the model’s strong ability to detect fraud, whereas an AUC of 0.5 represents random guessing (similar to tossing a coin). An AUC of 0.79 suggests that, with the right decision threshold, the model is effective at distinguishing between fraudulent and non-fraudulent activities.
The optimal threshold for decision-making based on the Network Score depends on your specific use case and risk tolerance. Since the Network Score is a probability metric, any value above 0.5 suggests a likely fraudulent phone number. To minimize false positives, we recommend setting the threshold at 0.90 or higher.

The table below presents the model’s performance metrics—Precision, Recall, and Accuracy—at thresholds of 0.5 and 0.90. Higher values for these metrics generally indicate better performance. However, keep in mind that Precision and Recall often trade off against one another as the threshold shifts from the balanced 0.5 point. Precision stands for the ratio of transactions classified correctly as fraudulent compared to all transactions the model predicted to be fraudulent. Recall stands for the ratio of transactions correctly predicted to be fraudulent compared to all verified fraudulent transactions. Accuracy is the proportion of all predictions that were correct, whether positive or negative.

Network Score

Threshold

Precision

Recall

Accuracy

0.50

0.7408

0.2373

0.9228

0.90

0.9480

0.0665

0.9144

The phone network score is calculated from the latest data available at the time of the request: social signals, phone number characteristics, and consortium data. If these signals change, the network score adapts to this to stay relevant. Such social signals tend to change, e.g. from manual lookup to lookup basically because some checks need more time to deliver results. The network score may change when more and more data is available. This ensures that the score gets more comprehensive by the more information is revealed about them phone number. This also marks the importance of setting the right time-out thresholds that fit your business while ensuring to receive the most data in the response.

Learn more

Learn more: How SEON harnesses the power, speed and accuracy of machine learning and what you can do to get the best results.Overview: Read about blackbox machine learning in SEON. Read about whitebox machine learning in SEON. Get the best results by setting up feedback loops. Migration guide to get started with Advanced Digital Footprinting (ADF)