PII Scanner

Identify direct identifiers (such as PII and PHI)  automatically with our AI-powered PII Scanner 

Book a demo

About PII scanner

Automatically detect and manage personally identifiable information (PII) in your data using Syntho’s AI-powered PII Scanner

PII column scanner
PII column scanner
Mitigate manual work and utilize our PII column scanner to identify columns in your database containing direct identifiers (PII/PHI) with the power of AI. 
PII text scanner
PII text scanner
Mitigate manual work and utilize our PII text scanner to  identify direct identifiers (PII/PHI) inside free text fields with the power of AI. 

Introducing the PII column scanner

Leverage our AI-powered PII scanner to automatically detect columns and instances in your database containing direct  identifiers, such as Personally Identifiable Information (PII) and Protected Health Information (PHI). 

Once columns  containing PII are recognized, our platform facilitates de-identification methods such as deletion, substitution with mock data, or synthesis, fortifying database privacy.  

Introducing the PII column scanner

Key benefits of AI-powered
PII column scanner

Efficiently identify PII automatically with AI

Enhanced data privacy

The AI-powered PII Scanner ensures enhanced data privacy by automatically identifying and flagging personally identifiable information (PII) within datasets.

Compliance with regulations

By swiftly and accurately identifying PII elements, the AI-powered scanner helps organizations adhere to regulatory requirements such as GDPR, HIPAA, and CCPA.

Time and cost savings

Automating the process of PII detection significantly reduces the time and resources required for manual data inspection, enabling organizations to achieve compliance more efficiently and cost-effectively. 

How to use the PII column
scanner

How can I use the PII scanner?

One can easily configure the PII Scanner via our user interface via the “PII” tab. This feature has two scan options:

  1. Shallow scan (only metadata, including column names).
  2. Deep scan (metadata + and the data itself).

All columns that are identified as PII are shown in the list of PII entities on the PII tab and are labeled PII on the column header on the Job Settings tab.

How can I use the PII scanner?

Apply mockers on PII automatically

Next to the automatic identification of PII, our platform also automatically suggests the correct mocker for each PII entity, saving time and effort for the user. By using this feature, users can ensure that the sensitive original PII is protected and is replaced by representative mock data with preserved referential integrity for multi-table databases via our consistent mapping functionality.

Apply mockers on PII automatically

Can I also identify PII manually?

Yes, users can also identify PII entities manually as an alternative to the PII scanner. Users can also apply mockers manually as an alternative to the automated suggested mockers. However, we optimized our platform in such a way that AI does the work for you to mitigate manual work and to be able to process large data volumes fast.

User documentation

Explore the Syntho user documentation

Learn more

Introducing the PII text
scanner

Run the PII text scanner

Utilize the PII Text Scanner to perform an in-depth analysis of your data, detecting any personally identifiable information (PII) present within fields containing free text.  

Use our Synthetic Mock Data techniques to replace PII with mockers. 

Run the PII text scanner

How to use the PII text scanner?

01
Remove the identified PII

The Syntho Engine allows users to remove identified PII. When PII is removed, it will simply be deleted from the data or replaced by a default value, such as “*” or “PII”. This allows organizations to protect information.

Example: Original text with identified PII:

John Smith  applied for a loan of  $10,000  with our bank on  03/01/2023 . He provided his social security number 123-45-6789 , date of birth 01/01/1985 , and employment information Syntho . Our loan officer conducted a credit check and verified his information. Based on his credit score and employment history, he was approved for the loan and the funds were deposited into his account on 03/03/2023 .”

02
Replace identified PII with the entity

The Syntho Engine allows users to replace identified PII with the entity. An entity is a placeholder that replaces specific identifying information to protect an individual’s privacy.

When PII is replaced with entities, sensitive information such as names or dates are replaced with placeholders such as “<name>” or “<date>”. This allows organizations to maintain the structure of their data, while protecting the sensitive information contained within it.

Text with PII replaced by entities:

<Name>   applied for a loan of <Amount> with our bank on <date> . They provided their social security number <SSN> , date of birth <DOB> , and employment information <Employer> . Our loan officer conducted a credit check and verified their information. Based on their credit score and employment history, they were approved for the loan and the funds were deposited into their account on <date> .”

03
Replace identified PII with a mocker

Similarly, the Syntho Engine allows users to replace identified PII with a mocker. A mocker is a fictitious value that is used to replace specific identifying information. Mocker values are not real and have no relation to the original data, but they maintain the same format as the original data to preserve its usability.

When PII is replaced with mockers, sensitive information is replaced with fictitious data. In this case, sensitive information such as names or dates are replaced with mock names such as “Wim Kees” or mock dates such as “03-05-2023”. This allows organizations to maintain the structure of their and have a representative mock value as alternative, while protecting the sensitive information contained within it.

Text with PII replaced by mockers:

Jake Brown applied for a loan of $23,340 with our bank on 12/11/2023 . They provided their social security number 987-65-4321 , date of birth 04/02/1989 , and employment information XYZ Company . Our loan officer conducted a credit check and verified their information. Based on their credit score and employment history, they were approved for the loan and the funds were deposited into their account on  13/12/2023 .”

Text with PII replaced by the default value “PII”:

PII  applied for a loan of  PII  with our bank on  PII . They provided their social security number PII , date of birth PII , and employment information PII . Our loan officer conducted a credit check and verified their information. Based on their credit score and employment history, they were approved for the loan and the funds were deposited into their account on PII .”

Other features from Syntho

Explore other features that we provide

Test Data Management

  • De-Identification & Synthetization

    .Comprehensive Testing with Representative Date.

  • Rule-Based Synthetic Data

    Simulate Real-World Scenarios.

  • Subsetting

    Increase the number of data samples in a dataset.

Data Masking

  • PII Scanner

    Identify PII automatically with our AI-powered PII Scanner.

  • Synthetic Mock Data

    Substitute sensitive PII, PHI, and other identifiers.

  • Consistent Mapping

    Preserve referential integrity in an entire relational data ecosystem.

AI Generated Synthetic Data

  • Quality Assurance Report

    Assess generated synthetic data on accuracy, privacy, and speed.

  • Time Series Synthetic Data

    Synthesize time-series data accurately with Syntho.

  • Upsampling

    Create Manageable Date Subsets.

Trusted by enterprise companies

Mimic (sensitive) data with AI to generate synthetic data twins

Frequently Asked Questions

Why do organizations use mockers?

PII, PHI, and other direct identifiers are sensitive and can be spotted manually or automatically with our PII scanner to save time and minimize manual work. Then, one can apply Mockers to substitute real values with mock values to de-identify data and enhance privacy.

What are examples of PII, PHI, and identifiers?
  • First name
  • Last name
  • Phone number
  • Social Security Number, SSN
  • Bank number, etc.
What is PII, PHI and what are identifiers?

PII stands for Personal Identifiable Information. PHI stands for Personal Health Information and is an extended version of PII dedicated to health information. Both PII and PHI are identifiers and relate to any information that can be used to distinguish or trace an individual’s identity directly. Here, with identifiers, only one person shares this trait.

What is Test Data Management?

Test data management (TDM) is the process of creating, maintaining, and controlling the data used for non-production environments (test, development and acceptance environments).

What is De-Identification?

De-identification is a process used to protect sensitive information by removing or modifying personally identifiable information (PII) from a dataset or database.

Build better and faster with synthetic data today

Unlock data access, accelerate development, and enhance data privacy.

Join our newsletter

Keep up to date with synthetic data news