Syntho Engine Deployment and Integration

How to deploy and integrate the Syntho
Engine and generate synthetic data

Book a demo

End-to-End Integration

Learn how Syntho simplifies secure and compliant synthetic data generation in just a few steps

01
Deploy in your environment

Syntho typically deploys in the safe environment of our customers so that (sensitive) data never leaves the safe and trusted environment of the customer. 

02
Connect to your database

Connect to the source data and target environment with our out-of-the-box connectors. Syntho can connect with every leading database & filesystem and supports 20+ database connectors and 5+ filesystem connectors.

03
Generate your data

Define the type of synthetization you would like to run, realistically mask or synthesize new values, automatically detect sensitive data types.

04
Share/Use the protected data

Utilize and share the protected data securely, ensuring compliance and privacy are maintained throughout its usage.

High-level deployment architecture

Our platform is flexible, and deployable in any environment via docker-compose or Kubernetes.

Within our Syntho Engine, we offer seamless integration options: a user-friendly interface (“Syntho Web UI”) or integration into your pipeline through our “Rest API”. This flexibility ensures easy deployment in your preferred environment, without external connections.

High-level deployment architecture

Save your synthetic data guide now

What is synthetic data?

How does it work?

Why do organizations use it?

How to start?

Privacy Policy

Seamless deployment in your environment

Syntho typically deploys in the safe environment of our customers so that (sensitive) data never leaves the safe and trusted environment of the customer. This allows you to synthesize at the source where the original data is stored so that data never leaves your safe environment and that Syntho never sees, receives or processes any data.

Accordingly, the Syntho Engine and can be easily deployed and plugged into your environment of choice.

Seamless deployment in your environment

Hosting and data access
requirements

Syntho supports deployments via Docker-Compose and Kubernetes. Review our User Manual for the deployment instructions or contact our experts for questions related to deployment options.

Hardware requirements
Node 1

Kubernetes Deployment 

  • 32GB-64GB-128GB 

    (based on expected data size)

  • 12-20 virtual CPUs

    (TBD based on expected data size)

  • 128 GB disk storage

Node 2

Application node

  • 16 GB of RAM

  • 4 virtual CPUs

  • 30GB disk storage

Host OS

  • Any Linux OS (x86 / x64)

Software requirements

Kubernetes Deployment 

  • Kubernetes: 1.20 and higher

    (recommended via managed service)

  • helm: v3 and higher

  • kubectl installed

  • Ingress Controller installed on cluster

Docker Compose Deployment

  • Docker: 1.13.0+

  • Docker-compose: V3 and higher

Frequently Asked Questions

What are examples of PII, PHI, and identifiers?
  • First name
  • Last name
  • Phone number
  • Social Security Number, SSN
  • Bank number, etc.
What is Synthetic Mock Data?

Substitute sensitive PII, PHI, and other identifiers with representative Synthetic Mock Data that follow business logic and patterns.

Why do organizations use Test Data Management?

Production data is privacy-sensitive
Testing and development with representative test data is essential to deliver state-of-the-art software solutions. Using original production data seems obvious, but is not allowed due to (privacy) regulations according to the GDPR and the Dutch Data Protection Authority. This introduces challenges for many organizations in getting the test data right.

 

Production data does not cover all test scenarios
Test data management is essential because production data often lacks the diversity required for comprehensive testing (or does not (yet) exist at all), leaving out edge cases and potential future scenarios. By creating and managing diverse test data sets, it ensures thorough testing coverage and helps identify potential issues before deployment, mitigating risks and bugs in production to enhance software quality.

 

Optimize testing and development
Let your testers and developers focus on testing and development, instead of test data creation. Test data management optimizes testing and development by maintaining and updating test data, saving developers and testers time typically spent on data preparation. Automation of test data provisioning and refreshing ensures data relevance and accuracy, allowing teams to focus on analyzing results and enhancing software quality efficiently. This streamlined process improves overall testing speed, agility, and productivity in the development lifecycle.

What is De-Identification?

De-identification is a process used to protect sensitive information by removing or modifying personally identifiable information (PII) from a dataset or database.