Syntho Engine Deployment and Integration

End-to-End Integration

Learn how Syntho simplifies secure and compliant synthetic data generation in just a few steps

Deploy in your environment

Syntho typically deploys in the safe environment of our customers so that (sensitive) data never leaves the safe and trusted environment of the customer.

Connect to your database

Connect to the source data and target environment with our out-of-the-box connectors. Syntho can connect with every leading database & filesystem and supports 20+ database connectors and 5+ filesystem connectors.

Generate your data

Define the type of synthetization you would like to run, realistically mask or synthesize new values, automatically detect sensitive data types.

Share/Use the protected data

Utilize and share the protected data securely, ensuring compliance and privacy are maintained throughout its usage.

High-level deployment architecture

Our platform is flexible, and deployable in any environment via docker-compose or Kubernetes.

Within our Syntho Engine, we offer seamless integration options: a user-friendly interface (“Syntho Web UI”) or integration into your pipeline through our “Rest API”. This flexibility ensures easy deployment in your preferred environment, without external connections.

Save your synthetic data guide now

What is synthetic data?

How does it work?

Why do organizations use it?

How to start?

Full Name *

Business Email *

Country *

Seamless deployment in your environment

Syntho typically deploys in the safe environment of our customers so that (sensitive) data never leaves the safe and trusted environment of the customer. This allows you to synthesize at the source where the original data is stored so that data never leaves your safe environment and that Syntho never sees, receives or processes any data.

Accordingly, the Syntho Engine and can be easily deployed and plugged into your environment of choice.

Hosting and data access
requirements

Syntho supports deployments via Docker-Compose and Kubernetes. Review our User Manual for the deployment instructions or contact our experts for questions related to deployment options.

Hardware requirements

Node 1

Kubernetes Deployment

32GB-64GB-128GB
(based on expected data size)
12-20 virtual CPUs
(TBD based on expected data size)
128 GB disk storage

Node 2

Application node

16 GB of RAM
4 virtual CPUs
30GB disk storage

Host OS

Any Linux OS (x86 / x64)

Software requirements

Kubernetes Deployment

Kubernetes: 1.20 and higher
(recommended via managed service)
helm: v3 and higher
kubectl installed
Ingress Controller installed on cluster

Docker Compose Deployment

Docker: 1.13.0+
Docker-compose: V3 and higher

Explore more resources

Mimic (sensitive) data with AI to generate synthetic data twins

All resources

Guides

Synthetic Data Quality Report

Blog

Unlocking the Power of Synthetic Data in Healthcare: Interview with Experts

Unlocking the Power of Synthetic Data in Healthcare: Interview with Experts Published July 16, 2024 https://www.youtube.com/watch?v=Uw9hr2s3tKs

Blog

What are Privacy-Enhancing Technologies (PETs)?

What are Privacy-Enhancing Technologies (PETs)? Types & Selection Guide Published July 16, 2024 Strict privacy laws limit your ability to

Blog

Pseudonymization vs Anonymization vs Synthetic Data: Understanding Key Data Privacy Techniques

Pseudonymization vs Anonymization vs Synthetic Data: Understanding Key Data Privacy Techniques Published July 16, 2024 The severe

All resources

Frequently Asked Questions

What are examples of PII, PHI, and identifiers?

First name
Last name
Phone number
Social Security Number, SSN
Bank number, etc.

What is Synthetic Mock Data?

Substitute sensitive PII, PHI, and other identifiers with representative Synthetic Mock Data that follow business logic and patterns.

Why do organizations use Test Data Management?

Production data is privacy-sensitive
Testing and development with representative test data is essential to deliver state-of-the-art software solutions. Using original production data seems obvious, but is not allowed due to (privacy) regulations according to the GDPR and the Dutch Data Protection Authority. This introduces challenges for many organizations in getting the test data right.

Production data does not cover all test scenarios
Test data management is essential because production data often lacks the diversity required for comprehensive testing (or does not (yet) exist at all), leaving out edge cases and potential future scenarios. By creating and managing diverse test data sets, it ensures thorough testing coverage and helps identify potential issues before deployment, mitigating risks and bugs in production to enhance software quality.

Optimize testing and development
Let your testers and developers focus on testing and development, instead of test data creation. Test data management optimizes testing and development by maintaining and updating test data, saving developers and testers time typically spent on data preparation. Automation of test data provisioning and refreshing ensures data relevance and accuracy, allowing teams to focus on analyzing results and enhancing software quality efficiently. This streamlined process improves overall testing speed, agility, and productivity in the development lifecycle.

What is De-Identification?

De-identification is a process used to protect sensitive information by removing or modifying personally identifiable information (PII) from a dataset or database.

View all FAQ’s

Syntho Engine Deployment and Integration

End-to-End Integration

Deploy in your environment

Connect to your database

Generate your data

Share/Use the protected data

High-level deployment architecture

Save your synthetic data guide now

Seamless deployment in your environment

Hosting and data accessrequirements

Hardware requirements

Node 1

Node 2

Software requirements

Explore more resources

Synthetic Data Quality Report

Unlocking the Power of Synthetic Data in Healthcare: Interview with Experts

What are Privacy-Enhancing Technologies (PETs)?

Pseudonymization vs Anonymization vs Synthetic Data: Understanding Key Data Privacy Techniques

Frequently Asked Questions

Hosting and data access
requirements