Generative Adversarial Network for Synthetic Data Generation
DOI:
https://doi.org/10.64252/0kbt2y27Keywords:
Generative Adversarial Network, Synthetic Data Generation, Min-Max Scaling, Recursive Feature Elimination, Support Vector Classifier, TensorFlow, TensorFlow-GAN.Abstract
The proposed methodology, as this paper includes, represents an innovative way of developing high quality synthetic information, through Generative Adversarial Networks which emerges due to the lack of availability of real-world data in multiple areas. The workflow involves Min-Max Scaling as the preprocessing to make the data consistent in the features ranges and GAN training easier. Also, a feature selection strategy, Recursive Feature Elimination, is used to streamline the model effectiveness because only the most highly significant features remain. As a measure of quality of data generated a Support Vector Classifier is utilised and has attained high level performance on synthetic datasets. It also allows synthesis of deep learning-driven synthetic data through a framework constructed with TensorFlow and TensorFlow-GAN to create a scalable yet flexible ecosystem to develop deep learning. This methodology of generation of synthetic data results efficiently and reflects the statistical characteristics of the actual datasets and can be used in many instances, such as in healthcare, finance, and development of machine learning models where actual data can be limited or confidential.