-Entering the global market with our own technology that combines differential information protection and generative AI
-Growing into one of the world's top 4 synthetic data companies
-Dreaming of the 'Amazon' of the data world
Data is like the brain of AI. Just as humans grow through various experiences, AI also acquires sophisticated judgment through abundant data. However, it is not easy to secure high-quality data. Original data is difficult to collect, and preprocessing such as labeling is also complicated. There is a risk of sensitive information and personal information leakage, so there are restrictions on data sharing, and unbalanced data distribution reduces analysis performance.
There is a company that has emerged to solve these data problems. It is CUBIG, an AI company specializing in data security and synthetic data generation. CUBIG is preparing to enter the global market with its own technology that combines differential information protection technology and generative AI.
CUBIG's co-CEO Bae Ho is currently a professor at Ewha Womans University. He received a master's degree in information security from the University of London and a doctorate in AI from Seoul National University, and was the first in the world to publish a paper establishing privacy and security in the AI field, drawing attention. Co-CEO Jeong Min Chan holds numerous R&D and AI-related patents and has experience developing AI data applications. CTO Ha Heon Seok is an AI expert who has been researching synthetic data for over 10 years.
Since its establishment in 2021, CUBIG has attracted seed investment from Naver D2SF and VNTG in 2023, and pre-A investment from Industrial Bank of Korea and Intops Investment in 2024. In 2024, it won the Minister of Science and ICT Award at the Information Protection Product Innovation Awards and was selected for the 2nd AI Startup Accelerator jointly operated by SK Telecom and Hana Bank. We met with CEO Bae Ho and CEO Jeong Min-chan to hear about CUBIG's story of establishing itself as the world's only specialized security synthetic data company.
■ “Challenge the Global Market… AI is the Next Generation Growth Engine”

CEO Jeong Min-chan, whom we met at Naver D2SF in Gangnam, started by telling startups to 'go and try out overseas markets.'
“In the AI field, you should not be defeated by the business, hesitate to start a business, or be afraid to go overseas. Even if you are not recognized domestically, you can become a company that is needed overseas.”
CEO Jeong likened AI companies to ‘wheels.’ “No matter how good a ‘wheel’ you make, it is worthless if there is no ‘sports car’ to utilize it. However, if there is a ‘sports car’ overseas, you can create synergy there,” he said, advising companies to find a ‘sports car’ that matches their ‘wheels’ in the global market.
CEO Jeong also emphasized the growth potential of the Korean AI industry. “Korea also needs to have an AI company with its own algorithm. AI will be our country’s next-generation growth engine,” he said. “The difference in AI performance comes from learning data. Even with the same engine, if you learn with high-quality data, you can get much better results,” he emphasized the importance of data in the AI industry.
■ Only 4 synthetic data specialized companies in the world
Synthetic data is a kind of substitute that overcomes the limitations of original data. Unlike existing data, it is easy to obtain, does not require preprocessing, and has no restrictions on the use of sensitive information. There is a low risk of personal information leakage and data sharing is free. In addition, even if the distribution of original data is unbalanced, analysis performance can be improved.
Representative Jeong likened synthetic data to a fake in a museum. “Just as a museum exhibits fakes instead of real items, synthetic data replaces sensitive real data. Real data contains sensitive information such as genetic information, financial information, corporate secrets, and personal information, and is subject to strict regulations. The UK uses a prior approval system, the US uses a post-existence liability system, and Korea has stronger regulations than Europe,” he explained the need for synthetic data. In other words, synthetic data can be said to be artificial data that replaces real data whose use is restricted due to regulations.
There are only four companies in the world, including CUBIG, that possess advanced synthetic data technology that replaces personal information while maintaining the statistical characteristics and distribution of actual data.
■ Combination of differential information protection and generative AI… Strengthening security while maintaining data quality
When creating synthetic data, the scale and safety of the data are important. For example, when announcing the population statistics of a small area, if you analyze the population of 'men in their 80s in OO area', there is a risk of identifying individuals due to the small number of parameters. Differential information protection technology is needed to solve this problem.
Differential information protection technology preserves the statistical characteristics of the dataset while making it impossible to identify individual data. Differential information protection technology is one of the 10 innovative technologies selected by MIT in 2020 and is currently known as the safest data security method. The innovation of differential information protection technology is that it cannot be restored to the original data. Unlike existing anonymization processing or encryption, data to which differential information protection is applied cannot be restored to the original information. This allows for the simultaneous achievement of two goals: personal information protection and data utilization.
“Differential information protection technology is the latest data security technology currently adopted by global companies and organizations such as Apple and the U.S. Census Bureau,” explained CEO Bae. “The innovation of this technology is that it cannot be restored to the original data. Unlike existing anonymization processing or encryption, data to which differential information protection has been applied cannot be restored to the original information. This achieves two goals at the same time: personal information protection and data utilization.”
CUBIG went one step further and combined differential information protection with generative AI. “If only differential information protection is applied, there is a limitation that data performance drops to 70-80%,” explained CEO Bae. “To solve this problem, we combined differential information protection with generative AI, and through this, we presented a new paradigm called ‘secure synthetic data.’” Secure synthetic data is an innovative technology that enhances security while maintaining data quality.

■ Generate data identical to the original without viewing the original data… Data non-access technology
Just as you need to see the original to create a replica in a museum, creating synthetic data also requires original data. However, CUBIG has developed a ‘data non-access technology’ that creates synthetic data without having to see the original data.
“Previously, original data had to be shared for data security purposes. However, it is difficult for companies or organizations to provide sensitive data to external parties. We have solved this fundamental problem with data non-access technology,” explained CEO Bae.
CUBIG's data-free technology for making counterfeit goods without seeing the real thing works in a '20 questions' manner. When the customer describes only the basic properties of the data they want, CUBIG generates and sends the expected data set. The customer selects the appropriate data from this and repeats this process to increase the accuracy of the data.
CEO Jeong explained the data non-access technology by saying, “If you explain that the first column is gender and the second column is age, we will create several sets of expected data and send them to you. When the customer selects the appropriate data among these, we will create a new data set based on the selected data and send it to you. We will improve the quality of the data by playing ‘Twenty Questions’ with the customer.”
CUBIG has obtained a patent for its data non-access technology and has proven the excellence of its technology through a PoC (Proof of Concept) with Naver. CEO Jeong emphasized the excellence of CUBIG's data non-access technology, saying, "In May 2024, Microsoft announced a similar algorithm, but while Microsoft can only process image data, CUBIG can process various types of data such as text, images, and tables."
■ 'DTS' and 'azoo', a new paradigm in the data industry
CUBIG offers two core solutions. The first is B2B SaaS 'DTS', a tool that allows companies to internally create synthetic data. Launched in July 2024, DTS is used to share data between affiliates and secure data for AI training, and is operated on a subscription model.
The second is the data trading platform 'azoo' launched in June 2024. Due to data regulations, each type of data had to be purchased individually, but with azoo, synthetic data can be used to integrate and trade various types of data in one place.
“Currently, due to regulations, data must be purchased from different places. However, synthetic data is not subject to regulations, so all data can be purchased from one place, just like an online shopping mall,” explained CEO Jeong.
Azoo currently offers basic data trading functions and is preparing a data combination service. In the first half of 2025, an integrated analysis function will also be introduced. Through this, it plans to evolve beyond a simple trading platform into a comprehensive data solution platform.

■ Entering the global market and attracting overseas investment
CUBIG is currently preparing to enter the global market. In particular, the strategy is to first target the European market, which has strict data regulations. It is in the process of establishing a UK corporation and is also planning to attract overseas investment. CEO Jeong said, “Europe has strict data regulations such as GDPR (General Data Protection Regulation), so the need for our solutions is greater. We are currently preparing to establish a UK corporation and plan to enter the US market afterward,” regarding the global expansion plan.
While saying, “For AI to be safe, training data must be safe. CUBIG will open a new paradigm for the data industry,” he said, “We want to become the ‘Amazon of the data world.’ Just as Amazon trades products from all over the world on a single platform, we want to create an ecosystem where all data can be traded safely and freely,” expressing his ambition to open a new horizon for the data industry.
You must be logged in to post a comment.