– Targeting the global market with proprietary technology combining differential information protection and generative AI.
-Growing into one of the world's top four synthetic data companies.
Dreaming of becoming the 'Amazon' of the data world
Data is like the brain of AI. Just as humans grow through diverse experiences, AI develops sophisticated judgment through abundant data. However, securing high-quality data is not easy. Collecting raw data is difficult, and preprocessing, including labeling, is complex. Data sharing is restricted due to the risk of sensitive information and personal information leaks, and unbalanced data distribution reduces analytical performance.
A company has emerged to address these data challenges. CUBIG is an AI company specializing in data security and synthetic data generation. CUBIG is preparing to enter the global market with its proprietary technology, combining differential information protection technology with generative AI.
Cubic's co-CEO, Bae Ho, currently serves as a professor at Ewha Womans University. He received a master's degree in information security from University College London and a doctorate in AI from Seoul National University. He garnered attention for publishing the world's first paper establishing privacy and security in the AI field. Co-CEO Jeong Min-chan holds numerous R&D and AI-related patents and has experience developing AI data applications. CTO Ha Heon-seok is an AI expert who has researched synthetic data for over a decade.
Since its incorporation in 2021, Cubic has secured seed investment from Naver D2SF and VNTG in 2023, and pre-A round investment from the Korea Development Bank and Intops Investment in 2024. In 2024, Cubic won the Minister of Science and ICT Award at the Information Security Product Innovation Awards and was selected for the 2nd AI Startup Accelerator, jointly run by SK Telecom and Hana Bank. We met with CEOs Bae Ho and Jeong Min-chan to learn about Cubic's journey to becoming the world's only company specializing in secure synthetic data.
■ "Challenge the Global Market…AI is the Next Generation Growth Engine"

CEO Jeong Min-chan, whom we met at Naver D2SF in Gangnam, started by telling startups to "experience the overseas market."
"In the AI field, we shouldn't be discouraged by defeatism, hesitate to start a business, or fear expanding overseas. Even if we don't gain recognition domestically, we can become a company that's in demand overseas."
CEO Jeong likened AI companies to "wheels." "No matter how good a wheel you create, it's worthless if you don't have a sports car to utilize it. However, if you have a sports car overseas, you can create synergy there," he said, advising companies to find a "sports car" that fits their "wheels" in the global market.
CEO Jeong also emphasized the growth potential of the Korean AI industry. "Korea needs to have AI companies with proprietary algorithms. AI will become our country's next-generation growth engine," he said. He emphasized the importance of data in the AI industry, saying, "The difference in AI performance stems from training data. Even with the same engine, training with high-quality data can yield significantly better results."
■ One of only four synthetic data companies in the world
Synthetic data is a kind of substitute that overcomes the limitations of original data. Unlike traditional data, it is easy to obtain, requires no preprocessing, and has no restrictions on the use of sensitive information. It reduces the risk of personal information leaks and allows for free data sharing. Furthermore, it can improve analytical performance even when the original data distribution is uneven.
CEO Jeong likened synthetic data to a replica in a museum. "Just as a museum exhibits replicas instead of genuine items, synthetic data replaces sensitive real data. Real data contains sensitive information such as genetic information, financial information, corporate secrets, and personal information, and is therefore subject to strict regulations. The UK uses a prior authorization system, the US uses an ex post facto liability system, and Korea has even stricter regulations than Europe," he explained, explaining the necessity of synthetic data. In other words, synthetic data can be considered artificial data that replaces real data whose use is restricted due to regulations.
Globally, only four companies, including Cubic, possess advanced synthetic data technology that replaces personal information while maintaining the statistical characteristics and distribution of real data.
Combining Differential Information Protection and Generative AI… Strengthening Security While Maintaining Data Quality
When generating synthetic data, the scale and security of the data are crucial. For example, when releasing demographic statistics for a small region, analyzing the population of "men in their 80s in Region OO" poses a risk of identifying individuals due to the small number of data points. To address this issue, differential information protection technology is necessary.
Differential data protection technology preserves the statistical characteristics of a dataset while rendering individual data unidentifiable. Differential data protection technology was selected by MIT as one of the top 10 breakthrough technologies of 2020 and is currently recognized as the most secure data security method. The innovation of differential data protection lies in its inability to restore the original data. Unlike existing anonymization or encryption, data subjected to differential data protection cannot be reverted to its original state. This allows for the simultaneous achievement of the dual goals of privacy protection and data utilization.
CEO Bae explained, "Differential information protection technology is a cutting-edge data security technology currently adopted by global companies and institutions such as Apple and the U.S. Census Bureau. What makes this technology innovative is that it cannot be restored to its original state. Unlike existing anonymization or encryption, data subject to differential information protection cannot be reverted to its original state. This achieves the dual goals of protecting personal information and utilizing data simultaneously."
Cubic went one step further, combining differential information protection with generative AI. CEO Bae explained, "If differential information protection alone is applied, data performance is limited to 70-80%. To address this issue, we combined differential information protection with generative AI, and through this, we presented a new paradigm called 'secure synthetic data.'" Secure synthetic data is an innovative technology that enhances security while maintaining data quality.

■ Generates data identical to the original without viewing the original data… Data non-access technology
Just as creating a replica of a museum piece requires access to the original, creating synthetic data also requires original data. However, Cubic has developed "data-inaccessible technology" that creates synthetic data without access to the original data.
CEO Bae explained, "Previously, for data security reasons, original data had to be shared. However, companies and organizations find it difficult to provide sensitive data to external parties. We've solved this fundamental problem with data non-access technology."
Cubic's data-free technology, which allows for the creation of counterfeit goods without the need to see the original, operates using a "20 Questions" approach. Customers simply describe the basic properties of the data they want, and Cubic generates and transmits a predicted dataset. Customers then select the appropriate data and repeat this process, further improving the accuracy of their data.
CEO Jeong explained the data access technology, saying, "If you explain that the first column is gender and the second is age, we'll generate and send multiple sets of expected data. Once the customer selects the appropriate data from these, we'll create a new dataset based on the selected data and send it to them. We improve data quality by playing 'Twenty Questions' with the customer."
Cubic has patented its data-non-access technology and has proven its superiority through a Proof of Concept (PoC) with Naver. CEO Jeong emphasized the superiority of Cubic's data-non-access technology, saying, "In May 2024, Microsoft announced a similar algorithm. While Microsoft could only process image data, Cubic can process diverse data types, including text, images, and tables."
■ 'DTS' and 'azoo', a new paradigm for the data industry
Cubic offers two core solutions. The first is DTS, a B2B SaaS tool that enables companies to internally generate synthetic data. Launched in July 2024, DTS is used for data sharing between affiliates and securing data for AI training, and operates on a subscription model.
The second is "azoo," a data trading platform launched in June 2024. While data regulations previously required individual data purchases, azoo leverages synthetic data to enable integrated trading of diverse data in one place.
CEO Jeong explained, "Currently, due to regulations, data must be purchased from different sources. However, synthetic data is not subject to regulation, so all data can be purchased in one place, much like an online shopping mall."
Azoo currently offers basic data trading capabilities and is preparing a data aggregation service. Integrated analytics capabilities are also planned for the first half of 2025. Through these initiatives, Azoo aims to evolve beyond a simple trading platform into a comprehensive data solutions platform.

■ We are moving towards entering the global market and attracting overseas investment.
Cubic is currently preparing to expand into the global market, with a particular focus on targeting the European market, known for its strict data regulations. The company is currently in the process of establishing a UK subsidiary and is also looking to attract foreign investment. CEO Jeong stated, "Europe has stringent data regulations, such as the General Data Protection Regulation (GDPR), which makes our solutions even more essential. We are currently preparing to establish a UK subsidiary, and we plan to expand into the US market afterward."
He continued, "For AI to be safe, training data must be safe. Cubic will open a new paradigm for the data industry." He added, "We aim to become the 'Amazon of the data world.' Just as Amazon trades products from all over the world on a single platform, we want to create an ecosystem where all data can be traded safely and freely." He expressed his ambition to open a new horizon for the data industry.
You must be logged in to post a comment.