Upstage's AI plagiarism controversy ends…K-AI ecosystem verification draws attention

The 'model plagiarism suspicion' that had been entangled in Upstage, a leading domestic artificial intelligence (AI) startup, at the beginning of the new year has been resolved as a happening just two days later.

This controversy, sparked by a public complaint from a competitor's CEO, has been resolved with an official apology from the parties involved. However, it has gone beyond a simple battle of wits and raised the question of "where does independent technology begin?" in the domestic AI ecosystem. The verification process for the government-led "Independent AI Foundation Model" project and the ethics of healthy competition among technology startups are now being put to the test. The industry is unanimous in its call for the urgent establishment of a scientific and transparent verification system that can ensure the reliability of "Korean AI."

The incident began on January 1st, when Seok-Hyeon Ko, CEO of Psionic AI, raised suspicions via social media that Upstage's large-scale language model (LLM) "Solar Open 100B" closely resembled a model from China's Zhipu AI. Ko cited a technical analysis showing that the cosine similarity between the layer normalization (LayerNorm) weights of the two models reached 96.8%. This immediately sparked a heated debate in the developer community, with some questioning whether a copied model was being used in a national project funded by taxpayer money.

However, Upstage immediately launched a rebuttal. Upstage explained, "During large-scale model learning, the phenomenon of statistical values of a specific layer converging can occur due to structural characteristics," and pointed out that concluding replication solely based on parameter similarity was a technical error. Furthermore, Upstage chose to tackle the issue head-on, announcing a public verification session with external experts and a live YouTube broadcast to clear up the suspicions. Finally, on the 2nd, a day after the issue was raised, CEO Ko posted an official apology, stating, "We accept the criticism that it is difficult to conclude weight sharing based solely on layer value similarity," and the controversy was concluded to have stemmed from a technical misunderstanding.

Where does AI technology independence begin and end?
This incident clearly demonstrates the high difficulty of verification in the technologically advanced generative AI market. LLMs consist of hundreds of billions of parameters, making it difficult to assess the originality of the underlying technology solely based on numerical similarities in a few sections. Especially in the AI field, where the open-source ecosystem is vibrant, "convergent evolution" can occur, where even different models converge to similar results depending on the architecture and training data composition. Experts analyze that this controversy not only exposes the weaknesses of a fragmented metric like cosine similarity, but also confirms the absence of a "comprehensive verification protocol" acceptable to the public and the market.

"We acknowledge the criticism from many that it's difficult to conclude whether model weights are shared solely based on the cosine similarity of layer values. We apologize to Upstage staff for raising suspicions without rigorous verification and causing confusion." (Seonik AI CEO Seok-Hyeon Ko)

"The identified section is a structure where statistically similar values can be generated during the learning process. To resolve any doubts, we will transparently disclose and verify the code and experimental environment to the extent necessary." (Upstage statement)

Although the suspicions have been cleared, the repercussions of this incident are expected to be significant. First, the demand for verification of AI projects funded by public funds, including the government's "Dokpamo" project, is expected to intensify. Beyond simply measuring the performance of the results, standardization of "Model Cards" is required to transparently verify the source of the learning data, the training pipeline, and the architecture design process.

Venture capital industry insiders suggested, "As competition with global big tech intensifies, mutual verification and technological advancement are more crucial than wasteful 'exposure wars' between domestic startups." They added, "This public verification attempt should not be a one-time event, but rather become a new practice that enhances transparency in the industry." Going forward, how the Korean AI industry secures the asset known as "trust" will likely be a key variable in determining its global competitiveness.

On the one hand, there is a positive view that the industry has demonstrated a rapid self-correction ability in that it quickly raises issues, quickly verifies those issues within the ecosystem, and formally apologizes for situations that are explained and understood by the parties involved.