Crowdworks Completes 'Research Project on Practical Approaches to Generative AI Reliability Assessment'

Crowdworks announced on the 4th that it has completed the 'Research Project on Practical Approaches to Generative AI Reliability Evaluation' hosted by the Telecommunications Technology Association (TTA). Through this project, Crowdworks has strengthened its expertise and competitiveness in AI reliability evaluation.

The main goal of this research project was to develop a framework standard that can systematically evaluate the reliability and safety of generative AI and to verify it. Crowdworks is in charge of verifying the reliability of LLM (Large Language Model) and developing educational materials, and conducted the evaluation targeting three LLM models developed by domestic companies.

Crowdworks first used the dataset to analyze the response patterns of the three LLM models, identified potential risk factors for each model, and designed attack scenarios. They tested the models in various ways using various prompts to supplement the scenarios and intensively explored the vulnerabilities of each model.

In addition, Crowdworks conducted an automatic evaluation using AI models and an in-depth evaluation by a red team of experts. The red team was composed of LLM experts selected from Crowdworks’ pool of 600,000 data experts, and conducted an in-depth evaluation based on detailed understanding.

In the evaluation process, the response risk of each model was analyzed quantitatively and qualitatively by applying AI risk assessment criteria such as violence, illegality, irrationality, non-factualness, misleading, and unethical, and the reliability and safety of the model were verified from various angles to derive areas for improvement.

Through this project, Crowdworks has secured expertise in AI reliability assessment, and based on this, plans to enhance AI reliability assessment services to reduce corporate AI risks. In addition, Crowdworks plans to expand its AI service reliability assessment business across various industries this year and strengthen its leadership in the AI reliability and safety sectors.

Kim Woo-seung, CEO of Crowdworks, said, “The AI reliability evaluation framework developed through this TTA research project has become the standard for domestic generative AI reliability evaluation,” and added, “Based on a network of 600,000 data experts and a verified evaluation system, we will lead the market as a leading company in the field of AI reliability and safety evaluation and support many companies to develop safe and reliable AI services.”


  • See more related articles