With the advancement of artificial intelligence (AI), the emergence of "large-scale language models (LLMs)" trained to understand and generate human language is driving rapid advancements in natural language processing (NLP) technology. Generative AI, like ChatGPT, is also able to understand context and engage in more natural conversations thanks to advances in LLMs. LLMs are utilized across data-processing industries and are playing a key role in supporting the rapid growth of AI technology.
Recently, LLMs have been evolving into sophisticated, specialized forms for specific domains such as communications, gaming, security, and law. General-purpose LLMs, despite training on vast amounts of data, have limitations in areas requiring in-depth expertise. They also face inherent issues such as information security vulnerabilities and hallucinations. Therefore, companies are building more precise, proprietary language models by training specialized knowledge and data reflecting the unique characteristics of each industry. Let's examine examples of companies advancing AI technologies through "domain-specific LLMs" optimized for their industries and businesses.

Crowdworks minimizes hallucination issues with experience developing enterprise AI across various industries.
AI tech company Crowdworks offers a "data engine" that preprocesses data required for AI learning and an " AI solution" that provides customized large-scale language model ( LLM ) construction services for each company. Through its data engine, Crowdworks operates a data labeling platform that converts various data, such as images and videos, into formats recognizable by AI . The platform allows individuals to participate in labeling tasks and share in the profits, selling the resulting data to clients such as financial institutions and search portals. Crowdworks boasts 620,000 registered labelers, the largest number in Korea.
As for AI solutions, the fine-tuning solution LLM Platform was launched in September 2023. It was selected as an official partner of Naver HyperClova X, and in the first half of this year, it launched the business-specific SLM ' WorksOne '. It also operates 'Crowd Academy', which trains the personnel needed to build AI data. Crowd Academy is a business that provides educational content for training labelers, and was selected for the 'National Tomorrow Learning Card' course in 2021 and the 'Platform Worker Specialized Training' project in 2023.
In particular, Crowdworks has been minimizing the chronic hallucination problem of LLM based on its experience in developing enterprise AI in various industries, and has been increasing customer satisfaction by improving the completeness and accuracy of answers through result data and performance verification.
◆ Demonstrating specialized data processing capabilities with a dark web-specific language model, S2W
S2W (hereafter referred to as S2W), an AI and security specialized data intelligence company, is attracting attention by developing the world's first AI language model specifically for the dark web, "DarkBERT," which can analyze difficult language and illegal content on the "Dark Web," which is pointed out as a hotbed of various cybercrimes such as drug distribution, ransomware, and hacking. Trained on a massive amount of text data of approximately 300 million pages collected from the dark web, the model demonstrates excellent performance in analysis tasks such as dark web page topic classification and ransomware leak site detection, thereby enhancing the efficiency of cybercrime investigations. In addition, S2W has installed "DarkCHAT," a dark web-specific chatbot using DarkBERT, into its AI-based big data analysis platform "XARVIS GLOBAL," which is supplied to an Indonesian government agency, allowing users to immediately check the cybercrime-related information they need.
Following Darkbert, the newly released cybersecurity-specific AI language model, CyberTuned, is designed to effectively learn unstructured cybersecurity data, particularly non-verbal elements such as URLs and SHA hashes, demonstrating differentiated capabilities in cyber threat intelligence (CTI) tasks. Furthermore, the company is expanding the NLP technology know-how and specialized data processing capabilities it has accumulated through the development of cybersecurity-specific language models to diverse industries, including manufacturing, distribution, finance, and the public sector.
◆ BHSN develops its own specialized LLM program to improve efficiency from legal advice to contract review.
There are also AI platforms that maximize legal efficiency through LLMs specialized in the legal market. Developed by legal AI solutions company BHSN, Allibee is a legal-specific, generative AI-based Software-as-a-Service (SaaS) platform. It provides optimized functionality for legal contract-related tasks by understanding context, grasping the meaning of words, and then providing appropriate responses.
Allybee is built on BHSN's proprietary legal AI language model, "BHSN Legal-LLM." It intensively trained on a large volume of high-quality legal data, including contracts, statutes, precedents, and policies, selected and generated through collaboration between lawyers and AI engineers. Based on this highly accurate information, Allybee implemented detailed functions tailored to the legal field. It also provides services such as modifying contracts to align with internal policies, drawing on data from diverse clients, including corporations, public institutions, and law firms. Currently, Allybee is being utilized as an all-in-one AI business solution that enhances work productivity based on these specialized legal domain features.
SKT innovates internal business with 'Telco LLM', a customized language model for telecommunication companies.
SK Telecom's (hereinafter SKT) 'Telco LLM' is a telecommunications-specific LLM that has studied domestic telecommunications terminology such as 5G rate plans, T membership, and public subsidies, as well as internal AI ethics guidelines. It collected and selected a vast amount of Korean telecommunications data and trained it on its own 'AX', OpenAI's 'GPT', and Antropic's 'Claude' to build a multi-engine-based LLM. It has gone through a detailed fine-tuning process exclusively for telecommunications companies, and is designed to process data in specialized telecommunications areas such as telecommunications services, membership benefits, and customer consultation patterns, enabling it to perform high-level tasks compared to general-purpose LLMs.
SKT is enhancing internal operational efficiency by selecting and applying optimized LLMs for each service through its multi-engine Telco LLM, implementing functions suited to various telecommunications work situations, and bolstering internal operational efficiency. Recently, it launched the "AI Consultation Support System," the first LLM-based system among major domestic customer centers, allowing counselors to quickly search and organize necessary information by entering questions in natural language. Furthermore, it has also built an "AI Document Automated Processing System," which automatically processes documents sent by customers via text and email, utilizing a Large Multimodal Model (LMM) that can understand not only text but also various images. SKT plans to expand the application of Telco LLM to various business situations, beyond distribution network management and network infrastructure operation.
NCsoft's "Barco LLM" presents a new paradigm for creative AI beyond the gaming and entertainment domains.
'VARCO LLM' is NCSOFT's first self-developed AI language model, leading innovation in the game and entertainment industry by supporting the creation of high-quality content specialized in game development. VARCO learns data focused on the development of in-game content such as text and scenarios, providing high efficiency in all aspects of content development such as vivid planning, operation, and art. In particular, 'VARCO Studio' based on VARCO LLM is an AI platform service specialized in game production. It supports the entire game development process through major AI functions such as 'VARCO Art', a web-based image creation tool specialized in NCSOFT's intellectual property (IP), 'VARCO Text', a text creation and management tool, and 'VARCO Avatar' for creating AI NPCs and chatbots, and helps create high-quality content.
Although Varco LLM is a language model specialized for game content creation, it is being applied to various industries, such as automotive platforms and education, by signing business agreements for the development of domain-specific models. NCsoft expects Varco to provide creativity differentiated from existing general-purpose creative AI, and has continuously developed and released tuned language models with improved performance, such as the next-generation version, 'Varco LLM 2.0' and 'Llama-VARCO LLM'. In the future, the plan is to spin off 'NC Research', the AI research and development organization that developed Varco, as a subsidiary to establish a specialized AI company and advance related technologies.
- See more related articles
You must be logged in to post a comment.