
AI tech company Crowdworks announced on the 22nd that it has applied for a patent for 'Document Complexity Analysis-Based Document Automation Processing Technology', a core technology applied to its AI data preprocessing solution 'Alpy Knowledge Compiler'.
This technology quantitatively analyzes the structural complexity of documents and determines whether automation should be applied in the unstructured data preprocessing process essential for developing AI agents based on RAG (Retrieval-Augmented Generation). It can prevent preprocessing quality degradation and resource waste, and promote work efficiency and cost optimization by deciding in advance whether to involve experts depending on the document type.
According to Crowdworks, this technology classifies documents into four levels from Class 1 to Class 4 based on their complexity, and suggests a standard for automatically preprocessing documents with simple structures and expert parsing for documents with complex structures. Through this, it is explained that it can be used to predict the possibility of data preprocessing errors and also for human resources and schedule management.
The technology is currently being applied and operated in Crowdworks' self-developed solution, 'RP Knowledge Compiler'. This solution converts various types of documents into a form that AI can learn based on OCR (optical character recognition), parsing, and chunking functions, and supports multiple document formats such as Hangul (HWP/HWPX), PDF, Word, and Excel. It recognizes visual elements such as nested structures in tables, charts, and images to generate metadata, and plans to provide advanced processing functions using LLM (large-scale language model) and VLM (vision language model).
As corporate demand for asset management of unstructured data has increased recently, Crowdworks plans to actively respond to the preprocessing needs of various industries at home and abroad through its solutions and strengthen its competitiveness in the field of AI-based work automation.
Kim Woo-seung, CEO of Crowdworks, said, “This patent application is the first case of increasing the precision and efficiency of data preprocessing through document complexity analysis-based technology, and it is an opportunity to prove our differentiation as a specialized AI data preprocessing company.” He added, “We are currently receiving inquiries from various companies about the RPI Knowledge Compiler, and we expect its expanded application in the enterprise AI market.”
- See more related articles
You must be logged in to post a comment.