Introduction
Welcome to AI Data Foundry.
AI Data Foundry is a powerful AI-based tool that extracts text, layout, tables, and key-value pairs from various unstructured document formats including HWP, Docx, and PDF.
The extracted data can be immediately utilized in various workflows such as RAG (Retrieval-Augmented Generation) data preprocessing, RPA (Robotic Process Automation), and AI dataset construction.
Key Features
Support for Various Business Document Formats
Supports a wide range of document formats including Hangul / MS-Office / PDF / Images, enabling compatibility with most document types held by enterprises
Perfect Analysis of Hidden Document Structure
- Recognition of detailed document structure information including headings, paragraphs, headers, footers, page numbers, captions, lists, etc.
- Recognition of visual information such as tables and images
Highly Practical Output Formats
- Markdown output support for Large Language Model (LLM) construction
- XML output support for enterprise database construction
Use Cases
Understanding various types of documents and
building LLM models based on owned documents
Improving natural language processing capabilities through understanding documents including tables and images
and developing conversational AI Q&A systems
Extracting only necessary information to
build business automation systems
Structuring unstructured documents and
building large-scale document digital archives
Next Steps
See the basic usage flow in Getting Started.
- Quickstart: Run your first workflow in 5 minutes with the UI
- Supported Formats: Check the complete list of analyzable file formats