Introduction
Welcome to the Docu-analyzer API.
The Docu-analyzer API is a powerful AI-based tool that extracts text, layout, tables, and key-value pairs from various unstructured document formats including HWP, Docx, and PDF.
The extracted data can be immediately utilized in various workflows such as RAG (Retrieval-Augmented Generation) data preprocessing, RPA (Robotic Process Automation), and AI dataset construction.

Key Features
Support for Various Business Document Formats
Supports a wide range of document formats including Hangul / MS-Office / PDF / Images, enabling compatibility with most document types held by enterprises
Perfect Analysis of Hidden Document Structure
- Recognition of detailed document structure information including headings, paragraphs, headers, footers, page numbers, captions, lists, etc.
- Recognition of visual information such as tables and images
Highly Practical Output Formats
- Markdown output support for Large Language Model (LLM) construction
- XML output support for enterprise database construction
Main Features
Use Cases
Understanding various types of documents and
building LLM models based on owned documents
Improving natural language processing capabilities through understanding documents including tables and images
and developing conversational AI Q&A systems
Extracting only necessary information to
build business automation systems
Structuring unstructured documents and
building large-scale document digital archives
Next Steps
- Quickstart: Quickly request your first document analysis and check the results
- Authentication: Learn how to obtain API keys and authenticate
- Supported Formats: Check the complete list of analyzable file formats