Introduction ============ Welcome to **Dataset Generator**, a flexible, extendable pipeline for creating realistic synthetic ticket datasets using AI assistants. Whether you need sample support tickets for testing, machine-learning training, or analytics, Dataset Generator lets you configure every step—from subject generation to multilingual translation and first-response simulation. Key Concepts ------------ - **Graph-based pipeline** Each ticket is generated by a directed graph of AI “nodes” (assistants). Outputs of one node feed into the next, enabling complex workflows such as: 1. Topic generation 2. Email text creation 3. Tag assignment 4. Paraphrasing & translation 5. First-response drafting - **Modular Assistants** Every step is powered by a configurable Assistant class. You can swap models, prompts, or even injection-decorate runs for cost tracking. - **Rich Ticket Attributes** Generate tickets with full metadata: - **Subject** & **Body** - **Type** (Incident, Request, etc.) - **Queue** (Technical Support, Billing, etc.) - **Priority** (low, medium, high) - **Language** (EN, DE, FR, ES, PT) - **Tags** (4–8 relevant topics) - **First-answer** simulation - **Cost Analysis** Automatically measure token usage and per-token costs across runs, aggregate by Assistant, and output summaries in your chosen currency. Getting Started --------------- 1. **Configure your project** in `config/config.py`: - Number of tickets - Translation nodes per ticket - Models, prompts, and providers 2. **Build the docs** ```bash cd docs make html `Dataset Generator on GitHub `_