XiaoZhi AI Chatbot Documentation Center | XiaoZhi.Dev
XiaoZhi AI Chatbot Documentation Center | XiaoZhi.Dev
📖 XiaoZhi AI Documentation Center
XiaoZhi AI is an open-source intelligent voice robot based on ESP32-S3 development, integrating wake word detection, AI conversation, device control, and multi-protocol communication capabilities. This documentation center provides complete technical guides from hardware assembly to AI integration.
Project Features:
- 🎙️ Wake Word Detection: Supports 26 wake words including “Hello XiaoZhi”, <200ms response
- 🧠 AI Integration: Supports multiple LLMs including DeepSeek, GPT, Ernie Bot
- 🏠 IoT Control: Integrates MQTT, MCP protocols for smart home control
- 🔧 Open Source Hardware: Based on ESP32-S3, complete open-source solution
🚀 Quick Start
Beginner’s Guide
ESP32-S3 board assembly, wiring solutions and component lists
Pre-compiled firmware download, flashing tools and configuration guide
Wi-Fi setup, device connection and network troubleshooting
Voice interaction, AI conversation and device control features
📚 Documentation Navigation
🛠️ User Guide
Complete tutorials for users
- Hardware Setup Guide - ESP32-S3 hardware assembly and wiring solutions
- Firmware Download & Flashing - Pre-compiled firmware acquisition and installation
- Network Configuration - Wi-Fi setup and network troubleshooting
- Feature Usage Tutorial - Voice interaction and smart control features
- ESP32 Budget Version - Low-cost ESP32 hardware solution
- Frequently Asked Questions - Common issues and solutions during usage
💻 Development Documentation
In-depth technical guides for developers
- ESP-IDF Environment Setup - Development environment configuration and compilation guide
- WebSocket Communication Protocol - Device-server communication protocol
- MCP Protocol Development - Model Context Protocol IoT control
- MQTT+UDP Hybrid Protocol - Control and audio hybrid communication
- Emoji Emotion Display - LLM emotion state expression protocol
🔧 ESP32 Development Guide
Complete ESP32-S3 platform development tutorial
- Technical Specifications - ESP32-S3 hardware architecture and performance parameters
- Programming Development Guide - From GPIO control to complex system development
- Advanced Feature Development - 4G communication, local AI inference, multimodal interaction
- Troubleshooting Guide - Common issue diagnosis and solutions
🤖 AI Feature Capabilities
AI technology integration and capability overview
- 🎯 Voice Processing: Local wake word + cloud recognition hybrid solution
- 🧠 LLM Integration: Support for DeepSeek, GPT, Qwen and other models
- ⚡ Edge Inference: TensorFlow Lite lightweight model integration
- 🎵 Voice Synthesis: Multi-engine TTS and emotional voice output
🛠️ Technical Features
Hardware Platform
- Main Controller: ESP32-S3 dual-core 240MHz, 16MB Flash + 8MB PSRAM
- Audio Processing: INMP441 digital microphone + MAX98357A digital amplifier
- Display Output: SSD1306 OLED display + RGB status lights
- Network Communication: Wi-Fi 2.4GHz + 4G Cat.1 communication (optional)
Software Architecture
- Development Framework: ESP-IDF v5.3.2 + Arduino compatible
- AI Engine: Espressif Wake Word Engine + cloud LLMs
- Communication Protocols: WebSocket + MQTT + MCP Protocol
- Audio Encoding: 16kHz PCM + Opus compression transmission
📊 Performance Metrics
| Feature Module | Performance Metrics | Notes |
|---|---|---|
| Wake Word Detection | <200ms latency, >99% accuracy | Local offline processing |
| Speech Recognition | <1s latency, >95% accuracy | Chinese recognition accuracy |
| AI Conversation | <3s response, supports 5+ LLMs | DeepSeek recommended |
| Device Control | <100ms command response | Local + cloud hybrid |
| Power Management | 5mA standby, 150mA active | Smart power optimization |
🗂️ Documentation Index
| Documentation Category | Document Name | Main Content | Update Time |
|---|---|---|---|
| User Guide | Hardware Setup Guide | ESP32-S3 assembly, wiring diagrams, component lists | 2025-03-19 |
| User Guide | Firmware Download | Pre-compiled firmware, flashing tools, configuration guide | 2025-03-18 |
| User Guide | Network Configuration | Wi-Fi setup, troubleshooting, advanced settings | 2025-03-18 |
| User Guide | Feature Tutorial | Voice interaction, device control, personalization settings | 2025-03-18 |
| User Guide | ESP32 Budget Version | Low-cost ESP32 development board construction solution | 2025-03-18 |
| User Guide | FAQ | Usage issues, troubleshooting, technical support | 2025-03-18 |
| Development Documentation | ESP-IDF Environment Setup | Development environment configuration, compilation toolchain installation | 2025-03-06 |
| Development Documentation | WebSocket Protocol | Communication protocol specifications, message format definitions | 2025-03-06 |
| Development Documentation | MCP Protocol Specification | Model Context Protocol interaction flow | 2025-03-20 |
| Development Documentation | MCP Usage Guide | Specific applications of IoT device control | 2025-03-20 |
| Development Documentation | MQTT+UDP Protocol | Control channel and audio channel hybrid communication | 2025-03-20 |
| Development Documentation | Emoji Emotion Display | LLM emotion state expression protocol | 2025-03-06 |
| ESP32 Development | Technical Specifications | ESP32-S3 hardware architecture, performance parameters | 2025-09-25 |
| ESP32 Development | Programming Guide | GPIO control to complex system development | 2025-09-25 |
| ESP32 Development | Advanced Features | 4G communication, AI inference, multimodal interaction | 2025-09-25 |
| ESP32 Development | Troubleshooting | Issue diagnosis, solutions, debugging techniques | 2025-09-25 |
| AI Features | AI Feature Integration | Voice processing, LLM integration, edge inference | 2025-09-25 |
🔗 Related Resources
Community Resources
- 📖 Online Documentation: https://xiaozhi.dev/docs
- 💬 Technical Blog: https://xiaozhi.dev/blog
- 🚀 Project Updates: Follow GitHub repository updates
Technical Support:
- 📧 Contact Email: [email protected]
- 📖 Online Documentation: https://xiaozhi.dev/docs