🚀 1. HOW IT WORKS
This workflow automatically extracts structured data from invoices sent via Telegram (PDF or image) and saves it to Excel.
- A user sends an invoice (PDF or image) to a Telegram bot
- The workflow detects the file type (PDF or image)
- For PDF:
- Extracts text directly from the file
- Falls back to OCR if needed
- For images:
- The extracted text is cleaned and processed
- AI (Google Gemini) converts the raw text into structured JSON data
- The data is validated and formatted
- Valid data is saved to Excel (Microsoft Excel or Google Sheets)
- A confirmation message is sent back via Telegram
This eliminates manual data entry and speeds up invoice processing.
⚙️ 2. SETUP INSTRUCTIONS
Prerequisites:
-
Telegram Bot
- Create a bot using BotFather
- Copy the Bot Token
-
Google Gemini API Key
- Get API key from Google AI Studio
-
Excel / Google Sheets
- Prepare a sheet with columns:
invoice_number, date, vendor, total, tax, items
Setup Steps:
-
Configure Telegram Trigger Node
-
Configure File Download
- Ensure binary data is correctly passed
-
Configure OCR
- Use Tesseract OCR node or external OCR API
-
Configure Google Gemini Node
- Add API key
- Use provided prompt for structured extraction
-
Configure Excel / Google Sheets Node
- Connect your account
- Map fields correctly
-
Test Workflow
- Send a sample invoice via Telegram
-
Activate Workflow
🛠 Requirements
- n8n (self-hosted or cloud)
- Telegram Bot Token
- Google Gemini API Key
- Excel 365 or Google Sheets
📂 Output Example
{
"invoice_number": "INV-001",
"date": "2025-01-01",
"vendor": "ABC Company",
"total_amount": "100000",
"tax": "10000",
"items": [
{
"name": "Service A",
"quantity": "1",
"price": "100000"
}
]
}
🚀 Use Cases
- Invoice automation
- Accounts payable automation
- Financial data entry automation
- Document digitization
💡 Notes
- Works best with clear and high-quality images
- OCR accuracy depends on image quality
- AI improves extraction accuracy significantly