✉️

Inbox AI

"Chat with all your emails"

Alpha
View on GitHub

Overview

Finding stuff buried in years of email is painful. Native email search doesn't really help when you need a flight confirmation from six months ago, want to tally up subscription costs, or need that hotel reservation your friend forwarded. Inbox AI lets you upload a Gmail export and ask questions in plain English, like chatting with someone who remembers every message you've ever sent or received.

It parses and indexes your entire email archive, then uses a RAG pipeline to find the most relevant messages for any query. Ask "How much did I spend on flights in 2024?" and it'll dig through thousands of emails, pull out booking confirmations, extract dollar amounts, and give you a clear answer. It goes past keyword matching. It understands context, recognizes entities like merchants, dates, and amounts, and can categorize your emails into groups like travel, finance, shopping, and subscriptions.

It also doubles as an exploration of multi-LLM architectures. You can switch between Claude and GPT-4, making it easy to compare response quality, latency, and cost. Conversations are saved so you can revisit past queries and refine them over time.

Key Features

Gmail export ingestion. Upload a Google Takeout archive and the system parses mbox files, extracting headers, body text, and attachments into a structured format for indexing.
Natural language queries. Ask things like "What flights did I book last summer?" or "Show me all emails from my bank" and get answers pulled directly from your email history.
Automatic email categorization. Emails get classified into categories like travel, finance, shopping, subscriptions, and personal using NLP topic modeling.
Structured data extraction for bookings. Flight confirmations, hotel reservations, and event tickets are parsed into structured records with dates, locations, confirmation numbers, and provider details.
Transaction and spending analysis. Financial emails from banks, payment processors, and merchants are parsed to show spending breakdowns, recurring charges, and cost trends.
Reservation tracking. Restaurant bookings, doctor appointments, and service reservations are extracted and organized chronologically.
Multi-LLM support with Claude and GPT-4. Switch between Anthropic's Claude and OpenAI's GPT-4 to compare answer quality, speed, and per-query cost.
Persistent conversation history. Every question and answer is saved so you can revisit past sessions and refine earlier queries.
Export results. Download insights, spending reports, and structured data as CSV or JSON for use in spreadsheets or other tools.
Privacy-first local processing. Your email data stays on your machine during parsing and indexing. Only the minimal context needed for each query is sent to the LLM provider.

Tech Stack

Python

Runs the core NLP pipeline: email parsing, text preprocessing, entity extraction, and embedding generation.

React

The chat interface where you submit queries, view conversations, browse categorized emails, and explore extracted data.

Claude / GPT-4

The language understanding layer. Interprets queries, reasons over retrieved email context, and generates answers with citations back to source messages.

NLP

Entity extraction, named entity recognition, and topic classification. Pulls dates, amounts, merchant names, and booking details from unstructured email text.

Architecture

Four-stage pipeline. First, the parsing layer ingests Gmail mbox exports, decodes MIME structures, strips HTML, and normalizes headers into clean document objects. Second, each email gets converted into vector embeddings and stored in a local vector index for fast similarity search. Third, when you submit a query, the RAG retrieval layer encodes the question, runs approximate nearest-neighbor search, and assembles a context window of the most relevant email snippets. Finally, the assembled context plus your question gets sent to the selected LLM (Claude or GPT-4), which generates an answer grounded in the retrieved evidence.

Challenges & Solutions

Parsing diverse email formats

Email is one of the oldest and most inconsistent digital formats still in use. A single Gmail export can contain plain text, nested MIME multipart structures, base64-encoded attachments, HTML-only emails with inline CSS, forwarded chains with mangled headers, and auto-generated messages with wildly varying templates. Getting clean text out of all of these required a lot of edge-case handling: detecting character encodings, stripping boilerplate signatures, reconstructing forwarded threads into coherent timelines.

Accurate entity extraction from unstructured text

Pulling structured data like dollar amounts, dates, flight numbers, and merchant names from free-form email text is harder than it sounds. The same info can appear in dozens of formats. A price might be "$1,234.56", "USD 1234.56", or "1,234 dollars" depending on the sender. Dates range from ISO formats to "next Thursday." Reliable extraction required combining regex pattern matching for known templates with NLP-based entity recognition for everything else, plus a validation layer to catch misparses.

Balancing cost across LLM providers

Claude and GPT-4 have different pricing, context window sizes, and token-counting behavior, so the same query can cost a lot more on one provider depending on context length. The system uses adaptive context windowing that tailors the number of retrieved email snippets to each provider's token limits and pricing. This keeps answer quality high without surprise cost spikes, while staying transparent about which model is being used and the trade-offs involved.