StructAI
An AI-powered import tool that turns messy CSV / XLSX / JSON files into a proper Postgres schema, with a human-in-the-loop for the calls the model shouldn't make alone.
The story every analyst knows: someone hands you a spreadsheet, the columns are
named Cust_Email_Final_v2, the same customer appears with three slightly
different addresses, and you spend an afternoon writing one-off Python before
you can even start the work you were actually asked to do.
StructAI is the tool I wanted that afternoon. You drop a document on a project, an agent profiles it, proposes a Postgres schema (extracted entities, foreign keys, primary keys, the boring choices made for you), and once you accept the schema it generates and runs the import script inside a single transaction. Every run takes a database snapshot first so you can undo any import that landed badly.
What it does well
- Schema-first review. The agent shows you the DDL before it writes a line
of import code. You accept, or you reply in natural language (“split the
address into its own table; make
customer.emailthe PK”) and it revises. This is the most expensive decision in an import, and it’s the one the model should not silently make. - Self-correcting execute loop. When the generated script blows up (encoding, mixed date formats, an embedded comma in a quoted field), the agent reads the stderr tail, diagnoses the root cause, and rewrites. Five attempts max; past that, it asks for help instead of thrashing.
- Cheap undo. Every successful import gets a template-DB snapshot. One click rolls the project back to before the import landed. Snapshots are pruned by a retention sweeper.
What it’s made of
- FastAPI + arq on the backend: one job at a time, server-sent events to the UI so the agent’s thinking shows up in real time.
- React + Vite on the frontend, with an interactive ER diagram and a data browser that does server-side sort and per-column filters.
- Postgres as both the metadata store and the per-project data DB, where template-DB clones make snapshots near-free.
- Anthropic Claude under the hood, called through a small agent-loop wrapper that lets the model ask the user mid-stream when it has to make a judgment call.
What I learned
- The right place for the human is before the agent has done expensive work, not after. Reviewing a schema takes 30 seconds; reverting a bad import takes ten minutes of cleanup.
- Letting the agent ask for clarification is much better than telling it to guess “sensibly”. The model will happily invent a heuristic; what you wanted was for it to say “I don’t know, you tell me.”
- A small bounded retry loop (“try, read the error, fix, try again, give up at 5”) catches almost every real failure. Unbounded retries are how you get $40 of tokens spent on a CSV with a misspelled column.
Try it at structai.adityaviki.com.