Document QA (RAG)

I built this as a focused implementation of the RAG pattern, earlier in the portfolio series before layering in more complexity. Users upload documents (multipart), the server stores them in AWS S3 via presigned URLs, and a BullMQ worker handles chunking, embedding, and indexing asynchronously.

At query time, the question is embedded and a cosine similarity search over pgvector retrieves the most relevant chunks. Claude uses those chunks as context to generate a grounded answer with citations back to the source document.

The monorepo splits into server, worker, web client, and a shared common package for chunking logic and types. This clean separation made it straightforward to scale the worker independently of the API.

GitHub Repository

Technologies Used

TypeScriptNext.jsExpressPostgreSQLpgvectorRedisBullMQClaude APIAWS S3