Prepit
2024An AI study partner for revising fundamentals -built end to end.
What
Prepit is an AI study partner built on top of my own study notes -system design, DSA, AI basics. The idea was simple: instead of navigating through markdown docs manually, just ask a question. It answers, lists sources and citations, and suggests follow-up questions. Users can rate each response. Since the LLM API is paid, there's a configurable context window -users clear it to continue. Access is invite-only.
Problem
I was studying system design, DSA, and AI and logging everything in markdown files. The notes were useful but hard to navigate quickly. I wanted to build a chat interface on top of them -not use a framework like LangChain, so I could actually understand how RAG works at the core.
Approach
- Started with a PRD -listed what I wanted to build, decided the knowledge base would be my own study notes. Then wrote a technical design doc: models, DAOs, folder structure, coding style, API contracts. Created a detailed todo list, went through several revisions, then handed it to a Claude agent team -four agents covering database, frontend, backend, and integration.
- Auth first. Google OAuth, everything server-side -state generation, CSRF protection, token verification. JWT in an HttpOnly cookie, sessions expire after 1 hour.
- Added an invite-only allowlist on top of auth. Authenticated users not on the list can't reach the chat interface. All endpoints protected.
- Chunking with langchain-text-splitters -custom logic got complex fast, MarkdownSplitter did the job. Reformatted the docs to work with it.
- Chose Weaviate for vector storage. Passed chunks through the embedding model before Weaviate instead of letting Weaviate call it internally -wanted visibility into embedding costs.
- Tested the chat flow end to end with curl and Swagger before integrating with the frontend. Kept this for last so retrieval bugs didn't block everything else.
- Deployed on AWS EC2 with Docker Compose -same setup as local. docker-compose.override.yml per environment so production config is never edited directly.
- Set up GitHub Actions for auto-deployment to EC2 and auto-generated release notes on every production deploy -no manual SSH, saves time, and gives a traceable history of what shipped.
Architecture
sequenceDiagram
actor User
participant Backend
participant Google
participant DB
User->>Backend: GET /api/auth/initiate
Backend->>Backend: Generate state nonce
Backend->>User: Set oauth_state cookie
Backend->>Google: Redirect with state param
Google->>Backend: Callback with code + state
Backend->>Backend: Verify state, clear cookie
Backend->>Google: Exchange code for user info
Backend->>DB: Check email against allowlist
alt Email allowed
Backend->>User: Set JWT cookie, redirect to chat
else Not on allowlist
Backend->>User: 403 Forbidden
end sequenceDiagram
actor User
participant Backend
participant OpenAI
participant Weaviate
User->>Backend: POST /api/chat {message, session_id}
Backend->>OpenAI: Generate query embedding
OpenAI->>Backend: Query vector
Backend->>Weaviate: Hybrid search (vector + keyword)
Weaviate->>Backend: Chunks + distance scores
alt Distance within threshold
Backend->>OpenAI: Prompt + context + chat history
OpenAI->>Backend: Answer + follow-up questions
Backend->>User: Response with sources + citations
else Distance exceeds threshold
Backend->>User: Out of scope
end Interesting Problems
This was my first time owning auth completely. Google OAuth sounds straightforward until you start thinking through what can go wrong. CSRF via the state parameter, XSS via HttpOnly cookies, PII kept out of the JWT. Then auth alone wasn't enough -an authenticated user who isn't on the invite list shouldn't reach the chat interface at all. All APIs and /docs are authorized based on scope -a free textarea with no backend auth would have been easy to misuse. And even with that, data isolation was a separate problem: one user's chat history should never be accessible to another. Each of these felt like a small decision at the time but added up to a layered security model.
Batch ingestion wasn't indexing documents via HNSW. Semantic search fell back to keyword search without any error. The 'out of scope' feature stopped working -the LLM was answering everything regardless of whether the topic was in the knowledge base. Only caught it by running hybrid search experiments directly in the shell. Fixed by switching to one API call per chunk.
Each request sends the full chat history to the LLM -the more messages, the more tokens, the more cost. Early implementation deleted messages from the DB once the token limit was hit. That felt wrong -message data is valuable. Enough of it could be used to build a custom evaluation dataset. Before launch, switched to session IDs instead. All messages in a session share a session_id. When the token limit is hit, a new session ID is generated -no messages are deleted, ever. Right now users clear the chat with a button -that generates a new session ID and the conversation starts fresh. But the data stays, and the door is open for rate limiting or cooldown periods later.
Retrieval uses hybrid search -if a user types a single keyword that matches a doc, it should return a result. But confidence scoring can't use the hybrid score, because a keyword match doesn't tell you if the content is semantically relevant. Used near_vector separately for confidence -Weaviate returns a distance score, and anything above a threshold gets rejected as out of scope.
Outcome
Shared with 5 people, 2 used it. Feedback was mostly on the UI -the left sidebar looks clickable and highlights on hover, which made users expect the documents to open. They don't. That's the clearest thing to fix next.
What's Next
- Integrate Sentry for log monitoring -currently logs are lost on every container restart.
- Fix the left sidebar UX -it highlights on hover which gives the impression documents can be opened, but viewing documents is not a feature by design. The hover state needs to be removed.