Prepit

2024

An AI study partner for revising fundamentals -built end to end.

What

Prepit is an AI study partner built on top of my own study notes -system design, DSA, AI basics. The idea was simple: instead of navigating through markdown docs manually, just ask a question. It answers, lists sources and citations, and suggests follow-up questions. Users can rate each response. Since the LLM API is paid, there's a configurable context window -users clear it to continue. Access is invite-only.

Problem

I was studying system design, DSA, and AI and logging everything in markdown files. The notes were useful but hard to navigate quickly. I wanted to build a chat interface on top of them -not use a framework like LangChain, so I could actually understand how RAG works at the core.

Approach

Started with a PRD -listed what I wanted to build, decided the knowledge base would be my own study notes. Then wrote a technical design doc: models, DAOs, folder structure, coding style, API contracts. Created a detailed todo list, went through several revisions, then handed it to a Claude agent team -four agents covering database, frontend, backend, and integration.
Auth first. Google OAuth, everything server-side -state generation, CSRF protection, token verification. JWT in an HttpOnly cookie, sessions expire after 1 hour.
Added an invite-only allowlist on top of auth. Authenticated users not on the list can't reach the chat interface. All endpoints protected.
Chunking with langchain-text-splitters -custom logic got complex fast, MarkdownSplitter did the job. Reformatted the docs to work with it.
Chose Weaviate for vector storage. Passed chunks through the embedding model before Weaviate instead of letting Weaviate call it internally -wanted visibility into embedding costs.
Tested the chat flow end to end with curl and Swagger before integrating with the frontend. Kept this for last so retrieval bugs didn't block everything else.
Deployed on AWS EC2 with Docker Compose -same setup as local. docker-compose.override.yml per environment so production config is never edited directly.
Set up GitHub Actions for auto-deployment to EC2 and auto-generated release notes on every production deploy -no manual SSH, saves time, and gives a traceable history of what shipped.

Architecture

Auth Flow

sequenceDiagram
    actor User
    participant Backend
    participant Google
    participant DB

    User->>Backend: GET /api/auth/initiate
    Backend->>Backend: Generate state nonce
    Backend->>User: Set oauth_state cookie
    Backend->>Google: Redirect with state param
    Google->>Backend: Callback with code + state
    Backend->>Backend: Verify state, clear cookie
    Backend->>Google: Exchange code for user info
    Backend->>DB: Check email against allowlist
    alt Email allowed
        Backend->>User: Set JWT cookie, redirect to chat
    else Not on allowlist
        Backend->>User: 403 Forbidden
    end

RAG Pipeline

sequenceDiagram
    actor User
    participant Backend
    participant OpenAI
    participant Weaviate

    User->>Backend: POST /api/chat {message, session_id}
    Backend->>OpenAI: Generate query embedding
    OpenAI->>Backend: Query vector
    Backend->>Weaviate: Hybrid search (vector + keyword)
    Weaviate->>Backend: Chunks + distance scores
    alt Distance within threshold
        Backend->>OpenAI: Prompt + context + chat history
        OpenAI->>Backend: Answer + follow-up questions
        Backend->>User: Response with sources + citations
    else Distance exceeds threshold
        Backend->>User: Out of scope
    end

Interesting Problems

Building auth end to end for the first time

This was my first time owning auth completely. Google OAuth sounds straightforward until you start thinking through what can go wrong. CSRF via the state parameter, XSS via HttpOnly cookies, PII kept out of the JWT. Then auth alone wasn't enough -an authenticated user who isn't on the invite list shouldn't reach the chat interface at all. All APIs and /docs are authorized based on scope -a free textarea with no backend auth would have been easy to misuse. And even with that, data isolation was a separate problem: one user's chat history should never be accessible to another. Each of these felt like a small decision at the time but added up to a layered security model.

Batch ingestion silently broke semantic search

Batch ingestion wasn't indexing documents via HNSW. Semantic search fell back to keyword search without any error. The 'out of scope' feature stopped working -the LLM was answering everything regardless of whether the topic was in the knowledge base. Only caught it by running hybrid search experiments directly in the shell. Fixed by switching to one API call per chunk.

Context window management -deleting messages was the wrong call

Each request sends the full chat history to the LLM -the more messages, the more tokens, the more cost. Early implementation deleted messages from the DB once the token limit was hit. That felt wrong -message data is valuable. Enough of it could be used to build a custom evaluation dataset. Before launch, switched to session IDs instead. All messages in a session share a session_id. When the token limit is hit, a new session ID is generated -no messages are deleted, ever. Right now users clear the chat with a button -that generates a new session ID and the conversation starts fresh. But the data stays, and the door is open for rate limiting or cooldown periods later.

Hybrid search and confidence scoring needed different strategies

Retrieval uses hybrid search -if a user types a single keyword that matches a doc, it should return a result. But confidence scoring can't use the hybrid score, because a keyword match doesn't tell you if the content is semantically relevant. Used near_vector separately for confidence -Weaviate returns a distance score, and anything above a threshold gets rejected as out of scope.

Outcome

Shared with 5 people, 2 used it. Feedback was mostly on the UI -the left sidebar looks clickable and highlights on hover, which made users expect the documents to open. They don't. That's the clearest thing to fix next.

What's Next

Integrate Sentry for log monitoring -currently logs are lost on every container restart.
Fix the left sidebar UX -it highlights on hover which gives the impression documents can be opened, but viewing documents is not a feature by design. The hover state needs to be removed.