Skip to main content
Back to Blog
Development12 min read

Building Orbinix: A production AI SaaS from scratch

How I shipped a multi-model AI pipeline, a freemium billing engine, and a self-hosted CI/CD stack from scratch.

E
Excelle Escalada
Digital Experience Architect

Solving the resume filter problem

Applicant Tracking Systems (ATS) reject more than 75% of resumes before a human recruiter ever reads them. This happens because resumes don't mirror the exact vocabulary, keyword density, and structure that ATS software scores against. A qualified candidate often gets filtered out automatically just by calling their work "managing daily logistics" instead of "directing cross-functional operations."

The common solution is to manually rewrite your resume for every job you apply to. That takes anywhere from 30 to 60 minutes per application. For someone applying to 10 or 15 roles a week, that math is brutal. Most people skip it and submit a generic resume while hoping for the best.

I built Orbinix to close that gap. You upload your complete work history as a master CV, paste a job description, and the platform intelligently reorders and optimizes your resume. I set one strict constraint from the start: the system must never fabricate experience. Every output bullet has to trace back to something the user provided.

What Orbinix does

I built the core user flow around precision and honesty. You upload a master CV, add a job description, and the AI tailors the content. After a quick interactive review, you export an ATS-ready PDF in seconds.

!Orbinix workflow and dashboard

The server parses and stores uploaded PDF or DOCX master CVs. Users paste a job description into a text field to trigger the analysis. The platform identifies hard skills, soft skills, required qualifications, and industry terminology to mirror the employer's own vocabulary.

After tailoring, users see a side-by-side editor with the original and rewritten versions. They can accept, reject, or manually edit any bullet to maintain full control. They can also use the gap analysis feature to surgically apply missing skills.

The platform launched with four tiers: Free, Pay-as-you-go, Pro, and Career. I built a custom credit ledger system to handle billing through Stripe subscriptions. This ensures every AI call is tracked and accounted for accurately.

Architecture at a glance

I built the stack as a full-stack TypeScript monorepo using Next.js 15 and the App Router. The architecture is deliberate about where tasks execute. Long-running AI requests that take 20 to 40 seconds cannot live on Vercel's free tier due to hard function timeouts. Hosting on a self-managed VM removes this constraint and keeps infrastructure costs predictable.

  • Frontend: Next.js 15, React 19, Tailwind CSS v4, shadcn/ui, Radix UI primitives
  • API Layer: Next.js Route Handlers running as serverless-compatible Node.js functions
  • AI Tier: Four distinct model integrations across three providers
  • Database: Supabase PostgreSQL with Row-Level Security policies on every table
  • Auth: NextAuth v5 for session management and middleware, with Supabase Auth handling OAuth and JWT claims for storage RLS
  • File Storage: Supabase Storage for uploaded CVs and generated PDFs
  • Billing: Stripe subscriptions, webhook handlers, and a custom credit ledger
  • Infra: GitHub Actions, rsync, PM2, and Nginx on a self-hosted Azure VM
  • Automation: A separate Docker Compose stack running n8n, SearXNG, and Crawl4AI
  • Multi-model AI tailoring pipeline

    Building an AI pipeline that produces reliable, structured output from a non-deterministic system was the hardest engineering challenge. The tailoring endpoint routes requests to one of three model tiers based on quality and cost requirements. Standard requests use Azure-hosted Kimi K2.5, while premium tailoring uses Anthropic Claude Sonnet.

    Each model gets the same carefully engineered system prompt. This prompt enforces the core constraints: no fabrication, perfect heading preservation, and ASCII-only punctuation. I explicitly instructed the models to avoid any unicode em dashes or en dashes.

    Language models don't always return clean, schema-compliant JSON. I built a multi-stage normalization layer to deduplicate sections and split embedded competency content. It also infers line types using date-range regex patterns and section name heuristics. This protects the database from malformed records.

    Freemium credit ledger: tracking like a bank

    Every call to an LLM costs real money, so the usage tracking layer needs to be defensively correct. I built Orbinix's access control and billing system to act as a financial ledger. Before any tailoring request starts, the system runs a deep usage check.

    The system reads the user's account tier and credit balance. It independently counts the number of successful rows in the llm_logs table for that user in the current month. It then takes the higher of the two values to set the floor.

    This dual-verification pattern protects against database update failures. If a profile counter fails to update, the logs count ensures the user can't exceed their limit. Every credit transaction is recorded as an append-only entry for full auditability.

    Targeted gap rewrite: matching the right model

    The gap analysis rewrite allows users to address missing skills without re-tailoring the whole document. After the initial analysis, the system surfaces skills from the job description that aren't in the CV. Users then select which "gaps" to address surgically.

    A brute-force approach would be expensive and slow. Instead, I built a targeted endpoint that asks the AI to rewrite specific existing bullets. This preserves the work the user has already reviewed while adding the missing keywords naturally.

    I chose the lightweight Azure OpenAI o3-mini model for this task. Speed matters during active editing, and this smaller model handles the targeted rewrite perfectly. This decision saves on API costs while providing a rapid response to the user.

    Self-hosted CI/CD: infra without the Vercel tax

    For Orbinix, I built a real deployment pipeline on a self-hosted Azure VM. I needed full control over timeout limits and execution environments that managed platforms typically restrict. The pipeline uses GitHub Actions to run CI on every pull request and CD on every merge to main.

    The CD workflow connects to the Azure VM over SSH and rsyncs the source code. I intentionally isolated secrets so that sensitive keys like Stripe and AI API keys never touch the CI runner. They live only in a production environment file on the actual server.

    PM2 starts the application through a custom script that sources these production secrets. This setup means even a compromised GitHub account would not reveal production secrets. Nginx handles the TLS termination and forwards real client IPs to the Node process.

    Supporting infrastructure: the automation stack

    I run a separate Docker Compose stack for workflow automation on the same Azure VM. This includes n8n for orchestration, SearXNG for self-hosted search, and Crawl4AI for structured web scraping. These tools provide the "intel" the AI needs to analyze job descriptions deeply.

    Nginx routes traffic to these services by subdomain with protective headers. The internal Docker network allows n8n to query the search engine and scraper using internal container hostnames. This bypasses Cloudflare's bot protection and reduces latency.

    This stack supports complex content workflows without requiring paid search APIs. Building and operating it provided hands-on experience with container networking and service discovery. It keeps the entire product ecosystem self-contained and cost-effective.

    Key decisions and trade-offs

    I made every significant technical decision in Orbinix based on production reliability and cost.

    DecisionRationale
    Multi-model routingDifferent features have different quality and cost requirements. Lightweight models handle gap rewrites for speed, while Claude Sonnet handles premium tailoring for prose.
    Supabase RLSRow-Level Security policies automatically enforce data isolation at the database layer. Users can only query their own rows without manual checks on every query.
    Self-hosted VMAI tailoring often takes longer than the 10-second limit on Vercel's hobby tier. Self-hosting removes this constraint and provides predictable infrastructure costs.
    Credit ledgerAn append-only log provides a complete audit trail for disputes, refunds, and fraud detection. This is the same pattern financial systems use because it's correct.
    Output normalizationLanguage models produce non-deterministic output. Validating and normalizing the response before writing to the database protects against corrupted user records.

    Actionable takeaways: what I learned shipping AI

    Building a production-ready AI product taught me that infrastructure and consistency define the user experience.

  • Build the test suite earlier. Unit tests for the CV parsing logic and integration tests for the AI pipeline catch normalization edge cases before they reach users.
  • Evaluate export architecture early. The react-pdf approach works for simple layouts but limits flexibility. A headless Chrome approach offers better layout control and font support.
  • Invest in a structured evaluation framework. Prompt engineering needs a systematic approach. Comparing outputs across a diverse set of CV types builds a more robust product faster than manual iteration.
  • Secret isolation is non-negotiable. Keeping production secrets off the CI runner and only on the server reduces the risk of a major credential compromise.
  • Infrastructure transparency beats ease of use. While Vercel is simple, self-hosting provided the control needed to manage long-running AI processes effectively.

  • Need help building your next AI-powered SaaS product? Get in touch for expert development and strategic guidance.

    Share this article

    More Articles