Back/Work

AuraHire

A full-stack AI recruitment platform built as a thesis system around two commitments: explainable scoring and active bias mitigation. Every AI score ships with a component breakdown, the weights used, a plain-language explanation, and verbatim evidence excerpts from the resume; job descriptions are scanned for biased language before publish, and resumes are PII-redacted before any scoring model sees them.

Type: AI Platform · Web App
Role: Solo Full-Stack Build
Built: 2026
Client: Personal · thesis project

Next.js 16React 19NestJS 10TypeScript (strict)Drizzle ORMPostgreSQL / SupabaseOpenAI (gpt-4o-mini)Tailwind 4TurborepoRedis / BullMQSocket.io

GitHub Live Site

Server-side AI engines

RLS-guarded DB tables

NestJS feature modules

Transparent score dimensions

The problem

Most AI recruitment tools score candidates inside a black box: a number appears, a candidate is ranked, and no one can say why. That opacity isn't just an annoyance - it's where bias hides. If you can't inspect how a score was produced, you can't tell whether it leaned on a name, an address, a gendered phrase, or genuine skill.

The stakes make that unacceptable. A hiring score helps decide who gets a job, so a number nobody can explain is a decision nobody can be held accountable for. The interesting problem isn't 'can AI rank resumes' - it obviously can - but 'can it do so in a way a candidate, a recruiter, and an auditor would all accept.'

AuraHire is my own thesis build, framed around a single defensible claim: explainable and fair AI-powered recruitment. The constraint I set was that the system is the artifact - no faked AI, no opaque scoring, no demo theatre. If a feature couldn't be defended in front of an examiner, it didn't ship.

What I built

AuraHire is a full, three-sided platform, not a scoring demo. It runs three roles - candidate, recruiter, and admin - on top of multi-tenant company membership, where one recruiter can belong to several companies and an active-company context quietly governs every recruiter view, query, and permission.

Each role gets a complete workflow. Candidates get a resume-first onboarding wizard - upload, AI parse, review the prefilled fields, set preferences, and see a Profile Score - plus a job feed with per-job match chips and one-tap apply. Recruiters get a rich-text job editor with inline bias flags, an application pipeline sortable by best match, interview scheduling, and offer generation. Admins get a command center, an audit log, AI scoring configuration, and a bias-and-fairness monitor.

What makes it production-grade rather than a prototype is the plumbing underneath all three: real authentication and role guards, Row-Level Security, background queues, scheduled jobs, real-time updates, and transactional email - the unglamorous parts that decide whether software survives contact with real use.

The explainable scoring engine

Two scoring engines run server-side on OpenAI with structured outputs - free-text parsing is forbidden, so every call returns a Zod-validated JSON document. The Profile Score weighs four components (Completeness 25, Skill Depth 30, Experience Clarity 30, Education Quality 15). The Match Score weighs four more against a specific job (Skills 40, Experience 35, Education 15, Cultural / Language Fit 10). That is eight transparent dimensions in total, each capped by its configured weight.

No score renders as a naked number. Every component carries its score, its max, its weight, a plain-language explanation, and one to three verbatim evidence excerpts pulled from the resume - each tagged positive, negative, or neutral with an estimated point contribution. Scores fall into plain-language bands (Strong at 70+, Partial at 40-69, Limited below 40) so a number is always paired with a human-readable label.

Reproducibility is baked into the data model: every score row stores the prompt version, the model used, latency, the redacted fields, the exact weights used, and the full raw model output. A historical score stays interpretable even after an admin re-tunes the algorithm, because the weights that produced it travel with it.

Bias mitigation and fairness

Most tools bolt fairness on at the end - they score first, then check the outputs for disparate impact. AuraHire inverts that: fairness is enforced upstream, before a model ever forms an opinion, because the cleanest way to stop a score from leaning on a name is to make sure the model never sees the name.

So before any scoring model sees a resume, a hybrid PII redaction pass runs: a rule-based stage nulls name, email, phone, and social links, and an LLM-assisted stage scrubs residual identifiers from free-text fields. A redacted-fields list is persisted on every score row so the redaction can be audited, and the scoring prompts treat a redacted token as a present-but-withheld field, so candidates are never penalized for the redaction itself.

On the job side, descriptions are scanned for biased language across four categories - gendered, age-coded, ableist, and exclusionary - plus admin-defined custom terms. Flags surface inline in the editor as the recruiter types, and publishing a job with unresolved flags requires an explicit written override reason that lands in both the bias-flag record and the append-only audit log.

Redact before scoring, always

Every resume passes through PII redaction before a scoring engine runs. The redacted-fields array is stored on each score row, so an admin can prove redaction happened for any decision.

Override requires a reason

A bias flag moves through flagged, then resolved or overridden. An override is only valid with a written reason - the schema enforces it and the audit log records the actor, the term, and the justification.

Preview Impact before retune

When an admin changes scoring weights or band thresholds, a Preview Impact pass projects how the change would have moved recent applications against current production weights - so re-tuning the algorithm is a deliberate, inspectable act.

No demographic labels stored

Demographic data is deliberately never collected - it would contradict the redaction philosophy. Fairness is surfaced as aggregate distributions (flag counts, top flagged terms, override rate) rather than as disparate-impact statistics computed from sensitive attributes.

Architecture

A Turborepo monorepo with a deliberately split frontend and backend: a Next.js 16 App Router frontend (Vercel) that is a pure UI layer with no database access and no AI keys, and a NestJS backend (Fastify adapter) that owns all business logic, data, AI, queues, cron, secrets, and email. Shared Zod schemas in a common package validate both the React forms and the backend DTOs, and an OpenAPI spec generates the typed TanStack Query client the frontend consumes.

Data lives in Supabase Postgres across 22 tables - modeled with Drizzle ORM - with Row-Level Security on every user-data table as the last line of defense behind frontend middleware, CORS, a Supabase JWT guard, and role plus active-company guards. Background work runs on four BullMQ queues over Redis (match scoring, match-preview precompute, profile-score recompute, and a re-score batch), with ten scheduled cron jobs for lifecycle tasks like archiving expired jobs, expiring offers, and interview reminders.

A Socket.io gateway backed by a Redis adapter pushes real-time events - new notifications, application status changes, interviews, offers, and completed score recomputations - to authenticated user and company rooms. Sixteen React Email templates cover the transactional lifecycle, switching transport from Mailpit in development to Resend in production.

Key decisions

A handful of constraints shaped the whole system and kept it academically defensible.

No prompt without a JSON schema

Every OpenAI call uses structured outputs derived from Zod schemas. The model never returns free text the system has to guess at, which makes parse, score, and bias results validated and reproducible.

No score without evidence

Every numeric component renders alongside verbatim resume excerpts and a plain-language explanation. A separate recruiter-safe excerpt variant scrubs names and company tokens so reviewers see skills-anchored quotes before interview completion.

No mutation without an audit row

Scores, overrides, status changes, suspensions, and deletes all write to an append-only audit log with actor, entity, IP, user agent, and JSONB details - the same trail a thesis examiner can replay a decision from.

No silent AI failure

Resume parsing falls back to manual entry, score computation degrades to a graceful 'temporarily unavailable' surface, and bias-detection failure logs a warning rather than blocking a publish - so the AI layer never traps the user.

What it proves

AuraHire is live at aurahire.site and the source is public, so the claims here are inspectable rather than asserted. As a thesis system, its job was to defend one position end to end - that AI recruitment can be explainable and fair without hand-waving - and every layer is built to be replayed: open a score and you'll see its weights, its evidence, and the raw model output that produced it.

It's also proof that a single engineer can ship a genuinely production-shaped AI platform - multi-tenant, real-time, audited, and queue-backed - and keep it honest about its own limits. The hard part was never getting the model to produce a number; it was the discipline of refusing to let it do anything the system couldn't explain afterward.

Up Next

Smart Access Control

A Flutter and IoT thesis prototype for contactless hotel room access.

Academic Prototype · Mobile + IoT

Open to work

Have a project in mind? Let's build it.

I'm available for freelance projects, and open to the right full-time role.

I build web apps, SaaS, and MVPs for founders and startups. Tell me what you're working on.