Updated on May 6, 2026

How GoGloby Replaced Manual Candidate Review With a Semantic ML Pipeline

Recruiters were losing 4 to 6 hours per role just screening. That’s nearly half the week spent chasing names instead of actually talking to people. It was a business model that didn’t scale; if we wanted more clients, we had to hire more people. There was no path to scale through efficiency alone.

Here’s how GoGloby broke that cycle by building a semantic ML pipeline—and the reality of what it took to get it right.

Achievements After Partnering With GoGloby

By automating the heaviest lifting of the recruitment funnel, the GoGloby partnership turned a manual, high-friction screening process into a precision operation. The shift not only slashed costs and timelines but also standardized candidate evaluation, allowing the hiring team to focus on top-tier talent while achieving a massive leap in operational ROI.

Metric	Result
Screening Time Per Role	↓ 85%
Candidate Evaluation Consistency	60% → 91% agreement
False Positive Rate on Shortlists	35% → 12%
Time to Shortlist	↓ 80%
Cost Per Candidate Scored	$3.20 → $0.02
Job Postings Handled Simultaneously	3x baseline
ROI	340% within 6 months

The Situation at a Glance

By replacing labor-intensive manual screening with a high-precision ML pipeline, the organization reclaimed nearly half of its recruiting capacity while maintaining a lean infrastructure footprint. This shift allowed the team to scale operations instantly, delivering a 340% return on investment in just 6 months.

Industry	Talent Acquisition
Timeline	6 months
Infrastructure cost	$400/month
Core problem	Manual candidate screening consuming 40% of recruiter time, with no path to scale
Solution	End-to-end semantic ML pipeline with configurable weighted scoring
ROI	340% within 6 months

The Problem

GoGloby’s traditional recruiting process relied entirely on human effort for candidate evaluation. Every new role required recruiters to spend 4–6 hours searching databases, reviewing profiles, and forming subjective assessments before a single meaningful conversation took place.

The process had 4 compounding failure modes.

1. Time Drain

Screening consumed 40% of total recruiter time. With multiple active roles running simultaneously, this became a constant bottleneck — the majority of productive hours spent on work that produced no direct signal.

2. Inconsistent Evaluation

Without a standardized scoring method, 2 recruiters evaluating the same candidate would agree only 60% of the time. Subjective judgment meant different clients received different quality levels depending on who handled their role.

3. No Path to Scale

Every additional job posting required proportional headcount. Growth was linear — and so was cost. Process automation sat below 5%. The business had a hard ceiling that no amount of hiring could move.

4. Missed Matches

Keyword searches couldn’t surface candidates whose profiles were semantically aligned but lexically different. A candidate experienced in “distributed systems” was invisible to a search for “high-availability backend architecture” — despite being exactly the right person for the role.

Engineering the Solution

The core insight was that candidate-job matching is fundamentally a semantic similarity problem. 2 phrases can describe identical expertise in entirely different words and no keyword filter catches that.

The GoGloby Applied AI Engineering team built an end-to-end ML pipeline that converts both candidate profiles and job requirements into dense vector representations, then ranks candidates by their geometric proximity in embedding space.

The pipeline runs on PostgreSQL with pgvector — no exotic infrastructure, no six-figure ML platform. The scoring engine processes hundreds of candidates per role in seconds, with configurable weights for technical skills, experience level, cultural fit, and location alignment depending on client priorities.

The system was designed deliberately to augment recruiters, not replace their judgment. It surfaces ranked shortlists with score breakdowns so recruiters can immediately focus on relationship building and client context — the work that actually requires human intelligence.

The Processing Pipeline

Raw data from CRM systems, job boards, and document repositories flows through 6 sequential stages:

1. Raw Data Ingestion

PDFs, Word docs, plain text, HTML — multiple formats unified into a single processing queue via Apache Airflow.

2. Text Extraction & Cleaning

Mistral-7B normalizes and cleans unstructured text at 1,000 documents per hour with 94% extraction accuracy.

3. PII Obfuscation

Tokenization, masking, and differential privacy applied before any embedding generation. Compliance is built in, not retrofitted.

4. Embedding Generation

OpenAI text-embedding-3-large produces 3,072-dimensional vectors per candidate profile and job requirement.

5. Vector Storage

PostgreSQL with pgvector and IVFFlat indexing enables sub-linear nearest-neighbor search across 1M+ candidate profiles.

6. Weighted Scoring

Multi-dimensional cosine similarity with client-configurable weights across four criteria. Top candidates returned in 200ms.

Key Engineering Decisions

3 decisions shaped the architecture’s performance and cost profile. Each involved a real trade-off.

1. Text Processing Model

The team evaluated 3 options for handling unstructured resume parsing at 1,000 documents per hour.

Model	Parameters	Monthly Cost	Accuracy
Mistral-7B	7B	$873/mo	94%	✓ Selected
Llama-3B	3B	$678/mo	81%	Rejected
DistilBERT	66M	$333/mo	73%	Rejected

The cheaper options introduced unacceptable accuracy losses on domain-specific language — technical job titles, framework names, seniority signals. The additional $195/month over Llama-3B delivered a 13-point accuracy improvement that paid for itself many times over in reduced false positives downstream.

2. Embedding Model

Model	Dimensions	Cost / 1M tokens
OpenAI text-embedding-3-large	3,072	$0.13	✓ Selected
Cohere embed-english-v3	1,024	$0.10	Rejected
Cohere embed-multilingual-v3	1,024	$0.15	Rejected
Google textembedding-gecko	768	$0.025	Rejected

At 3,072 dimensions, OpenAI’s model substantially outperformed all alternatives on professional terminology benchmarks — the nuance between “Staff Engineer” and “Principal Engineer”, or “React” versus “Next.js”, matters deeply in talent matching. 99.9% API uptime was also non-negotiable for production operations.

3. Database Architecture

The team evaluated purpose-built vector databases including Pinecone, Weaviate, and Qdrant. The operational overhead wasn’t justified at the scale required. PostgreSQL with pgvector provided a single system for both relational data and vector search — with IVFFlat indexing delivering sub-linear search performance across 1M+ candidate profiles. No additional infrastructure to secure, monitor, or maintain. This single decision saved an estimated $400–600/month in infrastructure cost and significant ongoing engineering complexity.

Results

Metric	Before	After	Change
Screening time per role	4–6 hours	45–60 minutes	↓ 85%
Candidate evaluation consistency	60% agreement	91% agreement	↑ 52%
Time to shortlist	2–3 days	4–6 hours	↓ 80%
False positive rate on shortlists	35%	12%	↓ 66%
Cost per candidate scored	~$3.20 (manual)	$0.02	↓ 99.4%
Monthly infrastructure spend	—	$400	ROI: 340% in 6 mo
Job postings handled simultaneously	Baseline	3× baseline	↑ 200%

Beyond the numbers: recruiters shifted focus from repetitive screening to high-value client relationship work. Non-obvious candidate matches — invisible to keyword search — surfaced consistently. And the clustering layer produced talent market reports as a new product line, with zero additional ML infrastructure required.

What We’d Tell Engineers Starting This

Don’t over-engineer your vector store: Purpose-built vector databases add operational overhead that isn’t justified until you’re well past 10M+ vectors. PostgreSQL + pgvector handles millions of candidate profiles with IVFFlat indexing — and keeps your stack simple and your security surface small.

Accuracy beats cost on foundation models: The temptation to use the cheapest text processing model is strong. Don’t. A 13-point accuracy drop on a $195/month saving creates far more expensive downstream problems — bad shortlists, recruiter rework, client trust damage. Measure total cost of ownership, not just API cost.

PII handling is an architectural decision, not an afterthought: Bake obfuscation into the pipeline before embeddings — not after. Sensitive data never touches the vector store. Compliance is far easier to maintain than to retrofit, and far cheaper than a breach.

Configurable weights unlock product flexibility: Different clients weight criteria differently. Making weights client-configurable turned the scoring engine from an internal tool into a product surface. Every new client configuration is a data point on what the market values.

Clustering is a second product hiding in your embeddings: Once you have embeddings for thousands of candidates, K-means clustering surfaces talent pool intelligence at near-zero marginal cost. The ML infrastructure was already there — the team productized it as client-facing market reports.

Design for augmentation, not automation: The system ranks candidates, but it doesn’t hire them. Recruiters still bring irreplaceable context about client culture, role nuance, and fit signals no embedding captures. Designing for augmentation rather than replacement drove adoption and trust.

Ready to Build AI Systems That Actually Perform in Production?

If your team is running manual processes that don’t scale, sitting on data that isn’t working for you, or facing an AI build that needs to ship — the constraint is execution. GoGloby embeds senior Applied AI Engineers into your team: engineers who have built production ML pipelines, not just prototypes, and who follow the 4x Applied AI Engineering™ system from day one.