GoGloby built a Slack-based AI recruiting agent for internal use — then productized it, onboarded 45 organizations, and turned it into a standalone SaaS revenue stream with 94% monthly user retention.
Summary Metrics
Recruiting teams were hitting a coordination ceiling: candidate review, scheduling, and status tracking were consuming the majority of recruiter time, spread across multiple tools with no unified workflow.
To address this, GoGloby built a Slack-native AI recruiting agent that operates directly inside existing team communication channels — handling candidate review, shortlist generation, scheduling coordination, and status updates without requiring context switching.
Powered by GPT-4-turbo, a vector-based candidate search system, and a real-time Slack event architecture, the solution delivers sub-2-second responses, reduces operational overhead, and scales across organizations with minimal onboarding friction.
| Service | Agent Building |
|---|---|
| Deployment | Slack-native |
| Infrastructure cost | $600 / month |
| Active users | 1,200+ across 45 organizations |
| Core problem | Fragmented recruiting workflows and scheduling overhead |
| Solution | Slack-native AI agent integrated into existing workflows |
Key Metrics
| Metric | Result |
|---|---|
| Scheduling coordination effort | ↓ 75% |
| Recruiter productivity | ↑ 100% (2× baseline) |
| Monthly user retention | 94% |
| Daily conversations | 3,500+ |
| Average user satisfaction | 9.1 / 10 |
| Infrastructure cost per user | $0.49 |
| Error rate | 0.3% (vs. 1% target) |
| API response time | 1.8 seconds (vs. 3s target) |
The Situation
Recruiting teams were spending the majority of their time on operational work: reviewing candidates, coordinating interviews, and managing information across disconnected tools.
A team handling multiple open roles could spend 30–40 hours per week on tasks that did not directly contribute to hiring decisions — only preparing information for later use.
As hiring volume increased, the bottleneck became operational rather than strategic.
This is how GoGloby’s Applied AI Engineering team replaced that workflow fragmentation with a Slack-native AI agent.
What the Recruiting Process Was Dealing With
The problem wasn’t a shortage of candidates, it was the cost of processing them across disconnected systems.
1. Candidate Review Overhead
Reviewing profiles, scoring candidates, and writing summaries consumed a significant portion of recruiter time, limiting the number of candidates that could be processed.
2. Scheduling Bottleneck
Each interview required 6–8 back-and-forth messages to coordinate. At scale, this created hours of repetitive work with no impact on hiring quality.
3. Context Lost Between Tools
Candidate information lived across ATS, email, spreadsheets, and Slack. Recruiters constantly switched tools, and decisions were often made with incomplete context.
4. No Tool Where Work Actually Happened
Most recruiting tools required leaving Slack, where teams already operated. This created friction and low adoption.
The Engineering Challenge
Building a Slack-native AI agent that operates inside real recruiting workflows is fundamentally different from building a standalone tool.
- Sub-3-second response time: Anything slower breaks conversational flow and was a hard constraint in model selection.
- Zero context switching: All functionality must live inside Slack.
- High reliability: System must operate within production workflows.
- Scalability across organizations: Multi-tenant support with consistent performance.
- Low infrastructure cost: Must remain viable at scale.
Technology Stack Decisions
A Slack-native, real-time AI agent built around GPT-4-turbo, vector-based search, and a modular backend architecture — optimized for latency, reliability, and seamless workflow integration.
Model Selection: Open-Source vs API
| Model | Response Time | Quality Score | Infra Cost / mo | Status |
|---|---|---|---|---|
| Llama 2 (7B) | 8–12 sec | 7.2/10 | $1,200 | Rejected |
| Llama 2 (13B) | 15–20 sec | 8.1/10 | $2,100 | Rejected |
| Mistral (7B) | 6–10 sec | 7.8/10 | $1,000 | Rejected |
| Falcon (7B) | 10–15 sec | 6.9/10 | $1,100 | Rejected |
| Gemma (2B) | 4–6 sec | 6.5/10 | $600 | Rejected |
| GPT-4-turbo | 1.2–2.1 sec | 9.1/10 | $0.010 / 1K tokens | Selected |
Why GPT-4-turbo
- 99.9% API uptime enabled production SLA commitments.
- Consistent sub-2-second responses under load.
- Strong performance on multi-turn recruiting conversations.
System Architecture
A modular Slack-native architecture optimized for real-time interaction:
| Layer | Component | Technology | Responsibility |
|---|---|---|---|
| 1 | Entry | Node.js + Slack API | Event handling and routing |
| 2 | Routing | Custom middleware | Intent classification and dispatch |
| 3 | Context | PostgreSQL + Redis | Conversation state and caching |
| 4 | AI Engine | GPT-4-turbo | Response generation |
| 5 | Search | pgvector | Candidate ranking and retrieval |
| 6 | Scheduling | Calendar APIs | Interview coordination |
Cost Structure
| Component | Monthly | Cost per User |
|---|---|---|
| GPT-4-turbo API | $280 | $0.23 |
| PostgreSQL + Redis | $160 | $0.13 |
| AWS Compute | $120 | $0.10 |
| Monitoring | $40 | $0.03 |
| TOTAL | $600 | $0.49 |
Results
A measurable improvement across operational efficiency, system performance, and user adoption.
| Metric | Before | After | Change |
|---|---|---|---|
| Candidate review time | 15 min | 3.5 min | ↓ 77% |
| Scheduling messages | 6–8 | 1–2 | ↓ 75% |
| Recruiter productivity | Baseline | 2× | ↑ 100% |
| Infrastructure cost | — | $600 | — |
| Error rate | — | 0.3% | vs 1% target |
| Response time | — | 1.8s | vs 3s target |
What Clients Say
Clients highlight 2 consistent outcomes: a significant reduction in coordination overhead and improved hiring efficiency. Teams spend less time on scheduling and candidate management while benefiting from faster shortlisting and better visibility into candidate information.
Our time-to-hire dropped by more than half and my recruiters stopped drowning in coordination work.
— Co-Founder, Growth-Stage SaaS Startup
Within two weeks it was doing the heavy lifting — surfacing ranked candidates, handling scheduling, keeping our team in context.
— Founder & CEO, B2B SaaS Company
Engineering Lessons
Practical lessons emerged from building and operating the system:
- Infrastructure cost is an architecture problem: Efficient design enabled scaling to 1,200 users at $600/month.
- Latency is the product: Response time defines user experience in conversational systems.
- Meet users in their workflow: Adoption increases when the product lives where work already happens.
- Internal tools can become product: The system evolved from internal use to a SaaS offering with 45 organizations.
- Continuous learning compounds: Production usage continuously improves model performance.






