Skip to content
All Projects
AI EngineeringProduction

AI-Powered People Intelligence Platform

OSINT-grade people search, synthesised in under 15 seconds

A people-search API that aggregates public-web signals from Google, Bing and DuckDuckGo, fuses them through a proprietary clustering engine, and synthesises a structured report with Google Gemini secured with HMAC-SHA256.

FastAPIPlaywrightGoogle Gemini 2.5MongoDBMotorPydantic v2HMAC-SHA256Docker
Problem Statement

Background-screening, talent-acquisition and due-diligence teams battle three compounding problems: fragmented information, ambiguous common-name results, and manual OSINT that takes hours when decisions need minutes.

  • Relevant information is scattered across dozens of platforms with no unified view.
  • Common names return mixed results, causing costly mis-identification.
  • Manual OSINT research takes 3–5 hours per profile.
Headline Outcomes
< 15 s~99% faster

Research time per profile

3–5 hours

< $0.05>99.9% saved

Cost per report

$75–$150

90%++50%

Name disambiguation accuracy

~60%

The Solution

A modular, asynchronous microservice with two execution paths a fast-path when an AI Overview gives high confidence, and a deep-search path that pools parallel dork queries across three engines before clustering and ranking.

Parallel async scraping across Google, Bing and DuckDuckGo with Playwright + reCAPTCHA solving.

Custom fuzzy-matching engine clusters results by username, domain and snippet similarity (90%+ precision).

Weighted profile ranking: Name 30% · Location 25% · Title 20% · Company 15% · Sources 10%.

Gemini synthesises a structured report + FAQ; a multi-turn chatbot layer answers follow-ups over MongoDB sessions.

System Architecture

How the data flows

01

Client Request

HMAC verified, 202 queued

02

AI Overview Check

Gemini confidence eval

03

Deep Search

Parallel dorks ×3 engines

04

Cluster & Rank

Fuzzy-match scoring

05

AI Report

Gemini structured output

Result 01

Cut manual OSINT from hours to seconds for recruiters, compliance and journalists.

Result 02

3× search throughput via parallel async engines vs. sequential lookups.

Result 03

Conversational follow-up via persistent chat sessions a qualitatively superior UX.

Available for new work

Have a backend, AI, or data problem worth solving?

From production APIs to self-hosted AI that kills per-call costs let's scope it. I reply within one business day.