AI EngineeringProduction

AI-Powered People Intelligence Platform

OSINT-grade people search, synthesised in under 15 seconds

A people-search API that aggregates public-web signals from Google, Bing and DuckDuckGo, fuses them through a custom clustering engine, and synthesises a structured report with Google Gemini. Requests are secured with HMAC-SHA256.

FastAPIPlaywrightGoogle Gemini 2.5MongoDBMotorPydantic v2HMAC-SHA256Docker

Problem Statement

Background-screening, talent-acquisition and due-diligence teams battle three compounding problems: fragmented information, ambiguous common-name results, and manual OSINT that takes hours when decisions need minutes.

Relevant information is scattered across dozens of platforms with no unified view.
Common names return mixed results, causing costly mis-identification.
Manual OSINT research takes 3–5 hours per profile.

Headline Outcomes

< 15 s~99% faster

Research time per profile

3–5 hours

< $0.05>99.9% saved

Cost per report

$75–$150

90%++50%

Name disambiguation accuracy

~60%

The Solution

An asynchronous microservice with two execution paths. A fast path runs when an AI Overview gives high confidence; a deep-search path pools parallel dork queries across three engines, then clusters and ranks the results.

Parallel async scraping across Google, Bing and DuckDuckGo with Playwright + reCAPTCHA solving.

Custom fuzzy-matching engine clusters results by username, domain and snippet similarity (90%+ precision).

Weighted profile ranking: Name 30% · Location 25% · Title 20% · Company 15% · Sources 10%.

Gemini synthesises a structured report and FAQ; a multi-turn chatbot layer answers follow-ups over MongoDB sessions.

System Architecture

How the data flows

Client Request

HMAC verified, 202 queued

AI Overview Check

Gemini confidence eval

Deep Search

Parallel dorks ×3 engines

Cluster & Rank

Fuzzy-match scoring

AI Report

Gemini structured output

Result 01

Cut manual OSINT from hours to seconds for recruiters, compliance and journalists.

Result 02

3× search throughput via parallel async engines vs. sequential lookups.

Result 03

Conversational follow-up via persistent chat sessions, so users refine results without re-running a search.

Build Something Like This Services & Pricing More Case Studies

Taking on new projects · Outside IR35

Have a data pipeline or warehouse problem worth solving?

From messy source data to analytics-ready warehouses that cut cost. Let's scope it. I reply within one business day.

Start a Project Connect on LinkedIn