Skip to main content

Concepts

Platform overview, module architecture, and how modules compose

2 min read


What is KLOAKD?

KLOAKD is web intelligence infrastructure - a set of 7 composable modules that handle every layer of autonomous web data extraction:

| Module | Role | |--------|------| | Evadr | Detect and bypass anti-bot systems (Cloudflare, Akamai, DataDome, Imperva) | | Webgrph | Map site structure, crawl pages, and build hierarchy trees | | Skanyr | Discover hidden API endpoints from JS bundles and network activity | | Nexus | Cognitive strategy engine - analyze a site and decide the optimal extraction approach | | Parlyr | Conversational NLP - turn natural language prompts into extraction plans | | Fetchyr | RPA & authentication - handle login flows, MFA, form automation | | Kolektr | Extract structured data using schemas, CSS selectors, or AI |

Artifact chaining

The key to KLOAKD's efficiency is artifact reuse. When Evadr fetches a page, it stores the result as an artifact. Every subsequent module can reference that artifact instead of re-fetching.

Evadr.fetch(url)          → artifact_id: "art-abc123"
Webgrph.crawl(url,          artifact_id="art-abc123")  # reuses fetch
Kolektr.page(url,           artifact_id="art-abc123")  # reuses fetch

Zero redundant HTTP requests. Zero redundant anti-bot bypass attempts.

Request lifecycle

User request
  → Evadr     (anti-bot bypass, 5-tier escalation)
  → Nexus     (strategy: which modules to use, in what order)
  → Webgrph   (site map, if needed)
  → Skanyr    (API discovery, if needed)
  → Fetchyr   (auth session, if needed)
  → Kolektr   (extraction)

You control which modules to use. Nexus can automate this decision.

Error model

All errors inherit from a base KloakdError class:

| Error | HTTP | When | |-------|------|------| | AuthenticationError | 401 | Invalid API key | | ForbiddenError | 403 | Organization ID mismatch (IDOR protection) | | NotEntitledError | 403 | Plan doesn't include this module | | RateLimitError | 429 | Quota exceeded (includes retry_after) | | UpstreamError | 502 | Target site is unreachable | | ApiError | 4xx/5xx | Other server errors |

Retry policy

All SDKs implement exponential backoff with a 1-hour cap:

  • Retryable: 429, 500, 502, 503, 504
  • Backoff: base_delay × 2^attempt, capped at 60s
  • 429 responses: respects Retry-After header and retry_after body field
  • Default: 3 retries
Was this page helpful?