Skip to main content

Kolektr

Structured data extraction using schemas, selectors, or AI

1 min read


Kolektr extracts structured data from any web page.

Always pass artifact_id from a previous Evadr fetch to avoid re-fetching pages behind anti-bot protection.

page

result = client.kolektr.page(
    url="https://example.com/products",
    schema={
        "name": "css:.product-name",
        "price": "css:.product-price",
        "rating": "css:.product-rating",
    },
    artifact_id=page.artifact_id,
    page_num=1,
    page_size=50,
)
print(result.total, result.has_next)
for record in result.records:
    print(record)

page_all (auto-pagination)

all_records = client.kolektr.page_all(
    url="https://example.com/products",
    schema={"name": "css:.product-name", "price": "css:.product-price"},
)
print(f"Total: {len(all_records)} records")

extract_html

result = client.kolektr.extract_html(
    html="<html>...</html>",
    schema={"title": "css:h1", "body": "css:article"},
)
print(result.records)

Schema syntax

| Syntax | Example | Description | |--------|---------|-------------| | css: | css:.price | CSS selector | | xpath: | xpath://span | XPath expression | | ai: | ai:product price | AI-powered extraction | | attr: | attr:img@src | Element attribute |

Was this page helpful?