Kolektr
Structured data extraction using schemas, selectors, or AI
1 min read
Kolektr extracts structured data from any web page.
Always pass artifact_id from a previous Evadr fetch to avoid re-fetching pages behind anti-bot protection.
page
result = client.kolektr.page(
url="https://example.com/products",
schema={
"name": "css:.product-name",
"price": "css:.product-price",
"rating": "css:.product-rating",
},
artifact_id=page.artifact_id,
page_num=1,
page_size=50,
)
print(result.total, result.has_next)
for record in result.records:
print(record)
page_all (auto-pagination)
all_records = client.kolektr.page_all(
url="https://example.com/products",
schema={"name": "css:.product-name", "price": "css:.product-price"},
)
print(f"Total: {len(all_records)} records")
extract_html
result = client.kolektr.extract_html(
html="<html>...</html>",
schema={"title": "css:h1", "body": "css:article"},
)
print(result.records)
Schema syntax
| Syntax | Example | Description | |--------|---------|-------------| | css: | css:.price | CSS selector | | xpath: | xpath://span | XPath expression | | ai: | ai:product price | AI-powered extraction | | attr: | attr:img@src | Element attribute |
Was this page helpful?