Quickstart

This guide walks through a complete workflow: pick a crawler, configure a squid, add tasks, run it, and download results — all via the API.

Prerequisites

You’ll need an API key. Find it in your lobstr.io dashboard under API in the sidebar. Set it as an environment variable to use in the examples below:

export LOBSTR_API_KEY="your_api_key_here"

Step 1: Verify your credentials

Confirm your key is working before proceeding.

Python

import requests,os

API_KEY = os.environ["LOBSTR_API_KEY"]
headers = {"Authorization": f"Token {API_KEY}"}

response = requests.get("https://api.lobstr.io/v1/me", headers=headers)
user = response.json()
print(f"Logged in as: {user['first_name']} {user['last_name']} ({user['email']})")

Step 2: Find a crawler

Crawlers define what site you’re scraping. List available crawlers and pick the one you need.

Python

response = requests.get("https://api.lobstr.io/v1/crawlers", headers=headers)
crawlers = response.json()

for crawler in crawlers:
    print(f"{crawler['id']}  {crawler['name']}")

Note the id of the crawler you want to use. For example, the Google Maps Reviews crawler.

Step 3: Create a squid

A squid is your configured scraping project — it ties together a crawler, your settings, and your tasks.

Python

payload = {
    "name": "My first squid",
    "crawler": "CRAWLER_ID"   # from Step 2
}

response = requests.post(
    "https://api.lobstr.io/v1/squids",
    headers={**headers, "Content-Type": "application/json"},
    json=payload
)
squid = response.json()
squid_id = squid["id"]
print(f"Squid created: {squid_id}")

Step 4: Add tasks

Tasks tell the squid what to scrape — typically URLs or search queries. The accepted keys depend on the crawler (use Get Crawler Parameters to check).

Python

payload = {
    "squid": squid_id,
    "tasks": [
        {"url": "https://maps.google.com/?cid=1234567890"},
        {"url": "https://maps.google.com/?cid=0987654321"}
    ]
}

response = requests.post(
    "https://api.lobstr.io/v1/tasks",
    headers={**headers, "Content-Type": "application/json"},
    json=payload
)
result = response.json()
print(f"Added {len(result['tasks'])} tasks ({result['duplicated_count']} duplicates skipped)")

Step 5: Start a run

A run executes all pending tasks in the squid.

Python

payload = {"squid": squid_id}

response = requests.post(
    "https://api.lobstr.io/v1/runs",
    headers={**headers, "Content-Type": "application/json"},
    json=payload
)
run = response.json()
run_id = run["id"]
print(f"Run started: {run_id}")

Step 6: Poll until complete

Check the run status periodically until it reaches a terminal state.

Python

import time

terminal_statuses = {"done", "aborted", "error"}

while True:
    response = requests.get(f"https://api.lobstr.io/v1/runs/{run_id}", headers=headers)
    run = response.json()
    status = run["status"]

    print(f"Status: {status} — {run['total_results']} results so far")

    if status in terminal_statuses:
        print(f"Run finished: {run['done_reason']}")
        break

    time.sleep(10)

Typical runs complete in seconds to a few minutes depending on task count and concurrency. Avoid polling more frequently than every 5 seconds.

Step 7: Download results

Once the run is done, fetch your data.

Python

response = requests.get(
    "https://api.lobstr.io/v1/results",
    headers=headers,
    params={"squid": squid_id, "limit": 100, "page": 1}
)
data = response.json()

print(f"Total results: {data['total_results']}")
for row in data["data"]:
    print(row)

For large datasets, iterate through pages using the page parameter. See the Pagination guide for details.

Complete example

import os, time, requests

API_KEY = os.environ["LOBSTR_API_KEY"]
CRAWLER_ID = "YOUR_CRAWLER_ID"

headers = {"Authorization": f"Token {API_KEY}"}
json_headers = {\*\*headers, "Content-Type": "application/json"}

# Create squid

squid = requests.post(
"https://api.lobstr.io/v1/squids",
headers=json_headers,
json={"name": "Quickstart squid", "crawler": CRAWLER_ID}
).json()
squid_id = squid["id"]

# Add tasks

requests.post(
"https://api.lobstr.io/v1/tasks",
headers=json_headers,
json={"squid": squid_id, "tasks": [{"url": "https://example.com"}]}
)

# Start run

run_id = requests.post(
"https://api.lobstr.io/v1/runs",
headers=json_headers,
json={"squid": squid_id}
).json()["id"]

# Poll until done

while True:
run = requests.get(f"https://api.lobstr.io/v1/runs/{run_id}", headers=headers).json()
if run["status"] in {"done", "aborted", "error"}:
break
time.sleep(10)

# Fetch results

results = requests.get(
"https://api.lobstr.io/v1/results",
headers=headers,
params={"squid": squid_id, "limit": 100, "page": 1}
).json()

print(f"Done — {results['total_results']} results collected")

Overview

Getting Started

Core Concepts

User

Crawler

Squid

Delivery

Account

Task

Run

Result

Prerequisites

Step 1: Verify your credentials

Step 2: Find a crawler

Step 3: Create a squid

Step 4: Add tasks

Step 5: Start a run

Step 6: Poll until complete

Step 7: Download results

Complete example

Overview

Getting Started

Core Concepts

User

Crawler

Squid

Delivery

Account

Task

Run

Result

Documentation Index

​Prerequisites

​Step 1: Verify your credentials

​Step 2: Find a crawler

​Step 3: Create a squid

​Step 4: Add tasks

​Step 5: Start a run

​Step 6: Poll until complete

​Step 7: Download results

​Complete example

Prerequisites

Step 1: Verify your credentials

Step 2: Find a crawler

Step 3: Create a squid

Step 4: Add tasks

Step 5: Start a run

Step 6: Poll until complete

Step 7: Download results

Complete example