Good beta targets
- Public documentation
- Blog posts and news articles
- Help center pages
- Public company and product pages
- Public hiring pages on a best-effort basis
Use POST /v1/scrape to turn a public webpage into clean, model-ready context for AI apps, agents, RAG pipelines, and automation workflows.
POST /v1/scrape
Required headers:
Authorization: Bearer DEMO_API_KEY
Content-Type: application/json
{
"url": "https://example.com",
"ai": "auto"
}
| Field | Required | Description |
|---|---|---|
url | Yes | A full public https:// webpage URL. |
ai | No | One of auto, off, clean, or extract. Defaults to auto. |
| Mode | Public behavior |
|---|---|
auto | Default mode. Returns AI-ready cleaned content when possible. |
off | Returns extracted webpage text without AI cleanup. |
clean | Applies stronger cleanup for better downstream LLM readability. |
extract | Compatibility mode. Currently equivalent to auto for public beta usage. |
BASE_URL="https://sou.esim111.net"
API_KEY="YOUR_API_KEY"
curl -sS -X POST "$BASE_URL/v1/scrape" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com","ai":"auto"}'
import json
import urllib.request
BASE_URL = "https://sou.esim111.net"
API_KEY = "YOUR_API_KEY"
payload = {"url": "https://example.com", "ai": "auto"}
request = urllib.request.Request(
f"{BASE_URL}/v1/scrape",
data=json.dumps(payload).encode("utf-8"),
method="POST",
headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
},
)
with urllib.request.urlopen(request, timeout=30) as response:
body = json.loads(response.read().decode("utf-8"))
print("success:", body.get("success"))
print("request_id:", body.get("request_id"))
print("summary:", body.get("content", {}).get("summary"))
const BASE_URL = 'https://sou.esim111.net';
const API_KEY = 'YOUR_API_KEY';
async function main() {
const response = await fetch(`${BASE_URL}/v1/scrape`, {
method: 'POST',
headers: {
Authorization: `Bearer ${API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
url: 'https://example.com',
ai: 'auto',
}),
});
const data = await response.json();
console.log('status:', response.status);
console.log('success:', data.success);
console.log('request_id:', data.request_id);
console.log('summary:', data.content?.summary);
}
main().catch((error) => {
console.error(error);
process.exitCode = 1;
});
During beta, integrations should rely first on these stable anchors:
successrequest_idurlcontent.markdown when extraction succeedserror.code when extraction fails| Code | Retry? | Meaning |
|---|---|---|
INVALID_URL | No | The URL is malformed or not a supported public https:// URL. |
TARGET_UNSUPPORTED | No | The target page type is outside public beta support scope. |
PROCESSING_CAPACITY_LIMITED | Yes | Processing is busy. Retry shortly with exponential backoff. |
For retryable errors, wait 1 second, then 2 seconds, then 4 seconds. Stop after 3 retries and include request_id in support requests.
Keep API keys on your server side. Never paste keys into screenshots, logs, or public support channels.