Skip to content

Crawling


POST /v1/crawl

POST
/v1/crawl

Crawl a website

Authorizations

bearerAuth
TypeHTTP (bearer)

Request Body

application/json
JSON
{
"includePaths": [
[
"/blog/*",
"/articles/*",
"/docs/*"
]
],
"excludePaths": [
[
"/admin/*",
"/private/*",
"/api/*"
]
],
"maxDepth": 3,
"limit": 500,
"allowBackwardLinks": false,
"allowExternalLinks": false,
"ignoreSitemap": true,
"url": "string",
"origin": "api",
"scrapeOptions": {
"formats": [
[
"markdown",
"rawHtml"
]
],
"headers": {
"additionalProperties": "string"
},
"includeTags": [
[
"h1",
"h2",
"p",
"article"
]
],
"excludeTags": [
[
"nav",
"footer",
"script",
"style"
]
],
"waitFor": 3000,
"extract": {
"mode": "string",
"schema": {
"type": "object",
"properties": {
"title": {
"type": "string"
},
"price": {
"type": "number"
},
"description": {
"type": "string"
}
},
"required": [
"title",
"price"
]
},
"systemPrompt": "Based on the information on the page, extract all the information from the schema. Try to extract all the fields even those that might not be marked as required.",
"prompt": "Extract the main article title and author from this page"
}
},
"webhookUrls": [
[
"https://your-webhook.com/crawl-status"
]
],
"webhookMetadata": {
"crawlId": "crawl_123",
"userId": "user_456"
}
}

Responses

Successful response
application/json
JSON
{
"success": true,
"id": "crawl_123e4567-e89b-12d3-a456-426614174000",
"url": "https://firecrawl.dev"
}

Playground

Authorization
Body

Samples


GET /v1/crawl/{id}

GET
/v1/crawl/{id}

Get crawl job status

Authorizations

bearerAuth
TypeHTTP (bearer)

Parameters

Path Parameters

id*
Typestring
Required

Responses

Successful response
application/json
JSON
{
"success": true,
"status": "string",
"completed": 0,
"total": 0,
"expiresAt": "string",
"next": "string",
"data": [
{
"markdown": "string",
"extract": "string",
"html": "string",
"rawHtml": "string",
"links": [
"string"
],
"screenshot": "string",
"metadata": {
"additionalProperties": {
}
}
}
]
}

Playground

Authorization
Variables
Key
Value

Samples


DELETE /v1/crawl/{id}/cancel

DELETE
/v1/crawl/{id}/cancel

Cancel a crawl job

Authorizations

bearerAuth
TypeHTTP (bearer)

Parameters

Path Parameters

id*
Typestring
Required

Responses

Successful response
application/json
JSON
{
"success": true,
"message": "string"
}

Playground

Authorization
Variables
Key
Value

Samples


Powered by VitePress OpenAPI