Appearance
Quickstart
LLMCrawl empowers you to convert entire websites into markdown format ready for Large Language Models (LLM).
Welcome to LLMCrawl
LLMCrawl is an API service designed to crawl a given URL and transform the content into clean markdown. It thoroughly navigates all accessible subpages, providing you with well-structured markdown for each, eliminating the need for a pre-existing sitemap.
Features
How to Use LLMCrawl
We offer multiple ways to integrate LLMCrawl into your applications:
- 🔗 REST API - Direct HTTP API access for any programming language
- 📦 JavaScript SDK - Official SDK for Node.js, React, Next.js, and browser applications
- 🎮 Web Playground - Interactive testing environment
Using the JavaScript SDK
The easiest way to get started is with our official JavaScript/TypeScript SDK:
bash
npm install @llmcrawl/llmcrawl-js
typescript
import { LLMCrawl } from "@llmcrawl/llmcrawl-js";
const client = new LLMCrawl({
apiKey: "your-api-key-here",
});
// Scrape a single page
const result = await client.scrape("https://example.com");
console.log(result.data?.markdown);
// Crawl an entire website
const crawl = await client.crawl("https://example.com", {
limit: 100,
includePaths: ["/docs/*"],
});
📚 View Complete SDK Documentation →
Quick SDK Examples
E-commerce Data Extraction:
typescript
const product = await client.scrape("https://store.example.com/product/123", {
extract: {
schema: {
type: "object",
properties: {
name: { type: "string" },
price: { type: "number" },
inStock: { type: "boolean" },
},
},
},
});
Documentation Crawling:
typescript
const docs = await client.crawl("https://docs.example.com", {
includePaths: ["/docs/*", "/api/*"],
scrapeOptions: {
formats: ["markdown"],
},
});
Obtaining an API Key
To access the API, register on LLMCrawl and obtain an API key. This key will authenticate your requests.
Scraping a Single URL
To scrape content from a single URL and obtain the data as a dictionary:
cURL
To initiate a crawl using cURL:
bash
curl -G https://llmcrawl.dev/api/v1/scrape \
-H "Authorization: Bearer {token}" \
-d '{"url": "https://example.com"}'
This will return the crawl status and results:
json
{
"success": true,
"data": {
"markdown": "# Markdown Content",
"metadata": {
"title": "Page Title",
<...>
}
}
}
AI-Powered Data Extraction
LLMCrawl's most powerful feature is AI-powered structured data extraction. Define a JSON schema, and our AI will extract the data for you:
REST API Example
bash
curl -X POST https://llmcrawl.dev/api/v1/scrape \
-H "Authorization: Bearer {token}" \
-H "Content-Type: application/json" \
-d '{
"url": "https://store.example.com/product/123",
"extract": {
"schema": {
"type": "object",
"properties": {
"productName": {"type": "string"},
"price": {"type": "number"},
"inStock": {"type": "boolean"},
"rating": {"type": "number"}
},
"required": ["productName", "price"]
}
}
}'
JavaScript SDK Example
typescript
const result = await client.scrape("https://news.example.com/article", {
extract: {
schema: {
type: "object",
properties: {
headline: { type: "string" },
author: { type: "string" },
publishDate: { type: "string" },
content: { type: "string" },
tags: { type: "array", items: { type: "string" } },
},
},
},
});
// Parsed structured data
const article = JSON.parse(result.data.extract);
console.log("Headline:", article.headline);
console.log("Author:", article.author);
Getting Started
- Get your API key - Sign up and generate your API key
- Choose your integration method:
- JavaScript SDK - For Node.js, React, Next.js applications
- REST API - For any programming language
- Playground - For testing and experimentation
Ready to start scraping? 🚀 Get your API key or 📚 explore the full documentation.