Skip to content

Quickstart

LLMCrawl empowers you to convert entire websites into markdown format ready for Large Language Models (LLM).


Welcome to LLMCrawl

LLMCrawl is an API service designed to crawl a given URL and transform the content into clean markdown. It thoroughly navigates all accessible subpages, providing you with well-structured markdown for each, eliminating the need for a pre-existing sitemap.

Features

How to Use LLMCrawl

We offer multiple ways to integrate LLMCrawl into your applications:

  • 🔗 REST API - Direct HTTP API access for any programming language
  • 📦 JavaScript SDK - Official SDK for Node.js, React, Next.js, and browser applications
  • 🎮 Web Playground - Interactive testing environment

Using the JavaScript SDK

The easiest way to get started is with our official JavaScript/TypeScript SDK:

bash
npm install @llmcrawl/llmcrawl-js
typescript
import { LLMCrawl } from "@llmcrawl/llmcrawl-js";

const client = new LLMCrawl({
  apiKey: "your-api-key-here",
});

// Scrape a single page
const result = await client.scrape("https://example.com");
console.log(result.data?.markdown);

// Crawl an entire website
const crawl = await client.crawl("https://example.com", {
  limit: 100,
  includePaths: ["/docs/*"],
});

📚 View Complete SDK Documentation →

Quick SDK Examples

E-commerce Data Extraction:

typescript
const product = await client.scrape("https://store.example.com/product/123", {
  extract: {
    schema: {
      type: "object",
      properties: {
        name: { type: "string" },
        price: { type: "number" },
        inStock: { type: "boolean" },
      },
    },
  },
});

Documentation Crawling:

typescript
const docs = await client.crawl("https://docs.example.com", {
  includePaths: ["/docs/*", "/api/*"],
  scrapeOptions: {
    formats: ["markdown"],
  },
});

🔗 Explore All SDK Features →

Obtaining an API Key

To access the API, register on LLMCrawl and obtain an API key. This key will authenticate your requests.

Scraping a Single URL

To scrape content from a single URL and obtain the data as a dictionary:


cURL

To initiate a crawl using cURL:

bash
curl -G https://llmcrawl.dev/api/v1/scrape \
-H "Authorization: Bearer {token}" \
-d '{"url": "https://example.com"}'

This will return the crawl status and results:

json
{
  "success": true,
  "data": {
    "markdown": "# Markdown Content",
    "metadata": {
      "title": "Page Title",
      <...>
    }
  }
}

AI-Powered Data Extraction

LLMCrawl's most powerful feature is AI-powered structured data extraction. Define a JSON schema, and our AI will extract the data for you:

REST API Example

bash
curl -X POST https://llmcrawl.dev/api/v1/scrape \
-H "Authorization: Bearer {token}" \
-H "Content-Type: application/json" \
-d '{
  "url": "https://store.example.com/product/123",
  "extract": {
    "schema": {
      "type": "object",
      "properties": {
        "productName": {"type": "string"},
        "price": {"type": "number"},
        "inStock": {"type": "boolean"},
        "rating": {"type": "number"}
      },
      "required": ["productName", "price"]
    }
  }
}'

JavaScript SDK Example

typescript
const result = await client.scrape("https://news.example.com/article", {
  extract: {
    schema: {
      type: "object",
      properties: {
        headline: { type: "string" },
        author: { type: "string" },
        publishDate: { type: "string" },
        content: { type: "string" },
        tags: { type: "array", items: { type: "string" } },
      },
    },
  },
});

// Parsed structured data
const article = JSON.parse(result.data.extract);
console.log("Headline:", article.headline);
console.log("Author:", article.author);

Getting Started

  1. Get your API key - Sign up and generate your API key
  2. Choose your integration method:

Ready to start scraping? 🚀 Get your API key or 📚 explore the full documentation.