Extract markdown, HTML, or screenshots. AI-ready output with noise removed.
{
"markdown": "# Article Title\n\nClean content...",
"title": "Article Title",
"url": "https://example.com/...",
"cost": 0.001
}Request any combination. Each format costs $0.001.
Clean, readable text with formatting preserved
LLM training, RAG pipelines, content analysis
Complete page structure with all elements
Custom parsing, structure preservation
Full-page PNG capture (base64 encoded)
Visual testing, archiving, change detection
Request multiple formats in one API call. Example: markdown + screenshot = $0.002. Add advanced proxy for protected sites = +$0.004.
from llmlayer import LLMLayerClient
client = LLMLayerClient(api_key="...")
# Extract clean markdown
response = client.scrape(
url="https://example.com/article",
formats=["markdown"],
main_content_only=True
)
print(response.markdown)
print(f"Cost: ${response.cost}")import { LLMLayerClient } from 'llmlayer';
const client = new LLMLayerClient({
apiKey: process.env.LLMLAYER_API_KEY
});
const response = await client.scrape({
url: 'https://example.com/pricing',
formats: ['markdown', 'screenshot'],
mainContentOnly: true
});
console.log(response.markdown);
// Screenshot is base64 PNGcurl -X POST https://api.llmlayer.dev/api/v2/scrape \
-H "Authorization: Bearer $LLMLAYER_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/article", "formats": ["markdown"], "main_content_only": true}'/api/v2/scrape| PARAMETER | TYPE | DEFAULT | DESCRIPTION |
|---|---|---|---|
url* | string | — | URL to scrape (must include https://) |
formats* | array | — | ["markdown"], ["html"], ["screenshot"] or combinations |
main_content_only | boolean | false | Remove nav, headers, footers, sidebars, ads |
advanced_proxy | boolean | false | Bypass bot detection (+$0.004/request) |
include_images | boolean | true | Include image links in markdown output |
include_links | boolean | true | Include hyperlinks in markdown output |
* Required parameter
Extract clean markdown from documentation, blogs, and articles. Use main_content_only for noise-free datasets.
Feed webpage content directly into vector databases. Markdown format preserves structure while removing HTML noise.
Monitor competitor pages with screenshots. Compare visual changes over time for pricing or content updates.
Build news aggregators or research tools. Extract and normalize content from diverse sources.
Get your API key and start scraping in minutes. Free credits to start. No credit card required.