Skip to content
scrapex logo scrapex logo

scrapex

Modern web scraper with LLM-enhanced extraction for Node.js

Features

Easy to Use

Simple API with sensible defaults. Scrape any webpage with a single function call.

LLM Integration

Enhance scraped content with AI-powered summarization, entity extraction, and classification.

Extensible Pipeline

Create custom extractors for domain-specific data extraction needs.

TypeScript First

Full TypeScript support with comprehensive type definitions.

Quick Example

import { scrape } from 'scrapex';
const result = await scrape('https://example.com/article');
console.log(result.title); // Page title
console.log(result.content); // Main content
console.log(result.textContent); // Plain text
console.log(result.links); // Extracted links

With LLM Enhancement

import { scrape } from 'scrapex';
import { createOpenAI, createEnhancer } from 'scrapex/llm';
const provider = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });
const enhancer = createEnhancer(provider)
.summarize()
.extractEntities();
const result = await scrape('https://example.com', { enhancer });
console.log(result.summary); // AI-generated summary
console.log(result.entities); // Extracted entities