Getting Started
Install markdown-for-agents and learn common usage patterns.
Getting Started
This guide walks you through installing markdown-for-agents and using it in common scenarios. Want to try it first?
Open the playground to convert HTML to Markdown right in your browser.
Installation
npm install markdown-for-agentsThe library has a single runtime dependency (htmlparser2) and works in any JavaScript environment that supports ES2022.
Python
Also available as a pure Python package with zero dependencies. See the Python package docs for installation and usage.
Your First Conversion
import { convert } from 'markdown-for-agents';
const { markdown } = convert('<h1>Hello</h1><p>World</p>');
console.log(markdown);
// # Hello
//
// WorldThe convert function takes an HTML string and returns an object with:
markdown- the converted Markdown stringtokenEstimate- a rough token/character/word count
Converting a Web Page
To convert a fetched web page and strip away navigation, ads, and boilerplate:
import { convert } from 'markdown-for-agents';
const response = await fetch('https://example.com/article');
const html = await response.text();
const { markdown, tokenEstimate } = convert(html, {
extract: true,
baseUrl: 'https://example.com'
});
console.log(markdown);
console.log(`~${tokenEstimate.tokens} tokens`);The extract: true option strips non-content elements (nav, footer, ads, etc.) and baseUrl resolves relative links and images to absolute
URLs.
Common Patterns
Converting an HTML Fragment
For HTML fragments without a full page structure, no extraction is needed:
const { markdown } = convert(`
<h2>Features</h2>
<ul>
<li>Fast</li>
<li>Lightweight</li>
<li>Universal</li>
</ul>
`);
// ## Features
//
// - Fast
// - Lightweight
// - UniversalCustomizing Markdown Output
Control the output style with options:
const { markdown } = convert(html, {
headingStyle: 'setext', // Title\n=====
bulletChar: '*', // * list items
fenceChar: '~', // ~~~ code blocks
strongDelimiter: '__', // __bold__
emDelimiter: '_' // _italic_
});Frontmatter
By default, metadata from the HTML <head> is extracted and prepended as YAML frontmatter:
const { markdown } = convert('<html><head><title>My Page</title></head><body><p>Hello</p></body></html>');
// ---
// title: My Page
// ---
//
// HelloDisable it with frontmatter: false, or merge custom fields with frontmatter: { author: 'Jane' }. See the Frontmatter guide for details.
Using Custom Rules
Override how specific elements are converted:
import { convert, createRule } from 'markdown-for-agents';
const { markdown } = convert(html, {
rules: [
// Convert <details> to a blockquote
createRule('details', ({ convertChildren, node }) => {
const content = convertChildren(node).trim();
return `\n\n> ${content}\n\n`;
})
]
});See the Custom Rules guide for the full rule API.
Serving Markdown via Middleware
If you're building a web server, you can automatically respond with Markdown when AI agents request it. Each middleware is a separate package - install only what you need:
// Express
import express from 'express';
import { markdown } from '@markdown-for-agents/express';
const app = express();
app.use(markdown({ extract: true }));
app.get('/article', (req, res) => {
res.send('<h1>Title</h1><p>Content...</p>');
});// Fastify
import Fastify from 'fastify';
import { markdown } from '@markdown-for-agents/fastify';
const fastify = Fastify();
fastify.register(markdown({ extract: true }));// Hono
import { Hono } from 'hono';
import { markdown } from '@markdown-for-agents/hono';
const app = new Hono();
app.use(markdown({ extract: true }));// Next.js (route handler)
import { withMarkdown } from '@markdown-for-agents/nextjs';
function handler() {
return new Response('<h1>Title</h1><p>Content...</p>', {
headers: { 'content-type': 'text/html' }
});
}
export const GET = withMarkdown(handler, { extract: true });When a client sends Accept: text/markdown, the response is automatically converted. Normal requests pass through untouched. See the
Middleware guide for all framework integrations, or the
Next.js example for a complete working app with the proxy
pattern.
Token Estimation
Every conversion returns a token estimate for LLM cost planning. The built-in heuristic works out of the box, or plug in an exact tokenizer:
const { tokenEstimate } = convert(html);
console.log(tokenEstimate);
// { tokens: 12, characters: 46, words: 8 }Deduplication
Real-world pages often repeat content (nav links, CTAs, footers). Enable deduplication to remove repeated blocks:
const { markdown } = convert(html, { deduplicate: true });Server Timing
Measure conversion performance with the serverTiming option. Middleware adapters surface this as a Server-Timing header:
const { markdown, convertDuration } = convert(html, { serverTiming: true });
console.log(`Took ${convertDuration}ms`);Content-Signal Header
Signal publisher consent for AI usage via an HTTP header. Only set when explicitly configured:
app.use(markdown({
contentSignal: { aiTrain: true, search: true, aiInput: true }
}));
// Sets: content-signal: ai-train=yes, search=yes, ai-input=yesSee Advanced Options for full details on all of the above.
What's Next
- Playground - try the converter interactively with any URL or HTML
- Content Extraction - fine-tune what gets stripped from web pages
- Frontmatter - control the YAML metadata prepended to output
- Custom Rules - extend the converter with your own element handlers
- Middleware - integrate with Express, Fastify, Hono, Next.js, or any Web Standard server
- Supported Elements - full reference of HTML-to-Markdown mappings
- Advanced Options - custom token counters, deduplication, server timing, and content-signal headers
- API Reference - complete API documentation