markdown-for-agents

Frontmatter

Automatically extract metadata from HTML and prepend it as YAML frontmatter.

Frontmatter

By default, metadata is extracted from the HTML <head> element and prepended as YAML frontmatter. This aligns with Cloudflare's Markdown for Agents convention, giving AI agents structured context about the page before the content begins.

Basic Usage

import { convert } from 'markdown-for-agents';

const html = `<html>
  <head>
    <title>My Page</title>
    <meta name="description" content="A great page about things">
    <meta property="og:image" content="https://example.com/hero.png">
  </head>
  <body><p>Content here</p></body>
</html>`;

const { markdown } = convert(html);
from markdown_for_agents import convert

html = """<html>
  <head>
    <title>My Page</title>
    <meta name="description" content="A great page about things">
    <meta property="og:image" content="https://example.com/hero.png">
  </head>
  <body><p>Content here</p></body>
</html>"""

result = convert(html)

Output:

---
title: My Page
description: A great page about things
image: https://example.com/hero.png
---

Content here

Extracted Fields

FieldSource
title<title> element
description<meta name="description">
image<meta property="og:image">

Disabling Frontmatter

convert(html, { frontmatter: false });
convert(html, frontmatter=False)

Custom Fields

Pass an object/dict to merge custom fields into the extracted metadata. Custom values override extracted ones:

convert(html, { frontmatter: { author: 'Jane', title: 'Custom Title' } });
convert(html, frontmatter={"author": "Jane", "title": "Custom Title"})

Output:

---
title: Custom Title
description: A great page about things
image: https://example.com/hero.png
author: Jane
---

On this page