MarkItDown

Package: markitdown · 12 nodes · Convert documents, URLs, and text to Markdown

Convert files (PDF, DOCX, PPTX, images), web pages, YouTube transcripts, and raw text/HTML into clean Markdown using Microsoft's markitdown library. Supports batch conversion, Azure Document Intelligence, and streaming binary data.

Node Reference

Node Type Inputs Outputs
Create Converter statement - Converter (md.Converter)
Doc Intel Converter statement - Converter (md.Converter)
Convert File statement Converter (md.Converter), Path (str) Text (str), Title (str)
Convert URL statement Converter (md.Converter), URL (str) Text (str), Title (str)
Convert Text statement Converter (md.Converter), Content (str) Markdown (str)
Convert Stream statement Converter (md.Converter), Data (str) Text (str), Title (str)
Convert YouTube statement Converter (md.Converter), YouTube URL (str) Transcript (str), Title (str)
Batch Convert statement Converter (md.Converter), Directory Path (str) Results (any), Combined Text (str)
Save Markdown statement Markdown (str), File Path (str) File Path (str)
Get Text expression Result (md.Result) Text (str), Title (str)
Get Markdown expression Result (md.Result) Markdown (str)
Merge Results expression Text A (str), Text B (str) Merged (str)

Typical Pipeline

Create Converter → Convert File / Convert URL / Convert YouTube → use Text output directly or pipe into LLM context.

For batch processing: Create Converter → Batch Convert → Save Markdown or feed Combined Text to downstream nodes.

For Azure Document Intelligence: Doc Intel Converter (with endpoint) → Convert File → Get Text.