MarkItDown
Package: markitdown · 12 nodes · Convert documents, URLs, and text to Markdown
Convert files (PDF, DOCX, PPTX, images), web pages, YouTube transcripts, and raw text/HTML into clean Markdown using Microsoft's markitdown library. Supports batch conversion, Azure Document Intelligence, and streaming binary data.
Node Reference
| Node | Type | Inputs | Outputs |
|---|---|---|---|
| Create Converter | statement | - | Converter (md.Converter) |
| Doc Intel Converter | statement | - | Converter (md.Converter) |
| Convert File | statement | Converter (md.Converter), Path (str) | Text (str), Title (str) |
| Convert URL | statement | Converter (md.Converter), URL (str) | Text (str), Title (str) |
| Convert Text | statement | Converter (md.Converter), Content (str) | Markdown (str) |
| Convert Stream | statement | Converter (md.Converter), Data (str) | Text (str), Title (str) |
| Convert YouTube | statement | Converter (md.Converter), YouTube URL (str) | Transcript (str), Title (str) |
| Batch Convert | statement | Converter (md.Converter), Directory Path (str) | Results (any), Combined Text (str) |
| Save Markdown | statement | Markdown (str), File Path (str) | File Path (str) |
| Get Text | expression | Result (md.Result) | Text (str), Title (str) |
| Get Markdown | expression | Result (md.Result) | Markdown (str) |
| Merge Results | expression | Text A (str), Text B (str) | Merged (str) |
Typical Pipeline
Create Converter → Convert File / Convert URL / Convert YouTube → use Text output directly or pipe into LLM context.
For batch processing: Create Converter → Batch Convert → Save Markdown or feed Combined Text to downstream nodes.
For Azure Document Intelligence: Doc Intel Converter (with endpoint) → Convert File → Get Text.