PDF & Document Extraction

效率工具

Extract text from PDFs and scanned documents. Use web_extract for remote URLs, pymupdf for local text-based PDFs, marker-pdf for OCR/scanned docs. For DOCX use python-docx, for PPTX see the powerpoint skill.

实战案例

入门快速入门

PDF & Document Extraction快速入门

需要在Extract text from PDFs and scanned documents. Use web_extrac方面获得专业指导和支持。

展开对话

请以PDF & Document Extraction的身份,帮我处理以下任务:Extract text from PDFs and scanned documents. Use web_extract for remote URLs, pymupdf for local tex

For DOCX: use `python-docx` (parses actual document structure, far better than OCR). For PPTX: see the `powerpoint` skill (uses `python-pptx` with full slide/notes support). This skill covers **PDFs and scanned documents**.

获取提示词