Which WebConverter features are now WebMCP tools?

All of them. Image conversion, images→PDF, OCR (Tesseract), merge/reorder/delete PDF pages, extract PDF text, document conversion with Pandoc, audio conversion and extraction, video transcoding and trimming with ffmpeg-wasm, background removal with U²-Net + WASM matting, and audio transcription with Whisper.

Do these tools upload my files?

No. Every tool runs entirely in your browser — WebAssembly + Web Workers — exactly like the human-facing UI. The single network use is the one-time lazy download of an engine (Pandoc, ffmpeg, Whisper, Tesseract), which the browser caches afterwards.

How does an agent discover the tools?

It calls list_supported_formats first; the response is a structured map of every tool's inputs, outputs and engine. The tool descriptors are also discoverable via document.modelContext (the W3C WebMCP spec) and the always-present window.webmcpToolRegistry.

Why is HEIC marked 'Safari/iOS only'?

Because that is the truth. WebConverter's WASM importers do not include a HEIF/HEIC decoder; the only HEIC decoding is the browser's own, which only Safari/iOS provides for HEIC. Listing HEIC as universally supported would be misleading. AVIF, by contrast, is decoded by Chrome and Firefox too.

Yes. No backend, no API key, no rate limit. As free as opening the page.

Every WebConverter Feature Is Now a WebMCP Tool — PDF, OCR, Whisper, Video, Background Removal

Editor's note (2026-05). Chrome 150 deprecated navigator.modelContext in favour of document.modelContext (per WebMCP spec PR #184). Mentions of the API in this post use the forward-compatible feature-detection pattern:
const modelContext = document.modelContext || navigator.modelContext;
if (modelContext) 
WebConverter's own bootstrap uses this exact fallback, so the same site code keeps working on browsers that still ship the older identifier.

When we first shipped WebMCP support, only image conversion was wired up. That was useful — but it was a sliver of what WebConverter can do. As of today every feature on the site is exposed as a WebMCP tool: an AI agent can call document.modelContext and convert images, build and edit PDFs, OCR scans, extract PDF text, convert documents with Pandoc, convert and trim video, convert and extract audio, transcribe speech with Whisper, and remove image backgrounds — all locally in the browser, all without uploading anything.

The Full Tool Catalogue

The thirteen tools are registered globally — on every page of WebConverter, not just /webmcp.html — so an agent can use them from wherever the user is. They all return a base64 file plus a data: URL (and, where appropriate, the structured output as plain text or JSON):

convert_image — image to PNG, JPEG, WebP, BMP, TGA, HDR, EXR or KTX2. Accepts every format the WASM importer reads (BMP, DDS, GIF, HDR, ICO, JPEG, KTX, PGM, PIC, PNG, PPM, PSD, TGA, WebP) plus, via a browser-decode fallback, HEIC on Safari/iOS and AVIF.
images_to_pdf — one or more images into a single PDF, one image per page.
images_to_searchable_pdf — images plus Tesseract OCR so the resulting PDF has a selectable, searchable text layer on top of the pixels.
merge_pdfs — whole-document merge of several PDFs.
reorder_pdf_pages — a PDF and a new page order, out comes a re-ordered PDF.
delete_pdf_pages — drop pages by index.
extract_pdf_text — text or simple Markdown out of any PDF, via pdf.js.
convert_document — DOCX, ODT, RTF, HTML, Markdown, LaTeX, RST, MediaWiki, EPUB ↔ Markdown, HTML, plain, LaTeX, RST, AsciiDoc, DOCX, ODT — with Pandoc compiled to WebAssembly. Lazy ~56 MB download on first call.
convert_audio — any audio (and the audio track of a video) to MP3, OGG, WAV or FLAC; decoding uses the browser's own codecs.
convert_video — MP4 (H.264 + AAC), WebM (VP9 + Opus) or animated GIF with ffmpeg-wasm; lazy per-variant download.
trim_video — cut a clip from startTime to endTime seconds.
remove_image_background — transparent PNG/WebP via a tiny U²-Net-P ONNX model plus a deterministic WASM matting pass for clean edges.
transcribe_audio — Whisper (quantised, ~40 MB) running in WebAssembly, with timestamped segments.
list_supported_formats — call this first; the agent gets a structured map of every tool's inputs, outputs and engine.

Why This Matters for Agents

An AI assistant that wants to do something with a file today usually has three bad options: upload it to a third-party API, run a server-side tool that touches your data, or refuse. WebMCP changes that because the tool is the page's own JavaScript. The agent gets the capability; your file never leaves the tab. There is no API key, no rate limit, no cost, and near-zero carbon because the upload-process-download round trip never happens.

That model finally scales beyond images. An agent can now:

Take a folder of phone photos and produce a single, OCR'd PDF — locally.
Transcribe a meeting recording into searchable text, in the browser, without sending audio to anyone.
Crop a 90-second clip out of a 4K video for a presentation, locally, with ffmpeg-wasm.
Strip backgrounds out of a batch of product photos for a shop listing — no subscription, no upload.
Pull the text out of a contract PDF, run a docx → markdown conversion on the related Word file, and merge several PDFs into one — all from the same tool surface.

Lazy by Design

The webmcp.js bootstrap is tiny. The expensive parts — Pandoc's ~56 MB WASM, the ffmpeg cores, the U²-Net ONNX model, the Whisper model, Tesseract's language data — are only fetched the first time the matching tool is called, then the browser caches them. Agents that never call convert_video never pay the ffmpeg download. Asset budget intact.

Privacy, Safety, Honesty

Every tool is annotated readOnlyHint — they take bytes and return bytes; they never write files, never make network requests outside the lazy WASM/model downloads, never read other tabs. The single network dependency is the lazy download of the engine itself (Pandoc, ffmpeg, Whisper, Tesseract), each from a stable URL and cached.

And we are honest about formats: HEIC is in list_supported_formats as "Safari/iOS only" because that is the truth — Chrome and Firefox do not decode HEIC natively, and shipping a multi-megabyte HEIC decoder would violate the project's asset-size budget. The fallback decodes whatever the browser itself can decode and no more.

Try It

The WebMCP page lists every registered tool and includes a working live demo. If you are building an in-browser agent — or just want to see how a serious, complete WebMCP server looks — this is what an honest, private, zero-cost file-tools surface looks like. And it is just a web page.

Ready to convert your images?

Try WebConverter Free