Although pandoc itself will not create or modify any files other than those you explicitly ask it create (with the exception of temporary files used in producing PDFs), a filter or custom writer could in principle do anything on your file system. Please audit filters and custom writers very carefully before using them.
Several input formats (including HTML, Org, and RST) support
include
directives that allow the contents of a file to be
included in the output. An untrusted attacker could use these to view
the contents of files on the file system. (Using the
--sandbox
option can protect against this threat.)
Several output formats (including RTF, FB2, HTML with
--self-contained
, EPUB, Docx, and ODT) will embed encoded
or raw images into the output file. An untrusted attacker could exploit
this to view the contents of non-image files on the file system. (Using
the --sandbox
option can protect against this threat, but
will also prevent including images in these formats.)
If your application uses pandoc as a Haskell library (rather than
shelling out to the executable), it is possible to use it in a mode that
fully isolates pandoc from your file system, by running the pandoc
operations in the PandocPure
monad. See the document Using the pandoc
API for more details. (This corresponds to the use of the
--sandbox
option on the command line.)
Pandoc’s parsers can exhibit pathological performance on some
corner cases. It is wise to put any pandoc operations under a timeout,
to avoid DOS attacks that exploit these issues. If you are using the
pandoc executable, you can add the command line options
+RTS -M512M -RTS
(for example) to limit the heap size to
512MB. Note that the commonmark
parser (including
commonmark_x
and gfm
) is much less vulnerable
to pathological performance than the markdown
parser, so it
is a better choice when processing untrusted input.
The HTML generated by pandoc is not guaranteed to be safe. If
raw_html
is enabled for the Markdown input, users can
inject arbitrary HTML. Even if raw_html
is disabled, users
can include dangerous content in URLs and attributes. To be safe, you
should run all HTML generated from untrusted user input through an HTML
sanitizer.