Creating Custom Pandoc Writers in Lua
Introduction
If you need to render a format not already handled by pandoc, or you want to change how pandoc renders a format, you can create a custom writer using the Lua language. Pandoc has a built-in Lua interpreter, so you needn’t install any additional software to do this.
A custom writer is a Lua file that defines how to render the
document. Two styles of custom writers are supported: classic
custom writers must define rendering functions for each AST
element. New style writers, available since pandoc 2.17.2, must
define just a single function Writer
, which gets passed the
document and writer options, and then does all rendering.
Classic style
A writer using the classic style defines rendering functions for each element of the pandoc AST.
For example,
function Para(s)
return "<paragraph>" .. s .. "</paragraph>"
end
The best way to go about creating a classic custom writer is to modify the example that comes with pandoc. To get the example, you can do
pandoc --print-default-data-file sample.lua > sample.lua
A custom HTML writer
sample.lua
is a
full-features HTML writer, with explanatory comments. To use it,
just use the path to the custom writer as the writer name:
pandoc -t sample.lua myfile.md
sample.lua
defines all
the functions needed by any custom writer, so you can design your
own custom writer by modifying the functions in sample.lua
according to your
needs.
-- This is a sample custom writer for pandoc. It produces output
-- that is very similar to that of pandoc's HTML writer.
-- There is one new feature: code blocks marked with class 'dot'
-- are piped through graphviz and images are included in the HTML
-- output using 'data:' URLs. The image format can be controlled
-- via the `image_format` metadata field.
--
-- Invoke with: pandoc -t sample.lua
--
-- Note: you need not have lua installed on your system to use this
-- custom writer. However, if you do have lua installed, you can
-- use it to test changes to the script. 'lua sample.lua' will
-- produce informative error messages if your code contains
-- syntax errors.
local pipe = pandoc.pipe
local stringify = (require 'pandoc.utils').stringify
-- The global variable PANDOC_DOCUMENT contains the full AST of
-- the document which is going to be written. It can be used to
-- configure the writer.
local meta = PANDOC_DOCUMENT.meta
-- Choose the image format based on the value of the
-- `image_format` meta value.
local image_format = meta.image_format
and stringify(meta.image_format)
or 'png'
local image_mime_type = ({
= 'image/jpeg',
jpeg = 'image/jpeg',
jpg = 'image/gif',
gif = 'image/png',
png = 'image/svg+xml',
svg })[image_format]
or error('unsupported image format `' .. image_format .. '`')
-- Character escaping
local function escape(s, in_attribute)
return s:gsub('[<>&"\']',
function(x)
if x == '<' then
return '<'
elseif x == '>' then
return '>'
elseif x == '&' then
return '&'
elseif in_attribute and x == '"' then
return '"'
elseif in_attribute and x == "'" then
return '''
else
return x
end
end)
end
-- Helper function to convert an attributes table into
-- a string that can be put into HTML tags.
local function attributes(attr)
local attr_table = {}
for x,y in pairs(attr) do
if y and y ~= '' then
table.insert(attr_table, ' ' .. x .. '="' .. escape(y,true) .. '"')
end
end
return table.concat(attr_table)
end
-- Table to store footnotes, so they can be included at the end.
local notes = {}
-- Blocksep is used to separate block elements.
function Blocksep()
return '\n\n'
end
-- This function is called once for the whole document. Parameters:
-- body is a string, metadata is a table, variables is a table.
-- This gives you a fragment. You could use the metadata table to
-- fill variables in a custom lua template. Or, pass `--template=...`
-- to pandoc, and pandoc will do the template processing as usual.
function Doc(body, metadata, variables)
local buffer = {}
local function add(s)
table.insert(buffer, s)
end
(body)
addif #notes > 0 then
('<ol class="footnotes">')
addfor _,note in pairs(notes) do
(note)
addend
('</ol>')
addend
return table.concat(buffer,'\n') .. '\n'
end
-- The functions that follow render corresponding pandoc elements.
-- s is always a string, attr is always a table of attributes, and
-- items is always an array of strings (the items in a list).
-- Comments indicate the types of other variables.
function Str(s)
return escape(s)
end
function Space()
return ' '
end
function SoftBreak()
return '\n'
end
function LineBreak()
return '<br/>'
end
function Emph(s)
return '<em>' .. s .. '</em>'
end
function Strong(s)
return '<strong>' .. s .. '</strong>'
end
function Subscript(s)
return '<sub>' .. s .. '</sub>'
end
function Superscript(s)
return '<sup>' .. s .. '</sup>'
end
function SmallCaps(s)
return '<span style="font-variant: small-caps;">' .. s .. '</span>'
end
function Strikeout(s)
return '<del>' .. s .. '</del>'
end
function Link(s, tgt, tit, attr)
return '<a href="' .. escape(tgt,true) .. '" title="' ..
escape(tit,true) .. '"' .. attributes(attr) .. '>' .. s .. '</a>'
end
function Image(s, src, tit, attr)
return '<img src="' .. escape(src,true) .. '" title="' ..
escape(tit,true) .. '"/>'
end
function Code(s, attr)
return '<code' .. attributes(attr) .. '>' .. escape(s) .. '</code>'
end
function InlineMath(s)
return '\\(' .. escape(s) .. '\\)'
end
function DisplayMath(s)
return '\\[' .. escape(s) .. '\\]'
end
function SingleQuoted(s)
return '‘' .. s .. '’'
end
function DoubleQuoted(s)
return '“' .. s .. '”'
end
function Note(s)
local num = #notes + 1
-- insert the back reference right before the final closing tag.
= string.gsub(s,
s '(.*)</', '%1 <a href="#fnref' .. num .. '">↩</a></')
-- add a list item with the note to the note table.
table.insert(notes, '<li id="fn' .. num .. '">' .. s .. '</li>')
-- return the footnote reference, linked to the note.
return '<a id="fnref' .. num .. '" href="#fn' .. num ..
'"><sup>' .. num .. '</sup></a>'
end
function Span(s, attr)
return '<span' .. attributes(attr) .. '>' .. s .. '</span>'
end
function RawInline(format, str)
if format == 'html' then
return str
else
return ''
end
end
function Cite(s, cs)
local ids = {}
for _,cit in ipairs(cs) do
table.insert(ids, cit.citationId)
end
return '<span class="cite" data-citation-ids="' .. table.concat(ids, ',') ..
'">' .. s .. '</span>'
end
function Plain(s)
return s
end
function Para(s)
return '<p>' .. s .. '</p>'
end
-- lev is an integer, the header level.
function Header(lev, s, attr)
return '<h' .. lev .. attributes(attr) .. '>' .. s .. '</h' .. lev .. '>'
end
function BlockQuote(s)
return '<blockquote>\n' .. s .. '\n</blockquote>'
end
function HorizontalRule()
return "<hr/>"
end
function LineBlock(ls)
return '<div style="white-space: pre-line;">' .. table.concat(ls, '\n') ..
'</div>'
end
function CodeBlock(s, attr)
-- If code block has class 'dot', pipe the contents through dot
-- and base64, and include the base64-encoded png as a data: URL.
if attr.class and string.match(' ' .. attr.class .. ' ',' dot ') then
local img = pipe('base64', {}, pipe('dot', {'-T' .. image_format}, s))
return '<img src="data:' .. image_mime_type .. ';base64,' .. img .. '"/>'
-- otherwise treat as code (one could pipe through a highlighter)
else
return '<pre><code' .. attributes(attr) .. '>' .. escape(s) ..
'</code></pre>'
end
end
function BulletList(items)
local buffer = {}
for _, item in pairs(items) do
table.insert(buffer, '<li>' .. item .. '</li>')
end
return '<ul>\n' .. table.concat(buffer, '\n') .. '\n</ul>'
end
function OrderedList(items)
local buffer = {}
for _, item in pairs(items) do
table.insert(buffer, '<li>' .. item .. '</li>')
end
return '<ol>\n' .. table.concat(buffer, '\n') .. '\n</ol>'
end
function DefinitionList(items)
local buffer = {}
for _,item in pairs(items) do
local k, v = next(item)
table.insert(buffer, '<dt>' .. k .. '</dt>\n<dd>' ..
table.concat(v, '</dd>\n<dd>') .. '</dd>')
end
return '<dl>\n' .. table.concat(buffer, '\n') .. '\n</dl>'
end
-- Convert pandoc alignment to something HTML can use.
-- align is AlignLeft, AlignRight, AlignCenter, or AlignDefault.
local function html_align(align)
if align == 'AlignLeft' then
return 'left'
elseif align == 'AlignRight' then
return 'right'
elseif align == 'AlignCenter' then
return 'center'
else
return 'left'
end
end
function CaptionedImage(src, tit, caption, attr)
if #caption == 0 then
return '<p><img src="' .. escape(src,true) .. '" id="' .. attr.id ..
'"/></p>'
else
local ecaption = escape(caption)
return '<figure>\n<img src="' .. escape(src,true) ..
'" id="' .. attr.id .. '" alt="' .. ecaption .. '"/>' ..
'<figcaption>' .. ecaption .. '</figcaption>\n</figure>'
end
end
-- Caption is a string, aligns is an array of strings,
-- widths is an array of floats, headers is an array of
-- strings, rows is an array of arrays of strings.
function Table(caption, aligns, widths, headers, rows)
local buffer = {}
local function add(s)
table.insert(buffer, s)
end
('<table>')
addif caption ~= '' then
('<caption>' .. escape(caption) .. '</caption>')
addend
if widths and widths[1] ~= 0 then
for _, w in pairs(widths) do
('<col width="' .. string.format('%.0f%%', w * 100) .. '" />')
addend
end
local header_row = {}
local empty_header = true
for i, h in pairs(headers) do
local align = html_align(aligns[i])
table.insert(header_row,'<th align="' .. align .. '">' .. h .. '</th>')
= empty_header and h == ''
empty_header end
if not empty_header then
('<tr class="header">')
addfor _,h in pairs(header_row) do
(h)
addend
('</tr>')
addend
local class = 'even'
for _, row in pairs(rows) do
= (class == 'even' and 'odd') or 'even'
class ('<tr class="' .. class .. '">')
addfor i,c in pairs(row) do
('<td align="' .. html_align(aligns[i]) .. '">' .. c .. '</td>')
addend
('</tr>')
addend
('</table>')
addreturn table.concat(buffer,'\n')
end
function RawBlock(format, str)
if format == 'html' then
return str
else
return ''
end
end
function Div(s, attr)
return '<div' .. attributes(attr) .. '>\n' .. s .. '</div>'
end
-- The following code will produce runtime warnings when you haven't defined
-- all of the functions you need for the custom writer, so it's useful
-- to include when you're working on a writer.
local meta = {}
.__index =
metafunction(_, key)
io.stderr:write(string.format("WARNING: Undefined function '%s'\n",key))
return function() return '' end
end
setmetatable(_G, meta)
Template variables
New template variables can be added, or existing ones modified,
by returning a second value from function Doc
.
For example, the following will add the current date in
variable date
, unless
date
is already defined
as either a metadata value or a variable:
function Doc (body, meta, vars)
.date = vars.date or meta.data or os.date '%B %e, %Y'
varsreturn body, vars
end
New style
Custom writers using the new style must contain a global
function named Writer
.
Pandoc calls this function with the document and writer options as
arguments, and expects the function to return a string.
function Writer (doc, opts)
-- ...
end
Example: modified Markdown writer
Writers have access to all modules described in the Lua filters
documentation. This includes pandoc.write
, which can be used
to render a document in a format already supported by pandoc. The
document can be modified before this conversion, as demonstrated
in the following short example. It renders a document as GitHub
Flavored Markdown, but always uses fenced code blocks, never
indented code.
function Writer (doc, opts)
local filter = {
CodeBlock = function (cb)
-- only modify if code block has no attributes
if cb.attr == pandoc.Attr() then
local delimited = '```\n' .. cb.text .. '\n```'
return pandoc.RawBlock('markdown', delimited)
end
end
}
return pandoc.write(doc:walk(filter), 'gfm', opts)
end