MultiMarkdown Developer's Guide

(Revised 2026-03-13)

Introduction

Why MultiMarkdown v7?

The “initial public commit” for MultiMarkdown v6 was January 18, 2017. I had first started working on it in early 2016, however. v6 was almost a complete rewrite of v5, which still used a PEG. Because of that, there was a great deal of “learning while building”, and the code was sloppier than it should have been in some places.

The API was also not as clean as it could have been, which required including several different header files in order to parse text into HTML within another project.

Differences from MultiMarkdown v6

Testing

Updated Test Suite

The standard test suite files from prior versions of MMD have been updated to account for changes from the v7 parser, but also to include new test situations that have come up. Most of the changes are relatively minor.

make
cd build
make
ctest
ctest -V	# To see details

Additionally, the “test harness” I use to perform integration testing has been updated to a bash script rather than using a very old Perl script. This allows the test suite to run on Windows without a Perl installation.

Unit Testing

In most of my projects, I try to make heavy use of unit testing and a Test-driven development approach.

MultiMarkdown is a bit different because unit testing would be very labor intensive, and it’s really the integration testing at the end that I care the most about. (I’m not saying that unit testing would be wrong, just that in this case I’m not sure it’s the approach I want to take.)

That said there are a few components that have some unit tests that can be run.

make test
cd build-test
make
./run_tests

Programmatically Generated Test Files

In addition to hand-generated test files in the test suite that have been built up over time, there are some computer generated tests as well. These allow more exhaustively testing situations that might not come up in regular use.

libFuzzer

libFuzzer is used for fuzz testing. Through the course of development, I was able to find a fair number of bugs this way that would have been challenging to find otherwise. This doesn’t work on macOS, so I use vagrant and do the fuzz testing in Ubuntu. Feel free to participate by running the fuzz tester yourself, and send me any examples that trigger an error!

cd fuzz
make
cd build
make
./fuzz_mmd-7

Performance Benchmarking

bench.c builds a small test program that generates a collection of test files that stress test a MMD parser with a few different scenarios. (from https://gist.github.com/mity/24822b24d35ef1f998f970965f8c8e53)

It then parses those files several times with MMD v6 and v7, CommonMark, and MD4C. bench.c will in all likelihood need to be modified to match how these programs are installed on your machine.

make
cd build
make
cd ../dev
make run

API Changes

API Calls

libMultiMarkdown7.h defines the API for interacting with the MultiMarkdown 7 library. I have tried to clean this file up in order to make it clearer to read and to included everything required to incorporate MMD in most projects.

You’ll notice that most of the primary API calls have 4 variants:

  1. One requires a FILE pointer to a file that has been opened. This can also be stdin.

  2. One requires path to a file, and MMD handles opening the file for reading.

  3. One requires a null terminated C string (which means the string has to be scanned to determine how long it is.)

  4. One requires a C string (optionally null terminated) along with the length of that string (in bytes). This version does not require an additional pass to determine the length of the string since it is provided up front. This variant is preferable to variant 3 if you already know the length of the string for that reason.

Regardless of how the source text is delivered, MMD expects UTF-8 encoding (with or without a BOM, which is not needed with UTF-8 encoding).

There are several different call classes available:

API Enumerations

libMultiMarkdown7.h also includes the various enums that are used.

All of these values are combined into a single 32-bit unsigned integer. There are couple of macros to extract specific values from that combined value if needed:

Abstract Syntax Tree

The AST consists of mmd_node structs, which specify a type of node, the starting offset in the source text (in bytes), the length (in bytes), and pointers to the next node and the first child node. The tail node points to the last childe node and is primarily used when building the AST so that a new child can be appended without walking the linked list.

mmd_line_node is the same but adds two more fields specifying where the actual “content” of the line starts and ends, which allows you to more easily ignore the markup, such as the leading and trailing ## in a header.

node_type is a value from 1 - 255 that specifies what a specific mmd_node represents. Values from 1–63 represent LINES in the source text. Values from 64–127 represent block level structures. Values from 128–255 represent span level tokens.

NOTE: If you customize MMD and add additional node types to the enumeration list, be sure assign to the proper value range and follow any directions in the comments of libMultiMarkdown7.h.

There are several utility macros to help easily determine what grouping a specific mmd_node belongs to based on its type:

Command-Line Changes

MMD v7 handles arguments from the command-line in a slightly different way from v6, though the most common use cases are unchanged.

multimarkdown [–help] {ast|batch|hash|meta|parse} [options] [Input file names]

The first argument should be an action. If no action is specified, MMD defaults to the parse action.

You can then specify different options to adjust the default behavior:

Cross-Platform Compatibility

macOS

Primary development for MMD is done on macOS, so everything works.

*nix

Additional compilation and testing is frequently done on Ubuntu Linux via a Virtualbox VM, so everything should work properly on *nix machines.

Windows

I was finally able to get a minimal development environment working on a Windows VM using UTM and Windows 11. It is not tested as regularly as macOS or *nix environments, but it works and passes the test suite.

However, performance is not what I would have expected. It’s possible this is because running Windows inside macOS imparts too much of a performance hit? (Yet running on an Ubuntu VM is just fine…) I worry it’s something more integral to the code and that Windows needs more time profiling to help improve things. As I do not have Windows hardware, and have no intention on purchasing Windows hardware, any contributions here are appreciated!

Others

MultiMarkdown is written in C, and is intended to be able to compiled on any (reasonable) operating system.

The only external library it uses is libcurl, but only if it is found.

CMake is normally used as the build system, but MMD could be compiled by manually specifying the source files to be compiled of course. So this is not a hard requirement either.

If you need to compile MMD on a different system and find that I have done something that prevents you from doing that, let me know and I will consider whether anything can be changed.

Continuous Integration

I use Github’s actions to run the test suite with every push to the public repository. This includes:

At the very least, this warns me if I push a commit that breaks compilation or expected output on the test suite on any of the three systems.

Contributing

Bug Reports and Suggestions

I welcome examples of source text that causes MMD to misbehave. You can contribute them at the Github issues page.

Same thing with suggestions for new features. Be warned, however, that I rarely add new features to the MMD syntax unless they are truly valuable to a wide range of users.

Code Contributions

Pull requests can be managed through Github. However, if your request is more than a straightforward bug fix, it is unlikely that I will accept it directly. It is more likely that I would rewrite the suggested code to ensure it matches the style and structure of existing code and my expectations. So you may be better off discussing the idea with me first before worrying too much about a pull request.