Markdown to HTML Converter

Convert Markdown to HTML with live preview. Support for GitHub Flavored Markdown, tables, task lists, and more.

📝Markdown to HTML Converter

GitHub Flavored Markdown

Markdown Input

Preview will appear here...

Quick Reference

## HeadingHeaders

**bold**Bold text

*italic*Italic text

[text](url)Link

![alt](url)Image

`code`Inline code

```langCode block

> quoteBlockquote

- [ ] taskTask list

About Markdown

Markdown is a lightweight markup language that you can use to add formatting elements to plaintext text documents. Created by John Gruber in 2004, Markdown is now one of the world's most popular markup languages.

Using Markdown is different than using a WYSIWYG editor. In an application like Microsoft Word, you click buttons to format words and phrases. In Markdown, you add syntax to the text to indicate formatting.

GitHub Flavored Markdown

✓Tables: Create tables with pipes and hyphens
✓Task Lists: Checkboxes in lists
✓Strikethrough: Cross out text with ~~tildes~~
✓Autolinks: Automatic URL linking
✓Syntax Highlighting: Language-specific code blocks

Common Use Cases

Documentation

• README files
• API documentation
• User guides
• Technical specs

Content Creation

• Blog posts
• Static sites
• Email templates
• Notes & wikis

Development

• GitHub/GitLab
• Issue tracking
• Pull requests
• Code comments

How Markdown Conversion Works

Markdown Syntax and Structure

Markdown uses plain text formatting syntax that's readable as-is but can be converted to HTML. Headings use # symbols (# H1, ## H2, ### H3), emphasis uses asterisks or underscores (*italic* or _italic_, **bold** or __bold__), and links use brackets and parentheses [text](url). This simplicity makes Markdown easy to write without a specialized editor while maintaining good readability in source form.

Lists in Markdown use intuitive characters: asterisks, hyphens, or plus signs for unordered lists, and numbers with periods for ordered lists. Code can be inline using backticks `code` or in blocks using triple backticks with optional language specification for syntax highlighting. Blockquotes use > prefix, and horizontal rules are created with three or more hyphens, asterisks, or underscores.

The original Markdown specification by John Gruber (2004) was deliberately minimal and somewhat ambiguous, leading to different parsers implementing features differently. This spawned various "flavors" of Markdown: GitHub Flavored Markdown (GFM), CommonMark (standardized specification), MultiMarkdown, Markdown Extra, and others. Each adds features like tables, footnotes, definition lists, or strikethrough text.

GitHub Flavored Markdown became a de facto standard for developer documentation. It adds task lists (- [ ] todo item), tables using pipes and hyphens, automatic URL linking, strikethrough with ~~text~~, and fenced code blocks with syntax highlighting. These extensions make GFM more powerful while maintaining backward compatibility with original Markdown for basic features.

Inline HTML is permitted in Markdown—you can embed HTML tags directly when Markdown's syntax is insufficient. For example, you can use <details> and <summary> for collapsible sections, <img> with specific attributes for images, or <div> with classes for styling. This escape hatch makes Markdown extensible without complicating the core syntax, though it reduces portability across platforms that sanitize HTML.

Parsing and Tokenization Process

Markdown parsers typically use a multi-stage process: lexing (tokenization), parsing (building an abstract syntax tree), and rendering (generating HTML). The lexer scans the input text character by character, identifying markdown tokens like heading markers, list indicators, emphasis markers, and link syntax. This stage handles line breaks and determines block-level versus inline-level elements.

Block-level parsing processes structural elements: paragraphs, headings, lists, code blocks, blockquotes, and horizontal rules. The parser identifies these by line patterns—headings start lines with #, list items start with *, -, +, or numbers, code blocks are indented or fenced. Block elements can contain inline elements but generally don't nest (except lists and blockquotes which can nest recursively).

Inline-level parsing handles emphasis, links, images, code spans, and line breaks within blocks. This is more complex because inline elements can nest: **bold with *italic* inside** is valid. The parser must track emphasis delimiters and match them correctly, handle escaped characters (\*not emphasis\*), and resolve ambiguities like whether * starts emphasis or a list item based on context.

The abstract syntax tree (AST) represents the document structure in memory. Each node is a Markdown element (heading, paragraph, list, etc.) with children (for container elements) and attributes (heading level, list type, link URL). The AST separates document structure from rendering, allowing the same parsed structure to be rendered as HTML, PDF, or other formats. Tools like remark work directly with ASTs for complex transformations.

Rendering traverses the AST and generates output HTML. Each node type has a rendering function: headings become <h1>-<h6>, emphasis becomes <em> or <strong>, lists become <ul> or <ol> with <li> children, code blocks become <pre><code>. The renderer handles escaping special HTML characters (< becomes <), generating IDs for headings (for anchor links), and adding classes or attributes as needed.

GitHub Flavored Markdown Extensions

Tables in GFM use pipes to separate columns and hyphens for the header separator: | Column 1 | Column 2 | creates table headers. Alignment is specified with colons in the separator row: |:---|, |:---:|, |---:| for left, center, and right alignment. Tables make documentation more structured than using plain lists or paragraphs, though they can be tedious to format manually (many editors have table formatting helpers).

Task lists provide checkboxes in lists using - [ ] for unchecked items and - [x] for checked items. GitHub renders these as interactive checkboxes in issues and pull requests, allowing users to track progress on tasks directly in markdown. This makes Markdown suitable for project management and documentation of multi-step procedures, not just static content.

Fenced code blocks use triple backticks (```) with optional language identifiers for syntax highlighting: ```javascript makes the code block highlight JavaScript syntax. This is cleaner than indented code blocks (which require 4 spaces) and enables language-specific formatting. Syntax highlighting libraries like Prism or Highlight.js use the language identifier to apply appropriate color schemes.

Automatic linking converts URLs and email addresses to clickable links without explicit markdown syntax. www.example.com and https://example.com both become links automatically. This makes writing more natural—you don't need to format every URL as [text](url). However, it can cause issues when you want to show a URL as plain text; you must use backticks or escape it.

Strikethrough uses double tildes: ~~deleted text~~ renders as <s>deleted text</s>. This is useful for showing changes in documents, marking completed items in lists, or indicating deprecated content. Along with other GFM features, strikethrough makes Markdown more suitable for collaborative editing and change tracking, expanding its use beyond simple document formatting.

HTML Generation and Sanitization

Converting Markdown to HTML is the most common use case. The parsed AST is rendered as HTML elements: paragraphs become <p>, headings become <h1>-<h6>, emphasis becomes <em> and <strong>, lists become <ul>/<ol> with <li> items. The HTML is typically semantic HTML5, using appropriate tags for structure rather than styling, leaving appearance to CSS.

Sanitization is critical when displaying user-generated Markdown to prevent XSS (cross-site scripting) attacks. Markdown allows inline HTML, so a malicious user could include <script>alert('XSS')</script> in their markdown. Sanitizers remove or escape dangerous HTML elements and attributes (script tags, event handlers like onclick, etc.) while preserving safe formatting. Libraries like DOMPurify handle this securely.

Link sanitization prevents javascript: URLs and other schemes that could execute code. Only safe protocols (http, https, mailto, relative paths) are allowed. Image sources are similarly restricted. Some platforms go further, proxying images through their servers to prevent tracking or mixed content warnings on HTTPS sites. Sanitization must balance security with functionality—overly aggressive filtering breaks legitimate use cases.

Class and ID generation makes HTML more useful for styling and scripting. Many renderers add classes to code blocks (language-javascript), headings, tables, and other elements. IDs are generated for headings from their text content (slugified: "My Heading" → id="my-heading"), enabling anchor links for navigation. Custom renderers can add data attributes or wrap elements in specific structures for frameworks.

Pretty-printing HTML output makes it human-readable with proper indentation and line breaks. While browsers don't care about formatting, developers inspecting the HTML appreciate readable structure. However, whitespace can matter in HTML—extra spaces in inline elements can appear in rendered output. Some renderers have options to compact output (minify) for production or prettify for development.

Common Conversion Pitfalls and Edge Cases

Nested emphasis can be ambiguous: **bold *italic* bold** should work, but ***bold and italic*** might be parsed as ** * bold and italic * ** (bold then emphasis then text) rather than the intended all-bold-and-italic. Different parsers handle this differently. The CommonMark spec clarifies many such cases, but legacy parsers may still have quirks. Testing with your specific parser is important.

List formatting is notoriously tricky. Indentation matters—sub-lists must be indented (typically 2 or 4 spaces). Mixing ordered and unordered lists requires proper indentation. Lazy continuation (starting a list item, then having multiple paragraphs without re-indenting) works differently across parsers. Blank lines between items can change whether items are wrapped in <p> tags or not, affecting CSS styling.

Special characters in URLs need escaping: a URL with parentheses [link](https://example.com/path_(with)_parens) can break the link parser. The closing parenthesis in the URL is seen as ending the markdown link. Solutions include URL-encoding the parentheses (%28, %29), using angle brackets [link](<url>), or relying on automatic linking. Similar issues occur with spaces (use %20), quotes, and other special characters.

HTML entities and character encoding cause subtle issues. Markdown supports HTML entities (© for ©), but the parser must decide when to convert them. Raw HTML passes through, but < and > in code blocks must not become < and >. Unicode characters work in UTF-8 encoded documents but might display incorrectly if encoding is wrong. Always use UTF-8 for Markdown files.

Table formatting is fragile—pipes that aren't properly aligned or missing header separators can break table parsing. Long cell content makes tables hard to read in source. Some parsers support pipe escaping (\|) for pipes in cell content, others don't. Markdown tables are convenient for simple data but become unwieldy for complex tables with cell spanning, nested structures, or lots of content. In such cases, use HTML tables directly.

Use in Static Site Generators and Documentation

Static site generators (Jekyll, Hugo, Gatsby, Next.js) use Markdown for content authoring. Authors write posts in Markdown files with front matter (metadata like title, date, tags in YAML format at the file start). The generator parses the Markdown, applies templates, and produces HTML pages. This separates content from presentation—change the template without touching content, or change content without worrying about HTML structure.

Documentation systems (GitBook, Docusaurus, VuePress, MkDocs) are built around Markdown. Technical documentation benefits from Markdown's simplicity and version control friendliness—docs live alongside code in Git, pull requests can change docs, and diffs are readable. Features like search, navigation, and versioning are added by the documentation tool while authors focus on writing clear content in Markdown.

Plugins and extensions enhance Markdown for specific needs. Math notation using LaTeX (wrapped in $...$ or $$...$$) renders equations using MathJax or KaTeX. Mermaid diagrams embed flowcharts and diagrams using text syntax. Footnotes, emoji shortcodes (:smile:), table of contents generation, and anchor links are common extensions. The plugin ecosystem makes Markdown adaptable to technical writing, academic papers, and complex documentation.

Conversion to other formats is possible through Pandoc, a universal document converter. Pandoc can convert Markdown to PDF (via LaTeX), Word documents, EPUB ebooks, slide presentations, and more. This makes Markdown a versatile source format—write once, publish to multiple formats. However, format-specific features may not translate perfectly, and complex layouts often require format-specific tweaking after conversion.

FAQ

What's the difference between Markdown and HTML?

Markdown is a lightweight markup language designed for easy writing and reading, using simple syntax like # for headings and ** for bold. HTML is a full markup language with verbose tags like <h1> and <strong>. Markdown converts to HTML but is much easier to write and read in source form. Use Markdown for content authoring, HTML when you need precise control over structure and styling.

Which Markdown flavor should I use?

GitHub Flavored Markdown (GFM) is the most widely supported and feature-rich, with tables, task lists, and strikethrough. If you're documenting software or using GitHub/GitLab, use GFM. For maximum compatibility across parsers, stick to CommonMark (the standardized core). For specific platforms (Discord, Reddit), check which flavor they support—many have custom extensions.

Can I use custom CSS with generated HTML?

Yes, the HTML generated from Markdown includes semantic tags that you can style with CSS. Most renderers add classes to elements (like .language-javascript for code blocks) that you can target. You can also wrap the rendered HTML in a div with a class and style everything within it. Many documentation tools provide themes or allow custom CSS for this purpose.

Is user-submitted Markdown safe to display?

Only if you sanitize it! Markdown allows inline HTML, which could include malicious scripts (). Always use a sanitization library (like DOMPurify) to remove dangerous HTML before displaying user content. Many platforms disable inline HTML entirely for user content, only allowing pure Markdown syntax. Never trust user input without proper sanitization.