push sheeet
Some checks failed
Periodic Merges (6h) / master → staging-nixos (push) Failing after 12m50s
Periodic Merges (6h) / master → staging-next (push) Failing after 12m54s
Periodic Merges (24h) / merge-base(master,staging) → haskell-updates (push) Failing after 11m54s
Periodic Merges (6h) / staging-next → staging (push) Failing after 12m13s
Periodic Merges (24h) / staging-next-25.05 → staging-25.05 (push) Failing after 13m24s
Periodic Merges (24h) / release-25.05 → staging-next-25.05 (push) Failing after 14m28s

This commit is contained in:
Dark Steveneq
2025-10-09 14:15:47 +02:00
commit 646b892680
49168 changed files with 5897842 additions and 0 deletions

View File

@@ -0,0 +1,91 @@
# nixos-render-docs
A [CommonMark](https://commonmark.org/) and [`man-pages`](https://www.man7.org/linux/man-pages/man7/man-pages.7.html) renderer for the NixOS and Nixpkgs manuals.
## Summary
`nixos-render-docs` implements [RFC 72](https://github.com/NixOS/rfcs/pull/72) and has enabled a lossless port of Nixpkgs and NixOS documentation, which was originally written in the [DocBook](https://docbook.org/whatis) format, to [CommonMark](https://commonmark.org/) with [custom extensions](../../../../doc/README.md#syntax).
Maintaining our own documentation rendering framework may appear extreme but has practical advantages:
- We never have to work around existing tools made under different assumptions
- We don't have to deal with unexpected breakage
- We can grow the framework with our evolving requirements without relying on external support or approval or the need to maintain a small diff to upstream
- The amount of code involved is minimal because it's single-purpose
Several alternatives to `nixos-render-docs` were discussed in the past.
A detailed analysis can be found in a [table comparing documentation rendering framework](https://ethercalc.net/dc4vcnnl8zv0).
## Redirects system
Moving contents around can cause links to break.
Since we have our own markdown parser, we can hook into the rendering process to extract all of the metadata around each content identifier.
The [mechanism for checking correctness of redirects](./src/nixos_render_docs/redirects.py) takes the collection of identifiers and a mapping of the identified content to its historical locations in the output.
It validates them against a set of rules, and creates a client-side redirect mapping for each output file, as well as a `_redirects` file for server-side redirects in [Netlify syntax](https://docs.netlify.com/routing/redirects/#syntax-for-the-redirects-file).
This allows us to catch:
- Identifiers that were removed or renamed
- Content that was moved from one location to another
- Various consistency errors in the redirects mapping
### Design considerations
The creation, movement, and removal of every identifier is captured in the Git history.
However, analysing hundreds of thousands of commits during the build process is impractical.
The chosen design is a trade-off between speed, repository size, and contributor friction:
- Stricter checks always require more attention from contributors
- Checks should be reasonably fast and ideally happen locally, e.g. as part of the build, as anything else will substantially lengthen the feedback cycle.
- Computing redirects against previous revisions of the repository would be more space-efficient, but impractically slow.
- It would also require keeping an impure or otherwise continuously updated reference to those other revisions.
- The static mapping acts like a semi-automatically updated cache that we drag along with version history.
- Other setups, such as a dedicated service to cache a history of moved content, are more complicated and would still be impure.
- Checking in large amounts of data that is touched often bears a risk of more merge conflicts or related build failures.
The solution picked here is to have a static mapping of the historical locations checked into the Git tree, such that it can be read during the build process.
This also ensures that an improper redirect mapping will cause `nixos-render-docs` to fail the build and thus enforce that redirects stay up-to-date with every commit.
### Redirects Mapping Structure
Here's an overview of this mapping:
```json
{
"<identifier>": [
"index.html#<identifier>",
"foo.html#foo",
"bar.html#foo"
]
}
```
- The keys of this mapping _must_ be an exhaustive list of all identifiers in the source files.
- The first element of the value of this mapping _must_ be the current output location (path and anchor) of the content signified by the identifier in the mapping key.
- While the order of the remaining elements is unconstrained, please only prepend to this list when the content under the identifier moves in order to keep the diffs readable.
In case this identifier is renamed, the mapping would change into:
```json
{
"<identifier-new>": [
"index.html#<identifier-new>",
"foo.html#<identifier>",
"bar.html#foo",
"index.html#<identifier>"
]
}
```
## Rendering multiple pages
The `include` directive accepts an argument `into-file` to specify the file into which the imported markdown should be rendered to. We can use this argument to set up multipage rendering of the manuals.
For example
~~~
```{=include=} appendix html:into-file=//release-notes.html
release-notes/release-notes.md
```
~~~
will render the release notes into a `release-notes.html` file, instead of making it a section within the default `index.html`.

View File

@@ -0,0 +1,89 @@
{
lib,
python3,
runCommand,
}:
let
python = python3.override {
self = python;
packageOverrides = final: prev: {
markdown-it-py = prev.markdown-it-py.overridePythonAttrs (_: {
doCheck = false;
});
mdit-py-plugins = prev.mdit-py-plugins.overridePythonAttrs (_: {
doCheck = false;
});
};
};
in
python.pkgs.buildPythonApplication rec {
pname = "nixos-render-docs";
version = "0.0";
format = "pyproject";
src = lib.cleanSourceWith {
filter =
name: type:
lib.cleanSourceFilter name type
&& !(
type == "directory"
&& builtins.elem (baseNameOf name) [
".pytest_cache"
".mypy_cache"
"__pycache__"
]
);
src = ./src;
};
nativeCheckInputs = [
python.pkgs.pytestCheckHook
];
build-system = [
python.pkgs.setuptools
];
propagatedBuildInputs = with python.pkgs; [
markdown-it-py
mdit-py-plugins
];
pytestFlags = [
"-vvrP"
];
enabledTestPaths = [
"tests/"
];
# NOTE this is a CI test rather than a build-time test because we want to keep the
# build closures small. mypy has an unreasonably large build closure for docs builds.
passthru.tests.typing =
runCommand "${pname}-mypy"
{
nativeBuildInputs = [
(python3.withPackages (
ps: with ps; [
mypy
pytest
markdown-it-py
mdit-py-plugins
]
))
];
}
''
mypy --strict ${src}
touch $out
'';
meta = with lib; {
description = "Renderer for NixOS manual and option docs";
mainProgram = "nixos-render-docs";
license = licenses.mit;
maintainers = [ ];
};
}

View File

@@ -0,0 +1,55 @@
import argparse
import sys
import textwrap
import traceback
from io import StringIO
from pprint import pprint
from . import manual
from . import options
from . import parallel
def pretty_print_exc(e: BaseException, *, _desc_text: str = "error") -> None:
print(f"\x1b[1;31m{_desc_text}:\x1b[0m", file=sys.stderr)
# destructure Exception and RuntimeError specifically so we can show nice
# messages for errors that weren't given their own exception type with
# a good pretty-printer.
if type(e) is Exception or type(e) is RuntimeError:
args = e.args
if len(args) and isinstance(args[0], str):
print("\t", args[0], file=sys.stderr, sep="")
args = args[1:]
buf = StringIO()
for arg in args:
pprint(arg, stream=buf)
if extra_info := buf.getvalue():
print("\x1b[1;34mextra info:\x1b[0m", file=sys.stderr)
print(textwrap.indent(extra_info, "\t"), file=sys.stderr, end="")
else:
print(e)
if e.__cause__ is not None:
print("", file=sys.stderr)
pretty_print_exc(e.__cause__, _desc_text="caused by")
def main() -> None:
parser = argparse.ArgumentParser(description='render nixos manual bits')
parser.add_argument('-j', '--jobs', type=int, default=None)
commands = parser.add_subparsers(dest='command', required=True)
options.build_cli(commands.add_parser('options'))
manual.build_cli(commands.add_parser('manual'))
args = parser.parse_args()
try:
parallel.pool_processes = args.jobs
if args.command == 'options':
options.run_cli(args)
elif args.command == 'manual':
manual.run_cli(args)
else:
raise RuntimeError('command not hooked up', args)
except Exception as e:
traceback.print_exc()
pretty_print_exc(e)
sys.exit(1)

View File

@@ -0,0 +1,217 @@
from collections.abc import Mapping, Sequence
from dataclasses import dataclass
from typing import cast
from urllib.parse import quote
from .md import Renderer
from markdown_it.token import Token
_asciidoc_escapes = {
# escape all dots, just in case one is pasted at SOL
ord('.'): "{zwsp}.",
# may be replaced by typographic variants
ord("'"): "{apos}",
ord('"'): "{quot}",
# passthrough character
ord('+'): "{plus}",
# table marker
ord('|'): "{vbar}",
# xml entity reference
ord('&'): "{amp}",
# crossrefs. < needs extra escaping because links break in odd ways if they start with it
ord('<'): "{zwsp}+<+{zwsp}",
ord('>'): "{gt}",
# anchors, links, block attributes
ord('['): "{startsb}",
ord(']'): "{endsb}",
# superscript, subscript
ord('^'): "{caret}",
ord('~'): "{tilde}",
# bold
ord('*'): "{asterisk}",
# backslash
ord('\\'): "{backslash}",
# inline code
ord('`'): "{backtick}",
}
def asciidoc_escape(s: str) -> str:
s = s.translate(_asciidoc_escapes)
# :: is deflist item, ;; is has a replacement but no idea why
return s.replace("::", "{two-colons}").replace(";;", "{two-semicolons}")
@dataclass(kw_only=True)
class List:
head: str
@dataclass()
class Par:
sep: str
block_delim: str
continuing: bool = False
class AsciiDocRenderer(Renderer):
__output__ = "asciidoc"
_parstack: list[Par]
_list_stack: list[List]
_attrspans: list[str]
def __init__(self, manpage_urls: Mapping[str, str]):
super().__init__(manpage_urls)
self._parstack = [ Par("\n\n", "====") ]
self._list_stack = []
self._attrspans = []
def _enter_block(self, is_list: bool) -> None:
self._parstack.append(Par("\n+\n" if is_list else "\n\n", self._parstack[-1].block_delim + "="))
def _leave_block(self) -> None:
self._parstack.pop()
def _break(self, force: bool = False) -> str:
result = self._parstack[-1].sep if force or self._parstack[-1].continuing else ""
self._parstack[-1].continuing = True
return result
def _admonition_open(self, kind: str) -> str:
pbreak = self._break()
self._enter_block(False)
return f"{pbreak}[{kind}]\n{self._parstack[-2].block_delim}\n"
def _admonition_close(self) -> str:
self._leave_block()
return f"\n{self._parstack[-1].block_delim}\n"
def _list_open(self, token: Token, head: str) -> str:
attrs = []
if (idx := token.attrs.get('start')) is not None:
attrs.append(f"start={idx}")
if token.meta['compact']:
attrs.append('options="compact"')
if self._list_stack:
head *= len(self._list_stack[0].head) + 1
self._list_stack.append(List(head=head))
return f"{self._break()}[{','.join(attrs)}]"
def _list_close(self) -> str:
self._list_stack.pop()
return ""
def text(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._parstack[-1].continuing = True
return asciidoc_escape(token.content)
def paragraph_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._break()
def paragraph_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ""
def hardbreak(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return " +\n"
def softbreak(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return " "
def code_inline(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._parstack[-1].continuing = True
return f"``{asciidoc_escape(token.content)}``"
def code_block(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self.fence(token, tokens, i)
def link_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._parstack[-1].continuing = True
return f"link:{quote(cast(str, token.attrs['href']), safe='/:')}["
def link_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "]"
def list_item_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._enter_block(True)
# allow the next token to be a block or an inline.
return f'\n{self._list_stack[-1].head} {{empty}}'
def list_item_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._leave_block()
return "\n"
def bullet_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._list_open(token, '*')
def bullet_list_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._list_close()
def em_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "__"
def em_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "__"
def strong_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "**"
def strong_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "**"
def fence(self, token: Token, tokens: Sequence[Token], i: int) -> str:
attrs = f"[source,{token.info}]\n" if token.info else ""
code = token.content
if code.endswith('\n'):
code = code[:-1]
return f"{self._break(True)}{attrs}----\n{code}\n----"
def blockquote_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
pbreak = self._break(True)
self._enter_block(False)
return f"{pbreak}[quote]\n{self._parstack[-2].block_delim}\n"
def blockquote_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._leave_block()
return f"\n{self._parstack[-1].block_delim}"
def note_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open("NOTE")
def note_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def caution_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open("CAUTION")
def caution_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def important_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open("IMPORTANT")
def important_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def tip_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open("TIP")
def tip_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def warning_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open("WARNING")
def warning_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def dl_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return f"{self._break()}[]"
def dl_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ""
def dt_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._break()
def dt_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._enter_block(True)
return ":: {empty}"
def dd_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ""
def dd_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._leave_block()
return "\n"
def myst_role(self, token: Token, tokens: Sequence[Token], i: int) -> str:
# NixOS-specific roles are documented at <nixpkgs>/doc/README.md (with reverse reference)
self._parstack[-1].continuing = True
content = asciidoc_escape(token.content)
if token.meta['name'] == 'manpage' and (url := self._manpage_urls.get(token.content)):
return f"link:{quote(url, safe='/:')}[{content}]"
return f"[.{token.meta['name']}]``{asciidoc_escape(token.content)}``"
def inline_anchor(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._parstack[-1].continuing = True
return f"[[{token.attrs['id']}]]"
def attr_span_begin(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._parstack[-1].continuing = True
(id_part, class_part) = ("", "")
if id := token.attrs.get('id'):
id_part = f"[[{id}]]"
if s := token.attrs.get('class'):
if s == 'keycap':
class_part = "kbd:["
self._attrspans.append("]")
else:
return super().attr_span_begin(token, tokens, i)
else:
self._attrspans.append("")
return id_part + class_part
def attr_span_end(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._attrspans.pop()
def heading_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return token.markup.replace("#", "=") + " "
def heading_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "\n"
def ordered_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._list_open(token, '.')
def ordered_list_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._list_close()

View File

@@ -0,0 +1,191 @@
from collections.abc import Mapping, Sequence
from dataclasses import dataclass
from typing import cast, Optional
from .md import md_escape, md_make_code, Renderer
from markdown_it.token import Token
@dataclass(kw_only=True)
class List:
next_idx: Optional[int] = None
compact: bool
first_item_seen: bool = False
@dataclass
class Par:
indent: str
continuing: bool = False
class CommonMarkRenderer(Renderer):
__output__ = "commonmark"
_parstack: list[Par]
_link_stack: list[str]
_list_stack: list[List]
def __init__(self, manpage_urls: Mapping[str, str]):
super().__init__(manpage_urls)
self._parstack = [ Par("") ]
self._link_stack = []
self._list_stack = []
def _enter_block(self, extra_indent: str) -> None:
self._parstack.append(Par(self._parstack[-1].indent + extra_indent))
def _leave_block(self) -> None:
self._parstack.pop()
self._parstack[-1].continuing = True
def _break(self) -> str:
self._parstack[-1].continuing = True
return f"\n{self._parstack[-1].indent}"
def _maybe_parbreak(self) -> str:
result = f"\n{self._parstack[-1].indent}" * 2 if self._parstack[-1].continuing else ""
self._parstack[-1].continuing = True
return result
def _admonition_open(self, kind: str) -> str:
pbreak = self._maybe_parbreak()
self._enter_block("")
return f"{pbreak}**{kind}:** "
def _admonition_close(self) -> str:
self._leave_block()
return ""
def _indent_raw(self, s: str) -> str:
if '\n' not in s:
return s
return f"\n{self._parstack[-1].indent}".join(s.splitlines())
def text(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._parstack[-1].continuing = True
return self._indent_raw(md_escape(token.content))
def paragraph_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._maybe_parbreak()
def paragraph_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ""
def hardbreak(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return f" {self._break()}"
def softbreak(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._break()
def code_inline(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._parstack[-1].continuing = True
return md_make_code(token.content)
def code_block(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self.fence(token, tokens, i)
def link_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._parstack[-1].continuing = True
self._link_stack.append(cast(str, token.attrs['href']))
return "["
def link_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return f"]({md_escape(self._link_stack.pop())})"
def list_item_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
lst = self._list_stack[-1]
lbreak = "" if not lst.first_item_seen else self._break() * (1 if lst.compact else 2)
lst.first_item_seen = True
head = " -"
if lst.next_idx is not None:
head = f" {lst.next_idx}."
lst.next_idx += 1
self._enter_block(" " * (len(head) + 1))
return f'{lbreak}{head} '
def list_item_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._leave_block()
return ""
def bullet_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._list_stack.append(List(compact=bool(token.meta['compact'])))
return self._maybe_parbreak()
def bullet_list_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._list_stack.pop()
return ""
def em_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "*"
def em_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "*"
def strong_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "**"
def strong_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "**"
def fence(self, token: Token, tokens: Sequence[Token], i: int) -> str:
code = token.content
if code.endswith('\n'):
code = code[:-1]
pbreak = self._maybe_parbreak()
return pbreak + self._indent_raw(md_make_code(code, info=token.info, multiline=True))
def blockquote_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
pbreak = self._maybe_parbreak()
self._enter_block("> ")
return pbreak + "> "
def blockquote_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._leave_block()
return ""
def note_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open("Note")
def note_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def caution_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open("Caution")
def caution_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def important_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open("Important")
def important_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def tip_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open("Tip")
def tip_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def warning_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open("Warning")
def warning_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def dl_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._list_stack.append(List(compact=False))
return ""
def dl_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._list_stack.pop()
return ""
def dt_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
pbreak = self._maybe_parbreak()
self._enter_block(" ")
# add an opening zero-width non-joiner to separate *our* emphasis from possible
# emphasis in the provided term
return f'{pbreak} - *{chr(0x200C)}'
def dt_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return f"{chr(0x200C)}*"
def dd_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._parstack[-1].continuing = True
return ""
def dd_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._leave_block()
return ""
def myst_role(self, token: Token, tokens: Sequence[Token], i: int) -> str:
# NixOS-specific roles are documented at <nixpkgs>/doc/README.md (with reverse reference)
self._parstack[-1].continuing = True
content = md_make_code(token.content)
if token.meta['name'] == 'manpage' and (url := self._manpage_urls.get(token.content)):
return f"[{content}]({url})"
return content # no roles in regular commonmark
def attr_span_begin(self, token: Token, tokens: Sequence[Token], i: int) -> str:
# there's no way we can emit attrspans correctly in all cases. we could use inline
# html for ids, but that would not round-trip. same holds for classes. since this
# renderer is only used for approximate options export and all of these things are
# not allowed in options we can ignore them for now.
return ""
def attr_span_end(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ""
def heading_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return token.markup + " "
def heading_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "\n"
def ordered_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._list_stack.append(
List(next_idx = cast(int, token.attrs.get('start', 1)),
compact = bool(token.meta['compact'])))
return self._maybe_parbreak()
def ordered_list_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._list_stack.pop()
return ""
def image(self, token: Token, tokens: Sequence[Token], i: int) -> str:
if title := cast(str, token.attrs.get('title', '')):
title = ' "' + title.replace('"', '\\"') + '"'
return f'![{token.content}]({token.attrs["src"]}{title})'

View File

@@ -0,0 +1,353 @@
from collections.abc import Mapping, Sequence
from typing import cast, Optional, NamedTuple
from html import escape
from markdown_it.token import Token
from .manual_structure import XrefTarget
from .md import Renderer
class UnresolvedXrefError(Exception):
pass
class Heading(NamedTuple):
container_tag: str
level: int
html_tag: str
# special handling for part content: whether partinfo div was already closed from
# elsewhere or still needs closing.
partintro_closed: bool
# tocs are generated when the heading opens, but have to be emitted into the file
# after the heading titlepage (and maybe partinfo) has been closed.
toc_fragment: str
_bullet_list_styles = [ 'disc', 'circle', 'square' ]
_ordered_list_styles = [ '1', 'a', 'i', 'A', 'I' ]
class HTMLRenderer(Renderer):
_xref_targets: Mapping[str, XrefTarget]
_headings: list[Heading]
_attrspans: list[str]
_hlevel_offset: int = 0
_bullet_list_nesting: int = 0
_ordered_list_nesting: int = 0
def __init__(self, manpage_urls: Mapping[str, str], xref_targets: Mapping[str, XrefTarget]):
super().__init__(manpage_urls)
self._headings = []
self._attrspans = []
self._xref_targets = xref_targets
def render(self, tokens: Sequence[Token]) -> str:
result = super().render(tokens)
result += self._close_headings(None)
return result
def _pull_image(self, path: str) -> str:
raise NotImplementedError()
def text(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return escape(token.content)
def paragraph_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "<p>"
def paragraph_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</p>"
def hardbreak(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "<br />"
def softbreak(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "\n"
def code_inline(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return f'<code class="literal">{escape(token.content)}</code>'
def code_block(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self.fence(token, tokens, i)
def link_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
href = escape(cast(str, token.attrs['href']), True)
tag, title, target, text = "link", "", 'target="_top"', ""
if href.startswith('#'):
if not (xref := self._xref_targets.get(href[1:])):
raise UnresolvedXrefError(f"bad local reference, id {href} not known")
if tokens[i + 1].type == 'link_close':
tag, text = "xref", xref.title_html
if xref.title:
# titles are not attribute-safe on their own, so we need to replace quotes.
title = 'title="{}"'.format(xref.title.replace('"', '&quot;'))
target, href = "", xref.href()
return f'<a class="{tag}" href="{href}" {title} {target}>{text}'
def link_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</a>"
def list_item_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<li class="listitem">'
def list_item_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</li>"
def bullet_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
extra = 'compact' if token.meta.get('compact', False) else ''
style = _bullet_list_styles[self._bullet_list_nesting % len(_bullet_list_styles)]
self._bullet_list_nesting += 1
return f'<div class="itemizedlist"><ul class="itemizedlist {extra}" style="list-style-type: {style};">'
def bullet_list_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._bullet_list_nesting -= 1
return "</ul></div>"
def em_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<span class="emphasis"><em>'
def em_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</em></span>"
def strong_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<span class="strong"><strong>'
def strong_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</strong></span>"
def fence(self, token: Token, tokens: Sequence[Token], i: int) -> str:
info = f" {escape(token.info, True)}" if token.info != "" else ""
return f'<pre><code class="programlisting{info}">{escape(token.content)}</code></pre>'
def blockquote_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<div class="blockquote"><blockquote class="blockquote">'
def blockquote_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</blockquote></div>"
def note_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<div class="note"><h3 class="title">Note</h3>'
def note_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</div>"
def caution_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<div class="caution"><h3 class="title">Caution</h3>'
def caution_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</div>"
def important_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<div class="important"><h3 class="title">Important</h3>'
def important_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</div>"
def tip_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<div class="tip"><h3 class="title">Tip</h3>'
def tip_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</div>"
def warning_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<div class="warning"><h3 class="title">Warning</h3>'
def warning_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</div>"
def dl_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<div class="variablelist"><dl class="variablelist">'
def dl_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</dl></div>"
def dt_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<dt><span class="term">'
def dt_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</span></dt>"
def dd_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "<dd>"
def dd_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</dd>"
def myst_role(self, token: Token, tokens: Sequence[Token], i: int) -> str:
# NixOS-specific roles are documented at <nixpkgs>/doc/README.md (with reverse reference)
if token.meta['name'] == 'command':
return f'<span class="command"><strong>{escape(token.content)}</strong></span>'
if token.meta['name'] == 'file':
return f'<code class="filename">{escape(token.content)}</code>'
if token.meta['name'] == 'var':
return f'<code class="varname">{escape(token.content)}</code>'
if token.meta['name'] == 'env':
return f'<code class="envar">{escape(token.content)}</code>'
if token.meta['name'] == 'option':
return f'<code class="option">{escape(token.content)}</code>'
if token.meta['name'] == 'manpage':
[page, section] = [ s.strip() for s in token.content.rsplit('(', 1) ]
section = section[:-1]
man = f"{page}({section})"
title = f'<span class="refentrytitle">{escape(page)}</span>'
vol = f"({escape(section)})"
ref = f'<span class="citerefentry">{title}{vol}</span>'
if man in self._manpage_urls:
return f'<a class="link" href="{escape(self._manpage_urls[man], True)}" target="_top">{ref}</a>'
else:
return ref
return super().myst_role(token, tokens, i)
def attr_span_begin(self, token: Token, tokens: Sequence[Token], i: int) -> str:
# we currently support *only* inline anchors and the special .keycap class to produce
# keycap-styled spans.
(id_part, class_part) = ("", "")
if s := token.attrs.get('id'):
id_part = f'<span id="{escape(cast(str, s), True)}"></span>'
if s := token.attrs.get('class'):
if s == 'keycap':
class_part = '<span class="keycap"><strong>'
self._attrspans.append("</strong></span>")
else:
return super().attr_span_begin(token, tokens, i)
else:
self._attrspans.append("")
return id_part + class_part
def attr_span_end(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._attrspans.pop()
def heading_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
hlevel = int(token.tag[1:])
htag, hstyle = self._make_hN(hlevel)
if hstyle:
hstyle = f'style="{escape(hstyle, True)}"'
if anchor := cast(str, token.attrs.get('id', '')):
anchor = f'id="{escape(anchor, True)}"'
result = self._close_headings(hlevel)
tag = self._heading_tag(token, tokens, i)
toc_fragment = self._build_toc(tokens, i)
self._headings.append(Heading(tag, hlevel, htag, tag != 'part', toc_fragment))
return (
f'{result}'
f'<div class="{tag}">'
f' <div class="titlepage">'
f' <div>'
f' <div>'
f' <{htag} {anchor} class="title" {hstyle}>'
)
def heading_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
heading = self._headings[-1]
result = (
f' </{heading.html_tag}>'
f' </div>'
f' </div>'
f'</div>'
)
if heading.container_tag == 'part':
result += '<div class="partintro">'
else:
result += heading.toc_fragment
return result
def ordered_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
extra = 'compact' if token.meta.get('compact', False) else ''
start = f'start="{token.attrs["start"]}"' if 'start' in token.attrs else ""
style = _ordered_list_styles[self._ordered_list_nesting % len(_ordered_list_styles)]
self._ordered_list_nesting += 1
return f'<div class="orderedlist"><ol class="orderedlist {extra}" {start} type="{style}">'
def ordered_list_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._ordered_list_nesting -= 1
return "</ol></div>"
def example_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
if id := cast(str, token.attrs.get('id', '')):
id = f'id="{escape(id, True)}"' if id else ''
return f'<div class="example"><span {id} ></span><details>'
def example_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '</div></details></div><br class="example-break" />'
def example_title_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '<summary><span class="title"><strong>'
def example_title_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return '</strong></span></summary><div class="example-contents">'
def image(self, token: Token, tokens: Sequence[Token], i: int) -> str:
src = self._pull_image(cast(str, token.attrs['src']))
alt = f'alt="{escape(token.content, True)}"' if token.content else ""
if title := cast(str, token.attrs.get('title', '')):
title = f'title="{escape(title, True)}"'
return (
'<div class="mediaobject">'
f'<img src="{escape(src, True)}" {alt} {title} />'
'</div>'
)
def figure_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
if anchor := cast(str, token.attrs.get('id', '')):
anchor = f'<span id="{escape(anchor, True)}"></span>'
return f'<div class="figure">{anchor}'
def figure_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return (
' </div>'
'</div><br class="figure-break" />'
)
def figure_title_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return (
'<p class="title">'
' <strong>'
)
def figure_title_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return (
' </strong>'
'</p>'
'<div class="figure-contents">'
)
def table_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return (
'<div class="informaltable">'
'<table class="informaltable" border="1">'
)
def table_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return (
'</table>'
'</div>'
)
def thead_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
cols = []
for j in range(i + 1, len(tokens)):
if tokens[j].type == 'thead_close':
break
elif tokens[j].type == 'th_open':
cols.append(cast(str, tokens[j].attrs.get('style', 'left')).removeprefix('text-align:'))
return "".join([
"<colgroup>",
"".join([ f'<col align="{col}" />' for col in cols ]),
"</colgroup>",
"<thead>",
])
def thead_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</thead>"
def tr_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "<tr>"
def tr_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</tr>"
def th_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return f'<th align="{cast(str, token.attrs.get("style", "left")).removeprefix("text-align:")}">'
def th_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</th>"
def tbody_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "<tbody>"
def tbody_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</tbody>"
def td_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return f'<td align="{cast(str, token.attrs.get("style", "left")).removeprefix("text-align:")}">'
def td_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</td>"
def footnote_ref(self, token: Token, tokens: Sequence[Token], i: int) -> str:
href = self._xref_targets[token.meta['target']].href()
id = escape(cast(str, token.attrs["id"]), True)
return (
f'<a href="{href}" class="footnote" id="{id}">'
f'<sup class="footnote">[{token.meta["id"] + 1}]</sup>'
'</a>'
)
def footnote_block_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return (
'<div class="footnotes">'
'<br />'
'<hr style="width:100; text-align:left;margin-left: 0" />'
)
def footnote_block_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</div>"
def footnote_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
# meta id,label
id = escape(self._xref_targets[token.meta["label"]].id, True)
return f'<div id="{id}" class="footnote">'
def footnote_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "</div>"
def footnote_anchor(self, token: Token, tokens: Sequence[Token], i: int) -> str:
href = self._xref_targets[token.meta['target']].href()
return (
f'<a href="{href}" class="para">'
f'<sup class="para">[{token.meta["id"] + 1}]</sup>'
'</a>'
)
def _make_hN(self, level: int) -> tuple[str, str]:
return f"h{min(6, max(1, level + self._hlevel_offset))}", ""
def _maybe_close_partintro(self) -> str:
if self._headings:
heading = self._headings[-1]
if heading.container_tag == 'part' and not heading.partintro_closed:
self._headings[-1] = heading._replace(partintro_closed=True)
return heading.toc_fragment + "</div>"
return ""
def _close_headings(self, level: Optional[int]) -> str:
result = []
while len(self._headings) and (level is None or self._headings[-1].level >= level):
result.append(self._maybe_close_partintro())
result.append("</div>")
self._headings.pop()
return "\n".join(result)
def _heading_tag(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return "section"
def _build_toc(self, tokens: Sequence[Token], i: int) -> str:
return ""

View File

@@ -0,0 +1,288 @@
from collections.abc import Mapping, Sequence
from dataclasses import dataclass
from typing import cast, Iterable, Optional
import re
from markdown_it.token import Token
from .md import Renderer
# roff(7) says:
#
# > roff documents may contain only graphable 7-bit ASCII characters, the space character,
# > and, in certain circumstances, the tab character. The backslash character \ indicates
# > the start of an escape sequence […]
#
# mandoc_char(7) says about the `'~^ characters:
#
# > In prose, this automatic substitution is often desirable; but when these characters have
# > to be displayed as plain ASCII characters, for example in source code samples, they require
# > escaping to render as follows:
#
# since we don't want these to be touched anywhere (because markdown will do all substituations
# we want to have) we'll escape those as well. we also escape " (macro metacharacter), - (might
# turn into a typographic hyphen), and . (roff request marker at SOL, changes spacing semantics
# at EOL). groff additionally does not allow unicode escapes for codepoints below U+0080, so
# those need "proper" roff escapes/replacements instead.
_roff_unicode = re.compile(r'''[^\n !#$%&()*+,\-./0-9:;<=>?@A-Z[\\\]_a-z{|}]''', re.ASCII)
_roff_escapes = {
ord('"'): "\\(dq",
ord("'"): "\\(aq",
ord('-'): "\\-",
ord('.'): "\\&.",
ord('\\'): "\\e",
ord('^'): "\\(ha",
ord('`'): "\\(ga",
ord('~'): "\\(ti",
}
def man_escape(s: str) -> str:
s = s.translate(_roff_escapes)
return _roff_unicode.sub(lambda m: f"\\[u{ord(m[0]):04X}]", s)
# remove leading and trailing spaces from links and condense multiple consecutive spaces
# into a single space for presentation parity with html. this is currently easiest with
# regex postprocessing and some marker characters. since we don't want to drop spaces
# from code blocks we will have to specially protect *inline* code (luckily not block code)
# so normalization can turn the spaces inside it into regular spaces again.
_normalize_space_re = re.compile(r'''\u0000 < *| *>\u0000 |(?<= ) +''')
def _normalize_space(s: str) -> str:
return _normalize_space_re.sub("", s).replace("\0p", " ")
def _protect_spaces(s: str) -> str:
return s.replace(" ", "\0p")
@dataclass(kw_only=True)
class List:
width: int
next_idx: Optional[int] = None
compact: bool
first_item_seen: bool = False
# this renderer assumed that it produces a set of lines as output, and that those lines will
# be pasted as-is into a larger output. no prefixing or suffixing is allowed for correctness.
#
# NOTE that we output exclusively physical markup. this is because we have to use the older
# mandoc(7) format instead of the newer mdoc(7) format due to limitations in groff: while
# using mdoc in groff works fine it is not a native format and thus very slow to render on
# manpages as large as configuration.nix.5. mandoc(1) renders both really quickly, but with
# groff being our predominant manpage viewer we have to optimize for groff instead.
#
# while we do use only physical markup (adjusting indentation with .RS and .RE, adding
# vertical spacing with .sp, \f[BIRP] escapes for bold/italic/roman/$previous font, \h for
# horizontal motion in a line) we do attempt to copy the style of mdoc(7) semantic requests
# as appropriate for each markup element.
class ManpageRenderer(Renderer):
# whether to emit mdoc .Ql equivalents for inline code or just the contents. this is
# mainly used by the options manpage converter to not emit extra quotes in defaults
# and examples where it's already clear from context that the following text is code.
inline_code_is_quoted: bool = True
link_footnotes: Optional[list[str]] = None
_href_targets: dict[str, str]
_link_stack: list[str]
_do_parbreak_stack: list[bool]
_list_stack: list[List]
_font_stack: list[str]
def __init__(self, manpage_urls: Mapping[str, str], href_targets: dict[str, str]):
super().__init__(manpage_urls)
self._href_targets = href_targets
self._link_stack = []
self._do_parbreak_stack = []
self._list_stack = []
self._font_stack = []
def _join_block(self, ls: Iterable[str]) -> str:
return "\n".join([ l for l in ls if len(l) ])
def _join_inline(self, ls: Iterable[str]) -> str:
return _normalize_space(super()._join_inline(ls))
def _enter_block(self) -> None:
self._do_parbreak_stack.append(False)
def _leave_block(self) -> None:
self._do_parbreak_stack.pop()
self._do_parbreak_stack[-1] = True
def _maybe_parbreak(self, suffix: str = "") -> str:
result = f".sp{suffix}" if self._do_parbreak_stack[-1] else ""
self._do_parbreak_stack[-1] = True
return result
def _admonition_open(self, kind: str) -> str:
self._enter_block()
return (
'.sp\n'
'.RS 4\n'
f'\\fB{kind}\\fP\n'
'.br'
)
def _admonition_close(self) -> str:
self._leave_block()
return ".RE"
def render(self, tokens: Sequence[Token]) -> str:
self._do_parbreak_stack = [ False ]
self._font_stack = [ "\\fR" ]
return super().render(tokens)
def text(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return man_escape(token.content)
def paragraph_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._maybe_parbreak()
def paragraph_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ""
def hardbreak(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ".br"
def softbreak(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return " "
def code_inline(self, token: Token, tokens: Sequence[Token], i: int) -> str:
s = _protect_spaces(man_escape(token.content))
return f"\\fR\\(oq{s}\\(cq\\fP" if self.inline_code_is_quoted else s
def code_block(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self.fence(token, tokens, i)
def link_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
href = cast(str, token.attrs['href'])
self._link_stack.append(href)
text = ""
if tokens[i + 1].type == 'link_close' and href in self._href_targets:
# TODO error or warning if the target can't be resolved
text = self._href_targets[href]
self._font_stack.append("\\fB")
return f"\\fB{text}\0 <"
def link_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
href = self._link_stack.pop()
text = ""
if self.link_footnotes is not None:
try:
idx = self.link_footnotes.index(href) + 1
except ValueError:
self.link_footnotes.append(href)
idx = len(self.link_footnotes)
text = "\\fR" + man_escape(f"[{idx}]")
self._font_stack.pop()
return f">\0 {text}{self._font_stack[-1]}"
def list_item_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._enter_block()
lst = self._list_stack[-1]
maybe_space = '' if lst.compact or not lst.first_item_seen else '.sp\n'
lst.first_item_seen = True
head = ""
if lst.next_idx is not None:
head = f"{lst.next_idx}."
lst.next_idx += 1
return (
f'{maybe_space}'
f'.RS {lst.width}\n'
f"\\h'-{len(head) + 1}'\\fB{man_escape(head)}\\fP\\h'1'\\c"
)
def list_item_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._leave_block()
return ".RE"
def bullet_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._list_stack.append(List(width=4, compact=bool(token.meta['compact'])))
return self._maybe_parbreak()
def bullet_list_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._list_stack.pop()
return ""
def em_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._font_stack.append("\\fI")
return "\\fI"
def em_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._font_stack.pop()
return self._font_stack[-1]
def strong_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._font_stack.append("\\fB")
return "\\fB"
def strong_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._font_stack.pop()
return self._font_stack[-1]
def fence(self, token: Token, tokens: Sequence[Token], i: int) -> str:
s = man_escape(token.content).rstrip('\n')
return (
'.sp\n'
'.RS 4\n'
'.nf\n'
f'{s}\n'
'.fi\n'
'.RE'
)
def blockquote_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
maybe_par = self._maybe_parbreak("\n")
self._enter_block()
return (
f"{maybe_par}"
".RS 4\n"
f"\\h'-3'\\fI\\(lq\\(rq\\fP\\h'1'\\c"
)
def blockquote_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._leave_block()
return ".RE"
def note_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open("Note")
def note_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def caution_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open( "Caution")
def caution_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def important_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open( "Important")
def important_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def tip_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open( "Tip")
def tip_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def warning_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_open( "Warning")
def warning_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonition_close()
def dl_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ".RS 4"
def dl_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ".RE"
def dt_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ".PP"
def dt_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ""
def dd_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._enter_block()
return ".RS 4"
def dd_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._leave_block()
return ".RE"
def myst_role(self, token: Token, tokens: Sequence[Token], i: int) -> str:
# NixOS-specific roles are documented at <nixpkgs>/doc/README.md (with reverse reference)
if token.meta['name'] in [ 'command', 'env', 'option' ]:
return f'\\fB{man_escape(token.content)}\\fP'
elif token.meta['name'] in [ 'file', 'var' ]:
return f'\\fI{man_escape(token.content)}\\fP'
elif token.meta['name'] == 'manpage':
[page, section] = [ s.strip() for s in token.content.rsplit('(', 1) ]
section = section[:-1]
return f'\\fB{man_escape(page)}\\fP\\fR({man_escape(section)})\\fP'
else:
raise NotImplementedError("md node not supported yet", token)
def attr_span_begin(self, token: Token, tokens: Sequence[Token], i: int) -> str:
# mdoc knows no anchors so we can drop those, but classes must be rejected.
if 'class' in token.attrs:
return super().attr_span_begin(token, tokens, i)
return ""
def attr_span_end(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return ""
def heading_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported in manpages", token)
def heading_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported in manpages", token)
def ordered_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
# max item head width for a number, a dot, and one leading space and one trailing space
width = 3 + len(str(cast(int, token.meta['end'])))
self._list_stack.append(
List(width = width,
next_idx = cast(int, token.attrs.get('start', 1)),
compact = bool(token.meta['compact'])))
return self._maybe_parbreak()
def ordered_list_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
self._list_stack.pop()
return ""

View File

@@ -0,0 +1,791 @@
import argparse
import hashlib
import html
import json
import re
import xml.sax.saxutils as xml
from abc import abstractmethod
from collections.abc import Mapping, Sequence
from pathlib import Path
from typing import Any, Callable, cast, ClassVar, Generic, get_args, NamedTuple
from markdown_it.token import Token
from . import md, options
from .html import HTMLRenderer, UnresolvedXrefError
from .manual_structure import check_structure, FragmentType, is_include, make_xml_id, TocEntry, TocEntryType, XrefTarget
from .md import Converter, Renderer
from .redirects import Redirects
from .src_error import SrcError
class BaseConverter(Converter[md.TR], Generic[md.TR]):
# per-converter configuration for ns:arg=value arguments to include blocks, following
# the include type. html converters need something like this to support chunking, or
# another external method like the chunktocs docbook uses (but block options seem like
# a much nicer of doing this).
INCLUDE_ARGS_NS: ClassVar[str]
INCLUDE_FRAGMENT_ALLOWED_ARGS: ClassVar[set[str]] = set()
INCLUDE_OPTIONS_ALLOWED_ARGS: ClassVar[set[str]] = set()
_base_paths: list[Path]
_current_type: list[TocEntryType]
def convert(self, infile: Path, outfile: Path) -> None:
self._base_paths = [ infile ]
self._current_type = ['book']
try:
tokens = self._parse(infile.read_text())
self._postprocess(infile, outfile, tokens)
converted = self._renderer.render(tokens)
outfile.write_text(converted)
except Exception as e:
raise RuntimeError(f"failed to render manual {infile}") from e
def _postprocess(self, infile: Path, outfile: Path, tokens: Sequence[Token]) -> None:
pass
def _handle_headings(self, tokens: list[Token], *, src: str, on_heading: Callable[[Token,str],None]) -> None:
# Headings in a globally numbered order
# h1 to h6
curr_heading_pos: list[int] = []
for token in tokens:
if token.type == "heading_open":
if token.tag not in ["h1", "h2", "h3", "h4", "h5", "h6"]:
raise SrcError(
src=src,
description=f"Got invalid heading tag {token.tag!r}. Only h1 to h6 headings are allowed.",
token=token,
)
idx = int(token.tag[1:]) - 1
if idx >= len(curr_heading_pos):
# extend the list if necessary
curr_heading_pos.extend([0 for _i in range(idx+1 - len(curr_heading_pos))])
curr_heading_pos = curr_heading_pos[:idx+1]
curr_heading_pos[-1] += 1
ident = ".".join(f"{a}" for a in curr_heading_pos)
on_heading(token,ident)
def _parse(self, src: str, *, auto_id_prefix: None | str = None) -> list[Token]:
tokens = super()._parse(src)
if auto_id_prefix:
def set_token_ident(token: Token, ident: str) -> None:
if "id" not in token.attrs:
token.attrs["id"] = f"{auto_id_prefix}-{ident}"
self._handle_headings(tokens, src=src, on_heading=set_token_ident)
check_structure(src, self._current_type[-1], tokens)
for token in tokens:
if not is_include(token):
continue
directive = token.info[12:].split()
if not directive:
continue
args = { k: v for k, _sep, v in map(lambda s: s.partition('='), directive[1:]) }
typ = directive[0]
if typ == 'options':
token.type = 'included_options'
self._process_include_args(src, token, args, self.INCLUDE_OPTIONS_ALLOWED_ARGS)
self._parse_options(src, token, args)
else:
fragment_type = typ.removesuffix('s')
if fragment_type not in get_args(FragmentType):
raise SrcError(
src=src,
description=f"unsupported structural include type {typ!r}",
token=token,
)
self._current_type.append(cast(FragmentType, fragment_type))
token.type = 'included_' + typ
self._process_include_args(src, token, args, self.INCLUDE_FRAGMENT_ALLOWED_ARGS)
self._parse_included_blocks(src, token, args)
self._current_type.pop()
return tokens
def _process_include_args(self, src: str, token: Token, args: dict[str, str], allowed: set[str]) -> None:
ns = self.INCLUDE_ARGS_NS + ":"
args = { k[len(ns):]: v for k, v in args.items() if k.startswith(ns) }
if unknown := set(args.keys()) - allowed:
raise SrcError(
src=src,
description=f"unrecognized include argument(s): {unknown}",
token=token,
)
token.meta['include-args'] = args
def _parse_included_blocks(self, src: str, token: Token, block_args: dict[str, str]) -> None:
assert token.map
included = token.meta['included'] = []
for (lnum, line) in enumerate(token.content.splitlines(), token.map[0] + 1):
line = line.strip()
path = self._base_paths[-1].parent / line
if path in self._base_paths:
raise SrcError(
src=src,
description="circular include found",
token=token,
)
try:
self._base_paths.append(path)
with open(path, 'r') as f:
prefix = None
if "auto-id-prefix" in block_args:
# include the current file number to prevent duplicate ids within include blocks
prefix = f"{block_args.get('auto-id-prefix')}-{lnum}"
tokens = self._parse(f.read(), auto_id_prefix=prefix)
included.append((tokens, path))
self._base_paths.pop()
except Exception as e:
raise SrcError(
src=src,
description=f"processing included file {path}",
token=lnum,
) from e
def _parse_options(self, src: str, token: Token, block_args: dict[str, str]) -> None:
assert token.map
items = {}
for (lnum, line) in enumerate(token.content.splitlines(), token.map[0] + 1):
if len(args := line.split(":", 1)) != 2:
raise SrcError(
src=src,
description=f"options directive with no argument",
tokens={
"Directive": lnum,
"Block": token,
},
)
(k, v) = (args[0].strip(), args[1].strip())
if k in items:
raise SrcError(
src=src,
description=f"duplicate options directive {k!r}",
tokens={
"Directive": lnum,
"Block": token,
},
)
items[k] = v
try:
id_prefix = items.pop('id-prefix')
varlist_id = items.pop('list-id')
source = items.pop('source')
except KeyError as e:
raise SrcError(
src=src,
description=f"options directive {e} missing",
tokens={
"Block": token,
},
) from e
if items.keys():
raise SrcError(
src=src,
description=f"unsupported options directives: {set(items.keys())}",
token=token,
)
try:
with open(self._base_paths[-1].parent / source, 'r') as f:
token.meta['id-prefix'] = id_prefix
token.meta['list-id'] = varlist_id
token.meta['source'] = json.load(f)
except Exception as e:
raise SrcError(
src=src,
description="processing options block",
token=token,
) from e
class RendererMixin(Renderer):
_toplevel_tag: str
_revision: str
def __init__(self, toplevel_tag: str, revision: str, *args: Any, **kwargs: Any):
super().__init__(*args, **kwargs)
self._toplevel_tag = toplevel_tag
self._revision = revision
self.rules |= {
'included_sections': lambda *args: self._included_thing("section", *args),
'included_chapters': lambda *args: self._included_thing("chapter", *args),
'included_preface': lambda *args: self._included_thing("preface", *args),
'included_parts': lambda *args: self._included_thing("part", *args),
'included_appendix': lambda *args: self._included_thing("appendix", *args),
'included_options': self.included_options,
}
def render(self, tokens: Sequence[Token]) -> str:
# books get special handling because they have *two* title tags. doing this with
# generic code is more complicated than it's worth. the checks above have verified
# that both titles actually exist.
if self._toplevel_tag == 'book':
return self._render_book(tokens)
return super().render(tokens)
@abstractmethod
def _render_book(self, tokens: Sequence[Token]) -> str:
raise NotImplementedError()
@abstractmethod
def _included_thing(self, tag: str, token: Token, tokens: Sequence[Token], i: int) -> str:
raise NotImplementedError()
@abstractmethod
def included_options(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise NotImplementedError()
class HTMLParameters(NamedTuple):
generator: str
stylesheets: Sequence[str]
scripts: Sequence[str]
# number of levels in the rendered table of contents. tables are prepended to
# the content they apply to (entire document / document chunk / top-level section
# of a chapter), setting a depth of 0 omits the respective table.
toc_depth: int
chunk_toc_depth: int
section_toc_depth: int
media_dir: Path
class ManualHTMLRenderer(RendererMixin, HTMLRenderer):
_base_path: Path
_in_dir: Path
_html_params: HTMLParameters
_redirects: Redirects | None
def __init__(self, toplevel_tag: str, revision: str, html_params: HTMLParameters,
manpage_urls: Mapping[str, str], xref_targets: dict[str, XrefTarget],
redirects: Redirects | None, in_dir: Path, base_path: Path):
super().__init__(toplevel_tag, revision, manpage_urls, xref_targets)
self._in_dir = in_dir
self._base_path = base_path.absolute()
self._html_params = html_params
self._redirects = redirects
def _pull_image(self, src: str) -> str:
src_path = Path(src)
content = (self._in_dir / src_path).read_bytes()
# images may be used more than once, but we want to store them only once and
# in an easily accessible (ie, not input-file-path-dependent) location without
# having to maintain a mapping structure. hashing the file and using the hash
# as both the path of the final image provides both.
content_hash = hashlib.sha3_256(content).hexdigest()
target_name = f"{content_hash}{src_path.suffix}"
target_path = self._base_path / self._html_params.media_dir / target_name
target_path.write_bytes(content)
return f"./{self._html_params.media_dir}/{target_name}"
def _push(self, tag: str, hlevel_offset: int) -> Any:
result = (self._toplevel_tag, self._headings, self._attrspans, self._hlevel_offset, self._in_dir)
self._hlevel_offset += hlevel_offset
self._toplevel_tag, self._headings, self._attrspans = tag, [], []
return result
def _pop(self, state: Any) -> None:
(self._toplevel_tag, self._headings, self._attrspans, self._hlevel_offset, self._in_dir) = state
def _render_book(self, tokens: Sequence[Token]) -> str:
assert tokens[4].children
title_id = cast(str, tokens[0].attrs.get('id', ""))
title = self._xref_targets[title_id].title
# subtitles don't have IDs, so we can't use xrefs to get them
subtitle = self.renderInline(tokens[4].children)
toc = TocEntry.of(tokens[0])
return "\n".join([
self._file_header(toc),
' <div class="book">',
' <div class="titlepage">',
' <div>',
f' <div><h1 class="title"><a id="{html.escape(title_id, True)}"></a>{title}</h1></div>',
f' <div><h2 class="subtitle">{subtitle}</h2></div>',
' </div>',
" <hr />",
' </div>',
self._build_toc(tokens, 0),
super(HTMLRenderer, self).render(tokens[6:]),
' </div>',
self._file_footer(toc),
])
def _file_header(self, toc: TocEntry) -> str:
prev_link, up_link, next_link = "", "", ""
prev_a, next_a, parent_title = "", "", "&nbsp;"
nav_html = ""
home = toc.root
if toc.prev:
prev_link = f'<link rel="prev" href="{toc.prev.target.href()}" title="{toc.prev.target.title}" />'
prev_a = f'<a accesskey="p" href="{toc.prev.target.href()}">Prev</a>'
if toc.parent:
up_link = (
f'<link rel="up" href="{toc.parent.target.href()}" '
f'title="{toc.parent.target.title}" />'
)
if (part := toc.parent) and part.kind != 'book':
assert part.target.title
parent_title = part.target.title
if toc.next:
next_link = f'<link rel="next" href="{toc.next.target.href()}" title="{toc.next.target.title}" />'
next_a = f'<a accesskey="n" href="{toc.next.target.href()}">Next</a>'
if toc.prev or toc.parent or toc.next:
nav_html = "\n".join([
' <div class="navheader">',
' <table width="100%" summary="Navigation header">',
' <tr>',
f' <th colspan="3" align="center">{toc.target.title}</th>',
' </tr>',
' <tr>',
f' <td width="20%" align="left">{prev_a}&nbsp;</td>',
f' <th width="60%" align="center">{parent_title}</th>',
f' <td width="20%" align="right">&nbsp;{next_a}</td>',
' </tr>',
' </table>',
' <hr />',
' </div>',
])
scripts = self._html_params.scripts
if self._redirects:
redirects_name = f'{toc.target.path.split('.html')[0]}-redirects.js'
with open(self._base_path / redirects_name, 'w') as file:
file.write(self._redirects.get_redirect_script(toc.target.path))
scripts.append(f'./{redirects_name}')
return "\n".join([
'<?xml version="1.0" encoding="utf-8" standalone="no"?>',
'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"',
' "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">',
'<html xmlns="http://www.w3.org/1999/xhtml">',
' <head>',
' <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />',
f' <title>{toc.target.title}</title>',
"".join((f'<link rel="stylesheet" type="text/css" href="{html.escape(style, True)}" />'
for style in self._html_params.stylesheets)),
"".join((f'<script src="{html.escape(script, True)}" type="text/javascript"></script>'
for script in scripts)),
f' <meta name="generator" content="{html.escape(self._html_params.generator, True)}" />',
f' <link rel="home" href="{home.target.href()}" title="{home.target.title}" />' if home.target.href() else "",
f' {up_link}{prev_link}{next_link}',
' </head>',
' <body>',
nav_html,
])
def _file_footer(self, toc: TocEntry) -> str:
# prev, next = self._get_prev_and_next()
prev_a, up_a, home_a, next_a = "", "&nbsp;", "&nbsp;", ""
prev_text, up_text, next_text = "", "", ""
nav_html = ""
home = toc.root
if toc.prev:
prev_a = f'<a accesskey="p" href="{toc.prev.target.href()}">Prev</a>'
assert toc.prev.target.title
prev_text = toc.prev.target.title
if toc.parent:
home_a = f'<a accesskey="h" href="{home.target.href()}">Home</a>'
if toc.parent != home:
up_a = f'<a accesskey="u" href="{toc.parent.target.href()}">Up</a>'
if toc.next:
next_a = f'<a accesskey="n" href="{toc.next.target.href()}">Next</a>'
assert toc.next.target.title
next_text = toc.next.target.title
if toc.prev or toc.parent or toc.next:
nav_html = "\n".join([
' <div class="navfooter">',
' <hr />',
' <table width="100%" summary="Navigation footer">',
' <tr>',
f' <td width="40%" align="left">{prev_a}&nbsp;</td>',
f' <td width="20%" align="center">{up_a}</td>',
f' <td width="40%" align="right">&nbsp;{next_a}</td>',
' </tr>',
' <tr>',
f' <td width="40%" align="left" valign="top">{prev_text}&nbsp;</td>',
f' <td width="20%" align="center">{home_a}</td>',
f' <td width="40%" align="right" valign="top">&nbsp;{next_text}</td>',
' </tr>',
' </table>',
' </div>',
])
return "\n".join([
nav_html,
' </body>',
'</html>',
])
def _heading_tag(self, token: Token, tokens: Sequence[Token], i: int) -> str:
if token.tag == 'h1':
return self._toplevel_tag
return super()._heading_tag(token, tokens, i)
def _build_toc(self, tokens: Sequence[Token], i: int) -> str:
toc = TocEntry.of(tokens[i])
if toc.kind == 'section' and self._html_params.section_toc_depth < 1:
return ""
def walk_and_emit(toc: TocEntry, depth: int) -> list[str]:
if depth <= 0:
return []
result = []
for child in toc.children:
result.append(
f'<dt>'
f' <span class="{html.escape(child.kind, True)}">'
f' <a href="{child.target.href()}">{child.target.toc_html}</a>'
f' </span>'
f'</dt>'
)
# we want to look straight through parts because docbook-xsl did too, but it
# also makes for more uesful top-level tocs.
next_level = walk_and_emit(child, depth - (0 if child.kind == 'part' else 1))
if next_level:
result.append(f'<dd><dl>{"".join(next_level)}</dl></dd>')
return result
def build_list(kind: str, id: str, lst: Sequence[TocEntry]) -> str:
if not lst:
return ""
entries = [
f'<dt>{i}. <a href="{e.target.href()}">{e.target.toc_html}</a></dt>'
for i, e in enumerate(lst, start=1)
]
return (
f'<div class="{id}">'
f'<p><strong>List of {kind}</strong></p>'
f'<dl>{"".join(entries)}</dl>'
'</div>'
)
# we don't want to generate the "Title of Contents" header for sections,
# docbook didn't and it's only distracting clutter unless it's the main table.
# we also want to generate tocs only for a top-level section (ie, one that is
# not itself contained in another section)
print_title = toc.kind != 'section'
if toc.kind == 'section':
if toc.parent and toc.parent.kind == 'section':
toc_depth = 0
else:
toc_depth = self._html_params.section_toc_depth
elif toc.starts_new_chunk and toc.kind != 'book':
toc_depth = self._html_params.chunk_toc_depth
else:
toc_depth = self._html_params.toc_depth
if not (items := walk_and_emit(toc, toc_depth)):
return ""
figures = build_list("Figures", "list-of-figures", toc.figures)
examples = build_list("Examples", "list-of-examples", toc.examples)
return "".join([
f'<div class="toc">',
' <p><strong>Table of Contents</strong></p>' if print_title else "",
f' <dl class="toc">'
f' {"".join(items)}'
f' </dl>'
f'</div>'
f'{figures}'
f'{examples}'
])
def _make_hN(self, level: int) -> tuple[str, str]:
# for some reason chapters didn't increase the hN nesting count in docbook xslts.
# originally this was duplicated here for consistency with docbook rendering, but
# it could be reevaluated and changed now that docbook is gone.
if self._toplevel_tag == 'chapter':
level -= 1
# this style setting is also for docbook compatibility only and could well go away.
style = ""
if level + self._hlevel_offset < 3 \
and (self._toplevel_tag == 'section' or (self._toplevel_tag == 'chapter' and level > 0)):
style = "clear: both"
tag, hstyle = super()._make_hN(max(1, level))
return tag, style
def _included_thing(self, tag: str, token: Token, tokens: Sequence[Token], i: int) -> str:
outer, inner = [], []
# since books have no non-include content the toplevel book wrapper will not count
# towards nesting depth. other types will have at least a title+id heading which
# *does* count towards the nesting depth. chapters give a -1 to included sections
# mirroring the special handing in _make_hN. sigh.
hoffset = (
0 if not self._headings
else self._headings[-1].level - 1 if self._toplevel_tag == 'chapter'
else self._headings[-1].level
)
outer.append(self._maybe_close_partintro())
into = token.meta['include-args'].get('into-file')
fragments = token.meta['included']
state = self._push(tag, hoffset)
if into:
toc = TocEntry.of(fragments[0][0][0])
inner.append(self._file_header(toc))
# we do not set _hlevel_offset=0 because docbook didn't either.
else:
inner = outer
in_dir = self._in_dir
for included, path in fragments:
try:
self._in_dir = (in_dir / path).parent
inner.append(self.render(included))
except Exception as e:
raise RuntimeError(f"rendering {path}") from e
if into:
inner.append(self._file_footer(toc))
(self._base_path / into).write_text("".join(inner))
self._pop(state)
return "".join(outer)
def included_options(self, token: Token, tokens: Sequence[Token], i: int) -> str:
conv = options.HTMLConverter(self._manpage_urls, self._revision,
token.meta['list-id'], token.meta['id-prefix'],
self._xref_targets)
conv.add_options(token.meta['source'])
return conv.finalize()
def _to_base26(n: int) -> str:
return (_to_base26(n // 26) if n > 26 else "") + chr(ord("A") + n % 26)
class HTMLConverter(BaseConverter[ManualHTMLRenderer]):
INCLUDE_ARGS_NS = "html"
INCLUDE_FRAGMENT_ALLOWED_ARGS = { 'into-file' }
_revision: str
_html_params: HTMLParameters
_manpage_urls: Mapping[str, str]
_redirects: Redirects | None
_xref_targets: dict[str, XrefTarget]
_redirection_targets: set[str]
_appendix_count: int = 0
def _next_appendix_id(self) -> str:
self._appendix_count += 1
return _to_base26(self._appendix_count - 1)
def __init__(self, revision: str, html_params: HTMLParameters, manpage_urls: Mapping[str, str], redirects: Redirects | None = None):
super().__init__()
self._revision, self._html_params, self._manpage_urls, self._redirects = revision, html_params, manpage_urls, redirects
self._xref_targets = {}
self._redirection_targets = set()
# renderer not set on purpose since it has a dependency on the output path!
def convert(self, infile: Path, outfile: Path) -> None:
self._renderer = ManualHTMLRenderer(
'book', self._revision, self._html_params, self._manpage_urls, self._xref_targets,
self._redirects, infile.parent, outfile.parent)
super().convert(infile, outfile)
def _parse(self, src: str, *, auto_id_prefix: None | str = None) -> list[Token]:
tokens = super()._parse(src,auto_id_prefix=auto_id_prefix)
for token in tokens:
if not token.type.startswith('included_') \
or not (into := token.meta['include-args'].get('into-file')):
continue
assert token.map
if len(token.meta['included']) == 0:
raise SrcError(
src=src,
description=f"redirection target {into!r} is empty!",
token=token,
)
# we use blender-style //path to denote paths relative to the origin file
# (usually index.html). this makes everything a lot easier and clearer.
if not into.startswith("//") or '/' in into[2:]:
raise SrcError(
src=src,
description=f"html:into-file must be a relative-to-origin //filename: {into}",
token=token,
)
into = token.meta['include-args']['into-file'] = into[2:]
if into in self._redirection_targets:
raise SrcError(
src=src,
description=f"redirection target {into} is already in use",
token=token,
)
self._redirection_targets.add(into)
return tokens
def _number_block(self, block: str, prefix: str, tokens: Sequence[Token], start: int = 1) -> int:
title_open, title_close = f'{block}_title_open', f'{block}_title_close'
for (i, token) in enumerate(tokens):
if token.type == title_open:
title = tokens[i + 1]
assert title.type == 'inline' and title.children
# the prefix is split into two tokens because the xref title_html will want
# only the first of the two, but both must be rendered into the example itself.
title.children = (
[
Token('text', '', 0, content=f'{prefix} {start}'),
Token('text', '', 0, content='. ')
] + title.children
)
start += 1
elif token.type.startswith('included_') and token.type != 'included_options':
for sub, _path in token.meta['included']:
start = self._number_block(block, prefix, sub, start)
return start
# xref | (id, type, heading inlines, file, starts new file)
def _collect_ids(self, tokens: Sequence[Token], target_file: str, typ: str, file_changed: bool
) -> list[XrefTarget | tuple[str, str, Token, str, bool]]:
result: list[XrefTarget | tuple[str, str, Token, str, bool]] = []
# collect all IDs and their xref substitutions. headings are deferred until everything
# has been parsed so we can resolve links in headings. if that's even used anywhere.
for (i, bt) in enumerate(tokens):
if bt.type == 'heading_open' and (id := cast(str, bt.attrs.get('id', ''))):
result.append((id, typ if bt.tag == 'h1' else 'section', tokens[i + 1], target_file,
i == 0 and file_changed))
elif bt.type == 'included_options':
id_prefix = bt.meta['id-prefix']
for opt in bt.meta['source'].keys():
id = make_xml_id(f"{id_prefix}{opt}")
name = html.escape(opt)
result.append(XrefTarget(id, f'<code class="option">{name}</code>', name, None, target_file))
elif bt.type.startswith('included_'):
sub_file = bt.meta['include-args'].get('into-file', target_file)
subtyp = bt.type.removeprefix('included_').removesuffix('s')
for si, (sub, _path) in enumerate(bt.meta['included']):
result += self._collect_ids(sub, sub_file, subtyp, si == 0 and sub_file != target_file)
elif bt.type == 'example_open' and (id := cast(str, bt.attrs.get('id', ''))):
result.append((id, 'example', tokens[i + 2], target_file, False))
elif bt.type == 'figure_open' and (id := cast(str, bt.attrs.get('id', ''))):
result.append((id, 'figure', tokens[i + 2], target_file, False))
elif bt.type == 'footnote_open' and (id := cast(str, bt.attrs.get('id', ''))):
result.append(XrefTarget(id, "???", None, None, target_file))
elif bt.type == 'footnote_ref' and (id := cast(str, bt.attrs.get('id', ''))):
result.append(XrefTarget(id, "???", None, None, target_file))
elif bt.type == 'inline':
assert bt.children is not None
result += self._collect_ids(bt.children, target_file, typ, False)
elif id := cast(str, bt.attrs.get('id', '')):
# anchors and examples have no titles we could use, but we'll have to put
# *something* here to communicate that there's no title.
result.append(XrefTarget(id, "???", None, None, target_file))
return result
def _render_xref(self, id: str, typ: str, inlines: Token, path: str, drop_fragment: bool) -> XrefTarget:
assert inlines.children
title_html = self._renderer.renderInline(inlines.children)
if typ == 'appendix':
# NOTE the docbook compat is strong here
n = self._next_appendix_id()
prefix = f"Appendix\u00A0{n}.\u00A0"
# HACK for docbook compat: prefix the title inlines with appendix id if
# necessary. the alternative is to mess with titlepage rendering in headings,
# which seems just a lot worse than this
prefix_tokens = [Token(type='text', tag='', nesting=0, content=prefix)]
inlines.children = prefix_tokens + list(inlines.children)
title = prefix + title_html
toc_html = f"{n}. {title_html}"
title_html = f"Appendix&nbsp;{n}"
elif typ in ['example', 'figure']:
# skip the prepended `{Example,Figure} N. ` from numbering
toc_html, title = self._renderer.renderInline(inlines.children[2:]), title_html
# xref title wants only the prepended text, sans the trailing colon and space
title_html = self._renderer.renderInline(inlines.children[0:1])
else:
toc_html, title = title_html, title_html
title_html = (
f"<em>{title_html}</em>"
if typ == 'chapter'
else title_html if typ in [ 'book', 'part' ]
else f'the section called “{title_html}'
)
return XrefTarget(id, title_html, toc_html, re.sub('<.*?>', '', title), path, drop_fragment)
def _postprocess(self, infile: Path, outfile: Path, tokens: Sequence[Token]) -> None:
self._number_block('example', "Example", tokens)
self._number_block('figure', "Figure", tokens)
xref_queue = self._collect_ids(tokens, outfile.name, 'book', True)
failed = False
deferred = []
while xref_queue:
for item in xref_queue:
try:
target = item if isinstance(item, XrefTarget) else self._render_xref(*item)
except UnresolvedXrefError:
if failed:
raise
deferred.append(item)
continue
if target.id in self._xref_targets:
raise RuntimeError(f"found duplicate id #{target.id}")
self._xref_targets[target.id] = target
if len(deferred) == len(xref_queue):
failed = True # do another round and report the first error
xref_queue = deferred
paths_seen = set()
for t in self._xref_targets.values():
paths_seen.add(t.path)
if len(paths_seen) == 1:
for (k, t) in self._xref_targets.items():
self._xref_targets[k] = XrefTarget(
t.id,
t.title_html,
t.toc_html,
t.title,
t.path,
t.drop_fragment,
drop_target=True
)
TocEntry.collect_and_link(self._xref_targets, tokens)
if self._redirects:
self._redirects.validate(self._xref_targets)
server_redirects = self._redirects.get_server_redirects()
with open(outfile.parent / '_redirects', 'w') as server_redirects_file:
formatted_server_redirects = []
for from_path, to_path in server_redirects.items():
formatted_server_redirects.append(f"{from_path} {to_path} 301")
server_redirects_file.write("\n".join(formatted_server_redirects))
def _build_cli_html(p: argparse.ArgumentParser) -> None:
p.add_argument('--manpage-urls', required=True)
p.add_argument('--revision', required=True)
p.add_argument('--generator', default='nixos-render-docs')
p.add_argument('--stylesheet', default=[], action='append')
p.add_argument('--script', default=[], action='append')
p.add_argument('--toc-depth', default=1, type=int)
p.add_argument('--chunk-toc-depth', default=1, type=int)
p.add_argument('--section-toc-depth', default=0, type=int)
p.add_argument('--media-dir', default="media", type=Path)
p.add_argument('--redirects', type=Path)
p.add_argument('infile', type=Path)
p.add_argument('outfile', type=Path)
def _run_cli_html(args: argparse.Namespace) -> None:
with open(args.manpage_urls) as manpage_urls, open(Path(__file__).parent / "redirects.js") as redirects_script:
redirects = None
if args.redirects:
with open(args.redirects) as raw_redirects:
redirects = Redirects(json.load(raw_redirects), redirects_script.read())
md = HTMLConverter(
args.revision,
HTMLParameters(args.generator, args.stylesheet, args.script, args.toc_depth,
args.chunk_toc_depth, args.section_toc_depth, args.media_dir),
json.load(manpage_urls), redirects)
md.convert(args.infile, args.outfile)
def build_cli(p: argparse.ArgumentParser) -> None:
formats = p.add_subparsers(dest='format', required=True)
_build_cli_html(formats.add_parser('html'))
def run_cli(args: argparse.Namespace) -> None:
if args.format == 'html':
_run_cli_html(args)
else:
raise RuntimeError('format not hooked up', args)

View File

@@ -0,0 +1,239 @@
from __future__ import annotations
import dataclasses as dc
import html
import itertools
from typing import cast, get_args, Iterable, Literal, Sequence
from markdown_it.token import Token
from .utils import Freezeable
from .src_error import SrcError
# FragmentType is used to restrict structural include blocks.
FragmentType = Literal['preface', 'part', 'chapter', 'section', 'appendix']
# in the TOC all fragments are allowed, plus the all-encompassing book.
TocEntryType = Literal['book', 'preface', 'part', 'chapter', 'section', 'appendix', 'example', 'figure']
def is_include(token: Token) -> bool:
return token.type == "fence" and token.info.startswith("{=include=} ")
# toplevel file must contain only the title headings and includes, anything else
# would cause strange rendering.
def _check_book_structure(src: str, tokens: Sequence[Token]) -> None:
for token in tokens[6:]:
if not is_include(token):
raise SrcError(
src=src,
description=f"unexpected content; expected structural include",
token=token,
)
# much like books, parts may not contain headings other than their title heading.
# this is a limitation of the current renderers and TOC generators that do not handle
# this case well even though it is supported in docbook (and probably supportable
# anywhere else).
def _check_part_structure(src: str,tokens: Sequence[Token]) -> None:
_check_fragment_structure(src, tokens)
for token in tokens[3:]:
if token.type == 'heading_open':
raise SrcError(
src=src,
description="unexpected heading",
token=token,
)
# two include blocks must either be adjacent or separated by a heading, otherwise
# we cannot generate a correct TOC (since there'd be nothing to link to between
# the two includes).
def _check_fragment_structure(src: str, tokens: Sequence[Token]) -> None:
for i, token in enumerate(tokens):
if is_include(token) \
and i + 1 < len(tokens) \
and not (is_include(tokens[i + 1]) or tokens[i + 1].type == 'heading_open'):
assert token.map
raise SrcError(
src=src,
description="unexpected content; expected heading or structural include",
token=token,
)
def check_structure(src: str, kind: TocEntryType, tokens: Sequence[Token]) -> None:
wanted = { 'h1': 'title' }
wanted |= { 'h2': 'subtitle' } if kind == 'book' else {}
for (i, (tag, role)) in enumerate(wanted.items()):
if len(tokens) < 3 * (i + 1):
raise RuntimeError(f"missing {role} ({tag}) heading")
token = tokens[3 * i]
if token.type != 'heading_open' or token.tag != tag:
raise SrcError(
src=src,
description=f"expected {role} ({tag}) heading",
token=token,
)
for t in tokens[3 * len(wanted):]:
if t.type != 'heading_open' or not (role := wanted.get(t.tag, '')):
continue
raise SrcError(
src=src,
description=f"only one {role} heading ({t.markup} [text...]) allowed per "
f"{kind}, but found a second. "
"Please remove all such headings except the first or demote the subsequent headings.",
token=t,
)
last_heading_level = 0
for token in tokens:
if token.type != 'heading_open':
continue
# book subtitle headings do not need an id, only book title headings do.
# every other headings needs one too. we need this to build a TOC and to
# provide stable links if the manual changes shape.
if 'id' not in token.attrs and (kind != 'book' or token.tag != 'h2'):
raise SrcError(
src=src,
description=f"heading does not have an id",
token=token,
)
level = int(token.tag[1:]) # because tag = h1..h6
if level > last_heading_level + 1:
raise SrcError(
src=src,
description=f"heading skips one or more heading levels, "
"which is currently not allowed",
token=token,
)
last_heading_level = level
if kind == 'book':
_check_book_structure(src, tokens)
elif kind == 'part':
_check_part_structure(src, tokens)
else:
_check_fragment_structure(src, tokens)
@dc.dataclass(frozen=True)
class XrefTarget:
id: str
"""link label for `[](#local-references)`"""
title_html: str
"""toc label"""
toc_html: str | None
"""text for `<title>` tags and `title="..."` attributes"""
title: str | None
"""path to file that contains the anchor"""
path: str
"""whether to drop the `#anchor` from links when expanding xrefs"""
drop_fragment: bool = False
"""whether to drop the `path.html` from links when expanding xrefs.
mostly useful for docbook compatibility"""
drop_target: bool = False
def href(self) -> str:
path = "" if self.drop_target else html.escape(self.path, True)
return path if self.drop_fragment else f"{path}#{html.escape(self.id, True)}"
@dc.dataclass
class TocEntry(Freezeable):
kind: TocEntryType
target: XrefTarget
parent: TocEntry | None = None
prev: TocEntry | None = None
next: TocEntry | None = None
children: list[TocEntry] = dc.field(default_factory=list)
starts_new_chunk: bool = False
examples: list[TocEntry] = dc.field(default_factory=list)
figures: list[TocEntry] = dc.field(default_factory=list)
@property
def root(self) -> TocEntry:
return self.parent.root if self.parent else self
@classmethod
def of(cls, token: Token) -> TocEntry:
entry = token.meta.get('TocEntry')
if not isinstance(entry, TocEntry):
raise RuntimeError('requested toc entry, none found', token)
return entry
@classmethod
def collect_and_link(cls, xrefs: dict[str, XrefTarget], tokens: Sequence[Token]) -> TocEntry:
entries, examples, figures = cls._collect_entries(xrefs, tokens, 'book')
def flatten_with_parent(this: TocEntry, parent: TocEntry | None) -> Iterable[TocEntry]:
this.parent = parent
return itertools.chain([this], *[ flatten_with_parent(c, this) for c in this.children ])
flat = list(flatten_with_parent(entries, None))
prev = flat[0]
prev.starts_new_chunk = True
paths_seen = set([prev.target.path])
for c in flat[1:]:
if prev.target.path != c.target.path and c.target.path not in paths_seen:
c.starts_new_chunk = True
c.prev, prev.next = prev, c
prev = c
paths_seen.add(c.target.path)
flat[0].examples = examples
flat[0].figures = figures
for c in flat:
c.freeze()
return entries
@classmethod
def _collect_entries(cls, xrefs: dict[str, XrefTarget], tokens: Sequence[Token],
kind: TocEntryType) -> tuple[TocEntry, list[TocEntry], list[TocEntry]]:
# we assume that check_structure has been run recursively over the entire input.
# list contains (tag, entry) pairs that will collapse to a single entry for
# the full sequence.
entries: list[tuple[str, TocEntry]] = []
examples: list[TocEntry] = []
figures: list[TocEntry] = []
for token in tokens:
if token.type.startswith('included_') and (included := token.meta.get('included')):
fragment_type_str = token.type[9:].removesuffix('s')
assert fragment_type_str in get_args(TocEntryType)
fragment_type = cast(TocEntryType, fragment_type_str)
for fragment, _path in included:
subentries, subexamples, subfigures = cls._collect_entries(xrefs, fragment, fragment_type)
entries[-1][1].children.append(subentries)
examples += subexamples
figures += subfigures
elif token.type == 'heading_open' and (id := cast(str, token.attrs.get('id', ''))):
while len(entries) > 1 and entries[-1][0] >= token.tag:
entries[-2][1].children.append(entries.pop()[1])
entries.append((token.tag,
TocEntry(kind if token.tag == 'h1' else 'section', xrefs[id])))
token.meta['TocEntry'] = entries[-1][1]
elif token.type == 'example_open' and (id := cast(str, token.attrs.get('id', ''))):
examples.append(TocEntry('example', xrefs[id]))
elif token.type == 'figure_open' and (id := cast(str, token.attrs.get('id', ''))):
figures.append(TocEntry('figure', xrefs[id]))
while len(entries) > 1:
entries[-2][1].children.append(entries.pop()[1])
return (entries[0][1], examples, figures)
_xml_id_translate_table = {
ord('*'): ord('_'),
ord('<'): ord('_'),
ord(' '): ord('_'),
ord('>'): ord('_'),
ord('['): ord('_'),
ord(']'): ord('_'),
ord(':'): ord('_'),
ord('"'): ord('_'),
}
# this function is needed to generate option id attributes in the same format as
# the docbook toolchain did to not break existing links. we don't actually use
# xml any more, that's just the legacy we're dealing with and part of our structure
# now.
def make_xml_id(s: str) -> str:
return s.translate(_xml_id_translate_table)

View File

@@ -0,0 +1,636 @@
from abc import ABC
from collections.abc import Mapping, MutableMapping, Sequence
from typing import Any, Callable, cast, Generic, get_args, Iterable, Literal, NoReturn, Optional, TypeVar
import dataclasses
import re
from .types import RenderFn
from .src_error import SrcError
import markdown_it
from markdown_it.token import Token
from markdown_it.utils import OptionsDict
from mdit_py_plugins.container import container_plugin # type: ignore[attr-defined]
from mdit_py_plugins.deflist import deflist_plugin # type: ignore[attr-defined]
from mdit_py_plugins.footnote import footnote_plugin # type: ignore[attr-defined]
from mdit_py_plugins.myst_role import myst_role_plugin # type: ignore[attr-defined]
_md_escape_table = {
ord('*'): '\\*',
ord('<'): '\\<',
ord('['): '\\[',
ord('`'): '\\`',
ord('.'): '\\.',
ord('#'): '\\#',
ord('&'): '\\&',
ord('\\'): '\\\\',
}
def md_escape(s: str) -> str:
return s.translate(_md_escape_table)
def md_make_code(code: str, info: str = "", multiline: Optional[bool] = None) -> str:
# for multi-line code blocks we only have to count ` runs at the beginning
# of a line, but this is much easier.
multiline = multiline or info != "" or '\n' in code
longest, current = (0, 0)
for c in code:
current = current + 1 if c == '`' else 0
longest = max(current, longest)
# inline literals need a space to separate ticks from content, code blocks
# need newlines. inline literals need one extra tick, code blocks need three.
ticks, sep = ('`' * (longest + (3 if multiline else 1)), '\n' if multiline else ' ')
return f"{ticks}{info}{sep}{code}{sep}{ticks}"
AttrBlockKind = Literal['admonition', 'example', 'figure']
AdmonitionKind = Literal["note", "caution", "tip", "important", "warning"]
class Renderer:
_admonitions: dict[AdmonitionKind, tuple[RenderFn, RenderFn]]
_admonition_stack: list[AdmonitionKind]
def __init__(self, manpage_urls: Mapping[str, str]):
self._manpage_urls = manpage_urls
self.rules = {
'text': self.text,
'paragraph_open': self.paragraph_open,
'paragraph_close': self.paragraph_close,
'hardbreak': self.hardbreak,
'softbreak': self.softbreak,
'code_inline': self.code_inline,
'code_block': self.code_block,
'link_open': self.link_open,
'link_close': self.link_close,
'list_item_open': self.list_item_open,
'list_item_close': self.list_item_close,
'bullet_list_open': self.bullet_list_open,
'bullet_list_close': self.bullet_list_close,
'em_open': self.em_open,
'em_close': self.em_close,
'strong_open': self.strong_open,
'strong_close': self.strong_close,
'fence': self.fence,
'blockquote_open': self.blockquote_open,
'blockquote_close': self.blockquote_close,
'dl_open': self.dl_open,
'dl_close': self.dl_close,
'dt_open': self.dt_open,
'dt_close': self.dt_close,
'dd_open': self.dd_open,
'dd_close': self.dd_close,
'myst_role': self.myst_role,
"admonition_open": self.admonition_open,
"admonition_close": self.admonition_close,
"attr_span_begin": self.attr_span_begin,
"attr_span_end": self.attr_span_end,
"heading_open": self.heading_open,
"heading_close": self.heading_close,
"ordered_list_open": self.ordered_list_open,
"ordered_list_close": self.ordered_list_close,
"example_open": self.example_open,
"example_close": self.example_close,
"example_title_open": self.example_title_open,
"example_title_close": self.example_title_close,
"image": self.image,
"figure_open": self.figure_open,
"figure_close": self.figure_close,
"figure_title_open": self.figure_title_open,
"figure_title_close": self.figure_title_close,
"table_open": self.table_open,
"table_close": self.table_close,
"thead_open": self.thead_open,
"thead_close": self.thead_close,
"tr_open": self.tr_open,
"tr_close": self.tr_close,
"th_open": self.th_open,
"th_close": self.th_close,
"tbody_open": self.tbody_open,
"tbody_close": self.tbody_close,
"td_open": self.td_open,
"td_close": self.td_close,
"footnote_ref": self.footnote_ref,
"footnote_block_open": self.footnote_block_open,
"footnote_block_close": self.footnote_block_close,
"footnote_open": self.footnote_open,
"footnote_close": self.footnote_close,
"footnote_anchor": self.footnote_anchor,
}
self._admonitions = {
"note": (self.note_open, self.note_close),
"caution": (self.caution_open,self.caution_close),
"tip": (self.tip_open, self.tip_close),
"important": (self.important_open, self.important_close),
"warning": (self.warning_open, self.warning_close),
}
self._admonition_stack = []
def _join_block(self, ls: Iterable[str]) -> str:
return "".join(ls)
def _join_inline(self, ls: Iterable[str]) -> str:
return "".join(ls)
def admonition_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
tag = token.meta['kind']
self._admonition_stack.append(tag)
return self._admonitions[tag][0](token, tokens, i)
def admonition_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
return self._admonitions[self._admonition_stack.pop()][1](token, tokens, i)
def render(self, tokens: Sequence[Token]) -> str:
def do_one(i: int, token: Token) -> str:
if token.type == "inline":
assert token.children is not None
return self.renderInline(token.children)
elif token.type in self.rules:
return self.rules[token.type](tokens[i], tokens, i)
else:
raise NotImplementedError("md token not supported yet", token)
return self._join_block(map(lambda arg: do_one(*arg), enumerate(tokens)))
def renderInline(self, tokens: Sequence[Token]) -> str:
def do_one(i: int, token: Token) -> str:
if token.type in self.rules:
return self.rules[token.type](tokens[i], tokens, i)
else:
raise NotImplementedError("md token not supported yet", token)
return self._join_inline(map(lambda arg: do_one(*arg), enumerate(tokens)))
def text(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def paragraph_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def paragraph_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def hardbreak(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def softbreak(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def code_inline(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def code_block(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def link_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def link_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def list_item_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def list_item_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def bullet_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def bullet_list_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def em_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def em_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def strong_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def strong_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def fence(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def blockquote_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def blockquote_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def note_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def note_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def caution_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def caution_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def important_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def important_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def tip_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def tip_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def warning_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def warning_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def dl_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def dl_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def dt_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def dt_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def dd_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def dd_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def myst_role(self, token: Token, tokens: Sequence[Token], i: int) -> str:
# NixOS-specific roles are documented at <nixpkgs>/doc/README.md (with reverse reference)
raise RuntimeError("md token not supported", token)
def attr_span_begin(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def attr_span_end(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def heading_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def heading_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def ordered_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def ordered_list_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def example_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def example_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def example_title_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def example_title_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def image(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def figure_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def figure_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def figure_title_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def figure_title_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def table_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def table_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def thead_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def thead_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def tr_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def tr_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def th_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def th_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def tbody_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def tbody_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def td_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def td_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def footnote_ref(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def footnote_block_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def footnote_block_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def footnote_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def footnote_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def footnote_anchor(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported", token)
def _is_escaped(src: str, pos: int) -> bool:
found = 0
while pos >= 0 and src[pos] == '\\':
found += 1
pos -= 1
return found % 2 == 1
# the contents won't be split apart in the regex because spacing rules get messy here
_ATTR_SPAN_PATTERN = re.compile(r"\{([^}]*)\}")
# this one is for blocks with attrs. we want to use it with fullmatch() to deconstruct an info.
_ATTR_BLOCK_PATTERN = re.compile(r"\s*\{([^}]*)\}\s*")
def _parse_attrs(s: str) -> Optional[tuple[Optional[str], list[str]]]:
(id, classes) = (None, [])
for part in s.split():
if part.startswith('#'):
if id is not None:
return None # just bail on multiple ids instead of trying to recover
id = part[1:]
elif part.startswith('.'):
classes.append(part[1:])
else:
return None # no support for key=value attrs like in pandoc
return (id, classes)
def _parse_blockattrs(info: str) -> Optional[tuple[AttrBlockKind, Optional[str], list[str]]]:
if (m := _ATTR_BLOCK_PATTERN.fullmatch(info)) is None:
return None
if (parsed_attrs := _parse_attrs(m[1])) is None:
return None
id, classes = parsed_attrs
# check that we actually support this kind of block, and that is adheres to
# whetever restrictions we want to enforce for that kind of block.
if len(classes) == 1 and classes[0] in get_args(AdmonitionKind):
# don't want to support ids for admonitions just yet
if id is not None:
return None
return ('admonition', id, classes)
if classes == ['example']:
return ('example', id, classes)
elif classes == ['figure']:
return ('figure', id, classes)
return None
def _attr_span_plugin(md: markdown_it.MarkdownIt) -> None:
def attr_span(state: markdown_it.rules_inline.StateInline, silent: bool) -> bool:
if state.src[state.pos] != '[':
return False
if _is_escaped(state.src, state.pos - 1):
return False
# treat the inline span like a link label for simplicity.
label_begin = state.pos + 1
label_end = markdown_it.helpers.parseLinkLabel(state, state.pos)
input_end = state.posMax
if label_end < 0:
return False
# match id and classes in any combination
match = _ATTR_SPAN_PATTERN.match(state.src[label_end + 1 : ])
if not match:
return False
if not silent:
if (parsed_attrs := _parse_attrs(match[1])) is None:
return False
id, classes = parsed_attrs
token = state.push("attr_span_begin", "span", 1)
if id:
token.attrs['id'] = id
if classes:
token.attrs['class'] = " ".join(classes)
state.pos = label_begin
state.posMax = label_end
state.md.inline.tokenize(state)
state.push("attr_span_end", "span", -1)
state.pos = label_end + match.end() + 1
state.posMax = input_end
return True
md.inline.ruler.before("link", "attr_span", attr_span)
def _inline_comment_plugin(md: markdown_it.MarkdownIt) -> None:
def inline_comment(state: markdown_it.rules_inline.StateInline, silent: bool) -> bool:
if state.src[state.pos : state.pos + 4] != '<!--':
return False
if _is_escaped(state.src, state.pos - 1):
return False
for i in range(state.pos + 4, state.posMax - 2):
if state.src[i : i + 3] == '-->': # -->
state.pos = i + 3
return True
return False
md.inline.ruler.after("autolink", "inline_comment", inline_comment)
def _block_comment_plugin(md: markdown_it.MarkdownIt) -> None:
def block_comment(state: markdown_it.rules_block.StateBlock, startLine: int, endLine: int,
silent: bool) -> bool:
pos = state.bMarks[startLine] + state.tShift[startLine]
posMax = state.eMarks[startLine]
if state.src[pos : pos + 4] != '<!--':
return False
nextLine = startLine
while nextLine < endLine:
pos = state.bMarks[nextLine] + state.tShift[nextLine]
posMax = state.eMarks[nextLine]
if state.src[posMax - 3 : posMax] == '-->':
state.line = nextLine + 1
return True
nextLine += 1
return False
md.block.ruler.after("code", "block_comment", block_comment)
_HEADER_ID_RE = re.compile(r"\s*\{\s*\#([\w.-]+)\s*\}\s*$")
def _heading_ids(md: markdown_it.MarkdownIt) -> None:
def heading_ids(state: markdown_it.rules_core.StateCore) -> None:
tokens = state.tokens
# this is purposely simple and doesn't support classes or other kinds of attributes.
for (i, token) in enumerate(tokens):
if token.type == 'heading_open':
children = tokens[i + 1].children
assert children is not None
if len(children) == 0 or children[-1].type != 'text':
continue
if m := _HEADER_ID_RE.search(children[-1].content):
tokens[i].attrs['id'] = m[1]
children[-1].content = children[-1].content[:-len(m[0])].rstrip()
md.core.ruler.before("replacements", "heading_ids", heading_ids)
def _footnote_ids(md: markdown_it.MarkdownIt) -> None:
"""generate ids for footnotes, their refs, and their backlinks. the ids we
generate here are derived from the footnote label, making numeric footnote
labels invalid.
"""
def generate_ids(src: str, tokens: Sequence[Token]) -> None:
for token in tokens:
if token.type == 'footnote_open':
if token.meta["label"][:1].isdigit():
assert token.map
raise SrcError(
src=src,
description="invalid footnote label",
token=token,
)
token.attrs['id'] = token.meta["label"]
elif token.type == 'footnote_anchor':
token.meta['target'] = f'{token.meta["label"]}.__back.{token.meta["subId"]}'
elif token.type == 'footnote_ref':
token.attrs['id'] = f'{token.meta["label"]}.__back.{token.meta["subId"]}'
token.meta['target'] = token.meta["label"]
elif token.type == 'inline':
assert token.children is not None
generate_ids(src, token.children)
def footnote_ids(state: markdown_it.rules_core.StateCore) -> None:
generate_ids(state.src, state.tokens)
md.core.ruler.after("footnote_tail", "footnote_ids", footnote_ids)
def _compact_list_attr(md: markdown_it.MarkdownIt) -> None:
@dataclasses.dataclass
class Entry:
head: Token
end: int
compact: bool = True
def compact_list_attr(state: markdown_it.rules_core.StateCore) -> None:
# markdown-it signifies wide lists by setting the wrapper paragraphs
# of each item to hidden. this is not useful for our stylesheets, which
# signify this with a special css class on list elements instead.
stack = []
for token in state.tokens:
if token.type in [ 'bullet_list_open', 'ordered_list_open' ]:
stack.append(Entry(token, cast(int, token.attrs.get('start', 1))))
elif token.type in [ 'bullet_list_close', 'ordered_list_close' ]:
lst = stack.pop()
lst.head.meta['compact'] = lst.compact
if token.type == 'ordered_list_close':
lst.head.meta['end'] = lst.end - 1
elif len(stack) > 0 and token.type == 'paragraph_open' and not token.hidden:
stack[-1].compact = False
elif token.type == 'list_item_open':
stack[-1].end += 1
md.core.ruler.push("compact_list_attr", compact_list_attr)
def _block_attr(md: markdown_it.MarkdownIt) -> None:
def assert_never(value: NoReturn) -> NoReturn:
assert False
def block_attr(state: markdown_it.rules_core.StateCore) -> None:
stack = []
for token in state.tokens:
if token.type == 'container_blockattr_open':
if (parsed_attrs := _parse_blockattrs(token.info)) is None:
# if we get here we've missed a possible case in the plugin validate function
raise RuntimeError("this should be unreachable")
kind, id, classes = parsed_attrs
if kind == 'admonition':
token.type = 'admonition_open'
token.meta['kind'] = classes[0]
stack.append('admonition_close')
elif kind == 'example':
token.type = 'example_open'
if id is not None:
token.attrs['id'] = id
stack.append('example_close')
elif kind == 'figure':
token.type = 'figure_open'
if id is not None:
token.attrs['id'] = id
stack.append('figure_close')
else:
assert_never(kind)
elif token.type == 'container_blockattr_close':
token.type = stack.pop()
md.core.ruler.push("block_attr", block_attr)
def _block_titles(block: str) -> Callable[[markdown_it.MarkdownIt], None]:
open, close = f'{block}_open', f'{block}_close'
title_open, title_close = f'{block}_title_open', f'{block}_title_close'
"""
find title headings of blocks and stick them into meta for renderers, then
remove them from the token stream. also checks whether any block contains a
non-title heading since those would make toc generation extremely complicated.
"""
def block_titles(state: markdown_it.rules_core.StateCore) -> None:
in_example = [None]
for i, token in enumerate(state.tokens):
if token.type == open:
if state.tokens[i + 1].type == 'heading_open':
assert state.tokens[i + 3].type == 'heading_close'
state.tokens[i + 1].type = title_open
state.tokens[i + 3].type = title_close
else:
raise SrcError(
src=state.src,
description=f"found {block} without title",
token=token,
)
in_example.append(token)
elif token.type == close:
in_example.pop()
elif token.type == 'heading_open' and in_example[-1]:
assert token.map
started_at = in_example[-1]
block_display = ":::{." + block + "}"
raise SrcError(
description=f"unexpected non-title heading in `{block_display}`; are you missing a `:::`?\n"
f"Note: blocks like `{block_display}` are only allowed to contain a single heading in order to simplify TOC generation.",
src=state.src,
tokens={
f"`{block_display}` block": started_at,
"Unexpected heading": token,
},
)
def do_add(md: markdown_it.MarkdownIt) -> None:
md.core.ruler.push(f"{block}_titles", block_titles)
return do_add
TR = TypeVar('TR', bound='Renderer')
class Converter(ABC, Generic[TR]):
# we explicitly disable markdown-it rendering support and use our own entirely.
# rendering is well separated from parsing and our renderers carry much more state than
# markdown-it easily acknowledges as 'good' (unless we used the untyped env args to
# shuttle that state around, which is very fragile)
class ForbiddenRenderer(markdown_it.renderer.RendererProtocol):
__output__ = "none"
def __init__(self, parser: Optional[markdown_it.MarkdownIt]):
pass
def render(self, tokens: Sequence[Token], options: OptionsDict,
env: MutableMapping[str, Any]) -> str:
raise NotImplementedError("do not use Converter._md.renderer. 'tis a silly place")
_renderer: TR
def __init__(self) -> None:
self._md = markdown_it.MarkdownIt(
"commonmark",
{
'maxNesting': 100, # default is 20
'html': False, # not useful since we target many formats
'typographer': True, # required for smartquotes
},
renderer_cls=self.ForbiddenRenderer
)
self._md.enable('table')
self._md.use(
container_plugin,
name="blockattr",
validate=lambda name, *args: _parse_blockattrs(name),
)
self._md.use(deflist_plugin)
self._md.use(footnote_plugin)
self._md.use(myst_role_plugin)
self._md.use(_attr_span_plugin)
self._md.use(_inline_comment_plugin)
self._md.use(_block_comment_plugin)
self._md.use(_heading_ids)
self._md.use(_footnote_ids)
self._md.use(_compact_list_attr)
self._md.use(_block_attr)
self._md.use(_block_titles("example"))
self._md.use(_block_titles("figure"))
self._md.enable(["smartquotes", "replacements"])
def _parse(self, src: str) -> list[Token]:
return self._md.parse(src, {})
def _render(self, src: str) -> str:
tokens = self._parse(src)
return self._renderer.render(tokens)

View File

@@ -0,0 +1,599 @@
from __future__ import annotations
import argparse
import html
import json
import xml.sax.saxutils as xml
from abc import abstractmethod
from collections.abc import Mapping, Sequence
from markdown_it.token import Token
from pathlib import Path
from typing import Any, Generic, Optional
from urllib.parse import quote
from . import md
from . import parallel
from .asciidoc import AsciiDocRenderer, asciidoc_escape
from .commonmark import CommonMarkRenderer
from .html import HTMLRenderer
from .manpage import ManpageRenderer, man_escape
from .manual_structure import make_xml_id, XrefTarget
from .md import Converter, md_escape, md_make_code
from .types import OptionLoc, Option, RenderedOption, AnchorStyle
def option_is(option: Option, key: str, typ: str) -> Optional[dict[str, str]]:
if key not in option:
return None
if type(option[key]) != dict:
return None
if option[key].get('_type') != typ: # type: ignore[union-attr]
return None
return option[key] # type: ignore[return-value]
class BaseConverter(Converter[md.TR], Generic[md.TR]):
__option_block_separator__: str
_options: dict[str, RenderedOption]
def __init__(self, revision: str):
super().__init__()
self._options = {}
self._revision = revision
def _sorted_options(self) -> list[tuple[str, RenderedOption]]:
keys = list(self._options.keys())
keys.sort(key=lambda opt: [ (0 if p.startswith("enable") else 1 if p.startswith("package") else 2, p)
for p in self._options[opt].loc ])
return [ (k, self._options[k]) for k in keys ]
def _format_decl_def_loc(self, loc: OptionLoc) -> tuple[Optional[str], str]:
# locations can be either plain strings (specific to nixpkgs), or attrsets
# { name = "foo/bar.nix"; url = "https://github.com/....."; }
if isinstance(loc, str):
# Hyperlink the filename either to the NixOS github
# repository (if its a module and we have a revision number),
# or to the local filesystem.
if not loc.startswith('/'):
if self._revision == 'local':
href = f"https://github.com/NixOS/nixpkgs/blob/master/{loc}"
else:
href = f"https://github.com/NixOS/nixpkgs/blob/{self._revision}/{loc}"
else:
href = f"file://{loc}"
# Print the filename and make it user-friendly by replacing the
# /nix/store/<hash> prefix by the default location of nixos
# sources.
if not loc.startswith('/'):
name = f"<nixpkgs/{loc}>"
elif 'nixops' in loc and '/nix/' in loc:
name = f"<nixops/{loc[loc.find('/nix/') + 5:]}>"
else:
name = loc
return (href, name)
else:
return (loc['url'] if 'url' in loc else None, loc['name'])
@abstractmethod
def _decl_def_header(self, header: str) -> list[str]: raise NotImplementedError()
@abstractmethod
def _decl_def_entry(self, href: Optional[str], name: str) -> list[str]: raise NotImplementedError()
@abstractmethod
def _decl_def_footer(self) -> list[str]: raise NotImplementedError()
def _render_decl_def(self, header: str, locs: list[OptionLoc]) -> list[str]:
result = []
result += self._decl_def_header(header)
for loc in locs:
href, name = self._format_decl_def_loc(loc)
result += self._decl_def_entry(href, name)
result += self._decl_def_footer()
return result
def _render_code(self, option: Option, key: str) -> list[str]:
if lit := option_is(option, key, 'literalMD'):
return [ self._render(f"*{key.capitalize()}:*\n{lit['text']}") ]
elif lit := option_is(option, key, 'literalExpression'):
code = md_make_code(lit['text'])
return [ self._render(f"*{key.capitalize()}:*\n{code}") ]
elif key in option:
raise Exception(f"{key} has unrecognized type", option[key])
else:
return []
def _render_description(self, desc: str | dict[str, str]) -> list[str]:
if isinstance(desc, str):
return [ self._render(desc) ] if desc else []
elif isinstance(desc, dict) and desc.get('_type') == 'mdDoc':
return [ self._render(desc['text']) ] if desc['text'] else []
else:
raise Exception("description has unrecognized type", desc)
@abstractmethod
def _related_packages_header(self) -> list[str]: raise NotImplementedError()
def _convert_one(self, option: dict[str, Any]) -> list[str]:
blocks: list[list[str]] = []
if desc := option.get('description'):
blocks.append(self._render_description(desc))
if typ := option.get('type'):
ro = " *(read only)*" if option.get('readOnly', False) else ""
blocks.append([ self._render(f"*Type:*\n{md_escape(typ)}{ro}") ])
if option.get('default'):
blocks.append(self._render_code(option, 'default'))
if option.get('example'):
blocks.append(self._render_code(option, 'example'))
if related := option.get('relatedPackages'):
blocks.append(self._related_packages_header())
blocks[-1].append(self._render(related))
if decl := option.get('declarations'):
blocks.append(self._render_decl_def("Declared by", decl))
if defs := option.get('definitions'):
blocks.append(self._render_decl_def("Defined by", defs))
for part in [ p for p in blocks[0:-1] if p ]:
part.append(self.__option_block_separator__)
return [ l for part in blocks for l in part ]
# this could return a TState parameter, but that does not allow dependent types and
# will cause headaches when using BaseConverter as a type bound anywhere. Any is the
# next best thing we can use, and since this is internal it will be mostly safe.
@abstractmethod
def _parallel_render_prepare(self) -> Any: raise NotImplementedError()
# this should return python 3.11's Self instead to ensure that a prepare+finish
# round-trip ends up with an object of the same type. for now we'll use BaseConverter
# since it's good enough so far.
@classmethod
@abstractmethod
def _parallel_render_init_worker(cls, a: Any) -> BaseConverter[md.TR]: raise NotImplementedError()
def _render_option(self, name: str, option: dict[str, Any]) -> RenderedOption:
try:
return RenderedOption(option['loc'], self._convert_one(option))
except Exception as e:
raise Exception(f"Failed to render option {name}") from e
@classmethod
def _parallel_render_step(cls, s: BaseConverter[md.TR], a: Any) -> RenderedOption:
return s._render_option(*a)
def add_options(self, options: dict[str, Any]) -> None:
mapped = parallel.map(self._parallel_render_step, options.items(), 100,
self._parallel_render_init_worker, self._parallel_render_prepare())
for (name, option) in zip(options.keys(), mapped):
self._options[name] = option
@abstractmethod
def finalize(self) -> str: raise NotImplementedError()
class OptionDocsRestrictions:
def heading_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported in options doc", token)
def heading_close(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported in options doc", token)
def attr_span_begin(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported in options doc", token)
def example_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
raise RuntimeError("md token not supported in options doc", token)
class OptionsManpageRenderer(OptionDocsRestrictions, ManpageRenderer):
pass
class ManpageConverter(BaseConverter[OptionsManpageRenderer]):
__option_block_separator__ = ".sp"
_options_by_id: dict[str, str]
_links_in_last_description: Optional[list[str]] = None
def __init__(self, revision: str,
header: list[str] | None,
footer: list[str] | None,
*,
# only for parallel rendering
_options_by_id: Optional[dict[str, str]] = None):
super().__init__(revision)
self._options_by_id = _options_by_id or {}
self._renderer = OptionsManpageRenderer({}, self._options_by_id)
self._header = header
self._footer = footer
def _parallel_render_prepare(self) -> Any:
return (
self._revision,
self._header,
self._footer,
{ '_options_by_id': self._options_by_id },
)
@classmethod
def _parallel_render_init_worker(cls, a: Any) -> ManpageConverter:
return cls(a[0], a[1], a[2], **a[3])
def _render_option(self, name: str, option: dict[str, Any]) -> RenderedOption:
links = self._renderer.link_footnotes = []
result = super()._render_option(name, option)
self._renderer.link_footnotes = None
return result._replace(links=links)
def add_options(self, options: dict[str, Any]) -> None:
for (k, v) in options.items():
self._options_by_id[f'#{make_xml_id(f"opt-{k}")}'] = k
return super().add_options(options)
def _render_code(self, option: dict[str, Any], key: str) -> list[str]:
try:
self._renderer.inline_code_is_quoted = False
return super()._render_code(option, key)
finally:
self._renderer.inline_code_is_quoted = True
def _related_packages_header(self) -> list[str]:
return [
'\\fIRelated packages:\\fP',
'.sp',
]
def _decl_def_header(self, header: str) -> list[str]:
return [
f'\\fI{man_escape(header)}:\\fP',
]
def _decl_def_entry(self, href: Optional[str], name: str) -> list[str]:
return [
'.RS 4',
f'\\fB{man_escape(name)}\\fP',
'.RE'
]
def _decl_def_footer(self) -> list[str]:
return []
def finalize(self) -> str:
result = []
if self._header is not None:
result += self._header
else:
result += [
r'''.TH "CONFIGURATION\&.NIX" "5" "01/01/1980" "NixOS" "NixOS Reference Pages"''',
r'''.\" disable hyphenation''',
r'''.nh''',
r'''.\" disable justification (adjust text to left margin only)''',
r'''.ad l''',
r'''.\" enable line breaks after slashes''',
r'''.cflags 4 /''',
r'''.SH "NAME"''',
self._render('{file}`configuration.nix` - NixOS system configuration specification'),
r'''.SH "DESCRIPTION"''',
r'''.PP''',
self._render('The file {file}`/etc/nixos/configuration.nix` contains the '
'declarative specification of your NixOS system configuration. '
'The command {command}`nixos-rebuild` takes this file and '
'realises the system configuration specified therein.'),
r'''.SH "OPTIONS"''',
r'''.PP''',
self._render('You can use the following options in {file}`configuration.nix`.'),
]
for (name, opt) in self._sorted_options():
result += [
".PP",
f"\\fB{man_escape(name)}\\fR",
".RS 4",
]
result += opt.lines
if links := opt.links:
result.append(self.__option_block_separator__)
md_links = ""
for i in range(0, len(links)):
md_links += "\n" if i > 0 else ""
if links[i].startswith('#opt-'):
md_links += f"{i+1}. see the {{option}}`{self._options_by_id[links[i]]}` option"
else:
md_links += f"{i+1}. " + md_escape(links[i])
result.append(self._render(md_links))
result.append(".RE")
if self._footer is not None:
result += self._footer
else:
result += [
r'''.SH "AUTHORS"''',
r'''.PP''',
r'''Eelco Dolstra and the Nixpkgs/NixOS contributors''',
]
return "\n".join(result)
class OptionsCommonMarkRenderer(OptionDocsRestrictions, CommonMarkRenderer):
pass
class CommonMarkConverter(BaseConverter[OptionsCommonMarkRenderer]):
__option_block_separator__ = ""
_anchor_style: AnchorStyle
_anchor_prefix: str
def __init__(self, manpage_urls: Mapping[str, str], revision: str, anchor_style: AnchorStyle = AnchorStyle.NONE, anchor_prefix: str = ""):
super().__init__(revision)
self._renderer = OptionsCommonMarkRenderer(manpage_urls)
self._anchor_style = anchor_style
self._anchor_prefix = anchor_prefix
def _parallel_render_prepare(self) -> Any:
return (self._renderer._manpage_urls, self._revision)
@classmethod
def _parallel_render_init_worker(cls, a: Any) -> CommonMarkConverter:
return cls(*a)
def _related_packages_header(self) -> list[str]:
return [ "*Related packages:*" ]
def _decl_def_header(self, header: str) -> list[str]:
return [ f"*{header}:*" ]
def _decl_def_entry(self, href: Optional[str], name: str) -> list[str]:
if href is not None:
return [ f" - [{md_escape(name)}]({href})" ]
return [ f" - {md_escape(name)}" ]
def _decl_def_footer(self) -> list[str]:
return []
def _make_anchor_suffix(self, loc: list[str]) -> str:
if self._anchor_style == AnchorStyle.NONE:
return ""
elif self._anchor_style == AnchorStyle.LEGACY:
sanitized = ".".join(map(make_xml_id, loc))
return f" {{#{self._anchor_prefix}{sanitized}}}"
else:
raise RuntimeError("unhandled anchor style", self._anchor_style)
def finalize(self) -> str:
result = []
for (name, opt) in self._sorted_options():
anchor_suffix = self._make_anchor_suffix(opt.loc)
result.append(f"## {md_escape(name)}{anchor_suffix}\n")
result += opt.lines
result.append("\n\n")
return "\n".join(result)
class OptionsAsciiDocRenderer(OptionDocsRestrictions, AsciiDocRenderer):
pass
class AsciiDocConverter(BaseConverter[OptionsAsciiDocRenderer]):
__option_block_separator__ = ""
def __init__(self, manpage_urls: Mapping[str, str], revision: str):
super().__init__(revision)
self._renderer = OptionsAsciiDocRenderer(manpage_urls)
def _parallel_render_prepare(self) -> Any:
return (self._renderer._manpage_urls, self._revision)
@classmethod
def _parallel_render_init_worker(cls, a: Any) -> AsciiDocConverter:
return cls(*a)
def _related_packages_header(self) -> list[str]:
return [ "__Related packages:__" ]
def _decl_def_header(self, header: str) -> list[str]:
return [ f"__{header}:__\n" ]
def _decl_def_entry(self, href: Optional[str], name: str) -> list[str]:
if href is not None:
return [ f"* link:{quote(href, safe='/:')}[{asciidoc_escape(name)}]" ]
return [ f"* {asciidoc_escape(name)}" ]
def _decl_def_footer(self) -> list[str]:
return []
def finalize(self) -> str:
result = []
for (name, opt) in self._sorted_options():
result.append(f"== {asciidoc_escape(name)}\n")
result += opt.lines
result.append("\n\n")
return "\n".join(result)
class OptionsHTMLRenderer(OptionDocsRestrictions, HTMLRenderer):
# TODO docbook compat. must be removed together with the matching docbook handlers.
def ordered_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
token.meta['compact'] = False
return super().ordered_list_open(token, tokens, i)
def bullet_list_open(self, token: Token, tokens: Sequence[Token], i: int) -> str:
token.meta['compact'] = False
return super().bullet_list_open(token, tokens, i)
def fence(self, token: Token, tokens: Sequence[Token], i: int) -> str:
info = f" {html.escape(token.info, True)}" if token.info != "" else ""
return f'<pre><code class="programlisting{info}">{html.escape(token.content)}</code></pre>'
class HTMLConverter(BaseConverter[OptionsHTMLRenderer]):
__option_block_separator__ = ""
def __init__(self, manpage_urls: Mapping[str, str], revision: str,
varlist_id: str, id_prefix: str, xref_targets: Mapping[str, XrefTarget]):
super().__init__(revision)
self._xref_targets = xref_targets
self._varlist_id = varlist_id
self._id_prefix = id_prefix
self._renderer = OptionsHTMLRenderer(manpage_urls, self._xref_targets)
def _parallel_render_prepare(self) -> Any:
return (self._renderer._manpage_urls, self._revision,
self._varlist_id, self._id_prefix, self._xref_targets)
@classmethod
def _parallel_render_init_worker(cls, a: Any) -> HTMLConverter:
return cls(*a)
def _related_packages_header(self) -> list[str]:
return [
'<p><span class="emphasis"><em>Related packages:</em></span></p>',
]
def _decl_def_header(self, header: str) -> list[str]:
return [
f'<p><span class="emphasis"><em>{header}:</em></span></p>',
'<table border="0" summary="Simple list" class="simplelist">'
]
def _decl_def_entry(self, href: Optional[str], name: str) -> list[str]:
if href is not None:
href = f' href="{html.escape(href, True)}"'
return [
"<tr><td>",
f'<code class="filename"><a class="filename" {href} target="_top">',
f'{html.escape(name)}',
'</a></code>',
"</td></tr>"
]
def _decl_def_footer(self) -> list[str]:
return [ "</table>" ]
def finalize(self) -> str:
result = []
result += [
'<div class="variablelist">',
f'<a id="{html.escape(self._varlist_id, True)}"></a>',
' <dl class="variablelist">',
]
for (name, opt) in self._sorted_options():
id = make_xml_id(self._id_prefix + name)
target = self._xref_targets[id]
result += [
'<dt>',
' <span class="term">',
# docbook compat, these could be one tag
f' <a id="{html.escape(id, True)}"></a><a class="term" href="{target.href()}">'
# no spaces here (and string merging) for docbook output compat
f'<code class="option">{html.escape(name)}</code>',
' </a>',
' </span>',
'</dt>',
'<dd>',
]
result += opt.lines
result += [
"</dd>",
]
result += [
" </dl>",
"</div>"
]
return "\n".join(result)
def _build_cli_manpage(p: argparse.ArgumentParser) -> None:
p.add_argument('--revision', required=True)
p.add_argument("--header", type=Path)
p.add_argument("--footer", type=Path)
p.add_argument("infile")
p.add_argument("outfile")
def parse_anchor_style(value: str|AnchorStyle) -> AnchorStyle:
if isinstance(value, AnchorStyle):
# Used by `argparse.add_argument`'s `default`
return value
try:
return AnchorStyle(value.lower())
except ValueError:
raise argparse.ArgumentTypeError(f"Invalid value {value}\nExpected one of {', '.join(style.value for style in AnchorStyle)}")
def _build_cli_commonmark(p: argparse.ArgumentParser) -> None:
p.add_argument('--manpage-urls', required=True)
p.add_argument('--revision', required=True)
p.add_argument(
'--anchor-style',
required=False,
default=AnchorStyle.NONE.value,
choices = [style.value for style in AnchorStyle],
help = "(default: %(default)s) Anchor style to use for links to options. \nOnly none is standard CommonMark."
)
p.add_argument('--anchor-prefix',
required=False,
default="",
help="(default: no prefix) String to prepend to anchor ids. Not used when anchor style is none."
)
p.add_argument("infile")
p.add_argument("outfile")
def _build_cli_asciidoc(p: argparse.ArgumentParser) -> None:
p.add_argument('--manpage-urls', required=True)
p.add_argument('--revision', required=True)
p.add_argument("infile")
p.add_argument("outfile")
def _run_cli_manpage(args: argparse.Namespace) -> None:
header = None
footer = None
if args.header is not None:
with args.header.open() as f:
header = f.read().splitlines()
if args.footer is not None:
with args.footer.open() as f:
footer = f.read().splitlines()
md = ManpageConverter(
revision = args.revision,
header = header,
footer = footer,
)
with open(args.infile, 'r') as f:
md.add_options(json.load(f))
with open(args.outfile, 'w') as f:
f.write(md.finalize())
def _run_cli_commonmark(args: argparse.Namespace) -> None:
with open(args.manpage_urls, 'r') as manpage_urls:
md = CommonMarkConverter(json.load(manpage_urls),
revision = args.revision,
anchor_style = parse_anchor_style(args.anchor_style),
anchor_prefix = args.anchor_prefix)
with open(args.infile, 'r') as f:
md.add_options(json.load(f))
with open(args.outfile, 'w') as f:
f.write(md.finalize())
def _run_cli_asciidoc(args: argparse.Namespace) -> None:
with open(args.manpage_urls, 'r') as manpage_urls:
md = AsciiDocConverter(json.load(manpage_urls), revision = args.revision)
with open(args.infile, 'r') as f:
md.add_options(json.load(f))
with open(args.outfile, 'w') as f:
f.write(md.finalize())
def build_cli(p: argparse.ArgumentParser) -> None:
formats = p.add_subparsers(dest='format', required=True)
_build_cli_manpage(formats.add_parser('manpage'))
_build_cli_commonmark(formats.add_parser('commonmark'))
_build_cli_asciidoc(formats.add_parser('asciidoc'))
def run_cli(args: argparse.Namespace) -> None:
if args.format == 'manpage':
_run_cli_manpage(args)
elif args.format == 'commonmark':
_run_cli_commonmark(args)
elif args.format == 'asciidoc':
_run_cli_asciidoc(args)
else:
raise RuntimeError('format not hooked up', args)

View File

@@ -0,0 +1,58 @@
# this module only has to exist because cpython has a global interpreter lock
# and markdown-it is pure python code. ideally we'd just use thread pools, but
# the GIL prohibits this.
import multiprocessing
from typing import Any, Callable, Iterable, Optional, TypeVar
R = TypeVar('R')
S = TypeVar('S')
T = TypeVar('T')
A = TypeVar('A')
pool_processes: Optional[int] = None
# this thing is impossible to type because there's so much global state involved.
# wrapping in a class to get access to Generic[] parameters is not sufficient
# because mypy is too weak, and unnecessarily obscures how much global state is
# needed in each worker to make this whole brouhaha work.
_map_worker_fn: Any = None
_map_worker_state_fn: Any = None
_map_worker_state_arg: Any = None
def _map_worker_init(*args: Any) -> None:
global _map_worker_fn, _map_worker_state_fn, _map_worker_state_arg
(_map_worker_fn, _map_worker_state_fn, _map_worker_state_arg) = args
# NOTE: the state argument is never passed by any caller, we only use it as a localized
# cache for the created state in lieu of another global. it is effectively a global though.
def _map_worker_step(arg: Any, state: Any = []) -> Any:
global _map_worker_fn, _map_worker_state_fn, _map_worker_state_arg
# if a Pool initializer throws it'll just be retried, leading to endless loops.
# doing the proper initialization only on first use avoids this.
if not state:
state.append(_map_worker_state_fn(_map_worker_state_arg))
return _map_worker_fn(state[0], arg)
def map(fn: Callable[[S, T], R], d: Iterable[T], chunk_size: int,
state_fn: Callable[[A], S], state_arg: A) -> list[R]:
"""
`[ fn(state, i) for i in d ]` where `state = state_fn(state_arg)`, but using multiprocessing
if `pool_processes` is not `None`. when using multiprocessing is used the state function will
be run once in ever worker process and `multiprocessing.Pool.imap` will be used.
**NOTE:** neither `state_fn` nor `fn` are allowed to mutate global state! doing so will cause
discrepancies if `pool_processes` is not None, since each worker will have its own copy.
**NOTE**: all data types that potentially cross a process boundary (so, all of them) must be
pickle-able. this excludes lambdas, bound functions, local functions, and a number of other
types depending on their exact internal structure. *theoretically* the pool constructor
can transfer non-pickleable data to worker processes, but this only works when using the
`fork` spawn method (and is thus not available on darwin or windows).
"""
if pool_processes is None:
state = state_fn(state_arg)
return [ fn(state, i) for i in d ]
with multiprocessing.Pool(pool_processes, _map_worker_init, (fn, state_fn, state_arg)) as p:
return list(p.imap(_map_worker_step, d, chunk_size))

View File

@@ -0,0 +1,3 @@
const anchor = document.location.hash.substring(1);
const redirects = REDIRECTS_PLACEHOLDER;
if (redirects[anchor]) document.location.href = redirects[anchor];

View File

@@ -0,0 +1,186 @@
import json
from dataclasses import dataclass, field
from pathlib import Path
from .manual_structure import XrefTarget
class RedirectsError(Exception):
def __init__(
self,
conflicting_anchors: set[str] = None,
divergent_redirects: set[str] = None,
identifiers_missing_current_outpath: set[str] = None,
identifiers_without_redirects: set[str] = None,
orphan_identifiers: set[str] = None
):
self.conflicting_anchors = conflicting_anchors or set()
self.divergent_redirects = divergent_redirects or set()
self.identifiers_missing_current_outpath = identifiers_missing_current_outpath or set()
self.identifiers_without_redirects = identifiers_without_redirects or set()
self.orphan_identifiers = orphan_identifiers or set()
def __str__(self):
error_messages = []
if self.conflicting_anchors:
error_messages.append(f"""
Identifiers must not be identical to any historical location's anchor of the same output path.
The following identifiers violate this rule:
- {"\n - ".join(self.conflicting_anchors)}
This can break links or redirects. If you added new content, choose a different identifier.""")
if self.divergent_redirects:
error_messages.append(f"""
All historical content locations must correspond to exactly one identifier.
The following locations violate this rule:
- {"\n - ".join(self.divergent_redirects)}
It leads to inconsistent behavior depending on which redirect is applied.
Please update doc/redirects.json or nixos/doc/manual/redirects.json!""")
if self.identifiers_missing_current_outpath:
error_messages.append(f"""
The first element of an identifier's redirects list must denote its current location.
The following identifiers violate this rule:
- {"\n - ".join(self.identifiers_missing_current_outpath)}
If you moved content, add its new location as the first element of the redirects mapping.
Please update doc/redirects.json or nixos/doc/manual/redirects.json!""")
if self.identifiers_without_redirects:
error_messages.append(f"""
Identifiers present in the source must have a mapping in the redirects file.
- {"\n - ".join(self.identifiers_without_redirects)}""")
if self.orphan_identifiers:
error_messages.append(f"""
Keys of the redirects mapping must correspond to some identifier in the source.
- {"\n - ".join(self.orphan_identifiers)}""")
if self.identifiers_without_redirects or self.orphan_identifiers or self.identifiers_missing_current_outpath:
error_messages.append(f"""
This can happen when an identifier was added, renamed, or removed.
Added new content?
$ redirects add-content <identifier> <path>
often:
$ redirects add-content <identifier> index.html
Moved existing content to a different output path?
$ redirects move-content <identifier> <path>
Renamed existing identifiers?
$ redirects rename-identifier <old-identifier> <new-identifier>
Removed content? Redirect to alternatives or relevant release notes.
$ redirects remove-and-redirect <identifier> <target-identifier>
NOTE: Run the right nix-shell to make this command available.
Nixpkgs:
$ nix-shell doc
NixOS:
$ nix-shell nixos/doc/manual
""")
error_messages.append("NOTE: If your build passes locally and you see this message in CI, you probably need a rebase.")
return "\n".join(error_messages)
@dataclass
class Redirects:
_raw_redirects: dict[str, list[str]]
_redirects_script: str
_xref_targets: dict[str, XrefTarget] = field(default_factory=dict)
def validate(self, initial_xref_targets: dict[str, XrefTarget]):
"""
Validate redirection mappings against element locations in the output
- Ensure semantic correctness of the set of redirects with the following rules:
- Identifiers present in the source must have a mapping in the redirects file
- Keys of the redirects mapping must correspond to some identifier in the source
- All historical content locations must correspond to exactly one identifier
- Identifiers must not be identical to any historical location's anchor of the same output path
- The first element of an identifier's redirects list must denote its current location.
"""
xref_targets = {}
ignored_identifier_patterns = ("opt-", "auto-generated-", "function-library-", "service-opt-", "systemd-service-opt")
for id, target in initial_xref_targets.items():
# filter out automatically generated identifiers from module options and library documentation
if id.startswith(ignored_identifier_patterns):
continue
xref_targets[id] = target
identifiers_without_redirects = xref_targets.keys() - self._raw_redirects.keys()
orphan_identifiers = self._raw_redirects.keys() - xref_targets.keys()
client_side_redirects = {}
server_side_redirects = {}
conflicting_anchors = set()
divergent_redirects = set()
identifiers_missing_current_outpath = set()
for identifier, locations in self._raw_redirects.items():
if identifier not in xref_targets:
continue
if not locations or locations[0] != f"{xref_targets[identifier].path}#{identifier}":
identifiers_missing_current_outpath.add(identifier)
for location in locations[1:]:
if '#' in location:
path, anchor = location.split('#')
if anchor in identifiers_without_redirects:
identifiers_without_redirects.remove(anchor)
if location not in client_side_redirects:
client_side_redirects[location] = f"{xref_targets[identifier].path}#{identifier}"
for identifier, xref_target in xref_targets.items():
if xref_target.path == path and anchor == identifier:
conflicting_anchors.add(anchor)
else:
divergent_redirects.add(location)
else:
if location not in server_side_redirects:
server_side_redirects[location] = xref_targets[identifier].path
else:
divergent_redirects.add(location)
if any([
conflicting_anchors,
divergent_redirects,
identifiers_missing_current_outpath,
identifiers_without_redirects,
orphan_identifiers
]):
raise RedirectsError(
conflicting_anchors=conflicting_anchors,
divergent_redirects=divergent_redirects,
identifiers_missing_current_outpath=identifiers_missing_current_outpath,
identifiers_without_redirects=identifiers_without_redirects,
orphan_identifiers=orphan_identifiers
)
self._xref_targets = xref_targets
def get_client_redirects(self, target: str):
paths_to_target = {src for src, dest in self.get_server_redirects().items() if dest == target}
client_redirects = {}
for locations in self._raw_redirects.values():
for location in locations[1:]:
if '#' not in location:
continue
path, anchor = location.split('#')
if path not in [target, *paths_to_target]:
continue
client_redirects[anchor] = locations[0]
return client_redirects
def get_server_redirects(self):
server_redirects = {}
for identifier, locations in self._raw_redirects.items():
for location in locations[1:]:
if '#' not in location and location not in server_redirects:
server_redirects[location] = self._xref_targets[identifier].path
return server_redirects
def get_redirect_script(self, target: str) -> str:
client_redirects = self.get_client_redirects(target)
return self._redirects_script.replace('REDIRECTS_PLACEHOLDER', json.dumps(client_redirects))

View File

@@ -0,0 +1,154 @@
from typing import Tuple
from markdown_it.token import Token
LineSpan = int | Tuple[int, int] | Token
class SrcError(Exception):
"""An error associated with a source file and location."""
def __init__(
self,
*,
description: str,
src: str,
tokens: dict[str, LineSpan] | None = None,
token: LineSpan | None = None,
):
"""Create a new `SrcError`.
Arguments:
- `description`: A description of the error.
- `src`: The source text the `token`s are from.
- `tokens`: A dictionary from descriptions to `Tokens` (or lines) associated with
the error.
The tokens are used for their source location.
A location like ` at lines 6-9` will be added to the description.
If the description is empty, the location will be described as `At
lines 6-9`.
- `token`: Shorthand for `tokens={"": token}`.
"""
self.src = src
tokens = tokens or {}
if token:
tokens[""] = token
self.tokens = tokens
self.description = description
self.message = _src_error_str(src=src, tokens=tokens, description=description)
super().__init__(self.message)
def __str__(self) -> str:
return self.message
def _get_line_span(location: LineSpan) -> Tuple[int, int] | None:
if isinstance(location, Token):
if location.map:
return (location.map[0], location.map[1])
else:
return None
elif isinstance(location, int):
return (location, location + 1)
else:
return location
def _src_error_str(*, src: str, tokens: dict[str, LineSpan], description: str) -> str:
"""Python exceptions are a bit goofy and need a `message` string attribute
right away, so we basically need a way to generate the string before we
actually finish `__init__`.
"""
result = [description]
src_lines = src.splitlines()
for description, token in tokens.items():
result.append("\n\n\x1b[33m")
if description:
result.append(description)
result.append(" at ")
else:
result.append("At ")
maybe_span = _get_line_span(token)
if not maybe_span:
result.append("unknown location\x1b[0m")
continue
start, end = maybe_span
# Note: `end` is exclusive, so single-line spans are represented as
# `(n, n+1)`.
if start == end - 1:
result.append("line ")
result.append(str(start + 1))
else:
result.append("lines ")
result.append(str(start + 1))
result.append("-")
result.append(str(end))
result.append(":\x1b[0m\n")
result.append(src_excerpt(src_lines=src_lines, start=start, end=end))
return "".join(result)
def src_excerpt(
*, src_lines: list[str], start: int, end: int, context: int = 3, max_lines: int = 20
) -> str:
output = []
def clamp_line(line_num: int) -> int:
return max(0, min(len(src_lines), line_num))
def add_line(line_num: int, *, is_context: bool) -> None:
# Lines start with the line number, dimmed.
prefix = "\x1b[2m\x1b[37m" + format(line_num + 1, " 4d") + "\x1b[0m"
# Context lines are prefixed with a dotted line, non-context lines are
# prefixed with a bold yellow line.
if is_context:
# Note: No reset here because context lines are dimmed.
prefix += " \x1b[2m\x1b[37m┆ "
else:
prefix += " \x1b[1m\x1b[33m┃\x1b[0m "
output.append(prefix + src_lines[line_num] + "\x1b[0m")
def add_lines(start: int, end: int, is_context: bool) -> None:
for i in range(clamp_line(start), clamp_line(end)):
add_line(i, is_context=is_context)
if end - start > max_lines:
# If we have more than `max_lines` in the range, show a `...` in the middle.
half_max_lines = max_lines // 2
add_lines(start - context, start, is_context=True)
add_lines(start, start + half_max_lines, is_context=False)
output.append(" \x1b[2m\x1b[37m...\x1b[0m")
add_lines(end - half_max_lines, end, is_context=False)
add_lines(end, end + context, is_context=True)
else:
add_lines(start - context, start, is_context=True)
add_lines(start, end, is_context=False)
add_lines(end, end + context, is_context=True)
return "\n".join(output)

View File

@@ -0,0 +1,19 @@
from collections.abc import Sequence
from enum import Enum
from typing import Callable, Optional, NamedTuple
from markdown_it.token import Token
OptionLoc = str | dict[str, str]
Option = dict[str, str | dict[str, str] | list[OptionLoc]]
class RenderedOption(NamedTuple):
loc: list[str]
lines: list[str]
links: Optional[list[str]] = None
RenderFn = Callable[[Token, Sequence[Token], int], str]
class AnchorStyle(Enum):
NONE = "none"
LEGACY = "legacy"

View File

@@ -0,0 +1,21 @@
from typing import Any
_frozen_classes: dict[type, type] = {}
# make a derived class freezable (ie, disallow modifications).
# we do this by changing the class of an instance at runtime when freeze()
# is called, providing a derived class that is exactly the same except
# for a __setattr__ that raises an error when called. this beats having
# a field for frozenness and an unconditional __setattr__ that checks this
# field because it does not insert anything into the class dict.
class Freezeable:
def freeze(self) -> None:
cls = type(self)
if not (frozen := _frozen_classes.get(cls)):
def __setattr__(instance: Any, n: str, v: Any) -> None:
raise TypeError(f'{cls.__name__} is frozen')
frozen = type(cls.__name__, (cls,), {
'__setattr__': __setattr__,
})
_frozen_classes[cls] = frozen
self.__class__ = frozen

View File

@@ -0,0 +1,18 @@
[project]
name = "nixos-render-docs"
version = "0.0"
description = "Renderer for NixOS manual and option docs"
classifiers = [
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
]
[project.scripts]
nixos-render-docs = "nixos_render_docs:main"
[build-system]
requires = ["setuptools"]
[tool.setuptools.package-data]
nixos_render_docs = ["redirects.js"]

View File

@@ -0,0 +1,62 @@
sample1 = """\
:::: {.warning}
foo
::: {.note}
nested
:::
::::
[
multiline
](link)
{manpage}`man(1)` reference
[some [nested]{#a} anchors]{#b}
*emph* **strong** *nesting emph **and strong** and `code`*
- wide bullet
- list
1. wide ordered
2. list
- narrow bullet
- list
1. narrow ordered
2. list
> quotes
>> with *nesting*
>>
>> nested code block
>
> - and lists
> - ```
> containing code
> ```
>
> and more quote
100. list starting at 100
1. goes on
deflist
: > with a quote
> and stuff
code block
```
fenced block
```
text
more stuff in same deflist
: foo
"""

View File

@@ -0,0 +1,17 @@
{
"services.frobnicator.types.<name>.enable": {
"declarations": [
"nixos/modules/services/frobnicator.nix"
],
"description": "Whether to enable the frobnication of this (`<name>`) type.",
"loc": [
"services",
"frobnicator",
"types",
"<name>",
"enable"
],
"readOnly": false,
"type": "boolean"
}
}

View File

@@ -0,0 +1,13 @@
## services\.frobnicator\.types\.\<name>\.enable
Whether to enable the frobnication of this (` <name> `) type\.
*Type:*
boolean
*Declared by:*
- [\<nixpkgs/nixos/modules/services/frobnicator\.nix>](https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/frobnicator.nix)

View File

@@ -0,0 +1,13 @@
## services\.frobnicator\.types\.\<name>\.enable {#opt-services.frobnicator.types._name_.enable}
Whether to enable the frobnication of this (` <name> `) type\.
*Type:*
boolean
*Declared by:*
- [\<nixpkgs/nixos/modules/services/frobnicator\.nix>](https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/frobnicator.nix)

View File

@@ -0,0 +1,145 @@
import nixos_render_docs as nrd
from sample_md import sample1
class Converter(nrd.md.Converter[nrd.asciidoc.AsciiDocRenderer]):
def __init__(self, manpage_urls: dict[str, str]):
super().__init__()
self._renderer = nrd.asciidoc.AsciiDocRenderer(manpage_urls)
def test_lists() -> None:
c = Converter({})
# attaching to the nth ancestor list requires n newlines before the +
assert c._render("""\
- a
b
- c
- d
- e
1
f
""") == """\
[]
* {empty}a
+
b
* {empty}c
+
[options="compact"]
** {empty}d
+
[]
** {empty}e
+
1
+
f
"""
def test_full() -> None:
c = Converter({ 'man(1)': 'http://example.org' })
assert c._render(sample1) == """\
[WARNING]
====
foo
[NOTE]
=====
nested
=====
====
link:link[ multiline ]
link:http://example.org[man(1)] reference
[[b]]some [[a]]nested anchors
__emph__ **strong** __nesting emph **and strong** and ``code``__
[]
* {empty}wide bullet
* {empty}list
[]
. {empty}wide ordered
. {empty}list
[options="compact"]
* {empty}narrow bullet
* {empty}list
[options="compact"]
. {empty}narrow ordered
. {empty}list
[quote]
====
quotes
[quote]
=====
with __nesting__
----
nested code block
----
=====
[options="compact"]
* {empty}and lists
* {empty}
+
----
containing code
----
and more quote
====
[start=100,options="compact"]
. {empty}list starting at 100
. {empty}goes on
[]
deflist:: {empty}
+
[quote]
=====
with a quote and stuff
=====
+
----
code block
----
+
----
fenced block
----
+
text
more stuff in same deflist:: {empty}foo
"""

View File

@@ -0,0 +1,92 @@
from pathlib import Path
from markdown_it.token import Token
from nixos_render_docs.manual import HTMLConverter, HTMLParameters
from nixos_render_docs.md import Converter
auto_id_prefix="TEST_PREFIX"
def set_prefix(token: Token, ident: str) -> None:
token.attrs["id"] = f"{auto_id_prefix}-{ident}"
def test_auto_id_prefix_simple() -> None:
md = HTMLConverter("1.0.0", HTMLParameters("", [], [], 2, 2, 2, Path("")), {})
src = f"""
# title
## subtitle
"""
tokens = Converter()._parse(src)
md._handle_headings(tokens, src=src, on_heading=set_prefix)
assert [
{**token.attrs, "tag": token.tag}
for token in tokens
if token.type == "heading_open"
] == [
{"id": "TEST_PREFIX-1", "tag": "h1"},
{"id": "TEST_PREFIX-1.1", "tag": "h2"}
]
def test_auto_id_prefix_repeated() -> None:
md = HTMLConverter("1.0.0", HTMLParameters("", [], [], 2, 2, 2, Path("")), {})
src = f"""
# title
## subtitle
# title2
## subtitle2
"""
tokens = Converter()._parse(src)
md._handle_headings(tokens, src=src, on_heading=set_prefix)
assert [
{**token.attrs, "tag": token.tag}
for token in tokens
if token.type == "heading_open"
] == [
{"id": "TEST_PREFIX-1", "tag": "h1"},
{"id": "TEST_PREFIX-1.1", "tag": "h2"},
{"id": "TEST_PREFIX-2", "tag": "h1"},
{"id": "TEST_PREFIX-2.1", "tag": "h2"},
]
def test_auto_id_prefix_maximum_nested() -> None:
md = HTMLConverter("1.0.0", HTMLParameters("", [], [], 2, 2, 2, Path("")), {})
src = f"""
# h1
## h2
### h3
#### h4
##### h5
###### h6
## h2.2
"""
tokens = Converter()._parse(src)
md._handle_headings(tokens, src=src, on_heading=set_prefix)
assert [
{**token.attrs, "tag": token.tag}
for token in tokens
if token.type == "heading_open"
] == [
{"id": "TEST_PREFIX-1", "tag": "h1"},
{"id": "TEST_PREFIX-1.1", "tag": "h2"},
{"id": "TEST_PREFIX-1.1.1", "tag": "h3"},
{"id": "TEST_PREFIX-1.1.1.1", "tag": "h4"},
{"id": "TEST_PREFIX-1.1.1.1.1", "tag": "h5"},
{"id": "TEST_PREFIX-1.1.1.1.1.1", "tag": "h6"},
{"id": "TEST_PREFIX-1.2", "tag": "h2"},
]

View File

@@ -0,0 +1,99 @@
import nixos_render_docs as nrd
from sample_md import sample1
from typing import Mapping
class Converter(nrd.md.Converter[nrd.commonmark.CommonMarkRenderer]):
def __init__(self, manpage_urls: Mapping[str, str]):
super().__init__()
self._renderer = nrd.commonmark.CommonMarkRenderer(manpage_urls)
# NOTE: in these tests we represent trailing spaces by ` ` and replace them with real space later,
# since a number of editors will strip trailing whitespace on save and that would break the tests.
def test_indented_fence() -> None:
c = Converter({})
s = """\
> - ```foo
> thing
>      
> rest
> ```\
""".replace(' ', ' ')
assert c._render(s) == s
def test_full() -> None:
c = Converter({ 'man(1)': 'http://example.org' })
assert c._render(sample1) == """\
**Warning:** foo
**Note:** nested
[
multiline
](link)
[` man(1) `](http://example.org) reference
some nested anchors
*emph* **strong** *nesting emph **and strong** and ` code `*
- wide bullet
- list
1. wide ordered
2. list
- narrow bullet
- list
1. narrow ordered
2. list
> quotes
> 
> > with *nesting*
> > 
> > ```
> > nested code block
> > ```
> 
> - and lists
> - ```
> containing code
> ```
> 
> and more quote
100. list starting at 100
101. goes on
- *deflist*
   
> with a quote
> and stuff
   
```
code block
```
   
```
fenced block
```
   
text
- *more stuff in same deflist*
   
foo""".replace(' ', ' ')
def test_images() -> None:
c = Converter({})
assert c._render("![*alt text*](foo \"title \\\"quoted\\\" text\")") == (
"![*alt text*](foo \"title \\\"quoted\\\" text\")"
)

View File

@@ -0,0 +1,104 @@
import nixos_render_docs as nrd
from markdown_it.token import Token
class Converter(nrd.md.Converter[nrd.html.HTMLRenderer]):
# actual renderer doesn't matter, we're just parsing.
def __init__(self, manpage_urls: dict[str, str]) -> None:
super().__init__()
self._renderer = nrd.html.HTMLRenderer(manpage_urls, {})
def test_heading_id_absent() -> None:
c = Converter({})
assert c._parse("# foo") == [
Token(type='heading_open', tag='h1', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='foo', markup='', info='', meta={}, block=False, hidden=False)
],
content='foo', markup='', info='', meta={}, block=True, hidden=False),
Token(type='heading_close', tag='h1', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False)
]
def test_heading_id_present() -> None:
c = Converter({})
assert c._parse("# foo {#foo}\n## bar { #bar}\n### bal { #bal} ") == [
Token(type='heading_open', tag='h1', nesting=1, attrs={'id': 'foo'}, map=[0, 1], level=0,
children=None, content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='foo {#foo}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='foo', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='heading_close', tag='h1', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='heading_open', tag='h2', nesting=1, attrs={'id': 'bar'}, map=[1, 2], level=0,
children=None, content='', markup='##', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[1, 2], level=1,
content='bar { #bar}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='bar', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='heading_close', tag='h2', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='##', info='', meta={}, block=True, hidden=False),
Token(type='heading_open', tag='h3', nesting=1, attrs={'id': 'bal'}, map=[2, 3], level=0,
children=None, content='', markup='###', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[2, 3], level=1,
content='bal { #bal}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='bal', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='heading_close', tag='h3', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='###', info='', meta={}, block=True, hidden=False)
]
def test_heading_id_incomplete() -> None:
c = Converter({})
assert c._parse("# foo {#}") == [
Token(type='heading_open', tag='h1', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='foo {#}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='foo {#}', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='heading_close', tag='h1', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False)
]
def test_heading_id_double() -> None:
c = Converter({})
assert c._parse("# foo {#a} {#b}") == [
Token(type='heading_open', tag='h1', nesting=1, attrs={'id': 'b'}, map=[0, 1], level=0,
children=None, content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='foo {#a} {#b}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='foo {#a}', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='heading_close', tag='h1', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False)
]
def test_heading_id_suffixed() -> None:
c = Converter({})
assert c._parse("# foo {#a} s") == [
Token(type='heading_open', tag='h1', nesting=1, attrs={}, map=[0, 1], level=0,
children=None, content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='foo {#a} s', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='foo {#a} s', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='heading_close', tag='h1', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False)
]

View File

@@ -0,0 +1,264 @@
import nixos_render_docs as nrd
import pytest
import textwrap
from sample_md import sample1
class Renderer(nrd.html.HTMLRenderer):
def _pull_image(self, src: str) -> str:
return src
class Converter(nrd.md.Converter[nrd.html.HTMLRenderer]):
def __init__(self, manpage_urls: dict[str, str], xrefs: dict[str, nrd.manual_structure.XrefTarget]):
super().__init__()
self._renderer = Renderer(manpage_urls, xrefs)
def unpretty(s: str) -> str:
return "".join(map(str.strip, s.splitlines())).replace('', ' ').replace('', '\n')
def test_lists_styles() -> None:
# nested lists rotate through a number of list style
c = Converter({}, {})
assert c._render("- - - - foo") == unpretty("""
<div class="itemizedlist"><ul class="itemizedlist compact" style="list-style-type: disc;">
<li class="listitem">
<div class="itemizedlist"><ul class="itemizedlist compact" style="list-style-type: circle;">
<li class="listitem">
<div class="itemizedlist"><ul class="itemizedlist compact" style="list-style-type: square;">
<li class="listitem">
<div class="itemizedlist"><ul class="itemizedlist compact" style="list-style-type: disc;">
<li class="listitem"><p>foo</p></li>
</ul></div>
</li>
</ul></div>
</li>
</ul></div>
</li>
</ul></div>
""")
assert c._render("1. 1. 1. 1. 1. 1. foo") == unpretty("""
<div class="orderedlist"><ol class="orderedlist compact" type="1">
<li class="listitem">
<div class="orderedlist"><ol class="orderedlist compact" type="a">
<li class="listitem">
<div class="orderedlist"><ol class="orderedlist compact" type="i">
<li class="listitem">
<div class="orderedlist"><ol class="orderedlist compact" type="A">
<li class="listitem">
<div class="orderedlist"><ol class="orderedlist compact" type="I">
<li class="listitem">
<div class="orderedlist"><ol class="orderedlist compact" type="1">
<li class="listitem"><p>foo</p></li>
</ol></div>
</li>
</ol></div>
</li>
</ol></div>
</li>
</ol></div>
</li>
</ol></div>
</li>
</ol></div>
""")
def test_xrefs() -> None:
# nested lists rotate through a number of list style
c = Converter({}, {
'foo': nrd.manual_structure.XrefTarget('foo', '<hr/>', 'toc1', 'title1', 'index.html'),
'bar': nrd.manual_structure.XrefTarget('bar', '<br/>', 'toc2', 'title2', 'index.html', True),
})
assert c._render("[](#foo)") == '<p><a class="xref" href="index.html#foo" title="title1" ><hr/></a></p>'
assert c._render("[](#bar)") == '<p><a class="xref" href="index.html" title="title2" ><br/></a></p>'
with pytest.raises(nrd.html.UnresolvedXrefError) as exc:
c._render("[](#baz)")
assert exc.value.args[0] == 'bad local reference, id #baz not known'
def test_images() -> None:
c = Converter({}, {})
assert c._render("![*alt text*](foo \"title text\")") == unpretty("""
<p>
<div class="mediaobject">
<img src="foo" alt="*alt text*" title="title text" />
</div>
</p>
""")
def test_tables() -> None:
c = Converter({}, {})
assert c._render(textwrap.dedent("""
| d | l | m | r |
|---|:--|:-:|--:|
| a | b | c | d |
""")) == unpretty("""
<div class="informaltable">
<table class="informaltable" border="1">
<colgroup>
<col align="left" />
<col align="left" />
<col align="center" />
<col align="right" />
</colgroup>
<thead>
<tr>
<th align="left">d</th>
<th align="left">l</th>
<th align="center">m</th>
<th align="right">r</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">a</td>
<td align="left">b</td>
<td align="center">c</td>
<td align="right">d</td>
</tr>
</tbody>
</table>
</div>
""")
def test_footnotes() -> None:
c = Converter({}, {
"bar": nrd.manual_structure.XrefTarget("bar", "", None, None, ""),
"bar.__back.0": nrd.manual_structure.XrefTarget("bar.__back.0", "", None, None, ""),
"bar.__back.1": nrd.manual_structure.XrefTarget("bar.__back.1", "", None, None, ""),
})
assert c._render(textwrap.dedent("""
foo [^bar] baz [^bar]
[^bar]: note
""")) == unpretty("""
<p>
foo <a href="#bar" class="footnote" id="bar.__back.0"><sup class="footnote">[1]</sup></a>␣
baz <a href="#bar" class="footnote" id="bar.__back.1"><sup class="footnote">[1]</sup></a>
</p>
<div class="footnotes">
<br />
<hr style="width:100; text-align:left;margin-left: 0" />
<div id="bar" class="footnote">
<p>
note<a href="#bar.__back.0" class="para"><sup class="para">[1]</sup></a>
<a href="#bar.__back.1" class="para"><sup class="para">[1]</sup></a>
</p>
</div>
</div>
""")
def test_full() -> None:
c = Converter({ 'man(1)': 'http://example.org' }, {})
assert c._render(sample1) == unpretty("""
<div class="warning">
<h3 class="title">Warning</h3>
<p>foo</p>
<div class="note">
<h3 class="title">Note</h3>
<p>nested</p>
</div>
</div>
<p>
<a class="link" href="link" target="_top">↵
multiline↵
</a>
</p>
<p>
<a class="link" href="http://example.org" target="_top">
<span class="citerefentry"><span class="refentrytitle">man</span>(1)</span>
</a> reference
</p>
<p><span id="b"></span>some <span id="a"></span>nested anchors</p>
<p>
<span class="emphasis"><em>emph</em></span>␣
<span class="strong"><strong>strong</strong></span>␣
<span class="emphasis"><em>nesting emph <span class="strong"><strong>and strong</strong></span>␣
and <code class="literal">code</code></em></span>
</p>
<div class="itemizedlist">
<ul class="itemizedlist " style="list-style-type: disc;">
<li class="listitem"><p>wide bullet</p></li>
<li class="listitem"><p>list</p></li>
</ul>
</div>
<div class="orderedlist">
<ol class="orderedlist " type="1">
<li class="listitem"><p>wide ordered</p></li>
<li class="listitem"><p>list</p></li>
</ol>
</div>
<div class="itemizedlist">
<ul class="itemizedlist compact" style="list-style-type: disc;">
<li class="listitem"><p>narrow bullet</p></li>
<li class="listitem"><p>list</p></li>
</ul>
</div>
<div class="orderedlist">
<ol class="orderedlist compact" type="1">
<li class="listitem"><p>narrow ordered</p></li>
<li class="listitem"><p>list</p></li>
</ol>
</div>
<div class="blockquote">
<blockquote class="blockquote">
<p>quotes</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>with <span class="emphasis"><em>nesting</em></span></p>
<pre>
<code class="programlisting">
nested code block↵
</code>
</pre>
</blockquote>
</div>
<div class="itemizedlist">
<ul class="itemizedlist compact" style="list-style-type: disc;">
<li class="listitem"><p>and lists</p></li>
<li class="listitem">
<pre>
<code class="programlisting">
containing code↵
</code>
</pre>
</li>
</ul>
</div>
<p>and more quote</p>
</blockquote>
</div>
<div class="orderedlist">
<ol class="orderedlist compact" start="100" type="1">
<li class="listitem"><p>list starting at 100</p></li>
<li class="listitem"><p>goes on</p></li>
</ol>
</div>
<div class="variablelist">
<dl class="variablelist">
<dt><span class="term">deflist</span></dt>
<dd>
<div class="blockquote">
<blockquote class="blockquote">
<p>
with a quote↵
and stuff
</p>
</blockquote>
</div>
<pre>
<code class="programlisting">
code block↵
</code>
</pre>
<pre>
<code class="programlisting">
fenced block↵
</code>
</pre>
<p>text</p>
</dd>
<dt><span class="term">more stuff in same deflist</span></dt>
<dd>
<p>foo</p>
</dd>
</dl>
</div>""")

View File

@@ -0,0 +1,188 @@
import nixos_render_docs as nrd
import pytest
from markdown_it.token import Token
class Converter(nrd.md.Converter[nrd.html.HTMLRenderer]):
# actual renderer doesn't matter, we're just parsing.
def __init__(self, manpage_urls: dict[str, str]) -> None:
super().__init__()
self._renderer = nrd.html.HTMLRenderer(manpage_urls, {})
@pytest.mark.parametrize("ordered", [True, False])
def test_list_wide(ordered: bool) -> None:
t, tag, m, e1, e2, i1, i2 = (
("ordered", "ol", ".", "1.", "2.", "1", "2") if ordered else ("bullet", "ul", "-", "-", "-", "", "")
)
c = Converter({})
meta = { 'end': int(e2[:-1]) } if ordered else {}
meta['compact'] = False
assert c._parse(f"{e1} a\n\n{e2} b") == [
Token(type=f'{t}_list_open', tag=tag, nesting=1, attrs={}, map=[0, 3], level=0,
children=None, content='', markup=m, info='', meta=meta, block=True, hidden=False),
Token(type='list_item_open', tag='li', nesting=1, attrs={}, map=[0, 2], level=1, children=None,
content='', markup=m, info=i1, meta={}, block=True, hidden=False),
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=2, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=3,
content='a', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='a', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=2, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='list_item_close', tag='li', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False),
Token(type='list_item_open', tag='li', nesting=1, attrs={}, map=[2, 3], level=1, children=None,
content='', markup=m, info=i2, meta={}, block=True, hidden=False),
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[2, 3], level=2, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[2, 3], level=3,
content='b', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='b', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=2, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='list_item_close', tag='li', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False),
Token(type=f'{t}_list_close', tag=tag, nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False)
]
@pytest.mark.parametrize("ordered", [True, False])
def test_list_narrow(ordered: bool) -> None:
t, tag, m, e1, e2, i1, i2 = (
("ordered", "ol", ".", "1.", "2.", "1", "2") if ordered else ("bullet", "ul", "-", "-", "-", "", "")
)
c = Converter({})
meta = { 'end': int(e2[:-1]) } if ordered else {}
meta['compact'] = True
assert c._parse(f"{e1} a\n{e2} b") == [
Token(type=f'{t}_list_open', tag=tag, nesting=1, attrs={}, map=[0, 2], level=0,
children=None, content='', markup=m, info='', meta=meta, block=True, hidden=False),
Token(type='list_item_open', tag='li', nesting=1, attrs={}, map=[0, 1], level=1, children=None,
content='', markup=m, info=i1, meta={}, block=True, hidden=False),
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=2, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=3,
content='a', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='a', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=2, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='list_item_close', tag='li', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False),
Token(type='list_item_open', tag='li', nesting=1, attrs={}, map=[1, 2], level=1, children=None,
content='', markup=m, info=i2, meta={}, block=True, hidden=False),
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[1, 2], level=2, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='inline', tag='', nesting=0, attrs={}, map=[1, 2], level=3,
content='b', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='b', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=2, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='list_item_close', tag='li', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False),
Token(type=f'{t}_list_close', tag=tag, nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False)
]
assert c._parse(f"{e1} - a\n{e2} b") == [
Token(type=f'{t}_list_open', tag=tag, nesting=1, attrs={}, map=[0, 2], level=0,
children=None, content='', markup=m, info='', meta=meta, block=True, hidden=False),
Token(type='list_item_open', tag='li', nesting=1, attrs={}, map=[0, 1], level=1, children=None,
content='', markup=m, info=i1, meta={}, block=True, hidden=False),
Token(type='bullet_list_open', tag='ul', nesting=1, attrs={}, map=[0, 1], level=2,
children=None, content='', markup='-', info='', meta={'compact': True}, block=True, hidden=False),
Token(type='list_item_open', tag='li', nesting=1, attrs={}, map=[0, 1], level=3, children=None,
content='', markup='-', info='', meta={}, block=True, hidden=False),
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=4, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=5,
content='a', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='a', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=4, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='list_item_close', tag='li', nesting=-1, attrs={}, map=None, level=3, children=None,
content='', markup='-', info='', meta={}, block=True, hidden=False),
Token(type='bullet_list_close', tag='ul', nesting=-1, attrs={}, map=None, level=2, children=None,
content='', markup='-', info='', meta={}, block=True, hidden=False),
Token(type='list_item_close', tag='li', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False),
Token(type='list_item_open', tag='li', nesting=1, attrs={}, map=[1, 2], level=1, children=None,
content='', markup=m, info=i2, meta={}, block=True, hidden=False),
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[1, 2], level=2, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='inline', tag='', nesting=0, attrs={}, map=[1, 2], level=3,
content='b', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='b', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=2, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='list_item_close', tag='li', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False),
Token(type=f'{t}_list_close', tag=tag, nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False)
]
assert c._parse(f"{e1} - a\n{e2} - b") == [
Token(type=f'{t}_list_open', tag=tag, nesting=1, attrs={}, map=[0, 2], level=0,
children=None, content='', markup=m, info='', meta=meta, block=True, hidden=False),
Token(type='list_item_open', tag='li', nesting=1, attrs={}, map=[0, 1], level=1, children=None,
content='', markup=m, info=i1, meta={}, block=True, hidden=False),
Token(type='bullet_list_open', tag='ul', nesting=1, attrs={}, map=[0, 1], level=2,
children=None, content='', markup='-', info='', meta={'compact': True}, block=True, hidden=False),
Token(type='list_item_open', tag='li', nesting=1, attrs={}, map=[0, 1], level=3, children=None,
content='', markup='-', info='', meta={}, block=True, hidden=False),
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=4, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=5,
content='a', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='a', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=4, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='list_item_close', tag='li', nesting=-1, attrs={}, map=None, level=3, children=None,
content='', markup='-', info='', meta={}, block=True, hidden=False),
Token(type='bullet_list_close', tag='ul', nesting=-1, attrs={}, map=None, level=2, children=None,
content='', markup='-', info='', meta={}, block=True, hidden=False),
Token(type='list_item_close', tag='li', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False),
Token(type='list_item_open', tag='li', nesting=1, attrs={}, map=[1, 2], level=1, children=None,
content='', markup=m, info=i2, meta={}, block=True, hidden=False),
Token(type='bullet_list_open', tag='ul', nesting=1, attrs={}, map=[1, 2], level=2,
children=None, content='', markup='-', info='', meta={'compact': True}, block=True, hidden=False),
Token(type='list_item_open', tag='li', nesting=1, attrs={}, map=[1, 2], level=3, children=None,
content='', markup='-', info='', meta={}, block=True, hidden=False),
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[1, 2], level=4, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='inline', tag='', nesting=0, attrs={}, map=[1, 2], level=5,
content='b', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='b', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=4, children=None,
content='', markup='', info='', meta={}, block=True, hidden=True),
Token(type='list_item_close', tag='li', nesting=-1, attrs={}, map=None, level=3, children=None,
content='', markup='-', info='', meta={}, block=True, hidden=False),
Token(type='bullet_list_close', tag='ul', nesting=-1, attrs={}, map=None, level=2, children=None,
content='', markup='-', info='', meta={}, block=True, hidden=False),
Token(type='list_item_close', tag='li', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False),
Token(type=f'{t}_list_close', tag=tag, nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup=m, info='', meta={}, block=True, hidden=False)
]

View File

@@ -0,0 +1,169 @@
import nixos_render_docs as nrd
from sample_md import sample1
from typing import Mapping
class Converter(nrd.md.Converter[nrd.manpage.ManpageRenderer]):
def __init__(self, manpage_urls: Mapping[str, str], options_by_id: dict[str, str] = {}):
super().__init__()
self._renderer = nrd.manpage.ManpageRenderer(manpage_urls, options_by_id)
def test_inline_code() -> None:
c = Converter({})
assert c._render("1 `x a x` 2") == "1 \\fR\\(oqx a x\\(cq\\fP 2"
def test_fonts() -> None:
c = Converter({})
assert c._render("*a **b** c*") == "\\fIa \\fBb\\fI c\\fR"
assert c._render("*a [1 `2`](3) c*") == "\\fIa \\fB1 \\fR\\(oq2\\(cq\\fP\\fI c\\fR"
def test_expand_link_targets() -> None:
c = Converter({}, { '#foo1': "bar", "#foo2": "bar" })
assert (c._render("[a](#foo1) [](#foo2) [b](#bar1) [](#bar2)") ==
"\\fBa\\fR \\fBbar\\fR \\fBb\\fR \\fB\\fR")
def test_collect_links() -> None:
c = Converter({}, { '#foo': "bar" })
c._renderer.link_footnotes = []
assert c._render("[a](link1) [b](link2)") == "\\fBa\\fR[1]\\fR \\fBb\\fR[2]\\fR"
assert c._renderer.link_footnotes == ['link1', 'link2']
def test_dedup_links() -> None:
c = Converter({}, { '#foo': "bar" })
c._renderer.link_footnotes = []
assert c._render("[a](link) [b](link)") == "\\fBa\\fR[1]\\fR \\fBb\\fR[1]\\fR"
assert c._renderer.link_footnotes == ['link']
def test_full() -> None:
c = Converter({ 'man(1)': 'http://example.org' })
assert c._render(sample1) == """\
.sp
.RS 4
\\fBWarning\\fP
.br
foo
.sp
.RS 4
\\fBNote\\fP
.br
nested
.RE
.RE
.sp
\\fBmultiline\\fR
.sp
\\fBman\\fP\\fR(1)\\fP reference
.sp
some nested anchors
.sp
\\fIemph\\fR \\fBstrong\\fR \\fInesting emph \\fBand strong\\fI and \\fR\\(oqcode\\(cq\\fP\\fR
.sp
.RS 4
\\h'-2'\\fB\\[u2022]\\fP\\h'1'\\c
wide bullet
.RE
.sp
.RS 4
\\h'-2'\\fB\\[u2022]\\fP\\h'1'\\c
list
.RE
.sp
.RS 4
\\h'-3'\\fB1\\&.\\fP\\h'1'\\c
wide ordered
.RE
.sp
.RS 4
\\h'-3'\\fB2\\&.\\fP\\h'1'\\c
list
.RE
.sp
.RS 4
\\h'-2'\\fB\\[u2022]\\fP\\h'1'\\c
narrow bullet
.RE
.RS 4
\\h'-2'\\fB\\[u2022]\\fP\\h'1'\\c
list
.RE
.sp
.RS 4
\\h'-3'\\fB1\\&.\\fP\\h'1'\\c
narrow ordered
.RE
.RS 4
\\h'-3'\\fB2\\&.\\fP\\h'1'\\c
list
.RE
.sp
.RS 4
\\h'-3'\\fI\\(lq\\(rq\\fP\\h'1'\\c
quotes
.sp
.RS 4
\\h'-3'\\fI\\(lq\\(rq\\fP\\h'1'\\c
with \\fInesting\\fR
.sp
.RS 4
.nf
nested code block
.fi
.RE
.RE
.sp
.RS 4
\\h'-2'\\fB\\[u2022]\\fP\\h'1'\\c
and lists
.RE
.RS 4
\\h'-2'\\fB\\[u2022]\\fP\\h'1'\\c
.sp
.RS 4
.nf
containing code
.fi
.RE
.RE
.sp
and more quote
.RE
.sp
.RS 6
\\h'-5'\\fB100\\&.\\fP\\h'1'\\c
list starting at 100
.RE
.RS 6
\\h'-5'\\fB101\\&.\\fP\\h'1'\\c
goes on
.RE
.RS 4
.PP
deflist
.RS 4
.RS 4
\\h'-3'\\fI\\(lq\\(rq\\fP\\h'1'\\c
with a quote and stuff
.RE
.sp
.RS 4
.nf
code block
.fi
.RE
.sp
.RS 4
.nf
fenced block
.fi
.RE
.sp
text
.RE
.PP
more stuff in same deflist
.RS 4
foo
.RE
.RE"""

View File

@@ -0,0 +1,41 @@
import json
from pathlib import Path
import pytest
from markdown_it.token import Token
import nixos_render_docs
from nixos_render_docs.options import AnchorStyle
def test_option_headings() -> None:
c = nixos_render_docs.options.HTMLConverter({}, 'local', 'vars', 'opt-', {})
with pytest.raises(RuntimeError) as exc:
c._render("# foo")
assert exc.value.args[0] == 'md token not supported in options doc'
assert exc.value.args[1] == Token(
type='heading_open', tag='h1', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False
)
def test_options_commonmark() -> None:
c = nixos_render_docs.options.CommonMarkConverter({}, 'local')
with Path('tests/sample_options_simple.json').open() as f:
opts = json.load(f)
assert opts is not None
with Path('tests/sample_options_simple_default.md').open() as f:
expected = f.read()
c.add_options(opts)
s = c.finalize()
assert s == expected
def test_options_commonmark_legacy_anchors() -> None:
c = nixos_render_docs.options.CommonMarkConverter({}, 'local', anchor_style = AnchorStyle.LEGACY, anchor_prefix = 'opt-')
with Path('tests/sample_options_simple.json').open() as f:
opts = json.load(f)
assert opts is not None
with Path('tests/sample_options_simple_legacy.md').open() as f:
expected = f.read()
c.add_options(opts)
s = c.finalize()
assert s == expected

View File

@@ -0,0 +1,549 @@
import textwrap
import pytest
from markdown_it.token import Token
import nixos_render_docs as nrd
from nixos_render_docs.src_error import SrcError
class Converter(nrd.md.Converter[nrd.html.HTMLRenderer]):
# actual renderer doesn't matter, we're just parsing.
def __init__(self, manpage_urls: dict[str, str]) -> None:
super().__init__()
self._renderer = nrd.html.HTMLRenderer(manpage_urls, {})
def test_attr_span_parsing() -> None:
c = Converter({})
assert c._parse("[]{#test}") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1, content='[]{#test}',
markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='attr_span_begin', tag='span', nesting=1, attrs={'id': 'test'}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='attr_span_end', tag='span', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("[]{.test}") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1, content='[]{.test}',
markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='attr_span_begin', tag='span', nesting=1, attrs={'class': 'test'}, map=None,
level=0, children=None, content='', markup='', info='', meta={}, block=False,
hidden=False),
Token(type='attr_span_end', tag='span', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("[]{.test1 .test2 #foo .test3 .test4}") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='[]{.test1 .test2 #foo .test3 .test4}',
markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='attr_span_begin', tag='span', nesting=1,
attrs={'class': 'test1 test2 test3 test4', 'id': 'foo'}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='attr_span_end', tag='span', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("[]{#a #a}") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='[]{#a #a}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='[]{#a #a}', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("[]{foo}") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='[]{foo}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='[]{foo}', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
def test_attr_span_formatted() -> None:
c = Converter({})
assert c._parse("a[b c `d` ***e***]{#test}f") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0,
children=None, content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='a[b c `d` ***e***]{#test}f', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0,
children=None, content='a', markup='', info='', meta={}, block=False, hidden=False),
Token(type='attr_span_begin', tag='span', nesting=1, attrs={'id': 'test'}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=1, children=None,
content='b c ', markup='', info='', meta={}, block=False, hidden=False),
Token(type='code_inline', tag='code', nesting=0, attrs={}, map=None, level=1,
children=None, content='d', markup='`', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=1, children=None,
content=' ', markup='', info='', meta={}, block=False, hidden=False),
Token(type='em_open', tag='em', nesting=1, attrs={}, map=None, level=1, children=None,
content='', markup='*', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=2, children=None,
content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='strong_open', tag='strong', nesting=1, attrs={}, map=None, level=2,
children=None, content='', markup='**', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=3, children=None,
content='e', markup='', info='', meta={}, block=False, hidden=False),
Token(type='strong_close', tag='strong', nesting=-1, attrs={}, map=None, level=2,
children=None, content='', markup='**', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=2, children=None,
content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='em_close', tag='em', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup='*', info='', meta={}, block=False, hidden=False),
Token(type='attr_span_end', tag='span', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='f', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
def test_attr_span_in_heading() -> None:
c = Converter({})
# inline anchors in headers are allowed, but header attributes should be preferred
assert c._parse("# foo []{#bar} baz") == [
Token(type='heading_open', tag='h1', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='foo []{#bar} baz', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='foo ', markup='', info='', meta={}, block=False, hidden=False),
Token(type='attr_span_begin', tag='span', nesting=1, attrs={'id': 'bar'}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='attr_span_end', tag='span', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content=' baz', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='heading_close', tag='h1', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False)
]
def test_attr_span_on_links() -> None:
c = Converter({})
assert c._parse("[ [a](#bar) ]{#foo}") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1, content='[ [a](#bar) ]{#foo}',
markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='attr_span_begin', tag='span', nesting=1, attrs={'id': 'foo'}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=1, children=None,
content=' ', markup='', info='', meta={}, block=False, hidden=False),
Token(type='link_open', tag='a', nesting=1, attrs={'href': '#bar'}, map=None, level=1,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=2, children=None,
content='a', markup='', info='', meta={}, block=False, hidden=False),
Token(type='link_close', tag='a', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=1, children=None,
content=' ', markup='', info='', meta={}, block=False, hidden=False),
Token(type='attr_span_end', tag='span', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
def test_attr_span_nested() -> None:
# inline anchors may contain more anchors (even though this is a bit pointless)
c = Converter({})
assert c._parse("[ [a]{#bar} ]{#foo}") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='[ [a]{#bar} ]{#foo}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='attr_span_begin', tag='span', nesting=1, attrs={'id': 'foo'}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=1, children=None,
content=' ', markup='', info='', meta={}, block=False, hidden=False),
Token(type='attr_span_begin', tag='span', nesting=1, attrs={'id': 'bar'}, map=None, level=1,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=2, children=None,
content='a', markup='', info='', meta={}, block=False, hidden=False),
Token(type='attr_span_end', tag='span', nesting=-1, attrs={}, map=None, level=1,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=1, children=None,
content=' ', markup='', info='', meta={}, block=False, hidden=False),
Token(type='attr_span_end', tag='span', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
def test_attr_span_escaping() -> None:
c = Converter({})
assert c._parse("\\[a]{#bar}") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='\\[a]{#bar}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='[a]{#bar}', markup='\\[', info='escape', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("\\\\[a]{#bar}") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='\\\\[a]{#bar}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='\\', markup='\\\\', info='escape', meta={}, block=False, hidden=False),
Token(type='attr_span_begin', tag='span', nesting=1, attrs={'id': 'bar'}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=1, children=None,
content='a', markup='', info='', meta={}, block=False, hidden=False),
Token(type='attr_span_end', tag='span', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("\\\\\\[a]{#bar}") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='\\[a]{#bar}', markup='\\\\', info='escape', meta={}, block=False, hidden=False)
],
content='\\\\\\[a]{#bar}', markup='', info='', meta={}, block=True, hidden=False),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
def test_inline_comment_basic() -> None:
c = Converter({})
assert c._parse("a <!-- foo --><!----> b") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='a <!-- foo --><!----> b', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='a b', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("a<!-- b -->") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='a<!-- b -->', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='a', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
def test_inline_comment_does_not_nest_in_code() -> None:
c = Converter({})
assert c._parse("`a<!-- b -->c`") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='`a<!-- b -->c`', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='code_inline', tag='code', nesting=0, attrs={}, map=None, level=0, children=None,
content='a<!-- b -->c', markup='`', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
def test_inline_comment_does_not_nest_elsewhere() -> None:
c = Converter({})
assert c._parse("*a<!-- b -->c*") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='*a<!-- b -->c*', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='em_open', tag='em', nesting=1, attrs={}, map=None, level=0, children=None,
content='', markup='*', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=1, children=None,
content='ac', markup='', info='', meta={}, block=False, hidden=False),
Token(type='em_close', tag='em', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='*', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
def test_inline_comment_can_be_escaped() -> None:
c = Converter({})
assert c._parse("a\\<!-- b -->c") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='a\\<!-- b -->c', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='a<!-- b -->c', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("a\\\\<!-- b -->c") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='a\\c', markup='', info='', meta={}, block=False, hidden=False)
],
content='a\\\\<!-- b -->c', markup='', info='', meta={}, block=True, hidden=False),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("a\\\\\\<!-- b -->c") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='a\\<!-- b -->c', markup='', info='', meta={}, block=False, hidden=False)
],
content='a\\\\\\<!-- b -->c', markup='', info='', meta={}, block=True, hidden=False),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
def test_block_comment() -> None:
c = Converter({})
assert c._parse("<!-- a -->") == []
assert c._parse("<!-- a\n-->") == []
assert c._parse("<!--\na\n-->") == []
assert c._parse("<!--\n\na\n\n-->") == []
assert c._parse("<!--\n\n```\n\n\n```\n\n-->") == []
def test_heading_attributes() -> None:
c = Converter({})
assert c._parse("# foo *bar* {#hid}") == [
Token(type='heading_open', tag='h1', nesting=1, attrs={'id': 'hid'}, map=[0, 1], level=0,
children=None, content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='foo *bar* {#hid}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='foo ', markup='', info='', meta={}, block=False, hidden=False),
Token(type='em_open', tag='em', nesting=1, attrs={}, map=None, level=0, children=None,
content='', markup='*', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=1, children=None,
content='bar', markup='', info='', meta={}, block=False, hidden=False),
Token(type='em_close', tag='em', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='*', info='', meta={}, block=False, hidden=False),
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='heading_close', tag='h1', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False)
]
assert c._parse("# foo--bar {#id-with--double-dashes}") == [
Token(type='heading_open', tag='h1', nesting=1, attrs={'id': 'id-with--double-dashes'}, map=[0, 1],
level=0, children=None, content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='foo--bar {#id-with--double-dashes}', markup='', info='', meta={}, block=True,
hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='foobar', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='heading_close', tag='h1', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False)
]
def test_admonitions() -> None:
c = Converter({})
assert c._parse("::: {.note}") == [
Token(type='admonition_open', tag='div', nesting=1, attrs={}, map=[0, 1], level=0,
children=None, content='', markup=':::', info=' {.note}', meta={'kind': 'note'}, block=True,
hidden=False),
Token(type='admonition_close', tag='div', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup=':::', info='', meta={}, block=True, hidden=False)
]
assert c._parse("::: {.caution}") == [
Token(type='admonition_open', tag='div', nesting=1, attrs={}, map=[0, 1], level=0,
children=None, content='', markup=':::', info=' {.caution}', meta={'kind': 'caution'},
block=True, hidden=False),
Token(type='admonition_close', tag='div', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup=':::', info='', meta={}, block=True, hidden=False)
]
assert c._parse("::: {.tip}") == [
Token(type='admonition_open', tag='div', nesting=1, attrs={}, map=[0, 1], level=0,
children=None, content='', markup=':::', info=' {.tip}', meta={'kind': 'tip'}, block=True,
hidden=False),
Token(type='admonition_close', tag='div', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup=':::', info='', meta={}, block=True, hidden=False)
]
assert c._parse("::: {.important}") == [
Token(type='admonition_open', tag='div', nesting=1, attrs={}, map=[0, 1], level=0,
children=None, content='', markup=':::', info=' {.important}', meta={'kind': 'important'},
block=True, hidden=False),
Token(type='admonition_close', tag='div', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup=':::', info='', meta={}, block=True, hidden=False)
]
assert c._parse("::: {.warning}") == [
Token(type='admonition_open', tag='div', nesting=1, attrs={}, map=[0, 1], level=0,
children=None, content='', markup=':::', info=' {.warning}', meta={'kind': 'warning'},
block=True, hidden=False),
Token(type='admonition_close', tag='div', nesting=-1, attrs={}, map=None, level=0,
children=None, content='', markup=':::', info='', meta={}, block=True, hidden=False)
]
def test_example() -> None:
c = Converter({})
assert c._parse("::: {.example}\n# foo") == [
Token(type='example_open', tag='div', nesting=1, attrs={}, map=[0, 2], level=0, children=None,
content='', markup=':::', info=' {.example}', meta={}, block=True, hidden=False),
Token(type='example_title_open', tag='h1', nesting=1, attrs={}, map=[1, 2], level=1, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[1, 2], level=2,
content='foo', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='foo', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='example_title_close', tag='h1', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='example_close', tag='div', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("::: {#eid .example}\n# foo") == [
Token(type='example_open', tag='div', nesting=1, attrs={'id': 'eid'}, map=[0, 2], level=0,
children=None, content='', markup=':::', info=' {#eid .example}', meta={}, block=True,
hidden=False),
Token(type='example_title_open', tag='h1', nesting=1, attrs={}, map=[1, 2], level=1, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[1, 2], level=2,
content='foo', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='foo', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='example_title_close', tag='h1', nesting=-1, attrs={}, map=None, level=1, children=None,
content='', markup='#', info='', meta={}, block=True, hidden=False),
Token(type='example_close', tag='div', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("::: {.example .note}") == [
Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1,
content='::: {.example .note}', markup='', info='', meta={}, block=True, hidden=False,
children=[
Token(type='text', tag='', nesting=0, attrs={}, map=None, level=0, children=None,
content='::: {.example .note}', markup='', info='', meta={}, block=False, hidden=False)
]),
Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None,
content='', markup='', info='', meta={}, block=True, hidden=False)
]
assert c._parse("::: {.example}\n### foo: `code`\nbar\n:::\nbaz") == [
Token(type='example_open', tag='div', nesting=1, map=[0, 3], markup=':::', info=' {.example}',
block=True),
Token(type='example_title_open', tag='h3', nesting=1, map=[1, 2], level=1, markup='###', block=True),
Token(type='inline', tag='', nesting=0, map=[1, 2], level=2, content='foo: `code`', block=True,
children=[
Token(type='text', tag='', nesting=0, content='foo: '),
Token(type='code_inline', tag='code', nesting=0, content='code', markup='`')
]),
Token(type='example_title_close', tag='h3', nesting=-1, level=1, markup='###', block=True),
Token(type='paragraph_open', tag='p', nesting=1, map=[2, 3], level=1, block=True),
Token(type='inline', tag='', nesting=0, map=[2, 3], level=2, content='bar', block=True,
children=[
Token(type='text', tag='', nesting=0, content='bar')
]),
Token(type='paragraph_close', tag='p', nesting=-1, level=1, block=True),
Token(type='example_close', tag='div', nesting=-1, markup=':::', block=True),
Token(type='paragraph_open', tag='p', nesting=1, map=[4, 5], block=True),
Token(type='inline', tag='', nesting=0, map=[4, 5], level=1, content='baz', block=True,
children=[
Token(type='text', tag='', nesting=0, content='baz')
]),
Token(type='paragraph_close', tag='p', nesting=-1, block=True)
]
with pytest.raises(SrcError) as exc:
c._parse("::: {.example}\n### foo\n### bar\n:::")
assert str(exc.value) == textwrap.dedent(
"""
unexpected non-title heading in `:::{.example}`; are you missing a `:::`?
Note: blocks like `:::{.example}` are only allowed to contain a single heading in order to simplify TOC generation.
\x1b[33m`:::{.example}` block at lines 1-3:\x1b[0m
\x1b[2m\x1b[37m 1\x1b[0m \x1b[1m\x1b[33m┃\x1b[0m ::: {.example}\x1b[0m
\x1b[2m\x1b[37m 2\x1b[0m \x1b[1m\x1b[33m┃\x1b[0m ### foo\x1b[0m
\x1b[2m\x1b[37m 3\x1b[0m \x1b[1m\x1b[33m┃\x1b[0m ### bar\x1b[0m
\x1b[2m\x1b[37m 4\x1b[0m \x1b[2m\x1b[37m┆ :::\x1b[0m
\x1b[33mUnexpected heading at line 3:\x1b[0m
\x1b[2m\x1b[37m 1\x1b[0m \x1b[2m\x1b[37m┆ ::: {.example}\x1b[0m
\x1b[2m\x1b[37m 2\x1b[0m \x1b[2m\x1b[37m┆ ### foo\x1b[0m
\x1b[2m\x1b[37m 3\x1b[0m \x1b[1m\x1b[33m┃\x1b[0m ### bar\x1b[0m
\x1b[2m\x1b[37m 4\x1b[0m \x1b[2m\x1b[37m┆ :::\x1b[0m
"""
).strip()
def test_footnotes() -> None:
c = Converter({})
assert c._parse("text [^foo]\n\n[^foo]: bar") == [
Token(type='paragraph_open', tag='p', nesting=1, map=[0, 1], block=True),
Token(type='inline', tag='', nesting=0, map=[0, 1], level=1, content='text [^foo]', block=True,
children=[
Token(type='text', tag='', nesting=0, content='text '),
Token(type='footnote_ref', tag='', nesting=0, attrs={'id': 'foo.__back.0'},
meta={'id': 0, 'subId': 0, 'label': 'foo', 'target': 'foo'})
]),
Token(type='paragraph_close', tag='p', nesting=-1, block=True),
Token(type='footnote_block_open', tag='', nesting=1),
Token(type='footnote_open', tag='', nesting=1, attrs={'id': 'foo'}, meta={'id': 0, 'label': 'foo'}),
Token(type='paragraph_open', tag='p', nesting=1, map=[2, 3], level=1, block=True, hidden=False),
Token(type='inline', tag='', nesting=0, map=[2, 3], level=2, content='bar', block=True,
children=[
Token(type='text', tag='', nesting=0, content='bar')
]),
Token(type='footnote_anchor', tag='', nesting=0,
meta={'id': 0, 'label': 'foo', 'subId': 0, 'target': 'foo.__back.0'}),
Token(type='paragraph_close', tag='p', nesting=-1, level=1, block=True),
Token(type='footnote_close', tag='', nesting=-1),
Token(type='footnote_block_close', tag='', nesting=-1),
]

View File

@@ -0,0 +1,259 @@
import json
import unittest
from pathlib import Path
from nixos_render_docs.manual import HTMLConverter, HTMLParameters
from nixos_render_docs.redirects import Redirects, RedirectsError
class TestRedirects(unittest.TestCase):
def setup_test(self, sources, raw_redirects):
with open(Path(__file__).parent / 'index.md', 'w') as infile:
indexHTML = ["# Redirects test suite {#redirects-test-suite}\n## Setup steps"]
for path in sources.keys():
outpath = f"{path.split('.md')[0]}.html"
indexHTML.append(f"```{{=include=}} appendix html:into-file=//{outpath}\n{path}\n```")
infile.write("\n".join(indexHTML))
for filename, content in sources.items():
with open(Path(__file__).parent / filename, 'w') as infile:
infile.write(content)
redirects = Redirects({"redirects-test-suite": ["index.html#redirects-test-suite"]} | raw_redirects, '')
return HTMLConverter("1.0.0", HTMLParameters("", [], [], 2, 2, 2, Path("")), {}, redirects)
def run_test(self, md: HTMLConverter):
md.convert(Path(__file__).parent / 'index.md', Path(__file__).parent / 'index.html')
def assert_redirect_error(self, expected_errors: dict, md: HTMLConverter):
with self.assertRaises(RuntimeError) as context:
self.run_test(md)
exception = context.exception.__cause__
self.assertIsInstance(exception, RedirectsError)
for attr, expected_values in expected_errors.items():
self.assertTrue(hasattr(exception, attr))
actual_values = getattr(exception, attr)
self.assertEqual(set(actual_values), set(expected_values))
def test_identifier_added(self):
"""Test adding a new identifier to the source."""
before = self.setup_test(
sources={"foo.md": "# Foo {#foo}"},
raw_redirects={"foo": ["foo.html#foo"]},
)
self.run_test(before)
intermediate = self.setup_test(
sources={"foo.md": "# Foo {#foo}\n## Bar {#bar}"},
raw_redirects={"foo": ["foo.html#foo"]},
)
self.assert_redirect_error({"identifiers_without_redirects": ["bar"]}, intermediate)
after = self.setup_test(
sources={"foo.md": "# Foo {#foo}\n## Bar {#bar}"},
raw_redirects={"foo": ["foo.html#foo"], "bar": ["foo.html#bar"]},
)
self.run_test(after)
def test_identifier_removed(self):
"""Test removing an identifier from the source."""
before = self.setup_test(
sources={"foo.md": "# Foo {#foo}\n## Bar {#bar}"},
raw_redirects={"foo": ["foo.html#foo"], "bar": ["foo.html#bar"]},
)
self.run_test(before)
intermediate = self.setup_test(
sources={"foo.md": "# Foo {#foo}"},
raw_redirects={"foo": ["foo.html#foo"], "bar": ["foo.html#bar"]},
)
self.assert_redirect_error({"orphan_identifiers": ["bar"]}, intermediate)
after = self.setup_test(
sources={"foo.md": "# Foo {#foo}"},
raw_redirects={"foo": ["foo.html#foo"]},
)
self.run_test(after)
def test_identifier_renamed(self):
"""Test renaming an identifier in the source."""
before = self.setup_test(
sources={"foo.md": "# Foo {#foo}\n## Bar {#bar}"},
raw_redirects={"foo": ["foo.html#foo"], "bar": ["foo.html#bar"]},
)
self.run_test(before)
intermediate = self.setup_test(
sources={"foo.md": "# Foo Prime {#foo-prime}\n## Bar {#bar}"},
raw_redirects={"foo": ["foo.html#foo"], "bar": ["foo.html#bar"]},
)
self.assert_redirect_error(
{
"identifiers_without_redirects": ["foo-prime"],
"orphan_identifiers": ["foo"]
},
intermediate
)
after = self.setup_test(
sources={"foo.md": "# Foo Prime {#foo-prime}\n## Bar {#bar}"},
raw_redirects={"foo-prime": ["foo.html#foo-prime", "foo.html#foo"], "bar": ["foo.html#bar"]},
)
self.run_test(after)
def test_leaf_identifier_moved_to_different_file(self):
"""Test moving a leaf identifier to a different output path."""
before = self.setup_test(
sources={"foo.md": "# Foo {#foo}\n## Bar {#bar}"},
raw_redirects={"foo": ["foo.html#foo"], "bar": ["foo.html#bar"]},
)
self.run_test(before)
intermediate = self.setup_test(
sources={
"foo.md": "# Foo {#foo}",
"bar.md": "# Bar {#bar}"
},
raw_redirects={"foo": ["foo.html#foo"], "bar": ["foo.html#foo"]},
)
self.assert_redirect_error({"identifiers_missing_current_outpath": ["bar"]}, intermediate)
after = self.setup_test(
sources={
"foo.md": "# Foo {#foo}",
"bar.md": "# Bar {#bar}"
},
raw_redirects={"foo": ["foo.html#foo"], "bar": ["bar.html#bar", "foo.html#bar"]},
)
self.run_test(after)
def test_non_leaf_identifier_moved_to_different_file(self):
"""Test moving a non-leaf identifier to a different output path."""
before = self.setup_test(
sources={"foo.md": "# Foo {#foo}\n## Bar {#bar}\n### Baz {#baz}"},
raw_redirects={"foo": ["foo.html#foo"], "bar": ["foo.html#bar"], "baz": ["foo.html#baz"]},
)
self.run_test(before)
intermediate = self.setup_test(
sources={
"foo.md": "# Foo {#foo}",
"bar.md": "# Bar {#bar}\n## Baz {#baz}"
},
raw_redirects={"foo": ["foo.html#foo"], "bar": ["foo.html#bar"], "baz": ["foo.html#baz"]},
)
self.assert_redirect_error({"identifiers_missing_current_outpath": ["bar", "baz"]}, intermediate)
after = self.setup_test(
sources={
"foo.md": "# Foo {#foo}",
"bar.md": "# Bar {#bar}\n## Baz {#baz}"
},
raw_redirects={
"foo": ["foo.html#foo"],
"bar": ["bar.html#bar", "foo.html#bar"],
"baz": ["bar.html#baz", "foo.html#baz"]
},
)
self.run_test(after)
def test_conflicting_anchors(self):
"""Test for conflicting anchors."""
md = self.setup_test(
sources={"foo.md": "# Foo {#foo}\n## Bar {#bar}"},
raw_redirects={
"foo": ["foo.html#foo", "foo.html#bar"],
"bar": ["foo.html#bar"],
}
)
self.assert_redirect_error({"conflicting_anchors": ["bar"]}, md)
def test_divergent_redirect(self):
"""Test for divergent redirects."""
md = self.setup_test(
sources={
"foo.md": "# Foo {#foo}",
"bar.md": "# Bar {#bar}"
},
raw_redirects={
"foo": ["foo.html#foo", "old-foo.html"],
"bar": ["bar.html#bar", "old-foo.html"]
}
)
self.assert_redirect_error({"divergent_redirects": ["old-foo.html"]}, md)
def test_no_client_redirects(self):
"""Test fetching client side redirects and ignore server-side ones."""
md = self.setup_test(
sources={"foo.md": "# Foo {#foo}\n## Bar {#bar}"},
raw_redirects={"foo": ["foo.html#foo"], "bar": ["foo.html#bar", "bar.html"]}
)
self.run_test(md)
self.assertEqual(md._redirects.get_client_redirects("foo.html"), {})
def test_basic_redirect_matching(self):
"""Test client-side redirects getter with a simple redirect mapping"""
md = self.setup_test(
sources={"foo.md": "# Foo {#foo}\n## Bar {#bar}"},
raw_redirects={
'foo': ['foo.html#foo', 'foo.html#some-section', 'foo.html#another-section'],
'bar': ['foo.html#bar'],
},
)
self.run_test(md)
client_redirects = md._redirects.get_client_redirects("foo.html")
expected_redirects = {'some-section': 'foo.html#foo', 'another-section': 'foo.html#foo'}
self.assertEqual(client_redirects, expected_redirects)
def test_advanced_redirect_matching(self):
"""Test client-side redirects getter with a complex redirect mapping"""
md = self.setup_test(
sources={"foo.md": "# Foo {#foo}", "bar.md": "# Bar {#bar}"},
raw_redirects={
'foo': ['foo.html#foo', 'foo.html#some-section', 'bar.html#foo'],
'bar': ['bar.html#bar', 'bar.html#another-section'],
},
)
self.run_test(md)
self.assertEqual(md._redirects.get_client_redirects("index.html"), {})
client_redirects = md._redirects.get_client_redirects("foo.html")
expected_redirects = {'some-section': 'foo.html#foo'}
self.assertEqual(client_redirects, expected_redirects)
client_redirects = md._redirects.get_client_redirects("bar.html")
expected_redirects = {'foo': 'foo.html#foo', 'another-section': 'bar.html#bar'}
self.assertEqual(client_redirects, expected_redirects)
def test_server_redirects(self):
"""Test server-side redirects getter"""
md = self.setup_test(
sources={"foo.md": "# Foo {#foo}", "bar.md": "# Bar {#bar}"},
raw_redirects={
'foo': ['foo.html#foo', 'foo-prime.html'],
'bar': ['bar.html#bar', 'bar-prime.html'],
},
)
self.run_test(md)
server_redirects = md._redirects.get_server_redirects()
expected_redirects = {'foo-prime.html': 'foo.html', 'bar-prime.html': 'bar.html'}
self.assertEqual(server_redirects, expected_redirects)
def test_client_redirects_to_ghost_paths(self):
"""Test implicit inference of client-side redirects to ghost paths"""
md = self.setup_test(
sources={"foo.md": "# Foo {#foo}", "bar.md": "# Bar {#bar}"},
raw_redirects={
'foo': ['foo.html#foo', 'foo-prime.html'],
'bar': ['bar.html#bar', 'foo-prime.html#old'],
},
)
self.run_test(md)
client_redirects = md._redirects.get_client_redirects("foo.html")
expected_redirects = {'old': 'bar.html#bar'}
self.assertEqual(client_redirects, expected_redirects)