Merge branch 'mauro' into docs-mw

Another build series from Mauro:

The goal of this series is to drop one of the most ancient and ugliest
hack from the documentation build system. Before migrating to Sphinx,
the media subsystem already had a very comprehensive uAPI book, together
with a build time system to detect and point for any documentation gaps.

When migrating to Sphinx, we ported the logic to a Perl script
(parse-headers.pl) and Markus came up with a Sphinx extension
(kernel_include.py). We also added some files to control how parse-headers
produce results, and a Makefile.

At the initial Sphinx versions (1.4.1 if I recall correctly), when
a new symbol is added to videodev2.h, a new warning were
produced at documentatiion time, it the patchset didn't have
the corresponding documentation path.

While kernel-include is generic, the only user at the moment is the media
subsystem.

This series gets rid of the Python script, replacing it by a command
line script and a class. The parse header class can optionally be used by
kernel-include to produce an enriched code that will contain cross-references.

As the other conversions, it starts with a bug-compatible version of
parse-headers, but the subsequent patches add more functionalities and
fix bugs.

It should be noticed that modern of Sphinx disabled the cross-reference
warnings. So, at the next series, I'll be re-adding it in a controlled way
(e.g. just for the references from kernel-include that has an special
argument).

The script also supports now generating a "toc" output, which will be used
at the next series.
This commit is contained in:
Jonathan Corbet 2025-08-29 15:58:48 -06:00
commit c67a9f492c
24 changed files with 1700 additions and 621 deletions

View File

@ -1,2 +1,2 @@
[MASTER]
init-hook='import sys; sys.path += ["scripts/lib/kdoc", "scripts/lib/abi"]'
init-hook='import sys; sys.path += ["scripts/lib/kdoc", "scripts/lib/abi", "tools/docs/lib"]'

View File

@ -87,7 +87,7 @@ loop_cmd = $(echo-cmd) $(cmd_$(1)) || exit;
PYTHONPYCACHEPREFIX ?= $(abspath $(BUILDDIR)/__pycache__)
quiet_cmd_sphinx = SPHINX $@ --> file://$(abspath $(BUILDDIR)/$3/$4)
cmd_sphinx = $(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/userspace-api/media $2 && \
cmd_sphinx = \
PYTHONPYCACHEPREFIX="$(PYTHONPYCACHEPREFIX)" \
BUILDDIR=$(abspath $(BUILDDIR)) SPHINX_CONF=$(abspath $(src)/$5/$(SPHINX_CONF)) \
$(PYTHON3) $(srctree)/scripts/jobserver-exec \
@ -171,7 +171,6 @@ refcheckdocs:
cleandocs:
$(Q)rm -rf $(BUILDDIR)
$(Q)$(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/userspace-api/media clean
dochelp:
@echo ' Linux kernel internal documentation in different formats from ReST:'

View File

@ -1,31 +1,82 @@
#!/usr/bin/env python3
# -*- coding: utf-8; mode: python -*-
# SPDX-License-Identifier: GPL-2.0
# pylint: disable=R0903, C0330, R0914, R0912, E0401
# pylint: disable=R0903, R0912, R0914, R0915, C0209,W0707
"""
kernel-include
~~~~~~~~~~~~~~
Implementation of the ``kernel-include`` reST-directive.
Implementation of the ``kernel-include`` reST-directive.
:copyright: Copyright (C) 2016 Markus Heiser
:license: GPL Version 2, June 1991 see linux/COPYING for details.
:copyright: Copyright (C) 2016 Markus Heiser
:license: GPL Version 2, June 1991 see linux/COPYING for details.
The ``kernel-include`` reST-directive is a replacement for the ``include``
directive. The ``kernel-include`` directive expand environment variables in
the path name and allows to include files from arbitrary locations.
The ``kernel-include`` reST-directive is a replacement for the ``include``
directive. The ``kernel-include`` directive expand environment variables in
the path name and allows to include files from arbitrary locations.
.. hint::
.. hint::
Including files from arbitrary locations (e.g. from ``/etc``) is a
security risk for builders. This is why the ``include`` directive from
docutils *prohibit* pathnames pointing to locations *above* the filesystem
tree where the reST document with the include directive is placed.
Including files from arbitrary locations (e.g. from ``/etc``) is a
security risk for builders. This is why the ``include`` directive from
docutils *prohibit* pathnames pointing to locations *above* the filesystem
tree where the reST document with the include directive is placed.
Substrings of the form $name or ${name} are replaced by the value of
environment variable name. Malformed variable names and references to
non-existing variables are left unchanged.
Substrings of the form $name or ${name} are replaced by the value of
environment variable name. Malformed variable names and references to
non-existing variables are left unchanged.
**Supported Sphinx Include Options**:
:param literal:
If present, the included file is inserted as a literal block.
:param code:
Specify the language for syntax highlighting (e.g., 'c', 'python').
:param encoding:
Specify the encoding of the included file (default: 'utf-8').
:param tab-width:
Specify the number of spaces that a tab represents.
:param start-line:
Line number at which to start including the file (1-based).
:param end-line:
Line number at which to stop including the file (inclusive).
:param start-after:
Include lines after the first line matching this text.
:param end-before:
Include lines before the first line matching this text.
:param number-lines:
Number the included lines (integer specifies start number).
Only effective with 'literal' or 'code' options.
:param class:
Specify HTML class attribute for the included content.
**Kernel-specific Extensions**:
:param generate-cross-refs:
If present, instead of directly including the file, it calls
ParseDataStructs() to convert C data structures into cross-references
that link to comprehensive documentation in other ReST files.
:param exception-file:
(Used with generate-cross-refs)
Path to a file containing rules for handling special cases:
- Ignore specific C data structures
- Use alternative reference names
- Specify different reference types
:param warn-broken:
(Used with generate-cross-refs)
Enables warnings when auto-generated cross-references don't point to
existing documentation targets.
"""
# ==============================================================================
@ -33,53 +84,247 @@
# ==============================================================================
import os.path
import re
import sys
from docutils import io, nodes, statemachine
from docutils.statemachine import ViewList
from docutils.utils.error_reporting import SafeString, ErrorString
from docutils.parsers.rst import directives
from docutils.parsers.rst import Directive, directives
from docutils.parsers.rst.directives.body import CodeBlock, NumberLines
from docutils.parsers.rst.directives.misc import Include
__version__ = '1.0'
from sphinx.util import logging
srctree = os.path.abspath(os.environ["srctree"])
sys.path.insert(0, os.path.join(srctree, "tools/docs/lib"))
from parse_data_structs import ParseDataStructs
__version__ = "1.0"
logger = logging.getLogger(__name__)
RE_DOMAIN_REF = re.compile(r'\\ :(ref|c:type|c:func):`([^<`]+)(?:<([^>]+)>)?`\\')
RE_SIMPLE_REF = re.compile(r'`([^`]+)`')
# ==============================================================================
def setup(app):
# ==============================================================================
class KernelInclude(Directive):
"""
KernelInclude (``kernel-include``) directive
app.add_directive("kernel-include", KernelInclude)
return dict(
version = __version__,
parallel_read_safe = True,
parallel_write_safe = True
)
Most of the stuff here came from Include directive defined at:
docutils/parsers/rst/directives/misc.py
# ==============================================================================
class KernelInclude(Include):
# ==============================================================================
Yet, overriding the class don't has any benefits: the original class
only have run() and argument list. Not all of them are implemented,
when checked against latest Sphinx version, as with time more arguments
were added.
"""KernelInclude (``kernel-include``) directive"""
So, keep its own list of supported arguments
"""
required_arguments = 1
optional_arguments = 0
final_argument_whitespace = True
option_spec = {
'literal': directives.flag,
'code': directives.unchanged,
'encoding': directives.encoding,
'tab-width': int,
'start-line': int,
'end-line': int,
'start-after': directives.unchanged_required,
'end-before': directives.unchanged_required,
# ignored except for 'literal' or 'code':
'number-lines': directives.unchanged, # integer or None
'class': directives.class_option,
# Arguments that aren't from Sphinx Include directive
'generate-cross-refs': directives.flag,
'warn-broken': directives.flag,
'toc': directives.flag,
'exception-file': directives.unchanged,
}
def read_rawtext(self, path, encoding):
"""Read and process file content with error handling"""
try:
self.state.document.settings.record_dependencies.add(path)
include_file = io.FileInput(source_path=path,
encoding=encoding,
error_handler=self.state.document.settings.input_encoding_error_handler)
except UnicodeEncodeError:
raise self.severe('Problems with directive path:\n'
'Cannot encode input file path "%s" '
'(wrong locale?).' % SafeString(path))
except IOError as error:
raise self.severe('Problems with directive path:\n%s.' % ErrorString(error))
try:
return include_file.read()
except UnicodeError as error:
raise self.severe('Problem with directive:\n%s' % ErrorString(error))
def apply_range(self, rawtext):
"""
Handles start-line, end-line, start-after and end-before parameters
"""
# Get to-be-included content
startline = self.options.get('start-line', None)
endline = self.options.get('end-line', None)
try:
if startline or (endline is not None):
lines = rawtext.splitlines()
rawtext = '\n'.join(lines[startline:endline])
except UnicodeError as error:
raise self.severe(f'Problem with "{self.name}" directive:\n'
+ io.error_string(error))
# start-after/end-before: no restrictions on newlines in match-text,
# and no restrictions on matching inside lines vs. line boundaries
after_text = self.options.get("start-after", None)
if after_text:
# skip content in rawtext before *and incl.* a matching text
after_index = rawtext.find(after_text)
if after_index < 0:
raise self.severe('Problem with "start-after" option of "%s" '
"directive:\nText not found." % self.name)
rawtext = rawtext[after_index + len(after_text) :]
before_text = self.options.get("end-before", None)
if before_text:
# skip content in rawtext after *and incl.* a matching text
before_index = rawtext.find(before_text)
if before_index < 0:
raise self.severe('Problem with "end-before" option of "%s" '
"directive:\nText not found." % self.name)
rawtext = rawtext[:before_index]
return rawtext
def xref_text(self, env, path, tab_width):
"""
Read and add contents from a C file parsed to have cross references.
There are two types of supported output here:
- A C source code with cross-references;
- a TOC table containing cross references.
"""
parser = ParseDataStructs()
parser.parse_file(path)
if 'exception-file' in self.options:
source_dir = os.path.dirname(os.path.abspath(
self.state_machine.input_lines.source(
self.lineno - self.state_machine.input_offset - 1)))
exceptions_file = os.path.join(source_dir, self.options['exception-file'])
parser.process_exceptions(exceptions_file)
# Store references on a symbol dict to be used at check time
if 'warn-broken' in self.options:
env._xref_files.add(path)
if "toc" not in self.options:
rawtext = ".. parsed-literal::\n\n" + parser.gen_output()
self.apply_range(rawtext)
include_lines = statemachine.string2lines(rawtext, tab_width,
convert_whitespace=True)
# Sphinx always blame the ".. <directive>", so placing
# line numbers here won't make any difference
self.state_machine.insert_input(include_lines, path)
return []
# TOC output is a ReST file, not a literal. So, we can add line
# numbers
rawtext = parser.gen_toc()
include_lines = statemachine.string2lines(rawtext, tab_width,
convert_whitespace=True)
# Append line numbers data
startline = self.options.get('start-line', None)
result = ViewList()
if startline and startline > 0:
offset = startline - 1
else:
offset = 0
for ln, line in enumerate(include_lines, start=offset):
result.append(line, path, ln)
self.state_machine.insert_input(result, path)
return []
def literal(self, path, tab_width, rawtext):
"""Output a literal block"""
# Convert tabs to spaces, if `tab_width` is positive.
if tab_width >= 0:
text = rawtext.expandtabs(tab_width)
else:
text = rawtext
literal_block = nodes.literal_block(rawtext, source=path,
classes=self.options.get("class", []))
literal_block.line = 1
self.add_name(literal_block)
if "number-lines" in self.options:
try:
startline = int(self.options["number-lines"] or 1)
except ValueError:
raise self.error(":number-lines: with non-integer start value")
endline = startline + len(include_lines)
if text.endswith("\n"):
text = text[:-1]
tokens = NumberLines([([], text)], startline, endline)
for classes, value in tokens:
if classes:
literal_block += nodes.inline(value, value,
classes=classes)
else:
literal_block += nodes.Text(value, value)
else:
literal_block += nodes.Text(text, text)
return [literal_block]
def code(self, path, tab_width):
"""Output a code block"""
include_lines = statemachine.string2lines(rawtext, tab_width,
convert_whitespace=True)
self.options["source"] = path
codeblock = CodeBlock(self.name,
[self.options.pop("code")], # arguments
self.options,
include_lines,
self.lineno,
self.content_offset,
self.block_text,
self.state,
self.state_machine)
return codeblock.run()
def run(self):
"""Include a file as part of the content of this reST file."""
env = self.state.document.settings.env
path = os.path.realpath(
os.path.expandvars(self.arguments[0]))
path = os.path.realpath(os.path.expandvars(self.arguments[0]))
# to get a bit security back, prohibit /etc:
if path.startswith(os.sep + "etc"):
raise self.severe(
'Problems with "%s" directive, prohibited path: %s'
% (self.name, path))
raise self.severe('Problems with "%s" directive, prohibited path: %s' %
(self.name, path))
self.arguments[0] = path
env.note_dependency(os.path.abspath(path))
#return super(KernelInclude, self).run() # won't work, see HINTs in _run()
return self._run()
def _run(self):
"""Include a file as part of the content of this reST file."""
# HINT: I had to copy&paste the whole Include.run method. I'am not happy
# with this, but due to security reasons, the Include.run method does
# not allow absolute or relative pathnames pointing to locations *above*
@ -87,107 +332,93 @@ class KernelInclude(Include):
if not self.state.document.settings.file_insertion_enabled:
raise self.warning('"%s" directive disabled.' % self.name)
source = self.state_machine.input_lines.source(
self.lineno - self.state_machine.input_offset - 1)
source = self.state_machine.input_lines.source(self.lineno -
self.state_machine.input_offset - 1)
source_dir = os.path.dirname(os.path.abspath(source))
path = directives.path(self.arguments[0])
if path.startswith('<') and path.endswith('>'):
if path.startswith("<") and path.endswith(">"):
path = os.path.join(self.standard_include_path, path[1:-1])
path = os.path.normpath(os.path.join(source_dir, path))
# HINT: this is the only line I had to change / commented out:
#path = utils.relative_path(None, path)
# path = utils.relative_path(None, path)
encoding = self.options.get(
'encoding', self.state.document.settings.input_encoding)
e_handler=self.state.document.settings.input_encoding_error_handler
tab_width = self.options.get(
'tab-width', self.state.document.settings.tab_width)
try:
self.state.document.settings.record_dependencies.add(path)
include_file = io.FileInput(source_path=path,
encoding=encoding,
error_handler=e_handler)
except UnicodeEncodeError as error:
raise self.severe('Problems with "%s" directive path:\n'
'Cannot encode input file path "%s" '
'(wrong locale?).' %
(self.name, SafeString(path)))
except IOError as error:
raise self.severe('Problems with "%s" directive path:\n%s.' %
(self.name, ErrorString(error)))
startline = self.options.get('start-line', None)
endline = self.options.get('end-line', None)
try:
if startline or (endline is not None):
lines = include_file.readlines()
rawtext = ''.join(lines[startline:endline])
else:
rawtext = include_file.read()
except UnicodeError as error:
raise self.severe('Problem with "%s" directive:\n%s' %
(self.name, ErrorString(error)))
# start-after/end-before: no restrictions on newlines in match-text,
# and no restrictions on matching inside lines vs. line boundaries
after_text = self.options.get('start-after', None)
if after_text:
# skip content in rawtext before *and incl.* a matching text
after_index = rawtext.find(after_text)
if after_index < 0:
raise self.severe('Problem with "start-after" option of "%s" '
'directive:\nText not found.' % self.name)
rawtext = rawtext[after_index + len(after_text):]
before_text = self.options.get('end-before', None)
if before_text:
# skip content in rawtext after *and incl.* a matching text
before_index = rawtext.find(before_text)
if before_index < 0:
raise self.severe('Problem with "end-before" option of "%s" '
'directive:\nText not found.' % self.name)
rawtext = rawtext[:before_index]
encoding = self.options.get("encoding",
self.state.document.settings.input_encoding)
tab_width = self.options.get("tab-width",
self.state.document.settings.tab_width)
include_lines = statemachine.string2lines(rawtext, tab_width,
convert_whitespace=True)
if 'literal' in self.options:
# Convert tabs to spaces, if `tab_width` is positive.
if tab_width >= 0:
text = rawtext.expandtabs(tab_width)
else:
text = rawtext
literal_block = nodes.literal_block(rawtext, source=path,
classes=self.options.get('class', []))
literal_block.line = 1
self.add_name(literal_block)
if 'number-lines' in self.options:
try:
startline = int(self.options['number-lines'] or 1)
except ValueError:
raise self.error(':number-lines: with non-integer '
'start value')
endline = startline + len(include_lines)
if text.endswith('\n'):
text = text[:-1]
tokens = NumberLines([([], text)], startline, endline)
for classes, value in tokens:
if classes:
literal_block += nodes.inline(value, value,
classes=classes)
else:
literal_block += nodes.Text(value, value)
else:
literal_block += nodes.Text(text, text)
return [literal_block]
if 'code' in self.options:
self.options['source'] = path
codeblock = CodeBlock(self.name,
[self.options.pop('code')], # arguments
self.options,
include_lines, # content
self.lineno,
self.content_offset,
self.block_text,
self.state,
self.state_machine)
return codeblock.run()
self.state_machine.insert_input(include_lines, path)
return []
# Get optional arguments to related to cross-references generation
if "generate-cross-refs" in self.options:
return self.xref_text(env, path, tab_width)
rawtext = self.read_rawtext(path, encoding)
rawtext = self.apply_range(rawtext)
if "code" in self.options:
return self.code(path, tab_width, rawtext)
return self.literal(path, tab_width, rawtext)
# ==============================================================================
reported = set()
def check_missing_refs(app, env, node, contnode):
"""Check broken refs for the files it creates xrefs"""
if not node.source:
return None
try:
xref_files = env._xref_files
except AttributeError:
logger.critical("FATAL: _xref_files not initialized!")
raise
# Only show missing references for kernel-include reference-parsed files
if node.source not in xref_files:
return None
target = node.get('reftarget', '')
domain = node.get('refdomain', 'std')
reftype = node.get('reftype', '')
msg = f"can't link to: {domain}:{reftype}:: {target}"
# Don't duplicate warnings
data = (node.source, msg)
if data in reported:
return None
reported.add(data)
logger.warning(msg, location=node, type='ref', subtype='missing')
return None
def merge_xref_info(app, env, docnames, other):
"""
As each process modify env._xref_files, we need to merge them back.
"""
if not hasattr(other, "_xref_files"):
return
env._xref_files.update(getattr(other, "_xref_files", set()))
def init_xref_docs(app, env, docnames):
"""Initialize a list of files that we're generating cross references¨"""
app.env._xref_files = set()
# ==============================================================================
def setup(app):
"""Setup Sphinx exension"""
app.connect("env-before-read-docs", init_xref_docs)
app.connect("env-merge-info", merge_xref_info)
app.add_directive("kernel-include", KernelInclude)
app.connect("missing-reference", check_missing_refs)
return {
"version": __version__,
"parallel_read_safe": True,
"parallel_write_safe": True,
}

View File

@ -1,404 +0,0 @@
#!/usr/bin/env perl
# SPDX-License-Identifier: GPL-2.0
# Copyright (c) 2016 by Mauro Carvalho Chehab <mchehab@kernel.org>.
use strict;
use Text::Tabs;
use Getopt::Long;
use Pod::Usage;
my $debug;
my $help;
my $man;
GetOptions(
"debug" => \$debug,
'usage|?' => \$help,
'help' => \$man
) or pod2usage(2);
pod2usage(1) if $help;
pod2usage(-exitstatus => 0, -verbose => 2) if $man;
pod2usage(2) if (scalar @ARGV < 2 || scalar @ARGV > 3);
my ($file_in, $file_out, $file_exceptions) = @ARGV;
my $data;
my %ioctls;
my %defines;
my %typedefs;
my %enums;
my %enum_symbols;
my %structs;
require Data::Dumper if ($debug);
#
# read the file and get identifiers
#
my $is_enum = 0;
my $is_comment = 0;
open IN, $file_in or die "Can't open $file_in";
while (<IN>) {
$data .= $_;
my $ln = $_;
if (!$is_comment) {
$ln =~ s,/\*.*(\*/),,g;
$is_comment = 1 if ($ln =~ s,/\*.*,,);
} else {
if ($ln =~ s,^(.*\*/),,) {
$is_comment = 0;
} else {
next;
}
}
if ($is_enum && $ln =~ m/^\s*([_\w][\w\d_]+)\s*[\,=]?/) {
my $s = $1;
my $n = $1;
$n =~ tr/A-Z/a-z/;
$n =~ tr/_/-/;
$enum_symbols{$s} = "\\ :ref:`$s <$n>`\\ ";
$is_enum = 0 if ($is_enum && m/\}/);
next;
}
$is_enum = 0 if ($is_enum && m/\}/);
if ($ln =~ m/^\s*#\s*define\s+([_\w][\w\d_]+)\s+_IO/) {
my $s = $1;
my $n = $1;
$n =~ tr/A-Z/a-z/;
$ioctls{$s} = "\\ :ref:`$s <$n>`\\ ";
next;
}
if ($ln =~ m/^\s*#\s*define\s+([_\w][\w\d_]+)\s+/) {
my $s = $1;
my $n = $1;
$n =~ tr/A-Z/a-z/;
$n =~ tr/_/-/;
$defines{$s} = "\\ :ref:`$s <$n>`\\ ";
next;
}
if ($ln =~ m/^\s*typedef\s+([_\w][\w\d_]+)\s+(.*)\s+([_\w][\w\d_]+);/) {
my $s = $2;
my $n = $3;
$typedefs{$n} = "\\ :c:type:`$n <$s>`\\ ";
next;
}
if ($ln =~ m/^\s*enum\s+([_\w][\w\d_]+)\s+\{/
|| $ln =~ m/^\s*enum\s+([_\w][\w\d_]+)$/
|| $ln =~ m/^\s*typedef\s*enum\s+([_\w][\w\d_]+)\s+\{/
|| $ln =~ m/^\s*typedef\s*enum\s+([_\w][\w\d_]+)$/) {
my $s = $1;
$enums{$s} = "enum :c:type:`$s`\\ ";
$is_enum = $1;
next;
}
if ($ln =~ m/^\s*struct\s+([_\w][\w\d_]+)\s+\{/
|| $ln =~ m/^\s*struct\s+([[_\w][\w\d_]+)$/
|| $ln =~ m/^\s*typedef\s*struct\s+([_\w][\w\d_]+)\s+\{/
|| $ln =~ m/^\s*typedef\s*struct\s+([[_\w][\w\d_]+)$/
) {
my $s = $1;
$structs{$s} = "struct $s\\ ";
next;
}
}
close IN;
#
# Handle multi-line typedefs
#
my @matches = ($data =~ m/typedef\s+struct\s+\S+?\s*\{[^\}]+\}\s*(\S+)\s*\;/g,
$data =~ m/typedef\s+enum\s+\S+?\s*\{[^\}]+\}\s*(\S+)\s*\;/g,);
foreach my $m (@matches) {
my $s = $m;
$typedefs{$s} = "\\ :c:type:`$s`\\ ";
next;
}
#
# Handle exceptions, if any
#
my %def_reftype = (
"ioctl" => ":ref",
"define" => ":ref",
"symbol" => ":ref",
"typedef" => ":c:type",
"enum" => ":c:type",
"struct" => ":c:type",
);
if ($file_exceptions) {
open IN, $file_exceptions or die "Can't read $file_exceptions";
while (<IN>) {
next if (m/^\s*$/ || m/^\s*#/);
# Parsers to ignore a symbol
if (m/^ignore\s+ioctl\s+(\S+)/) {
delete $ioctls{$1} if (exists($ioctls{$1}));
next;
}
if (m/^ignore\s+define\s+(\S+)/) {
delete $defines{$1} if (exists($defines{$1}));
next;
}
if (m/^ignore\s+typedef\s+(\S+)/) {
delete $typedefs{$1} if (exists($typedefs{$1}));
next;
}
if (m/^ignore\s+enum\s+(\S+)/) {
delete $enums{$1} if (exists($enums{$1}));
next;
}
if (m/^ignore\s+struct\s+(\S+)/) {
delete $structs{$1} if (exists($structs{$1}));
next;
}
if (m/^ignore\s+symbol\s+(\S+)/) {
delete $enum_symbols{$1} if (exists($enum_symbols{$1}));
next;
}
# Parsers to replace a symbol
my ($type, $old, $new, $reftype);
if (m/^replace\s+(\S+)\s+(\S+)\s+(\S+)/) {
$type = $1;
$old = $2;
$new = $3;
} else {
die "Can't parse $file_exceptions: $_";
}
if ($new =~ m/^\:c\:(data|func|macro|type)\:\`(.+)\`/) {
$reftype = ":c:$1";
$new = $2;
} elsif ($new =~ m/\:ref\:\`(.+)\`/) {
$reftype = ":ref";
$new = $1;
} else {
$reftype = $def_reftype{$type};
}
$new = "$reftype:`$old <$new>`";
if ($type eq "ioctl") {
$ioctls{$old} = $new if (exists($ioctls{$old}));
next;
}
if ($type eq "define") {
$defines{$old} = $new if (exists($defines{$old}));
next;
}
if ($type eq "symbol") {
$enum_symbols{$old} = $new if (exists($enum_symbols{$old}));
next;
}
if ($type eq "typedef") {
$typedefs{$old} = $new if (exists($typedefs{$old}));
next;
}
if ($type eq "enum") {
$enums{$old} = $new if (exists($enums{$old}));
next;
}
if ($type eq "struct") {
$structs{$old} = $new if (exists($structs{$old}));
next;
}
die "Can't parse $file_exceptions: $_";
}
}
if ($debug) {
print Data::Dumper->Dump([\%ioctls], [qw(*ioctls)]) if (%ioctls);
print Data::Dumper->Dump([\%typedefs], [qw(*typedefs)]) if (%typedefs);
print Data::Dumper->Dump([\%enums], [qw(*enums)]) if (%enums);
print Data::Dumper->Dump([\%structs], [qw(*structs)]) if (%structs);
print Data::Dumper->Dump([\%defines], [qw(*defines)]) if (%defines);
print Data::Dumper->Dump([\%enum_symbols], [qw(*enum_symbols)]) if (%enum_symbols);
}
#
# Align block
#
$data = expand($data);
$data = " " . $data;
$data =~ s/\n/\n /g;
$data =~ s/\n\s+$/\n/g;
$data =~ s/\n\s+\n/\n\n/g;
#
# Add escape codes for special characters
#
$data =~ s,([\_\`\*\<\>\&\\\\:\/\|\%\$\#\{\}\~\^]),\\$1,g;
$data =~ s,DEPRECATED,**DEPRECATED**,g;
#
# Add references
#
my $start_delim = "[ \n\t\(\=\*\@]";
my $end_delim = "(\\s|,|\\\\=|\\\\:|\\;|\\\)|\\}|\\{)";
foreach my $r (keys %ioctls) {
my $s = $ioctls{$r};
$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
print "$r -> $s\n" if ($debug);
$data =~ s/($start_delim)($r)$end_delim/$1$s$3/g;
}
foreach my $r (keys %defines) {
my $s = $defines{$r};
$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
print "$r -> $s\n" if ($debug);
$data =~ s/($start_delim)($r)$end_delim/$1$s$3/g;
}
foreach my $r (keys %enum_symbols) {
my $s = $enum_symbols{$r};
$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
print "$r -> $s\n" if ($debug);
$data =~ s/($start_delim)($r)$end_delim/$1$s$3/g;
}
foreach my $r (keys %enums) {
my $s = $enums{$r};
$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
print "$r -> $s\n" if ($debug);
$data =~ s/enum\s+($r)$end_delim/$s$2/g;
}
foreach my $r (keys %structs) {
my $s = $structs{$r};
$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
print "$r -> $s\n" if ($debug);
$data =~ s/struct\s+($r)$end_delim/$s$2/g;
}
foreach my $r (keys %typedefs) {
my $s = $typedefs{$r};
$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
print "$r -> $s\n" if ($debug);
$data =~ s/($start_delim)($r)$end_delim/$1$s$3/g;
}
$data =~ s/\\ ([\n\s])/\1/g;
#
# Generate output file
#
my $title = $file_in;
$title =~ s,.*/,,;
open OUT, "> $file_out" or die "Can't open $file_out";
print OUT ".. -*- coding: utf-8; mode: rst -*-\n\n";
print OUT "$title\n";
print OUT "=" x length($title);
print OUT "\n\n.. parsed-literal::\n\n";
print OUT $data;
close OUT;
__END__
=head1 NAME
parse_headers.pl - parse a C file, in order to identify functions, structs,
enums and defines and create cross-references to a Sphinx book.
=head1 SYNOPSIS
B<parse_headers.pl> [<options>] <C_FILE> <OUT_FILE> [<EXCEPTIONS_FILE>]
Where <options> can be: --debug, --help or --usage.
=head1 OPTIONS
=over 8
=item B<--debug>
Put the script in verbose mode, useful for debugging.
=item B<--usage>
Prints a brief help message and exits.
=item B<--help>
Prints a more detailed help message and exits.
=back
=head1 DESCRIPTION
Convert a C header or source file (C_FILE), into a ReStructured Text
included via ..parsed-literal block with cross-references for the
documentation files that describe the API. It accepts an optional
EXCEPTIONS_FILE with describes what elements will be either ignored or
be pointed to a non-default reference.
The output is written at the (OUT_FILE).
It is capable of identifying defines, functions, structs, typedefs,
enums and enum symbols and create cross-references for all of them.
It is also capable of distinguish #define used for specifying a Linux
ioctl.
The EXCEPTIONS_FILE contain two rules to allow ignoring a symbol or
to replace the default references by a custom one.
Please read Documentation/doc-guide/parse-headers.rst at the Kernel's
tree for more details.
=head1 BUGS
Report bugs to Mauro Carvalho Chehab <mchehab@kernel.org>
=head1 COPYRIGHT
Copyright (c) 2016 by Mauro Carvalho Chehab <mchehab@kernel.org>.
License GPLv2: GNU GPL version 2 <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
=cut

View File

@ -1,64 +0,0 @@
# SPDX-License-Identifier: GPL-2.0
# Rules to convert a .h file to inline RST documentation
SRC_DIR=$(srctree)/Documentation/userspace-api/media
PARSER = $(srctree)/Documentation/sphinx/parse-headers.pl
UAPI = $(srctree)/include/uapi/linux
KAPI = $(srctree)/include/linux
FILES = ca.h.rst dmx.h.rst frontend.h.rst net.h.rst \
videodev2.h.rst media.h.rst cec.h.rst lirc.h.rst
TARGETS := $(addprefix $(BUILDDIR)/, $(FILES))
gen_rst = \
echo ${PARSER} $< $@ $(SRC_DIR)/$(notdir $@).exceptions; \
${PARSER} $< $@ $(SRC_DIR)/$(notdir $@).exceptions
quiet_gen_rst = echo ' PARSE $(patsubst $(srctree)/%,%,$<)'; \
${PARSER} $< $@ $(SRC_DIR)/$(notdir $@).exceptions
silent_gen_rst = ${gen_rst}
$(BUILDDIR)/ca.h.rst: ${UAPI}/dvb/ca.h ${PARSER} $(SRC_DIR)/ca.h.rst.exceptions
@$($(quiet)gen_rst)
$(BUILDDIR)/dmx.h.rst: ${UAPI}/dvb/dmx.h ${PARSER} $(SRC_DIR)/dmx.h.rst.exceptions
@$($(quiet)gen_rst)
$(BUILDDIR)/frontend.h.rst: ${UAPI}/dvb/frontend.h ${PARSER} $(SRC_DIR)/frontend.h.rst.exceptions
@$($(quiet)gen_rst)
$(BUILDDIR)/net.h.rst: ${UAPI}/dvb/net.h ${PARSER} $(SRC_DIR)/net.h.rst.exceptions
@$($(quiet)gen_rst)
$(BUILDDIR)/videodev2.h.rst: ${UAPI}/videodev2.h ${PARSER} $(SRC_DIR)/videodev2.h.rst.exceptions
@$($(quiet)gen_rst)
$(BUILDDIR)/media.h.rst: ${UAPI}/media.h ${PARSER} $(SRC_DIR)/media.h.rst.exceptions
@$($(quiet)gen_rst)
$(BUILDDIR)/cec.h.rst: ${UAPI}/cec.h ${PARSER} $(SRC_DIR)/cec.h.rst.exceptions
@$($(quiet)gen_rst)
$(BUILDDIR)/lirc.h.rst: ${UAPI}/lirc.h ${PARSER} $(SRC_DIR)/lirc.h.rst.exceptions
@$($(quiet)gen_rst)
# Media build rules
.PHONY: all html texinfo epub xml latex
all: $(IMGDOT) $(BUILDDIR) ${TARGETS}
html: all
texinfo: all
epub: all
xml: all
latex: $(IMGPDF) all
linkcheck:
clean:
-rm -f $(DOTTGT) $(IMGTGT) ${TARGETS} 2>/dev/null
$(BUILDDIR):
$(Q)mkdir -p $@

View File

@ -6,5 +6,6 @@
CEC Header File
***************
.. kernel-include:: $BUILDDIR/cec.h.rst
.. kernel-include:: include/uapi/linux/cec.h
:generate-cross-refs:
:exception-file: cec.h.rst.exceptions

View File

@ -7,10 +7,19 @@ Digital TV uAPI header files
Digital TV uAPI headers
***********************
.. kernel-include:: $BUILDDIR/frontend.h.rst
.. kernel-include:: include/uapi/linux/dvb/frontend.h
:generate-cross-refs:
:exception-file: frontend.h.rst.exceptions
.. kernel-include:: $BUILDDIR/dmx.h.rst
.. kernel-include:: include/uapi/linux/dvb/dmx.h
:generate-cross-refs:
:exception-file: dmx.h.rst.exceptions
.. kernel-include:: $BUILDDIR/ca.h.rst
.. kernel-include:: include/uapi/linux/dvb/ca.h
:generate-cross-refs:
:exception-file: ca.h.rst.exceptions
.. kernel-include:: include/uapi/linux/dvb/net.h
:generate-cross-refs:
:exception-file: net.h.rst.exceptions
.. kernel-include:: $BUILDDIR/net.h.rst

View File

@ -6,5 +6,6 @@
Media Controller Header File
****************************
.. kernel-include:: $BUILDDIR/media.h.rst
.. kernel-include:: include/uapi/linux/media.h
:generate-cross-refs:
:exception-file: media.h.rst.exceptions

View File

@ -6,5 +6,7 @@
LIRC Header File
****************
.. kernel-include:: $BUILDDIR/lirc.h.rst
.. kernel-include:: include/uapi/linux/lirc.h
:generate-cross-refs:
:exception-file: lirc.h.rst.exceptions

View File

@ -6,4 +6,6 @@
Video For Linux Two Header File
*******************************
.. kernel-include:: $BUILDDIR/videodev2.h.rst
.. kernel-include:: include/uapi/linux/videodev2.h
:generate-cross-refs:
:exception-file: videodev2.h.rst.exceptions

View File

@ -7308,6 +7308,7 @@ F: scripts/get_abi.py
F: scripts/kernel-doc*
F: scripts/lib/abi/*
F: scripts/lib/kdoc/*
F: tools/docs/*
F: tools/net/ynl/pyynl/lib/doc_generator.py
F: scripts/sphinx-pre-install
X: Documentation/ABI/

719
scripts/sphinx-build-wrapper Executable file
View File

@ -0,0 +1,719 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0
# Copyright (C) 2025 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
#
# pylint: disable=R0902, R0912, R0913, R0914, R0915, R0917, C0103
#
# Converted from docs Makefile and parallel-wrapper.sh, both under
# GPLv2, copyrighted since 2008 by the following authors:
#
# Akira Yokosawa <akiyks@gmail.com>
# Arnd Bergmann <arnd@arndb.de>
# Breno Leitao <leitao@debian.org>
# Carlos Bilbao <carlos.bilbao@amd.com>
# Dave Young <dyoung@redhat.com>
# Donald Hunter <donald.hunter@gmail.com>
# Geert Uytterhoeven <geert+renesas@glider.be>
# Jani Nikula <jani.nikula@intel.com>
# Jan Stancek <jstancek@redhat.com>
# Jonathan Corbet <corbet@lwn.net>
# Joshua Clayton <stillcompiling@gmail.com>
# Kees Cook <keescook@chromium.org>
# Linus Torvalds <torvalds@linux-foundation.org>
# Magnus Damm <damm+renesas@opensource.se>
# Masahiro Yamada <masahiroy@kernel.org>
# Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
# Maxim Cournoyer <maxim.cournoyer@gmail.com>
# Peter Foley <pefoley2@pefoley.com>
# Randy Dunlap <rdunlap@infradead.org>
# Rob Herring <robh@kernel.org>
# Shuah Khan <shuahkh@osg.samsung.com>
# Thorsten Blum <thorsten.blum@toblux.com>
# Tomas Winkler <tomas.winkler@intel.com>
"""
Sphinx build wrapper that handles Kernel-specific business rules:
- it gets the Kernel build environment vars;
- it determines what's the best parallelism;
- it handles SPHINXDIRS
This tool ensures that MIN_PYTHON_VERSION is satisfied. If version is
below that, it seeks for a new Python version. If found, it re-runs using
the newer version.
"""
import argparse
import locale
import os
import re
import shlex
import shutil
import subprocess
import sys
from concurrent import futures
from glob import glob
LIB_DIR = "lib"
SRC_DIR = os.path.dirname(os.path.realpath(__file__))
sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR))
from jobserver import JobserverExec # pylint: disable=C0413
def parse_version(version):
"""Convert a major.minor.patch version into a tuple"""
return tuple(int(x) for x in version.split("."))
def ver_str(version):
"""Returns a version tuple as major.minor.patch"""
return ".".join([str(x) for x in version])
# Minimal supported Python version needed by Sphinx and its extensions
MIN_PYTHON_VERSION = parse_version("3.7")
# Default value for --venv parameter
VENV_DEFAULT = "sphinx_latest"
# List of make targets and its corresponding builder and output directory
TARGETS = {
"cleandocs": {
"builder": "clean",
},
"htmldocs": {
"builder": "html",
},
"epubdocs": {
"builder": "epub",
"out_dir": "epub",
},
"texinfodocs": {
"builder": "texinfo",
"out_dir": "texinfo",
},
"infodocs": {
"builder": "texinfo",
"out_dir": "texinfo",
},
"latexdocs": {
"builder": "latex",
"out_dir": "latex",
},
"pdfdocs": {
"builder": "latex",
"out_dir": "latex",
},
"xmldocs": {
"builder": "xml",
"out_dir": "xml",
},
"linkcheckdocs": {
"builder": "linkcheck"
},
}
# Paper sizes. An empty value will pick the default
PAPER = ["", "a4", "letter"]
class SphinxBuilder:
"""
Handles a sphinx-build target, adding needed arguments to build
with the Kernel.
"""
def is_rust_enabled(self):
"""Check if rust is enabled at .config"""
config_path = os.path.join(self.srctree, ".config")
if os.path.isfile(config_path):
with open(config_path, "r", encoding="utf-8") as f:
return "CONFIG_RUST=y" in f.read()
return False
def get_path(self, path, abs_path=False):
"""
Ancillary routine to handle patches the right way, as shell does.
It first expands "~" and "~user". Then, if patch is not absolute,
join self.srctree. Finally, if requested, convert to abspath.
"""
path = os.path.expanduser(path)
if not path.startswith("/"):
path = os.path.join(self.srctree, path)
if abs_path:
return os.path.abspath(path)
return path
def __init__(self, venv=None, verbose=False, n_jobs=None, interactive=None):
"""Initialize internal variables"""
self.venv = venv
self.verbose = None
# Normal variables passed from Kernel's makefile
self.kernelversion = os.environ.get("KERNELVERSION", "unknown")
self.kernelrelease = os.environ.get("KERNELRELEASE", "unknown")
self.pdflatex = os.environ.get("PDFLATEX", "xelatex")
if not interactive:
self.latexopts = os.environ.get("LATEXOPTS", "-interaction=batchmode -no-shell-escape")
else:
self.latexopts = os.environ.get("LATEXOPTS", "")
if not verbose:
verbose = bool(os.environ.get("KBUILD_VERBOSE", "") != "")
# Handle SPHINXOPTS evironment
sphinxopts = shlex.split(os.environ.get("SPHINXOPTS", ""))
# As we handle number of jobs and quiet in separate, we need to pick
# it the same way as sphinx-build would pick, so let's use argparse
# do to the right argument expansion
parser = argparse.ArgumentParser()
parser.add_argument('-j', '--jobs', type=int)
parser.add_argument('-q', '--quiet', type=int)
# Other sphinx-build arguments go as-is, so place them
# at self.sphinxopts
sphinx_args, self.sphinxopts = parser.parse_known_args(sphinxopts)
if sphinx_args.quiet == True:
self.verbose = False
if sphinx_args.jobs:
self.n_jobs = sphinx_args.jobs
# Command line arguments was passed, override SPHINXOPTS
if verbose is not None:
self.verbose = verbose
self.n_jobs = n_jobs
# Source tree directory. This needs to be at os.environ, as
# Sphinx extensions and media uAPI makefile needs it
self.srctree = os.environ.get("srctree")
if not self.srctree:
self.srctree = "."
os.environ["srctree"] = self.srctree
# Now that we can expand srctree, get other directories as well
self.sphinxbuild = os.environ.get("SPHINXBUILD", "sphinx-build")
self.kerneldoc = self.get_path(os.environ.get("KERNELDOC",
"scripts/kernel-doc.py"))
self.obj = os.environ.get("obj", "Documentation")
self.builddir = self.get_path(os.path.join(self.obj, "output"),
abs_path=True)
# Media uAPI needs it
os.environ["BUILDDIR"] = self.builddir
# Detect if rust is enabled
self.config_rust = self.is_rust_enabled()
# Get directory locations for LaTeX build toolchain
self.pdflatex_cmd = shutil.which(self.pdflatex)
self.latexmk_cmd = shutil.which("latexmk")
self.env = os.environ.copy()
# If venv parameter is specified, run Sphinx from venv
if venv:
bin_dir = os.path.join(venv, "bin")
if os.path.isfile(os.path.join(bin_dir, "activate")):
# "activate" virtual env
self.env["PATH"] = bin_dir + ":" + self.env["PATH"]
self.env["VIRTUAL_ENV"] = venv
if "PYTHONHOME" in self.env:
del self.env["PYTHONHOME"]
print(f"Setting venv to {venv}")
else:
sys.exit(f"Venv {venv} not found.")
def run_sphinx(self, sphinx_build, build_args, *args, **pwargs):
"""
Executes sphinx-build using current python3 command and setting
-j parameter if possible to run the build in parallel.
"""
with JobserverExec() as jobserver:
if jobserver.claim:
n_jobs = str(jobserver.claim)
else:
n_jobs = "auto" # Supported since Sphinx 1.7
cmd = []
if self.venv:
cmd.append("python")
else:
cmd.append(sys.executable)
cmd.append(sphinx_build)
# if present, SPHINXOPTS or command line --jobs overrides default
if self.n_jobs:
n_jobs = str(self.n_jobs)
if n_jobs:
cmd += [f"-j{n_jobs}"]
if not self.verbose:
cmd.append("-q")
cmd += self.sphinxopts
cmd += build_args
if self.verbose:
print(" ".join(cmd))
rc = subprocess.call(cmd, *args, **pwargs)
def handle_html(self, css, output_dir):
"""
Extra steps for HTML and epub output.
For such targets, we need to ensure that CSS will be properly
copied to the output _static directory
"""
if not css:
return
css = os.path.expanduser(css)
if not css.startswith("/"):
css = os.path.join(self.srctree, css)
static_dir = os.path.join(output_dir, "_static")
os.makedirs(static_dir, exist_ok=True)
try:
shutil.copy2(css, static_dir)
except (OSError, IOError) as e:
print(f"Warning: Failed to copy CSS: {e}", file=sys.stderr)
def build_pdf_file(self, latex_cmd, from_dir, path):
"""Builds a single pdf file using latex_cmd"""
try:
subprocess.run(latex_cmd + [path],
cwd=from_dir, check=True)
return True
except subprocess.CalledProcessError:
# LaTeX PDF error code is almost useless: it returns
# error codes even when build succeeds but has warnings.
# So, we'll ignore the results
return False
def pdf_parallel_build(self, tex_suffix, latex_cmd, tex_files, n_jobs):
"""Build PDF files in parallel if possible"""
builds = {}
build_failed = False
max_len = 0
has_tex = False
# Process files in parallel
with futures.ThreadPoolExecutor(max_workers=n_jobs) as executor:
jobs = {}
for from_dir, pdf_dir, entry in tex_files:
name = entry.name
if not name.endswith(tex_suffix):
continue
name = name[:-len(tex_suffix)]
max_len = max(max_len, len(name))
has_tex = True
future = executor.submit(self.build_pdf_file, latex_cmd,
from_dir, entry.path)
jobs[future] = (from_dir, name, entry.path)
for future in futures.as_completed(jobs):
from_dir, name, path = jobs[future]
pdf_name = name + ".pdf"
pdf_from = os.path.join(from_dir, pdf_name)
try:
success = future.result()
if success and os.path.exists(pdf_from):
pdf_to = os.path.join(pdf_dir, pdf_name)
os.rename(pdf_from, pdf_to)
builds[name] = os.path.relpath(pdf_to, self.builddir)
else:
builds[name] = "FAILED"
build_failed = True
except Exception as e:
builds[name] = f"FAILED ({str(e)})"
build_failed = True
# Handle case where no .tex files were found
if not has_tex:
name = "Sphinx LaTeX builder"
max_len = max(max_len, len(name))
builds[name] = "FAILED (no .tex file was generated)"
build_failed = True
return builds, build_failed, max_len
def handle_pdf(self, output_dirs):
"""
Extra steps for PDF output.
As PDF is handled via a LaTeX output, after building the .tex file,
a new build is needed to create the PDF output from the latex
directory.
"""
builds = {}
max_len = 0
tex_suffix = ".tex"
# Get all tex files that will be used for PDF build
tex_files = []
for from_dir in output_dirs:
pdf_dir = os.path.join(from_dir, "../pdf")
os.makedirs(pdf_dir, exist_ok=True)
if self.latexmk_cmd:
latex_cmd = [self.latexmk_cmd, f"-{self.pdflatex}"]
else:
latex_cmd = [self.pdflatex]
latex_cmd.extend(shlex.split(self.latexopts))
# Get a list of tex files to process
with os.scandir(from_dir) as it:
for entry in it:
if entry.name.endswith(tex_suffix):
tex_files.append((from_dir, pdf_dir, entry))
# When using make, this won't be used, as the number of jobs comes
# from POSIX jobserver. So, this covers the case where build comes
# from command line. On such case, serialize by default, except if
# the user explicitly sets the number of jobs.
n_jobs = 1
# n_jobs is either an integer or "auto". Only use it if it is a number
if self.n_jobs:
try:
n_jobs = int(self.n_jobs)
except ValueError:
pass
# When using make, jobserver.claim is the number of jobs that were
# used with "-j" and that aren't used by other make targets
with JobserverExec() as jobserver:
n_jobs = 1
# Handle the case when a parameter is passed via command line,
# using it as default, if jobserver doesn't claim anything
if self.n_jobs:
try:
n_jobs = int(self.n_jobs)
except ValueError:
pass
if jobserver.claim:
n_jobs = jobserver.claim
# Build files in parallel
builds, build_failed, max_len = self.pdf_parallel_build(tex_suffix,
latex_cmd,
tex_files,
n_jobs)
msg = "Summary"
msg += "\n" + "=" * len(msg)
print()
print(msg)
for pdf_name, pdf_file in builds.items():
print(f"{pdf_name:<{max_len}}: {pdf_file}")
print()
# return an error if a PDF file is missing
if build_failed:
sys.exit(f"PDF build failed: not all PDF files were created.")
else:
print("All PDF files were built.")
def handle_info(self, output_dirs):
"""
Extra steps for Info output.
For texinfo generation, an additional make is needed from the
texinfo directory.
"""
for output_dir in output_dirs:
try:
subprocess.run(["make", "info"], cwd=output_dir, check=True)
except subprocess.CalledProcessError as e:
sys.exit(f"Error generating info docs: {e}")
def cleandocs(self, builder):
shutil.rmtree(self.builddir, ignore_errors=True)
def build(self, target, sphinxdirs=None, conf="conf.py",
theme=None, css=None, paper=None):
"""
Build documentation using Sphinx. This is the core function of this
module. It prepares all arguments required by sphinx-build.
"""
builder = TARGETS[target]["builder"]
out_dir = TARGETS[target].get("out_dir", "")
# Cleandocs doesn't require sphinx-build
if target == "cleandocs":
self.cleandocs(builder)
return
# Other targets require sphinx-build
sphinxbuild = shutil.which(self.sphinxbuild, path=self.env["PATH"])
if not sphinxbuild:
sys.exit(f"Error: {self.sphinxbuild} not found in PATH.\n")
if builder == "latex":
if not self.pdflatex_cmd and not self.latexmk_cmd:
sys.exit("Error: pdflatex or latexmk required for PDF generation")
docs_dir = os.path.abspath(os.path.join(self.srctree, "Documentation"))
# Prepare base arguments for Sphinx build
kerneldoc = self.kerneldoc
if kerneldoc.startswith(self.srctree):
kerneldoc = os.path.relpath(kerneldoc, self.srctree)
# Prepare common Sphinx options
args = [
"-b", builder,
"-c", docs_dir,
]
if builder == "latex":
if not paper:
paper = PAPER[1]
args.extend(["-D", f"latex_elements.papersize={paper}paper"])
if self.config_rust:
args.extend(["-t", "rustdoc"])
if conf:
self.env["SPHINX_CONF"] = self.get_path(conf, abs_path=True)
if not sphinxdirs:
sphinxdirs = os.environ.get("SPHINXDIRS", ".")
# The sphinx-build tool has a bug: internally, it tries to set
# locale with locale.setlocale(locale.LC_ALL, ''). This causes a
# crash if language is not set. Detect and fix it.
try:
locale.setlocale(locale.LC_ALL, '')
except Exception:
self.env["LC_ALL"] = "C"
self.env["LANG"] = "C"
# sphinxdirs can be a list or a whitespace-separated string
sphinxdirs_list = []
for sphinxdir in sphinxdirs:
if isinstance(sphinxdir, list):
sphinxdirs_list += sphinxdir
else:
for name in sphinxdir.split(" "):
sphinxdirs_list.append(name)
# Build each directory
output_dirs = []
for sphinxdir in sphinxdirs_list:
src_dir = os.path.join(docs_dir, sphinxdir)
doctree_dir = os.path.join(self.builddir, ".doctrees")
output_dir = os.path.join(self.builddir, sphinxdir, out_dir)
# Make directory names canonical
src_dir = os.path.normpath(src_dir)
doctree_dir = os.path.normpath(doctree_dir)
output_dir = os.path.normpath(output_dir)
os.makedirs(doctree_dir, exist_ok=True)
os.makedirs(output_dir, exist_ok=True)
output_dirs.append(output_dir)
build_args = args + [
"-d", doctree_dir,
"-D", f"kerneldoc_bin={kerneldoc}",
"-D", f"version={self.kernelversion}",
"-D", f"release={self.kernelrelease}",
"-D", f"kerneldoc_srctree={self.srctree}",
src_dir,
output_dir,
]
# Execute sphinx-build
try:
self.run_sphinx(sphinxbuild, build_args, env=self.env)
except Exception as e:
sys.exit(f"Build failed: {e}")
# Ensure that html/epub will have needed static files
if target in ["htmldocs", "epubdocs"]:
self.handle_html(css, output_dir)
# PDF and Info require a second build step
if target == "pdfdocs":
self.handle_pdf(output_dirs)
elif target == "infodocs":
self.handle_info(output_dirs)
@staticmethod
def get_python_version(cmd):
"""
Get python version from a Python binary. As we need to detect if
are out there newer python binaries, we can't rely on sys.release here.
"""
result = subprocess.run([cmd, "--version"], check=True,
stdout=subprocess.PIPE, stderr=subprocess.PIPE,
universal_newlines=True)
version = result.stdout.strip()
match = re.search(r"(\d+\.\d+\.\d+)", version)
if match:
return parse_version(match.group(1))
print(f"Can't parse version {version}")
return (0, 0, 0)
@staticmethod
def find_python():
"""
Detect if are out there any python 3.xy version newer than the
current one.
Note: this routine is limited to up to 2 digits for python3. We
may need to update it one day, hopefully on a distant future.
"""
patterns = [
"python3.[0-9]",
"python3.[0-9][0-9]",
]
# Seek for a python binary newer than MIN_PYTHON_VERSION
for path in os.getenv("PATH", "").split(":"):
for pattern in patterns:
for cmd in glob(os.path.join(path, pattern)):
if os.path.isfile(cmd) and os.access(cmd, os.X_OK):
version = SphinxBuilder.get_python_version(cmd)
if version >= MIN_PYTHON_VERSION:
return cmd
return None
@staticmethod
def check_python():
"""
Check if the current python binary satisfies our minimal requirement
for Sphinx build. If not, re-run with a newer version if found.
"""
cur_ver = sys.version_info[:3]
if cur_ver >= MIN_PYTHON_VERSION:
return
python_ver = ver_str(cur_ver)
new_python_cmd = SphinxBuilder.find_python()
if not new_python_cmd:
sys.exit(f"Python version {python_ver} is not supported anymore.")
# Restart script using the newer version
script_path = os.path.abspath(sys.argv[0])
args = [new_python_cmd, script_path] + sys.argv[1:]
print(f"Python {python_ver} not supported. Changing to {new_python_cmd}")
try:
os.execv(new_python_cmd, args)
except OSError as e:
sys.exit(f"Failed to restart with {new_python_cmd}: {e}")
def jobs_type(value):
"""
Handle valid values for -j. Accepts Sphinx "-jauto", plus a number
equal or bigger than one.
"""
if value is None:
return None
if value.lower() == 'auto':
return value.lower()
try:
if int(value) >= 1:
return value
raise argparse.ArgumentTypeError(f"Minimum jobs is 1, got {value}")
except ValueError:
raise argparse.ArgumentTypeError(f"Must be 'auto' or positive integer, got {value}")
def main():
"""
Main function. The only mandatory argument is the target. If not
specified, the other arguments will use default values if not
specified at os.environ.
"""
parser = argparse.ArgumentParser(description="Kernel documentation builder")
parser.add_argument("target", choices=list(TARGETS.keys()),
help="Documentation target to build")
parser.add_argument("--sphinxdirs", nargs="+",
help="Specific directories to build")
parser.add_argument("--conf", default="conf.py",
help="Sphinx configuration file")
parser.add_argument("--theme", help="Sphinx theme to use")
parser.add_argument("--css", help="Custom CSS file for HTML/EPUB")
parser.add_argument("--paper", choices=PAPER, default=PAPER[0],
help="Paper size for LaTeX/PDF output")
parser.add_argument("-v", "--verbose", action='store_true',
help="place build in verbose mode")
parser.add_argument('-j', '--jobs', type=jobs_type,
help="Sets number of jobs to use with sphinx-build")
parser.add_argument('-i', '--interactive', action='store_true',
help="Change latex default to run in interactive mode")
parser.add_argument("-V", "--venv", nargs='?', const=f'{VENV_DEFAULT}',
default=None,
help=f'If used, run Sphinx from a venv dir (default dir: {VENV_DEFAULT})')
args = parser.parse_args()
SphinxBuilder.check_python()
builder = SphinxBuilder(venv=args.venv, verbose=args.verbose,
n_jobs=args.jobs, interactive=args.interactive)
builder.build(args.target, sphinxdirs=args.sphinxdirs, conf=args.conf,
theme=args.theme, css=args.css, paper=args.paper)
if __name__ == "__main__":
main()

View File

View File

@ -0,0 +1,70 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0
# Copyright (c) 2025 by Mauro Carvalho Chehab <mchehab@kernel.org>.
"""
Ancillary argparse HelpFormatter class that works on a similar way as
argparse.RawDescriptionHelpFormatter, e.g. description maintains line
breaks, but it also implement transformations to the help text. The
actual transformations ar given by enrich_text(), if the output is tty.
Currently, the follow transformations are done:
- Positional arguments are shown in upper cases;
- if output is TTY, ``var`` and positional arguments are shown prepended
by an ANSI SGR code. This is usually translated to bold. On some
terminals, like, konsole, this is translated into a colored bold text.
"""
import argparse
import re
import sys
class EnrichFormatter(argparse.HelpFormatter):
"""
Better format the output, making easier to identify the positional args
and how they're used at the __doc__ description.
"""
def __init__(self, *args, **kwargs):
"""Initialize class and check if is TTY"""
super().__init__(*args, **kwargs)
self._tty = sys.stdout.isatty()
def enrich_text(self, text):
"""Handle ReST markups (currently, only ``foo``)"""
if self._tty and text:
# Replace ``text`` with ANSI SGR (bold)
return re.sub(r'\`\`(.+?)\`\`',
lambda m: f'\033[1m{m.group(1)}\033[0m', text)
return text
def _fill_text(self, text, width, indent):
"""Enrich descriptions with markups on it"""
enriched = self.enrich_text(text)
return "\n".join(indent + line for line in enriched.splitlines())
def _format_usage(self, usage, actions, groups, prefix):
"""Enrich positional arguments at usage: line"""
prog = self._prog
parts = []
for action in actions:
if action.option_strings:
opt = action.option_strings[0]
if action.nargs != 0:
opt += f" {action.dest.upper()}"
parts.append(f"[{opt}]")
else:
# Positional argument
parts.append(self.enrich_text(f"``{action.dest.upper()}``"))
usage_text = f"{prefix or 'usage: '} {prog} {' '.join(parts)}\n"
return usage_text
def _format_action_invocation(self, action):
"""Enrich argument names"""
if not action.option_strings:
return self.enrich_text(f"``{action.dest.upper()}``")
return ", ".join(action.option_strings)

View File

@ -0,0 +1,452 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0
# Copyright (c) 2016-2025 by Mauro Carvalho Chehab <mchehab@kernel.org>.
# pylint: disable=R0912,R0915
"""
Parse a source file or header, creating ReStructured Text cross references.
It accepts an optional file to change the default symbol reference or to
suppress symbols from the output.
It is capable of identifying defines, functions, structs, typedefs,
enums and enum symbols and create cross-references for all of them.
It is also capable of distinguish #define used for specifying a Linux
ioctl.
The optional rules file contains a set of rules like:
ignore ioctl VIDIOC_ENUM_FMT
replace ioctl VIDIOC_DQBUF vidioc_qbuf
replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ :c:type:`v4l2_event_motion_det`
"""
import os
import re
import sys
class ParseDataStructs:
"""
Creates an enriched version of a Kernel header file with cross-links
to each C data structure type.
It is meant to allow having a more comprehensive documentation, where
uAPI headers will create cross-reference links to the code.
It is capable of identifying defines, functions, structs, typedefs,
enums and enum symbols and create cross-references for all of them.
It is also capable of distinguish #define used for specifying a Linux
ioctl.
By default, it create rules for all symbols and defines, but it also
allows parsing an exception file. Such file contains a set of rules
using the syntax below:
1. Ignore rules:
ignore <type> <symbol>`
Removes the symbol from reference generation.
2. Replace rules:
replace <type> <old_symbol> <new_reference>
Replaces how old_symbol with a new reference. The new_reference can be:
- A simple symbol name;
- A full Sphinx reference.
On both cases, <type> can be:
- ioctl: for defines that end with _IO*, e.g. ioctl definitions
- define: for other defines
- symbol: for symbols defined within enums;
- typedef: for typedefs;
- enum: for the name of a non-anonymous enum;
- struct: for structs.
Examples:
ignore define __LINUX_MEDIA_H
ignore ioctl VIDIOC_ENUM_FMT
replace ioctl VIDIOC_DQBUF vidioc_qbuf
replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ :c:type:`v4l2_event_motion_det`
"""
# Parser regexes with multiple ways to capture enums and structs
RE_ENUMS = [
re.compile(r"^\s*enum\s+([\w_]+)\s*\{"),
re.compile(r"^\s*enum\s+([\w_]+)\s*$"),
re.compile(r"^\s*typedef\s*enum\s+([\w_]+)\s*\{"),
re.compile(r"^\s*typedef\s*enum\s+([\w_]+)\s*$"),
]
RE_STRUCTS = [
re.compile(r"^\s*struct\s+([_\w][\w\d_]+)\s*\{"),
re.compile(r"^\s*struct\s+([_\w][\w\d_]+)$"),
re.compile(r"^\s*typedef\s*struct\s+([_\w][\w\d_]+)\s*\{"),
re.compile(r"^\s*typedef\s*struct\s+([_\w][\w\d_]+)$"),
]
# FIXME: the original code was written a long time before Sphinx C
# domain to have multiple namespaces. To avoid to much turn at the
# existing hyperlinks, the code kept using "c:type" instead of the
# right types. To change that, we need to change the types not only
# here, but also at the uAPI media documentation.
DEF_SYMBOL_TYPES = {
"ioctl": {
"prefix": "\\ ",
"suffix": "\\ ",
"ref_type": ":ref",
"description": "IOCTL Commands",
},
"define": {
"prefix": "\\ ",
"suffix": "\\ ",
"ref_type": ":ref",
"description": "Macros and Definitions",
},
# We're calling each definition inside an enum as "symbol"
"symbol": {
"prefix": "\\ ",
"suffix": "\\ ",
"ref_type": ":ref",
"description": "Enumeration values",
},
"typedef": {
"prefix": "\\ ",
"suffix": "\\ ",
"ref_type": ":c:type",
"description": "Type Definitions",
},
# This is the description of the enum itself
"enum": {
"prefix": "\\ ",
"suffix": "\\ ",
"ref_type": ":c:type",
"description": "Enumerations",
},
"struct": {
"prefix": "\\ ",
"suffix": "\\ ",
"ref_type": ":c:type",
"description": "Structures",
},
}
def __init__(self, debug: bool = False):
"""Initialize internal vars"""
self.debug = debug
self.data = ""
self.symbols = {}
for symbol_type in self.DEF_SYMBOL_TYPES:
self.symbols[symbol_type] = {}
def store_type(self, symbol_type: str, symbol: str,
ref_name: str = None, replace_underscores: bool = True):
"""
Stores a new symbol at self.symbols under symbol_type.
By default, underscores are replaced by "-"
"""
defs = self.DEF_SYMBOL_TYPES[symbol_type]
prefix = defs.get("prefix", "")
suffix = defs.get("suffix", "")
ref_type = defs.get("ref_type")
# Determine ref_link based on symbol type
if ref_type:
if symbol_type == "enum":
ref_link = f"{ref_type}:`{symbol}`"
else:
if not ref_name:
ref_name = symbol.lower()
# c-type references don't support hash
if ref_type == ":ref" and replace_underscores:
ref_name = ref_name.replace("_", "-")
ref_link = f"{ref_type}:`{symbol} <{ref_name}>`"
else:
ref_link = symbol
self.symbols[symbol_type][symbol] = f"{prefix}{ref_link}{suffix}"
def store_line(self, line):
"""Stores a line at self.data, properly indented"""
line = " " + line.expandtabs()
self.data += line.rstrip(" ")
def parse_file(self, file_in: str):
"""Reads a C source file and get identifiers"""
self.data = ""
is_enum = False
is_comment = False
multiline = ""
with open(file_in, "r",
encoding="utf-8", errors="backslashreplace") as f:
for line_no, line in enumerate(f):
self.store_line(line)
line = line.strip("\n")
# Handle continuation lines
if line.endswith(r"\\"):
multiline += line[-1]
continue
if multiline:
line = multiline + line
multiline = ""
# Handle comments. They can be multilined
if not is_comment:
if re.search(r"/\*.*", line):
is_comment = True
else:
# Strip C99-style comments
line = re.sub(r"(//.*)", "", line)
if is_comment:
if re.search(r".*\*/", line):
is_comment = False
else:
multiline = line
continue
# At this point, line variable may be a multilined statement,
# if lines end with \ or if they have multi-line comments
# With that, it can safely remove the entire comments,
# and there's no need to use re.DOTALL for the logic below
line = re.sub(r"(/\*.*\*/)", "", line)
if not line.strip():
continue
# It can be useful for debug purposes to print the file after
# having comments stripped and multi-lines grouped.
if self.debug > 1:
print(f"line {line_no + 1}: {line}")
# Now the fun begins: parse each type and store it.
# We opted for a two parsing logic here due to:
# 1. it makes easier to debug issues not-parsed symbols;
# 2. we want symbol replacement at the entire content, not
# just when the symbol is detected.
if is_enum:
match = re.match(r"^\s*([_\w][\w\d_]+)\s*[\,=]?", line)
if match:
self.store_type("symbol", match.group(1))
if "}" in line:
is_enum = False
continue
match = re.match(r"^\s*#\s*define\s+([\w_]+)\s+_IO", line)
if match:
self.store_type("ioctl", match.group(1),
replace_underscores=False)
continue
match = re.match(r"^\s*#\s*define\s+([\w_]+)(\s+|$)", line)
if match:
self.store_type("define", match.group(1))
continue
match = re.match(r"^\s*typedef\s+([_\w][\w\d_]+)\s+(.*)\s+([_\w][\w\d_]+);",
line)
if match:
name = match.group(2).strip()
symbol = match.group(3)
self.store_type("typedef", symbol, ref_name=name)
continue
for re_enum in self.RE_ENUMS:
match = re_enum.match(line)
if match:
self.store_type("enum", match.group(1))
is_enum = True
break
for re_struct in self.RE_STRUCTS:
match = re_struct.match(line)
if match:
self.store_type("struct", match.group(1))
break
def process_exceptions(self, fname: str):
"""
Process exceptions file with rules to ignore or replace references.
"""
if not fname:
return
name = os.path.basename(fname)
with open(fname, "r", encoding="utf-8", errors="backslashreplace") as f:
for ln, line in enumerate(f):
ln += 1
line = line.strip()
if not line or line.startswith("#"):
continue
# Handle ignore rules
match = re.match(r"^ignore\s+(\w+)\s+(\S+)", line)
if match:
c_type = match.group(1)
symbol = match.group(2)
if c_type not in self.DEF_SYMBOL_TYPES:
sys.exit(f"{name}:{ln}: {c_type} is invalid")
d = self.symbols[c_type]
if symbol in d:
del d[symbol]
continue
# Handle replace rules
match = re.match(r"^replace\s+(\S+)\s+(\S+)\s+(\S+)", line)
if not match:
sys.exit(f"{name}:{ln}: invalid line: {line}")
c_type, old, new = match.groups()
if c_type not in self.DEF_SYMBOL_TYPES:
sys.exit(f"{name}:{ln}: {c_type} is invalid")
reftype = None
# Parse reference type when the type is specified
match = re.match(r"^\:c\:(data|func|macro|type)\:\`(.+)\`", new)
if match:
reftype = f":c:{match.group(1)}"
new = match.group(2)
else:
match = re.search(r"(\:ref)\:\`(.+)\`", new)
if match:
reftype = match.group(1)
new = match.group(2)
# If the replacement rule doesn't have a type, get default
if not reftype:
reftype = self.DEF_SYMBOL_TYPES[c_type].get("ref_type")
if not reftype:
reftype = self.DEF_SYMBOL_TYPES[c_type].get("real_type")
new_ref = f"{reftype}:`{old} <{new}>`"
# Change self.symbols to use the replacement rule
if old in self.symbols[c_type]:
self.symbols[c_type][old] = new_ref
else:
print(f"{name}:{ln}: Warning: can't find {old} {c_type}")
def debug_print(self):
"""
Print debug information containing the replacement rules per symbol.
To make easier to check, group them per type.
"""
if not self.debug:
return
for c_type, refs in self.symbols.items():
if not refs: # Skip empty dictionaries
continue
print(f"{c_type}:")
for symbol, ref in sorted(refs.items()):
print(f" {symbol} -> {ref}")
print()
def gen_output(self):
"""Write the formatted output to a file."""
# Avoid extra blank lines
text = re.sub(r"\s+$", "", self.data) + "\n"
text = re.sub(r"\n\s+\n", "\n\n", text)
# Escape Sphinx special characters
text = re.sub(r"([\_\`\*\<\>\&\\\\:\/\|\%\$\#\{\}\~\^])", r"\\\1", text)
# Source uAPI files may have special notes. Use bold font for them
text = re.sub(r"DEPRECATED", "**DEPRECATED**", text)
# Delimiters to catch the entire symbol after escaped
start_delim = r"([ \n\t\(=\*\@])"
end_delim = r"(\s|,|\\=|\\:|\;|\)|\}|\{)"
# Process all reference types
for ref_dict in self.symbols.values():
for symbol, replacement in ref_dict.items():
symbol = re.escape(re.sub(r"([\_\`\*\<\>\&\\\\:\/])", r"\\\1", symbol))
text = re.sub(fr'{start_delim}{symbol}{end_delim}',
fr'\1{replacement}\2', text)
# Remove "\ " where not needed: before spaces and at the end of lines
text = re.sub(r"\\ ([\n ])", r"\1", text)
text = re.sub(r" \\ ", " ", text)
return text
def gen_toc(self):
"""
Create a TOC table pointing to each symbol from the header
"""
text = []
# Add header
text.append(".. contents:: Table of Contents")
text.append(" :depth: 2")
text.append(" :local:")
text.append("")
# Sort symbol types per description
symbol_descriptions = []
for k, v in self.DEF_SYMBOL_TYPES.items():
symbol_descriptions.append((v['description'], k))
symbol_descriptions.sort()
# Process each category
for description, c_type in symbol_descriptions:
refs = self.symbols[c_type]
if not refs: # Skip empty categories
continue
text.append(f"{description}")
text.append("-" * len(description))
text.append("")
# Sort symbols alphabetically
for symbol, ref in sorted(refs.items()):
text.append(f"* :{ref}:")
text.append("") # Add empty line between categories
return "\n".join(text)
def write_output(self, file_in: str, file_out: str, toc: bool):
title = os.path.basename(file_in)
if toc:
text = self.gen_toc()
else:
text = self.gen_output()
with open(file_out, "w", encoding="utf-8", errors="backslashreplace") as f:
f.write(".. -*- coding: utf-8; mode: rst -*-\n\n")
f.write(f"{title}\n")
f.write("=" * len(title) + "\n\n")
if not toc:
f.write(".. parsed-literal::\n\n")
f.write(text)

60
tools/docs/parse-headers.py Executable file
View File

@ -0,0 +1,60 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0
# Copyright (c) 2016, 2025 by Mauro Carvalho Chehab <mchehab@kernel.org>.
# pylint: disable=C0103
"""
Convert a C header or source file ``FILE_IN``, into a ReStructured Text
included via ..parsed-literal block with cross-references for the
documentation files that describe the API. It accepts an optional
``FILE_RULES`` file to describes what elements will be either ignored or
be pointed to a non-default reference type/name.
The output is written at ``FILE_OUT``.
It is capable of identifying defines, functions, structs, typedefs,
enums and enum symbols and create cross-references for all of them.
It is also capable of distinguish #define used for specifying a Linux
ioctl.
The optional ``FILE_RULES`` contains a set of rules like:
ignore ioctl VIDIOC_ENUM_FMT
replace ioctl VIDIOC_DQBUF vidioc_qbuf
replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ :c:type:`v4l2_event_motion_det`
"""
import argparse
from lib.parse_data_structs import ParseDataStructs
from lib.enrich_formatter import EnrichFormatter
def main():
"""Main function"""
parser = argparse.ArgumentParser(description=__doc__,
formatter_class=EnrichFormatter)
parser.add_argument("-d", "--debug", action="count", default=0,
help="Increase debug level. Can be used multiple times")
parser.add_argument("-t", "--toc", action="store_true",
help="instead of a literal block, outputs a TOC table at the RST file")
parser.add_argument("file_in", help="Input C file")
parser.add_argument("file_out", help="Output RST file")
parser.add_argument("file_rules", nargs="?",
help="Exceptions file (optional)")
args = parser.parse_args()
parser = ParseDataStructs(debug=args.debug)
parser.parse_file(args.file_in)
if args.file_rules:
parser.process_exceptions(args.file_rules)
parser.debug_print()
parser.write_output(args.file_in, args.file_out, args.toc)
if __name__ == "__main__":
main()