XDR enum decoders generated by xdrgen do not verify that incoming
values are valid members of the enum. Incoming out-of-range values
from malicious or buggy peers propagate through the system
unchecked.
Add validation logic to generated enum decoders using a switch
statement that explicitly lists valid enumerator values. The
compiler optimizes this to a simple range check when enum values
are dense (contiguous), while correctly rejecting invalid values
for sparse enums with gaps in their value ranges.
The --no-enum-validation option on the source subcommand disables
this validation when not needed.
The minimum and maximum fields in _XdrEnum, which were previously
unused placeholders for a range-based validation approach, have
been removed since the switch-based validation handles both dense
and sparse enums correctly.
Because the new mechanism results in substantive changes to
generated code, existing .x files are regenerated. Unrelated white
space and semicolon changes in the generated code are due to recent
commit 1c873a2fd1 ("xdrgen: Don't generate unnecessary semicolon")
and commit 38c4df91242b ("xdrgen: Address some checkpatch whitespace
complaints").
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Commit 277df18d7df9 ("xdrgen: Improve parse error reporting") added
clean, compiler-style error messages for syntax errors detected during
parsing. However, semantic errors discovered during AST transformation
still produce verbose Python stack traces.
When an XDR specification references an undefined type, the transformer
raises a VisitError wrapping a KeyError. Before this change:
Traceback (most recent call last):
File ".../lark/visitors.py", line 124, in _call_userfunc
return f(children)
...
KeyError: 'fsh4_mode'
...
lark.exceptions.VisitError: Error trying to process rule "basic":
'fsh4_mode'
After this change:
file.x:156:2: semantic error
Undefined type 'fsh4_mode'
fsh4_mode mode;
^
The new handle_transform_error() function extracts position information
from the Lark tree node metadata and formats the error consistently with
parse error messages.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
The current verbose Lark exception output makes it difficult to
quickly identify and fix syntax errors in XDR specifications. Users
must wade through hundreds of lines of cascading errors to find the
root cause.
Replace this with concise, compiler-style error messages showing
file, line, column, the unexpected token, and the source line with
a caret pointing to the error location.
Before:
Unexpected token Token('__ANON_1', '+1') at line 14, column 35.
Expected one of:
* SEMICOLON
Previous tokens: [Token('__ANON_0', 'LM_MAXSTRLEN')]
[hundreds more cascading errors...]
After:
file.x:14:35: parse error
Unexpected number '+1'
const LM_MAXNAMELEN = LM_MAXSTRLEN+1;
^
The error handler now raises XdrParseError on the first error,
preventing cascading messages that obscure the root cause.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Add a Python-based tool for translating XDR specifications into XDR
encoder and decoder functions written in the Linux kernel's C coding
style. The generator attempts to match the usual C coding style of
the Linux kernel's SunRPC consumers.
This approach is similar to the netlink code generator in
tools/net/ynl .
The maintainability benefits of machine-generated XDR code include:
- Stronger type checking
- Reduces the number of bugs introduced by human error
- Makes the XDR code easier to audit and analyze
- Enables rapid prototyping of new RPC-based protocols
- Hardens the layering between protocol logic and marshaling
- Makes it easier to add observability on demand
- Unit tests might be built for both the tool and (automatically)
for the generated code
In addition, converting the XDR layer to use memory-safe languages
such as Rust will be easier if much of the code can be converted
automatically.
Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>