XDR specification files can contain lines prefixed with '%' that
pass through unchanged to generated output. Traditional rpcgen
removes the '%' and emits the remainder verbatim, allowing direct
insertion of C includes, pragma directives, or other language-
specific content into the generated code.
Until now, xdrgen silently discarded these lines during parsing.
This prevented specifications from including necessary headers or
preprocessor directives that might be required for the generated
code to compile correctly.
The grammar now captures pass-through lines instead of ignoring
them. A new AST node type represents pass-through content, and
the AST transformer strips the leading '%' character. Definition
and source generators emit pass-through content in document order,
preserving the original placement within the specification.
This brings xdrgen closer to feature parity with traditional
rpcgen while maintaining the existing document-order processing
model.
Existing generated xdrgen source code has been regenerated.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
XDR enum decoders generated by xdrgen do not verify that incoming
values are valid members of the enum. Incoming out-of-range values
from malicious or buggy peers propagate through the system
unchecked.
Add validation logic to generated enum decoders using a switch
statement that explicitly lists valid enumerator values. The
compiler optimizes this to a simple range check when enum values
are dense (contiguous), while correctly rejecting invalid values
for sparse enums with gaps in their value ranges.
The --no-enum-validation option on the source subcommand disables
this validation when not needed.
The minimum and maximum fields in _XdrEnum, which were previously
unused placeholders for a range-based validation approach, have
been removed since the switch-based validation handles both dense
and sparse enums correctly.
Because the new mechanism results in substantive changes to
generated code, existing .x files are regenerated. Unrelated white
space and semicolon changes in the generated code are due to recent
commit 1c873a2fd1 ("xdrgen: Don't generate unnecessary semicolon")
and commit 38c4df91242b ("xdrgen: Address some checkpatch whitespace
complaints").
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
struct svc_service has a .vs_xdrsize field that is filled in by
servers for each of their RPC programs. This field is supposed to
contain the size of the largest procedure argument in the RPC
program. This value is also sometimes used to size network
transport buffers.
Currently, server implementations must manually calculate and
hard-code this value, which is error-prone and requires updates
when procedure arguments change.
Update xdrgen to determine which procedure argument structure is
largest, and emit a macro with a well-known name that contains
the size of that structure. Server code then uses this macro when
initializing the .vs_xdrsize field.
For NLM version 4, xdrgen now emits:
#define NLM4_MAX_ARGS_SZ (NLM4_nlm4_lockargs_sz)
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Commit 277df18d7df9 ("xdrgen: Improve parse error reporting") added
clean, compiler-style error messages for syntax errors detected during
parsing. However, semantic errors discovered during AST transformation
still produce verbose Python stack traces.
When an XDR specification references an undefined type, the transformer
raises a VisitError wrapping a KeyError. Before this change:
Traceback (most recent call last):
File ".../lark/visitors.py", line 124, in _call_userfunc
return f(children)
...
KeyError: 'fsh4_mode'
...
lark.exceptions.VisitError: Error trying to process rule "basic":
'fsh4_mode'
After this change:
file.x:156:2: semantic error
Undefined type 'fsh4_mode'
fsh4_mode mode;
^
The new handle_transform_error() function extracts position information
from the Lark tree node metadata and formats the error consistently with
parse error messages.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
The current verbose Lark exception output makes it difficult to
quickly identify and fix syntax errors in XDR specifications. Users
must wade through hundreds of lines of cascading errors to find the
root cause.
Replace this with concise, compiler-style error messages showing
file, line, column, the unexpected token, and the source line with
a caret pointing to the error location.
Before:
Unexpected token Token('__ANON_1', '+1') at line 14, column 35.
Expected one of:
* SEMICOLON
Previous tokens: [Token('__ANON_0', 'LM_MAXSTRLEN')]
[hundreds more cascading errors...]
After:
file.x:14:35: parse error
Unexpected number '+1'
const LM_MAXNAMELEN = LM_MAXSTRLEN+1;
^
The error handler now raises XdrParseError on the first error,
preventing cascading messages that obscure the root cause.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Add a Python-based tool for translating XDR specifications into XDR
encoder and decoder functions written in the Linux kernel's C coding
style. The generator attempts to match the usual C coding style of
the Linux kernel's SunRPC consumers.
This approach is similar to the netlink code generator in
tools/net/ynl .
The maintainability benefits of machine-generated XDR code include:
- Stronger type checking
- Reduces the number of bugs introduced by human error
- Makes the XDR code easier to audit and analyze
- Enables rapid prototyping of new RPC-based protocols
- Hardens the layering between protocol logic and marshaling
- Makes it easier to add observability on demand
- Unit tests might be built for both the tool and (automatically)
for the generated code
In addition, converting the XDR layer to use memory-safe languages
such as Rust will be easier if much of the code can be converted
automatically.
Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>