The device name allocated via kzalloc() in init_one_mc() is assigned to
dev->init_name but never freed on the normal removal path. device_register()
copies init_name and then sets dev->init_name to NULL, so the name pointer
becomes unreachable from the device. Thus leaking memory.
Use a stack-local char array instead of using kzalloc() for name.
Fixes: d5fe2fec6c ("EDAC: Add a driver for the AMD Versal NET DDR controller")
Signed-off-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260401111856.2342975-1-ptsm@linux.microsoft.com
up of the relevant places to have them more developer-friendly (read: sort
them alphanumerically and clean up comments) such that adding new banks is
easy
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmndH/cACgkQEsHwGGHe
VUpqFBAAvjQCWdL5GQ0sV4EyYVToj4OKU3DmUCJLMKEh3n3yrQpPsbU9+KfxyndP
B68lRfRqqV/uUxQGebh7Rnp8o2jWphU2hf1Lr0Ssl6y5ouKWs5Up4foLlG4hAhzC
2MmHVz+jj8Z3FWKLxMEymxqq6wLF+0H3Issd/l23DkK6hMQCkjKc6WrSNC6JBDCA
sSF5kR/E4Q/lcW12ncq4pUYwkKox2lcdsNtI/nC7W7W+CoqwpOq8MfomCDIII+A0
Ib7baeRxagOk0WHlfy15fGaDoKlHW6ImT3cVYBK/tomp8dpG2zRMXHHQExan2rBR
rHzvk3aHEgOr02DZJ/dxOT+libQIkBwno+DheEhJHcirB/gS5Z51ERhkyzqLReGv
+XSO1Eq9j5bqiVn8RdPeJIVLtfqnOrpcks+cCmyH0AlLIx1WV5mSRUtmVl1kWyq1
GBos0yOnH4PgMxqv8fNkfNjm1ATnHyrVjYl5YNKSzJHhu/8BYcQJ4X8R0f2m0pXS
WI6uXf35C6rJcKj25qo1Nnhmj5YDWJgelJjes9ZtmRMPDNNooD4VLk1W6ox7VuOY
QaNMNwrroLRdfOlaz7oYIUAuoaZbZnTqbz8Lfmb4UScLd9LfI5ZPqs7pB5VORApF
5IYM/Wli+kQl2Qbz0CD6ZtfdidqR09H7oJBE/r6bEFePot2EpUY=
=aivF
-----END PGP SIGNATURE-----
Merge tag 'ras_core_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:
- Add new AMD MCA bank names and types to the MCA code, preceded by a
clean up of the relevant places to have them more developer-friendly
(read: sort them alphanumerically and clean up comments) such that
adding new banks is easy
* tag 'ras_core_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce, EDAC/mce_amd: Add new SMCA bank types
x86/mce, EDAC/mce_amd: Update CS bank type naming
x86/mce, EDAC/mce_amd: Reorder SMCA bank type enums
- i10nm: Add GNR error information decoder support as an alternative to the
firmware decoder
— versalnet: Restructure the init/teardown logic for correct and more readable
error handling. Also, fix two memory leaks and a resource leak
— Convert several internal structs to use bounded flex arrays, enabling the
kernel's runtime checker to catch out-of-bounds memory accesses
- Mark various sysfs attribute tables read-only, preventing accidental
modification at runtime
- The usual fixes and cleanups across the subsystem
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmncwr0ACgkQEsHwGGHe
VUp/7Q//SITcEEp/IZrZoUKg0CRwJY+uifYo0SQHCoLlq0JzjuQwsS8szwJfF3fy
tm2JHq1e9SaPYAD78Ybda1MruyUFg/GesexgRgjtkrxb1A6tW48nsPdxuTCv5KCe
A5wZM8ByDt4cOk68LzRIWtisuiaVzGzuExfUmTAQwc8G0sR7g7rMfClY9qWMvOtS
l8ZzM1LbShvSPjnzkQIoKVAQH4OOr+uCNwT/8myUY7WkAo0qkTjrOWVGjYG4Dt2u
HTWqOaJZase8V6kMdo6HIpE5jTy+Ic+9tfXCibqUygTW6s7OWTnCVVRdB/aLZmFK
zvdUOH4J+8OAF19xM8FNVvFPZySrq2EPBNl09+kBwZOOY2afT9MnpmQJlXkdAPJY
fnSDJX/oKt4q7te2+OEQoIg2dhFkqVr7nXK4JsjqYGEdpgacays2ITY8n3x8joDE
LgBMpO8TTaHEQBNFUfGG5+y9cu2AIBdaxBNIq1yrSRUNTzZ0Af32cy+QnGimUi9g
40LyXm418Ixc9UZrWPnZ6mIiiLVhWtv9aJfulidyFa52xDOVbJojlydhLUojxX7n
dXAD7jLoHcKqZqRWD5NUtBVqFoOiEhtuHZjOQ4+ocebrMPAt7NSC13a4NCLZylyy
zP02Tmj2sTgaq/5KBKIoVSP4sKmk4bX4rxYvTgeIbpu7txLcWcU=
=EZr3
-----END PGP SIGNATURE-----
Merge tag 'edac_updates_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
- amd64_edac: Add support for AMD Zen 3 (family 19h, models 40h–4fh)
- i10nm: Add GNR error information decoder support as an alternative to
the firmware decoder
- versalnet: Restructure the init/teardown logic for correct and more
readable error handling. Also, fix two memory leaks and a resource
leak
- Convert several internal structs to use bounded flex arrays, enabling
the kernel's runtime checker to catch out-of-bounds memory accesses
- Mark various sysfs attribute tables read-only, preventing accidental
modification at runtime
- The usual fixes and cleanups across the subsystem
* tag 'edac_updates_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/mc: Use kzalloc_flex()
EDAC/ie31200: Make rpl_s_cfg static
EDAC/i10nm: Fix spelling mistake "readd" -> "read"
EDAC/versalnet: Fix device_node leak in mc_probe()
EDAC/versalnet: Fix memory leak in remove and probe error paths
EDAC/amd64: Add support for family 19h, models 40h-4fh
EDAC/i10nm: Add driver decoder for Granite Rapids server
EDAC/sb: Use kzalloc_flex()
EDAC/i7core: Use kzalloc_flex()
EDAC/mpc85xx: Constify device sysfs attributes
EDAC/device: Allow addition of const sysfs attributes
EDAC/pci_sysfs: Constify instance sysfs attributes
EDAC/device: Constify info sysfs attributes
EDAC/device: Drop unnecessary and dangerous casts of attributes
EDAC/device: Drop unused macro to_edacdev_attr()
EDAC/altera: Drop unused field eccmgr_sysfs_attr
EDAC/versalnet: Refactor memory controller initialization and cleanup
When the mci->pvt_info allocation in edac_mc_alloc() fails, the error path
will call put_device() which will end up calling the device's release
function.
However, the init ordering is wrong such that device_initialize() happens
*after* the failed allocation and thus the device itself and the release
function pointer are not initialized yet when they're called:
MCE: In-kernel MCE decoding enabled.
------------[ cut here ]------------
kobject: '(null)': is not initialized, yet kobject_put() is being called.
WARNING: lib/kobject.c:734 at kobject_put, CPU#22: systemd-udevd
CPU: 22 UID: 0 PID: 538 Comm: systemd-udevd Not tainted 7.0.0-rc1+ #2 PREEMPT(full)
RIP: 0010:kobject_put
Call Trace:
<TASK>
edac_mc_alloc+0xbe/0xe0 [edac_core]
amd64_edac_init+0x7a4/0xff0 [amd64_edac]
? __pfx_amd64_edac_init+0x10/0x10 [amd64_edac]
do_one_initcall
...
Reorder the calling sequence so that the device is initialized and thus the
release function pointer is properly set before it can be used.
This was found by Claude while reviewing another EDAC patch.
Fixes: 0bbb265f70 ("EDAC/mc: Get rid of silly one-shot struct allocation in edac_mc_alloc()")
Reported-by: Claude Code:claude-opus-4.5
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: stable@kernel.org
Link: https://patch.msgid.link/20260331121623.4871-1-bp@kernel.org
Convert struct mem_ctl_info to use flex array and use the new flex array
helpers to enable runtime bounds checking, including annotating the array
length member with __counted_by() for extra runtime analysis when requested.
Move memcpy() after the counter assignment so that it is initialized before
the first reference to the flex array, as the new attribute requires.
[ bp: Heavily massage commit message. ]
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/20260327024828.7377-1-rosenp@gmail.com
The rpl_s_cfg variable is only used within this file, so mark it static, as
Sparse reports:
drivers/edac/ie31200_edac.c:709:19: warning: symbol 'rpl_s_cfg' was not
declared. Should it be static?
[ bp: Massage commit message. ]
Signed-off-by: Marilene Andrade Garcia <marilene.agarcia@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/f353c3aff47abd88bafc46a1c0c5eb9917e11b28.1774619301.git.marilene.agarcia@gmail.com
of_parse_phandle() returns a device_node reference that must be released with
of_node_put(). The original code never freed r5_core_node on any exit path,
causing a memory leak.
Fix this by using the automatic cleanup attribute __free(device_node) which
ensures of_node_put() is called when the variable goes out of scope.
Fixes: d5fe2fec6c ("EDAC: Add a driver for the AMD Versal NET DDR controller")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Cc: <stable@kernel.org>
Link: https://patch.msgid.link/20260323-versalnet-v1-1-4ab3012635ef@gmail.com
The mcdi object allocated using kzalloc() in the setup_mcdi() is not freed in
the remove path or in probe's error handling path leading to a memory leak.
Fix it by freeing the allocated memory.
Fixes: d5fe2fec6c ("EDAC: Add a driver for the AMD Versal NET DDR controller")
Signed-off-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260322131139.1684716-1-ptsm@linux.microsoft.com
Current i10nm_edac only supports the firmware decoder (ACPI DSM methods)
for Granite Rapids servers. Add the driver decoder, which directly extracts
topology information from the IMC machine check bank IA32_MCi_MISC MSRs, to
improve decoding performance for Granite Rapids.
[Tony: Updated commit comment]
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Shawn Fan <shawn.fan@intel.com>
Link: https://patch.msgid.link/20260318023118.2704139-1-qiuxu.zhuo@intel.com
Simplifies allocations by using a flexible array member in this struct.
Add __counted_by to get extra runtime analysis. Move counting variable
assignment immediately after allocation as required by __counted_by.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/20260313215637.6371-1-rosenp@gmail.com
Simplifies allocations by using a flexible array member in this struct.
Add __counted_by to get extra runtime analysis. Move counting variable
assignment to right after allocation as required by __counted_by.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/20260313215900.6724-1-rosenp@gmail.com
These casts assume that the struct attribute is at the beginning of struct
edac_dev_sysfs_block_attribute. This is can silently break if the field is
moved, either manually or through struct randomization.
Use proper member syntax to get the field address and drop the casts.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260223-sysfs-const-edac-v1-3-3ff0b87249e7@weissschuh.net
Recognize new SMCA bank types and include their short names for sysfs
and long names for decoding.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260307163316.345923-4-yazen.ghannam@amd.com
Recent documentation updated the "CS" bank type name from "Coherent
Slave" to "Coherent Station".
Apply this change in the kernel also.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260307163316.345923-3-yazen.ghannam@amd.com
Originally, the SMCA bank type enums were ordered based on processor
documentation. However, the ordering became inconsistent after new bank
types were added over time.
Sort the bank type enums alphanumerically in most places. Sort the
"enum to HWID/McaType" mapping by HWID/McaType. Drop redundant code
comments.
No functional changes.
[ bp: Sort them alphanumerically. ]
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260307163316.345923-2-yazen.ghannam@amd.com
Simplify the initialization and cleanup flow for Versal Net DDRMC
controllers in the EDAC driver by carving out the single controller init
into a separate function which allows for a much better and more
readable error handling and unwinding.
[ bp:
- do the kzalloc allocations first
- "publish" the structures only after they've been initialized
properly so that you don't need to unwind unnecessarily when
it fails later
- remove_versalnet() is now trivial
]
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20251104093932.3838876-1-shubhrajyoti.datta@amd.com
This converts some of the visually simpler cases that have been split
over multiple lines. I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.
Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script. I probably had made it a bit _too_ trivial.
So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.
The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the exact same thing as the 'alloc_obj()' version, only much
smaller because there are a lot fewer users of the *alloc_flex()
interface.
As with alloc_obj() version, this was done entirely with mindless brute
force, using the same script, except using 'flex' in the pattern rather
than 'objs*'.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
- Add support for Intel Amston Lake and Panther Lake-H SoCs to igen6_edac
- The usual amount of fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmmJ2DQACgkQEsHwGGHe
VUo8ow/9HFeAYSqEqYxBzv46qumdCE/Ru6ijz0v6wFW0ayKUVnwJeAfCDOSAnsO3
G7CTHt9a6kD1aZG8YbZXmjc+dghffqzUeMt7pOoIB0RliY+LUDC6FVGOt8U1ulBB
ymp4iPuBWNapTs083sVLJsXtphCt2U48PE1Hz63c9QfM1LHyOfb6edCfdGemaP4B
gV1GalYuUlDm2Bmw5PA35Kc2MXff7FA7FF6WvSIV1ljVYGpFKHL9VFRLESWSEYi6
27A27l4VSqZfQXPRJkH1naIKkflsralrgvZ7cSCsb4Lq8V4oGzatfrnTK/0J5A9v
PEZk2HO7ZUaYStmzcTfUt8mSjlAN34Q77BQ9Z6796ju2PhJlN5TQJxLz+lvzbiec
kWKtded961qrLB/mXDFfiyLMyhqdPR13ZK3vb0xJcDvq8LQZ0tVXsbmH6UclNwOs
Nk+Hb8F7PghU/nDgJkYlL0mjCyCucl+4UpS/ObCXhAVUFDHik4SRZ8TPw1mz5ywt
HHidlCDbKN42zji7p8ISpC+OTJWW7IQX24V+QarJ/Qi519K/o71XeBwnhXNSjuku
sRB1tSWg9ZEXec2BPnqEvW76ODRkblQxO6/KdYW2D8X9aV5XDQdq/RX4yGbsV+7s
faALpB26wCVJ7oPoYPl/AX3UKBvaGz0IRR/zPoI0LkXqjnRrgUk=
=s1Be
-----END PGP SIGNATURE-----
Merge tag 'edac_updates_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
- Remove two drivers for obsolete hardware: i82443bxgx_edac and
r82600_edac
- Add support for Intel Amston Lake and Panther Lake-H SoCs to
igen6_edac
- The usual amount of fixes and cleanups
* tag 'edac_updates_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/r82600: Remove this obsolete driver
EDAC/i82443bxgx: Remove driver that has been marked broken since 2007
EDAC/amd64: Avoid a -Wformat-security warning
RAS/AMD/ATL: Remove an unneeded semicolon
EDAC/igen6: Add more Intel Panther Lake-H SoCs support
EDAC/igen6: Make masks of {MCHBAR, TOM, TOUUD, ECC_ERROR_LOG} configurable
EDAC/igen6: Add two Intel Amston Lake SoCs support
EDAC/i5400: Fix snprintf() limit calculation in calculate_dimm_size()
EDAC/i5000: Fix snprintf() size calculation in calculate_dimm_size()
* ras/edac-drivers:
EDAC/r82600: Remove this obsolete driver
EDAC/i82443bxgx: Remove driver that has been marked broken since 2007
EDAC/igen6: Add more Intel Panther Lake-H SoCs support
EDAC/igen6: Make masks of {MCHBAR, TOM, TOUUD, ECC_ERROR_LOG} configurable
EDAC/igen6: Add two Intel Amston Lake SoCs support
EDAC/i5400: Fix snprintf() limit calculation in calculate_dimm_size()
EDAC/i5000: Fix snprintf() size calculation in calculate_dimm_size()
* ras/edac-misc:
EDAC/amd64: Avoid a -Wformat-security warning
* ras/edac-amd-atl:
RAS/AMD/ATL: Remove an unneeded semicolon
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Passing IRQF_ONESHOT ensures that the interrupt source is masked until
the secondary (threaded) handler is done. If only a primary handler is
used then the flag makes no sense because the interrupt can not fire
(again) while its handler is running.
The flag also prevents force-threading of the primary handler and the
irq-core will warn about this.
Remove IRQF_ONESHOT from irqflags.
Fixes: a29d64a45e ("EDAC, altera: Add IRQ Flags to disable IRQ while handling")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260128095540.863589-11-bigeasy@linutronix.de
This driver supports the Radisys 82600 embedded chipset, which was used with
Pentium III-era CPUs. It is highly unlikely that it is still used. Besides
its own documentation, the only information I was able to find about the
R82600, after looking through many pages of Google results, was that it was
used in a Nokia 2G GSM base station.
The original author said:
"This was after Bluesmoke (watch was out-of-tree), and was pay of the first
set of in tree edac drivers (a fresh design as far as I remember).
This particular driver did apparently get used by Akamai quite heavily and
widely but all ancient history now. The Radisys r82600 edac driver
r82600_edac.c was closely related hardware from the same era, and can probably
go too (although being embedded hardware, it's possible its production
lifespan was considerably longer).
Tim."
Mail is
https://lore.kernel.org/r/01BCCA37-F6A2-458B-BFE7-99C7724CDDEA@buttersideup.com
but lkml drops html mail so quoting the relevant info here too.
[ bp: Massage and extend commit message. ]
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260130052633.13119-1-enelsonmoore@gmail.com
The history of this driver is pretty amusing. It was marked broken in
2007 in
28f96eeafc ("drivers/edac-new-i82443bxgz-mc-driver: mark as broken").
It was then fixed in 2008 in
53a2fe5804 ("edac: make i82443bxgx_edac coexist with intel_agp"),
but the dependency on BROKEN was never removed. Given that this was never
fixed in the last ~18 years, it is obvious there is no demand for this driver.
Remove it and move the former maintainer to the CREDITS file.
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260129082937.48740-1-enelsonmoore@gmail.com
Using a variable as a format string causes a (default-disabled) warning:
drivers/edac/amd64_edac.c: In function 'per_family_init':
drivers/edac/amd64_edac.c:3914:17: error: format not a string literal and no format arguments [-Werror=format-security]
3914 | scnprintf(pvt->ctl_name, sizeof(pvt->ctl_name), tmp_name);
| ^~~~~~~~~
The code here is safe, but in order to enable the warning by default in the
future, change this instance to pass the name indirectly.
Fixes: e9abd990ae ("EDAC/amd64: Generate ctl_name string at runtime")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Avadhut Naik <avadhut.naik@amd.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://patch.msgid.link/20251204100231.1034557-1-arnd@kernel.org
The masks used to retrieve base addresses from {MCHBAR, TOM, TOUUD,
ECC_ERROR_LOG} registers can be CPU model-specific. Currently,
igen6_edac hard-codes these masks with the most significant bit at
38, while some CPUs have extended the most significant bit to bit 41
or bit 45.
Systems with more than 512GB (2^39) memory need this extension to get
correct masks. But all CPUs currently supported by igen6_edac support
max memory less than 512GB (e.g., max memory size for Raptor Lake
systems is 192GB, for Alder Lake systems is 128GB, ...), which means
the previous hard-coded most significant bit 38 still works properly.
So backporting this patch to stable kernels is not necessary.
To make these masks reflect the CPUs' real support and easily support
future Intel client CPUs supported by igen6_edac that have more than
512GB memory, add four new fields to structure res_config to make these
masks CPU model-specific and configure them properly.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Jianfeng Gao <jianfeng.gao@intel.com>
Link: https://patch.msgid.link/20251219042956.3232568-3-qiuxu.zhuo@intel.com
Intel Amston Lake SoCs with IBECC (In-Band ECC) capability share the same
IBECC registers as Alder Lake-N SoCs. Add two new compute die IDs for
Amston Lake SoC products to enable EDAC support.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Jianfeng Gao <jianfeng.gao@intel.com>
Link: https://patch.msgid.link/20251124065457.3630949-2-qiuxu.zhuo@intel.com
The snprintf() can't really overflow because we're writing a max of 42
bytes to a PAGE_SIZE buffer. But my static checker complains because
the limit calculation doesn't take the first 11 space characters that
we wrote into the buffer into consideration. Fix this for the sake of
correctness even though it doesn't affect runtime.
Also delete an earlier "space -= n;" which was not used.
Fixes: 68d086f89b ("i5400_edac: improve debug messages to better represent the filled memory")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/ccd06b91748e7ed8e33eeb2ff1e7b98700879304.1765290801.git.dan.carpenter@linaro.org
The snprintf() can't really overflow because we're writing a max of 42
bytes to a PAGE_SIZE buffer. But the limit calculation doesn't take
the first 11 bytes that we wrote into consideration so the limit is
not correct. Just fix it for correctness even though it doesn't
affect runtime.
Fixes: 64e1fdaf55 ("i5000_edac: Fix the logic that retrieves memory information")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/07cd652c51e77aad5a8350e1a7cd9407e5bbe373.1765290801.git.dan.carpenter@linaro.org
future incarnations of this memory controllers architecture
- amd64_edac: Remove the legacy csrow sysfs interface which has been
deprecated and unused (we assume) for at least a decade
- Add the capability to fallback to BIOS-provided address translation
functionality (ACPI PRM) which can be used on systems unsupported by
the current AMD address translation library
- The usual fixes, fixlets, cleanups and improvements all over the place
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmktdyMACgkQEsHwGGHe
VUpXTxAAhdQxn1v1tYKya6YHxBS3T3Y3+4fec+LeKgoY1YnoFHMse3TAU+G67opR
1xnEKHKrkX4v1FAwe7eD2G6qyz2ytqcApv4XGxmQ1WgldFWuPl/lI3ngPNMCHMog
dqeQFRQ7MXsk0no0cjMA6NjafFpYOGGGhIzdU3wvgZawH4hG9wHLS6Urvn2SfWj6
Pf/449qS7XoPU5G22qWPqqixRHpc9BPkJfKMIYeaWbxldePlwbh9cOMLqwsZo1QV
v5cv/3CAIVFzRvNVIx05kDhRrwqTjIZL+u9IYHg2g9DA45GQuktYQwd1KksbVpUn
CijhpKMoSnQHN+ZLW84XzvEH2rvroSTZl28d5suY1GHXG3ePc9HpmTVbVElFXWKZ
dq0X2RIbMEbSxneePFHJ4ESUfNN2HbPSfh/sXN4epxcMQI0VWVhXYs5+Ek4UV1+E
hvhCS/kuAypODzEi0cULoMcXdyKr2V1zpaAHNlZshepp/kUzY46b3cBhxKiL3Fsd
x+IhZgow9a+iMJfMpCJhMABKEkoZRgS3gs5nWMJ6t0EvulvknG+aovGB/Q0VaIIa
H69Fn+R2ewnEuZf1JGZDMit1y+wjGgeamk+uWTym+tCyNH1eHaSq48POribajcYF
UtcobK4kG7hPodsbwwD4MhqtSLhuyIcXTHbI3x4+r+LLAgdAPKM=
=NidS
-----END PGP SIGNATURE-----
Merge tag 'edac_updates_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
- imh_edac: Add a new EDAC driver for Intel Diamond Rapids and future
incarnations of this memory controllers architecture
- amd64_edac: Remove the legacy csrow sysfs interface which has been
deprecated and unused (we assume) for at least a decade
- Add the capability to fallback to BIOS-provided address translation
functionality (ACPI PRM) which can be used on systems unsupported by
the current AMD address translation library
- The usual fixes, fixlets, cleanups and improvements all over the
place
* tag 'edac_updates_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
RAS/AMD/ATL: Replace bitwise_xor_bits() with hweight16()
EDAC/igen6: Fix error handling in igen6_edac driver
EDAC/imh: Setup 'imh_test' debugfs testing node
EDAC/{skx_comm,imh}: Detect 2-level memory configuration
EDAC/skx_common: Extend the maximum number of DRAM chip row bits
EDAC/{skx_common,imh}: Add EDAC driver for Intel Diamond Rapids servers
EDAC/skx_common: Prepare for skx_set_hi_lo()
EDAC/skx_common: Prepare for skx_get_edac_list()
EDAC/{skx_common,skx,i10nm}: Make skx_register_mci() independent of pci_dev
EDAC/ghes: Replace deprecated strcpy() in ghes_edac_report_mem_error()
EDAC/ie31200: Fix error handling in ie31200_register_mci
RAS/CEC: Replace use of system_wq with system_percpu_wq
EDAC: Remove the legacy EDAC sysfs interface
EDAC/amd64: Remove NUM_CONTROLLERS macro
EDAC/amd64: Generate ctl_name string at runtime
RAS/AMD/ATL: Require PRM support for future systems
ACPI: PRM: Add acpi_prm_handler_available()
RAS/AMD/ATL: Return error codes from helper functions
Drop the driver-specific field_get() macro, in favor of the globally
available variant from <linux/bitfield.h>.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Prepare for the advent of a globally available common field_get() macro
by undefining the symbol before defining a local variant. This prevents
redefinition warnings from the C preprocessor when introducing the common
macro later.
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
The igen6_edac driver calls device_initialize() for all memory
controllers in igen6_register_mci(), but misses corresponding
put_device() calls in error paths and during normal shutdown in
igen6_unregister_mcis().
Adding the missing put_device() calls improves code readability and
ensures proper reference counting for the device structure.
Found by code review.
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251105090244.23327-1-make24@iscas.ac.cn
Setup the following debugfs testing node to enable fake memory error
address decoding tests for the imh_edac driver.
/sys/kernel/debug/edac/imh_test/addr
Tested-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251119134132.2389472-8-qiuxu.zhuo@intel.com
Detect 2-level memory configurations and notify the 'skx_common' library
to enable ADXL 2-level memory error decoding.
Tested-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251119134132.2389472-7-qiuxu.zhuo@intel.com