linux/kernel
Roman Gushchin 5318f3163c BACKPORT: cgroup: cgroup v2 freezer
Cgroup v1 implements the freezer controller, which provides an ability
to stop the workload in a cgroup and temporarily free up some
resources (cpu, io, network bandwidth and, potentially, memory)
for some other tasks. Cgroup v2 lacks this functionality.

This patch implements freezer for cgroup v2.

Cgroup v2 freezer tries to put tasks into a state similar to jobctl
stop. This means that tasks can be killed, ptraced (using
PTRACE_SEIZE*), and interrupted. It is possible to attach to
a frozen task, get some information (e.g. read registers) and detach.
It's also possible to migrate a frozen tasks to another cgroup.

This differs cgroup v2 freezer from cgroup v1 freezer, which mostly
tried to imitate the system-wide freezer. However uninterruptible
sleep is fine when all tasks are going to be frozen (hibernation case),
it's not the acceptable state for some subset of the system.

Cgroup v2 freezer is not supporting freezing kthreads.
If a non-root cgroup contains kthread, the cgroup still can be frozen,
but the kthread will remain running, the cgroup will be shown
as non-frozen, and the notification will not be delivered.

* PTRACE_ATTACH is not working because non-fatal signal delivery
is blocked in frozen state.

There are some interface differences between cgroup v1 and cgroup v2
freezer too, which are required to conform the cgroup v2 interface
design principles:
1) There is no separate controller, which has to be turned on:
the functionality is always available and is represented by
cgroup.freeze and cgroup.events cgroup control files.
2) The desired state is defined by the cgroup.freeze control file.
Any hierarchical configuration is allowed.
3) The interface is asynchronous. The actual state is available
using cgroup.events control file ("frozen" field). There are no
dedicated transitional states.
4) It's allowed to make any changes with the cgroup hierarchy
(create new cgroups, remove old cgroups, move tasks between cgroups)
no matter if some cgroups are frozen.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
No-objection-from-me-by: Oleg Nesterov <oleg@redhat.com>
Cc: kernel-team@fb.com
Change-Id: I3404119678cbcd7410aa56e9334055cee79d02fa
(cherry picked from commit 76f969e894)
cgroup-defs.h: use the struct cgroup_freezer_state and the
freezer field from definitions in I6221a975c04f06249a4f8d693852776ae08a8d8e
sched.h: use the frozen field defined in
I6221a975c04f06249a4f8d693852776ae08a8d8e
Bug: 154548692
Signed-off-by: Marco Ballesio <balejs@google.com>
2020-08-26 15:35:17 -07:00
..
bpf This is the 4.19.137 stable release 2020-08-05 12:07:48 +02:00
cgroup BACKPORT: cgroup: cgroup v2 freezer 2020-08-26 15:35:17 -07:00
configs
debug This is the 4.19.132 stable release 2020-07-09 11:20:59 +02:00
dma ANDROID: GKI: common: dma-mapping: make dma_common_contiguous_remap more robust 2020-05-01 21:57:17 -07:00
events This is the 4.19.135 stable release 2020-07-29 13:22:30 +02:00
gcov This is the 4.19.119 stable release 2020-04-29 17:26:17 +02:00
irq This is the 4.19.141 stable release 2020-08-21 13:01:46 +02:00
livepatch livepatch: Nullify obj->mod in klp_module_coming()'s error path 2019-10-07 18:57:10 +02:00
locking locktorture: Print ratio of acquisitions, not failures 2020-04-23 10:30:23 +02:00
power This is the 4.19.121 stable release 2020-05-06 09:03:15 +02:00
printk This is the 4.19.134 stable release 2020-07-22 13:03:12 +02:00
rcu rcu: Avoid data-race in rcu_gp_fqs_check_wake() 2020-02-11 04:33:55 -08:00
sched This is the 4.19.140 stable release 2020-08-19 08:43:22 +02:00
time This is the 4.19.138 stable release 2020-08-07 10:09:24 +02:00
trace This is the 4.19.141 stable release 2020-08-21 13:01:46 +02:00
.gitignore BACKPORT: Provide in-kernel headers to make extending kernel easier 2019-06-12 12:33:20 +00:00
acct.c acct_on(): don't mess with freeze protection 2019-05-31 06:46:05 -07:00
async.c
audit_fsnotify.c
audit_tree.c audit: Embed key into chunk 2019-12-13 08:51:11 +01:00
audit_watch.c audit_get_nd(): don't unlock parent too early 2019-12-13 08:51:02 +01:00
audit.c audit: fix a net reference leak in audit_list_rules_send() 2020-06-22 09:05:13 +02:00
audit.h audit: fix a net reference leak in audit_list_rules_send() 2020-06-22 09:05:13 +02:00
auditfilter.c audit: fix a net reference leak in audit_list_rules_send() 2020-06-22 09:05:13 +02:00
auditsc.c audit: print empty EXECVE args 2019-12-01 09:17:17 +01:00
backtracetest.c
bounds.c kbuild: fix kernel/bounds.c 'W=1' warning 2018-11-13 11:08:47 -08:00
capability.c LSM: generalize flag passing to security_capable 2020-01-23 08:21:29 +01:00
cfi.c ANDROID: cfi: fix export symbol types 2020-04-29 19:16:15 +02:00
compat.c make 'user_access_begin()' do 'access_ok()' 2020-06-22 09:04:58 +02:00
configs.c
context_tracking.c
cpu_pm.c kernel/cpu_pm: Fix uninitted local in cpu_pm 2020-06-22 09:05:28 +02:00
cpu.c This is the 4.19.129 stable release 2020-06-22 10:50:54 +02:00
crash_core.c kernel/crash_core.c: print timestamp using time64_t 2018-08-22 10:52:47 -07:00
crash_dump.c
cred.c memcg: account security cred as well to kmemcg 2020-01-09 10:19:00 +01:00
delayacct.c UPSTREAM: delayacct: track delays from thrashing cache pages 2019-03-21 16:25:26 -07:00
dma.c
elfcore.c kernel/elfcore.c: include proper prototypes 2019-10-11 18:21:23 +02:00
exec_domain.c
exit.c This is the 4.19.129 stable release 2020-06-22 10:50:54 +02:00
extable.c
fail_function.c
fork.c BACKPORT: cgroup: cgroup v2 freezer 2020-08-26 15:35:17 -07:00
freezer.c PM / reboot: Eliminate race between reboot and suspend 2018-08-06 12:35:20 +02:00
futex.c futex: Unbreak futex hashing 2020-03-25 08:06:14 +01:00
gen_kheaders.sh UPSTREAM: kheaders: include only headers into kheaders_data.tar.xz 2020-04-16 18:00:25 +00:00
groups.c
hung_task.c kernel: hung_task.c: disable on suspend 2019-04-20 09:16:02 +02:00
iomem.c
irq_work.c irq_work: Do not raise an IPI when queueing work on the local CPU 2019-05-31 06:46:19 -07:00
jump_label.c jump_label: move 'asm goto' support test to Kconfig 2019-06-04 08:02:34 +02:00
kallsyms.c Merge 4.19.133 into android-4.19-stable 2020-07-17 07:54:52 +02:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt kconfig: include kernel/Kconfig.preempt from init/Kconfig 2018-08-02 08:06:54 +09:00
kcov.c UPSTREAM: kcov: remote coverage support 2020-01-15 14:51:23 +00:00
kexec_core.c kexec: Allocate decrypted control pages for kdump if SME is enabled 2019-11-24 08:20:29 +01:00
kexec_file.c
kexec_internal.h
kexec.c kexec: add call to LSM hook in original kexec_load syscall 2018-07-16 12:31:57 -07:00
kheaders.c BACKPORT: kheaders: Move from proc to sysfs 2019-06-12 12:33:54 +00:00
kmod.c kmod: make request_module() return an error when autoloading is disabled 2020-04-17 10:48:52 +02:00
kprobes.c kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler 2020-08-21 11:05:33 +02:00
ksysfs.c
kthread.c This is the 4.19.142 stable release 2020-08-26 11:07:03 +02:00
latencytop.c
Makefile This is the 4.19.87 stable release 2019-12-01 09:53:43 +01:00
memremap.c mm/memory_hotplug: shrink zones when offlining memory 2020-01-29 16:43:27 +01:00
module_signing.c modsign: log module name in the event of an error 2018-07-02 11:36:17 +02:00
module-internal.h modsign: log module name in the event of an error 2018-07-02 11:36:17 +02:00
module.c This is the 4.19.141 stable release 2020-08-21 13:01:46 +02:00
notifier.c x86/mm: split vmalloc_sync_all() 2020-03-25 08:06:13 +01:00
nsproxy.c
padata.c padata: purge get_cpu and reorder_via_wq from padata_do_serial 2020-05-27 17:37:36 +02:00
panic.c UPSTREAM: GKI: panic/reboot: allow specifying reboot_mode for panic only 2020-04-17 05:00:40 +00:00
params.c ANDROID: GKI: export symbols from abi_gki_aarch64_qcom_whitelist 2020-04-13 21:36:41 +00:00
pid_namespace.c signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig 2019-07-26 09:14:01 +02:00
pid.c UPSTREAM: pid: add pidfd_open() 2019-08-12 13:36:37 -04:00
profile.c
ptrace.c ptrace: reintroduce usage of subjective credentials in ptrace_has_cap() 2020-01-23 08:21:29 +01:00
range.c
reboot.c UPSTREAM: GKI: panic/reboot: allow specifying reboot_mode for panic only 2020-04-17 05:00:40 +00:00
relay.c kernel/relay.c: fix memleak on destroy relay channel 2020-08-26 10:30:59 +02:00
resource.c resource: fix locking in find_next_iomem_res() 2019-09-16 08:22:20 +02:00
rseq.c rseq: uapi: Declare rseq_cs field as union, update includes 2018-07-10 22:18:52 +02:00
scs.c FROMLIST: scs: add support for stack usage debugging 2019-11-27 12:37:25 -08:00
seccomp.c LSM: generalize flag passing to security_capable 2020-01-23 08:21:29 +01:00
signal.c BACKPORT: cgroup: cgroup v2 freezer 2020-08-26 15:35:17 -07:00
smp.c cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM 2019-02-12 19:47:25 +01:00
smpboot.c smpboot: Remove cpumask from the API 2018-07-03 09:20:44 +02:00
smpboot.h
softirq.c nohz: Fix missing tick reprogram when interrupting an inline softirq 2018-08-03 15:52:10 +02:00
stacktrace.c
stop_machine.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-08-13 11:25:07 -07:00
sys_ni.c UPSTREAM: signal: support CLONE_PIDFD with pidfd_send_signal 2019-08-12 13:36:37 -04:00
sys.c UPSTREAM: arm64: Tighten the PR_{SET, GET}_TAGGED_ADDR_CTRL prctl() unused arguments 2019-10-07 15:27:39 -04:00
sysctl_binary.c
sysctl.c Revert "BACKPORT: mm: reclaim small amounts of memory when an external fragmentation event occurs" 2020-04-14 19:35:27 +08:00
task_work.c
taskstats.c taskstats: fix data-race 2020-01-09 10:18:59 +01:00
test_kprobes.c
torture.c
tracepoint.c tracepoint: Fix tracepoint array element size mismatch 2018-10-17 15:35:29 -04:00
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c coredump: fix crash when umh is disabled 2020-05-14 07:57:21 +02:00
up.c
user_namespace.c userns: also map extents in the reverse map to kernel IDs 2018-11-13 11:09:00 -08:00
user-return-notifier.c
user.c ANDROID: proc: Add /proc/uid directory 2019-03-06 15:59:21 +00:00
utsname_sysctl.c sys: don't hold uts_sem while accessing userspace memory 2018-08-11 02:05:53 -05:00
utsname.c
watchdog_hld.c watchdog: Mark watchdog touch functions as notrace 2018-08-30 12:56:40 +02:00
watchdog.c watchdog/softlockup: Enforce that timestamp is valid on boot 2020-02-24 08:34:49 +01:00
workqueue_internal.h UPSTREAM: psi: fix aggregation idle shut-off 2019-03-21 16:25:27 -07:00
workqueue.c ANDROID: Fix wq fp check for CFI builds 2020-04-04 16:30:40 +00:00