With cgroups v1, it is possible to listen for events about memory pressure. According to the docs, one needs to
- Create a new
eventfd
- Open
memory.pressure_level
for reading - Open
cgroup.event_control
for writing - Write
{eventfd} {pressure_level_fd} {level}
(wherelevel
islow
,medium
, orcritical
) toevent_control
- Wait until reading from the eventfd returns 8 bytes
When doing so with a program that’s about to run out of memory, you’ll receive a long train of low
events, then a few medium
and critical
, before finally the OOM killer will run.
If you want to convince yourself of this, I’ve prepared a little Rust example, you can execute it with cargo build --release --examples && sudo target/release/examples/cv1
.
For cgroups v2 (docs), similar events can be received by
- Setting up an inotify watch on
memory.events.local
- Reading and parsing the file fully, comparing numbers after each event received
This works (even without root, unlike v1), when setting a memory limit in the cgroup, and you’ll usually receive at least a few inotify events with increases either in high
or max
. (Again, if you want to convince yourself of this, run systemd-run --same-dir --pty --user -p MemoryMax=1G cargo run --example cv2
on the above gist.)
However, when there’s no memory limit set, or the limit is higher than the available memory, the process will be killed without events received. Looking at memory.pressure
shows a strong increase, so the kernel definitely knows that something is up before it invokes the OOM killer. Is there a way to get it to tell us, with a nice behavior like cgroups v1 that gives lots of warnings up ahead?
Note: I’m aware of some related questions (1, 2), but:
- They’re old and questions/answers only consider cgroups v1
- I’d like to be triggered before the oom killer becomes active, so that hack with “spawning a high oom_score_adj canary process” is out.