Will Linux start killing my processes without asking me if memory gets short?What happens if a Linux distro...
Should I stick with American terminology in my English set young adult book?
Why did Khan ask Admiral James T. Kirk about Project Genesis?
Round towards zero
When, exactly, does the Rogue Scout get to use their Skirmisher ability?
I don't have the theoretical background in my PhD topic. I can't justify getting the degree
Very slow boot time and poor perfomance
Limitations with dynamical systems vs. PDEs?
Are game port joystick buttons ever more than plain switches? Is this one just faulty?
Nothing like a good ol' game of ModTen
What is this artifact and how to avoid it?
How can I download a file through 2 SSH connections?
Talk interpreter
How can I reorder triggered abilities in Arena?
"Opusculum hoc, quamdiu vixero, doctioribus emendandum offero."?
Can I take an upcast spell as my Mystic Arcanum?
How many birds in the bush?
How do I make my image comply with the requirements of this photography competition?
How does encoder decoder network works?
Where can/should I, as a high schooler, publish a paper regarding the derivation of a formula?
How were medieval castles built in swamps or marshes without draining them?
How to check whether a sublist exist in a huge database lists in a fast way?
Changing JPEG to RAW to use on Lightroom?
Could this kind of inaccurate sacrifice be countered?
How does the OS tell whether an "Address is already in use"?
Will Linux start killing my processes without asking me if memory gets short?
What happens if a Linux distro is installed with no swap and when it’s almost out of RAM executes a new application?How does the OOM killer decide which process to kill first?Why can't the OOM-Killer just kill the process that asks for too much?Limiting processes by memory under LinuxGet exit status of a process over SSHFork / Cannot allocate memoryJava process gets killed frequently by OOM - Killer on upgraded hardwarefortran program crashes instantly with segmentation faultWhy does killing a process gets printed in console“cannot allocate memory” error when trying to create folder in cgroup hierarchySwap is half full but nothing is going in or out of swapMysql segfaulting when used from the shellWhy is the Linux OOM killer terminating my programs?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}
I was running a shell script with commands to run several memory-intensive programs (2-5 GB) back-to-back. When I went back to check on the progress of my script I was surprised to discover that some of my processes were Killed
, as my terminal reported to me. Several programs had already successively completed before the programs that were later Killed
started, but all the programs afterwards failed in a segmentation fault (which may or may not have been due to a bug in my code, keep reading).
I looked at the usage history of the particular cluster I was using and saw that someone started running several memory-intensive processes at the same time and in doing so exhausted the real memory (and possibly even the swap space) available to the cluster. As best as I can figure, these memory-intensive processes started running about the same time I started having problems with my programs.
Is it possible that Linux killed my programs once it started running out of memory? And is it possible that the segmentation faults I got later on were due to the lack of memory available to run my programs (instead of a bug in my code)?
linux memory kill segmentation-fault
add a comment |
I was running a shell script with commands to run several memory-intensive programs (2-5 GB) back-to-back. When I went back to check on the progress of my script I was surprised to discover that some of my processes were Killed
, as my terminal reported to me. Several programs had already successively completed before the programs that were later Killed
started, but all the programs afterwards failed in a segmentation fault (which may or may not have been due to a bug in my code, keep reading).
I looked at the usage history of the particular cluster I was using and saw that someone started running several memory-intensive processes at the same time and in doing so exhausted the real memory (and possibly even the swap space) available to the cluster. As best as I can figure, these memory-intensive processes started running about the same time I started having problems with my programs.
Is it possible that Linux killed my programs once it started running out of memory? And is it possible that the segmentation faults I got later on were due to the lack of memory available to run my programs (instead of a bug in my code)?
linux memory kill segmentation-fault
1
When you allocate memory, do you have a statement to check whether the memory was successfully allocated? That should provide a clue whether there is bug in your code or whether it was due to a lack of memory in the system.
– unxnut
Jun 9 '14 at 22:47
4
Check out this very thorough explanation under the title Taming the OOM killer over at LWN
– 0xC0000022L
Jun 10 '14 at 0:07
add a comment |
I was running a shell script with commands to run several memory-intensive programs (2-5 GB) back-to-back. When I went back to check on the progress of my script I was surprised to discover that some of my processes were Killed
, as my terminal reported to me. Several programs had already successively completed before the programs that were later Killed
started, but all the programs afterwards failed in a segmentation fault (which may or may not have been due to a bug in my code, keep reading).
I looked at the usage history of the particular cluster I was using and saw that someone started running several memory-intensive processes at the same time and in doing so exhausted the real memory (and possibly even the swap space) available to the cluster. As best as I can figure, these memory-intensive processes started running about the same time I started having problems with my programs.
Is it possible that Linux killed my programs once it started running out of memory? And is it possible that the segmentation faults I got later on were due to the lack of memory available to run my programs (instead of a bug in my code)?
linux memory kill segmentation-fault
I was running a shell script with commands to run several memory-intensive programs (2-5 GB) back-to-back. When I went back to check on the progress of my script I was surprised to discover that some of my processes were Killed
, as my terminal reported to me. Several programs had already successively completed before the programs that were later Killed
started, but all the programs afterwards failed in a segmentation fault (which may or may not have been due to a bug in my code, keep reading).
I looked at the usage history of the particular cluster I was using and saw that someone started running several memory-intensive processes at the same time and in doing so exhausted the real memory (and possibly even the swap space) available to the cluster. As best as I can figure, these memory-intensive processes started running about the same time I started having problems with my programs.
Is it possible that Linux killed my programs once it started running out of memory? And is it possible that the segmentation faults I got later on were due to the lack of memory available to run my programs (instead of a bug in my code)?
linux memory kill segmentation-fault
linux memory kill segmentation-fault
edited Jun 9 '14 at 23:51
Gilles
571k138 gold badges1180 silver badges1692 bronze badges
571k138 gold badges1180 silver badges1692 bronze badges
asked Jun 9 '14 at 22:42
NeutronStarNeutronStar
6312 gold badges11 silver badges17 bronze badges
6312 gold badges11 silver badges17 bronze badges
1
When you allocate memory, do you have a statement to check whether the memory was successfully allocated? That should provide a clue whether there is bug in your code or whether it was due to a lack of memory in the system.
– unxnut
Jun 9 '14 at 22:47
4
Check out this very thorough explanation under the title Taming the OOM killer over at LWN
– 0xC0000022L
Jun 10 '14 at 0:07
add a comment |
1
When you allocate memory, do you have a statement to check whether the memory was successfully allocated? That should provide a clue whether there is bug in your code or whether it was due to a lack of memory in the system.
– unxnut
Jun 9 '14 at 22:47
4
Check out this very thorough explanation under the title Taming the OOM killer over at LWN
– 0xC0000022L
Jun 10 '14 at 0:07
1
1
When you allocate memory, do you have a statement to check whether the memory was successfully allocated? That should provide a clue whether there is bug in your code or whether it was due to a lack of memory in the system.
– unxnut
Jun 9 '14 at 22:47
When you allocate memory, do you have a statement to check whether the memory was successfully allocated? That should provide a clue whether there is bug in your code or whether it was due to a lack of memory in the system.
– unxnut
Jun 9 '14 at 22:47
4
4
Check out this very thorough explanation under the title Taming the OOM killer over at LWN
– 0xC0000022L
Jun 10 '14 at 0:07
Check out this very thorough explanation under the title Taming the OOM killer over at LWN
– 0xC0000022L
Jun 10 '14 at 0:07
add a comment |
2 Answers
2
active
oldest
votes
It can.
There are two different out of memory conditions you can encounter in Linux. Which you encounter depends on the value of sysctl vm.overcommit_memory
(/proc/sys/vm/overcommit_memory
)
Introduction:
The kernel can perform what is called 'memory overcommit'. This is when the kernel allocates programs more memory than is really present in the system. This is done in the hopes that the programs won't actually use all the memory they allocated, as this is a quite common occurrence.
overcommit_memory = 2
When overcommit_memory
is set to 2
, the kernel does not perform any overcommit at all. Instead when a program is allocated memory, it is guaranteed access to have that memory. If the system does not have enough free memory to satisfy an allocation request, the kernel will just return a failure for the request. It is up to the program to gracefully handle the situation. If it does not check that the allocation succeeded when it really failed, the application will often encounter a segfault.
In the case of the segfault, you should find a line such as this in the output of dmesg
:
[1962.987529] myapp[3303]: segfault at 0 ip 00400559 sp 5bc7b1b0 error 6 in myapp[400000+1000]
The at 0
means that the application tried to access an uninitialized pointer, which can be the result of a failed memory allocation call (but it is not the only way).
overcommit_memory = 0 and 1
When overcommit_memory
is set to 0
or 1
, overcommit is enabled, and programs are allowed to allocate more memory than is really available.
However, when a program wants to use the memory it was allocated, but the kernel finds that it doesn't actually have enough memory to satisfy it, it needs to get some memory back.
It first tries to perform various memory cleanup tasks, such as flushing caches, but if this is not enough it will then terminate a process. This termination is performed by the OOM-Killer. The OOM-Killer looks at the system to see what programs are using what memory, how long they've been running, who's running them, and a number of other factors to determine which one gets killed.
After the process has been killed, the memory it was using is freed up, and the program which just caused the out-of-memory condition now has the memory it needs.
However, even in this mode, programs can still be denied allocation requests.
When overcommit_memory
is 0
, the kernel tries to take a best guess at when it should start denying allocation requests.
When it is set to 1
, I'm not sure what determination it uses to determine when it should deny a request but it can deny very large requests.
You can see if the OOM-Killer is involved by looking at the output of dmesg
, and finding a messages such as:
[11686.043641] Out of memory: Kill process 2603 (flasherav) score 761 or sacrifice child
[11686.043647] Killed process 2603 (flasherav) total-vm:1498536kB, anon-rss:721784kB, file-rss:4228kB
So, it seems that both situations happened to me.
– NeutronStar
Jun 9 '14 at 22:57
@Joshua I just updated the answer. I forgot to mention you can still get allocation failures whenovercommit_memory
is set to 0 or 2.
– Patrick
Jun 9 '14 at 23:14
I think editing a link to Taming the OOM killer into the post might be worthwhile.
– 0xC0000022L
Jun 10 '14 at 0:05
@0xC0000022L Thanks, that's a good article (though a little out of date). I didn't want to put anything about controlling the OOM killer since that's not part of the question (and it isn't a short subject), and we have a ton of other questions here about just that.
– Patrick
Jun 10 '14 at 0:59
1
@mikeserv I don't say that the behavior of the OOM killer has nothing to do with controlling it. The question was whether linux would kill his programs. How to prevent linux from doing so first requires establishing that it is indeed linux doing it. And ifovercommit_memory=2
, the OOM killer isn't even enabled, so controlling it is irrelevant. However once we establish that it is the OOM killer, that becomes another subject in which is covered by many other questions & answers here.
– Patrick
Jun 10 '14 at 21:17
|
show 5 more comments
The truth is that regardless of which way you look at it - whether your process choked up due to the system's memory manager or due to something else - it is still a bug. What happened to all of that data you were just processing in memory? It should have been saved.
While overcommit_memory=
is the most general way of configuring Linux OOM management, it is also adjustable per process like:
echo [-+][n] >/proc/$pid/oom_adj
Using -17
in the above will exclude a process from out-of-memory management. Probably not a great idea generally, but if you're bug-hunting doing so could be worthwhile - especially if you wish to know whether it was OOM or your code. Positively incrementing the number will make the process more likely to be killed in an OOM event, which could enable you to better shore up your code's resilience in low-memory situations and to ensure you exit gracefully when necessary.
You can check the OOM handler's current settings per process like:
cat /proc/$pid/oom_score
Else you could go suicidal:
sysctl vm.panic_on_oom=1
sysctl kernel.panic=X
That will set the computer to reboot in the event of an out-of-memory condition. You set the X
above to the number of seconds you wish the computer to halt after a kernel panic before rebooting. Go wild.
And if, for some reason, you decide you like it, make it persistent:
echo "vm.panic_on_oom=1" >> /etc/sysctl.conf
echo "kernel.panic=X" >> /etc/sysctl.conf
It's a shared cluster I'm using, I'm sure the other users wouldn't appreciate it restarting without their consent.
– NeutronStar
Jun 9 '14 at 23:06
3
@Joshua - I doubt very seriously that anyone would like it - it even defies Asimov's laws of robotics. On the other hand, as I mention, you can configure the OOM per process the other way as well. Which is to say you can personally triage based on your own defined rulesets per process. That kind of thing sounds like it might be especially useful in a shared cluster scenario.
– mikeserv
Jun 9 '14 at 23:10
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f136291%2fwill-linux-start-killing-my-processes-without-asking-me-if-memory-gets-short%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
It can.
There are two different out of memory conditions you can encounter in Linux. Which you encounter depends on the value of sysctl vm.overcommit_memory
(/proc/sys/vm/overcommit_memory
)
Introduction:
The kernel can perform what is called 'memory overcommit'. This is when the kernel allocates programs more memory than is really present in the system. This is done in the hopes that the programs won't actually use all the memory they allocated, as this is a quite common occurrence.
overcommit_memory = 2
When overcommit_memory
is set to 2
, the kernel does not perform any overcommit at all. Instead when a program is allocated memory, it is guaranteed access to have that memory. If the system does not have enough free memory to satisfy an allocation request, the kernel will just return a failure for the request. It is up to the program to gracefully handle the situation. If it does not check that the allocation succeeded when it really failed, the application will often encounter a segfault.
In the case of the segfault, you should find a line such as this in the output of dmesg
:
[1962.987529] myapp[3303]: segfault at 0 ip 00400559 sp 5bc7b1b0 error 6 in myapp[400000+1000]
The at 0
means that the application tried to access an uninitialized pointer, which can be the result of a failed memory allocation call (but it is not the only way).
overcommit_memory = 0 and 1
When overcommit_memory
is set to 0
or 1
, overcommit is enabled, and programs are allowed to allocate more memory than is really available.
However, when a program wants to use the memory it was allocated, but the kernel finds that it doesn't actually have enough memory to satisfy it, it needs to get some memory back.
It first tries to perform various memory cleanup tasks, such as flushing caches, but if this is not enough it will then terminate a process. This termination is performed by the OOM-Killer. The OOM-Killer looks at the system to see what programs are using what memory, how long they've been running, who's running them, and a number of other factors to determine which one gets killed.
After the process has been killed, the memory it was using is freed up, and the program which just caused the out-of-memory condition now has the memory it needs.
However, even in this mode, programs can still be denied allocation requests.
When overcommit_memory
is 0
, the kernel tries to take a best guess at when it should start denying allocation requests.
When it is set to 1
, I'm not sure what determination it uses to determine when it should deny a request but it can deny very large requests.
You can see if the OOM-Killer is involved by looking at the output of dmesg
, and finding a messages such as:
[11686.043641] Out of memory: Kill process 2603 (flasherav) score 761 or sacrifice child
[11686.043647] Killed process 2603 (flasherav) total-vm:1498536kB, anon-rss:721784kB, file-rss:4228kB
So, it seems that both situations happened to me.
– NeutronStar
Jun 9 '14 at 22:57
@Joshua I just updated the answer. I forgot to mention you can still get allocation failures whenovercommit_memory
is set to 0 or 2.
– Patrick
Jun 9 '14 at 23:14
I think editing a link to Taming the OOM killer into the post might be worthwhile.
– 0xC0000022L
Jun 10 '14 at 0:05
@0xC0000022L Thanks, that's a good article (though a little out of date). I didn't want to put anything about controlling the OOM killer since that's not part of the question (and it isn't a short subject), and we have a ton of other questions here about just that.
– Patrick
Jun 10 '14 at 0:59
1
@mikeserv I don't say that the behavior of the OOM killer has nothing to do with controlling it. The question was whether linux would kill his programs. How to prevent linux from doing so first requires establishing that it is indeed linux doing it. And ifovercommit_memory=2
, the OOM killer isn't even enabled, so controlling it is irrelevant. However once we establish that it is the OOM killer, that becomes another subject in which is covered by many other questions & answers here.
– Patrick
Jun 10 '14 at 21:17
|
show 5 more comments
It can.
There are two different out of memory conditions you can encounter in Linux. Which you encounter depends on the value of sysctl vm.overcommit_memory
(/proc/sys/vm/overcommit_memory
)
Introduction:
The kernel can perform what is called 'memory overcommit'. This is when the kernel allocates programs more memory than is really present in the system. This is done in the hopes that the programs won't actually use all the memory they allocated, as this is a quite common occurrence.
overcommit_memory = 2
When overcommit_memory
is set to 2
, the kernel does not perform any overcommit at all. Instead when a program is allocated memory, it is guaranteed access to have that memory. If the system does not have enough free memory to satisfy an allocation request, the kernel will just return a failure for the request. It is up to the program to gracefully handle the situation. If it does not check that the allocation succeeded when it really failed, the application will often encounter a segfault.
In the case of the segfault, you should find a line such as this in the output of dmesg
:
[1962.987529] myapp[3303]: segfault at 0 ip 00400559 sp 5bc7b1b0 error 6 in myapp[400000+1000]
The at 0
means that the application tried to access an uninitialized pointer, which can be the result of a failed memory allocation call (but it is not the only way).
overcommit_memory = 0 and 1
When overcommit_memory
is set to 0
or 1
, overcommit is enabled, and programs are allowed to allocate more memory than is really available.
However, when a program wants to use the memory it was allocated, but the kernel finds that it doesn't actually have enough memory to satisfy it, it needs to get some memory back.
It first tries to perform various memory cleanup tasks, such as flushing caches, but if this is not enough it will then terminate a process. This termination is performed by the OOM-Killer. The OOM-Killer looks at the system to see what programs are using what memory, how long they've been running, who's running them, and a number of other factors to determine which one gets killed.
After the process has been killed, the memory it was using is freed up, and the program which just caused the out-of-memory condition now has the memory it needs.
However, even in this mode, programs can still be denied allocation requests.
When overcommit_memory
is 0
, the kernel tries to take a best guess at when it should start denying allocation requests.
When it is set to 1
, I'm not sure what determination it uses to determine when it should deny a request but it can deny very large requests.
You can see if the OOM-Killer is involved by looking at the output of dmesg
, and finding a messages such as:
[11686.043641] Out of memory: Kill process 2603 (flasherav) score 761 or sacrifice child
[11686.043647] Killed process 2603 (flasherav) total-vm:1498536kB, anon-rss:721784kB, file-rss:4228kB
So, it seems that both situations happened to me.
– NeutronStar
Jun 9 '14 at 22:57
@Joshua I just updated the answer. I forgot to mention you can still get allocation failures whenovercommit_memory
is set to 0 or 2.
– Patrick
Jun 9 '14 at 23:14
I think editing a link to Taming the OOM killer into the post might be worthwhile.
– 0xC0000022L
Jun 10 '14 at 0:05
@0xC0000022L Thanks, that's a good article (though a little out of date). I didn't want to put anything about controlling the OOM killer since that's not part of the question (and it isn't a short subject), and we have a ton of other questions here about just that.
– Patrick
Jun 10 '14 at 0:59
1
@mikeserv I don't say that the behavior of the OOM killer has nothing to do with controlling it. The question was whether linux would kill his programs. How to prevent linux from doing so first requires establishing that it is indeed linux doing it. And ifovercommit_memory=2
, the OOM killer isn't even enabled, so controlling it is irrelevant. However once we establish that it is the OOM killer, that becomes another subject in which is covered by many other questions & answers here.
– Patrick
Jun 10 '14 at 21:17
|
show 5 more comments
It can.
There are two different out of memory conditions you can encounter in Linux. Which you encounter depends on the value of sysctl vm.overcommit_memory
(/proc/sys/vm/overcommit_memory
)
Introduction:
The kernel can perform what is called 'memory overcommit'. This is when the kernel allocates programs more memory than is really present in the system. This is done in the hopes that the programs won't actually use all the memory they allocated, as this is a quite common occurrence.
overcommit_memory = 2
When overcommit_memory
is set to 2
, the kernel does not perform any overcommit at all. Instead when a program is allocated memory, it is guaranteed access to have that memory. If the system does not have enough free memory to satisfy an allocation request, the kernel will just return a failure for the request. It is up to the program to gracefully handle the situation. If it does not check that the allocation succeeded when it really failed, the application will often encounter a segfault.
In the case of the segfault, you should find a line such as this in the output of dmesg
:
[1962.987529] myapp[3303]: segfault at 0 ip 00400559 sp 5bc7b1b0 error 6 in myapp[400000+1000]
The at 0
means that the application tried to access an uninitialized pointer, which can be the result of a failed memory allocation call (but it is not the only way).
overcommit_memory = 0 and 1
When overcommit_memory
is set to 0
or 1
, overcommit is enabled, and programs are allowed to allocate more memory than is really available.
However, when a program wants to use the memory it was allocated, but the kernel finds that it doesn't actually have enough memory to satisfy it, it needs to get some memory back.
It first tries to perform various memory cleanup tasks, such as flushing caches, but if this is not enough it will then terminate a process. This termination is performed by the OOM-Killer. The OOM-Killer looks at the system to see what programs are using what memory, how long they've been running, who's running them, and a number of other factors to determine which one gets killed.
After the process has been killed, the memory it was using is freed up, and the program which just caused the out-of-memory condition now has the memory it needs.
However, even in this mode, programs can still be denied allocation requests.
When overcommit_memory
is 0
, the kernel tries to take a best guess at when it should start denying allocation requests.
When it is set to 1
, I'm not sure what determination it uses to determine when it should deny a request but it can deny very large requests.
You can see if the OOM-Killer is involved by looking at the output of dmesg
, and finding a messages such as:
[11686.043641] Out of memory: Kill process 2603 (flasherav) score 761 or sacrifice child
[11686.043647] Killed process 2603 (flasherav) total-vm:1498536kB, anon-rss:721784kB, file-rss:4228kB
It can.
There are two different out of memory conditions you can encounter in Linux. Which you encounter depends on the value of sysctl vm.overcommit_memory
(/proc/sys/vm/overcommit_memory
)
Introduction:
The kernel can perform what is called 'memory overcommit'. This is when the kernel allocates programs more memory than is really present in the system. This is done in the hopes that the programs won't actually use all the memory they allocated, as this is a quite common occurrence.
overcommit_memory = 2
When overcommit_memory
is set to 2
, the kernel does not perform any overcommit at all. Instead when a program is allocated memory, it is guaranteed access to have that memory. If the system does not have enough free memory to satisfy an allocation request, the kernel will just return a failure for the request. It is up to the program to gracefully handle the situation. If it does not check that the allocation succeeded when it really failed, the application will often encounter a segfault.
In the case of the segfault, you should find a line such as this in the output of dmesg
:
[1962.987529] myapp[3303]: segfault at 0 ip 00400559 sp 5bc7b1b0 error 6 in myapp[400000+1000]
The at 0
means that the application tried to access an uninitialized pointer, which can be the result of a failed memory allocation call (but it is not the only way).
overcommit_memory = 0 and 1
When overcommit_memory
is set to 0
or 1
, overcommit is enabled, and programs are allowed to allocate more memory than is really available.
However, when a program wants to use the memory it was allocated, but the kernel finds that it doesn't actually have enough memory to satisfy it, it needs to get some memory back.
It first tries to perform various memory cleanup tasks, such as flushing caches, but if this is not enough it will then terminate a process. This termination is performed by the OOM-Killer. The OOM-Killer looks at the system to see what programs are using what memory, how long they've been running, who's running them, and a number of other factors to determine which one gets killed.
After the process has been killed, the memory it was using is freed up, and the program which just caused the out-of-memory condition now has the memory it needs.
However, even in this mode, programs can still be denied allocation requests.
When overcommit_memory
is 0
, the kernel tries to take a best guess at when it should start denying allocation requests.
When it is set to 1
, I'm not sure what determination it uses to determine when it should deny a request but it can deny very large requests.
You can see if the OOM-Killer is involved by looking at the output of dmesg
, and finding a messages such as:
[11686.043641] Out of memory: Kill process 2603 (flasherav) score 761 or sacrifice child
[11686.043647] Killed process 2603 (flasherav) total-vm:1498536kB, anon-rss:721784kB, file-rss:4228kB
edited 7 hours ago
G-Man
15.3k9 gold badges44 silver badges83 bronze badges
15.3k9 gold badges44 silver badges83 bronze badges
answered Jun 9 '14 at 22:56
PatrickPatrick
53.6k12 gold badges141 silver badges188 bronze badges
53.6k12 gold badges141 silver badges188 bronze badges
So, it seems that both situations happened to me.
– NeutronStar
Jun 9 '14 at 22:57
@Joshua I just updated the answer. I forgot to mention you can still get allocation failures whenovercommit_memory
is set to 0 or 2.
– Patrick
Jun 9 '14 at 23:14
I think editing a link to Taming the OOM killer into the post might be worthwhile.
– 0xC0000022L
Jun 10 '14 at 0:05
@0xC0000022L Thanks, that's a good article (though a little out of date). I didn't want to put anything about controlling the OOM killer since that's not part of the question (and it isn't a short subject), and we have a ton of other questions here about just that.
– Patrick
Jun 10 '14 at 0:59
1
@mikeserv I don't say that the behavior of the OOM killer has nothing to do with controlling it. The question was whether linux would kill his programs. How to prevent linux from doing so first requires establishing that it is indeed linux doing it. And ifovercommit_memory=2
, the OOM killer isn't even enabled, so controlling it is irrelevant. However once we establish that it is the OOM killer, that becomes another subject in which is covered by many other questions & answers here.
– Patrick
Jun 10 '14 at 21:17
|
show 5 more comments
So, it seems that both situations happened to me.
– NeutronStar
Jun 9 '14 at 22:57
@Joshua I just updated the answer. I forgot to mention you can still get allocation failures whenovercommit_memory
is set to 0 or 2.
– Patrick
Jun 9 '14 at 23:14
I think editing a link to Taming the OOM killer into the post might be worthwhile.
– 0xC0000022L
Jun 10 '14 at 0:05
@0xC0000022L Thanks, that's a good article (though a little out of date). I didn't want to put anything about controlling the OOM killer since that's not part of the question (and it isn't a short subject), and we have a ton of other questions here about just that.
– Patrick
Jun 10 '14 at 0:59
1
@mikeserv I don't say that the behavior of the OOM killer has nothing to do with controlling it. The question was whether linux would kill his programs. How to prevent linux from doing so first requires establishing that it is indeed linux doing it. And ifovercommit_memory=2
, the OOM killer isn't even enabled, so controlling it is irrelevant. However once we establish that it is the OOM killer, that becomes another subject in which is covered by many other questions & answers here.
– Patrick
Jun 10 '14 at 21:17
So, it seems that both situations happened to me.
– NeutronStar
Jun 9 '14 at 22:57
So, it seems that both situations happened to me.
– NeutronStar
Jun 9 '14 at 22:57
@Joshua I just updated the answer. I forgot to mention you can still get allocation failures when
overcommit_memory
is set to 0 or 2.– Patrick
Jun 9 '14 at 23:14
@Joshua I just updated the answer. I forgot to mention you can still get allocation failures when
overcommit_memory
is set to 0 or 2.– Patrick
Jun 9 '14 at 23:14
I think editing a link to Taming the OOM killer into the post might be worthwhile.
– 0xC0000022L
Jun 10 '14 at 0:05
I think editing a link to Taming the OOM killer into the post might be worthwhile.
– 0xC0000022L
Jun 10 '14 at 0:05
@0xC0000022L Thanks, that's a good article (though a little out of date). I didn't want to put anything about controlling the OOM killer since that's not part of the question (and it isn't a short subject), and we have a ton of other questions here about just that.
– Patrick
Jun 10 '14 at 0:59
@0xC0000022L Thanks, that's a good article (though a little out of date). I didn't want to put anything about controlling the OOM killer since that's not part of the question (and it isn't a short subject), and we have a ton of other questions here about just that.
– Patrick
Jun 10 '14 at 0:59
1
1
@mikeserv I don't say that the behavior of the OOM killer has nothing to do with controlling it. The question was whether linux would kill his programs. How to prevent linux from doing so first requires establishing that it is indeed linux doing it. And if
overcommit_memory=2
, the OOM killer isn't even enabled, so controlling it is irrelevant. However once we establish that it is the OOM killer, that becomes another subject in which is covered by many other questions & answers here.– Patrick
Jun 10 '14 at 21:17
@mikeserv I don't say that the behavior of the OOM killer has nothing to do with controlling it. The question was whether linux would kill his programs. How to prevent linux from doing so first requires establishing that it is indeed linux doing it. And if
overcommit_memory=2
, the OOM killer isn't even enabled, so controlling it is irrelevant. However once we establish that it is the OOM killer, that becomes another subject in which is covered by many other questions & answers here.– Patrick
Jun 10 '14 at 21:17
|
show 5 more comments
The truth is that regardless of which way you look at it - whether your process choked up due to the system's memory manager or due to something else - it is still a bug. What happened to all of that data you were just processing in memory? It should have been saved.
While overcommit_memory=
is the most general way of configuring Linux OOM management, it is also adjustable per process like:
echo [-+][n] >/proc/$pid/oom_adj
Using -17
in the above will exclude a process from out-of-memory management. Probably not a great idea generally, but if you're bug-hunting doing so could be worthwhile - especially if you wish to know whether it was OOM or your code. Positively incrementing the number will make the process more likely to be killed in an OOM event, which could enable you to better shore up your code's resilience in low-memory situations and to ensure you exit gracefully when necessary.
You can check the OOM handler's current settings per process like:
cat /proc/$pid/oom_score
Else you could go suicidal:
sysctl vm.panic_on_oom=1
sysctl kernel.panic=X
That will set the computer to reboot in the event of an out-of-memory condition. You set the X
above to the number of seconds you wish the computer to halt after a kernel panic before rebooting. Go wild.
And if, for some reason, you decide you like it, make it persistent:
echo "vm.panic_on_oom=1" >> /etc/sysctl.conf
echo "kernel.panic=X" >> /etc/sysctl.conf
It's a shared cluster I'm using, I'm sure the other users wouldn't appreciate it restarting without their consent.
– NeutronStar
Jun 9 '14 at 23:06
3
@Joshua - I doubt very seriously that anyone would like it - it even defies Asimov's laws of robotics. On the other hand, as I mention, you can configure the OOM per process the other way as well. Which is to say you can personally triage based on your own defined rulesets per process. That kind of thing sounds like it might be especially useful in a shared cluster scenario.
– mikeserv
Jun 9 '14 at 23:10
add a comment |
The truth is that regardless of which way you look at it - whether your process choked up due to the system's memory manager or due to something else - it is still a bug. What happened to all of that data you were just processing in memory? It should have been saved.
While overcommit_memory=
is the most general way of configuring Linux OOM management, it is also adjustable per process like:
echo [-+][n] >/proc/$pid/oom_adj
Using -17
in the above will exclude a process from out-of-memory management. Probably not a great idea generally, but if you're bug-hunting doing so could be worthwhile - especially if you wish to know whether it was OOM or your code. Positively incrementing the number will make the process more likely to be killed in an OOM event, which could enable you to better shore up your code's resilience in low-memory situations and to ensure you exit gracefully when necessary.
You can check the OOM handler's current settings per process like:
cat /proc/$pid/oom_score
Else you could go suicidal:
sysctl vm.panic_on_oom=1
sysctl kernel.panic=X
That will set the computer to reboot in the event of an out-of-memory condition. You set the X
above to the number of seconds you wish the computer to halt after a kernel panic before rebooting. Go wild.
And if, for some reason, you decide you like it, make it persistent:
echo "vm.panic_on_oom=1" >> /etc/sysctl.conf
echo "kernel.panic=X" >> /etc/sysctl.conf
It's a shared cluster I'm using, I'm sure the other users wouldn't appreciate it restarting without their consent.
– NeutronStar
Jun 9 '14 at 23:06
3
@Joshua - I doubt very seriously that anyone would like it - it even defies Asimov's laws of robotics. On the other hand, as I mention, you can configure the OOM per process the other way as well. Which is to say you can personally triage based on your own defined rulesets per process. That kind of thing sounds like it might be especially useful in a shared cluster scenario.
– mikeserv
Jun 9 '14 at 23:10
add a comment |
The truth is that regardless of which way you look at it - whether your process choked up due to the system's memory manager or due to something else - it is still a bug. What happened to all of that data you were just processing in memory? It should have been saved.
While overcommit_memory=
is the most general way of configuring Linux OOM management, it is also adjustable per process like:
echo [-+][n] >/proc/$pid/oom_adj
Using -17
in the above will exclude a process from out-of-memory management. Probably not a great idea generally, but if you're bug-hunting doing so could be worthwhile - especially if you wish to know whether it was OOM or your code. Positively incrementing the number will make the process more likely to be killed in an OOM event, which could enable you to better shore up your code's resilience in low-memory situations and to ensure you exit gracefully when necessary.
You can check the OOM handler's current settings per process like:
cat /proc/$pid/oom_score
Else you could go suicidal:
sysctl vm.panic_on_oom=1
sysctl kernel.panic=X
That will set the computer to reboot in the event of an out-of-memory condition. You set the X
above to the number of seconds you wish the computer to halt after a kernel panic before rebooting. Go wild.
And if, for some reason, you decide you like it, make it persistent:
echo "vm.panic_on_oom=1" >> /etc/sysctl.conf
echo "kernel.panic=X" >> /etc/sysctl.conf
The truth is that regardless of which way you look at it - whether your process choked up due to the system's memory manager or due to something else - it is still a bug. What happened to all of that data you were just processing in memory? It should have been saved.
While overcommit_memory=
is the most general way of configuring Linux OOM management, it is also adjustable per process like:
echo [-+][n] >/proc/$pid/oom_adj
Using -17
in the above will exclude a process from out-of-memory management. Probably not a great idea generally, but if you're bug-hunting doing so could be worthwhile - especially if you wish to know whether it was OOM or your code. Positively incrementing the number will make the process more likely to be killed in an OOM event, which could enable you to better shore up your code's resilience in low-memory situations and to ensure you exit gracefully when necessary.
You can check the OOM handler's current settings per process like:
cat /proc/$pid/oom_score
Else you could go suicidal:
sysctl vm.panic_on_oom=1
sysctl kernel.panic=X
That will set the computer to reboot in the event of an out-of-memory condition. You set the X
above to the number of seconds you wish the computer to halt after a kernel panic before rebooting. Go wild.
And if, for some reason, you decide you like it, make it persistent:
echo "vm.panic_on_oom=1" >> /etc/sysctl.conf
echo "kernel.panic=X" >> /etc/sysctl.conf
edited Jun 10 '14 at 22:59
answered Jun 9 '14 at 23:02
mikeservmikeserv
47k6 gold badges72 silver badges172 bronze badges
47k6 gold badges72 silver badges172 bronze badges
It's a shared cluster I'm using, I'm sure the other users wouldn't appreciate it restarting without their consent.
– NeutronStar
Jun 9 '14 at 23:06
3
@Joshua - I doubt very seriously that anyone would like it - it even defies Asimov's laws of robotics. On the other hand, as I mention, you can configure the OOM per process the other way as well. Which is to say you can personally triage based on your own defined rulesets per process. That kind of thing sounds like it might be especially useful in a shared cluster scenario.
– mikeserv
Jun 9 '14 at 23:10
add a comment |
It's a shared cluster I'm using, I'm sure the other users wouldn't appreciate it restarting without their consent.
– NeutronStar
Jun 9 '14 at 23:06
3
@Joshua - I doubt very seriously that anyone would like it - it even defies Asimov's laws of robotics. On the other hand, as I mention, you can configure the OOM per process the other way as well. Which is to say you can personally triage based on your own defined rulesets per process. That kind of thing sounds like it might be especially useful in a shared cluster scenario.
– mikeserv
Jun 9 '14 at 23:10
It's a shared cluster I'm using, I'm sure the other users wouldn't appreciate it restarting without their consent.
– NeutronStar
Jun 9 '14 at 23:06
It's a shared cluster I'm using, I'm sure the other users wouldn't appreciate it restarting without their consent.
– NeutronStar
Jun 9 '14 at 23:06
3
3
@Joshua - I doubt very seriously that anyone would like it - it even defies Asimov's laws of robotics. On the other hand, as I mention, you can configure the OOM per process the other way as well. Which is to say you can personally triage based on your own defined rulesets per process. That kind of thing sounds like it might be especially useful in a shared cluster scenario.
– mikeserv
Jun 9 '14 at 23:10
@Joshua - I doubt very seriously that anyone would like it - it even defies Asimov's laws of robotics. On the other hand, as I mention, you can configure the OOM per process the other way as well. Which is to say you can personally triage based on your own defined rulesets per process. That kind of thing sounds like it might be especially useful in a shared cluster scenario.
– mikeserv
Jun 9 '14 at 23:10
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f136291%2fwill-linux-start-killing-my-processes-without-asking-me-if-memory-gets-short%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
When you allocate memory, do you have a statement to check whether the memory was successfully allocated? That should provide a clue whether there is bug in your code or whether it was due to a lack of memory in the system.
– unxnut
Jun 9 '14 at 22:47
4
Check out this very thorough explanation under the title Taming the OOM killer over at LWN
– 0xC0000022L
Jun 10 '14 at 0:07