Oh, I was expecting this to say something about the other end of the equation: how do you get a go program to detach from console and run in the background, ala C’s daemon(). Is there a builtin equivalent for go?
Something that I’ve never been sure about is why we want our code to be able to be shut down properly. It definitely feels cleaner, and I do it everywhere, but I’m not sure that it isn’t voodoo. Really, the code should still behave properly if someone pulls the plug on the server, and if your server gets forcibly shut down, you’re definitely not going to be permitted to call all of your cleanup code. Is the concern that if your program claims some resource and then doesn’t let go of it explicitly, then if you run a kill -9 against it, and your operating system doesn’t clean up that resource for some reason, and then you do this a huge number of times, you have a resource leak?
Nothing will be lost, except the things that are. :) sysv shm stuff is not tied to one process and can survive even if there are no processes left, hanging around until somebody notices and runs ipcrm by hand.
A server should always leave its data in a consistent state, but that may not be the latest state. Imagine a database where you are in the middle of a large transaction. You run commit, the disks are churning, you pull the plug. At this point, a good database will survive and restart, but it will be missing whatever updates that last transaction was trying to commit. Sending a signal for a graceful shutdown means the database will only exit after properly recording everything.
Also, you don’t always control the systems your code interacts with. Shutting down gracefully gives you an opportunity to make sure that is handled properly.
However, a server should work both in the clean shutdown case, and also in the pulled plug case. Definitely it’s nicer to cleanly shutdown, but your server should know what it can and cannot assume after it has run commit. If your commit request returns before the commit has actually been run, you should build your server with that knowledge in mind. I can see how it can make your life easier from the perspective of, “it’s convenient if your operations aren’t idempotent and it becomes ambiguous which requests have finished or not,” but it sounds like in that case there are deeper issues with your design. As has been stated elsewhere in the thread, the OS will clean up nearly all resources after the process is terminated, so most resource management is done to ensure that everything is OK while the process is still running.
I’ve heard the argument that the right thing to do is the kill -9 your processes, because that’s closer to what is actually going to happen if the plug gets pulled on the server, and this tests that condition. That seems unpleasant, because worst-case failure scenarios usually mean performance degradation, but also means that you’ll be prepared for the worst.
To apply the kill -9 logic to the extreme, the datacenter might be destroyed by earthquake/flood/meteor and I’d have to restore from last week’s offsite backup. Therefore, every time I reboot I should restore from last week’s backup because that way I know it’ll work in case of a disaster.
I think you’re missing the idea that just because you can’t always do better, doesn’t mean you should never do better.
Specifically, just about everything I work on can deal with the occasion of the database being missing for 30 seconds when it starts, and will retry connecting, but if the database dies midtransaction, the client app will also die. Rewinding all the business logic to start again is too hard. So when I upgrade my database, I shut it down nicely and start the new one and then I don’t have that problem.
Oh, I was expecting this to say something about the other end of the equation: how do you get a go program to detach from console and run in the background, ala C’s daemon(). Is there a builtin equivalent for go?
There isn’t a way yet, at least not that I know of.
If you need to, you can use daemon
Something that I’ve never been sure about is why we want our code to be able to be shut down properly. It definitely feels cleaner, and I do it everywhere, but I’m not sure that it isn’t voodoo. Really, the code should still behave properly if someone pulls the plug on the server, and if your server gets forcibly shut down, you’re definitely not going to be permitted to call all of your cleanup code. Is the concern that if your program claims some resource and then doesn’t let go of it explicitly, then if you run a kill -9 against it, and your operating system doesn’t clean up that resource for some reason, and then you do this a huge number of times, you have a resource leak?
[Comment removed by author]
Nothing will be lost, except the things that are. :) sysv shm stuff is not tied to one process and can survive even if there are no processes left, hanging around until somebody notices and runs ipcrm by hand.
I would also consider crypto-related stuff (such as clearing out keys, and so forth) as being important.
A server should always leave its data in a consistent state, but that may not be the latest state. Imagine a database where you are in the middle of a large transaction. You run commit, the disks are churning, you pull the plug. At this point, a good database will survive and restart, but it will be missing whatever updates that last transaction was trying to commit. Sending a signal for a graceful shutdown means the database will only exit after properly recording everything.
Also, you don’t always control the systems your code interacts with. Shutting down gracefully gives you an opportunity to make sure that is handled properly.
However, a server should work both in the clean shutdown case, and also in the pulled plug case. Definitely it’s nicer to cleanly shutdown, but your server should know what it can and cannot assume after it has run commit. If your commit request returns before the commit has actually been run, you should build your server with that knowledge in mind. I can see how it can make your life easier from the perspective of, “it’s convenient if your operations aren’t idempotent and it becomes ambiguous which requests have finished or not,” but it sounds like in that case there are deeper issues with your design. As has been stated elsewhere in the thread, the OS will clean up nearly all resources after the process is terminated, so most resource management is done to ensure that everything is OK while the process is still running.
I’ve heard the argument that the right thing to do is the kill -9 your processes, because that’s closer to what is actually going to happen if the plug gets pulled on the server, and this tests that condition. That seems unpleasant, because worst-case failure scenarios usually mean performance degradation, but also means that you’ll be prepared for the worst.
I don’t mind being told I’m wrong, but it would be nice if someone explained to me how I was incorrect.
To apply the kill -9 logic to the extreme, the datacenter might be destroyed by earthquake/flood/meteor and I’d have to restore from last week’s offsite backup. Therefore, every time I reboot I should restore from last week’s backup because that way I know it’ll work in case of a disaster.
I think you’re missing the idea that just because you can’t always do better, doesn’t mean you should never do better.
Specifically, just about everything I work on can deal with the occasion of the database being missing for 30 seconds when it starts, and will retry connecting, but if the database dies midtransaction, the client app will also die. Rewinding all the business logic to start again is too hard. So when I upgrade my database, I shut it down nicely and start the new one and then I don’t have that problem.
“just because you can’t always do better, doesn’t mean you should never do better”
You’re right, I was missing this point. Thanks for stating it so clearly!