Go的优雅终止姿势

引经据典

最近优化了一版程序:用到了golang的优雅退出机制。

程序使用etcd的election sdk做高可用选主,需要在节点意外下线的时候,主动去etcd卸任(删除10s租约), 否则已经下线的节点还会被etcd认为是leader。

所以在这里,优雅退出是技术刚需。

另外根据[云原生十二要素方法论] 第9条: 快速启动和优雅终止可最大化健壮性 , 也推荐各位遵守实践。

Fast startup and shutdown are advocated for a more robust and resilient system.


粗浅的认知方案: 捕获程序的终止信号, 主动去卸任。

标准信号

Linux支持如下标准信号,第二列指示该信号遵守的标准。

   Signal      Standard   Action   Comment
   ────────────────────────────────────────────────────────────────────────
   SIGABRT      P1990      Core    Abort signal from abort(3)
   SIGALRM      P1990      Term    Timer signal from alarm(2)
   SIGBUS       P2001      Core    Bus error (bad memory access)
   SIGCHLD      P1990      Ign     Child stopped or terminated
   SIGCLD         -        Ign     A synonym for SIGCHLD
   SIGCONT      P1990      Cont    Continue if stopped
   SIGEMT         -        Term    Emulator trap
   SIGFPE       P1990      Core    Floating-point exception
   SIGHUP       P1990      Term    Hangup detected on controlling terminal
                                   or death of controlling process
   SIGILL       P1990      Core    Illegal Instruction
   SIGINFO        -                A synonym for SIGPWR
   SIGINT       P1990      Term    Interrupt from keyboard


   SIGIO          -        Term    I/O now possible (4.2BSD)
   SIGIOT         -        Core    IOT trap. A synonym for SIGABRT
   SIGKILL      P1990      Term    Kill signal
   SIGLOST        -        Term    File lock lost (unused)
   SIGPIPE      P1990      Term    Broken pipe: write to pipe with no
                                   readers; see pipe(7)
   SIGPOLL      P2001      Term    Pollable event (Sys V);
                                   synonym for SIGIO
   SIGPROF      P2001      Term    Profiling timer expired
   SIGPWR         -        Term    Power failure (System V)
   SIGQUIT      P1990      Core    Quit from keyboard
   SIGSEGV      P1990      Core    Invalid memory reference
   SIGSTKFLT      -        Term    Stack fault on coprocessor (unused)
   SIGSTOP      P1990      Stop    Stop process
   SIGTSTP      P1990      Stop    Stop typed at terminal
   SIGSYS       P2001      Core    Bad system call (SVr4);
                                   see also seccomp(2)
   SIGTERM      P1990      Term    Termination signal
   SIGTRAP      P2001      Core    Trace/breakpoint trap
   SIGTTIN      P1990      Stop    Terminal input for background process
   SIGTTOU      P1990      Stop    Terminal output for background process
   SIGUNUSED      -        Core    Synonymous with SIGSYS
   SIGURG       P2001      Ign     Urgent condition on socket (4.2BSD)
   SIGUSR1      P1990      Term    User-defined signal 1
   SIGUSR2      P1990      Term    User-defined signal 2
   SIGVTALRM    P2001      Term    Virtual alarm clock (4.2BSD)
   SIGXCPU      P2001      Core    CPU time limit exceeded (4.2BSD);
                                   see setrlimit(2)
   SIGXFSZ      P2001      Core    File size limit exceeded (4.2BSD);
                                   see setrlimit(2)
   SIGWINCH       -        Ign     Window resize signal (4.3BSD, Sun)

其中SIGKILL,SIGSTOP信号不能被捕获,阻塞,忽略。

我们常见的三种终止程序的操作:

  1. CTRL+C 实际是发送SIGINT信号,

  2. kill pid的作用是向指定进程发送SIGTERM信号(这是kill默认发送的信息), 若应用程序没有捕获并响应该信号的逻辑,则该信号默认动作是kill掉进程,这是终止进程的推荐做法。

  3. kill -9 pid 则是向指定进程发送SIGKILL信号,SIGKILL信号既不能被应用程序捕获,也不能被阻塞或忽略,

故要达成我们的目的,这里捕获 SIGINTSIGTREM信号就可满足需求。


golang提供signal包来监听并反馈收到的信号。

可针对长时间运行的程序,新开协程,持续监听信号,并插入优雅关闭的代码。

c := make(chan os.Signal)
signal.Notify(c, syscall.SIGTERM, syscall.SIGINT)
go func() {
                select  {
                        case sig:= <-c: {
                                log.Infof("Got %s signal. Aborting...\n", sig)
                                eCli.Close()    // 利用 etcd election sdk主动卸任
                                os.Exit(1)    
                        }
                }
        }()

是不是依旧适配容器?

我们得看DOCKER官方docker stop,docker kill命令的定义。

docker stop: The main process inside the container will receiver SIGTREM, and after a grace period,SIGKILL .(default grace period =10s)

docker kill:The main process inside the container is sent SIGKILL signal (default), or the signal that is specified with the --signal option

我们常用的docker stop命令: 向容器内进程发送SIGTREM信号,10s后发送SIGKILL信号,这10s时间给了程序做优雅关闭的时机,所以上面代码的逻辑是能适配容器的。