Plain Old Blogumentation: Minna no Go Gengo: A Summary / Review in English (chapter 3)

Here's a continuation of my summary of the Japanese Go programming book: Minna no Go Gengo. This is chapter 3.

For anyone who has missed my other summaries, here are chapter 1 and chapter 2.

How to Make practical applications

Author: Fujiwara Shunichiro (aka @fujiwara)

3.1 Opening

First of all what does the author mean by "practical applications"?
A practical application...

makes it easy to look up what kind of operations it performs
has good performance
can support different inputs and outputs
is easy for humans to use
is easy to maintain

The two github repos below are referenced frequently throughout the chapter as real-world practical applications written. Both are applications created by the author.

Stretcher: a pull based deployment tool
fluent-agent-hydra: an agent that sends logs to Fluentd

3.2 Version Control

Many Go programs can be shipped as a single binary, so in comparison to interpreted languages, the deployment process is generally much simpler. However since we're dealing with binaries it's a good idea to make it easy to programmatically obtain the version number of the binary so users can check if they have the latest version. Using the flag package to capture whether the program was invoked with -v or --version flags is common, but instead of hardcoding the version into the source code, the author recommends making use of git tags to store the version number and then passing it to the code using the build argument ldflags. A Makefile could for example do something like this:


#!/bin/sh

GIT_VER=`git describe --tags`
go build -ldflags "-X main.version=${GIT_VER}"

The Makefile for fluent-agent-hydra seems to make use of this very technique.

3.3 Efficient Use of I/O

This section demonstrates why and how bufio should be used when dealing with I/O operations.

The first point the author makes is how useful bufio.Reader.Peek can be when you run into a situation where you want to validate data coming in from STDIN, but don't want to read everything in the buffer quite just yet. An example of this kind of scenario is in the application stretcher which expects to receive a valid JSON string via STDIN. Although it's possible to read the entirety of the input into memory and then check whether it is valid JSON or not, it's more efficient to pass the input in io.Reader directly to encoding/json.Decoder. This is where bufio.Reader.Peek comes in. A call to Peak() can be used to check if the first character looks like the beginning of a JSON array ("[") , and if not we can simply return an error without bothering to read the rest of STDIN.

Another important point the author brings up is the difference between buffering in Go as opposed to in interpreted languages such as Ruby, Perl and Python. Interpreted languages generally handle the buffering of text output automatically at run time when their enclosing program is handed to a pipe, thereby reducing the number of costly system calls. Go, on the other hand, doesn't automatically buffer anything.

For example, try inspecting the following program using strace -e trace=write ./filename | cat


package main

import (
  "fmt"
  "os"
  "strings"
)

func main() {
  for i := 0; i < 100; i++ {
    fmt.Fprintln(os.Stdout, strings.Repeat("x", 100))
  }
}

Checking the output of strace on the above program reveals that a total of 100 system calls are recorded, indicating that no buffering has taken place. The bufio package can help us improve this example.


package main

import (
  "bufio"
  "fmt"
  "os"
  "strings"
)

func main() {
  b := bufio.NewWriter(os.Stdout)
  for i := 0; i < 100; i++ {
    fmt.Fprintln(b, strings.Repeat("x", 100))
  }
  b.Flush()
}

By wrapping os.Stdout with a *bufio.Writer we can delay the system calls until Flush() is called. The default buffer size is 4096 bytes, but it can be increased as necessary. Inspecting our new and improved program in strace will show that the number of system calls has been reduced to 2 whether we pass our program to a pipe or not.

3.4 Handling random numbers

This section mostly just explains the difference between math/rand (pseudo　random number generator) and crypto/rand (cryptographically secure pseudo random number generator) and shows how they can be used. I think this topic is pretty well covered in English.

3.5 Human readable numbers

The package recommended for converting file sizes and time stamps to human readable format is: go-humanize. Again the documentation for this package is in English, so probably I don't need to summarize. Just if you need to convert numbers to a more readable format, use this package rather than wasting time trying to do it yourself.

3.6 Executing external commands through Go

Generally speaking executing other programs through Go incurs a penalty in terms of starting up other processes in the background and sending data to external commands, so performance-wise, it's often preferable to implement a lot of things in pure Go. However there are of course instances where it is better to delegate the work to an existing program. This section mostly just models how to use the os/exec package to call external programs. One thing I didn't know is that if you call sh through os/exec you can use redirects and other shell sigils (>, ||, &&, etc) as normal.


exec.Command("sh", "-c", "some_command || handle_error").Output()

3.7 Timing out

While a lot of existing packages like net/http handle timeouts for you, sometimes you might want to implement a timeout yourself. This section demonstrates how you can use the time package and channels to implement a timeout yourself.


// A 10 second timer
timer := time.NewTimer(10 * time.Second)
// a channel to receive the result
done := make(chan error)

go func() {
  // call the function you want to run asynchronously in a goroutine
  done <- doSomething() // a function that returns an error
}

// use select to wait for a response from multiple channels
select {
case <-timer.C:
  return fmt.Errorf("timeout reached")
case err <-done:
  if err != nil {
    return err
  }
}

3.8 Working with signals

Go's default handling of signals is documented in os/signal. This section demonstrates how you might change the default behaviour in your own programs. A few examples of why you might want to do this is for example you have a server application and want to finish processing all incoming requests before closing or you want to make sure your program finishes writing everything currently in the buffer and properly closes open files before exiting itself. The example in the book is relatively close to the example from the os/signal docs for Notify(), so a read through that will give you the gist.

3.9 Stopping goroutines

It's easy to start a goroutine, but stopping one goroutine from inside another can be a bit tricky. There are two main ways to do this: using channels, or using the context package provided by go 1.7 and later.

The example code on page 80 shows how to use channels to stop a goroutine. The code demonstrates a program that implements concurrent workers which can process data from a queue.


package main

import (
  "fmt"
  "sync"
)

var wg sync.WaitGroup

func main() {
  queue := make(chan string)
  for i := 0; i < 2; i++ { //make two workers (goroutines)
    wg.Add(1)
    go fetchURL(queue)
  }

  queue <- "http://www.example.com"
  queue <- "http://www.example.net"
  queue <- "http://www.example.net/foo"
  queue <- "http://www.example.net/bar"

  close(queue) // tell the goroutines to terminate
  wg.Wait()    // wait for all goroutines to terminate
}

func fetchURL(queue chan string) {
  for {
    url, more := <-queue // more will be false when this closes
    if more {
      // process the url
      fmt.Println("fetching", url)
      // ...
    } else {
      fmt.Println("worker exit")
      wg.Done()
      return
    }
  }
}

When you call close() on a channel from the sending side, the second variable passed to the receiving side (the variable more evaluates to false. The goroutine is then able to close itself (by calling return) when there is no more data to receive.

The context package can be used to achieve much the same thing. The advantage that context brings to the table is a function called context.WithTimeout() which lets you handle timing out and cancellation in one fell swoop. Go Concurrency Patterns: Context from the official golang blog covers its usage extensively.

That wraps up the topics introduced in this chapter. A lot of the content was new to me, so hope to make use of some of the techniques in the future.

Plain Old Blogumentation

2016/11/02

Minna no Go Gengo: A Summary / Review in English (chapter 3)