Concurrency in Go

these notes are based largely on chapter 8 of the book The Go Programming Language by Donovan and Kernighan

Introduction

a concurrent program is a program with two or more autonomous activities going on at the same time

concurrency is used in many places in modern programming, e.g. web servers concurrently serve hundreds of users, and cell phones concurrently draw on the screen, interact with the network, and perhaps play music

concurrency is one of the strengths of Go, and there are two main approaches:

  • Communicating Sequential Processes (CSP) using goroutines and channels
  • shared-memory multi-threading

we will only touch upon goroutines and channels in this note

concurrent programming is notoriously tricky, even in a language like Go that was designed for it

  • the order in which concurrent processes run is often hard for humans to reason about
  • sharing memory is a tricky issue since you must be careful to ensure that two (or more) processing never try to do different things to the same memory at the same time

Goroutines

every concurrent activity in Go runs in its own goroutine

  • you can think of goroutines as similar to threads, but with some important conceptual differences that we will see below
  • we sometimes say that regular functions are called sequentially

it’s possible to have 2, or more (sometimes tens of thousands more!) goroutines active at the same time

when a Go program starts running, the main function is called inside a goroutine, and this goroutine is called the main goroutine

the go statement is used to create new goroutines, e.g.:

    f()  // calls function f in the regular sequential way

go f() // create a new goroutine and call f inside of it
       // does not wait!

in the following programming, there are two goroutines:

  • the main one where a long-running calculation is done
  • a second goroutine that runs a simple ASCII animation of a spinner
// spinner.go

// From Chapter 8 of the book "The Go Programming Language", by Donovan and Kernighan.

package main

import (
        "fmt"
        "time"
)

func main() {
        go spinner(100 * time.Millisecond)
        const n = 45
        fibN := fib(n) // slow
        fmt.Printf("\rFibonacci(%d) = %d\n", n, fibN)
}

func spinner(delay time.Duration) {
        for {
                for _, r := range `-\|/` {
                        fmt.Printf("\r%c", r)
                        time.Sleep(delay)
                }
        }
}

func fib(x int) int {
        if x < 2 {
                return x
        }
        return fib(x-1) + fib(x-2)
}

A Simple Time Server

here is a Go program that implements a simple time server on port 8000:

func main() {
        listener, err := net.Listen("tcp", "localhost:8000")
        if err != nil {
                log.Fatal(err)
        }
        for {
                // Accept() blocks (i.e. waits) until a connection request is received
                conn, err := listener.Accept()
                if err != nil {
                        log.Print(err) // e.g., connection aborted
                        continue
                }
                // handleConn(conn) // handle one connection at a time
                go handleConn(conn) // handle each connection in its own goroutine
        }
}

func handleConn(c net.Conn) {
        // defer causes a statement to be executed after a function is done;
        // c.Close() is called when the function ends, no matter where/how it ends
        defer c.Close()
        for {
                _, err := io.WriteString(c, time.Now().Format("15:04:05\n"))
                if err != nil {
                        return // e.g., client disconnected
                }
                time.Sleep(1 * time.Second)
        }
}

to test this program, run it at the command-line (note the &), and then connect to it using the standard nc (netcat) program:

$ go run clock1.go &      // note the &

$ nc localhost 8000       // ctrl-C to end

this should print a list of times on the screen

ending nc doesn’t end clock1

  • type ps to see a list of currently running processes
  • one of the processes should be named clock1
  • you can end that with the command killall clock1, which ends all processes with that name

the line main that calls handleConn is important

if it looks look this, then it can handle only one connection at a time:

handleConn(conn)

this is a regular function call, and the program does not continue until this function call finished, i.e. until the connection is closed

  • so this line allows only one client at a time.

running it as a goroutine allows multiple connections:

go handleConn(conn)

every time this line of code is called a new goroutine is created running its own instance of handleConn(conn)

  • this allows multiple clients to connect at the same time

Channels

channels are connections between goroutines that allow them to send/receive information between each other

  • functions normally send information to each other through their arguments and return values
    • or through global variables, but using global variables is usually frowned upon
  • but that won’t work for a goroutine, since we want to send/receive information to/from it while it’s running

here’s a simple application that launches a goroutine that generates Fibonacci numbers, and then communicates these numbers to the main goroutine through a channel:

package main

import (
        "fmt"
)

// generates next Fibonacci number
func fibgen() chan int {
        ch := make(chan int)

        go func() {
                a, b := 1, 1
                for { // infinite loop
                        ch <- a
                        a, b = b, a+b
                }
        }()        // note the ()

        return ch
}

func main() {
        nextFib := fibgen()
        for i := 0; i < 10; i++ {
                fmt.Println(<-nextFib)
        }
}

note a few things:

  • when the program runs, main() is run in the main goroutine, and that calls the fibgen function
  • since fibgen is a function, the following for-loop in main doesn’t start running until fibgen returns
  • in fibgen, first an unbuffered int channel is created
    • this unbuffered channel will be used to communicate with the main goroutine
    • the type of a channel (in this case int) restricts the type of values that can be sent on it, i.e. you can only send/receive int values on ch
  • after ch is created, a new goroutine is launched
    • note that it is launched as an anonymous lambda function, i.e. a function with no name
    • note also that the function in the goroutine refers to ch, which is a value created outside of the anonymous lambda function itself
  • the goroutine in launched in fibgen starts running as soon as it is launched, but the main goroutine continues running fibgen
    • this is a very important point about goroutines: they start running as soon as you launch them, and also the function that launches them keeps going, i.e. it doesn’t wait until the goroutine is finished
  • the first thing this new goroutine does is initialize a and b to 1, and then enter a loop where the first thing done in the loop is to send the value a on ch using the statement ch <- a
    • the goroutine that calls ch <- a is the sending goroutine, and since ch is an unbuffered channel the goroutine blocks (i.e. waits) until another goroutine executes a receive operation on the same channel
    • if the instead the receiving goroutines receive operation was done first, then the receiving operation will block (i.e. wait) until another goroutine sends a value on ch
      • in this example, <-nextFib in the main goroutine is the send operation
      • note the placement of <- at the start of the channel name
    • re-read the previous two points: they’re very important!
    • so, the just-launched goroutine is just sitting there on the line ch <- a, waiting for another goroutine to receive its value
    • if no goroutine receives its value, then it could wait forever
    • since it’s waiting, the fact that the goroutine is executing an infinite loop is not a problem: it sends values on at a time, not generating the next value until it is needed
    • in general, for unbuffered Go channels, the send/receive operations block (i.e. wait) until the opposite channel operation is done
      • in this way, channels synchronize the operation of goroutines
  • one other important channel operation is close, i.e. you close a channel when you are done with it
    • as soon as a channel is closed, no more values can be sent on it
    • any values on the channel when it was closed can be received, then when all values are received the channel will only return the zero-value for its type

buffered channels are like unbuffered channels, except they contain a queue (with a max size given at their creation time) that an hold values

  • when a value is sent to a buffered channel, it is added to the back of the queue; if the queue is full, then the goroutine blocks until an item has been removed from the queue
  • when a value is received from a buffered channel, it is removed from the front of the queue; if the queue is empty, then the goroutine is blocked until an item is added to the channel

we don’t have time to go into the details of buffered channels

one other Go language features often used with channels is select: it’s used in goroutines that are listening to multiple channels

finally, we note that there are many standard patterns of concurrency, such as pipelines and parallel looping, that can be usefully built on top of goroutines and channels