Slices and Arrays in Go

(The following is based largely on http://blog.golang.org/slices.)

Like most programming languages, Go has a data structure called an array that stores a sequence of objects of the same type. Less commonly, Go has a second — and more useful — data structure called a slice that lets you refer to a contiguous sub-sequence of elements from an underlying array.

Arrays

Arrays are a fundamental data structure in Go. For instance:

// declares arr to be an array of 5 ints, all initialized to 0
var arr [5]int

for i, val := range arr {                  // print arr
    fmt.Printf("arr[%d] = %d\n", i, val)
}

This prints:

arr[0] = 0
arr[1] = 0
arr[2] = 0
arr[3] = 0
arr[4] = 0

When you create arrays, Go always initializes its element to the zero-value for the array type.

You can also create an array using an array literal, e.g.:

var arr = [5]int{1, 2, 3, 4, 5}  // array literal

Ranged for-loops (i.e. loops that use the keyword range as above) are often the nicest way to traverse arrays in Go. You can also use a more traditional C-style loop:

for i := 0; i < len(arr); i++ {
    fmt.Printf("arr[%d] = %d\n", i, arr[i])
}

The len function returns the length of an array, while the usual square- bracket notion is used to access individual elements at a given position. The first index position of a Go array is always 0.

You can’t change the size of an array in Go. Once an array is created, it’s length remains fixed until it is disposed of by the garbage collector.

You can pass an array to a function. As with all parameters, Go passes arrays by value (i.e. it makes a copy of the array):

func add1(arr [5]int) {
    for i := range arr {
        arr[i]++
    }
}

Now call it with this code:

var arr [5]int               // arr is initialized to all 0s
add1(arr)                    // add 1 to every element of arr
for i, val := range arr {    // print the results
    fmt.Printf("arr[%d] = %d\n", i, val)
}

This prints:

arr[0] = 0
arr[1] = 0
arr[2] = 0
arr[3] = 0
arr[4] = 0

Nothing has changed because add1 modified the copy of arr, not the original.

Copying an array is generally inefficient, and one way around this is to re- write add1 to accept a pointer to the array, e.g.:

func add1(arr *[5]int) {
    for i := range *arr {
        (*arr)[i]++
    }
}

Then you call it like this:

var arr [5]int
add1(&arr)                   // note the &
for i, val := range arr {
    fmt.Printf("arr[%d] = %d\n", i, val)
}

Now the original arr is changed:

arr[0] = 1
arr[1] = 1
arr[2] = 1
arr[3] = 1
arr[4] = 1

The code for this is rather messy. It’s easy to forget a * or &.

Arrays have a significant limitation. Consider this slight change:

var arr [6]int               // 6 elements instead of 5
add1(&arr)
for i, val := range arr {
    fmt.Printf("arr[%d] = %d\n", i, val)
}

This doesn’t even compile! You get a type mis-match error because add1 expects an array of length 5, but we are passing it an array of length 6. In Go, the length of an array is part of the array’s type, so arrays of different lengths have different types.

We get the same error in the version without pointers:

func add1(arr [5]int) {
    for i := range arr {
        arr[i]++
    }
}

func main() {
    var arr [6]int               // 6 elements instead of 5
    add1(arr)
    for i, val := range arr {
        fmt.Printf("arr[%d] = %d\n", i, val)
    }
}

Again, this is a type mis-match error: you can’t pass a 6-element array to a function expecting a 5-element array because an array’s length is part of its type.

This is a big problem with arrays. It essentially makes it impossible for us to write one function that works on arrays of different sizes.

Go addresses these problems with arrays by introducing a new data structure called a slice.

Slicing an Array

A slice is a data structure that describes a contiguous sequence of elements in an array. A slice is not an array, but for most purposes it acts like one.

One way to make a slice is from an array:

var arr [5]int           // an array with 5 elements

var sl []int = arr[1:4]  // a slice referring to arr[1], arr[2], arr[3]

Here, sl is a slice, and it’s type is []int, read “slice of ints”. We can also define sl like this:

var sl = arr[1:4]  // type of sl inferred from arr[1:4]

Or this:

sl := arr[1:4]  // can only use := if this line is in a function

You can use a slice pretty much the same way that you use an array:

var arr [5]int
for i := range arr {   // arr is {0, 1, 2, 3, 4}
    arr[i] = i
}

sl := arr[1:4]         // sl is a slice of length 3

for i := 0; i < len(sl); i++ {            // print sl
    fmt.Printf("sl[%d] = %d\n", i, sl[i])
}

It’s important to understand that a slice always refers to an underlying array, and if that array changes, then so does the slice:

arr[1] = -100   // change second element of arr

for i := 0; i < len(sl); i++ {              // print sl
    fmt.Printf("sl[%d] = %d\n", i, sl[i])
}

This prints:

sl[0] = -100
sl[1] = 2
sl[2] = 3

You can pass slices to functions like this:

// sl can be any int slice of any length
func display(sl []int) {
    for i, val := range sl {
        fmt.Printf("sl[%d] = %d\n", i, val)
    }
}

func main() {
    var arr [5]int

    sl1 := arr[1:4]  // sl1 refers to arr[1], arr[2], arr[3]
    display(sl1)

    sl2 := arr[2:5]  // sl2 refers to arr[2], arr[3], arr[4]
    display(sl2)
}

Unlike arrays, the length of a slice is not part of its type, so the display function can be passed any int slice.

Slice Notation in General

Suppose that arr is of length n, and consider this statement:

s := arr[a:b]

Since the indices of the underlying array arr start at 0 and end at n - 1, a few basic restrictions on a and b are clear:

  • Both a and b must be non-negative. This is because the 0 is the first index for the underlying array (all Go arrays start at 0).

    You will get either a compiler error or a run-time error if you use negative indices. For example, this causes a compiler error:

    var arr [5]int
    
    s := arr[-2:4]  // compiler error: negative index not allowed in a slice
    

    Or sometimes you get a run-time error like this:

    var arr [5]int
    
    i := -2
    
    s := arr[i:4]      // runtime error when this line is executed
    
    fmt.Println(s[0])  // s must be used somewhere to avoid an
                       // "s is unused error"
    

    It’s unfortunate that Go can’t recognize this indexing error at compile time. But, in general, the compiler can’t always know the value of a variable before the program runs. For example, if i was read from the user, then the value of i is not known until the user types it.

  • Both a and b must be less than, or equal to, n. Again, this is a consequence of the fact that the underlying array has n elements.

    An important detail is that a slice may refer to location n, even through arr[n] does not exist. For example:

    var arr [5]int
    
    s := arr[0:5]   // okay: s is the entire array
    
    s = arr[0:4]    // okay: s is the first 4 elements of the array
    
    s = arr[5:5]    // okay: s is the empty slice, []
    

    Note that we use := only for the first assignment to s. That’s because := assigns and creates the variable, and it would be an error to create s more than once.

  • The first index, a, must be less than, or equal to, b. Otherwise, you get an inverted slice error, e.g.:

    var arr [5]int
    
    s := arr[3:1]  // compiler error: slice indices are inverted
    

To summarize, if arr is an array of non-negative length n, then the slice arr[a:b] is a valid slice if 0 <= a <= b <= n.

Finally, there are a couple of slice indexing shorthands that are useful to know. Assuming arr is of length n, then:

  • arr[a:] is the same as arr[a:n]
  • arr[:b] is the same as arr[0:b]
  • arr[:] is the same as arr[a:n]

Slices of Slices

We’ve seen that you can create a slice from an array. You can also create a slice from another slice. For example:

var arr [5]int

for i := range arr { arr[i] = i }

s1 := arr[1:4]  // a slice of an array
s2 := s1[0:2]   // a slice of a slice

fmt.Printf("s1 is %v\n", s1)  // %v prints the value of a Go object
fmt.Printf("s2 is %v\n", s2)  // in a standard notation

This prints:

s1 is [1 2 3]
s2 is [1 2]

Both slices refer to the same underlying array, so if that changes, the slices change too:

// ... code same as above ...

arr[1] = 77

fmt.Printf("\ns1 is %v\n", s1)
fmt.Printf("s2 is %v\n", s2)

The entire program prints:

s1 is [1 2 3]
s2 is [1 2]

s1 is [77 2 3]
s2 is [77 2]

We can also change the underlying array through a slice, e.g.:

// ... code same as above two examples ...

s2[0] = -55

fmt.Printf("\narr is %v\n", arr)
fmt.Printf("s1 is %v\n", s1)
fmt.Printf("s2 is %v\n", s2)

The entire program now prints:

s1 is [1 2 3]
s2 is [1 2]

s1 is [77 2 3]
s2 is [77 2]

arr is [0 -55 2 3 4]
s1 is [-55 2 3]
s2 is [-55 2]

How Slices are Implemented

It’s instructive to see how slices are implemented in Go, as that can help explain their behaviour. Conceptually, a slice could be implemented like this:

type sliceHeader struct {
    Length        int
    Capacity      int
    ZerothElement *int     // pointer to an int
}

Note: We can’t mimic slices perfectly because every variable in a Go structure must have an explicit type. Here, we’ve given ZerothElement the type *int, so sliceHeader can only refer to arrays of ints. In general, the ZerothElement must be of type T*, where T is the type of the elements in the underlying array. Go currently has no way of expressing T: that would require something like templates as in C++ or Java.

Importantly, a sliceHeader uses a small, fixed amount of memory. It consists of two integers and a pointer. The elements of the slice are stored in the underlying array. Thus, the cost of creating or copying a slice is extremely low.

The capacity of a slice is the number of elements in the underlying array from the start of the slice to the end of the array (not the end of the slice). The function cap returns a slice’s capacity:

var arr [5]int

s := arr[1:4]  // a slice of an array

fmt.Printf("length of s is %d\n", len(s))
fmt.Printf("capacity of s is %d\n", cap(s))

This prints:

length of s is 3
capacity of s is 4

So why does a slice store the capacity? The capacity of a slice is used for extending its size. Basically, you can extend the size of a slice up to its capacity. If you extend it past its capacity, you get an error.

Consider this function for appending an integer onto the end of a slice:

func extend(slice []int, x int) []int {
    n := len(slice)
    slice = slice[0 : n+1]  // can only slice up to a slice's capacity
    slice[n] = x
    return slice
}

Then run this code:

func main() {
    var arr [5]int

    s := arr[0:0]  // initially empty slice: length 0, capacity 5

    s = extend(s, 1)
    s = extend(s, 2)
    s = extend(s, 3)
    s = extend(s, 4)
    s = extend(s, 5)

    fmt.Printf("s is %v\n", s)

    s = extend(s, 6)  // runtime error: slice bounds out of range
}

The sixth time extend is called gives a runtime error because there is no more space in the slice’s underlying array. A slice can never be longer than the array it refers to.

Passing a Slice to a Function

When you pass an array to a function, the array is passed by value, i.e. a copy of the array is made. This is both inefficient for large arrays, and also a limitation because the function cannot modify the original array (it can only modify the copy).

Slices behave differently. When you pass a slice to a function, the slice data structure is copied, but the underlying array is not copied. The copied slice points to the same underlying array as the original slice, and so both slices access the same array. Thus passing a slice to a function is not only efficient, but it lets you modify the underlying array.

For example:

func add1(s []int) {     // s is slice
    for i := range s {
        s[i]++
    }
}

func main() {
    var arr [5]int      // arr is an array
    s := arr[0:5]

    add1(s)

    for i, val := range s {
        fmt.Printf("s[%d] = %d\n", i, val)
    }
}

This prints:

s[0] = 1
s[1] = 1
s[2] = 1
s[3] = 1
s[4] = 1

Making Slices with make

We’ve seen that one way to create a slice is directly from an array. Another option is to use Go’s make function like this:

s := make([]int, 10, 15)  // makes an int slice of length 10, capacity 15

This statement creates both an array of ints of length 15, and a slice of length 10 (and capacity 15) that refers to that array.

If you want to create a slice whose length and capacity are equal, you don’t need to pass the third parameter to make:

s := make([]int, 10)  // same as make([]int, 10, 10)

If you need to increase the length of a slice past its capacity, your only choice is to create a new slice. This is usually done using the append function discussed below, but here we will do it “by hand” using make:

s := make([]int, 10, 15)

fmt.Printf("length of s is %d\n", len(s))
fmt.Printf("capacity of s is %d\n", cap(s))

// make a new slice with the same length of s but twice the capacity
s2 := make([]int, len(s), 2 * cap(s))

for i := range s {  // copy values of s into s2
    s2[i] = s[i]
}

s = s2
fmt.Printf("\nlength of s is %d\n", len(s))
fmt.Printf("capacity of s is %d\n", cap(s))

The copy Function

In the example code above we copied the values from one slice to another using a loop. Copying slices is common enough in Go that a built-in copying function is provided. Conveniently, it’s called copy:

s1 := []int{1, 2, 3}                     // slice literal
s2 := make([]int, len(s1), 2 * cap(s1))

copy(s2, s1)  // copy values from s1 into s2

fmt.Printf("%v\n", s2)  // prints: [1 2 3]

Notice that s1 is initialized using a slice literal, which is yet another way to create a slice. If we wrote [3]int{1, 2, 3}, then it would instead be an array literal.

Not only is copy easier to type than a loop, but it correctly handles a few special cases. First, the number of values copied by copy equals the minimum of the lengths of its two arguments. For example:

s1 := []int{1, 2, 3}   // slice literal
s2 := make([]int, 2)   // slice of length 2, capacity 2

copy(s2, s1)  // copy values from s1 into s2

fmt.Printf("s2 = %v\n", s2) // prints: [1 2]

The other thing copy does is to correctly handle over-lapping slices within the same underlying array. This lets you use copy to move items around inside a slice. For example:

s := []int{0, 1, 2, 3, 4}   // s is a slice of ints

copy(s[3:5], s[2:4])        // move [2 3] one position up in s

fmt.Printf("s = %v\n", s)   // prints: [0 1 2 2 3]

The append Function

What happens if you want a slice to grow longer than it’s capacity? As mentioned above, once a slice is created it’s capacity can’t change because it’s limited by the length of the underlying array (which can’t change). So you need to create a new, longer slice, and this is what the built-in append function does (when necessary).

For example:

s := []int{0, 1, 2, 3, 4}

fmt.Printf("len(s) is %d\n", len(s))
fmt.Printf("cap(s) is %d\n", cap(s))

s = append(s, 5)  // append 5 onto the end of s

fmt.Printf("\ns = %v\n", s)
fmt.Printf("len(s) is %d\n", len(s))
fmt.Printf("cap(s) is %d\n", cap(s))

Running this code prints this:

len(s) is 5
cap(s) is 5

s = [0 1 2 3 4 5]
len(s) is 6
cap(s) is 12

Appending one new element to s increases its length by 1, but its capacity has increased from 5 to 12. That’s because when append is forced to increase the slice’s capacity, it allocates more memory than is strictly needed so that future appends will be faster. Note that the documentation for append does not say exactly how much the capacity of the array will be increased, and so it is possible that different versions of Go might print different capacities than this example.

The implementation of append is something like this:

func Append(s []int, elements ...int) []int {
    n := len(s)
    total := len(s) + len(elements)
    if total > cap(s) {
        // reallocate
        newSize := 2 * total + 1
        newSlice := make([]int, total, newSize)
        copy(newSlice, s)
        s = newSlice
    }
    s = s[:total]
    copy(s[n:], elements)
    return s
}

(This only works with int slices, while the built-in append works with any type of slice. But the essential code is the same.)

As you can see, a new slice is created only when there is no more capacity in s to hold all everything in elements. Thus appending is often extremely efficient, and a new slice is only allocated when necessary.

One new feature used here is the argument type ...int. This lets you pass in a single int, or an entire slice of ints. For example:

s := []int{0, 1, 2, 3, 4}
t := []int{5, 6, 7}

s = append(s, t...)  // the ... is necessary

fmt.Printf("\ns = %v\n", s)
fmt.Printf("len(s) is %d\n", len(s))
fmt.Printf("cap(s) is %d\n", cap(s))

Exercises

  1. Write a function called shrink(s []int) that returns a new slice with the same elements as s, except the capacity of the newly returned slice is the same as its length.