So Many Libraries, So little Time

blog-image

Navigating Go’s I/O libraries.

Go has several libraries for file I/O including os, io, io/ioutil, and bufio.

As convenient as options are, having an abundance can lead you to question whether or not you’re using one that’s most appropriate for your particular use case. What if instead of filtering through documentation until you found what you were looking for, you could do so by simply identifying your use case in an intuitive way?

I’ve taken the liberty of organizing some useful I/O functions by their use case for the 4 libraries mentioned above. I’ve also included examples and explanations to better illustrate their utility from my point-of-view.

The idea here is to begin experimenting with what I believe to be an easily navigable format for digesting information quickly.

Feeback is welcome and encouraged.

Enjoy!

os pkg


os.Open

Opens an existing file as os.O_RDONLY.

Will error out if the file does not have read permissions or does not exist it.

You can also check for the specific err with os.IsPermission or os.IsNotExist.

You can also check if the file exists before-hand using os.Stat.

os.Stat will return other useful file information as well.

// check if file exists with os.Stat
info, err := os.Stat("scrap.txt")
if err != nil{
    log.Fatal(err)
}

// useful info returned from os.Stat
fmt.Println("File name:", info.Name())
fmt.Println("Size in bytes:", info.Size())
fmt.Println("Permissions:", info.Mode())
fmt.Println("Last modified:", info.ModTime())
fmt.Println("Is Directory: ", info.IsDir())
fmt.Printf("System interface type: %T\n", info.Sys())
fmt.Printf("System info: %+v\n\n", info.Sys())

// narrowing down errors
f, err := os.Open("scrap.txt")
if err != nil{
	if os.IsNotExist(err){
        fmt.Println("file does not exist")
	}
	if os.IsPermission(err){
        fmt.Println("file does not have read permissions")
	}
    log.Fatal(err)
}
defer f.Close()

os.OpenFile

Opens a file and requires you to explicitly set it’s flag,control-behavior, and permission bits.

If the file doesn’t exist make sure to include a control behavior of os.O_CREATE so it can be created before opening.

f, err := os.OpenFile("scrap.txt", os.O_RDWR | os.O_CREATE, 0666)
if err != nil{
	log.Fatalf("failed to create file : e%v\n", err)
}
defer f.Close()

// flags
// os.O_RDONLY // Read only
// os.O_WRONLY // Write only
// os.O_RDWR // Read and write
//control behaviors
// os.O_APPEND // Append to end of file
// os.O_CREATE // Create the file if it doesn't exist
// os.O_TRUNC // Truncate file when opening

os.Create

os.Create is exactly the same as OpenFile(name, O_RDWR|O_CREATE|O_TRUNC, 0666) under the hood.

It creates a new file with the os.O_RDWR flag and returns the file descriptor for I/O operations.

f, err := os.Create("scrap.txt")
if err != nil{
    log.Fatalf("failed to create file : %v\n", err)
}
defer f.Close()

os.File.Read, os.File.Write, and os.File.Seek

If you’re plan is to limit the number of packages you’re importing and keep using os, it’s a good idea to be aware of Seek.

Consider the example below and the error it returns.

The read operation will fail. Can you tell why ?

f, err := os.Create("scrap.txt")
if err != nil{
	log.Fatalf("failed to create file : %v\n", err)
}
defer f.Close()

//write
bytesWritten, err := f.Write([]byte("give me a home"))
if err != nil{
	log.Fatal(err)
}
fmt.Printf("wrote %d bytes\n", bytesWritten)

data := make([]byte,bytesWritten)

//read
bytesRead, err := f.Read(data)
if err != nil{
	log.Fatal(err)
}
fmt.Printf("read %d bytes\n", bytesRead)
fmt.Printf("data : %s\n", data)

//Output:
//wrote 14 bytes
//2019/08/07 16:18:49 EOF
//exit status 1

To understand why we logged an error on the read operation, more specifically an EOF, you need to understand byte position.

After the Write method succesfully completed, the byte position changed to 14.

We can verify this by printing the position after our write operation using Seek.

  currentPosition, err := file.Seek(0, 1)
  if err != nil {
      log.Fatal(err)
  }
  fmt.Printf("current byte position : %)

  //wrote 14 bytes
  //current byte position : 14

Seek takes takes two params.

offset : number of bytes to move( can be (-) or (+))

whence : position to move from as follows :

    // 0 = Beginning of file
    // 1 = Current position
    // 2 = End of file

With that said, we can interpret our currentPosition example above as “move 0 bytes from the current position”.

This told us that our current position after the write operation was at 14.

Therefore the Read method returned an EOF since there were no more bytes to read from the current position.

In the same way we used Seek to get the current position, we can use it to reset our current position.

    if _, err := f.Seek(-int64(bytesWritten), 1);err != nil{
        log.Fatal(err)
    }
    fmt.Printf("successfully moved back : %d bytes\n", bytesWritten)

When we add the above logic in between our Read and Write methods from our earlier example we output the expected result.

    //output :
    //wrote 14 bytes
    //successfully moved back : 14 bytes
    //read 14 bytes
    //data : give me a home

ioutil pkg


ioutil.WriteFile

Creates/opens a file, writes a slice of bytes to it, and closes it all in one function.

  err := ioutil.WriteFile("scrap.txt", []byte("some text here\n"), 0666)
  if err != nil {
     log.Fatal(err)
  }

Is there a read equivalent ?

Yes, two actually.

ioutil.ReadFile and ioutil.ReadAll

Both functions read all the bytes of a file into memory until EOF is reached.

The difference between the two is that ioutil.ReadAll takes a file-descriptor and ioutil.ReadFile takes the path/name string of the file and handles opening the file for you.

EOF is not returned as an error but used by the function as a means of distinguishing whether or not it should continue reading.

//contains the text "give me a home"
f, err := os.Open("scrap.txt")
if err != nil{
	log.Fatalf("failed to create and openfile : %v\n", err)
}
defer f.Close()

//opens the file for you
bytesRead, err := ioutil.ReadFile("scrap.txt"); 
if err != nil{
	log.Fatalf("failed to read file : %v\n", err)
}
fmt.Printf("bytes read : %d\n", bytesRead)

//takes an already opened file
bytesRead, err = ioutil.ReadAll(f); 
if err != nil{
	log.Fatalf("failed to read file : %v\n", err)
}
fmt.Printf("bytes read : %d\n", bytesRead)

//output:
//bytes read : [103 105 118 101 32 109 101 32 97 32 104 111 109 101]
//bytes read : [103 105 118 101 32 109 101 32 97 32 104 111 109 101]

This is convenient for quickly loading small files into memory. If you’re working with large files, the next couple functions are safer to use as they allow you to better set thresholds around the amount of bytes you read into the buffer.

io pkg


io.ReadAtLeast

Allows you to specify a minimum byte threshold to read before and limits the bytes read to the buffer capacity.

Here are some examples of when this function might return a non-nil error.

ErrUnexpectedEOF : failed to read the minimum bytes threshold.

EOF : only if no bytes are read at all.

ErrShortBuffer : minimum byte threshold > capacity of the buffer.

Any errors that are returned are dropped as long as the minimum bytes threshold is met.

Notice in the example below that, the output did not print all of the text from the file.

That’s because it stopped reading once the buffer capacity was filled. In this case 11 bytes.

// this file contains the text "give me a home"
f, err := os.Open("scrap.txt")
if err != nil {
    log.Fatal(err)
}
defer f.Close()

data := make([]byte, 11)
minBytes := 8
numBytesRead, err := io.ReadAtLeast(file, data, minBytes)
if err != nil {
    log.Fatal(err)
}
log.Printf("Number of bytes read: %d\n", numBytesRead)
log.Printf("Data read: %s\n", data)

//output:
//2019/08/07 21:15:05 Number of bytes read: 11
//2019/08/07 21:15:05 Data read: give me a h

io.ReadFull

Strictly enforces that the buffer is completely filled up by erroring out if it’s not.

It will return an EOF error value if there is no data to read.

If it’s read some data but reached EOF before satisfying the buffer capacity the error value will be UnexpectedEOF as shown in the example below.

This is due to our file containing 14 bytes and the buffer requiring 20.

// this file contains the text "give me a home"
  file, err := os.Open("scrap.txt")
  if err != nil {
     log.Fatal(err)
  }
  defer f.Close()

  byteSlice := make([]byte, 20)
  numBytesRead, err := io.ReadFull(file, byteSlice)
  if err != nil {
     log.Fatal(err)
  }
  log.Printf("Number of bytes read: %d\n", numBytesRead)
  log.Printf("Data read: %s\n", byteSlice)

//output: 
//2019/08/07 21:20:24 unexpected EOF
//exit status 1

bufio pkg


bufio.Scanner

Steps through a file line-by-line, word-by-word, or by a custom delimiter.

You start by wrapping a new scanner around a file-descriptor.

f, err := os.Open("scrap.txt")
if err != nil {
	log.Fatal(err)
}
defer f.Close()

s := bufio.NewScanner(f)

Scanning line-by-line is the default behavior.

//s.Scan() will evaluate to false on EOF or other err
for s.Scan() {
	fmt.Println(s.Text())//you can output as string
	fmt.Println(s.Bytes())//or bytes
}

//output:
//this is line 198
//[116 104 105 115 32 105 115 32 108 105 110 101 32 49 57 56]
//this is line 199
//[116 104 105 115 32 105 115 32 108 105 110 101 32 49 57 57]
//this is line 200
//[116 104 105 115 32 105 115 32 108 105 110 101 32 50 48 48]

We can scan word-by-word.

s.Split(bufio.ScanWords)

for s.Scan() {
	fmt.Println(s.Text())
}

//output:
//this
//is
//line
//199

We can also use our own custom split func to use a different delimiter as long as it’s function signature is of type :

SplitFunc(data []byte, atEOF bool) (advance int, token []byte, err error)

Let’s create a custom split func that will use whole numbers as a delimiter.

func myCustomSplitFunc(data []byte, atEOF bool) (advance int, token []byte, err error){
    // Skip leading spaces.
    start := 0
    for width := 0; start < len(data); start += width {
        var r rune
        r, width = utf8.DecodeRune(data[start:])
        if !unicode.IsNumber(r) {
            break
        }
    }
    // Scan up until number, marking end of line.
    for width, i := 0, start; i < len(data); i += width {
        var r rune
        r, width = utf8.DecodeRune(data[i:])
        if unicode.IsNumber(r) {
            return i + width, data[start:i], nil
        }
    }
    // If we're at EOF and we have a final, non-empty line then return it.
    if atEOF && len(data) > start {
        return len(data), data[start:], nil
    }
    // Request more data.
    return start, nil, nil
}

We can now pass our custom split func into s.Split since it’s function signature satisfies type SplitFunc.

//end of text file contains:
//this is line 198
//this is line 199
//this is line 200

f, err := os.Open("scrap.txt")
if err != nil {
	log.Fatal(err)
}
defer f.Close()

s := bufio.NewScanner(f)

//custom delimiter
s.Split(myCustomSplitFunc)

for s.Scan() {
	fmt.Println(s.Text())
}

//output:
//this is line 
//this is line 
//this is line

bufio.Reader

Is similar to the scanner in that it wraps around a file-descriptor the same way and allows you to read up to a specified delimiter.

//scrap.txt contains the text "give me a home"
f, err := os.Open("scrap.txt")
if err != nil {
    log.Fatal(err)
}
defer f.Close()

r := bufio.NewReader(f)

One way it differs is that you have more control over what you can do (or undo) with the byte position cursor.

bufio.Reader.Peek

Allows you to see what bytes are next without advancing the cursor.

for {
    byteSlice := make([]byte, 4)
    byteSlice, err = r.Peek(4)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("Peeked at %d bytes: %s\n", len(byteSlice), byteSlice)
}
//output:
//Peeked at 4 bytes: give
//Peeked at 4 bytes: give
//Peeked at 4 bytes: give

The output in the infinite loop above will never change because the starting position for the read operation never changes. This can be useful in a situation where the cursor should only advance based on whether or not a certain value is next.

bufio.Reader.ReadByte and bufio.Reader.UnreadByte

Useful for when you only need to read byte-by-byte or revert the last byte read.

oneByte, err := r.ReadByte()
if err != nil {
	log.Fatal(err)
}
fmt.Printf("one byte : %v\n", oneByte)

if err := r.UnreadByte();err != nil {
	log.Fatal(err)
}

sameByte, err := r.ReadByte()
if err != nil {
	log.Fatal(err)
}
fmt.Printf("same byte : %v\n", sameByte)

//output:
//one byte : 103
//same byte : 103

In the above example the cursor advanced one byte, removed the last read byte from the buffer and reverted the cursor position before reading that same byte again.

If you do in fact need to be reading byte-by-byte you should avoiding writing byte-by-byte if you can. See bufio.Writer for details.

bufio.ReadBytes

Similar to scanner.Bytes but it’s different in that the operation is inclusive of the delimiter value.

I added “!” to the end of the scrap.txt file to properly demonstrate the next example.

// Read up to and include the specified delimiter dataBytes, err := r.ReadBytes('!') if err != nil { log.Fatal(err) } fmt.Printf(“data: %s\n”, dataBytes)

//output:
//data: give me a home!

bufio.Writer

Provides a buffer you can write multiple times to before writing to disk.

Also useful if you’re reading byte-by-byte because you don’t wan’t to write immediately after every read because that would be slower and more taxing on the disk.

The writer is wrapped around an open file similar to bufio.Reader and the scanner.

The buffer has a default capacity of 4096 bytes but, if you need more you’ll wan’t to use bufio.NewWriterSize

f, err := os.OpenFile("scrap.txt", os.O_WRONLY, 0666)
if err != nil {
	log.Fatal(err)
}
defer f.Close()

//default capacity of 4096
w := bufio.NewWriter(file)

//to explicitly set buffer size
//w = bufio.NewWriterSize(w,8000)

We can write multiple times to the buffer before we write to disk.

We can demonstrate this by checking the amount of data in the buffer as well as the buffer’s remaining capacity in between writes.

bytesWritten, err := w.Write([]byte{65, 66, 67})
if err != nil {
	log.Fatal(err)
}
log.Printf("wrote %d bytes\n", bytesWritten)

inBuffer := w.Buffered()
log.Printf("in buffer : %d\n", inBuffer)

capacityLeft := w.Available()
if err != nil {
	log.Fatal(err)
}
log.Printf("Available bytes in buffer: %d\n", capacityLeft)

bytesWritten, err = w.WriteString("add me to buffer")
if err != nil {
	log.Fatal(err)
}
log.Printf("wrote %d bytes \n", bytesWritten)


inBuffer = w.Buffered()
log.Printf("in buffer: %d\n", inBuffer)

capacityLeft = w.Available()
if err != nil {
	log.Fatal(err)
}
log.Printf("Available bytes in buffer: %d\n", capacityLeft)

//2019/08/08 16:25:55 wrote 3 bytes
//2019/08/08 16:25:55 in buffer : 3
//2019/08/08 16:25:55 Available bytes in buffer: 4093
//2019/08/08 16:25:55 wrote 16 bytes 
//2019/08/08 16:25:55 in buffer: 19
//2019/08/08 16:25:55 Available bytes in buffer: 4077

bufio.Writer.Flush and bufio.Writer.Discard

Use Flush to write the contents of the buffer to disk.

To clear the buffer without writing to disk use Reset.

//write to disk
if err := w.Flush(); err != nil{
    log.Fatal(err)
}

//or discard
w.Reset(w)

I hope this has helped clear up any ambiguity around when you might use the different I/O functions across the 4 libraries we discussed today.

Much love,

-Faris