1 Parsing CSV files
Richard Townsend edited this page 2014-08-09 07:12:32 -07:00

Golearn's base package includes a utility function (ParseCSVToInstances) which is used for loading CSV files.

Code excerpt: reading a CSV file.


package main

import (
	"fmt"
	"github.com/sjwhitworth/golearn/base"
)

func main() {

	// Read the CSV file
	XORData, err := base.ParseCSVToInstances("xor.csv", false)
	// Error check
	if err != nil {
		panic(fmt.Sprintf("Couldn't load CSV file (error %s)", err))
	}
	// Print human-readable version
	fmt.Println(XORData)

}

Things to be aware of

Fixed number of lines: Golearn reads the number of lines in the file before parsing in order to allocate memory.

The first line can be optionally used for Attribute names. If the second argument of ParseCSVToInstances is true, Golearn reads the first line and uses it to name the Attributes.

The first line read determines the Attribute count. If the number of fields increases at any point the CSV file, the extra fields are ignored. If the number of fields decreases, this may cause a runtime error.

Missing values are not supported.

Attribute types are inferred from the first data line which is read. Golearn reads the first data line from the file and generates FloatAttributes for each field if the value resembles a number, and otherwise generates a CategoricalAttribute.