Golearn's base
package includes a utility function (ParseCSVToInstances
) which is used for loading CSV files.
Code excerpt: reading a CSV file.
package main
import (
"fmt"
"github.com/sjwhitworth/golearn/base"
)
func main() {
// Read the CSV file
XORData, err := base.ParseCSVToInstances("xor.csv", false)
// Error check
if err != nil {
panic(fmt.Sprintf("Couldn't load CSV file (error %s)", err))
}
// Print human-readable version
fmt.Println(XORData)
}
Things to be aware of
Fixed number of lines: Golearn reads the number of lines in the file before parsing in order to allocate memory.
The first line can be optionally used for Attribute
names. If the second argument of ParseCSVToInstances
is true, Golearn reads the first line and uses it to name the Attributes
.
The first line read determines the Attribute
count. If the number of fields increases at any point the CSV file, the extra fields are ignored. If the number of fields decreases, this may cause a runtime error.
Missing values are not supported.
Attribute types are inferred from the first data line which is read. Golearn reads the first data line from the file and generates FloatAttributes
for each field if the value resembles a number, and otherwise generates a CategoricalAttribute
.