CSV Parser for Arduino
|
CSV means comma separated values. It's like a normal "txt" file with commas at regular places to separate some values.
Typically the first line of CSV file is a "header", containing names of columns (this way any reader knows which column means what).
Example CSV file with header and 2 columns:
Date,Temperature
2020/06/12,20
2020/06/13,22
2020/06/14,21
Using CSV format is one way of organising data, which makes it easy for programs to read.
It's a class to which you can supply:
Class parses that string, in other words, it extracts values, stores them and provides you with:
It adheres to the RFC 4180 specification.
It was written with care to not be greedy in terms of occupied memory and parsing time.
In Arduino IDE select Tools->Manage libraries, type "csv" in the top editbox, find "CSV Parser" and press install.
Then just add the following line at the top of your sketch:
Output:
hello - 70000 - 140 - 10 - 3.33 - FF0000
world - 80000 - 150 - 20 - 7.77 - FF
noice - 90000 - 160 - 20 - 9.99 - FFFFFF
Notice how each character within "sLdcfx-"
string specifies different type for each column. It is very important to set this format right. We could set each solumn to be strings like "sssssss", however this would use more memory than it's really needed. If we wanted to store a large array of small numerical values (e.g. under 128), then using "c" specifier would be appropriate. See "How to specify value types" section for full list of available specifiers and their descriptions.
Is it necessary to supply the whole string at once?
No, it may be supplied in incomplete parts as shown in this example.
We may as well supply the csv file character by character like:
If CSV file doesn't contain header line, then it must be specified as 3rd argument of the constructor (see this example)
If CSV file is separated by other character instead of comma, then it must be specified as 4th argument of the constructor (see this example)
Programmer must:
The CSV file may:
**Important - if the file does not end with "\n" (new line) then cp.parseLeftover() method must be called after supplying the whole file (regardless if it was supplied all at once or in parts). Example:**
What if the string itself stored in CSV contains comma (or other custom delimiter)?
As described in the RFC 4180 specification we can enclose the string using double quotes. Example csv:
my_strings,my_ints
"single, string, including, commas",10
"another string, with single comma",20
What if we wanted to store double quotes themselves?
As described in the RFC 4180 specification we can put two double quotes next to each other. The parser will treat them as one. Example:
my_strings,my_ints
"this string will have 1 "" double quote inside it",10
"another string with "" double quote char",10
Parser will read such file as:
1st string = this string will have 1 " double quote inside it
2nd string = another string with " double quote char
Notice that it's possible to customize the quote char as shown in this section. E.g. to use single quotes (') instead.
Example above is specifying "s" (string) for the 1st column, and "f" (float) for the 2nd column.
Possible specifiers are:
s - string (C-like string, not a "String" Arduino object, just a char pointer, terminated by 0)
f - float
L - int32_t (32-bit signed value, can't be used for values over 2147483647)
d - int16_t (16-bit signed value, can't be used for values over 32767)
c - char (8-bit signed value, can't be used for values over 127)
x - hex (stored as int32_t)
- (dash character) means that value is unused/not-parsed (this way memory won't be allocated for values from that column)
By preceding the integer based specifiers ("L", "d", "c", "x") with "u".
Example:
See unsigned_values example for more info.
Let's suppose that we parse the following:
To cast/retrieve the values we can use:
"x" (hex input values), should be cast as "int32_t*" (or uint32_t*), because that's how they're stored. Casting them to "int*" could result in wrong address being computed when using ints[index]
.
To parse CSV files without header we can specify 3rd optional argument to the constructor. Example:
And then we can use the following to get the extracted values:
Delimiter is 4th parameter of the constructor. It's comma (,) by default. We can customize it like this:
Quote character is 5th parameter of the constructor. It's double quote (") by default. We can customize it like this:
Use CSV_Parser.print function and check serial monitor. Example:
It will display parsed header fields, their types and all the parsed values. Like this:
CSV_Parser content:
Header:
my_strings | my_longs | my_ints | my_chars | my_floats | my_hex | -
Types:
char* | int32_t | int16_t | char | float | hex (long) | -
Values:
hello | 70000 | 140 | 10 | 3.33 | FF0000 | -
world | 80000 | 150 | 20 | 7.77 | FF | -
noice | 90000 | 160 | 30 | 9.99 | FFFFFF | -
Important - cp.print() method is using "Serial" object, it assumes that "Serial.begin(baud_rate);" was previously called.
I wanted to parse covid-19 csv data and couldn't find any csv parser for Arduino. So instead of rushing with a quick/dirty solution, I decided to write something that could be reused in the future (possibly by other people too).
https://michalmonday.github.io/CSV-Parser-for-Arduino/index.html