Last Modified: | March 07, 2015, at 03:00 PM |
By: | robtillaart |
Platforms: | All |
Latest version on - Github
One of the main applications for the Arduino board is reading and logging of sensor data. For instance one monitors the temperature and air pressure every minute of the day. As that implies a lot of records, we often want the average and standard deviation to get an image of the variations of the temperature of that day.
Background reading - tutorial statistic formulas
The Statistic library just calculates the average and stdev of a set of data(floats). Furthermore it holds the minimum and maximum values entered. The interface consists of nine functions: (version 0.3.3 on Github)
Statistic(); // constructor void clear(); // reset all counters void add(double); // add a new value long count(); // # values added double sum(); // total double minimum(); // minimum double maximum(); // maximum double average(); // average double pop_stdev(); // population std deviation double unbiased_stdev(); // unbiased std deviation
Internally the library does not record the individual values, only the count, sum and the squared sum (sum*sum), minimum and maximum. These five are enough to calculate the average and stdev. The nice part is that it does not matter if one adds 10, 100 or 1000 values.
A small sketch shows how it can be used. A random generator is used to mimic a sensor.
#include "Statistic.h" // without trailing s Statistic myStats; void setup(void) { Serial.begin(9600); Serial.print("Demo Statistic lib "); Serial.println(STATISTIC_LIB_VERSION); myStats.clear(); //explicitly start clean } void loop(void) { long rn = random(0, 100); myStats.add(rn/100.0); Serial.print(" Count: "); Serial.print(myStats.count()); Serial.print(" Average: "); Serial.print(myStats.average(), 4); Serial.print(" Std deviation: "); Serial.print(myStats.pop_stdev(), 4); Serial.println(); if (myStats.count() == 300) { myStats.clear(); delay(1000); } }
In setup() the myStats is cleared so we can start adding new data.
In loop() first a random number is generated and converted to a float to be added to myStats. Then the count, the average and std deviation so far is printed to the serial port. One could also display it on some LCD or send over Ethernet etc. When 300 items are added myStats is cleared to start over again.
In the first version I collected all the samples in an array but that resulted in quite some memory usage and the user had to know the number of samples beforehand to allocate enough room. As I found this not quite acceptable therefore I stripped the data-array from the class to make it more elementary.
To use the library, make a folder in your SKETCHBOOKPATH\libaries with the name Statistic and put the .h and .cpp there.
Enjoy tinkering,
rob.tillaart@removethisgmail.com
#ifndef Statistic_h #define Statistic_h // // FILE: Statistic.h // AUTHOR: Rob dot Tillaart at gmail dot com // modified at 0.3 by Gil Ross at physics dot org // VERSION: 0.3.3 // PURPOSE: Recursive Statistical library for Arduino // HISTORY: See Statistic.cpp // // Released to the public domain // // the standard deviation increases the lib (<100 bytes) // it can be in/excluded by un/commenting next line #define STAT_USE_STDEV #include <math.h> #define STATISTIC_LIB_VERSION "0.3.3" class Statistic { public: Statistic(); void clear(); void add(double); // returns the number of values added unsigned long count() { return _cnt; }; // zero if empty double sum() { return _sum; }; // zero if empty double minimum() { return _min; }; // zero if empty double maximum() { return _max; }; // zero if empty double average(); #ifdef STAT_USE_STDEV double variance(); double pop_stdev(); // population stdev double unbiased_stdev(); #endif protected: unsigned long _cnt; double _store; // store to minimise computation double _sum; double _min; double _max; #ifdef STAT_USE_STDEV double _ssqdif; // sum of squares difference #endif }; #endif // END OF FILE
// // FILE: Statistic.cpp // AUTHOR: Rob dot Tillaart at gmail dot com // modified at 0.3 by Gil Ross at physics dot org // VERSION: 0.3.3 // PURPOSE: Recursive statistical library for Arduino // // NOTE: 2011-01-07 Gill Ross // Rob Tillaart's Statistic library uses one-pass of the data (allowing // each value to be discarded), but expands the Sum of Squares Differences to // difference the Sum of Squares and the Average Squared. This is susceptible // to bit length precision errors with the float type (only 5 or 6 digits // absolute precision) so for long runs and high ratios of // the average value to standard deviation the estimate of the // standard error (deviation) becomes the difference of two large // numbers and will tend to zero. // // For small numbers of iterations and small Average/SE th original code is // likely to work fine. // It should also be recognised that for very large samples, questions // of stability of the sample assume greater importance than the // correctness of the asymptotic estimators. // // This recursive algorithm, which takes slightly more computation per // iteration is numerically stable. // It updates the number, mean, max, min and SumOfSquaresDiff each step to // deliver max min average, population standard error (standard deviation) and // unbiassed SE. // ------------- // // HISTORY: // 0.1 - 2010-10-29 initial version // 0.2 - 2010-10-29 stripped to minimal functionality // 0.2.01 - 2010-10-30 // added minimim, maximum, unbiased stdev, // changed counter to long -> int overflows @32K samples // 0.3 - 2011-01-07 // branched from 0.2.01 version of Rob Tillaart's code // 0.3.1 - minor edits // 0.3.2 - 2012-11-10 // minor edits // changed count -> unsigned long allows for 2^32 samples // added variance() // 0.3.3 - 2015-03-07 // float -> double to support ARM (compiles) // moved count() sum() min() max() to .h; for optimizing compiler // // Released to the public domain // #include "Statistic.h" Statistic::Statistic() { clear(); } // resets all counters void Statistic::clear() { _cnt = 0; _sum = 0.0; _min = 0.0; _max = 0.0; #ifdef STAT_USE_STDEV _ssqdif = 0.0; // not _ssq but sum of square differences // which is SUM(from i = 1 to N) of // (f(i)-_ave_N)**2 #endif } // adds a new value to the data-set void Statistic::add(double value) { if (_cnt == 0) { _min = value; _max = value; } else { if (value < _min) _min = value; else if (value > _max) _max = value; } _sum += value; _cnt++; #ifdef STAT_USE_STDEV if (_cnt > 1) { _store = (_sum / _cnt - value); _ssqdif = _ssqdif + _cnt * _store * _store / (_cnt-1); } #endif } // returns the average of the data-set added sofar double Statistic::average() { if (_cnt == 0) return NAN; // original code returned 0 return _sum / _cnt; } // Population standard deviation = s = sqrt [ S ( Xi - ยต )2 / N ] // https://www.suite101.com/content/how-is-standard-deviation-used-a99084 #ifdef STAT_USE_STDEV double Statistic::variance() { if (_cnt == 0) return NAN; // otherwise DIV0 error return _ssqdif / _cnt; } double Statistic::pop_stdev() { if (_cnt == 0) return NAN; // otherwise DIV0 error return sqrt( _ssqdif / _cnt); } double Statistic::unbiased_stdev() { if (_cnt < 2) return NAN; // otherwise DIV0 error return sqrt( _ssqdif / (_cnt - 1)); } #endif // END OF FILE