Arduino Playground is read-only starting December 31st, 2018. For more info please look at this Forum Post

Corrupt Array Variables And Memory


This page's comments/discussion on Arduino Forum: Arduino Forum - Wiki discuss: Corrupt Array Variables And Memory



Introduction

This is a little writeup, with an example (carried out in the Ubuntu Linux 10.04 environment), of the issues discussed in these forum topics:

In essence, what these topics illustrate, is the fact that the Arduino's AVR microcontroller is of modified Harvard architecture: "with physically separate storage and signal pathways for instructions and data". This means that the process of memory allocation of variables in this architecture, is slightly different from the usual process when programming usual PCs (to which most users may be used to). See also Data Storage & Retrieval - Measuring Stuff?



Initial example

Let's illustrate that with a simple example. This example was developed on an Arduino Duemilanove with an Atmel ATmega328 microcontroller, so let us first include its memory specifications:

Flash Memory: 	32 KB (ATmega328) of which 2 KB used by bootloader
SRAM: 		2 KB (ATmega328)
EEPROM:		1 KB (ATmega328)

Now, let us consider the following Arduino .pde program as our starting point:

 
// minitest.pde

unsigned long mydata_count;

static const long SERSPEED=115200;
static const int SIZE=100;
static const char RSTSTR[] = "RESET!!";

void setup() 
{
  Serial.begin(SERSPEED); 
  mydata_count = 0; 
}

void loop() 
{ 
  Serial.println(mydata_count, DEC);
  mydata_count++;   
  if (mydata_count == SIZE) {
    doReset();
  }
}

void doReset()
{
  Serial.println(RSTSTR); 
  mydata_count = 0; 
}
 

The example, simply, increases a counter, prints its value through the serial port, and then after the counter reaches a certain value (here 100), it is reset - and the process starts all over again. This first example builds and runs just fine on an Arduino, and the output of this program is:

 
0
1
2
3
4
...
98
99
RESET!!
0
1
2
3
...

In the example, all the unchanging values used have been declared as static const variables. Note that the counter is of type unsigned long (and this is in order to illustrate what happens to variables with sizes greater than a byte; as an unsigned long is 32 bits, or 4 bytes in size).

After building (by clicking 'Verify' in the Arduino IDE), the Arduino IDE informs us about the binary produced:

Binary sketch size: 2342 bytes (of a 30720 byte maximum)

While this piece of info would suggest that everything should be fine - as we will see later, this binary sketch size does not necesarilly inform us whether we have allocated enough memory for our variables! That is because for variables, we need information about utilization of static RAM (SRAM); however, this is not included in the default report of the Arduino IDE (see the posts above, though, for a patch).

At this point, let us note that:



Obtaining memory info from a binary

After we have built the example, we have several tools in Linux at our disposal to further check the output binary - in terms of the memory specifications listed above:

 
$ avr-size /tmp/build6964438790091573694.tmp/minitest.cpp.elf 
   text	   data	    bss	    dec	    hex	filename
   2324	     18	    164	   2506	    9ca	/tmp/build6964438790091573694.tmp/minitest.cpp.elf

$ avr-objdump -h /tmp/build6964438790091573694.tmp/minitest.cpp.elf 

/tmp/build6964438790091573694.tmp/minitest.cpp.elf:     file format elf32-avr

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .data         00000012  00800100  00000914  000009a8  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  1 .text         00000914  00000000  00000000  00000094  2**1
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .bss          000000a4  00800112  00800112  000009ba  2**0
                  ALLOC
  3 .stab         000046bc  00000000  00000000  000009bc  2**2
                  CONTENTS, READONLY, DEBUGGING
  4 .stabstr      0000310e  00000000  00000000  00005078  2**0
                  CONTENTS, READONLY, DEBUGGING

$ nm /tmp/build6964438790091573694.tmp/minitest.cpp.elf | sort 
00000000 a __tmp_reg__
00000000 a __tmp_reg__
00000000 a __tmp_reg__
00000000 a __tmp_reg__
00000000 a __tmp_reg__
00000000 a __tmp_reg__
00000000 a __tmp_reg__
00000000 T __vectors
00000000 W __heap_end
00000000 W __vector_default
00000001 a __zero_reg__
00000001 a __zero_reg__
00000001 a __zero_reg__
00000001 a __zero_reg__
00000001 a __zero_reg__
00000001 a __zero_reg__
00000001 a __zero_reg__
00000034 a __CCP__
00000034 a __CCP__
00000034 a __CCP__
00000034 a __CCP__
00000034 a __CCP__
00000034 a __CCP__
00000034 a __CCP__
0000003d a __SP_L__
0000003d a __SP_L__
0000003d a __SP_L__
0000003d a __SP_L__
0000003d a __SP_L__
0000003d a __SP_L__
0000003d a __SP_L__
0000003e a __SP_H__
0000003e a __SP_H__
0000003e a __SP_H__
0000003e a __SP_H__
0000003e a __SP_H__
0000003e a __SP_H__
0000003e a __SP_H__
0000003f a __SREG__
0000003f a __SREG__
0000003f a __SREG__
0000003f a __SREG__
0000003f a __SREG__
0000003f a __SREG__
0000003f a __SREG__
00000068 T __ctors_start
00000068 T __trampolines_end
00000068 T __trampolines_start
0000006a T __ctors_end
0000006a T __dtors_end
0000006a T __dtors_start
0000006a W __init
00000076 T __do_copy_data
00000082 t .do_copy_data_loop
00000086 t .do_copy_data_start
0000008c T __do_clear_bss
00000094 t .do_clear_bss_loop
00000096 t .do_clear_bss_start
0000009c T __do_global_ctors
000000a4 t .do_global_ctors_loop
000000ac t .do_global_ctors_start
000000ba T __bad_interrupt
000000ba W __vector_1
000000ba W __vector_10
000000ba W __vector_11
000000ba W __vector_12
000000ba W __vector_13
000000ba W __vector_14
000000ba W __vector_15
000000ba W __vector_17
000000ba W __vector_19
000000ba W __vector_2
000000ba W __vector_20
000000ba W __vector_21
000000ba W __vector_22
000000ba W __vector_23
000000ba W __vector_24
000000ba W __vector_25
000000ba W __vector_3
000000ba W __vector_4
000000ba W __vector_5
000000ba W __vector_6
000000ba W __vector_7
000000ba W __vector_8
000000ba W __vector_9
000000be T _Z7doResetv
000000dc T loop
0000012e T setup
00000150 T __vector_18
000001ca T _ZN14HardwareSerial5beginEl
000003bc T _ZN14HardwareSerial5writeEh
000003e2 t _GLOBAL__I_rx_buffer
00000456 T _ZN5Print5writeEPKc
00000486 T _ZN5Print5writeEPKhj
000004c4 T _ZN5Print5printEPKc
000004d4 T _ZN5Print11printNumberEmh
00000608 T _ZN5Print5printEmi
00000626 T _ZN5Print5printEli
00000692 T _ZN5Print7printlnEv
000006c0 T _ZN5Print7printlnEmi
000006d6 T _ZN5Print7printlnEPKc
000006ec T main
000006fa T __vector_16
0000078a T init
000007fe T __mulsi3
0000083c T _div
0000083c T __divmodhi4
00000850 t __divmodhi4_neg2
00000856 t __divmodhi4_exit
00000858 t __divmodhi4_neg1
00000862 T __udivmodsi4
0000086e t __udivmodsi4_loop
00000888 t __udivmodsi4_ep
000008a6 T __divmodsi4
000008ba t __divmodsi4_neg2
000008c8 t __divmodsi4_exit
000008ca t __divmodsi4_neg1
000008dc T __udivmodhi4
000008e4 t __udivmodhi4_loop
000008f2 t __udivmodhi4_ep
000008ff W __stack
00000904 T __tablejump2__
00000908 T __tablejump__
00000910 T _exit
00000910 W exit
00000912 t __stop_program
00000914 A __data_load_start
00000914 T _etext
00000926 A __data_load_end
00800100 D __data_start
00800100 d _ZL6RSTSTR
00800108 V _ZTV14HardwareSerial
00800112 B __bss_start
00800112 B mydata_count
00800112 D __data_end
00800112 D _edata
00800116 B rx_buffer
0080019a B Serial
008001ad B timer0_overflow_count
008001b1 B timer0_millis
008001b5 b timer0_fract
008001b6 B __bss_end
008001b6 N _end
008001b6 N __heap_start
00810000 N __eeprom_end
         U __cxa_pure_virtual

In essence, avr-size gives the same information in decimal, as the first column of avr-objdump -h output in hexadecimal - the sizes of .text, .data and .bss sections; nm gives the memory layout, so we can confirm that these sizes are correct.

However, for the purpose of this discussion, we can just use avr-size to obtain the sizes of .text, .data and .bss sections. For more on these sections, see avr-libc: Memory Sections.

One more note - SERSPEED in the code above, needs to be defined as long, since that is what Serial.begin expects; however, even if you define it as int (in which case the serial port will start wrong - and there will be no serial output), there will be no change in the output of the avr-size, avr-objdump and nm commands!



Array example and comparison

Now, let us complicate our example a bit, and introduce an array. So, the example becomes:

 
// minitest.pde

unsigned long mydata_count;

static const long SERSPEED=115200;
static const int SIZE=100;
static const char RSTSTR[] = "RESET!!";

unsigned long mydata[SIZE];

void setup() 
{
  Serial.begin(SERSPEED); 
  mydata_count = 0; 
}

void loop() 
{ 
  mydata[mydata_count] = mydata_count; 
  Serial.println(mydata[mydata_count], DEC);
  mydata_count++;   
  if (mydata_count == SIZE) {
    doReset();
  }
}

void doReset()
{
  Serial.println(RSTSTR); 
  mydata_count = 0; 
}
 

This example still builds and runs well, and produces the same output as the initial example; the Arduino IDE now reports:

Binary sketch size: 2364 bytes (of a 30720 byte maximum)

while avr-size reports:

 
$ avr-size /tmp/build6964438790091573694.tmp/minitest.cpp.elf 
   text	   data	    bss	    dec	    hex	filename
   2346	     18	    564	   2928	    b70	/tmp/build6964438790091573694.tmp/minitest.cpp.elf

Now, let's keep in mind these quotes from Arduino Forum - SRAM size compile time check/report.:

'''text''' = code (flash) memory
'''data''' = initialised data (in RAM - includes strings etc)
'''bss''' = non-initialised data (actually normally set to 0x00) also in RAM.

So total RAM usage = (data + bss), but this does '''not''' include dynamic 
memory allocated from the heap at run time...
...
We also must store all initialised data in flash so:

FLASH = text+data
RAM = data + bss 

So, keeping in mind that SRAM=data+bss, let us make a small comparison between the initial and the array example:

ExampleIDE.text.data.bssSRAM
Initial ex.2342232418164182
Array ex.2364234618564582

We sized our array mydata to be 100 elements of unsigned long big; and since unsigned long is 4 bytes in size, the mydata array should take up 400 bytes - and as we can see, that is exactly the difference of .bss sizes between the array example and initial example binaries.

  • In other words: since we declared the mydata array, but we didn't initialize it - memory for it was allocated in the .bss section (the section for non-initialized data).

At this point, let's try to play around a bit with the declaration of the array, and see what kind of effect it will have on the memory sections:

ExampleIDE.text.data.bssSRAM
unsigned long mydata[SIZE];2364234618564582
static unsigned long mydata[SIZE];2364234618564582
static unsigned long mydata[SIZE] = { 0L };2364234618564582
static unsigned long mydata[SIZE] = { 2L };27642346418164582

We can see that only when we initialize an array to a value other than 0, the memory allocation of mydata array moves in the .data section; however, that results with no changes in SRAM allocation.



Max size of array?

At this point, we come to the key question of this post - given the same code example, what is the maximum value of SIZE we can use for the mydata array - for an ATmega328?

As noted above, for an ATmega328, we have 2 KB - or 2048 bytes - of SRAM at disposal; and for 100 unsigned long elements, we use in total 582 bytes of SRAM.

  • So there remain 2048-582=1466 bytes at disposal;
  • 1466 bytes correspond to 1466/4=366.5, or 366 unsigned long elements.
  • Since 100 unsigned long elements were already included in those 582 bytes, that means that the total size of the array will be 100+366=466 unsigned long elements.

The thing is, in spite of this fine calculation - using 466 bytes for the size of the array, will NOT work on an Arduino with an ATmega328: either the serial output will be garbled; or there will be no output at all (in other words, the Arduino will 'crash')!

  • Furthermore, the same problem will appear, regardless of whether we declare mydata as 'static' - or whether we initialize it!

That means, that the calculation based on SRAM utilization based only on .data and .bss is, in essence, unreliable.

These symptoms are essentially the same as described in Arduino - Memory: "If you run out of SRAM, your program may fail in unexpected ways; it will appear to upload successfully, but not run, or run strangely."

  • The same page reccomends also some strategies for solving SRAM problems, such as using PROGMEM keyword; however, none of them are applicable in this example, since here we do want to write to (change) the array at runtime.

Some other pointers can be found on: Issue 40 - arduino - SRAM Memory size check:

Please have the IDE check the linker output, and report an error if the
SRAM usage exceeds the size determined by the CPU selected in the
preferences menu. 
..
Consider that the RAM usage changes as the program runs, so even if you fit under the 
limit when the program starts, you can run out of RAM in the middle of execution
...
This would essentially require an arduino simulator to be run for some unspecified 
time. 

and Arduino Forum - Serial data being corrupted:

Why doesn't avr-gcc warn when the RAM is filling up?

The most the compiler could do is warn how much static memory (heap) is being used.

A significant part of RAM utilization is determined at runtime, and it is frequently 
dependent on the actual input data, so it can change between program executions.  
The compiler has no way of knowing what data you will feed the program, and only the 
vaguest guess at how functions will be called.  Automatic variables and saved 
registers consume RAM on the stack when functions are called.  Functions don't get 
called at compile time, they get called at runtime. 

We might also remember the statement above, that SRAM=data+bss "...does not include dynamic memory allocated from the heap at run time" - however, none of that should be applicable here, since we don't use malloc to allocate memory (and hence, all allocations are known a-priori).

So, the only thing we are left with, is to take the above value of 466 as an upper bound, SIZEmax, and then keep decreasing it until we find a value that makes the code work properly. So, starting with this kind of definitions:

 
static const int SIZE=466;
...
static unsigned long mydata[SIZE];

Here is a log of outcomes of changing SIZE and rebuilding:

SIZEIDE.text.data.bssSRAMstatussimavremulino
466236423461820282046no outavr_sadly_crashedsegfault
450236423461819641982corrupt outruns oksegfault
440236423461819241942runs okruns okruns ok
445236423461819441962runs okruns okruns ok
449236423461819601978corrupt outruns okcrashes
448236423461819561974no outruns oksegfault
447236423461819521970corrupt outruns okfreeze
446236423461819481966runs okruns oksegfault
445236423461819441962runs okruns okruns ok

In the table above, "status" refers to the result obtained by building the .pde with the given SIZE, uploading it to the Arduino, and observing the output via

screen -L /dev/ttyUSB0 115200

The "-L" argument allows capture to a logfile, which can reveal whether the output is corrupt (when it is corrupt, usually the number sequence will change at random, and the code may never enter the doReset function).

So we can see that the highest number of elements that is "guaranteed" to work is, in fact, 445, which is 21 elements less than the initial maximum of 466 - or 84 bytes less from the "theoretical" SRAM maximum (derived from .data and .bss sizes only) for ATmega328. Obviously, we would like some measure that would inform us of this margin - which is why we may consider simulators and debugging.



Simulators and debugging

Also in the table above, there are "simavr" and "emulino" columns; these refer to the output obtained by using two emulators for Linux that can be used for the Arduino (but need to be compiled from source):

  • emulino: here being called by:
    ./emulino /tmp/build7967507421090092066.tmp/minitest.cpp.hex
  • simavr: here being called by:
    ./run_avr -f 800 -m atmega328 /tmp/build7967507421090092066.tmp/minitest.cpp.hex

We can notice that simavr is somewhat optimistic in respect to how the Arduino behaves (that is, it will simulate an OK process for values of SIZE, for which the real Arduino will have problems); whereas emulino is even somewhat pesimistic (as it segfaults for a value of SIZE=446, for which the real Arduino seems to work OK).

Note also that there is another simulator, simulavr, which could possibly work - however, it doesn't seem to support ATmega328 (seemingly, not even in its latest git version)

The good thing about simavr or simulavr is that they can be used to debug AVR programs, by establishing a (local socket) connection to gdb; however, that needs to be a special installation of gdb for AVR, known as avr-gdb.

  • Ubuntu users should note, that there are Debian packages for gdb-avr and simulavr in the main repositories; however, both of those are outdated, and should probably be built from source (see Bug #407367 in gdb-avr (Ubuntu): 'gdb-avr is outdated')
  • Unfortunately, since simavr is a bit too optimistic, it is likely that even a gdb approach may be unfruitful - as simavr simply does not display any problems, for sizes where the real Arduino has difficulties.

Also, note that emulino can be compiled with a #define TRACE in cpu.c, which will generate a tremendeous ammount of debug information - though not easily readable; the output could easily reach 70 MB text for logging just 100 working steps; and if there is a 'crash' afterwards, it can keep looping for more than 200MB without actually crashing. Also, in this case the simulation will run extremely slow - with piping stderr to file, it will take up to 3-5 seconds for a step to be printed out on stdout.

Another option for debugging this problem could be to use AVR Studio. It is Windows only, however it can be made to run under Wine (see AVR Studio on Linux). In principle, one can open the *.elf file in AVR studio, and run a debug session in assembler.

  • For a debug session with C/C++ source, it seems that either: Arduino programs need to be compiled in AVR Studio itself; or one needs to generate so-called coff files.
  • An assembler debug under Wine goes extremely slow, and one can spend hours in vain waiting for the binary to 'crash'.
  • There is no default way to see USART output in AVR studio; to have that, one has to additionally use HAPSIM
    • Hapsim can be also made to run in Wine, but it has difficulties finding a running instance of AVR Studio - and even when it finds one, it fails to print anything on its serial console.
  • However, the benefits of AVR studio would be its built-in options to visualise the entire memory, as well as the state of all register, program counter etc.



Memory info from running Arduino

As a closing note, there is yet another way to debug these memory allocation issues in Arduino, listed in the forum entry Arduino Forum - How much ram is being used?. There, a memory profiling function is implemented in the Arduino sketch, which calculates sizes of heap, etc at runtime - and prints them out through serial.

As we have seen in this specific problem, in a real Arduino, we'd either get no output, or a corrupt output when "SRAM is maxed" - and in that case, we would not be able to rely on its serial printout either; and once sizes fit and the process starts working, the printout should not point to any discrepancies (and thus does not necesarrily bring us closer to identifying an "easy" condition for a maximum size of the array).

  • However, when "SRAM is maxed", we can just as well run the memory profiler code in a simulator, and maybe hope to identify some more concrete conditions there.

Obviously, if we add the memory profiling function, the available memory will change, and hence we have to go through the manual process of starting with a small array size, and then identifying a maximum array size, again. So let's do that; below is the example code we used so far - with the memory profiler function added (note that it uses freeMemory via MemoryFree.h, which is also given in the thread above), and starting array size of 100 elements (400 bytes):

 
// minitest.pde

static unsigned long mydata_count;

static const long SERSPEED=115200;
static const int SIZE=100;
static const char RSTSTR[] = "RESET!!";

static unsigned long mydata[SIZE];

#include <MemoryFree.h>

extern unsigned int __data_start;
extern unsigned int __data_end;
extern unsigned int __bss_start;
extern unsigned int __bss_end;
extern unsigned int __heap_start;
//extern void *__malloc_heap_start; --> apparently already declared as char*
//extern void *__malloc_margin; --> apparently already declared as a size_t
extern void *__brkval;
// RAMEND and SP seem to be available without declaration here

int16_t ramSize=0;   // total amount of ram available for partitioning
int16_t dataSize=0;  // partition size for .data section
int16_t bssSize=0;   // partition size for .bss section
int16_t heapSize=0;  // partition size for current snapshot of the heap section
int16_t stackSize=0; // partition size for current snapshot of the stack section
int16_t freeMem1=0;  // available ram calculation #1
int16_t freeMem2=0;  // available ram calculation #2

//* This function places the current value of the heap and stack pointers in the
// * variables. You can call it from any place in your code and save the data for
// * outputting or displaying later. This allows you to check at different parts of
// * your program flow.
// * The stack pointer starts at the top of RAM and grows downwards. The heap pointer
// * starts just above the static variables etc. and grows upwards. SP should always
// * be larger than HP or you'll be in big trouble! The smaller the gap, the more
// * careful you need to be. Julian Gall 6-Feb-2009.
// *
uint8_t *heapptr, *stackptr;
uint16_t diff=0;
void check_mem() {
  stackptr = (uint8_t *)malloc(4);          // use stackptr temporarily
  heapptr = stackptr;                     // save value of heap pointer
  free(stackptr);      // free up the memory again (sets stackptr to 0)
  stackptr =  (uint8_t *)(SP);           // save value of stack pointer
}



void setup() 
{
  Serial.begin(SERSPEED); 
  mydata_count = 0; 
}

void loop() 
{ 
  mydata[mydata_count] = mydata_count; 
  memrep();
  Serial.println(mydata[mydata_count], DEC);
  mydata_count++;   
  if (mydata_count == SIZE) {
    doReset();
  }
}

void doReset()
{
  Serial.println(RSTSTR); 
  mydata_count = 0; 
}



void memrep()                     // run over and over again
{
  Serial.print("\n\n--------------------------------------------");
  Serial.print("\n\nLOOP BEGIN: get_free_memory() reports [");
  Serial.print( freeMemory() );
  Serial.print("] (bytes) which must be > 0 for no heap/stack collision");


  Serial.print("\n\nSP should always be larger than HP or you'll be in big trouble!");

  check_mem();

  Serial.print("\nheapptr=[0x"); Serial.print( (int) heapptr, HEX); Serial.print("] (growing upward, "); Serial.print( (int) heapptr, DEC); Serial.print(" decimal)");

  Serial.print("\nstackptr=[0x"); Serial.print( (int) stackptr, HEX); Serial.print("] (growing downward, "); Serial.print( (int) stackptr, DEC); Serial.print(" decimal)");

  Serial.print("\ndifference should be positive: diff=stackptr-heapptr, diff=[0x");
  diff=stackptr-heapptr;
  Serial.print( (int) diff, HEX); Serial.print("] (which is ["); Serial.print( (int) diff, DEC); Serial.print("] (bytes decimal)");


  Serial.print("\n\nLOOP END: get_free_memory() reports [");
  Serial.print( freeMemory() );
  Serial.print("] (bytes) which must be > 0 for no heap/stack collision");


  // ---------------- Print memory profile -----------------
  Serial.print("\n\n__data_start=[0x"); Serial.print( (int) &__data_start, HEX ); Serial.print("] which is ["); Serial.print( (int) &__data_start, DEC); Serial.print("] bytes decimal");

  Serial.print("\n__data_end=[0x"); Serial.print((int) &__data_end, HEX ); Serial.print("] which is ["); Serial.print( (int) &__data_end, DEC); Serial.print("] bytes decimal");

  Serial.print("\n__bss_start=[0x"); Serial.print((int) & __bss_start, HEX ); Serial.print("] which is ["); Serial.print( (int) &__bss_start, DEC); Serial.print("] bytes decimal");

  Serial.print("\n__bss_end=[0x"); Serial.print( (int) &__bss_end, HEX ); Serial.print("] which is ["); Serial.print( (int) &__bss_end, DEC); Serial.print("] bytes decimal");

  Serial.print("\n__heap_start=[0x"); Serial.print( (int) &__heap_start, HEX ); Serial.print("] which is ["); Serial.print( (int) &__heap_start, DEC); Serial.print("] bytes decimal");

  Serial.print("\n__malloc_heap_start=[0x"); Serial.print( (int) __malloc_heap_start, HEX ); Serial.print("] which is ["); Serial.print( (int) __malloc_heap_start, DEC); Serial.print("] bytes decimal");

  Serial.print("\n__malloc_margin=[0x"); Serial.print( (int) &__malloc_margin, HEX ); Serial.print("] which is ["); Serial.print( (int) &__malloc_margin, DEC); Serial.print("] bytes decimal");

  Serial.print("\n__brkval=[0x"); Serial.print( (int) __brkval, HEX ); Serial.print("] which is ["); Serial.print( (int) __brkval, DEC); Serial.print("] bytes decimal");

  Serial.print("\nSP=[0x"); Serial.print( (int) SP, HEX ); Serial.print("] which is ["); Serial.print( (int) SP, DEC); Serial.print("] bytes decimal");

  Serial.print("\nRAMEND=[0x"); Serial.print( (int) RAMEND, HEX ); Serial.print("] which is ["); Serial.print( (int) RAMEND, DEC); Serial.print("] bytes decimal");

  // summaries:
  ramSize   = (int) RAMEND       - (int) &__data_start;
  dataSize  = (int) &__data_end  - (int) &__data_start;
  bssSize   = (int) &__bss_end   - (int) &__bss_start;
  heapSize  = (int) __brkval     - (int) &__heap_start;
  stackSize = (int) RAMEND       - (int) SP;
  freeMem1  = (int) SP           - (int) __brkval;
  freeMem2  = ramSize - stackSize - heapSize - bssSize - dataSize;
  Serial.print("\n--- section size summaries ---");
  Serial.print("\nram   size=["); Serial.print( ramSize, DEC ); Serial.print("] bytes decimal");
  Serial.print("\n.data size=["); Serial.print( dataSize, DEC ); Serial.print("] bytes decimal");
  Serial.print("\n.bss  size=["); Serial.print( bssSize, DEC ); Serial.print("] bytes decimal");
  Serial.print("\nheap  size=["); Serial.print( heapSize, DEC ); Serial.print("] bytes decimal");
  Serial.print("\nstack size=["); Serial.print( stackSize, DEC ); Serial.print("] bytes decimal");
  Serial.print("\nfree size1=["); Serial.print( freeMem1, DEC ); Serial.print("] bytes decimal");
  Serial.print("\nfree size2=["); Serial.print( freeMem2, DEC ); Serial.print("] bytes decimal");
  Serial.println();
}


Here, binary sketch size is 5190 bytes, while avr-size reports:

 
$ avr-size /tmp/build7967507421090092066.tmp/minitest.cpp.elf 
   text	   data	    bss	    dec	    hex	filename
   4416	    774	    588	   5778	   1692	/tmp/build7967507421090092066.tmp/minitest.cpp.elf

which means SRAM=774+588=1362 bytes; and the output is like:

 
...
--------------------------------------------

LOOP BEGIN: get_free_memory() reports [650] (bytes) which must be > 0 for no heap/stack collision

SP should always be larger than HP or you'll be in big trouble!
heapptr=[0x654] (growing upward, 1620 decimal)
stackptr=[0x8E5] (growing downward, 2277 decimal)
difference should be positive: diff=stackptr-heapptr, diff=[0x291] (which is [657] (bytes decimal)

LOOP END: get_free_memory() reports [650] (bytes) which must be > 0 for no heap/stack collision

__data_start=[0x100] which is [256] bytes decimal
__data_end=[0x406] which is [1030] bytes decimal
__bss_start=[0x406] which is [1030] bytes decimal
__bss_end=[0x652] which is [1618] bytes decimal
__heap_start=[0x652] which is [1618] bytes decimal
__malloc_heap_start=[0x652] which is [1618] bytes decimal
__malloc_margin=[0x3EE] which is [1006] bytes decimal
__brkval=[0x658] which is [1624] bytes decimal
SP=[0x8E7] which is [2279] bytes decimal
RAMEND=[0x8FF] which is [2303] bytes decimal
--- section size summaries ---
ram   size=[2047] bytes decimal
.data size=[774] bytes decimal
.bss  size=[588] bytes decimal
heap  size=[6] bytes decimal
stack size=[24] bytes decimal
free size1=[655] bytes decimal
free size2=[655] bytes decimal
99
RESET!!


--------------------------------------------

LOOP BEGIN: get_free_memory() reports [650] (bytes) which must be > 0 for no heap/stack collision

SP should always be larger than HP or you'll be in big trouble!
heapptr=[0x654] (growing upward, 1620 decimal)
stackptr=[0x8E5] (growing downward, 2277 decimal)
difference should be positive: diff=stackptr-heapptr, diff=[0x291] (which is [657] (bytes decimal)

LOOP END: get_free_memory() reports [650] (bytes) which must be > 0 for no heap/stack collision

__data_start=[0x100] which is [256] bytes decimal
__data_end=[0x406] which is [1030] bytes decimal
__bss_start=[0x406] which is [1030] bytes decimal
__bss_end=[0x652] which is [1618] bytes decimal
__heap_start=[0x652] which is [1618] bytes decimal
__malloc_heap_start=[0x652] which is [1618] bytes decimal
__malloc_margin=[0x3EE] which is [1006] bytes decimal
__brkval=[0x658] which is [1624] bytes decimal
SP=[0x8E7] which is [2279] bytes decimal
RAMEND=[0x8FF] which is [2303] bytes decimal
--- section size summaries ---
ram   size=[2047] bytes decimal
.data size=[774] bytes decimal
.bss  size=[588] bytes decimal
heap  size=[6] bytes decimal
stack size=[24] bytes decimal
free size1=[655] bytes decimal
free size2=[655] bytes decimal
0
...

Notice in the log above, that none of the information printed changes between steps!

In this case, we got usage of SRAM=1362 bytes, which should mean that 2048-1362=686 bytes remain; however, as free_memory() reports 650 bytes, we will take that as a starting point:

  • 650 bytes means 650/4=162.5, or 162 unsigned long elements
  • Added to the already existing 100 elements, we have SIZEmax = 262 elements.

SIZEIDE.text.data.bssSRAMstatussimavremulino
2625190441677412362010no outfreezesloops to LOOP BEGIN:
2525190441677411961970no outfreezesloops to LOOP BEGIN:
2425190441677411561930runs okruns okruns ok

Again, we get OK performance, if we allocate about 80 bytes less, than the first estimation of SIZEmax ! For 242 elements, we get a report:

 
...
LOOP BEGIN: get_free_memory() reports [82] (bytes) which must be > 0 for no heap/stack collision

SP should always be larger than HP or you'll be in big trouble!
heapptr=[0x88C] (growing upward, 2188 decimal)
stackptr=[0x8E5] (growing downward, 2277 decimal)
difference should be positive: diff=stackptr-heapptr, diff=[0x59] (which is [89] (bytes decimal)

LOOP END: get_free_memory() reports [82] (bytes) which must be > 0 for no heap/stack collision

__data_start=[0x100] which is [256] bytes decimal
__data_end=[0x406] which is [1030] bytes decimal
__bss_start=[0x406] which is [1030] bytes decimal
__bss_end=[0x88A] which is [2186] bytes decimal
__heap_start=[0x88A] which is [2186] bytes decimal
__malloc_heap_start=[0x88A] which is [2186] bytes decimal
__malloc_margin=[0x3EE] which is [1006] bytes decimal
__brkval=[0x890] which is [2192] bytes decimal
SP=[0x8E7] which is [2279] bytes decimal
RAMEND=[0x8FF] which is [2303] bytes decimal
--- section size summaries ---
ram   size=[2047] bytes decimal
.data size=[774] bytes decimal
.bss  size=[1156] bytes decimal
heap  size=[6] bytes decimal
stack size=[24] bytes decimal
free size1=[87] bytes decimal
free size2=[87] bytes decimal
...

If we again start increasing SIZE, we will find that corrupt output starts occuring at SIZE of 244 elements, in which case a real Arduino output can be like:

 
...
LOOP BEGIN: get_free_memory() reports [74] (bytes) which must be > 0 for no heap/stack collision

SP should always be larger than HP or you'll be in big trouble!
heapptr=[0xFFFF82B7] (growing upward, -32073 decimal)
stack{90}{STX}r=[0x8E5] (growing downward, 2277 decimal)
difference should be positive: diff=stackptr-heapptr, diff=[0xFFFF862EQ (which is [-31186] (bytes decimal)

LOOP END: ge{EOT}74] (bytes) which must be > 0 for no heap/stack collision

__data_st{EOT}100{BD}{82}which is [256] bytes decimal
__data_end=[0x406{BD}{82}which is [1030] bytes decimal
__bss_start=[0x406{BD}{82}which is [1030] bytes decimal
__bss_end=[0x892{BD}{82}which is [2194] bytes decimal
__heap_start=[0x892{BD}{82}which is [2194] bytes decimal
__malloc_heap_start=[0x892{BD}{82}which is [2194] bytes decimal
__malloc_margin=[0x3EE{BD}{82}which is [1006] bytes decimal
__brkval=[0x898{BD}{82}which is [2200] bytes decimal
SP=[0x8E7{BD}{82}which is [2279] bytes decimal
RAMEND=[0x8FF{BD}{82}which is [2303] bytes decimal
--- section size summaries ---
ram   size=[2047] bytes decimal
.data size=[774] bytes decimal
.bss  size=[1164] bytes decimal
heap  size=[6] bytes decimal
stack size=[24] bytes decimal
free size1=[79] bytes decimal
free size2=[79] bytes decimal
...

And here, in between the corrupt characters (marked with {}), we gain another parameter explaining why we get a failure now - the difference between stack and heap pointer is negative: diff=[0xFFFF862EQ (which is [-31186] (bytes decimal).

  • In this case, simavr is again more optimistic (it works without a crash); while emulino freezes after several iterations (however, its diff is not negative - although, it decreases with every run).
  • For SIZE of 243, simavr, emulino and real Arduino output show the same memory profile function output.



Conclusion

So, although a 100% working recipe for finding safe maximum array sizes, cannot really be stated - as a brief conclusion, we can say that: in order to find the maximum size for a single array in an Arduino sketch, which will fill out maximum available ammount of SRAM without corrupting the running process, we can first start with a smaller array of safe, known size - and then:

First approximation (compile time, without memory profile function):

  • get usage SRAM as .data+.bss
  • given maxSRAM for a chip, get remain as maxSRAM-SRAM
  • subtract about 80 bytes from remain as a safety margin: saferemain = remain - 80
  • use saferemain to estimate initial SIZEmax of array
  • increase or decrease SIZE until its max value for proper performance is found.

Second approximation (runtime, with memory profile function):

  • get remain as whatever freeMemory() returns
  • subtract about 80 bytes from remain as a safety margin: saferemain = remain - 80
  • use saferemain to estimate initial SIZEmax of array
  • increase or decrease SIZE until its max value, that still keeps the difference between stack and heap pointer positive, is found.





Smilen Dimitrov