Besides the current allocated/free bytes, keep track of
the maximum allocated bytes to help determine the heap
size requirements. Also, provide a function to reset
the statistic.
Signed-off-by: Damian Krolik <damian.krolik@nordicsemi.no>
add functions to get the sys_heap runtime statistics,
include total free bytes, total allocated bytes.
Signed-off-by: Chen Peng1 <peng1.chen@intel.com>
The "small" heap is is way sufficient for most 32-bit systems.
Let's provide the option to have only one type of heap allowing for
smaller and faster heap code due to not having a bunch of runtime
conditionals based on the heap size.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
In code is a variable "chunksz_t chunksz" that has the same name as
function "chunksz_t chunksz()" in the one heap.h file.
Create unique variable name to avoid misreading in the future.
Found as a coding guideline violation (MISRA R5.9) by static
coding scanning tool.
Signed-off-by: Maksim Masalski <maksim.masalski@intel.com>
The size_t usage, especially in struct z_heap_bucket made the heap
header almost 2x bigger than it needs to be on 64-bit systems.
This prompted me to clean up our type usage to make the code more
efficient and easier to understand. From now on:
- chunkid_t is for absolute chunk position measured in chunk units
- chunksz_t is for chunk sizes measured in chunk units
- size_t is for buffer sizes measured in bytes
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
The end marker chunk was represented by the len field of struct z_heap.
It is now renamed to end_chunk to make it more obvious what it is.
And while at it...
Given that it is used in size_too_big() to cap the allocation size
already, we no longer need to test the bucket index against the
biggest index possible derived from end_chunk in alloc_chunk(). The
corresponding bucket_idx() call is relatively expensive on some
architectures so avoiding it (turning it into a CHECK() instead) is
a good thing.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Turn sys_heap_dump() into sys_heap_print_info() to better reflect
what it actually does, and improve the information being printed.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
This reverts commit b6b6d39bb6.
With both commit 4690b8d5ec ("libc/minimal: fix malloc() allocated
memory alignment") and commit c822e0abbd ("libc/minimal: fix
realloc() allocated memory alignment") in place, there is no longer
a need for enforcing the big heap mode on every allocations.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
This option allows forcing big heap mode. Useful on for getting 8-byte
aligned blocks on 32-bit machines.
Signed-off-by: Martin Åberg <martin.aberg@gaisler.com>
Let's do it upfront only once for each entry point and dispense
with overflow checks later to keep the code simple.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Both operands of an operator in the arithmetic conversions
performed shall have the same essential type category.
Changes are related to converting the integer constants to the
unsigned integer constants
Signed-off-by: Aastha Grover <aastha.grover@intel.com>
After commit 8a6b02b5bf ("lib/os/heap: some code simplification in
sys_heap_aligned_alloc()") it is no longer required to have a "big"
heap for aligned allocations to work on 32-bit targets. While the
natural alignment for returned memory has an offset of 4 within a chunk
unit due to the smaller header size, returning to a chunkid from a
memory pointer with an offset of 8 will fall back onto the proper chunk
number once the 4 is substracted and then divided by 8.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Instead of limiting the excess split-off to sufficiently large chunks
in split_alloc(), let's allow normal allocations to create "solo free
headers" just like with aligned allocations. There is no point leaving
them in the allocated chunk if the user didn't ask for it. Doing so
makes them eligible for merging at the next opportunity and potentially
reusable sooner.
Also make the validation code aware of them.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Add support for a C11-style aligned_alloc() in the heap
implementation. This is properly optimized, in the sense that unused
prefix/suffix data around the chosen allocation is returned to the
heap and made available for general allocation.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This struct is taking up most of the heap's constant footprint overhead.
We can easily get rid of the list_size member as it is mostly used to
determine if the list is empty, and that can be determined through
other means.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Make the LEFT_SIZE field first and SIZE_AND_USED field last (for an
allocated chunk) so they sit right next to the allocated memory. The
current chunk's SIZE_AND_USED field points to the next (right) chunk,
and from there the LEFT_SIZE field should point back to the current
chunk. Many trivial memory overflows should trip that test.
One way to make this test more robust could involve xor'ing the values
within respective accessor pairs. But at least the fact that the size
value is shifted by one bit already prevent fooling the test with a
same-byte corruption.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
We already have chunk #0 containing our struct z_heap and marked as
used. We can add a partial chunk at the very end that is also marked
as used. By doing so there is no longer a need for checking heap
boundaries at run time when merging/splitting chunks meaning fewer
conditionals in the code's hot path.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
It is possible to remove a few fields from struct z_heap, removing
some runtime indirections by doing so:
- The buf pointer is actually the same as the struct z_heap pointer
itself. So let's simply create chunk_buf() that perform a type
conversion. That type is also chunk_unit_t now rather than u64_t so
it can be defined based on CHUNK_UNIT.
- Replace the struct z_heap_bucket pointer by a zero-sized array at the
end of struct z_heap.
- Make chunk #0 into an actual chunk with its own header. This allows
for removing the chunk0 field and streamlining the code. This way
h->chunk0 becomes right_chunk(h, 0). This sets the table for further
simplifications to come.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
By storing the used flag in the LSB, it is no longer necessary to have
a size_mask variable to locate that flag. This produces smaller and
faster code.
Replace the validation check in chunk_set() to base it on the storage
type.
Also clarify the semantics of set_chunk_size() which allows for clearing
the used flag bit unconditionally which simplifies the code further.
The idea of moving the used flag bit into the LEFT_SIZE field was
raised. It turns out that this isn't as beneficial as it may seem
because the used bit is set only once i.e. when the memory is handed off
to a user and the size field becomes frozen at that point. Modifications
on the leftward chunk may still occur and extra instructions to preserve
that bit would be necessary if it were moved there.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Let's provide accessors for getting and setting every field to make the
chunk header layout abstracted away from the main code. Those are:
SIZE_AND_USED: chunk_used(), chunk_size(), set_chunk_used() and
chunk_size().
LEFT_SIZE: left_chunk() and set_left_chunk_size().
FREE_PREV: prev_free_chunk() and set_prev_free_chunk().
FREE_NEXT: next_free_chunk() and set_next_free_chunk().
To be consistent, the former chunk_set_used() is now set_chunk_used().
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
First, some renames to make accessors more explicit:
size() --> chunk_size()
used() --> chunk_used()
free_prev() --> prev_free_chunk()
free_next() --> next_free_chunk()
Then, the return type of chunk_size() is changed from chunkid_t to
size_t, and chunk_used() from chunkid_t to bool.
The left_size() accessor is used only once and can be easily substituted
by left_chunk(), so it is removed.
And in free_list_add() the variable b is renamed to bi so to be
consistent with usage in sys_heap_alloc().
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>