In z_sys_mem_pool_block_alloc() the size of the first level block
allocation is rounded up to the next 4-bite boundary. This means one
or more of the trailing blocks could overlap the free block bitmap.
Let's consider this code from kernel.h:
#define K_MEM_POOL_DEFINE(name, minsz, maxsz, nmax, align) \
char __aligned(align) _mpool_buf_##name[_ALIGN4(maxsz * nmax) \
+ _MPOOL_BITS_SIZE(maxsz, minsz, nmax)]; \
The static pool allocation rounds up the product of maxsz and nmax not
size of individual blocks. If we have, say maxsz = 10 and nmax = 20,
the result of _ALIGN4(10 * 20) is 200. That's the offset at which the
free block bitmap will be located.
However, because z_sys_mem_pool_block_alloc() does this:
lsizes[0] = _ALIGN4(p->max_sz);
Individual level 0 blocks will have a size of 12 not 10. That means
the 17th block will extend up to offset 204, 18th block up to 216, 19th
block to 228, and 20th block to 240. So 4 out of the 20 blocks are
overflowing the static pool area and 3 of them are even located
completely outside of it.
In this example, we have only 20 blocks that can't be split so there is
no extra free block bitmap allocation beyond the bitmap embedded in the
sys_mem_pool_lvl structure. This means that memory corruption will
happen in whatever data is located alongside the _mpool_buf_##name
array. But even with, say, 40 blocks, or larger blocks, the extra bitmap
size would be small compared to the extent of the overflow, and it would
get corrupted too of course.
And the data corruption will happen even without allocating any memory
since z_sys_mem_pool_base_init() stores free_list pointer nodes into
those blocks, which in turn may get corrupted if that other data is
later modified instead.
Fixing this issue is simple: rounding on the static pool allocation is
"misparenthesized". Let's turn
_ALIGN4(maxsz * nmax)
into
_ALIGN4(maxsz) * nmax
But that's not sufficient.
In z_sys_mem_pool_base_init() we have:
size_t buflen = p->n_max * p->max_sz, sz = p->max_sz;
u32_t *bits = (u32_t *)((u8_t *)p->buf + buflen);
Considering the same parameters as above, here we're locating the extra
free block bitmap at offset `buflen` which is 20 * 10 = 200, again below
the reach of the last 4 memory blocks. If the number of blocks gets past
the size of the embedded bitmap, it will overlap memory blocks.
Also, the block_ptr() call used here to initialize the free block linked
list uses unrounded p->max_sz, meaning that it is initially not locating
dlist nodes within the same block boundaries as what is expected from
z_sys_mem_pool_block_alloc(). This opens the possibility for allocated
adjacent blocks to overwrite dlist nodes, leading to random crashes in
the future.
So a complete fix must round up p->max_sz here too.
Given that runtime usage of max_sz should always be rounded up, it is
then preferable to round it up once at compile time instead and avoid
further mistakes of that sort. The existing _ALIGN4() usage on p->max_sz
at run time are then redundant.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>