zephyr

Commit Graph

Author	SHA1	Message	Date
Nicolas Pitre	bb7c2e82b1	mempool: remove redundant bit set/clear within loops When small blocks are recombined to create a single block at a shallower level, it is sufficient to remove those blocks from the free list. There is no need to mark those small blocks as allocated in the bitmap. This, in turn, removes the need to mark small blocks back as unallocated when splitting up a big blocks as they'll already be so marked. Only the first small block needs to be marked allocated and the remaining blocks only need to be added to the free list. This makes the code smaller and more efficient, especially since those removed bit manipulations were located within loops. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>	2019-10-04 13:42:59 -04:00
Nicolas Pitre	1b193e9ece	mempool: reverse free bit semantic This turns the free-bit flag into an alloc-bit flag effectively reversing its semantic. This is to make further changes more natural and easier to understand. No need to clear the alloc bits at init time as they're located in .bss and all clear already. The code remains functionally equivalent after this change. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>	2019-10-04 13:42:59 -04:00
Nicolas Pitre	2129937d3d	realloc(): move mempool internal knowledge out of generic lib code The realloc function was a bit too intimate with the mempool accounting. Abstract that knowledge away and move it where it belongs. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>	2019-09-30 10:57:24 -07:00
Nicolas Pitre	629bd85612	mempool: significant reduction of memory waste The mempool allocator implementation recursively breaks a memory block into 4 sub-blocks until it minimally fits the requested memory size. The size of each sub-blocks is rounded up to the next word boundary to preserve word alignment on the returned memory, and this is a problem. Let's consider max_sz = 2072 and n_max = 1. That's our level 0. At level 1, we get one level-0 block split in 4 sub-blocks whose size is WB_UP(2072 / 4) = 520. However 4 * 520 = 2080 so we must discard the 4th sub-block since it doesn't fit inside our 2072-byte parent block. We're down to 3 * 520 = 1560 bytes of usable memory. Our memory usage efficiency is now 1560 / 2072 = 75%. At level 2, we get 3 level-1 blocks, and each of them may be split in 4 sub-blocks whose size is WB_UP(520 / 4) = 132. But 4 * 132 = 528 so the 4th sub-block has to be discarded again. We're down to 9 * 132 = 1188 bytes of usable memory. Our memory usage efficiency is now 1188 / 2072 = 57%. At level 3, we get 9 level-2 blocks, each split into WB_UP(132 / 4) = 36 bytes. Again 4 * 36 = 144 so the 4th sub-block is discarded. We're down to 27 * 36 = 972 bytes of usable memory. Our memory usage efficiency is now 972 / 2072 = 47%. What should be done instead, is to round _down_ sub-block sizes not _up_. This way, sub-blocks still align to word boundaries, and they always fit within their parent block as the total size may no longer exceed the initial size. Using the same max_sz = 2072 would yield a memory usage efficiency of 99% at level 3, so let's demo a worst case 2044 instead. Level 1: 4 sub-blocks of WB_DN(2044 / 4) = 508 bytes. We're down to 4 * 508 = 2032 bytes of usable memory. Our memory usage efficiency is now 2032 / 2044 = 99%. Level 2: 4 * 4 sub-blocks of WB_DN(508 / 4) = 124 bytes. We're down to 16 * 124 = 1984 bytes of usable memory. Our memory usage efficiency is now 1984 / 2044 = 97%. Level 3: 16 * 4 sub-blocks of WB_DN(124 / 4) = 28 bytes. We're down to 64 * 28 = 1792 bytes of usable memory. Our memory usage efficiency is now 1792 / 2044 = 88%. Conclusion: if max_sz is a power of 2 then we get 100% efficiency at all levens in both cases. But if not, then the rounding-up method has a far worse degradation curve than the rounding-down method, wasting more than 50% of memory in some cases. So let's round sub-block sizes down rather than up, and remove block_fits() which purpose was to identify sub-blocks that didn't fit within their parent block and is now useless. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>	2019-07-16 14:21:21 -07:00
Nicolas Pitre	39cd2ebef7	malloc: make sure returned memory is properly aligned The accounting data stored at the beginning of a memory block used by malloc must push the returned memory address to a word boundary. This is already the case on 32-bit systems, but not on 64-bit systems where e.g. struct k_mem_block_id still has a size of 4. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>	2019-07-03 14:17:29 -07:00
Nicolas Pitre	fc4ca923bb	mempool: fully use the inline free block bitmap on 64-bit targets The "bits" field in struct sys_mem_pool_lvl is unioned with a pointer. That leaves more space for inline free bits on 64-bit targets. Let's declare it as an array and adjust its size based on the pointer size. On 32-bit targets the generated code remains identical. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>	2019-07-02 19:41:20 -07:00
Nicolas Pitre	cf974371fb	mempool: make alignment/rounding 64-bit compatible Minimum alignment and rounding must be done on a word boundary. Let's replace _ALIGN4() with WB_UP() which is equivalent on 32-bit targets, and 64-bit aware. Also enforce a minimal alignment on the memory pool. This is making a difference mostly on64-bit targets where the widely used 4-byte alignment is not sufficient. The _ALIGN4() macro has no users left so it is removed. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>	2019-07-02 19:41:20 -07:00
Anas Nashif	08ee8b09ba	cleanup: include/: move misc/mempool.h to sys/mempool.h move misc/mempool.h to sys/mempool.h and create a shim for backward-compatibility. No functional changes to the headers. A warning in the shim can be controlled with CONFIG_COMPAT_INCLUDES. Related to #16539 Signed-off-by: Anas Nashif <anas.nashif@intel.com>	2019-06-27 22:55:49 -04:00
Anas Nashif	1ed300b318	cleanup: include/: move misc/mempool_base.h to sys/mempool_base.h move misc/mempool_base.h to sys/mempool_base.h and create a shim for backward-compatibility. No functional changes to the headers. A warning in the shim can be controlled with CONFIG_COMPAT_INCLUDES. Related to #16539 Signed-off-by: Anas Nashif <anas.nashif@intel.com>	2019-06-27 22:55:49 -04:00
Anas Nashif	5eb90ec169	cleanup: include/: move misc/__assert.h to sys/__assert.h move misc/__assert.h to sys/__assert.h and create a shim for backward-compatibility. No functional changes to the headers. A warning in the shim can be controlled with CONFIG_COMPAT_INCLUDES. Related to #16539 Signed-off-by: Anas Nashif <anas.nashif@intel.com>	2019-06-27 22:55:49 -04:00
Nicolas Pitre	1140bd090c	mempool: properly use the inline free block bitmap The free block bitmap uses either extra memory specified by a pointer in struct sys_mem_pool_lvl or the space occupied by that pointer directly if the bitmap length is small enough to fit it. But the test is wrong. the inline bitmap should be used if the number of required bits is smaller or _equal_ to the pointer size. Not doing so would wrongly bounce the free block bitmap to extra memory when the number of blocks is exactly 32, which is in disagreement with Z_MPOOL_LBIT_WORDS() that correctly returns 0 in that case. In theory that mean that this bug would causes an overflow of the free block bitmap whenever one level has exactly 32 blocks. But right now there is a separate bug fixed separately that over-sizes the extra block bitmap mitigating this bug. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>	2019-06-25 23:24:05 -04:00
Andy Ross	d0490fe9f9	lib/os/mempool: Fix corruption case with block splitting The block_fits() predicate was borked. It would check that a block fits within the bounds of the whole heap. But that's not enough: because of alignment changes between levels the sub-blocks may be adjusted forward. It needs to fit inside the PARENT block that it was split from. What could happen at runtime is that the last subblocks of a misaligned parent block would overlap memory from subsequent blocks, or even run off the end of the heap. That's bad. Change the API of block_fits() a little so it can extract the parent region and do this properly. Fixes #15279. Passes test introduced in #16728 to demonstrate what seems like the same issue. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-06-25 18:51:08 -07:00
Nicolas Pitre	465b2cf31b	mempool: fix corruption of the free block bitmap and beyond In z_sys_mem_pool_block_alloc() the size of the first level block allocation is rounded up to the next 4-bite boundary. This means one or more of the trailing blocks could overlap the free block bitmap. Let's consider this code from kernel.h: #define K_MEM_POOL_DEFINE(name, minsz, maxsz, nmax, align) \ char __aligned(align) _mpool_buf_##name[_ALIGN4(maxsz * nmax) \ + _MPOOL_BITS_SIZE(maxsz, minsz, nmax)]; \ The static pool allocation rounds up the product of maxsz and nmax not size of individual blocks. If we have, say maxsz = 10 and nmax = 20, the result of _ALIGN4(10 * 20) is 200. That's the offset at which the free block bitmap will be located. However, because z_sys_mem_pool_block_alloc() does this: lsizes[0] = _ALIGN4(p->max_sz); Individual level 0 blocks will have a size of 12 not 10. That means the 17th block will extend up to offset 204, 18th block up to 216, 19th block to 228, and 20th block to 240. So 4 out of the 20 blocks are overflowing the static pool area and 3 of them are even located completely outside of it. In this example, we have only 20 blocks that can't be split so there is no extra free block bitmap allocation beyond the bitmap embedded in the sys_mem_pool_lvl structure. This means that memory corruption will happen in whatever data is located alongside the _mpool_buf_##name array. But even with, say, 40 blocks, or larger blocks, the extra bitmap size would be small compared to the extent of the overflow, and it would get corrupted too of course. And the data corruption will happen even without allocating any memory since z_sys_mem_pool_base_init() stores free_list pointer nodes into those blocks, which in turn may get corrupted if that other data is later modified instead. Fixing this issue is simple: rounding on the static pool allocation is "misparenthesized". Let's turn _ALIGN4(maxsz * nmax) into _ALIGN4(maxsz) * nmax But that's not sufficient. In z_sys_mem_pool_base_init() we have: size_t buflen = p->n_max * p->max_sz, sz = p->max_sz; u32_t bits = (u32_t )((u8_t )p->buf + buflen); Considering the same parameters as above, here we're locating the extra free block bitmap at offset `buflen` which is 20 10 = 200, again below the reach of the last 4 memory blocks. If the number of blocks gets past the size of the embedded bitmap, it will overlap memory blocks. Also, the block_ptr() call used here to initialize the free block linked list uses unrounded p->max_sz, meaning that it is initially not locating dlist nodes within the same block boundaries as what is expected from z_sys_mem_pool_block_alloc(). This opens the possibility for allocated adjacent blocks to overwrite dlist nodes, leading to random crashes in the future. So a complete fix must round up p->max_sz here too. Given that runtime usage of max_sz should always be rounded up, it is then preferable to round it up once at compile time instead and avoid further mistakes of that sort. The existing _ALIGN4() usage on p->max_sz at run time are then redundant. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>	2019-06-24 12:10:09 -07:00
Andrew Boie	c8aee7b413	sys_mem_pool: use sys_mutex Permission management no longer necessary, the former parameter for the mutex is now simply ignored. Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2019-04-03 13:47:45 -04:00
Pawel Dunaj	2189d9b56d	lib: mempool: Alloc and break must happen atomically This fixes a regression caused by `41e90630d`. Signed-off-by: Pawel Dunaj <pawel.dunaj@nordicsemi.no>	2019-04-03 12:36:36 -04:00
Pawel Dunaj	41e90630d7	lib: mempool: Synchronize level checks Do not perform early level usage check. This can lead to situation where block is seen as available on level when it was taken from the other context. Fixes: #14504 Signed-off-by: Pawel Dunaj <pawel.dunaj@nordicsemi.no>	2019-03-19 13:38:46 -05:00
Pawel Dunaj	2c7d68009a	lib: mempool: Return error if no block found Return -ENOMEM if no block is available on any level. Fixes: #14504 Signed-off-by: Pawel Dunaj <pawel.dunaj@nordicsemi.no>	2019-03-19 13:38:46 -05:00
Patrik Flykt	4344e27c26	all: Update reserved function names Update reserved function names starting with one underscore, replacing them as follows: '_k_' with 'z_' '_K_' with 'Z_' '_handler_' with 'z_handl_' '_Cstart' with 'z_cstart' '_Swap' with 'z_swap' This renaming is done on both global and those static function names in kernel/include and include/. Other static function names in kernel/ are renamed by removing the leading underscore. Other function names not starting with any prefix listed above are renamed starting with a 'z_' or 'Z_' prefix. Function names starting with two or three leading underscores are not automatcally renamed since these names will collide with the variants with two or three leading underscores. Various generator scripts have also been updated as well as perf, linker and usb files. These are drivers/serial/uart_handlers.c include/linker/kobject-text.ld kernel/include/syscall_handler.h scripts/gen_kobject_list.py scripts/gen_syscall_header.py Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>	2019-03-11 13:48:42 -04:00
Andy Ross	85d895c60e	lib/os: Remove recursion from mempool and rbtree MISRA rules (see #11425) forbid recursive algorithms. In the case of rb_walk(), it's not actually used anywhere but a test right now, so we can simply disable the API when CONFIG_MISRA_SANE is defined. Mempool had a (IMHO, fairly clever) tail recursive loop in bfree_recombine() which can be trivially transformed into an only slightly uglier iterative version. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-28 10:06:35 -08:00
Andy Ross	fe04adf99b	lib/os: Conditionally eliminate alloca/VLA usage MISRA rules (see #9892) forbid alloca() and family, even though those features can be valuable performance and memory size optimizations useful to Zephyr. Introduce a MISRA_SANE kconfig, which when true enables a gcc error condition whenever a variable length array is used. When enabled, the mempool code will use a theoretical-maximum array size on the stack instead of one tailored to the current pool configuration. The rbtree code will do similarly, but because the theoretical maximum is quite a bit larger (236 bytes on 32 bit platforms) the array is placed into struct rbtree instead so it can live in static data (and also so I don't have to go and retune all the test stack sizes!). Current code only uses at most two of these (one in the scheduler when SCHED_SCALABLE is selected, and one for dynamic kernel objects when USERSPACE and DYNAMIC_OBJECTS are set). This tunable is false by default, but is selected in a single test (a subcase of tests/kernel/common) for coverage. Note that the I2C and SPI subsystems contain uncorrected VLAs, so a few platforms need to be blacklisted with a filter. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-28 10:06:35 -08:00
Anas Nashif	db92e5c66e	lib: flatten all loose components into one lib lib/ was starting to get messy and inconsitent. Files being either dumped in the root or in sub-directories without a clear plan. Move all library components into one single folder and call it 'os'. Signed-off-by: Anas Nashif <anas.nashif@intel.com>	2019-01-22 07:45:22 -05:00

21 Commits