* USING SPIFFS TODO * SPIFFS DESIGN Spiffs is inspired by YAFFS. However, YAFFS is designed for NAND flashes, and for bigger targets with much more ram. Nevertheless, many wise thoughts have been borrowed from YAFFS when writing spiffs. Kudos! The main complication writing spiffs was that it cannot be assumed the target has a heap. Spiffs must go along only with the work ram buffer given to it. This forces extra implementation on many areas of spiffs. ** SPI flash devices using NOR technology Below is a small description of how SPI flashes work internally. This is to give an understanding of the design choices made in spiffs. SPI flash devices are physically divided in blocks. On some SPI flash devices, blocks are further divided into sectors. Datasheets sometimes name blocks as sectors and vice versa. Common memory capacaties for SPI flashes are 512kB up to 8MB of data, where blocks may be 64kB. Sectors can be e.g. 4kB, if supported. Many SPI flashes have uniform block sizes, whereas others have non-uniform - the latter meaning that e.g. the first 16 blocks are 4kB big, and the rest are 64kB. The entire memory is linear and can be read and written in random access. Erasing can only be done block- or sectorwise; or by mass erase. SPI flashes can normally be erased from 100.000 up to 1.000.000 cycles before they fail. A clean SPI flash from factory have all bits in entire memory set to one. A mass erase will reset the device to this state. Block or sector erasing will put the all bits in the area given by the sector or block to ones. Writing to a NOR flash pulls ones to zeroes. Writing 0xFF to an address is simply a no-op. Writing 0b10101010 to a flash address holding 0b00001111 will yield 0b00001010. This way of "write by nand" is used considerably in spiffs. Common characteristics of NOR flashes are quick reads, but slow writes. And finally, unlike NAND flashes, NOR flashes seem to not need any error correction. They always write correctly I gather. ** Spiffs logical structure Some terminology before proceeding. Physical blocks/sectors means sizes stated in the datasheet. Logical blocks and pages is something the integrator choose. ** Blocks and pages Spiffs is allocated to a part or all of the memory of the SPI flash device. This area is divided into logical blocks, which in turn are divided into logical pages. The boundary of a logical block must coincide with one or more physical blocks. The sizes for logical blocks and logical pages always remain the same, they are uniform. Example: non-uniform flash mapped to spiffs with 128kB logical blocks PHYSICAL FLASH BLOCKS SPIFFS LOGICAL BLOCKS: 128kB +-----------------------+ - - - +-----------------------+ | Block 1 : 16kB | | Block 1 : 128kB | +-----------------------+ | | | Block 2 : 16kB | | | +-----------------------+ | | | Block 3 : 16kB | | | +-----------------------+ | | | Block 4 : 16kB | | | +-----------------------+ | | | Block 5 : 64kB | | | +-----------------------+ - - - +-----------------------+ | Block 6 : 64kB | | Block 2 : 128kB | +-----------------------+ | | | Block 7 : 64kB | | | +-----------------------+ - - - +-----------------------+ | Block 8 : 64kB | | Block 3 : 128kB | +-----------------------+ | | | Block 9 : 64kB | | | +-----------------------+ - - - +-----------------------+ | ... | | ... | A logical block is divided further into a number of logical pages. A page defines the smallest data holding element known to spiffs. Hence, if a file is created being one byte big, it will occupy one page for index and one page for data - it will occupy 2 x size of a logical page on flash. So it seems it is good to select a small page size. Each page has a metadata header being normally 5 to 9 bytes. This said, a very small page size will make metadata occupy a lot of the memory on the flash. A page size of 64 bytes will waste 8-14% on metadata, while 256 bytes 2-4%. So it seems it is good to select a big page size. Also, spiffs uses a ram buffer being two times the page size. This ram buffer is used for loading and manipulating pages, but it is also used for algorithms to find free file ids, scanning the file system, etc. Having too small a page size means less work buffer for spiffs, ending up in more reads operations and eventually gives a slower file system. Choosing the page size for the system involves many factors: - How big is the logical block size - What is the normal size of most files - How much ram can be spent - How much data (vs metadata) must be crammed into the file system - How fast must spiffs be - Other things impossible to find out So, choosing the Optimal Page Size (tm) seems tricky, to say the least. Don't fret - there is no optimal page size. This varies from how the target will use spiffs. Use the golden rule: ~~~ Logical Page Size = Logical Block Size / 256 ~~~ This is a good starting point. The final page size can then be derived through heuristical experimenting for us non-analytical minds. ** Objects, indices and look-ups A file, or an object as called in spiffs, is identified by an object id. Another YAFFS rip-off. This object id is a part of the page header. So, all pages know to which object/file they belong - not counting the free pages. An object is made up of two types of pages: object index pages and data pages. Data pages contain the data written by user. Index pages contain metadata about the object, more specifically what data pages are part of the object. The page header also includes something called a span index. Let's say a file is written covering three data pages. The first data page will then have span index 0, the second span index 1, and the last data page will have span index 2. Simple as that. Finally, each page header contain flags, telling if the page is used, deleted, finalized, holds index or data, and more. Object indices also have span indices, where an object index with span index 0 is referred to as the object index header. This page does not only contain references to data pages, but also extra info such as object name, object size in bytes, flags for file or directory, etc. If one were to create a file covering three data pages, named e.g. "spandex-joke.txt", given object id 12, it could look like this: PAGE 0 PAGE 1 page header: [obj_id:12 span_ix:0 flags:USED|DATA] PAGE 2 page header: [obj_id:12 span_ix:1 flags:USED|DATA] PAGE 3 page header: [obj_id:545 span_ix:13 flags:USED|DATA] PAGE 4 page header: [obj_id:12 span_ix:2 flags:USED|DATA] PAGE 5 page header: [obj_id:12 span_ix:0 flags:USED|INDEX] obj ix header: [name:spandex-joke.txt size:600 bytes flags:FILE] obj ix: [1 2 4] Looking in detail at page 5, the object index header page, the object index array refers to each data page in order, as mentioned before. The index of the object index array correlates with the data page span index. entry ix: 0 1 2 obj ix: [1 2 4] | | | PAGE 1, DATA, SPAN_IX 0 --------/ | | PAGE 2, DATA, SPAN_IX 1 --------/ | PAGE 4, DATA, SPAN_IX 2 --------/ Things to be unveiled in page 0 - well.. Spiffs is designed for systems low on ram. We cannot keep a dynamic list on the whereabouts of each object index header so we can find a file fast. There might not even be a heap! But, we do not want to scan all page headers on the flash to find the object index header. The first page(s) of each block contains the so called object look-up. These are not normal pages, they do not have a header. Instead, they are arrays pointing out what object-id the rest of all pages in the block belongs to. By this look-up, only the first page(s) in each block must to scanned to find the actual page which contains the object index header of the desired object. The object lookup is redundant metadata. The assumption is that it presents less overhead reading a full page of data to memory from each block and search that, instead of reading a small amount of data from each page (i.e. the page header) in all blocks. Each read operation from SPI flash normally contains extra data as the read command itself and the flash address. Also, depending on the underlying implementation, other criterions may need to be passed for each read transaction, like mutexes and such. The veiled example unveiled would look like this, with some extra pages: PAGE 0 [ 12 12 545 12 12 34 34 4 0 0 0 0 ...] PAGE 1 page header: [obj_id:12 span_ix:0 flags:USED|DATA] ... PAGE 2 page header: [obj_id:12 span_ix:1 flags:USED|DATA] ... PAGE 3 page header: [obj_id:545 span_ix:13 flags:USED|DATA] ... PAGE 4 page header: [obj_id:12 span_ix:2 flags:USED|DATA] ... PAGE 5 page header: [obj_id:12 span_ix:0 flags:USED|INDEX] ... PAGE 6 page header: [obj_id:34 span_ix:0 flags:USED|DATA] ... PAGE 7 page header: [obj_id:34 span_ix:1 flags:USED|DATA] ... PAGE 8 page header: [obj_id:4 span_ix:1 flags:USED|INDEX] ... PAGE 9 page header: [obj_id:23 span_ix:0 flags:DELETED|INDEX] ... PAGE 10 page header: [obj_id:23 span_ix:0 flags:DELETED|DATA] ... PAGE 11 page header: [obj_id:23 span_ix:1 flags:DELETED|DATA] ... PAGE 12 page header: [obj_id:23 span_ix:2 flags:DELETED|DATA] ... ... Ok, so why are page 9 to 12 marked as 0 when they belong to object id 23? These pages are deleted, so this is marked both in page header flags and in the look up. This is an example where spiffs uses NOR flashes "nand-way" of writing. As a matter of fact, there are two object id's which are special: obj id 0 (all bits zeroes) - indicates a deleted page in object look up obj id 0xff.. (all bits ones) - indicates a free page in object look up Actually, the object id's have another quirk: if the most significant bit is set, this indicates an object index page. If the most significant bit is zero, this indicates a data page. So to be fully correct, page 0 in above example would look like this: PAGE 0 [ 12 12 545 12 *12 34 34 *4 0 0 0 0 ...] where the asterisk means the msb of the object id is set. This is another way to speed up the searches when looking for object indices. By looking on the object id's msb in the object lookup, it is also possible to find out whether the page is an object index page or a data page.