drivers/mmcsd/mmcsd_sdio.c: Fix an error that was causing SDIO multiple block transfers from achieving full performance. Also adds a feature controlled by CONFIG_SDIO_BLK_CALLBACK that bypasses the normal mechanism for obtaining sector data for transfer.

This commit is contained in:
Bob Feretich 2018-07-20 09:55:33 -06:00 committed by Gregory Nutt
parent f9f56aa254
commit 17b6c7ba39
4 changed files with 457 additions and 17 deletions

View File

@ -129,4 +129,14 @@ config SDIO_BLOCKSETUP
number of blocks. Others just work on the byte stream. This option
enables the block setup method in the SDIO vtable.
config SDIO_BLK_CALLBACK
bool "SDIO block callback"
depends on !DRVR_WRITEBUFFER && !MMCSD_MULTIBLOCK_DISABLE
default n
---help---
Some hardware is RAM constrained and therefore unable to utilize the more
efficient Multiple Block Write (CMD25) mechanism. Selecting this option
will permit the driver to perform Multiple Block Writes from as few as one
512-byte data buffer. See nuttx/drivers/mmcsd/README.txt for details.
endif

140
drivers/mmcsd/README.txt Normal file
View File

@ -0,0 +1,140 @@
General SD-card Operation (Bob Feretich)
=========================
SD-cards are not passive memory devices. They contain a microcontroller,
input/output buffers, and flash memory. Data being written to an SD-card
is received by the microcontroller, CRC checked, and then placed in an input
buffer. The size of these input buffers vary, but most modern (SDHC (4GB+) and
newer) cards have at least 16 KB of input buffers and are able to store the
contents of the entire input buffer into the flash memory in one "program"
operation. This "program" operation accounts for most of the delay of the
write operation.
When a Single Block Write (CMD24) is performed, a sector (512 bytes) of data
is transfered to the SD-card, the data is placed in an input buffer, an erased
area of flash memory is obtained/identified, and the data in the input buffer
is programmed into the flash memory.
When a Multiple Block Write (CMD25) is performed, one or more sectors
(512 bytes each) of data are transfered to the SD-card, the data is CRC
checked and placed in the input buffer, when the buffer is full, the contents
are programmed into the flash memory. During the programming operation,
the SD-card will hold-off the transfer of additional blocks from the mmcsd
driver by becoming "busy". Then when it can accept more data, the card drops
its busy indication and the transfer of data blocks continue. When the mmcsd
driver has sent the number of blocks that it was requested to send, the mmcsd
driver issues a Stop Transmission CMD12 to the SD-card. This command causes
the SD-card to program any data remaining in the input buffer into the
flash memory.
During a Multiple Block Write, obtaining erased flash memory areas may become
a bottleneck. (Erasing flash takes much longer than programming it.)
To eliminate this delay, the mmcsd driver issues a SET_WR_BLK_ERASE_COUNT
ACMD23 to notify the SD-card of the needed number of erased blocks in advance.
The mmcsd driver will automatically use Multiple Block Writes when mmcsd_write
is called with nsectors greater than one, unless CONFIG_MMCSD_MULTIBLOCK_DISABLE
is defined.
The SDIO block callback option (Bob Feretich)
==============================
The best SD-card write performance is achieved when Multiple Block Writes are
used to take advantage of the large input buffers in the SD-cards. By calling
mmcsd_write with a 16 KB buffer (32 sectors of data) I was able to consistently
achieve greater than 1.7 MB / second on a Class 10 SDHC SD-card. (stm32f7
microcontroller with the sdio clock at 16 MHz; time is averaged over writing
100 MB of data) Performance using Single Block Writes was less than
100 KB / second. Note that new cards had much higher initial performance, but
the performance degraded to (and seems to have stabilized at) this level after
writing several gigabytes to the card.
But even with the resources of a stm32f7 dealing with 16 KB DMA friendly
buffers is very difficult. (The stm32 DTCM memory region is 64 KB. That is
only four 16K buffers.) The designers of the SD-card architecture knew this
and made the SD-card interface flexible enough so that a microcontroller
with only a single 512-byte buffer can take advantage of Multiple Block Write
speeds.
The feature in the SD-card interface that makes this possible is the ability
for the interface to be paused between blocks of a Multiple Block Write.
(After a block of data is transferred to the SD-card, the buffer can be
refilled and its contents transferred again as needed. Or better yet, a double
buffering architecture can be used, so while one buffer is being transferred
to the SD-card, another buffer could be being filled.)
Unfortunately, the POSIX interfaces of the operating system and the FAT
File System do not facilitate buffers being refilled or swapped in the middle
of a write operation. But there is a way to do this and make both the
operating system and the FAT File System happy.
The POSIX write function is defined as...
ssize_t (*write)(FAR struct file *filep, FAR const char *buffer, size_t buflen);
This configuration option redefines the interpretation of <buffer> from a
pointer directly to the data buffer to a pointer to a structure that contains
the callback linkage to the caller's function that provides the address of the
next block to be transferred. The <buffer> itself is opaque to Nuttx. The most
Nuttx is permitted to do is check that <buffer> specifies a memory address
that is valid for data transfer.
But, why reinterpret write() when ioctl() is provided for these special
circumstances? We choose to provide a reinterpretation of write() because
file systems like fatfs do not deal with ioctl() forms of write, and this
reinterpretation permits Multiple Block Writes to utilize the file system
and therefore can generate media that can be read on other computers.
Nuttx passes a write() upon an opened SD-card file to the fatfs file system...
ssize_t fat_write(FAR struct file *filep, FAR const char *buffer, size_t buflen);
<buflen> is important for directory management and FAT cluster assignment, and
since it contains the correct length of data to be written, these functions
will perform as designed.
If <buflen> is an even multiple of 512, then the Nuttx implementation of fatfs
provides the option of data transfer directly from the user's buffer. When
this option is utilized, fatfs only checks that the address contained in
<buffer> specifies a valid DMA transfer region. So if the callback linkage
structure pointed to by <buffer> is located in DMA capable RAM (not
necessarily DTCM), then fatfs is happy and will pass the write to the
mmcsd driver...
ssize_t mmcsd_write(FAR struct inode *inode, FAR const unsigned char *buffer,
size_t startsector, unsigned int nsectors);
When CONFIG_SDIO_BLK_CALLBACK is defined, then the mmcsd driver's
mmcsd_writemultiple function will interpret <buffer> as a pointer to a
struct sdsector_callback_s...
typedef FAR uint8_t *(*sdsector_callback_t)( size_t );
struct sdsector_callback_s {
sdsector_callback_t callback;
};
Then when the mmcsd driver is ready to transmit a sector of data to the ,
SD-card it will call the provided callback function and the function is
expected to return the address of the next 512 bytes of data to be written.
If NULL is returned, then the mmcsd driver will issue a Stop Transmission
CMD12 to end the write. If the write is ended prematurely, then the blocks
already written will contain the new data, but blocks designated for erasure
may only be partially erased.
The below measurements were made on a stm32f722 (216 MHz, 16 MHz sdio clock)
using a Patriot SDHC 4 GB SD-card. (Several GBs of data were written to the
card before it was reformatted and used for these tests.
Sectors-per-Write Buffer_Size_Used Average_Data_Rate Longest_Delay Via_fatfs
1 512 Bytes 50-75 KB/s* 490 ms yes
1 512 Bytes 50-75 KB/s* 490 ms no
32 512 Bytes 1750 KB/s 330 ms yes
32 512 Bytes 1850 KB/s 300 ms no
32 16K Bytes 1875 KB/s 310 ms yes
32 16K Bytes 1925 KB/s 300 ms no
* The range shown depicts run to run variations that exceeded any noticeable
performance difference between these two rows. The single sector write tests
wrote only 10MB of data per run. Other tests wrote 100 MB of data per run.
Other measurements showed +/- 10% run to run variations. Note that new SD-card
performance was much faster than these numbers, but the performance drops off
quickly as the card is written.
Usage notes:
* The architecture of the SDIO adapter in the MCU must permit configuring
DMA after the CMD25 is issued.
* When using Multiple Block Writes through the file system, a Multiple Block
Write must not span a file system cluster boundary. (Cluster's in a file
may not be contiguous.)

View File

@ -1,8 +1,9 @@
/****************************************************************************
* drivers/mmcsd/mmcsd_sdio.c
*
* Copyright (C) 2009-2013, 2016-2017 Gregory Nutt. All rights reserved.
* Copyright (C) 2009-2013, 2016-2018 Gregory Nutt. All rights reserved.
* Author: Gregory Nutt <gnutt@nuttx.org>
* Bob Feretich <bob.fereich@rafresearch.com>
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
@ -96,7 +97,7 @@
#define MMCSD_SCR_DATADELAY (100) /* Wait up to 100MS to get SCR */
#define MMCSD_BLOCK_RDATADELAY (100) /* Wait up to 100MS to get one data block */
#define MMCSD_BLOCK_WDATADELAY (200) /* Wait up to 200MS to write one data block */
#define MMCSD_BLOCK_WDATADELAY (230) /* Wait up to 230MS to write one data block */
#define IS_EMPTY(priv) (priv->type == MMCSD_CARDTYPE_UNKNOWN)
@ -127,7 +128,7 @@ struct mmcsd_state_s
uint8_t mode:2; /* (See MMCSDMODE_* definitions) */
uint8_t type:4; /* Card type (See MMCSD_CARDTYPE_* definitions) */
uint8_t buswidth:4; /* Bus widthes supported (SD only) */
uint8_t buswidth:4; /* Bus widths supported (SD only) */
sdio_capset_t caps; /* SDIO driver capabilities/limitations */
uint16_t selblocklen; /* The currently selected block length */
uint16_t rca; /* Relative Card Address (RCS) register */
@ -143,6 +144,7 @@ struct mmcsd_state_s
#else
uint32_t capacity; /* Total capacity of volume (Limited to 4Gb) */
#endif
/* Read-ahead and write buffering support */
#if defined(CONFIG_DRVR_WRITEBUFFER) || defined(CONFIG_DRVR_READAHEAD)
@ -1003,6 +1005,10 @@ static int mmcsd_getR1(FAR struct mmcsd_state_s *priv, FAR uint32_t *r1)
*/
priv->locked = ((localR1 & MMCSD_R1_CARDISLOCKED) != 0);
/* We must tell someone which error bits were set. */
fwarn("WARNING: mmcsd_getR1 returned errors: R1=%08x\n", localR1);
ret = -EIO;
}
else
@ -1786,6 +1792,248 @@ static ssize_t mmcsd_writesingle(FAR struct mmcsd_state_s *priv,
*
* Description:
* Write multiple, contiguous blocks of data to the physical device.
* This function expects that the data to be written will be provided
* by a user callback routine dynamically. Each time the callback
* function is called, it is expected to return with the address of the
* next sector of data to be written.
*
* This permits the system using it to work with smaller (512 byte) buffers
* and still perform efficient multi-sector mmcsd writes.
*
* Note the the SD-card spec permits delay and even card deselection
* between blocks of a CMD25 write, but the mmcsd driver will hold the
* mmcsd semaphore for this device until all promised buffer addresses
* are delivered, or the multi-sector write is canceled by the callback
* function returning a NULL pointer.
*
* If multiple tasks are accessing this device, then it is strongly
* recommended that the user callback return immediately.
*
****************************************************************************/
#ifdef PERMIT_SECTOR_BUFFER_CALLBACK
#if defined(CONFIG_FS_WRITABLE) && !defined(CONFIG_MMCSD_MULTIBLOCK_DISABLE)
static ssize_t mmcsd_writemultiple(FAR struct mmcsd_state_s *priv,
FAR const uint8_t *cbparm, off_t startblock,
size_t nblocks)
{
off_t offset;
size_t nbytes; /* Number of bytes to send on the next burst. */
size_t curblock = 0; /* Count of blocks sent. */
int ret;
int evret = OK;
FAR uint8_t *buffer;
FAR struct sdsector_callback_s *cbstruct
= (FAR struct sdsector_callback_s *) cbparm;
finfo("startblock=%d nblocks=%d\n", startblock, nblocks);
DEBUGASSERT(priv != NULL && cbparm != NULL && cbstruct != NULL &&
nblocks > 1);
/* Check if the card is locked or write protected (either via software or
* via the mechanical write protect on the card)
*/
if (mmcsd_wrprotected(priv))
{
ferr("ERROR: Card is locked or write protected\n");
return -EPERM;
}
/* Request the first buffer. */
buffer = (*(cbstruct->callback))(curblock);
if (buffer == NULL)
{
return 0; /* Write ended with no blocks written */
}
#if defined(CONFIG_SDIO_DMA) && defined(CONFIG_SDIO_PREFLIGHT)
/* If we think we are going to perform a DMA transfer, make sure that we
* will be able to before we commit the card to the operation.
*/
if ((priv->caps & SDIO_CAPS_DMASUPPORTED) != 0)
{
ret = SDIO_DMAPREFLIGHT(priv->dev, buffer, priv->blocksize);
if (ret != OK)
{
return ret;
}
}
#endif
/* Verify that the card is ready for the transfer. The card may still be
* busy from the preceding write transfer. It would be simpler to check
* for write busy at the end of each write, rather than at the beginning of
* each read AND write, but putting the busy-wait at the beginning of the
* transfer allows for more overlap and, hopefully, better performance
*/
ret = mmcsd_transferready(priv);
if (ret != OK)
{
ferr("ERROR: Card not ready: %d\n", ret);
return ret;
}
/* If this is a byte addressed SD card, then convert both the total transfer
* size to bytes and the sector start sector number to a byte offset
*/
nbytes = 1 << priv->blockshift;
if (IS_BLOCK(priv->type))
{
offset = startblock;
}
else
{
offset = startblock << priv->blockshift;
}
finfo("nbytes=%d byte offset=%d\n", nbytes, offset);
/* Select the block size for the card */
ret = mmcsd_setblocklen(priv, priv->blocksize);
if (ret != OK)
{
ferr("ERROR: mmcsd_setblocklen failed: %d\n", ret);
return ret;
}
/* If this is an SD card, then send ACMD23 (SET_WR_BLK_ERASE_COUNT) just
* before sending CMD25 (WRITE_MULTIPLE_BLOCK). This sets the number of
* write blocks to be pre-erased and might make the following multiple block
* write command faster.
*/
if (IS_SD(priv->type))
{
/* Send CMD55, APP_CMD, a verify that good R1 status is returned */
mmcsd_sendcmdpoll(priv, SD_CMD55, (uint32_t)priv->rca << 16);
ret = mmcsd_recvR1(priv, SD_CMD55);
if (ret != OK)
{
ferr("ERROR: mmcsd_recvR1 for CMD55 (ACMD23) failed: %d\n", ret);
return ret;
}
/* Send CMD23, SET_WR_BLK_ERASE_COUNT, and verify that good R1 status is returned */
mmcsd_sendcmdpoll(priv, SD_ACMD23, nblocks);
ret = mmcsd_recvR1(priv, SD_ACMD23);
if (ret != OK)
{
ferr("ERROR: mmcsd_recvR1 for ACMD23 failed: %d\n", ret);
return ret;
}
}
/* Send CMD25, WRITE_MULTIPLE_BLOCK, and verify that good R1 status
* is returned
*/
mmcsd_sendcmdpoll(priv, MMCSD_CMD25, offset);
ret = mmcsd_recvR1(priv, MMCSD_CMD25);
if (ret != OK)
{
ferr("ERROR: mmcsd_recvR1 for CMD25 failed: %d\n", ret);
return ret;
}
/* Loop sending a sector at a time. */
do
{
/* Configure SDIO controller hardware for the write transfer */
SDIO_BLOCKSETUP(priv->dev, priv->blocksize, 1);
SDIO_WAITENABLE(priv->dev,
SDIOWAIT_TRANSFERDONE | SDIOWAIT_TIMEOUT | SDIOWAIT_ERROR);
#ifdef CONFIG_SDIO_DMA
if ((priv->caps & SDIO_CAPS_DMASUPPORTED) != 0)
{
ret = SDIO_DMASENDSETUP(priv->dev, buffer, nbytes);
if (ret != OK)
{
finfo("SDIO_DMASENDSETUP: error %d\n", ret);
return ret;
}
}
else
#endif
{
SDIO_SENDSETUP(priv->dev, buffer, nbytes);
}
/* Flag that a write transfer is pending that we will have to check for
* write complete at the beginning of the next transfer.
*/
priv->wrbusy = true;
/* Wait for the transfer to complete */
evret = mmcsd_eventwait(priv, SDIOWAIT_TIMEOUT | SDIOWAIT_ERROR,
nblocks * MMCSD_BLOCK_WDATADELAY);
if (evret != OK)
{
ferr("ERROR: CMD25 transfer failed: %d\n", evret);
/* If we return from here, we probably leave the sd-card in
* Receive-data State. Instead, we will remember that
* an error occurred and try to execute the STOP_TRANSMISSION
* to put the sd-card back into Transfer State.
*/
}
/* Request the next buffer. */
if (++curblock < nblocks)
{
buffer = (*(cbstruct->callback))(curblock);
if (buffer == NULL)
{
break; /* Write ended with some blocks written */
}
}
}
while (curblock < nblocks && evret == OK);
/* Send STOP_TRANSMISSION */
ret = mmcsd_stoptransmission(priv);
if (evret != OK)
{
return evret;
}
if (ret != OK)
{
ferr("ERROR: mmcsd_stoptransmission failed: %d\n", ret);
return ret;
}
/* On success, return the number of blocks written */
return curblock;
}
#endif
#else /* PERMIT_SECTOR_BUFFER_CALLBACK is not defined */
/****************************************************************************
* Name: mmcsd_writemultiple
*
* Description:
* Write multiple, contiguous blocks of data to the physical device.
* This function expects that the data to be written is contained in
* one large buffer that is pointed to by buffer.
*
****************************************************************************/
@ -1797,6 +2045,7 @@ static ssize_t mmcsd_writemultiple(FAR struct mmcsd_state_s *priv,
off_t offset;
size_t nbytes;
int ret;
int evret = OK;
finfo("startblock=%d nblocks=%d\n", startblock, nblocks);
DEBUGASSERT(priv != NULL && buffer != NULL && nblocks > 1);
@ -1866,10 +2115,10 @@ static ssize_t mmcsd_writemultiple(FAR struct mmcsd_state_s *priv,
return ret;
}
/* If this is an SD card, then send ACMD23 (SET_WR_BLK_COUNT) just before
* sending CMD25 (WRITE_MULTIPLE_BLOCK). This sets the number of write
* blocks to be pre-erased and might make the following multiple block write
* command faster.
/* If this is an SD card, then send ACMD23 (SET_WR_BLK_ERASE_COUNT) just
* before sending CMD25 (WRITE_MULTIPLE_BLOCK). This sets the number of
* write blocks to be pre-erased and might make the following multiple block
* write command faster.
*/
if (IS_SD(priv->type))
@ -1884,9 +2133,11 @@ static ssize_t mmcsd_writemultiple(FAR struct mmcsd_state_s *priv,
return ret;
}
/* Send CMD23, SET_WR_BLK_COUNT, and verify that good R1 status is returned */
/* Send CMD23, SET_WR_BLK_ERASE_COUNT, and verify that good R1 status
* is returned.
*/
mmcsd_sendcmdpoll(priv, SD_ACMD23, 0);
mmcsd_sendcmdpoll(priv, SD_ACMD23, nblocks);
ret = mmcsd_recvR1(priv, SD_ACMD23);
if (ret != OK)
{
@ -1961,17 +2212,27 @@ static ssize_t mmcsd_writemultiple(FAR struct mmcsd_state_s *priv,
/* Wait for the transfer to complete */
ret = mmcsd_eventwait(priv, SDIOWAIT_TIMEOUT | SDIOWAIT_ERROR,
nblocks * MMCSD_BLOCK_WDATADELAY);
if (ret != OK)
evret = mmcsd_eventwait(priv, SDIOWAIT_TIMEOUT | SDIOWAIT_ERROR,
nblocks * MMCSD_BLOCK_WDATADELAY);
if (evret != OK)
{
ferr("ERROR: CMD25 transfer failed: %d\n", ret);
return ret;
ferr("ERROR: CMD25 transfer failed: %d\n", evret);
/* If we return from here, we probably leave the sd-card in
* Receive-data State. Instead, we will remember that
* an error occurred and try to execute the STOP_TRANSMISSION
* to put the sd-card back into Transfer State.
*/
}
/* Send STOP_TRANSMISSION */
ret = mmcsd_stoptransmission(priv);
if (evret != OK)
{
return evret;
}
if (ret != OK)
{
ferr("ERROR: mmcsd_stoptransmission failed: %d\n", ret);
@ -1983,6 +2244,7 @@ static ssize_t mmcsd_writemultiple(FAR struct mmcsd_state_s *priv,
return nblocks;
}
#endif
#endif
/****************************************************************************
* Name: mmcsd_flush
@ -3397,3 +3659,4 @@ errout_with_alloc:
}
#endif /* defined (CONFIG_MMCSD) && defined (CONFIG_MMCSD_SDIO) */

View File

@ -1,7 +1,7 @@
/********************************************************************************************
* drivers/mmcsd/mmcsd_sdio.h
*
* Copyright (C) 2009, 2011 Gregory Nutt. All rights reserved.
* Copyright (C) 2009, 2011, 2018 Gregory Nutt. All rights reserved.
* Author: Gregory Nutt <gnutt@nuttx.org>
*
* Redistribution and use in source and binary forms, with or without
@ -312,11 +312,39 @@ struct mmcsd_scr_s
uint8_t sdversion; /* 59:56 SD memory card physical layer version */
uint8_t erasestate; /* 55:55 Data state after erase (1 or 0) */
uint8_t security; /* 54:52 SD security support */
uint8_t buswidth; /* 51:48 DAT bus widthes supported */
uint8_t buswidth; /* 51:48 DAT bus widths supported */
/* 47:32 SD reserved space */
uint32_t mfgdata; /* 31:0 Reserved for manufacturing data */
};
#if defined(CONFIG_SDIO_BLK_CALLBACK) && !defined(CONFIG_DRVR_WRITEBUFFER) && \
!defined(CONFIG_MMCSD_MULTIBLOCK_DISABLE)
# define PERMIT_SECTOR_BUFFER_CALLBACK
#endif
#ifdef PERMIT_SECTOR_BUFFER_CALLBACK
/* When CONFIG_SDIO_BLK_CALLBACK is defined and a call is made to the
* mmcsd_write function specifying greater than one sector to be written, the
* mmcsd_write function will interpret its buffer parameter as a pointer to
* the below structure.
*
* This structure contains the address of a user callback function that will
* return a pointer to the memory containing the data for next sector to be
* written.
*
* The address of this structure and all addresses returned by the callback
* function must be aligned and positioned according to the DMA transfers
* rules of the architecture being used.
*/
typedef FAR uint8_t *(*sdsector_callback_t)(size_t);
struct sdsector_callback_s
{
sdsector_callback_t callback;
};
#endif
/********************************************************************************************
* Public Data
********************************************************************************************/
@ -334,7 +362,6 @@ extern "C"
* Public Functions
********************************************************************************************/
#undef EXTERN
#if defined(__cplusplus)
}