[riot-notifications] [RIOT-OS/RIOT] STM32_common/dma: Optimize the latency in the hot path (#14096)

Koen Zandberg notifications at github.com
Mon May 18 11:48:33 CEST 2020

### Contribution description

This PR optimizes the stm32 DMA controller to reduce the number of operations in the hot path of a DMA stream configuration.

1. Move power-on and enable function to the DMA init. The DMA is never powered off, so the most this will do is prematurely enable the DMA before it is strictly required.
2. FCR configuration changes only during memory-to-memory transfers. clearing the DMDIS flag during the acquire should be sufficient for now
3. Same as above, clearing the flags during the acquire and not every transfer is sufficient as these are also cleared in the ISR.
4. This removes a number of asserts that should have been triggered during proper functional call usage anyway. This is why the dma asserts remain in the acquire function. No need to assert this again during the transfer, start and wait call.
5. This commit splits the `dma_configure` into a setup function, called once in a series of DMA transfers, and a prepare, called every DMA transfer. The setup configures the stream channel, direction, the peripheral address and transfer width. This is usually only required once in a set of transfers (for example a spi_transfer_regs, which consists of a transfer for the device register and a transfer for the data.). The `dma_prepare` is called before every transfer to change the memory address and the length of the transfer.
6. Cache the DMA stream register base address in RAM. Calculating the register base address during init and not every call reduced the overhead from a single transfer from 7.4μs to 6.4μs. This comes at the cost of 4 bytes per logical DMA stream configured.

Let me know if it is preferred to split this commit in multiple PRs

### Testing procedure

This should be tested on the different stm32 (f0, f1, f2, f4, l4 and f7) classes to ensure that I didn't accidentally broke the DMA peripheral.

Testing is probably easiest with SPI, using #14087 and #14093.

Boards that have DMA configured (and thus might break) are:

- b-l072z-lrwan1 (SPI and UART)
- b-l475e-iot01a (SPI and UART)
- iotlab-(a8-)m3 (UART)
- nucleo-f207zg (SPI, UART and ethernet)
- nucleo-f767zi (SPI, UART and ethernet)
- nucleo-l152re (SPI and UART)
- nucleo-l476rg (SPI and UART)

### Issues/PRs references

Depends on #14087 for some measurements on this.

You can view, comment on, or merge this pull request online at:


-- Commit Summary --

  * tests/periph_spi: Add thread runtime stats
  * stm32/dma: Move one-time config to init function
  * stm32/dma: Move FCR configuration to acquire function
  * stm32/dma: Move clear flags to acquire
  * stm32/dma: Remove superfluous asserts from DMA hot path
  * stm32/dma: add setup and prepare functions
  * stm32/dma: Cache DMA stream base address

-- File Changes --

    M cpu/stm32_common/include/periph_cpu_common.h (45)
    M cpu/stm32_common/periph/dma.c (150)
    M tests/periph_spi/main.c (129)

-- Patch Links --


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.riot-os.org/pipermail/notifications/attachments/20200518/49b3efac/attachment.htm>

More information about the notifications mailing list