Zhe Weng
6222ad5764
Revert "net: downgrade iob priority of input/udp/icmp to avoid blocking devif"
...
This reverts commit d87620abc9
.
2023-01-12 01:56:18 +08:00
Zhe Weng
d87620abc9
net: downgrade iob priority of input/udp/icmp to avoid blocking devif
...
When trying to use iperf2, we found it comsumes all the IOB when sending UDP packets, then devif_poll has no IOB to send the packet out, so speed drops to 0 and never recovers.
Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
2023-01-05 22:25:19 +08:00
chao an
34d2cde8a8
net/l2/l3/l4: add support of iob offload
...
1. Add new config CONFIG_NET_LL_GUARDSIZE to isolation of l2 stack,
which will benefit l3(IP) layer for multi-MAC(l2) implementation,
especially in some NICs such as celluler net driver.
new configuration options: CONFIG_NET_LL_GUARDSIZE
CONFIG_NET_LL_GUARDSIZE will reserved l2 buffer header size of
network buffer to isolate the L2/L3 (MAC/IP) data on network layer,
which will be beneficial to L3 network layer protocol transparent
transmission and forwarding
------------------------------------------------------------
Layout of frist iob entry:
iob_data (aligned by CONFIG_IOB_ALIGNMENT)
|
| io_offset(CONFIG_NET_LL_GUARDSIZE)
| |
-------------------------------------------------
iob | Reserved | io_len |
-------------------------------------------------
-------------------------------------------------------------
Layout of different NICs implementation:
iob_data (aligned by CONFIG_IOB_ALIGNMENT)
|
| io_offset(CONFIG_NET_LL_GUARDSIZE)
| |
-------------------------------------------------
Ethernet | Reserved | ETH_HDRLEN | io_len |
---------------------------------|---------------
8021Q | Reserved | ETH_8021Q_HDRLEN | io_len |
---------------------------------|---------------
ipforward | Reserved | io_len |
-------------------------------------------------
--------------------------------------------------------------------
2. Support iob offload to l2 driver to avoid unnecessary memory copy
Support send/receive iob vectors directly between the NICs and l3/l4
stack to avoid unnecessary memory copies, especially on hardware that
supports Scatter/gather, which can greatly improve performance.
new interface to support iob offload:
------------------------------------------
| IOB version | original |
|----------------------------------------|
| devif_iob_poll() | devif_poll() |
| ... | ... |
------------------------------------------
--------------------------------------------------------------------
1> NIC hardware support Scatter/gather transfer
TX:
tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
/ \
/ \
devif_poll_[l3|l4]_connections() devif_iob_send() (nocopy:udp/icmp/...)
/ \ (copy:tcp)
/ \
devif_iob_poll("NIC"_txpoll) callback() // "NIC"_txpoll
|
dev->d_iob: |
--------------- ---------------
io_data iob1 | | | iob3 | | |
\ --------------- ---------------
--------------- | --------------- |
iob0 | | | | iob2 | | | |
--------------- | --------------- |
\ | / /
\ | / /
----------------------------------------------
NICs io vector | | | | | | | | | |
----------------------------------------------
RX:
[tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
|
|
[tcp|udp|icmp|...]_ipv[4|6]_in()/...
|
|
pkt/ipv[4/6]_input()/...
|
|
NICs io vector receive(iov_base to each iobs)
--------------------------------------------------------------------
2> CONFIG_IOB_BUFSIZE is greater than MTU:
TX:
"(CONFIG_IOB_BUFSIZE) > (MAX_NETDEV_PKTSIZE + CONFIG_NET_GUARDSIZE + CONFIG_NET_LL_GUARDSIZE)"
tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
/ \
/ \
devif_poll_[l3|l4]_connections() devif_iob_send() (nocopy:udp/icmp/...)
/ \ (copy:tcp)
/ \
devif_iob_poll("NIC"_txpoll) callback() // "NIC"_txpoll
|
"NIC"_send()
(dev->d_iob->io_data[CONFIG_NET_LL_GUARDSIZE - NET_LL_HDRLEN(dev)])
RX:
[tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
|
|
[tcp|udp|icmp|...]_ipv[4|6]_in()/...
|
|
pkt/ipv[4/6]_input()/...
|
|
NICs io vector receive(iov_base to io_data)
--------------------------------------------------------------------
3> Compatible with all old flat buffer NICs
TX:
tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
/ \
/ \
devif_poll_[l3|l4]_connections() devif_iob_send() (nocopy:udp/icmp/...)
/ \ (copy:tcp)
/ \
devif_iob_poll(devif_poll_callback()) devif_poll_callback() /* new interface, gather iobs to flat buffer */
/ \
/ \
devif_poll("NIC"_txpoll) "NIC"_send()(dev->d_buf)
RX:
[tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
|
|
[tcp|udp|icmp|...]_ipv[4|6]_in()/...
|
|
netdev_input() /* new interface, Scatter/gather flat/iob buffer */
|
|
pkt/ipv[4|6]_input()/...
|
|
NICs io vector receive(Orignal flat buffer)
3. Iperf passthrough on NuttX simulator:
-------------------------------------------------
| Protocol | Server | Client | |
|-----------------------------------------------|
| TCP | 813 | 834 | Mbits/sec |
| TCP(Offload) | 1720 | 1100 | Mbits/sec |
| UDP | 22 | 757 | Mbits/sec |
| UDP(Offload) | 25 | 1250 | Mbits/sec |
-------------------------------------------------
Signed-off-by: chao an <anchao@xiaomi.com>
2022-12-03 11:47:04 +08:00
Zhe Weng
025430a964
net/icmpv6: Fix icmpv6_reply function
...
Fix icmpv6_reply logic broken by commit 48311cc61f
and 391b501639
.
- 48311cc61f
"Fix unaligned memory access when creating ICMP Port Unreachable messages"
- It removed `htonl` function outside `data`, then the byte order may be wrong, so add `htons` back.
- 391b501639
"net: extract l3 header build code into new functions"
- It mis-removed the `memmove`, and the icmpv6 has no payload copied after this commit.
Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
2022-12-02 15:26:45 +08:00
liyi
391b501639
net: extract l3 header build code into new functions
...
Signed-off-by: liyi <liyi25@xiaomi.com>
2022-11-29 18:36:15 +08:00
Zhe Weng
869c93638d
net/icmpv6: Fix `ipv6->len` in icmpv6_reply
...
The `ipv6->len` is the length excluding the IPv6 header, so need to be `dev->d_len - IPv6_HDRLEN`.
Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
2022-11-24 23:19:10 +08:00
Zhe Weng
a26ea96f3b
net/icmpv6: Fix `datalen` in icmpv6_reply
...
The `datalen` indicates the whole len of original packet, which will become the payload inside icmpv6 packet.
Using `datalen = (ipv4->len[0] << 8) + ipv4->len[1]` in icmp_reply is correct, because it includes IPv4 header, but when coming to IPv6, the `len` does not include the header, so we need to add it back.
Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
2022-11-24 23:19:10 +08:00
Zhe Weng
233974b327
net/icmpv6: Fix icmpv6 checksum calculation in icmpv6_reply
...
The second param of icmpv6_chksum is `iplen`, which indicates `The size of the IPv6 header`.
Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
2022-11-24 23:19:10 +08:00
chao an
a8d3286258
net: move device buffer define to common header
...
Signed-off-by: chao an <anchao@xiaomi.com>
2022-10-28 00:32:16 -04:00
Petro Karashchenko
08043fb5bc
net: unify FAR keyword usage for all net buffer memory mapped buffers
...
Signed-off-by: Petro Karashchenko <petro.karashchenko@gmail.com>
2022-01-20 01:42:56 +08:00
Norman Rasmussen
48311cc61f
Fix unaligned memory access when creating ICMP Port Unreachable messages
...
commit 3b69d09c80
corrected the
unreachable handling for net/udp/icmp but introduced an unaligned store.
This splits the uint32_t data field into a two element uint16_t data
field to avoid the unaligned store.
2021-12-28 03:51:53 -06:00
chao.an
3b69d09c80
net/udp/icmp: correct the unreadchable handling
...
Reference RFC1122:
https://datatracker.ietf.org/doc/html/rfc1122
----------------------------------------------
4.1.3 SPECIFIC ISSUES
4.1.3.1 Ports
If a datagram arrives addressed to a UDP port for which
there is no pending LISTEN call, UDP SHOULD send an ICMP
Port Unreachable message.
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-11-26 08:47:54 -06:00