summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-06-08event/dsw: support explicit release only modeHEADmainMattias Rönnblom
Add the RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE capability to the DSW event device. This feature may be used by an EAL thread to pull more work from the work scheduler, without giving up the option to forward events originating from a previous dequeue batch. This in turn allows an EAL thread to be productive while waiting for a hardware accelerator to complete some operation. Prior to this change, DSW didn't make any distinction between RTE_EVENT_OP_FORWARD and RTE_EVENT_OP_NEW type events, other than that new events would be backpressured earlier. After this change, DSW tracks the number of released events (i.e., events of type RTE_EVENT_OP_FORWARD and RTE_EVENT_OP_RELEASE) that has been enqueued. For efficiency reasons, DSW does not track the identity of individual events. This in turn implies that a certain stage in the flow migration process, DSW must wait for all pending releases (on the migration source port, only) to be received from the application, to assure that no event pertaining to any of the to-be-migrated flows are being processed. With this change, DSW starts making a distinction between forward and new type events for credit allocation purposes. Only RTE_EVENT_OP_NEW events needs credits. All events marked as RTE_EVENT_OP_FORWARD must have a corresponding dequeued event from a previous dequeue batch. Flow migration for flows on RTE_SCHED_TYPE_PARALLEL queues remains unaffected by this change. A side-effect of the tweaked DSW migration logic is that the migration latency is reduced, regardless if implicit release is enabled or not. Another side-effect is that migrated flows are now not processed during any part of the migration procedure. An upside of this change it reduces the load of the overloaded port. A downside is it introduces slightly more jitter for the migrated flows. This patch is contains various minor refactorings, improved formatting, fixed spelling, and the removal of unnessary memory barriers. Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
2024-06-08dma/cnxk: remove completion poolPavan Nikhilesh
Use DMA ops to store metadata, remove use of completion pool. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Vamsi Attunuru <vattunuru@marvell.com>
2024-06-08eventdev/dma: reorganize event DMA opsPavan Nikhilesh
Re-organize event DMA ops structure to allow holding source and destination pointers without the need for additional memory, the mempool allocating memory for rte_event_dma_adapter_ops can size the structure to accommodate all the needed source and destination pointers. Add multiple words for holding user metadata, adapter implementation specific metadata and event metadata. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Amit Prakash Shukla <amitprakashs@marvell.com>
2024-06-07eventdev/crypto: fix opaque field handlingGanapati Kundapura
For session-less crypto operations, event info is contained in crypto op metadata for each event which is restored in event from the crypto op metadata response info. For session based crypto operations, crypto op contains per session based event info in crypto op metadata. If any PMD passes any implementation specific data in "struct rte_event::impl_opaque" on each event, it's not getting restored. This patch stores "struct rte_event::impl_opaque" in mbuf dynamic field before enqueueing to cryptodev and restores "struct rte_event::impl_opaque" from mbuf dynamic field after dequeueing crypto op from cryptodev for session based crypto operations. Fixes: 7901eac3409a ("eventdev: add crypto adapter implementation") Cc: stable@dpdk.org Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com> Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
2024-05-30event/sw: fix warning from useless snprintfStephen Hemminger
With GCC-14, this warning is generated: drivers/event/sw/sw_evdev.c:263:3: warning: snprintf' will always be truncated; specified size is 12, but format string expands to at least 13 snprintf(buf, sizeof(buf), "sw%d_iq_%d_rob", dev_id, i); ^ Yet the whole printf to the buf is unnecessary. The type string argument has never been implemented, and should just be NULL. Removing the unnecessary snprintf, then means IQ_ROB_NAMESIZE can be removed. Fixes: 5ffb2f142d95 ("event/sw: support event queues") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2024-06-07baseband/acc: remove superfluous log newlineHernan Vargas
Minor cosmetic log change. No functional impact. Signed-off-by: Hernan Vargas <hernan.vargas@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2024-06-07baseband/acc: improve error descriptionHernan Vargas
Remove dead code for error and update description of one error print. Signed-off-by: Hernan Vargas <hernan.vargas@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2024-06-07baseband/acc: remove ACC100 HARQ pruningHernan Vargas
HARQ pruning is not an ACC100 feature. Removing in effect dead code. Signed-off-by: Hernan Vargas <hernan.vargas@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2024-06-07baseband/acc: remove ACC100 unused codeHernan Vargas
Remove dead code and unused function in ACC100 driver. Signed-off-by: Hernan Vargas <hernan.vargas@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2024-06-07baseband/acc: fix memory barrierHernan Vargas
Moving memory barrier so that dequeue thread can be in sync with enqueue thread. Fixes: 32e8b7ea35dd ("baseband/acc100: refactor to segregate common code") Cc: stable@dpdk.org Signed-off-by: Hernan Vargas <hernan.vargas@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2024-06-12net/virtio: fix MAC table updateSatha Rao
Don't send NULL MAC addresses in MAC table update. Fixes: 1b306359e58c ("virtio: suport multiple MAC addresses") Cc: stable@dpdk.org Signed-off-by: Satha Rao <skoteshwar@marvell.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2024-06-12vhost: manage FD with epollDavid Marchand
Switch to epoll so that the concern over the poll() fd array is removed. Add a simple list of used entries and track the next free entry. epoll() is thread safe, we no more need a synchronization mechanism and so can remove the notification pipe. Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com>
2024-06-12vhost: improve fdset initializationMaxime Coquelin
This patch heavily reworks fdset initialization: - fdsets are now dynamically allocated by the FD manager - the event dispatcher is now created by the FD manager - struct fdset is now opaque to VDUSE and Vhost Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com>
2024-06-12vhost: hide synchronization within FD managerMaxime Coquelin
This patch forces synchronization for all FDs additions or deletions in the FD set. With that, it is no more necessary for the user to know about the FD set pipe, so hide its initialization in the FD manager. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com>
2024-06-12vhost: make use of FD manager init functionMaxime Coquelin
Instead of statically initialize the fdset, this patch converts VDUSE and Vhost-user to use fdset_init() function, which now also initialize the mutexes. This is preliminary rework to hide FDs manager pipe from its users. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com>
2024-06-12vhost: rename polling mutexMaxime Coquelin
This trivial patch fixes a typo in fd's manager polling mutex name. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com>
2024-06-12net/virtio-user: fix control queue allocationMaxime Coquelin
It is possible to have the control queue without the device advertising VIRTIO_NET_F_MQ. Rely on the VIRTIO_NET_F_CTRL_VQ feature being advertised instead. Fixes: 6fdf32d1e318 ("net/virtio-user: remove max queues limitation") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com>
2024-06-12net/virtio-user: fix shadow control queue notification initMaxime Coquelin
The Virtio-user control queue kick and call FDs were not uninitialized at device stop time. This patch fixes this using the queues iterator helper for both initialization and uninitialization. Fixes: 90966e8e5b67 ("net/virtio-user: send shadow virtqueue info to the backend") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com>
2024-06-12net/virtio-user: fix control queue destructionMaxime Coquelin
This patch uses the freshly renamed iterator to destroy queues at stop time. Doing this, we fix the missing control queue destruction. Fixes: 90966e8e5b67 ("net/virtio-user: send shadow virtqueue info to the backend") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com>
2024-06-12net/virtio-user: rename queue iteratorMaxime Coquelin
This is a preliminary rework to prepare for iterating over queues for non-setup operations. Also, remove the error log that does not provide much information given the callbacks already provide one. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com>
2024-06-12net/virtio-user: add order platform flag to feature listNithin Dabilpuram
VIRTIO_F_ORDER_PLATFORM is needed feature when working with real HW platforms that are exposing virtio-net devices via VDPA framework. This feature helps in having more real ordering requirements between descriptor updates and notification data updates. Hence enable it if the device supports the feature. Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2024-06-12vhost: remove unnecessary features fetchYuan Zhiyuan
vhost_user_get_protocol_features() does not need to know about Virtio features, but only about Vhost-user protocol features. Signed-off-by: Yuan Zhiyuan <yuanzhiyuan0928@outlook.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2024-06-12vhost: add flag for async connection in client modeDaniil Ushkov
This patch introduces a new flag RTE_VHOST_USER_ASYNC_CONNECT, which in combination with the flag RTE_VHOST_USER_CLIENT makes rte_vhost_driver_start connect asynchronously to the vhost server. Signed-off-by: Daniil Ushkov <daniil.ushkov@yandex.ru> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2024-06-07vhost: cleanup resubmit info before inflight setupHaoqian He
This patch fixes a potential VM hang bug when the VM reboots after vhost live recovery due to missing cleanup virtqueue resubmit info. Specifically, if inflight IO that should be resubmitted during the latest vhost reconnection has not been submitted yet while VM rebooting, so GET_VRING_BASE would not wait for the inflight IO, at this time the resubmit info has been. When the VM restarts, SET_VRING_KICK will resubmit the inflight IO (If resubmit info is not null, function set_vring_kick will return without updating resubmit info). It’s an error, any stale inflight IO should not be resubmitted after the VM restart. The solution is to clean up virtqueue resubmit info when function set_inflight_fd before function set_vring_kick. Fixes: ad0a4ae491fe ("vhost: checkout resubmit inflight information") Cc: stable@dpdk.org Signed-off-by: Haoqian He <haoqian.he@smartx.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2024-06-07vhost: fix build with GCC 13Luca Vizzarro
This patch resolves a build error with GCC 13 and arm/aarch32 as targets: In function ‘mbuf_to_desc’, inlined from ‘vhost_enqueue_async_packed’ at ../lib/vhost/virtio_net.c:1828:6, inlined from ‘virtio_dev_rx_async_packed’ at ../lib/vhost/virtio_net.c:1842:6, inlined from ‘virtio_dev_rx_async_submit_packed’ at ../lib/vhost/virtio_net.c:1900:7: ../lib/vhost/virtio_net.c:1159:18: error: ‘buf_vec[0].buf_addr’ may be used uninitialized [-Werror=maybe-uninitialized] 1159 | buf_addr = buf_vec[vec_idx].buf_addr; | ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ <snip> ../lib/vhost/virtio_net.c:1160:18: error: ‘buf_vec[0].buf_iova’ may be used uninitialized [-Werror=maybe-uninitialized] 1160 | buf_iova = buf_vec[vec_idx].buf_iova; | ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ <snip> ../lib/vhost/virtio_net.c:1161:35: error: ‘buf_vec[0].buf_len’ may be used uninitialized [-Werror=maybe-uninitialized] 1161 | buf_len = buf_vec[vec_idx].buf_len; | ~~~~~~~~~~~~~~~~^~~~~~~~ GCC complains about the possible runtime path where the while loop which fills buf_vec (in vhost_enqueue_async_packed) is not run. As a consequence it correctly thinks that buf_vec is not initialized while being accessed anyways. This scenario is actually very unlikely as the only way this can occur is if size has overflowed to 0. Meaning that the total packet length would be close to UINT64_MAX (or actually UINT32_MAX). At first glance, the code suggests that this may never happen as the type of size has been changed to 64-bit. For a 32-bit architecture such as arm (e.g. armv7-a) and aarch32, this still happens because the operand types (pkt->pkt_len and sizeof) are 32-bit wide, performing 32-bit arithmetic first (where the overflow can happen) and widening to 64-bit later. The proposed fix simply guarantees to the compiler that the scope which fills buf_vec is accessed at least once, while not disrupting the actual logic. This is based on the assumption that size will always be greater than 0, as suggested by the sizeof, and the packet length will never be as big as UINT32_MAX, and causing an overflow. Fixes: 873e8dad6f49 ("vhost: support packed ring in async datapath") Cc: stable@dpdk.org Signed-off-by: Luca Vizzarro <luca.vizzarro@arm.com> Reviewed-by: Paul Szczepanek <paul.szczepanek@arm.com> Reviewed-by: Nick Connolly <nick.connolly@arm.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2024-06-07vhost: optimize mbuf allocation in Tx packed pathAndrey Ignatov
Currently virtio_dev_tx_packed() always allocates requested @count of packets, no matter how many packets are really available on the virtio Tx ring. Later it has to free all packets it didn't use and if, for example, there were zero available packets on the ring, then all @count mbufs would be allocated just to be freed afterwards. This wastes CPU cycles since rte_pktmbuf_alloc_bulk() / rte_pktmbuf_free_bulk() do quite a lot of work. Optimize it by using the same idea as the virtio_dev_tx_split() uses on the Tx split path: estimate the number of available entries on the ring and allocate only that number of mbufs. On the split path it's pretty easy to estimate. On the packed path it's more work since it requires checking flags for up to @count of descriptors. Still it's much less expensive than the alloc/free pair. The new get_nb_avail_entries_packed() function doesn't change how virtio_dev_tx_packed() works with regard to memory barriers since the barrier between checking flags and other descriptor fields is still in place later in virtio_dev_tx_batch_packed() and virtio_dev_tx_single_packed(). The difference for a guest transmitting ~17Gbps with MTU 1500 on a `perf record` / `perf report` (on lower pps the savings will be bigger): * Before the change: Samples: 18K of event 'cycles:P', Event count (approx.): 19206831288 Children Self Pid:Command - 100.00% 100.00% 798808:dpdk-worker1 <... skip ...> - 99.09% pkt_burst_io_forward - 90.26% common_fwd_stream_receive - 90.04% rte_eth_rx_burst - 75.53% eth_vhost_rx - 74.29% rte_vhost_dequeue_burst - 71.48% virtio_dev_tx_packed_compliant + 17.11% rte_pktmbuf_alloc_bulk + 11.80% rte_pktmbuf_free_bulk + 2.11% vhost_user_inject_irq 0.75% rte_pktmbuf_reset 0.53% __rte_pktmbuf_free_seg_via_array 0.88% vhost_queue_stats_update + 13.66% mlx5_rx_burst_vec + 8.69% common_fwd_stream_transmit * After: Samples: 18K of event 'cycles:P', Event count (approx.): 19225310840 Children Self Pid:Command - 100.00% 100.00% 859754:dpdk-worker1 <... skip ...> - 98.61% pkt_burst_io_forward - 86.29% common_fwd_stream_receive - 85.84% rte_eth_rx_burst - 61.94% eth_vhost_rx - 60.05% rte_vhost_dequeue_burst - 55.98% virtio_dev_tx_packed_compliant + 3.43% rte_pktmbuf_alloc_bulk + 2.50% vhost_user_inject_irq 1.17% vhost_queue_stats_update 0.76% rte_rwlock_read_unlock 0.54% rte_rwlock_read_trylock + 22.21% mlx5_rx_burst_vec + 12.00% common_fwd_stream_transmit It can be seen that virtio_dev_tx_packed_compliant() goes from 71.48% to 55.98% with rte_pktmbuf_alloc_bulk() going from 17.11% to 3.43% and rte_pktmbuf_free_bulk() going away completely. Signed-off-by: Andrey Ignatov <rdna@apple.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2024-06-14eal/x86: fix prefetch with C++ for MSVCTyler Retzlaff
_mm_prefetch does not take a volatile qualified pointer, cast it away. Additionally the pointer type should be char * not void * so adjust the cast to match. _mm_cldemote does not take a volatile qualified pointer, cast it away. Fixes: 28a5e0b9c7f0 ("eal/x86: implement prefetch for MSVC") Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2024-06-14eal: implement more prefetch functions for MSVCTyler Retzlaff
MSVC does not have an equivalent of __builtin_prefetch that allows read or read-write parameter. Introduce conditional compile expansion of rte_prefetch[0-2] inline functions when building with MSVC. Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2024-06-14mempool: fix some build issue with MSVCStephen Hemminger
Applying __rte_unused to a variable has no effect with MS windows compiler. The temporary variable used if debug enabled can just be eliminated. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2024-06-14mempool: replace GCC pragma with castStephen Hemminger
Building mempool with MSVC generates a warning because of this pragma (same with clang when debug is enabled). The issue the pragma was working around can be better solved by using an additional cast. Fixes: af75078fece3 ("first public release") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2024-06-14mempool: add memory chunks in dumpMorten Brørup
Added information about the memory chunks holding the objects in the mempool when dumping the status of the mempool to a file. Signed-off-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Paul Szczepanek <paul.szczepanek@arm.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com> Acked-by: Huisong Li <lihuisong@huawei.com>
2024-06-14hash: check name when creating a hashConor Fogarty
Add NULL pointer check to params->name, which is later copied into the hash datastructure. Without this check the code segfaults on the strlcpy() of a NULL pointer. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Cc: stable@dpdk.org Signed-off-by: Conor Fogarty <conor.fogarty@intel.com> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
2024-06-14hash: fix return code description in DoxygenChenming Chang
The rte_hash lookup can return ZERO which is not a positive value. Fixes: af75078fece3 ("first public release") Cc: stable@dpdk.org Signed-off-by: Chenming Chang <ccm@ccm.ink> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
2024-06-12ethdev: support duplicating only item maskDariusz Sosnowski
Extend rte_flow_conv() to support working only on item's mask. This allows drivers to get only the mask's size when working on pattern templates and duplicate items having only the mask in a generic way. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>
2024-06-12net/nfp: support xstats for flower firmwareChaoyong He
Add support the extend stats for flower firmware, include the stats for each queue. Signed-off-by: Chaoyong He <chaoyong.he@corigine.com> Reviewed-by: Long Wu <long.wu@corigine.com> Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
2024-06-12net/nfp: fix xstats for multi PF firmwareChaoyong He
When using multi PF firmware, the other ports always get the xstats of the first port. Fix it by adding the offset for other ports. Fixes: 8ad2cc8fec37 ("net/nfp: add flag for multiple PFs support") Cc: stable@dpdk.org Signed-off-by: Chaoyong He <chaoyong.he@corigine.com> Reviewed-by: Long Wu <long.wu@corigine.com> Reviewed-by: Peng Zhang <peng.zhang@corigine.com>
2024-06-12app/testpmd: fix lcore ID restrictionSivaprasad Tummala
With modern CPUs, it is possible to have higher CPU count thus we can have higher RTE_MAX_LCORES. In testpmd application, the current config forwarding cores option "--nb-cores" is hard limited to 255. The patch fixes this constraint and also adjusts the lcore data structure to 32-bit to align with rte lcore APIs. Fixes: af75078fece3 ("first public release") Cc: stable@dpdk.org Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com> Acked-by: Ferruh Yigit <ferruh.yigit@amd.com>
2024-06-11net: clear outer UDP checksum in Intel prepare helperDavid Marchand
If requesting an inner (L3/L4 checksum or L4 segmentation) offload, when the hardware does not support recomputing outer UDP checksum, automatically disable it in the common helper. Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Ali Alnubani <alialnu@nvidia.com>
2024-06-11net/iavf: remove outer UDP checksum offload for X710 VFDavid Marchand
According to the X710 datasheet, X710 devices do not support outer checksum offload. """ 8.4.4.2 Transmit L3 and L4 Integrity Offload Tunneling UDP headers and GRE header are not offloaded while the X710/XXV710/XL710 leaves their checksum field as is. If a checksum is required, software should provide it as well as the inner checksum value(s) that are required for the outer checksum. """ Fix Tx offload capabilities depending on the VF type. Bugzilla ID: 1406 Fixes: f7c8c36fdeb7 ("net/iavf: enable inner and outer Tx checksum offload") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Ali Alnubani <alialnu@nvidia.com>
2024-06-11net/i40e: fix outer UDP checksum offload for X710David Marchand
According to the X710 datasheet (and confirmed on the field..), X710 devices do not support outer checksum offload. """ 8.4.4.2 Transmit L3 and L4 Integrity Offload Tunneling UDP headers and GRE header are not offloaded while the X710/XXV710/XL710 leaves their checksum field as is. If a checksum is required, software should provide it as well as the inner checksum value(s) that are required for the outer checksum. """ Fix Tx offload capabilities according to the hardware. X722 may support such offload by setting I40E_TXD_CTX_QW0_L4T_CS_MASK. Bugzilla ID: 1406 Fixes: 8cc79a1636cd ("net/i40e: fix forward outer IPv6 VXLAN") Cc: stable@dpdk.org Reported-by: Jun Wang <junwang01@cestc.cn> Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Ali Alnubani <alialnu@nvidia.com>
2024-06-11net: fix outer UDP checksum in Intel prepare helperDavid Marchand
Setting a pseudo header checksum in the outer UDP checksum is a Intel (and some other vendors) requirement. Applications (like OVS) requesting outer UDP checksum without doing this extra setup have broken outer UDP checksums. Move this specific setup from testpmd to the "common" helper rte_net_intel_cksum_flags_prepare(). net/hns3 can then be adjusted. Bugzilla ID: 1406 Fixes: d8e5e69f3a9b ("app/testpmd: add GTP parsing and Tx checksum offload") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Ali Alnubani <alialnu@nvidia.com>
2024-06-11app/testpmd: fix outer IP checksum offloadDavid Marchand
Resetting the outer IP checksum to 0 is not something mandated by the mbuf API and is done by rte_eth_tx_prepare(), or per driver if needed. Fixes: 4fb7e803eb1a ("ethdev: add Tx preparation") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@amd.com> Tested-by: Ali Alnubani <alialnu@nvidia.com>
2024-06-11net/ice: enhance debug when HW fails to transmitDavid Marchand
At the moment, if the driver sets an incorrect Tx descriptor, the HW will raise a MDD event reported as: ice_interrupt_handler(): OICR: MDD event Add some debug info for this case. Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Ali Alnubani <alialnu@nvidia.com>
2024-06-11net/ice: fix check for outer UDP checksum offloadDavid Marchand
ICE_TX_CTX_EIPT_NONE == 0. There is a good chance that !(anything & 0) is true :-). While removing this noop check is doable, let's check that the descriptor does contain a outer ip type. Fixes: 2ed011776334 ("net/ice: fix outer UDP Tx checksum offload") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Tested-by: Ali Alnubani <alialnu@nvidia.com>
2024-06-11app/testpmd: support matching any VXLAN fieldGavin Li
VXLAN extensions (VXLAN-GPE and VXLAN-GBP) are unified in a single VXLAN flow item. It is user responsibility to explicitly match VXLAN-GPE with its UDP port. Below are examples to match standard VXLAN, VXLAN-GPE and VXLAN-GBP. To match standard vxlan, ... / udp dst is 4789 / vxlan ... / ... To match VXLAN-GBP, group policy ID is 4321, ... / udp dst is 4789 / vxlan flag_g is 1 group_policy_id is 4321 ... / ... To match VXLAN-GPE, next protocol is IPv6 ... / udp dst is 4790 / vxlan flag_p is 1 protocol is 2 ... / ... Signed-off-by: Gavin Li <gavinl@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ori Kam <orika@nvidia.com>
2024-06-11net: extend VXLAN header to support more extensionsGavin Li
VXLAN and VXLAN-GPE were supported with similar header structures. In order to add VXLAN-GBP, which is another extension to VXLAN, both extensions are merged in the original VXLAN header structure for an easier usage. More VXLAN extensions may be added in the future in the same single structure. VXLAN and VXLAN-GBP use the same UDP port (4789), while VXLAN-GPE uses a different port (4790). The three protocols have the same header length and overall a similar header structure as below. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |R|R|R|R|I|R|R|R| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | VXLAN Network Identifier (VNI) | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1: VXLAN Header 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |R|R|Ver|I|P|B|O| Reserved |Next Protocol | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | VXLAN Network Identifier (VNI) | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2: VXLAN-GPE Header 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|R|R|R|I|R|R|R|R|D|R|R|A|R|R|R| Group Policy ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | VXLAN Network Identifier (VNI) | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3: VXLAN-GBP Extension Both GPE and GBP are extending VXLAN by using some reserved bits. It means the packets can be processed with the same pattern and most of the code can be reused. The old field names are kept with the use of anonymous unions. The Group Policy ID (GBP) and the Next Protocol (GPE) fields are overlapping so they are in a union as well. Another improvement is defining and documenting each bit. Instead of adding flow items, a single VXLAN flow item is more flexible as it uses the same header anyway. GBP can be matches with the G bit. GPE can be matched with the UDP port number. VXLAN-GPE flow item and specific header are marked as deprecated. A removal of the deprecated structures and macros may be proposed later. Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2024-06-11net/mlx5: support VXLAN last reserved modificationRongwei Liu
Implementing the VxLAN last reserved byte modification. Following the RFC, the field is only 1 byte and needs to use the field_length as 8 instead of the real dst_field->size. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
2024-06-11ethdev: add VXLAN last reserved fieldRongwei Liu
Add "uint8_t last_rsvd" as union with origin rsvd1. Add RTE_FLOW_FIELD_VXLAN_LAST_RSVD into rte flow packet field. The new union is used by testpmd matching item VXLAN "last_rsvd" and modify target RTE_FLOW_FIELD_VXLAN_LAST_RSVD. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
2024-06-11app/testpmd: fix flow field string sequenceRongwei Liu
The field string should be in the same order as the rte_flow_field_id enumration definitions Fixes: bfc007802da7 ("ethdev: allow modifying IPv6 FL and TC fields") Fixes: d66aa38f431d ("ethdev: allow modifying IPsec fields") Fixes: b160da13b398 ("ethdev: allow modifying IPv4 next protocol field") Cc: stable@dpdk.org Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>
2024-06-05net/axgbe: modify debug messagesVenkat Kumar Ande
Modify debug messages to get better information from debug logs Signed-off-by: Venkat Kumar Ande <venkatkumar.ande@amd.com> Acked-by: Selwin Sebastian <selwin.sebastian@amd.com>