Multi Channel DMA Intel® FPGA IP for PCI Express Design Example User Guide

ID 683517
Date 4/29/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

2.4.1.2. Hardware Test Results

Note: The PIO test was run with MCDMA H-Tile.
The Custom Driver was used to generate the following output:
Figure 15. PIO Test -o option
Figure 16. H2D Avalon-ST Streaming-t option. Note: This hardware test was run with the Intel® Stratix® 10 GX H-tile PCIe Gen3 x16 configuration. Hardware test result with P-Tile Gen4 x16 may be added in a future release.
Note: In the example above, the perfq_app command transfers 250 MB of total transfer size with payload of 8192 bytes in each descriptor in H2D direction (-t) for four channels. Without -v (data validation) option, the command displays the bandwidth.
The -p option specifies the payload size. The maximum payload size varies depending on the design examples as follows:
  • Loopback: 1 MB
  • Avalon-MM:
    • With validation enabled: ((total available memory) / #channels)
    • With validation disabled: 1 MB
  • Avalon-ST Packet Generate/Check: 1 MB
The -s option specifies the transfer size.
  • For loopback, Avalon-ST and Avalon-MM design examples (except for the Avalon-ST Packet Generate/Check design example), there is no limit on the transfer size.
  • For the Avalon-ST Packet Generate/Check design example, the number of descriptors (transfer size / packet size) should be a modulus of 64 (64 is the default file size).

The -a option specifies the number of threads. For this option, you can provide any number that is a factor of the total number of queues required to distribute the traffic equally among the available cores in the system.

For example, for a system with 64 channels of bidirectional traffic, there is a maximum of 128 possible queues. Hence, the -a option can accept these values: 1,2,4,8,16,32,64,128. If you use -a 128, the 128 queues are distributed among 128 cores. However, if the number of cores in the system is limited, you can use smaller values for a. If you use -a 4, the 128 queues are distributed among 4 cores (with each core handling 32 queues). A higher number of queues per core does lead to a decrease in performance.

Figure 17. H2D Avalon-ST Streaming with Data Validation Enabled-t with -v option. Note: This hardware test was run with the Intel® Stratix® 10 GX H-tile PCIe Gen3 x16 configuration. Hardware test result with P-Tile Gen4 x16 may be added in a future release.
Figure 18. D2H Avalon-ST Streaming-r option. Note: This hardware test was run with the Intel® Stratix® 10 GX H-tile PCIe Gen3 x16 configuration. Hardware test result with P-Tile Gen4 x16 may be added in a future release.
Figure 19. D2H Avalon-ST Streaming with Data Validation Enabled-r with -v option. Note: This hardware test was run with the Intel® Stratix® 10 GX H-tile PCIe Gen3 x16 configuration. Hardware test result with P-Tile Gen4 x16 may be added in a future release.
Figure 20. Bidirectional Avalon-ST Streaming-z option. Note: This hardware test was run with the Intel® Stratix® 10 GX H-tile PCIe Gen3 x16 configuration. Hardware test result with P-Tile Gen4 x16 may be added in a future release.
Figure 21. Bidirectional Avalon-ST Streaming with Data Validation Enabled -z with -v option. Note: This hardware test was run with the Intel® Stratix® 10 GX H-tile PCIe Gen3 x16 configuration. Hardware test result with P-Tile Gen4 x16 may be added in a future release.