HomeDesign Services FPGA Boards FMC ModulesIP CoresZ-RAY Modules AccessoriesOnline StoreHow To BuyAbout UsDesign ToolsSearch

Product Updates

PCI Express Block DMA/SGDMA IP Solution

The PCI-Express DMA core offers a fully integrated, flexible and highly optimized solution for high bandwidth and low latency direct memory access between host memory and target FPGAs. An optional Scatter-Gather DMA mode is supported for efficient utilization of the host memory.

The core supports PCIe Gen2 and Gen3 capable endpoints for both Xilinx and Altera devices. It implements a single clock domain 64bit, 128bit or 256bit design depending on the Endpoint generation (Gen2/3) and user interface width. The core provides a standard AXI4-S compliant user interface and has a multichannel support. An efficient scheduler is implemented to provide even priority to all channels which are scheduled in a round robin fashion.

A simple block DMA variation of the core is available in which data is transferred to/from a continuous buffer in host memory. A variation of the block DMA core, Direct DMA, is available where the C2S transfers are completely managed by the user interface at the DMA core. For this mode, C2S transfer address, size and interrupt information can be provided directly by the user using the TUSER field in the AXI4-S interface. The interrupt can be enabled for each packet or after multiple packets for interrupt coalescing as required by user application. For S2C transfers, the host memory address and size information is provided to user application using the TUSER field.

A high performance scatter-gather variation of the core is also available to support multiple scattered system buffers with packet markers (start and end of packet and other packet fields). This mode is highly suitable with packet user logic for example Ethernet cores.

The DMA core also provides optional performance instrumentation logic that can help user to monitor the maximum and minimum throughput states for performance intensive applications.

Key Features

● Supports PCIe Gen1, Gen2 and Gen3
Support 64-bit, 128-bit and 256-bit Xilinx and Altera’s Endpoints
● AXI4-Streaming compliant user interface
● Supports multiple independent DMA channels
● Each channel supports block or scatter-gather mode of operation depending on the core variation
● High performance core with minimum latency and very low inter-packet gaps
● Credits based transfer allows maximum efficient use of available bandwidth
● Dynamic 32-bit and 64-bit systems address support
Support up to 4096 descriptors in Scatter-Gather mode. The descriptors are generated in a continuous buffer on the host memory by the DMA driver for efficient fetching/update of descriptors by the DMA core
● Supports page size of up to 2 Megabytes
Supports dynamic Maximum Read Request Size (MRRS), Read Completion Boundary (RCB) and Maximum Payload Size (MPS). Updates to these parameters can be made by writing to the Endpoint configuration registers from PCIe driver
● Supports reordering of received completions from PCIe root complex in S2C direction
Implements packet splitter logic to split the transfers at 4K boundary in C2S direction as required by PCIe specifications
● Supports MSI and Legacy Interrupts
● Per channel DMA Interrupt Enable/Disable support
● Provides a simple register file read/write interface for accessing user application registers
● Implements optional internal Completion timeout counters and DMA performance counters
● Implements soft reset logic to reset the DMA engines to known good state
● Includes efficient Linux (32/64 bit) PCIe and DMA drivers with example applications for DMA transfers.
● Includes self-checking simulation test-benches with support for Endpoint bypass for faster simulations
Includes PCIe DMA reference design with data generator/checker on the FPGA side and C/C++ based user application on host side

Licensing and Maintenance

NO yearly maintenance fees for upgrades and bug fixes
Basic core licensing for a single vendor (either Xilinx or Altera) compiled (synthesized netlist) binary
Additional vendor license provided at only 50% cost of the base license. This allows for cost effective multi-vendor designs with identical user and control interfaces.
Other licensing options include:
   - Vendor and device family agnostic source code (Verilog) license

Part Number Product Description Binary Single Site / Single Design Binary Single Site / Multi Designs
HTK-SG-PCIeDMA-64-8CH-x4G2 x4 Gen2 PCIe, 64-bit data path PCIe backend 8 Channel, SG DMA Controller (GiGE Solution) $4,495 $6,495
HTK-SG-PCIeDMA-128-8CH-x8G2 x8 Gen2 PCIe, 128-bit data path PCIe backend 8 Channel, SG DMA Controller (10G Solution) $6,495 $8,495
HTK-SG-PCIeDMA-128-8CH-x4G3 x4 Gen3 PCIe, 128-bit data path PCIe backend 8 Channel, SG DMA Controller (10G Solution) $6,495 $8,495
HTK-SG-PCIeDMA-256-8CH-x8G3 x8 Gen3 PCIe, 256-bit data path PCIe backend 8 Channel, SG DMA Controller (40G Solution) $8,495 $10,495
16 Channel DMA Additional 8 channels to the base 8 channel DMA controllers (Total 16 channels) + $4,000 + $6,000

Resource Utilization

The utilization summary of the PCIe DMA solution is given in following tables. The utilization numbers vary based on the different variations of the core. The solution has been fully verified on different hardware platforms for both Altera and Xilinx FPGAs and for different host platforms.


  Compiled synthesizable binaries or encrypted RTL for the DMA core
Self-checking behavioral models and test benches for simulation
Constraint files and synthesis scripts for design compilation
Linux drivers; Source code
Reference design with integrated PCIe endpoint, data generator/checker and software user application
Design guide(s) and user manuals
USA based technical support by developers