|
10G TCP/IP Offload
Engine (TOE) IP Core
►Overview:
Integration of 10Gbps TOE + 10 GEMAC + PCIe allows this highly
flexible and customizable IP core to be used for layer-3, layer 4-7
network infrastructure and network security systems applications. Some
applications include high performance Servers, NICs, SAN/NAS and data
center equipment design applications. This IP core provides key building
blocks for very high performance 10-Giga bit Ethernet implemented in
ASIC/ASSP/FPGAs.
The IP core can process TCP/IP
sessions as client/server in mixed session mode and other protocols for
Network equipment and in-line network security appliances,
simultaneously, at 10-G-bit rate. This relieves the host CPU from costly
TCP/IP software related session setup/tear down, data copying and
maintenance tasks thereby delivering 4x to 8x TCP/IP network performance
improvement when compared with TCP/IP software.
Wide range of TOE processing hardware cores is offered for 10-GE to
1-GE applications using PCI Express or embedded system interfaces. TOE
products support full TCP offload as well as conventional NIC mode
operation (in TCP Bypass Mode) and feature advanced software support
(optional) where applications need no modification to take advantage of
TOE acceleration.
►TOEs
design versions
1) Generic TOE for Network
infrastructure design applications:
a) 4 Session with Payload FIFO of 8/16 K Bytes.
b) 16 Session with Payload FIFO of 16/32
K Bytes
c) 64 Session with scalable FIFO of
64K/256K bytes.
d) 65+ sessions depend upon onchip
memory or DDRx interface (available upon request).
e) Optional Very high performance DMA
blocks also available to integrate with high performance PCIe Gen 2
interface.
2)
TOE with enhanced Security features
a) All of the options available in Generic TOE plus;
i. Protocol filter block can selectively direct
traffic for any known application level protocol to any selected MAC
port; e.g. all
IM/chat traffic, SMTP (email), Web (http) traffic, VoIP etc. can be
filtered and directed to selected ports.
ii. IP and Port number filter block
iii. Specific IP and Port Filtered traffic routed to optional
selected MAC interface/s or PCIe interface or Memory interface directly
at
line rate without CPU involvement.
iv. MAC Filter block, traffic routed to any of the selected
interfaces

Simplified Block Diagram
►
Specifications brief:
Original
1 G TOE Functionality Proven in multiple IDS/IPS appliances
Complete header and flag processing of TCP/IP sessions in hardware
(accelerates by 10x 20x)
TCP Offload Engine- 20-G b/s Wire-speed performance
Scalable to 40 G b/s
TCP + IP check sum- hardware
TCP segmentation/reassembly in hardware (opt)
Multiple slot storage for fragmented packets (opt)
Out of sequence packet detection/storage/Reassembly (opt)
TCP port address tracking/automatic DMA
MAC Address search logic/filter (opt)
IP address search logic/filter (opt)
Accelerate security processing, Storage Networking- TCP
RDMA- Data placement in Applications buffer (reduces CPU utilization
by 90 % )
Future Proof- Flexible implementation of TCP Offload
Accommodates future Specifications changes.
►APIs
Network applications use the Socket API. Typically OS implements the
Socket API with a software stack. In its basic mode the TOE core
implements a standard Hardware API which allows next higher level
applications to fully take advantage of TOEs complete benefits.
Optionally, to achieve higher performance, an
equivalent Socket API (TOE Socket API) can be implemented in order
to enable plug and play acceleration through a simple intercept of
standard Socket calls.
Hardware API: Enables
dedicated processing in the FPGA for application specific acceleration
Ideal for Very high performance
specialized, differentiable ASICs or FPGAs for Network security or
Network infrastructure applications
Fully verified using comprehensive verification methdology for ASIC
ports and Network system tested core.
Smallest logic foot print; less than 20,000 Xilinx slices, Altera ALMs
or 250,000 ASIC gates + on-chip memory
Fully integrated 10 G bit high performance Ethernet MAC.
Scalable MAC Rx FIFOs and Tx FIFOs make it ideal for optimizing system
performance.
Hardware implementation of TCP/IP stacks control plane and data
plane.
Hardware implementation of ARP protocol processing.
Extended ARP table creation, deletion management (optional)
Adheres to RFCs; 793, 1500, 1700, 813, 791, 2001
Sliding Window
mechanism implemented in hardware allowing total Flow Control
Slow
start transfer control in hardware
Non-TCP Bypass mode lets all Non TCP/IP related traffic
go directly to host interface via user_fifo for TCP/IP software to
handle
Can be deployed behind a gateway which will respond to Gateway-IP
request as opposed to ARP request (optional)
On-chip DDR or SSRAM memory controller which can address from 4K Bytes
to 4 MB Bytes on chip or 256 MB off chip memories (optional)
Simple User Side interface for easy hardware integration
or a little more complicated for more power full and controlled
Streaming data transfers.
Many trade-offs for some functions performed in hardware
or software
Configurable Packet buffers, session table buffers
On-chip or Off-chip memories,
attached DDR I/II interface. Depending on system, performance, ASIC/FPGA
size requirements-
Interfaces directly to XGMII, 10 G Bit serial interface
Architecture can be scaled up to 40-G bits
Customizable to handle jumbo frames
Integrated PCIe x 4 bus interface. x8 and x16 (opt)
Integrated AMBA 2.0 interface or Xilinxs PLB bus for Local Processor
control.
User programmable/ prioritize-able interrupts
Performs connection/session management
Monitors, Stores, Maintains and processes up to 1024 live TCP
sessions. Customizable to implement more, depending upon on-chip memory
availability and other FPGA limitations.
Extendable to 4K TCP sessions. Internal Memory dependent.
Wire-speed 20-Gbps performance in full duplex
Multiple TOEs can
process up to 4K connections per second
TCP
+ IP check sum generation and check performed in hardware in less than 6
clks (30 ns at 200 MHz) vs 1-2 us by typical software TCP-stack
Connection set up and tear down/termination without CPU involvement.
User programmable Session table parameters
Dedicated set of hardware Timers for each TCP/IP session
or customizable for sharing one set of common timers for all stale
sessions.
Multiple slot storage for fragmented packets. More slots allocated
when more On-chip Memory available. Self-checking available memory
logic. (optional)
Out of sequence packet detection/storage and
Reassembly/Segmentation (optional)
Direct Data placement in Applications buffer at full wire
speed without CPU (reduces CPUs buffer copy time and utilization by 95%
)
Support VLAN Bypass mode (optional)
Easily customizable for filtering various IP and TCP
traffic Protocols, directed towards any port or IP (Ideal for security
appliances)
Implements Full TCP/IP offload or By-Pass mode.
(Optional)
Future Proof- Flexible implementation of TCP Offload
Accommodates future Specifications changes.
Basic mini API available for easy integration with
Linux/windows. Others OSs/CPUs also available
►
Deliverables
Verilog Source Codes or NetList.
Test Bench, ,vcd files, configuration code/API for easy Linux
port
Verilog models for various components e.g. TCP/IP Client and
Server models, transaction model (optional)
10-GEMAC
External memory interface/model (optional).
TCP Model (optional)
Verification suite (optional)
Test packet-traffic suite (optional)
Price:

Supported Development Boards:

Xilinx Virtex-6 |

Altera Stratix IV GT |

Xilinx Virtex-5 |

Xilinx Virtex-6 HXT |

Xilinx Virtex-6 HXT |
|