This document describes the FPGA (Field-Programmable Gate Array) support in the Phynexus framework.
FPGAs are reconfigurable hardware devices that can be programmed to implement custom digital circuits, making them ideal for accelerating specific computational tasks. Phynexus includes support for FPGA acceleration through multiple frameworks, enabling high-performance, energy-efficient execution of AI workloads.
The FPGA support in Phynexus is designed to be flexible and extensible, supporting various FPGA vendors and programming models while providing a unified API for developers.
The FPGA backend in Phynexus provides:
Phynexus supports various FPGA devices:
# Python
from neurenix.hardware.fpga import FPGAManager
# Create an FPGA manager
fpga_manager = FPGAManager(
framework="opencl", # Options: "opencl", "vitis", "openvino"
precision="float32", # Options: "float32", "float16", "int8"
optimize_for="throughput" # Options: "throughput", "latency", "power"
)
# Initialize the FPGA environment
fpga_manager.initialize()
# Get information about available FPGAs
fpga_info = fpga_manager.get_fpga_info()
print(f"FPGA information: {fpga_info}")
# Compile a model for FPGA execution
compiled_model = fpga_manager.compile_model(model, example_inputs)
# Execute the model on FPGA
outputs = fpga_manager.execute_model(compiled_model, inputs)
# Clean up
fpga_manager.finalize()
# Python
from neurenix.hardware.fpga import OpenCLManager
# Create an OpenCL FPGA manager
opencl_manager = OpenCLManager(
precision="float32",
optimize_for="throughput"
)
# Get available platforms
platforms = opencl_manager.get_platforms()
print(f"Available OpenCL platforms: {platforms}")
# Get devices for a specific platform
devices = opencl_manager.get_devices(platform_id=0)
print(f"Available OpenCL devices: {devices}")
# Create and execute a kernel
kernel = opencl_manager.create_kernel(
kernel_name="vector_add",
kernel_source="""
__kernel void vector_add(__global const float* a, __global const float* b, __global float* c) {
int gid = get_global_id(0);
c[gid] = a[gid] + b[gid];
}
"""
)
# Execute the kernel
opencl_manager.execute_kernel(
kernel=kernel,
global_size=(1024,),
local_size=(64,),
args=[input_a, input_b, output]
)
# Python
from neurenix.hardware.fpga import VitisManager
# Create a Vitis FPGA manager
vitis_manager = VitisManager(
target_device="u250",
precision="float32",
optimize_for="throughput"
)
# Load an XCLBIN file
vitis_manager.load_xclbin("/path/to/accelerator.xclbin")
# Get available kernels
kernels = vitis_manager.get_kernels()
print(f"Available kernels: {kernels}")
# Execute a kernel
vitis_manager.execute_kernel(
kernel_name="matrix_multiply",
args=[input_a, input_b, output]
)
# Python
from neurenix.hardware.fpga import OpenVINOManager
# Create an OpenVINO FPGA manager
openvino_manager = OpenVINOManager(
precision="float32",
optimize_for="throughput",
cache_dir="/tmp/openvino_cache"
)
# Load a model
model = openvino_manager.load_model(
model_path="/path/to/model.xml",
weights_path="/path/to/model.bin"
)
# Get input and output information
input_info = openvino_manager.get_input_info(model)
output_info = openvino_manager.get_output_info(model)
print(f"Input info: {input_info}")
print(f"Output info: {output_info}")
# Run inference
outputs = openvino_manager.infer(model, {"input": input_tensor})
# Python
from neurenix.hardware.fpga import FPGAManager
# Use the FPGA manager as a context manager
with FPGAManager(framework="opencl") as fpga:
# The FPGA environment is automatically initialized
# Compile and execute a model
compiled_model = fpga.compile_model(model, example_inputs)
outputs = fpga.execute_model(compiled_model, inputs)
# The FPGA environment is automatically finalized when exiting the context
The FPGA backend implementation in Phynexus follows a layered architecture:
The OpenCL implementation uses the OpenCL API to interact with FPGA devices from various vendors. It provides:
The Xilinx Vitis implementation uses the Vitis AI and Vitis HLS tools to:
The Intel OpenVINO implementation uses the OpenVINO toolkit to:
For optimal performance on FPGAs:
# Python
from neurenix.hardware.fpga import FPGAManager
# Create an FPGA manager with optimization settings
fpga_manager = FPGAManager(
framework="opencl",
precision="float16", # Use lower precision for better performance
optimize_for="throughput",
memory_allocation="static" # Use static memory allocation for predictable performance
)
# Compile the model with optimization
compiled_model = fpga_manager.compile_model(model, example_inputs)
When designing custom kernels for FPGAs:
Efficient memory management is crucial for FPGA performance:
# Python
from neurenix.hardware.fpga import OpenCLManager
# Create an OpenCL FPGA manager
opencl_manager = OpenCLManager()
# Create a kernel with efficient memory access patterns
kernel = opencl_manager.create_kernel(
kernel_name="efficient_kernel",
kernel_source="""
__kernel void efficient_kernel(__global const float* input, __global float* output) {
// Use local memory for frequently accessed data
__local float local_data[256];
// Load data into local memory
int lid = get_local_id(0);
int gid = get_global_id(0);
local_data[lid] = input[gid];
barrier(CLK_LOCAL_MEM_FENCE);
// Process data from local memory
output[gid] = local_data[lid] * 2.0f;
}
"""
)
| Feature | Neurenix | TensorFlow |
|---|---|---|
| FPGA Support | Native support for multiple FPGA frameworks | Limited support through third-party extensions |
| Xilinx Integration | Native Vitis support | Requires TensorFlow-Vitis AI bridge |
| Intel Integration | Native OpenVINO support | Requires OpenVINO-TensorFlow bridge |
| OpenCL Support | Comprehensive OpenCL support | Limited OpenCL support |
| Custom Kernel API | Unified API for custom kernels | Complex integration for custom kernels |
| Multi-vendor Support | Support for multiple FPGA vendors | Primarily focused on specific vendors |
Neurenix provides more comprehensive and integrated FPGA support compared to TensorFlow, which relies on third-party extensions for FPGA acceleration. The unified API in Neurenix makes it easier to work with different FPGA vendors and frameworks.
| Feature | Neurenix | PyTorch |
|---|---|---|
| FPGA Support | Native support for multiple FPGA frameworks | No official FPGA support |
| Xilinx Integration | Native Vitis support | Requires custom integration |
| Intel Integration | Native OpenVINO support | Requires custom integration |
| OpenCL Support | Comprehensive OpenCL support | No native OpenCL support |
| Custom Kernel API | Unified API for custom kernels | No standard API for FPGA kernels |
| Model Compilation | Built-in model compilation for FPGAs | Requires external tools |
PyTorch lacks official FPGA support, requiring custom integrations or third-party tools for FPGA acceleration. Neurenix's native FPGA support provides a more seamless experience with built-in model compilation and optimization.
| Feature | Neurenix | Scikit-Learn |
|---|---|---|
| FPGA Support | Comprehensive FPGA support | No FPGA support |
| Hardware Acceleration | Multiple acceleration options | CPU only |
| Custom Kernels | Support for custom FPGA kernels | No hardware kernel support |
| Model Compilation | Built-in model compilation for FPGAs | No hardware compilation |
| Multi-vendor Support | Support for multiple FPGA vendors | No hardware vendor support |
| Precision Control | Multiple precision options | Limited precision control |
Scikit-Learn does not provide any FPGA or hardware acceleration support, focusing solely on CPU execution. Neurenix's FPGA support enables significant performance improvements for suitable workloads.
The current FPGA implementation has the following limitations:
Future development of the FPGA backend will include: