This document describes the DirectML support in the Phynexus framework.
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. It enables AI workloads to run on a wide variety of DirectX 12-capable GPUs on Windows systems. Phynexus includes DirectML support to provide hardware acceleration on Windows devices, particularly those without dedicated CUDA or ROCm support.
DirectML leverages the DirectX 12 API to provide cross-vendor GPU acceleration, allowing models to run efficiently on GPUs from NVIDIA, AMD, and Intel on Windows platforms. This makes it an excellent choice for deploying AI applications on Windows systems with diverse hardware configurations.
The DirectML backend in Phynexus provides:
DirectML support in Phynexus requires:
Compatible hardware includes: - NVIDIA GPUs (Kepler architecture or newer) - AMD GPUs (GCN architecture or newer) - Intel GPUs (Gen9 architecture or newer) - Any other DirectX 12 compatible GPU
# Python
from neurenix.hardware.directml import is_directml_available
if is_directml_available():
print("DirectML is available")
else:
print("DirectML is not available")
# Python
from neurenix.hardware.directml import DirectMLBackend
# Create the backend
try:
directml = DirectMLBackend()
# Initialize the backend
if directml.initialize():
print("DirectML backend initialized successfully")
else:
print("Failed to initialize DirectML backend")
except RuntimeError as e:
print(f"DirectML error: {e}")
# Python
from neurenix.hardware.directml import DirectMLBackend
# Create and initialize the backend
directml = DirectMLBackend()
directml.initialize()
# Get the number of available devices
device_count = directml.get_device_count()
print(f"Available DirectML devices: {device_count}")
# Get information about a specific device
device_info = directml.get_device_info(0) # First device
print(f"Device info: {device_info}")
# Python
import neurenix as nx
from neurenix.hardware.directml import DirectMLBackend
# Create tensors
a = nx.Tensor([[1, 2], [3, 4]])
b = nx.Tensor([[5, 6], [7, 8]])
# Create and initialize the backend
directml = DirectMLBackend()
directml.initialize()
# Perform matrix multiplication using DirectML
c = directml.matmul(a, b)
print(f"Result: {c}")
# Python
import neurenix as nx
from neurenix.hardware.directml import DirectMLBackend
# Create input and weight tensors
input = nx.random.randn(1, 3, 32, 32) # Batch, Channels, Height, Width
weight = nx.random.randn(16, 3, 3, 3) # Out channels, In channels, Kernel H, Kernel W
# Create and initialize the backend
directml = DirectMLBackend()
directml.initialize()
# Perform 2D convolution using DirectML
output = directml.conv2d(
input=input,
weight=weight,
bias=None,
stride=(1, 1),
padding=(1, 1)
)
print(f"Output shape: {output.shape}")
The DirectML backend implementation in Phynexus follows a layered architecture:
The implementation uses the DirectML API to:
The DirectML backend includes automatic fallback to CPU implementations when: - DirectML is not available on the system - The operation is not supported by DirectML - An error occurs during DirectML execution
This ensures that code written to use DirectML will still work on systems without DirectML support, albeit with reduced performance.
When multiple DirectML-compatible devices are available:
# Python
from neurenix.hardware.directml import DirectMLBackend
# Create the backend
directml = DirectMLBackend()
directml.initialize()
# Get the number of available devices
device_count = directml.get_device_count()
# Get information about all devices
for i in range(device_count):
device_info = directml.get_device_info(i)
print(f"Device {i}: {device_info}")
# Select the best device based on your requirements
# (In a real implementation, you would choose based on device capabilities)
For optimal performance with DirectML:
For best performance with DirectML:
| Feature | Neurenix | TensorFlow |
|---|---|---|
| DirectML Support | Native integration | Via DirectML plugin |
| Windows Integration | Seamless | Requires additional setup |
| Cross-vendor Support | NVIDIA, AMD, Intel GPUs | Limited cross-vendor support |
| Fallback Mechanism | Automatic CPU fallback | Manual fallback required |
| API Simplicity | Unified API | Separate API for DirectML |
| Integration with Framework | Fully integrated | Plugin-based integration |
Neurenix provides more seamless integration with DirectML compared to TensorFlow, which requires a separate plugin. The unified API in Neurenix makes it easier to use DirectML acceleration while maintaining compatibility with other backends.
| Feature | Neurenix | PyTorch |
|---|---|---|
| DirectML Support | Native integration | Via DirectML plugin |
| Windows Integration | Seamless | Requires additional setup |
| Cross-vendor Support | NVIDIA, AMD, Intel GPUs | Limited cross-vendor support |
| Fallback Mechanism | Automatic CPU fallback | Manual fallback required |
| API Simplicity | Unified API | Separate API for DirectML |
| Integration with Framework | Fully integrated | Plugin-based integration |
PyTorch's DirectML support is provided through a separate plugin, which introduces additional complexity. Neurenix's native DirectML support provides a more integrated experience with automatic fallback to CPU when needed.
| Feature | Neurenix | Scikit-Learn |
|---|---|---|
| DirectML Support | Comprehensive support | No DirectML support |
| Hardware Acceleration | Multiple acceleration options | CPU only |
| Windows GPU Support | Native support | No GPU support on Windows |
| Cross-vendor Support | NVIDIA, AMD, Intel GPUs | No GPU support |
| API Consistency | Consistent API across devices | CPU-only API |
| Performance on Windows | Optimized for Windows GPUs | Limited to CPU performance |
Scikit-Learn does not provide any DirectML or GPU acceleration support, focusing solely on CPU execution. Neurenix's DirectML support enables significant performance improvements on Windows systems with compatible GPUs.
The current DirectML implementation has the following limitations:
Future development of the DirectML backend will include: