This document describes the Intel MKL-DNN (Math Kernel Library for Deep Neural Networks) support in the Phynexus framework.
Intel MKL-DNN is the predecessor to oneDNN, providing optimized primitives for deep learning applications on Intel CPUs. It includes highly optimized implementations of basic operations like convolution, matrix multiplication, and activation functions specifically tuned for Intel architectures. Phynexus maintains support for MKL-DNN to ensure compatibility with systems that have not yet migrated to oneDNN.
The MKL-DNN backend in Phynexus leverages Intel's optimized primitives to significantly improve performance for both training and inference workloads on Intel CPUs. This integration allows models to run efficiently on a wide range of Intel platforms, from laptops to high-performance computing systems.
The MKL-DNN backend in Phynexus provides:
MKL-DNN support in Phynexus requires:
Compatible hardware includes: - Intel Xeon processors - Intel Core processors - Intel Atom processors
# Python
from neurenix.hardware.mkldnn import is_mkldnn_available
if is_mkldnn_available():
print("MKL-DNN is available")
else:
print("MKL-DNN is not available")
# Python
from neurenix.hardware.mkldnn import MKLDNNBackend
# Create the backend
try:
mkldnn = MKLDNNBackend()
# Initialize the backend
if mkldnn.initialize():
print("MKL-DNN backend initialized successfully")
else:
print("Failed to initialize MKL-DNN backend")
except RuntimeError as e:
print(f"MKL-DNN error: {e}")
# Python
from neurenix.hardware.mkldnn import MKLDNNBackend
# Create and initialize the backend
mkldnn = MKLDNNBackend()
mkldnn.initialize()
# Get the number of available devices
device_count = mkldnn.get_device_count()
print(f"Available MKL-DNN devices: {device_count}")
# Get information about a specific device
device_info = mkldnn.get_device_info(0) # First device
print(f"Device info: {device_info}")
# Python
import neurenix as nx
from neurenix.hardware.mkldnn import MKLDNNBackend
# Create tensors
a = nx.Tensor([[1, 2], [3, 4]])
b = nx.Tensor([[5, 6], [7, 8]])
# Create and initialize the backend
mkldnn = MKLDNNBackend()
mkldnn.initialize()
# Perform matrix multiplication using MKL-DNN
c = mkldnn.matmul(a, b)
print(f"Result: {c}")
# Python
import neurenix as nx
from neurenix.hardware.mkldnn import MKLDNNBackend
# Create input and weight tensors
input = nx.random.randn(1, 3, 32, 32) # Batch, Channels, Height, Width
weight = nx.random.randn(16, 3, 3, 3) # Out channels, In channels, Kernel H, Kernel W
# Create and initialize the backend
mkldnn = MKLDNNBackend()
mkldnn.initialize()
# Perform 2D convolution using MKL-DNN
output = mkldnn.conv2d(
input=input,
weight=weight,
bias=None,
stride=(1, 1),
padding=(1, 1)
)
print(f"Output shape: {output.shape}")
# Python
import neurenix as nx
from neurenix.hardware.mkldnn import MKLDNNBackend
# Create input and hidden state tensors
batch_size = 1
seq_length = 10
input_size = 20
hidden_size = 30
input = nx.random.randn(seq_length, batch_size, input_size)
h0 = nx.random.randn(batch_size, hidden_size)
c0 = nx.random.randn(batch_size, hidden_size)
hidden = (h0, c0)
weight_ih = nx.random.randn(4 * hidden_size, input_size)
weight_hh = nx.random.randn(4 * hidden_size, hidden_size)
bias_ih = nx.random.randn(4 * hidden_size)
bias_hh = nx.random.randn(4 * hidden_size)
# Create and initialize the backend
mkldnn = MKLDNNBackend()
mkldnn.initialize()
# Perform LSTM operation using MKL-DNN
output, (h_n, c_n) = mkldnn.lstm(
input=input,
hidden=(h0, c0),
weight_ih=weight_ih,
weight_hh=weight_hh,
bias_ih=bias_ih,
bias_hh=bias_hh
)
print(f"Output shape: {output.shape}")
print(f"Final hidden state shape: {h_n.shape}")
print(f"Final cell state shape: {c_n.shape}")
The MKL-DNN backend implementation in Phynexus follows a layered architecture:
The implementation uses MKL-DNN's engine and stream abstractions:
These abstractions allow for efficient execution and synchronization of operations.
The implementation includes sophisticated memory management:
MKL-DNN operations are represented as primitives:
For optimal performance with MKL-DNN:
# Python
import neurenix as nx
from neurenix.hardware.mkldnn import MKLDNNBackend
# Create and initialize the backend
mkldnn = MKLDNNBackend()
mkldnn.initialize()
# When creating tensors, let MKL-DNN choose the optimal memory format
# (In a real implementation, this would be handled internally)
Reuse primitives for repeated operations:
# Python
# In a real implementation, the MKLDNNBackend class would cache primitives internally
# for repeated operations with the same parameters
Choose appropriate data types for your workload:
# Python
import neurenix as nx
from neurenix.hardware.mkldnn import MKLDNNBackend
# Create and initialize the backend
mkldnn = MKLDNNBackend()
mkldnn.initialize()
# For inference, consider using lower precision
# (In a real implementation, this would be configurable)
| Feature | Neurenix | TensorFlow |
|---|---|---|
| MKL-DNN Integration | Native integration | Via MKL-DNN plugin |
| Intel CPU Optimization | Comprehensive | Limited |
| Memory Format Optimization | Automatic | Manual configuration required |
| Primitive Caching | Automatic | Limited |
| API Simplicity | Unified API | Separate API for MKL-DNN |
| Transition to oneDNN | Smooth transition path | Requires code changes |
Neurenix provides more seamless integration with MKL-DNN compared to TensorFlow, which requires a separate plugin. The unified API in Neurenix makes it easier to use MKL-DNN acceleration while maintaining compatibility with other backends, and provides a smooth transition path to oneDNN.
| Feature | Neurenix | PyTorch |
|---|---|---|
| MKL-DNN Integration | Native integration | Via MKLDNN backend |
| Intel CPU Optimization | Comprehensive | Good |
| Memory Format Optimization | Automatic | Manual configuration required |
| Primitive Caching | Automatic | Limited |
| API Simplicity | Unified API | Separate API for MKLDNN |
| Transition to oneDNN | Smooth transition path | Requires code changes |
PyTorch has some integration with MKL-DNN through its MKLDNN backend, but it's not as deeply integrated as Neurenix's native MKL-DNN support. Neurenix provides a more seamless experience with automatic memory format optimization and primitive caching, as well as a smoother transition path to oneDNN.
| Feature | Neurenix | Scikit-Learn |
|---|---|---|
| MKL-DNN Support | Comprehensive support | No MKL-DNN support |
| Deep Learning Primitives | Optimized primitives | Generic implementations |
| Intel CPU Optimization | Comprehensive | Limited |
| Memory Format Optimization | Automatic | N/A |
| Performance on Intel Hardware | Optimized | Not optimized for deep learning |
| Transition to oneDNN | Smooth transition path | N/A |
Scikit-Learn does not provide any MKL-DNN integration, focusing on traditional machine learning algorithms rather than deep learning. Neurenix's MKL-DNN support enables significant performance improvements for deep learning workloads on Intel hardware.
The current MKL-DNN implementation has the following limitations:
Future development of the MKL-DNN backend will focus on: