Hugging Face 1 | API - Neurenix Documentation

← Previous Page

Overview¶

The Hugging Face Integration module in Neurenix provides seamless interoperability with the Hugging Face ecosystem, enabling users to leverage pre-trained models, datasets, and tools from the Hugging Face Hub within the Neurenix framework. This integration allows researchers and developers to combine the performance benefits of Neurenix's multi-language architecture with the vast resources available in the Hugging Face ecosystem.

Built on Neurenix's high-performance Rust and C++ backends with a user-friendly Python interface, this module ensures efficient execution of Hugging Face models while maintaining compatibility with the broader Hugging Face ecosystem. The integration supports a wide range of model architectures, including transformers for natural language processing, vision transformers for computer vision, and multimodal models.

The module provides utilities for model conversion, fine-tuning, inference, and deployment, making it easy to incorporate state-of-the-art pre-trained models into Neurenix-based applications. It also includes tools for dataset access and processing, enabling seamless integration with the Hugging Face datasets library.

Key Concepts¶

Model Conversion and Compatibility¶

The module provides tools for converting models between Hugging Face's format and Neurenix's native format:

Model Import: Convert Hugging Face models to Neurenix format while preserving architecture and weights
Model Export: Convert Neurenix models to Hugging Face format for sharing and compatibility
Weight Mapping: Automatic mapping of weights between different parameter naming conventions
Architecture Adaptation: Adaptation of model architectures to ensure compatibility
Checkpoint Conversion: Tools for converting checkpoints between formats

This bidirectional conversion ensures seamless integration between the two ecosystems.

Pre-trained Model Access¶

The module provides direct access to pre-trained models from the Hugging Face Hub:

Model Discovery: Browse and search for models on the Hugging Face Hub
Automatic Download: Download and cache models for offline use
Version Management: Track and manage model versions
Model Cards: Access model metadata and documentation
Authentication: Secure access to private models and organizations

These features enable users to leverage the vast collection of pre-trained models available on the Hugging Face Hub.

Fine-tuning and Adaptation¶

The module includes tools for fine-tuning Hugging Face models using Neurenix's training infrastructure:

Transfer Learning: Fine-tune pre-trained models on domain-specific data
Parameter-Efficient Fine-tuning: Support for techniques like LoRA, Prefix Tuning, and P-Tuning
Adapter Integration: Use and manage adapters for efficient model adaptation
Quantization-Aware Fine-tuning: Fine-tune models with quantization in mind
Distillation: Distill knowledge from larger models to smaller ones

These capabilities enable efficient adaptation of pre-trained models to specific tasks and domains.

Inference and Deployment¶

The module provides optimized inference for Hugging Face models using Neurenix's hardware acceleration:

Optimized Inference: Hardware-accelerated inference across various devices
Batched Processing: Efficient processing of batched inputs
Streaming Inference: Support for streaming inputs and outputs
Quantization: Reduced precision inference for improved performance
Deployment Utilities: Tools for deploying models to production environments

These features ensure efficient execution of Hugging Face models in production environments.

Dataset Integration¶

The module includes tools for working with Hugging Face datasets:

Dataset Access: Direct access to datasets from the Hugging Face Hub
Data Processing: Tools for preprocessing and transforming datasets
Data Caching: Efficient caching of datasets for repeated use
Custom Datasets: Tools for creating and sharing custom datasets
Data Streaming: Support for streaming large datasets

These capabilities enable seamless integration with the Hugging Face datasets ecosystem.

API Reference¶

Model Conversion¶

import neurenix
from neurenix.huggingface import from_pretrained, to_pretrained

# Import a Hugging Face model to Neurenix
model = neurenix.huggingface.from_pretrained(
    model_name_or_path="bert-base-uncased",
    task="text-classification",
    num_labels=2,
    device="cuda" if neurenix.cuda.is_available() else "cpu"
)

# Export a Neurenix model to Hugging Face format
neurenix.huggingface.to_pretrained(
    model=model,
    save_directory="./my_exported_model",
    push_to_hub=True,
    repository_id="username/model-name",
    token="hf_token"  # Optional, can also use environment variable
)

Model Access and Management¶

from neurenix.huggingface import list_models, get_model_info

# List available models for a specific task
models = neurenix.huggingface.list_models(
    task="text-classification",
    filter_by_language="en",
    sort_by="downloads",
    direction="descending",
    limit=10
)

# Get information about a specific model
model_info = neurenix.huggingface.get_model_info("bert-base-uncased")
print(f"Model: {model_info.name}")
print(f"Downloads: {model_info.downloads}")
print(f"Task: {model_info.task}")
print(f"Architecture: {model_info.architecture}")

# Download and cache a model for offline use
neurenix.huggingface.cache_model(
    model_name_or_path="bert-base-uncased",
    force_download=False,
    resume_download=True,
    proxies=None,
    local_files_only=False
)

Fine-tuning¶

from neurenix.huggingface import Trainer, TrainingArguments

# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=64,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    metric_for_best_model="accuracy"
)

# Create a trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    compute_metrics=compute_metrics,
    optimizers=(optimizer, scheduler)
)

# Fine-tune the model
trainer.train()

# Evaluate the model
results = trainer.evaluate()

# Save the fine-tuned model
trainer.save_model("./fine_tuned_model")

Parameter-Efficient Fine-tuning¶

from neurenix.huggingface import LoraConfig, PeftModel

# Configure LoRA for parameter-efficient fine-tuning
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["query", "key", "value"],
    lora_dropout=0.05,
    bias="none",
    task_type="SEQ_CLS"
)

# Create a PEFT model
model = neurenix.huggingface.from_pretrained("bert-base-uncased")
peft_model = PeftModel.from_pretrained(
    model=model,
    peft_config=lora_config
)

# Fine-tune the PEFT model
trainer = Trainer(
    model=peft_model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset
)
trainer.train()

Inference¶

from neurenix.huggingface import Pipeline

# Create an inference pipeline
pipeline = Pipeline(
    task="text-classification",
    model=model,
    tokenizer=tokenizer,
    device=0 if neurenix.cuda.is_available() else -1,
    batch_size=32
)

# Run inference
results = pipeline(
    texts=["I love this product!", "This product is terrible."],
    top_k=1
)

# Create a zero-shot classification pipeline
zero_shot_pipeline = Pipeline(
    task="zero-shot-classification",
    model="facebook/bart-large-mnli",
    device=0 if neurenix.cuda.is_available() else -1
)

# Run zero-shot classification
zero_shot_results = zero_shot_pipeline(
    texts=["I love this product!"],
    candidate_labels=["positive", "negative", "neutral"],
    hypothesis_template="This text expresses a {} sentiment."
)

Optimized Inference¶

from neurenix.huggingface import OptimizedModel

# Create an optimized model for inference
optimized_model = OptimizedModel.from_pretrained(
    model_name_or_path="bert-base-uncased",
    optimization_level=3,  # 0-3, higher means more aggressive optimization
    quantization="int8",  # None, "int8", "fp16", "fp8"
    device="cuda" if neurenix.cuda.is_available() else "cpu",
    batch_size=32,
    sequence_length=128,
    enable_caching=True
)

# Run inference with the optimized model
outputs = optimized_model(
    input_ids=input_ids,
    attention_mask=attention_mask
)

Dataset Integration¶

from neurenix.huggingface import load_dataset, DatasetDict

# Load a dataset from the Hugging Face Hub
dataset = neurenix.huggingface.load_dataset(
    path="glue",
    name="mrpc",
    split="train",
    cache_dir="./cache",
    use_auth_token=None
)

# Create a DatasetDict
dataset_dict = DatasetDict({
    "train": dataset.select(range(3000)),
    "validation": dataset.select(range(3000, 3500)),
    "test": dataset.select(range(3500, 4000))
})

# Apply preprocessing to the dataset
def preprocess_function(examples):
    return tokenizer(
        examples["sentence1"],
        examples["sentence2"],
        truncation=True,
        padding="max_length",
        max_length=128
    )

processed_dataset = dataset.map(
    preprocess_function,
    batched=True,
    num_proc=4,
    remove_columns=["sentence1", "sentence2", "idx"]
)

# Create a Neurenix DataLoader from the dataset
dataloader = neurenix.data.DataLoader(
    processed_dataset,
    batch_size=32,
    shuffle=True,
    num_workers=4
)

Framework Comparison¶

Neurenix Hugging Face Integration vs. TensorFlow Hugging Face Integration¶

Feature	Neurenix Hugging Face Integration	TensorFlow Hugging Face Integration
Performance	Multi-language implementation with Rust/C++ backends	Python implementation with TensorFlow backend
Hardware Support	Comprehensive support for various hardware	Primarily focused on TPUs and GPUs
Model Conversion	Bidirectional conversion with weight mapping	One-way conversion with limited mapping
Optimization	Advanced optimization techniques for inference	Basic optimization through TensorFlow tools
Edge Device Support	Native support for edge devices	Limited through TensorFlow Lite
Parameter-Efficient Fine-tuning	Comprehensive support for various techniques	Limited support through separate libraries
Integration Depth	Deep integration with Neurenix ecosystem	Separate library with limited integration
Deployment Options	Multiple deployment options with hardware optimization	Limited to TensorFlow Serving and TF Lite

Neurenix's Hugging Face Integration provides better performance through its multi-language implementation and offers more comprehensive hardware support, especially for edge devices. It also provides more advanced optimization techniques and deeper integration with the Neurenix ecosystem.

Neurenix Hugging Face Integration vs. PyTorch Hugging Face Integration¶

Feature	Neurenix Hugging Face Integration	PyTorch Hugging Face Integration
Performance	Multi-language implementation with Rust/C++ backends	Python implementation with PyTorch backend
Hardware Support	Comprehensive support for various hardware	Primarily focused on CUDA devices
Model Conversion	Bidirectional conversion with automatic adaptation	Native format with minimal conversion needed
Optimization	Integrated optimization pipeline	Separate optimization tools
Edge Device Support	Native support for edge devices	Limited through separate tools
Parameter-Efficient Fine-tuning	Integrated support for various techniques	Available through separate libraries
Integration Depth	Deep integration with Neurenix ecosystem	Native integration with PyTorch
Deployment Options	Multiple deployment options with hardware optimization	Limited deployment options

While PyTorch has native integration with Hugging Face, Neurenix's integration offers better performance through its multi-language implementation and provides more comprehensive hardware support, especially for edge devices. It also offers an integrated optimization pipeline and better support for deployment.

Neurenix Hugging Face Integration vs. Scikit-Learn Hugging Face Integration¶

Feature	Neurenix Hugging Face Integration	Scikit-Learn Hugging Face Integration
Model Types	Comprehensive support for various model architectures	Limited to specific model types
Hardware Acceleration	Native support for various hardware accelerators	Limited hardware acceleration
Deep Learning Support	Full support for deep learning models	Limited deep learning capabilities
Scalability	Scales to large models and datasets	Limited scalability for large models
Integration Depth	Deep integration with Neurenix ecosystem	Limited integration through adapters
Deployment Options	Multiple deployment options with hardware optimization	Limited deployment options
Parameter-Efficient Fine-tuning	Comprehensive support for various techniques	Limited support for fine-tuning
Edge Device Support	Native support for edge devices	Limited edge support

Scikit-Learn's integration with Hugging Face is limited compared to Neurenix's, especially for deep learning models and hardware acceleration. Neurenix provides more comprehensive support for various model architectures, better scalability, and more deployment options.

Best Practices¶

Efficient Model Conversion¶

Use the Right Conversion Strategy: Choose the appropriate conversion strategy based on your needs.

# For simple models, use direct conversion
model = neurenix.huggingface.from_pretrained("bert-base-uncased")

# For complex models, use the advanced conversion with custom mapping
model = neurenix.huggingface.from_pretrained(
    "t5-base",
    weight_mapping={
        "encoder.block.0.layer.0.SelfAttention.q": "encoder.layers.0.self_attn.q_proj",
        "encoder.block.0.layer.0.SelfAttention.k": "encoder.layers.0.self_attn.k_proj",
        "encoder.block.0.layer.0.SelfAttention.v": "encoder.layers.0.self_attn.v_proj",
        # ... other mappings
    }
)

Verify Conversion Accuracy: Always verify that the converted model produces the same outputs as the original.

# Original Hugging Face model
import transformers
hf_model = transformers.AutoModel.from_pretrained("bert-base-uncased")
hf_outputs = hf_model(input_ids, attention_mask)

# Converted Neurenix model
nx_model = neurenix.huggingface.from_pretrained("bert-base-uncased")
nx_outputs = nx_model(input_ids, attention_mask)

# Verify outputs are similar
import neurenix.nn.functional as F
similarity = F.cosine_similarity(
    neurenix.tensor(hf_outputs.last_hidden_state.detach().numpy()),
    nx_outputs.last_hidden_state
)
print(f"Output similarity: {similarity.mean().item()}")

Cache Converted Models: Cache converted models to avoid repeated conversion.

# Convert and cache a model
neurenix.huggingface.from_pretrained(
    "bert-base-uncased",
    cache_dir="./model_cache",
    force_convert=False  # Use cached version if available
)

Optimizing Fine-tuning¶

Choose the Right Fine-tuning Approach: Select the appropriate fine-tuning approach based on your resources and requirements.

# For full fine-tuning with sufficient resources
model = neurenix.huggingface.from_pretrained("bert-base-uncased")
trainer = Trainer(model=model, ...)

# For parameter-efficient fine-tuning with limited resources
from neurenix.huggingface import LoraConfig, PeftModel
lora_config = LoraConfig(r=8, lora_alpha=16, ...)
model = neurenix.huggingface.from_pretrained("bert-base-uncased")
peft_model = PeftModel.from_pretrained(model, lora_config)
trainer = Trainer(model=peft_model, ...)

Use Mixed Precision Training: Enable mixed precision training for faster fine-tuning.

training_args = TrainingArguments(
    fp16=True,  # Enable mixed precision training
    fp16_opt_level="O1",  # Optimization level
    # ... other arguments
)

Implement Gradient Accumulation: Use gradient accumulation for larger effective batch sizes.

training_args = TrainingArguments(
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,  # Effective batch size of 16
    # ... other arguments
)

Monitor Training Metrics: Track relevant metrics during fine-tuning.

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = neurenix.argmax(logits, axis=-1)
    return {
        "accuracy": (predictions == labels).mean().item(),
        "f1": f1_score(labels, predictions, average="weighted")
    }

trainer = Trainer(
    model=model,
    args=training_args,
    compute_metrics=compute_metrics,
    # ... other arguments
)

Optimizing Inference¶

Batch Inputs for Efficiency: Process inputs in batches for better throughput.

# Inefficient: Process inputs one by one
results = []
for text in texts:
    result = pipeline(text)
    results.append(result)

# Efficient: Process inputs in batches
results = pipeline(texts, batch_size=32)

Use Quantization for Faster Inference: Apply quantization to reduce model size and improve inference speed.

# Create a quantized model for inference
quantized_model = neurenix.huggingface.OptimizedModel.from_pretrained(
    "bert-base-uncased",
    quantization="int8",
    device="cuda" if neurenix.cuda.is_available() else "cpu"
)

Enable Caching for Repeated Inference: Cache intermediate results for repeated inference with similar inputs.

# Enable caching for the model
model = neurenix.huggingface.from_pretrained(
    "bert-base-uncased",
    enable_caching=True,
    cache_size=100  # Cache size in MB
)

Use Hardware-Specific Optimizations: Enable optimizations for specific hardware.

# Enable tensor core optimizations for NVIDIA GPUs
if neurenix.cuda.is_available() and neurenix.cuda.is_tensor_cores_available():
    model = neurenix.huggingface.from_pretrained(
        "bert-base-uncased",
        enable_tensor_cores=True
    )

Tutorials¶

Converting and Fine-tuning a Hugging Face Model¶

import neurenix
from neurenix.huggingface import from_pretrained, Trainer, TrainingArguments
from neurenix.data import DataLoader
import numpy as np

# Load a pre-trained model from Hugging Face
model = neurenix.huggingface.from_pretrained(
    model_name_or_path="bert-base-uncased",
    task="text-classification",
    num_labels=2,
    device="cuda" if neurenix.cuda.is_available() else "cpu"
)

# Load a tokenizer
tokenizer = neurenix.huggingface.AutoTokenizer.from_pretrained("bert-base-uncased")

# Load and preprocess a dataset
dataset = neurenix.huggingface.load_dataset("glue", "mrpc")

def preprocess_function(examples):
    return tokenizer(
        examples["sentence1"],
        examples["sentence2"],
        truncation=True,
        padding="max_length",
        max_length=128
    )

processed_dataset = dataset.map(
    preprocess_function,
    batched=True,
    num_proc=4,
    remove_columns=["sentence1", "sentence2", "idx"]
)

# Split the dataset
train_dataset = processed_dataset["train"]
eval_dataset = processed_dataset["validation"]

# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=64,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    metric_for_best_model="accuracy",
    fp16=True  # Enable mixed precision training
)

# Define metrics computation function
def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    accuracy = (predictions == labels).mean()
    return {"accuracy": accuracy}

# Create a trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    compute_metrics=compute_metrics
)

# Fine-tune the model
trainer.train()

# Evaluate the model
results = trainer.evaluate()
print(f"Evaluation results: {results}")

# Save the fine-tuned model
trainer.save_model("./fine_tuned_model")

# Export the model to Hugging Face format
neurenix.huggingface.to_pretrained(
    model=model,
    save_directory="./exported_model",
    push_to_hub=True,
    repository_id="username/bert-finetuned-mrpc",
    token="hf_token"  # Optional, can also use environment variable
)

Parameter-Efficient Fine-tuning with LoRA¶

import neurenix
from neurenix.huggingface import from_pretrained, LoraConfig, PeftModel, Trainer, TrainingArguments

# Load a pre-trained model from Hugging Face
base_model = neurenix.huggingface.from_pretrained(
    model_name_or_path="gpt2",
    device="cuda" if neurenix.cuda.is_available() else "cpu"
)

# Configure LoRA for parameter-efficient fine-tuning
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["c_attn", "c_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

# Create a PEFT model
peft_model = PeftModel.from_pretrained(
    model=base_model,
    peft_config=lora_config
)

# Load a tokenizer
tokenizer = neurenix.huggingface.AutoTokenizer.from_pretrained("gpt2")
tokenizer.pad_token = tokenizer.eos_token

# Load and preprocess a dataset
dataset = neurenix.huggingface.load_dataset("imdb")

def preprocess_function(examples):
    return tokenizer(
        examples["text"],
        truncation=True,
        padding="max_length",
        max_length=512
    )

processed_dataset = dataset.map(
    preprocess_function,
    batched=True,
    num_proc=4,
    remove_columns=["text"]
)

# Split the dataset
train_dataset = processed_dataset["train"].select(range(10000))  # Subsample for faster training
eval_dataset = processed_dataset["test"].select(range(1000))

# Define training arguments
training_args = TrainingArguments(
    output_dir="./results_lora",
    num_train_epochs=1,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    per_device_eval_batch_size=16,
    warmup_steps=100,
    weight_decay=0.01,
    logging_dir="./logs_lora",
    logging_steps=10,
    evaluation_strategy="steps",
    eval_steps=500,
    save_strategy="steps",
    save_steps=500,
    load_best_model_at_end=True,
    fp16=True
)

# Create a trainer
trainer = Trainer(
    model=peft_model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset
)

# Fine-tune the model
trainer.train()

# Evaluate the model
results = trainer.evaluate()
print(f"Evaluation results: {results}")

# Save the fine-tuned LoRA model
peft_model.save_pretrained("./lora_finetuned_model")

# Merge LoRA weights with the base model
merged_model = peft_model.merge_and_unload()

# Generate text with the fine-tuned model
input_text = "This movie was"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(peft_model.device)

with neurenix.no_grad():
    output = merged_model.generate(
        input_ids=input_ids,
        max_length=100,
        num_return_sequences=1,
        temperature=0.7,
        top_p=0.9
    )

generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(f"Generated text: {generated_text}")

Creating an Optimized Inference Pipeline¶

import neurenix
from neurenix.huggingface import OptimizedModel, Pipeline
import time

# Load and optimize a model for inference
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
optimized_model = neurenix.huggingface.OptimizedModel.from_pretrained(
    model_name_or_path=model_name,
    optimization_level=3,
    quantization="int8",
    device="cuda" if neurenix.cuda.is_available() else "cpu",
    batch_size=32,
    sequence_length=128,
    enable_caching=True
)

# Load a tokenizer
tokenizer = neurenix.huggingface.AutoTokenizer.from_pretrained(model_name)

# Create an optimized pipeline
pipeline = Pipeline(
    task="text-classification",
    model=optimized_model,
    tokenizer=tokenizer,
    device=0 if neurenix.cuda.is_available() else -1,
    batch_size=32
)

# Prepare test data
test_texts = [
    "I love this product! It's amazing and works perfectly.",
    "This is the worst purchase I've ever made. Completely disappointed.",
    "The product is okay, but not worth the price.",
    "Exceeded my expectations in every way possible.",
    "It's decent, but there are better alternatives available."
] * 100  # Repeat to create a larger test set

# Warm up the pipeline
_ = pipeline(test_texts[:5])

# Benchmark inference
start_time = time.time()
results = pipeline(test_texts)
end_time = time.time()

# Print results
print(f"Processed {len(test_texts)} texts in {end_time - start_time:.2f} seconds")
print(f"Average time per text: {(end_time - start_time) / len(test_texts) * 1000:.2f} ms")
print(f"Throughput: {len(test_texts) / (end_time - start_time):.2f} texts/second")

# Print some example results
for i, (text, result) in enumerate(zip(test_texts[:5], results[:5])):
    label = result["label"]
    score = result["score"]
    print(f"Text: {text}")
    print(f"Prediction: {label} (confidence: {score:.4f})")
    print()

# Export the optimized model
optimized_model.save("./optimized_model")

# Create a model for edge deployment
edge_model = neurenix.huggingface.OptimizedModel.from_pretrained(
    model_name_or_path=model_name,
    optimization_level=3,
    quantization="int8",
    target_device="edge",
    export_format="onnx"
)

# Save the edge model
edge_model.save("./edge_model")

Conclusion¶

The Hugging Face Integration module in Neurenix provides a comprehensive set of tools for leveraging the vast ecosystem of pre-trained models, datasets, and tools available on the Hugging Face Hub. By combining Neurenix's high-performance multi-language architecture with Hugging Face's resources, users can build powerful machine learning applications with state-of-the-art models while benefiting from Neurenix's hardware acceleration and optimization capabilities.

Compared to other frameworks like TensorFlow, PyTorch, and Scikit-Learn, Neurenix's Hugging Face Integration offers advantages in terms of performance, hardware support, and optimization capabilities. Its deep integration with the Neurenix ecosystem ensures seamless interoperability while maintaining compatibility with the broader Hugging Face ecosystem.

By following the best practices and tutorials outlined in this documentation, users can efficiently convert, fine-tune, and deploy Hugging Face models using Neurenix, taking advantage of the strengths of both ecosystems to build high-performance machine learning applications.

Page

Table Of Contents

Hugging Face Integration Module¶

Overview¶

Key Concepts¶

Model Conversion and Compatibility¶

Pre-trained Model Access¶

Fine-tuning and Adaptation¶

Inference and Deployment¶

Dataset Integration¶

API Reference¶

Model Conversion¶

Model Access and Management¶

Fine-tuning¶

Parameter-Efficient Fine-tuning¶

Inference¶

Optimized Inference¶

Dataset Integration¶

Framework Comparison¶

Neurenix Hugging Face Integration vs. TensorFlow Hugging Face Integration¶

Neurenix Hugging Face Integration vs. PyTorch Hugging Face Integration¶

Neurenix Hugging Face Integration vs. Scikit-Learn Hugging Face Integration¶

Best Practices¶

Efficient Model Conversion¶

Optimizing Fine-tuning¶

Optimizing Inference¶

Tutorials¶

Converting and Fine-tuning a Hugging Face Model¶

Parameter-Efficient Fine-tuning with LoRA¶

Creating an Optimized Inference Pipeline¶

Conclusion¶