Overview
The NI Compute Subnet is a decentralized GPU compute marketplace operating on the Bittensor network as subnet 27. It enables miners to contribute GPU resources and earn rewards based on their computational performance, while validators measure miner capabilities and allocate resources to clients through a trustless, permissionless system.
This document provides a high-level overview of the system architecture, core components, and their interactions. For detailed installation instructions, see Installation and Setup. For specific component documentation, refer to the Validator System, Miner System, and Resource Allocation API sections.
System Architecture
Section titled “System Architecture”The NI Compute Subnet consists of three primary components that interact through the Bittensor network and custom communication protocols:
graph TB
subgraph "Bittensor Network"
BT["Subtensor Blockchain"]
META["Metagraph State"]
end
subgraph "Validator System"
VAL["Validator Process<br/>neurons/validator.py"]
POG["Proof-of-GPU Validation"]
SCORE["Performance Scoring"]
WEIGHTS["Weight Setting"]
end
subgraph "Miner System"
MIN["Miner Process<br/>neurons/miner.py"]
DOCKER["Docker Container Management"]
SSH["SSH Resource Access"]
HASHCAT["Hashcat PoW Challenges"]
end
subgraph "Resource Allocation API"
API["RegisterAPI<br/>FastAPI Service"]
ALLOC["Resource Discovery"]
HEALTH["Health Monitoring"]
end
subgraph "Data Layer"
DB["ComputeDb<br/>SQLite Database"]
WANDB["WandB Metrics"]
CONFIG["GPU Performance Config"]
end
VAL -->|"Validates Performance"| MIN
VAL -->|"Sets Network Weights"| BT
VAL -->|"Queries Metagraph"| META
MIN -->|"Registers Hotkey"| BT
MIN -->|"Manages Containers"| DOCKER
MIN -->|"Provides SSH Access"| SSH
API -->|"Allocates Resources"| MIN
API -->|"Health Checks"| HEALTH
VAL -->|"Stores Results"| DB
VAL -->|"Logs Metrics"| WANDB
MIN -->|"Updates Status"| WANDB
CONFIG -->|"Configures Scoring"| VAL
Sources: README.md:1-535 , compute/__init__.py:1-93
Core Components
Section titled “Core Components”Validator System
Section titled “Validator System”Validators are responsible for measuring miner performance and maintaining network integrity. They operate continuous validation cycles that include hardware specification queries, proof-of-GPU benchmarks, and cryptographic challenge verification.
The validator system implements a sophisticated scoring mechanism based on GPU performance metrics, with base scores assigned to different GPU models and scaling factors for multiple GPU configurations. Validators with sufficient stake (validator_permit_stake = 1.0e4 TAO) can set network weights that determine miner rewards.
Key Classes and Constants:
neurons/validator.py- Main validator processvalidator_permit_stake- Minimum stake requirement for validatorsspecs_timeout = 60- Timeout for hardware specification requestspog_retry_limit = 30- Maximum retries for proof-of-GPU validation
Miner System
Section titled “Miner System”Miners contribute GPU computational resources to the network and respond to validator requests. They manage Docker containers for resource isolation, handle SSH-based resource allocation, and participate in proof-of-work challenges using Hashcat.
The miner system uses a priority-based request handling system where resource allocation requests (miner_priority_allocate = 3) take precedence over challenge responses (miner_priority_challenge = 2) and specification queries (miner_priority_specs = 1).
Key Classes and Constants:
neurons/miner.py- Main miner processminer_hashcat_location = "hashcat"- Hashcat binary locationminer_hashcat_workload_profile = "3"- High performance profilepow_timeout = 30- Proof-of-work challenge timeout
Resource Allocation API
Section titled “Resource Allocation API”The Resource Allocation API is a FastAPI-based web service that provides external access to the compute network. It handles resource discovery, allocation requests, and health monitoring of active allocations.
The API implements RSA encryption for secure communication and maintains state through both local database storage and distributed WandB synchronization.
Sources: compute/__init__.py:21-77 , README.md:87-108
Validation and Challenge System
Section titled “Validation and Challenge System”The subnet implements multiple validation mechanisms to ensure miner integrity and performance:
sequenceDiagram
participant V as "Validator"
participant M as "Miner"
participant D as "Docker Container"
participant BT as "Bittensor Network"
Note over V,M: "Hardware Specification Phase"
V->>M: "Send Specs Request"
M->>V: "Return GPU/CPU Specs"
Note over V,M: "Proof-of-GPU Phase"
V->>M: "Request PoG Allocation"
M->>D: "Create Container"
D->>V: "Provide SSH Access"
V->>D: "Run GPU Benchmarks"
D->>V: "Return Benchmark Results"
Note over V,M: "Challenge Phase"
V->>M: "Send PoW Challenge"
M->>M: "Execute Hashcat"
M->>V: "Return Merkle Proof"
Note over V,BT: "Weight Setting Phase"
V->>V: "Calculate Scores"
V->>BT: "Set Network Weights"
Proof-of-Work Configuration:
pow_min_difficulty = 7- Minimum challenge difficultypow_max_difficulty = 12- Maximum challenge difficultypow_default_mode = "610"- BLAKE2b-512 hash modepow_default_chars- Challenge character set including alphanumeric and special characters
Sources: compute/__init__.py:37-49 , README.md:437-450
Data Management and Monitoring
Section titled “Data Management and Monitoring”The system maintains state through multiple data persistence layers:
| Component | Storage Type | Purpose |
|---|---|---|
| ComputeDb | SQLite | Local miner stats, scores, allocations |
| WandB | Distributed | Cross-validator metrics, distributed state |
| Configuration | YAML/ENV | GPU performance benchmarks, API keys |
The monitoring system uses WandB for distributed state management and metrics collection, with separate runs for validators and miners to track performance and resource utilization.
Version Management:
__version__ = "1.9.0"- Current subnet version__minimal_miner_version__ = "1.8.5"- Minimum required miner version__minimal_validator_version__ = "1.8.8"- Minimum required validator version
Sources: compute/__init__.py:20-26 , README.md:297-303
Network Communication
Section titled “Network Communication”The subnet uses custom Bittensor synapse protocols for component communication:
graph LR
subgraph "Protocol Layer"
SPECS["Specs Protocol<br/>Hardware Queries"]
ALLOC["Allocate Protocol<br/>Resource Requests"]
CHALLENGE["Challenge Protocol<br/>PoW Verification"]
end
subgraph "Transport Layer"
AXON["Custom Axon<br/>Miner Endpoints"]
SUBTENSOR["Custom Subtensor<br/>Network Interface"]
end
SPECS --> AXON
ALLOC --> AXON
CHALLENGE --> AXON
AXON --> SUBTENSOR
The communication system includes blacklist management for suspected exploiters and trusted validator lists to maintain network security and integrity.
Sources: compute/__init__.py:59-92 , compute/utils/parser.py:8-170
For detailed information about specific components, see the Validator System, Miner System, Resource Allocation API, and Communication Protocols sections.