Proof of GPU
Overview
Section titled “Overview”Proof of GPU (PoG) is a critical verification mechanism in the NI Compute system (Subnet 27) that enables validators to verify the GPU hardware capabilities of miners in the network. This verification is essential to ensure miners possess the computational resources they claim, maintaining the integrity of the decentralized GPU marketplace.
This document details how validators implement Proof of GPU verification, the benchmarking process, and how these results affect miner scoring within the subnet.
For information about how scores are calculated based on PoG results, see Scoring System.
Core Concepts
Section titled “Core Concepts”Proof of GPU employs a series of technical tests to verify:
- Actual existence of GPU hardware
- Number of GPUs available on the miner
- Type/model of GPUs (e.g., RTX 3090, A100, etc.)
- Performance capabilities of the GPUs
The system uses a combination of direct hardware information queries, benchmarking performance tests, and cryptographic verification methods to ensure miners cannot falsify their hardware capabilities.
Sources: neurons/validator.py:638-762 , neurons/Validator/pog.py
Verification Process
Section titled “Verification Process”sequenceDiagram participant V as "Validator" participant M as "Miner" participant SSH as "SSH Connection" participant GPU as "Miner's GPU" V->>M: Request allocation (for testing) M-->>V: Return SSH connection details V->>SSH: Establish connection Note over V,SSH: Integrity Verification Phase V->>SSH: Hash script SSH-->>V: Return script hash V->>V: Verify script integrity Note over V,SSH: Hardware Detection Phase V->>SSH: Request GPU information SSH->>GPU: Query with nvidia-smi GPU-->>SSH: Return GPU count and types SSH-->>V: Return GPU information Note over V,SSH: Benchmarking Phase V->>SSH: Execute benchmark operations SSH->>GPU: Run matrix operations (FP16/FP32) GPU-->>SSH: Return computation timings SSH-->>V: Return benchmark results V->>V: Calculate TFLOPS V->>V: Identify GPU model based on performance Note over V,SSH: Merkle Proof Verification V->>SSH: Send random seeds SSH->>GPU: Run matrix calculations with seeds GPU-->>SSH: Calculate Merkle tree of results SSH-->>V: Return root hashes V->>SSH: Challenge with random indices SSH->>GPU: Generate proofs for indices GPU-->>SSH: Return proofs SSH-->>V: Return proofs V->>V: Verify Merkle proofs V->>M: Deallocate resources V->>V: Record GPU specifications and score
The above diagram illustrates the complete PoG verification process between a validator and miner.
Sources: neurons/validator.py:774-923 , neurons/Validator/pog.py
Technical Implementation
Section titled “Technical Implementation”1. Allocation and Connection
Section titled “1. Allocation and Connection”The validator first allocates the miner’s resources temporarily for testing purposes:
- The validator generates an RSA key pair for secure communication
- It requests allocation from the miner with minimal resource requirements
- If allocation succeeds, the validator receives SSH connection details
- An SSH connection is established to the miner’s container
This process ensures validators can perform tests in a controlled environment.
Sources: neurons/validator.py:924-987
2. Script Integrity Verification
Section titled “2. Script Integrity Verification”To prevent miners from tampering with the benchmarking script:
- The validator computes a hash of the local benchmarking script
- The script is sent to the miner and a hash is computed remotely
- The validator compares the local and remote hashes
- If they don’t match, the verification fails immediately
flowchart TD A["Compute local hash (compute_script_hash)"] --> B["Send script to miner (send_script_and_request_hash)"] B --> C["Receive remote hash"] C --> D{"Hashes match?"} D -->|"Yes"| E["Continue verification"] D -->|"No"| F["Fail verification"]
Sources: neurons/validator.py:820-827 , neurons/Validator/pog.py
3. GPU Detection and Benchmarking
Section titled “3. GPU Detection and Benchmarking”The validator performs direct hardware detection and benchmarking tests:
- Query GPU information using NVIDIA tools on the miner
- Execute matrix multiplication benchmarks in both FP16 and FP32 precision
- Measure execution time and calculate TFLOPS (Tera Floating-Point Operations Per Second)
- Identify GPU model based on performance metrics and reported hardware information
The benchmarking uses specially designed tests that:
- Must run on GPUs (cannot be efficiently faked with CPUs)
- Produce consistent results for specific GPU models
- Scale with the available GPU memory
Sources: neurons/validator.py:828-858 , neurons/Validator/pog.py
4. Merkle Proof Verification
Section titled “4. Merkle Proof Verification”To cryptographically verify that the benchmarking was actually performed:
- The validator sends random seeds to the miner
- The miner computes large matrices using these seeds
- The miner builds a Merkle tree from the computation results
- The miner returns the Merkle root hashes
- The validator requests proofs for random elements in the matrices
- The miner provides Merkle proofs for these elements
- The validator verifies the proofs against the root hashes
flowchart TD A["Send random seeds to miner (send_seeds)"] --> B["Miner computes matrices (execute_script_on_miner)"] B --> C["Miner builds Merkle trees"] C --> D["Receive root hashes (parse_merkle_output)"] D --> E["Send challenge indices (send_challenge_indices)"] E --> F["Receive Merkle proofs (receive_responses)"] F --> G["Verify proofs (verify_responses)"] G --> H{"Verification successful?"} H -->|"Yes"| I["Record GPU specifications"] H -->|"No"| J["Fail verification"]
This cryptographic verification ensures the miner cannot precompute results or falsify benchmarks.
Sources: neurons/validator.py:859-908 , neurons/Validator/pog.py
System Architecture
Section titled “System Architecture”The PoG system is implemented across several components in the codebase:
graph TD subgraph "Validator System" V["validator.py"] --> POG["Validator/pog.py"] V --> CALC["Validator/calculate_pow_score.py"] POG --> SH["send_script_and_request_hash()"] POG --> BM["parse_benchmark_output()"] POG --> MK["verify_merkle_proof_row()"] CALC --> CS["calc_score_pog()"] end subgraph "Miner Node" MS["Miner Script"] --> GPU["GPU Hardware"] MS --> CUDA["CUDA Operations"] MS --> MT["Merkle Tree Generation"] end subgraph "Database" DB["ComputeDb"] --> PST["POG Stats Table"] V --> DB end V <--> SSH["SSH Connection"] SSH <--> MS
Sources: neurons/validator.py:638-762 , neurons/Validator/pog.py , neurons/Validator/database/pog.py
GPU Scoring and Identification
Section titled “GPU Scoring and Identification”The system identifies GPU models based on their performance characteristics:
- Benchmark results produce FP16 and FP32 TFLOPS measurements
- VRAM capacity is detected and reported
- These metrics are compared against known values for different GPU models
- A tolerance system allows for some variation in benchmark results
- The identified GPU type and count are stored in the database
The identification process uses a configuration file that defines performance expectations for different GPU models. The identify_gpu
function matches the measured performance against these known profiles.
Sources: neurons/Validator/pog.py , neurons/Validator/calculate_pow_score.py
Database Integration
Section titled “Database Integration”Proof of GPU results are stored in a database for:
- Persistent tracking of miner capabilities
- Input into the scoring system
- Historical analysis of network hardware
Key database functions:
get_pog_specs
: Retrieves stored GPU specifications for a specific minerupdate_pog_stats
: Updates the database with new proof resultsretrieve_stats
: Gets statistics for all miners
Sources: neurons/validator.py:343-349 , neurons/Validator/database/pog.py
Scheduling and Resource Management
Section titled “Scheduling and Resource Management”The PoG system includes intelligent scheduling to avoid overwhelming the network:
- Tests are performed periodically (approximately every ~360 blocks)
- Random delays are added to prevent network congestion
- Concurrent testing is limited to a configurable number of miners
- Allocated miners are excluded from testing to avoid disrupting active services
- Failed tests can be retried a configurable number of times
flowchart TD A["Validator Start"] --> B["Block check (current_block % block_next_pog == 0)"] B -->|"Yes"| C["Schedule next POG (block_next_pog = current_block + 360)"] C --> D["Create async task (proof_of_gpu)"] D --> E["Random delay (0-1200 seconds)"] E --> F["Initialize worker pool"] F --> G["Queue miners for testing"] G --> H{"Queue empty?"} H -->|"No"| I["Test miner GPU (test_miner_gpu)"] I --> J{"Test successful?"} J -->|"Yes"| K["Record results"] J -->|"No"| L{"Retry limit reached?"} L -->|"No"| M["Add to retry queue"] L -->|"Yes"| N["Record failure"] M --> H K --> H N --> H H -->|"Yes"| O["Complete POG cycle"] O --> P["Sync scores"]
Sources: neurons/validator.py:638-762 , neurons/validator.py:1169-1176
Integration with Scoring System
Section titled “Integration with Scoring System”The PoG results directly influence miner scoring:
- Successful verification stores GPU type and count in the database
- The scoring system retrieves this data when calculating miner scores
- Miners with more powerful/numerous GPUs receive higher scores
- These scores influence the weights set on the blockchain
- Weights determine reward distribution in the subnet
If a miner fails PoG verification or has no GPUs detected, they receive a score of 0 for GPU capabilities.
Sources: neurons/validator.py:343-366 , neurons/Validator/calculate_pow_score.py
Security Considerations
Section titled “Security Considerations”The PoG system includes several security measures:
- Script integrity verification prevents tampering with the benchmarking code
- Random seeds prevent precomputation of results
- Merkle proofs cryptographically verify computational results
- Performance-based verification makes it difficult to simulate GPUs with CPUs
- SSH connections are secured with proper authentication
These measures collectively ensure that miners cannot easily falsify their hardware capabilities.
Sources: neurons/validator.py:774-923 , neurons/Validator/pog.py
Table of GPU Identification Parameters
Section titled “Table of GPU Identification Parameters”The following table illustrates examples of how different GPU models might be identified (actual values may vary):
GPU Model | Typical FP16 TFLOPS | Typical FP32 TFLOPS | VRAM (GB) | Score Multiplier |
---|---|---|---|---|
RTX 3090 | 35-40 | 18-22 | 24 | High |
RTX 3080 | 28-33 | 14-18 | 10 | Medium-High |
A100 | 75-85 | 18-22 | 40/80 | Very High |
V100 | 55-65 | 14-18 | 16/32 | High |
T4 | 8-12 | 4-6 | 16 | Medium |
K80 | 4-6 | 2-3 | 12 | Low |
The system uses tolerance pairs to account for variations in benchmark results across different environments and configurations.
Sources: neurons/Validator/calculate_pow_score.py , neurons/validator.py:856-858
Conclusion
Section titled “Conclusion”Proof of GPU is a critical component of the NI Compute subnet that provides cryptographic assurance of miners’ hardware capabilities. By combining hardware detection, performance benchmarking, and cryptographic verification, the system maintains the integrity of the marketplace and ensures that rewards are distributed fairly based on actual GPU resources contributed to the network.