Spaces:
Running
Running
[node] estimation
Browse files
README.md
CHANGED
|
@@ -89,6 +89,8 @@ python app.py
|
|
| 89 |
| Qwen2-VL-7B | 1024/256 | 1 | Inference | FP16 | 1 |
|
| 90 |
| VILA-1.5-13B | 2048/512 | 2 | Inference | BF16 | 1 |
|
| 91 |
| Qwen2-Audio-7B | 1024/256 | 1 | Inference | FP16 | 1 |
|
|
|
|
|
|
|
| 92 |
|
| 93 |
## CUDA Recommendations
|
| 94 |
|
|
@@ -133,6 +135,12 @@ The application provides tailored CUDA version recommendations:
|
|
| 133 |
- **Qwen-Audio**: Base, Chat variants
|
| 134 |
- **Qwen2-Audio**: 7B
|
| 135 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
### Precision Impact
|
| 137 |
- **FP32**: Full precision (4 bytes per parameter)
|
| 138 |
- **FP16/BF16**: Half precision (2 bytes per parameter)
|
|
@@ -145,6 +153,17 @@ The application provides tailored CUDA version recommendations:
|
|
| 145 |
- **Memory Overhead**: Additional memory for vision/audio encoders and cross-modal attention
|
| 146 |
- **Token Estimation**: Consider multimodal inputs when calculating token counts
|
| 147 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 148 |
## Limitations
|
| 149 |
|
| 150 |
- Estimates are approximate and may vary based on:
|
|
|
|
| 89 |
| Qwen2-VL-7B | 1024/256 | 1 | Inference | FP16 | 1 |
|
| 90 |
| VILA-1.5-13B | 2048/512 | 2 | Inference | BF16 | 1 |
|
| 91 |
| Qwen2-Audio-7B | 1024/256 | 1 | Inference | FP16 | 1 |
|
| 92 |
+
| PhysicsNeMo-FNO-Large | 512/128 | 8 | Training | FP32 | 1 |
|
| 93 |
+
| PhysicsNeMo-GraphCast-Medium | 1024/256 | 4 | Training | FP16 | 1 |
|
| 94 |
|
| 95 |
## CUDA Recommendations
|
| 96 |
|
|
|
|
| 135 |
- **Qwen-Audio**: Base, Chat variants
|
| 136 |
- **Qwen2-Audio**: 7B
|
| 137 |
|
| 138 |
+
#### Physics-ML Models (NVIDIA PhysicsNeMo)
|
| 139 |
+
- **Fourier Neural Operators (FNO)**: Small (1M), Medium (10M), Large (50M)
|
| 140 |
+
- **Physics-Informed Neural Networks (PINN)**: Small (0.5M), Medium (5M), Large (20M)
|
| 141 |
+
- **GraphCast**: Small (50M), Medium (200M), Large (1B) - for weather/climate modeling
|
| 142 |
+
- **Spherical FNO (SFNO)**: Small (25M), Medium (100M), Large (500M) - for global simulations
|
| 143 |
+
|
| 144 |
### Precision Impact
|
| 145 |
- **FP32**: Full precision (4 bytes per parameter)
|
| 146 |
- **FP16/BF16**: Half precision (2 bytes per parameter)
|
|
|
|
| 153 |
- **Memory Overhead**: Additional memory for vision/audio encoders and cross-modal attention
|
| 154 |
- **Token Estimation**: Consider multimodal inputs when calculating token counts
|
| 155 |
|
| 156 |
+
### PhysicsNeMo Considerations
|
| 157 |
+
- **Grid-Based Data**: Physics models work with spatial/temporal grids rather than text tokens
|
| 158 |
+
- **Batch Training**: Physics-ML models typically require larger batch sizes for stable training
|
| 159 |
+
- **Memory Patterns**: Different from LLMs - less KV cache, more gradient memory for PDE constraints
|
| 160 |
+
- **Precision Requirements**: Many physics simulations require FP32 for numerical stability
|
| 161 |
+
- **Use Cases**:
|
| 162 |
+
- **FNO**: Solving PDEs on regular grids (fluid dynamics, heat transfer)
|
| 163 |
+
- **PINN**: Physics-informed training with PDE constraints
|
| 164 |
+
- **GraphCast**: Weather prediction and climate modeling
|
| 165 |
+
- **SFNO**: Global atmospheric and oceanic simulations
|
| 166 |
+
|
| 167 |
## Limitations
|
| 168 |
|
| 169 |
- Estimates are approximate and may vary based on:
|
app.py
CHANGED
|
@@ -272,10 +272,10 @@ def estimate_nodes_interface(
|
|
| 272 |
|
| 273 |
# Validate inputs
|
| 274 |
if input_tokens <= 0 or output_tokens <= 0:
|
| 275 |
-
return "Please enter valid token counts (> 0)", "", None, ""
|
| 276 |
|
| 277 |
if batch_size <= 0:
|
| 278 |
-
return "Please enter a valid batch size (> 0)", "", None, ""
|
| 279 |
|
| 280 |
# Calculate node requirements
|
| 281 |
nodes_needed, explanation, breakdown = estimate_h100_nodes(
|
|
@@ -288,7 +288,7 @@ def estimate_nodes_interface(
|
|
| 288 |
# Create performance chart
|
| 289 |
fig = create_performance_chart(breakdown)
|
| 290 |
|
| 291 |
-
return explanation, cuda_rec, fig, f"
|
| 292 |
|
| 293 |
# Create Gradio interface
|
| 294 |
def create_interface():
|
|
@@ -345,7 +345,7 @@ def create_interface():
|
|
| 345 |
with gr.Column(scale=2):
|
| 346 |
gr.Markdown("## Results")
|
| 347 |
|
| 348 |
-
node_count = gr.Markdown("
|
| 349 |
|
| 350 |
with gr.Tab("📊 Detailed Analysis"):
|
| 351 |
detailed_output = gr.Markdown()
|
|
|
|
| 272 |
|
| 273 |
# Validate inputs
|
| 274 |
if input_tokens <= 0 or output_tokens <= 0:
|
| 275 |
+
return "Please enter valid token counts (> 0)", "", None, "## ⚠️ <span style='color: #E74C3C;'>**Invalid Input: Token counts must be > 0**</span>"
|
| 276 |
|
| 277 |
if batch_size <= 0:
|
| 278 |
+
return "Please enter a valid batch size (> 0)", "", None, "## ⚠️ <span style='color: #E74C3C;'>**Invalid Input: Batch size must be > 0**</span>"
|
| 279 |
|
| 280 |
# Calculate node requirements
|
| 281 |
nodes_needed, explanation, breakdown = estimate_h100_nodes(
|
|
|
|
| 288 |
# Create performance chart
|
| 289 |
fig = create_performance_chart(breakdown)
|
| 290 |
|
| 291 |
+
return explanation, cuda_rec, fig, f"## 🖥️ <span style='color: #4A90E2;'>**Estimated H100 Nodes Required: {nodes_needed}**</span>"
|
| 292 |
|
| 293 |
# Create Gradio interface
|
| 294 |
def create_interface():
|
|
|
|
| 345 |
with gr.Column(scale=2):
|
| 346 |
gr.Markdown("## Results")
|
| 347 |
|
| 348 |
+
node_count = gr.Markdown("## 🖥️ <span style='color: #4A90E2;'>**Ready to estimate...**</span>")
|
| 349 |
|
| 350 |
with gr.Tab("📊 Detailed Analysis"):
|
| 351 |
detailed_output = gr.Markdown()
|