nvidia-generative-ai-notes

Megatron Core

Megatron Core is an open-source PyTorch-based library that contains GPU-optimized techniques and cutting-edge system-level optimizations. It abstracts them into composable and modular APIs, allowing full flexibility for developers and model researchers to train custom transformers at-scale on NVIDIA accelerated computing infrastructure.

Megatron Core is the low-level high-performance training library extracted from Megatron-LM.

It provides:

Megatron Bridge

NeMo Megatron Bridge is a PyTorch-native library within the NeMo Framework that provides pretraining, SFT and LoRA for popular LLM and VLM models. Megatron Bridge is a compatibility layer between Megatron Core/NeMo and other ecosystems like HuggingFace.

The part of NeMo 2.0 that lets you:

If you’re training LLMs (not just fine-tuning small models), Megatron-Bridge is the backend that powers large-scale distributed training.

Practical Example

Scenario 1: Training Llama 70B on 64 GPUs

You use:

Megatron Core handles:

Scenario 2: You want to export that model to HuggingFace

You use: