Skip to content

NVIDIA/NeMo-Skills

Repository files navigation

NeMo Skills

NeMo-Skills is a collection of pipelines to improve "skills" of large language models. You can use it to generate synthetic data, train/evaluate models, analyzing outputs and more! Here are some of the things we support.

  • Flexible inference: Seamlessly switch between API providers, local server and large-scale slurm jobs for LLM inference.
  • Multiple formats: Use any of the NeMo, vLLM, sglang and TensorRT-LLM servers and easily convert checkpoints from one format to another.
  • Model evaluation: Evaluate your models on many popular benchmarks
    • Math problem solving: math, aime24, aime25, omni-math (and many more)
    • Formal proofs in Lean: minif2f, proofnet
    • Coding skills: human-eval, mbpp
    • Chat/instruction following: ifeval, arena-hard, mt-bench
    • General knowledge: mmlu, mmlu-pro, gpqa
  • Model training: Train models at speed-of-light using NeMo-Aligner.

You can find the full documentation here. To get started, follow this tutorial, browse available pipelines or run ns --help to see all available commands and their options.

OpenMathInstruct-2

Using our pipelines we created OpenMathInstruct-2 dataset which consists of 14M question-solution pairs (600K unique questions), making it nearly eight times larger than the previous largest open-source math reasoning dataset.

The models trained on this dataset achieve strong results on common mathematical benchmarks.

model GSM8K MATH AMC 2023 AIME 2024 Omni-MATH
Llama3.1-8B-Instruct 84.5 51.9 9/40 2/30 12.7
OpenMath2-Llama3.1-8B (nemo | HF) 91.7 67.8 16/40 3/30 22.0
+ majority@256 94.1 76.1 23/40 3/30 24.6
Llama3.1-70B-Instruct 95.1 68.0 19/40 6/30 19.0
OpenMath2-Llama3.1-70B (nemo | HF) 94.9 71.9 20/40 4/30 23.1
+ majority@256 96.0 79.6 24/40 6/30 27.6

We provide all instructions to fully reproduce our results.

See our paper for ablations studies and more details!

Nemo Inspector

We also provide a convenient tool for visualizing inference and data analysis.

Papers

If you find our work useful, please consider citing us!

@article{toshniwal2024openmathinstruct2,
  title   = {{OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data}},
  author  = {Shubham Toshniwal and Wei Du and Ivan Moshkov and Branislav Kisacanin and Alexan Ayrapetyan and Igor Gitman},
  year    = {2024},
  journal = {arXiv preprint arXiv: Arxiv-2410.01560}
}
@inproceedings{toshniwal2024openmathinstruct1,
  title   = {{OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset}},
  author  = {Shubham Toshniwal and Ivan Moshkov and Sean Narenthiran and Daria Gitman and Fei Jia and Igor Gitman},
  year    = {2024},
  booktitle = {Advances in Neural Information Processing Systems},
}

Disclaimer: This project is strictly for research purposes, and not an official product from NVIDIA.