{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## LLM Compressor Workbench -- Getting Started\n", "\n", "This notebook will demonstrate how common [LLM Compressor](https://github.com/vllm-project/llm-compressor) flows can be run on the Alauda AI.\n", "\n", "We will show how a user can compress a Large Language Model, with a calibration dataset.\n", "\n", "The notebook will detect if a GPU is available. If one is not available, it will demonstrate an abbreviated run, so users without GPU access can still get a feel for `llm-compressor`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calibrated Compression with a Dataset\n", "\n", "Some more advanced compression algorithms require a small dataset of calibration samples that are meant to be a representative random subset of the language the model will see at inference.\n", "\n", "We will show how the previous section can be augmented with a calibration dataset and GPTQ, one of the first published LLM compression algorithms.\n", "\n", "