PQuantML

Welcome to the official documentation for PQuantML, a hardware-aware model compression framework supporting:

PQuantML enables efficient deployment of compact neural networks on resource-constrained hardware such as FPGAs and embedded accelerators.

Key Features

Joint Quantization + Pruning: Combine bit-width reduction with structured pruning.
Flexible Precision Control: Per-layer and mixed-precision configuration.
Hardware-Aware Objective: Include resource constraints (DSP, LUT, BRAM) in training.
Simple API: Configure compression through a single YAML or Python object.
PyTorch Integration: Works with custom training/validation loops.
Export Support: Model conversion towards hardware toolchains.