ptq
Here are 14 public repositories matching this topic...
This is the official PyTorch implementation of "LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models"
-
Updated
Jul 3, 2024 - Python
Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.
-
Updated
Jul 3, 2024 - Python
Brevitas: neural network quantization in PyTorch
-
Updated
Jul 4, 2024 - Python
[ICML 2024] Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
-
Updated
Jun 15, 2024 - Python
Quantization of Models : Post-Training Quantization(PTQ) and Quantize Aware Training(QAT)
-
Updated
May 21, 2024 - Jupyter Notebook
More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam,dcn and so on), and tensorrt
-
Updated
May 8, 2024 - Python
EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.
-
Updated
May 4, 2024 - Jupyter Notebook
Post post-training-quantization (PTQ) method for improving LLMs. Unofficial implementation of https://arxiv.org/abs/2309.02784
-
Updated
Feb 21, 2024 - Python
inference with the structured sparsity and quantization
-
Updated
Aug 30, 2023 - Python
Build AI model to classify beverages for blind individuals
-
Updated
Aug 16, 2023 - Python
quantization example for pqt & qat
-
Updated
Jul 24, 2023 - Python
Generating tensorrt model using onnx
-
Updated
Jun 22, 2023 - C++
Improve this page
Add a description, image, and links to the ptq topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the ptq topic, visit your repo's landing page and select "manage topics."