1 Repositories
Latest Python Libraries
Convert BART models to ONNX with quantization. 3X reduction in size, and upto 3X boost in inference speed
fast-Bart Reduction of BART model size by 3X, and boost in inference speed up to 3X BART implementation of the fastT5 library (https://github.com/Ki6a
19 Dec 09, 2022