ZipFly is a zip archive generator based on zipfile.py

Overview

Build Status GitHub release (latest by date) Downloads

Buzon - ZipFly

ZipFly is a zip archive generator based on zipfile.py. It was created by Buzon.io to generate very large ZIP archives for immediate sending out to clients, or for writing large ZIP archives without memory inflation.

Requirements

Python 3.6+

Install

pip3 install zipfly

Basic usage, compress on-the-fly during writes

Using this library will save you from having to write the Zip to disk. Some data will be buffered by the zipfile deflater, but memory inflation is going to be very constrained. Data will be written to destination by default at regular 32KB intervals.

ZipFly defaults attributes:

  • paths: [ ]
  • mode: (write) w
  • chunksize: (bytes) 32768
  • compression: Stored
  • allowZip64: True
  • compresslevel: None
  • storesize: (bytes) 0
  • encode: utf-8

paths list of dictionaries:

.
fs Should be the path to a file on the filesystem
n (Optional) Is the name which it will have within the archive
(by default, this will be the same as fs)

    import zipfly

    paths = [
        {
            'fs': '/path/to/large/file'
        },
    ]

    zfly = zipfly.ZipFly(paths = paths)

    generator = zfly.generator()
    print (generator)
    # 
   


    with open("large.zip", "wb") as f:
        for i in generator:
            f.write(i)

Examples

Streaming multiple files in a zip with Django or Flask Send forth large files to clients with the most popular frameworks

Create paths Easy way to create the array paths from a parent folder.

Predict the size of the zip file before creating it Use the BufferPredictionSize to compute the correct size of the resulting archive before creating it.

Streaming a large file Efficient way to read a single very large binary file in python

Set a comment Your own comment in the zip file

Maintainer

Santiago Debus (@santiagodebus.com)

License

This library was created by Buzon.io and is released under the MIT. Copyright 2021 Cardallot, Inc.

Comments
  • Cannot include empty folders

    Cannot include empty folders

    Python's zipfile module supports writing empty folders to zip archive.

    when trying with zipfly i get:

    File "/.../zipfly.py", line 211, in generator
        with open( path[self.filesystem], 'rb' ) as e:
    IsADirectoryError: [Errno 21] Is a directory: '/path/to/dir'
    

    Is there an undocumented way for writing empty folders?

    opened by rnixx 2
  • Is compression possible?

    Is compression possible?

    Hi,

    I see that only uncompressed ZIP files are can be created with zipfly. What is the reason for that? I don't see any compression level specific code in ZipFly.generator, except file size estimation (which doesn't work for ZIP64 achives).

    So, is compression just an unimplemented feature? Or are there any aspects which don't allow using compression for streaming made this way?

    I understand that compression is often useless in streaming scenarios but it's not true in our case. We're looking for a replacement for zipstream-new which seems to be dead. ZipFly looks nice for us...

    opened by dezhin 2
  • ASGI breaks functionability in HttpStreamingResponse

    ASGI breaks functionability in HttpStreamingResponse

    Running Django 4.1 in async ASGI mode (channels & websockets needed)

    Zipfly functionability breaks as it tries to compile the entire zip to memory before sending zip to client, filling up memory and causing a crash

    opened by T-101 1
  • Release master branch to pypi

    Release master branch to pypi

    It looks like the last version on PyPI has been published 2 years ago and there have been many changes in the code since.

    It would be helpful if you could release the latest changes.

    PS since the project is already using GH action for testing, it may be helpful to set up GH actions for release as well.

    https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/

    https://github.com/marketplace/actions/pypi-publish

    opened by 1oglop1 1
  • Stream into stream

    Stream into stream

    Hi, thank you for the inspirational code.

    I'm wondering if it's possible to adapt your code and make it compatible with S3 storage.

    I use django-storages and boto3 client returns a streaming body (already open fp) and all metadata.

    I need a zipfile (or any other archive) to be created from the set of files during the GET request (not great but it has to be).

    So to save the memory I have 2 options.

    1. Use zipfly as is and download files into a temporary location (and remove it after the operation)
    2. Or better solution that doesn't require intermediate storage so I could pipe the content of streaming body into zipfly and return that as a streaming response.

    streaming body (many of them) -> zipfly(zip file) -> streaming response

    Do you think that option two is possible? If so, could you please point out what pieces of code I should focus on to adapt zipfly?

    Thank you

    opened by 1oglop1 1
  • Add functionality to provide file streams and zip them on the fly

    Add functionality to provide file streams and zip them on the fly

    It is a common use case to store large files remotely and not next to the code. See e.g. for django https://github.com/jschneier/django-storages. If many (large) files must be shipped synchronuosly as a zip file, it saves memory and storage to pass them through the web worker as a stream without saving anything to disk. To archive that, file buffers as input may be supported.

    I would appreciate if we could add this functionality. This pull request contains a tested initial attempt. Feel free to improve!

    This pull request is based on https://github.com/BuzonIO/zipfly/pull/75 and https://github.com/BuzonIO/zipfly/pull/76

    opened by beckedorf 0
  • Using bytes instead of actual files

    Using bytes instead of actual files

    Hi, thanks for the great work. I have some bytes generators and I want to make a zip on the fly out of them. Is there an option to do this? Sorry for my poor python knowledge. Thanks :)

    opened by TaToTanWeb 2
Owner
Buzon
Buzon.io is a Cardallot Inc. service
Buzon
A python wrapper for libmagic

python-magic python-magic is a Python interface to the libmagic file type identification library. libmagic identifies file types by checking their hea

Adam Hupp 2.3k Dec 29, 2022
Various converters to convert value sets from CSV to JSON, etc.

ValueSet Converters Tools for converting value sets in different formats. Such as converting extensional value sets in CSV format to JSON format able

Health Open Terminology Ecosystem 4 Sep 08, 2022
Convert All TXT Files To One File.

AllToOne Convert All TXT Files To One File. Hi 👋 , I'm Alireza A Python Developer Boy 🔭 I’m currently working on my C# projects 🌱 I’m currently Lea

4 Jun 07, 2022
Pure Python tools for reading and writing all TIFF IFDs, sub-IFDs, and tags.

Tiff Tools Pure Python tools for reading and writing all TIFF IFDs, sub-IFDs, and tags. Developed by Kitware, Inc. with funding from The National Canc

Digital Slide Archive 32 Dec 14, 2022
Uncompress DEFLATE streams in pure Python

stream-inflate Uncompress DEFLATE streams in pure Python. Installation pip install stream-inflate Usage from stream_inflate import stream_inflate impo

Michal Charemza 7 Oct 13, 2022
Measure file similarity in a many-to-many fashion

Mesi Mesi is a tool to measure the similarity in a many-to-many fashion of long-form documents like Python source code or technical writing. The outpu

GatorEducator 3 Feb 02, 2022
This is a junk file creator tool which creates junk files in Internal Storage

This is a junk file creator tool which creates junk files in Internal Storage

KiLL3R_xRO 3 Jun 20, 2021
Small-File-Explorer - I coded a small file explorer with several options

Petit explorateur de fichier / Small file explorer Pour la première option (création de répertoire) / For the first option (creation of a directory) e

Xerox 1 Jan 03, 2022
Simple Python File Manager

This script lets you automatically relocate files based on their extensions. Very useful from the downloads folder !

Aimé Risson 22 Dec 27, 2022
This program can help you to move and rename many files at once

This program can help you to rename and save many files in a folder in seconds, but don't give the same name to files, it can delete both files.

João Assalim 1 Oct 10, 2022
Python interface for reading and appending tar files

Python interface for reading and appending tar files, while keeping a fast index for finding and reading files in the archive. This interface has been

Lawrence Livermore National Laboratory 1 Nov 12, 2021
Python's Filesystem abstraction layer

PyFilesystem2 Python's Filesystem abstraction layer. Documentation Wiki API Documentation GitHub Repository Blog Introduction Think of PyFilesystem's

pyFilesystem 1.8k Jan 02, 2023
Simple archive format designed for quickly reading some files without extracting the entire archive

Simple archive format designed for quickly reading some files without extracting the entire archive

Jarred Sumner 336 Dec 30, 2022
Annotate your Python requirements.txt file with summaries of each package.

Summarize Requirements 🐍 📜 Annotate your Python requirements.txt file with a short summary of each package. This tool: takes a Python requirements.t

Zeke Sikelianos 8 Apr 22, 2022
Generates a clean .txt file of contents of a 3 lined csv file

Generates a clean .txt file of contents of a 3 lined csv file. File contents is the .gml file of some function which stores the contents of the csv as a map.

Alex Eckardt 1 Jan 09, 2022
Add Ranges and page numbers to IIIF Manifest from a CSV.

Add Ranges and page numbers to IIIF Manifest from CSV specific to a workflow of the Bibliotheca Hertziana.

Raffaele Viglianti 3 Apr 28, 2022
Quick and dirty FAT12 filesystem to ZIP file converter

Quick and Dirty FAT12 Filesystem Converter This is a really crappy Python script I wrote to convert a semi-compatible FAT12 filesystem from my HP150's

Tube Time 2 Feb 12, 2022
Simple, convenient and cross-platform file date changing library. 📝📅

Simple, convenient and cross-platform file date changing library.

kubinka0505 15 Dec 18, 2022
🧹 Create symlinks for .m2ts files and classify them into directories in yyyy-mm format.

🧹 Create symlinks for .m2ts files and classify them into directories in yyyy-mm format.

Nep 2 Feb 07, 2022
An easy-to-use library for emulating code in minidump files.

dumpulator Note: This is a work-in-progress prototype, please treat it as such. An easy-to-use library for emulating code in minidump files. Example T

Duncan Ogilvie 362 Dec 31, 2022