Python function to construct an ODS spreadsheet on the fly - without having to store the entire file in memory or disk

Overview

stream-write-ods CircleCI Test Coverage

Python function to construct an ODS (OpenDocument Spreadsheet) on the fly - without having to store the entire file in memory or disk.

Can be used to convert CSV, SQLite, or JSON to ODS format.

Installation

pip install stream-write-ods

Usage

In general, pass a nested iterable to stream_write_ods and it will return an interable of bytes of the ODS file, as follows.

from stream_write_ods import stream_write_ods

def get_sheets():
    def get_rows_of_sheet_1():
        yield 'Value A', 'Value B'
        yield 'Value C', 'Value D'

    yield 'Sheet 1 name', ('col_1_name', 'col_2_name'), get_rows_of_sheet_1()

    def get_rows_of_sheet_2():
        yield 'col_1_value',

    yield 'Sheet 2 name', ('col_1_name',), get_rows_of_sheet_2()

ods_chunks = stream_write_ods(get_sheets())

Usage: Convert CSV to ODS

The following recipe converts a CSV to ODS.

import codecs
import csv
from stream_write_ods import stream_write_ods

# Any iterable that yields the bytes of a CSV file
# Hard coded for the purposes of this example
csv_bytes_iter = (
    b'col_1,col_2\n',
    b'1,"value"\n',
)

def get_sheets(sheet_name, csv_reader):
    yield sheet_name, next(csv_reader), csv_reader

csv_str_iter = codecs.iterdecode(csv_bytes_iter, 'utf-8')
csv_reader = csv.reader(csv_str_iter, csv.QUOTE_NONNUMERIC)
ods_chunks = stream_write_ods(get_sheets('Sheet 1', csv_reader))

Usage: Convert JSON to ODS

Using ijson to stream-parse a JSON file, it's possible to convert JSON data to ODS on the fly:

import ijson
import itertools
from stream_write_ods import stream_write_ods

# Any iterable that yields the bytes of a JSON file
# Hard coded for the purposes of this example
json_bytes_iter = (b'''{
  "data": [
      {"id": 1, "name": "Foo"},
      {"id": 2, "name": "Bar"}
  ]
}''',)

# ijson requires a file-like object
def to_file_like_obj(bytes_iter):
    chunk = b''
    offset = 0
    it = iter(bytes_iter)

    def up_to_iter(num):
        nonlocal chunk, offset

        while num:
            if offset == len(chunk):
                try:
                    chunk = next(it)
                except StopIteration:
                    break
                else:
                    offset = 0
            to_yield = min(num, len(chunk) - offset)
            offset = offset + to_yield
            num -= to_yield
            yield chunk[offset - to_yield:offset]

    class FileLikeObj:
        def read(self, n):
            return b''.join(up_to_iter(n))

    return FileLikeObj()

def get_sheets(json_file):
    columns = None

    def rows():
        nonlocal columns
        for item in ijson.items(json_file, 'data.item'):
            if columns is None:
                columns = list(item.keys())
            yield tuple(item[column] for column in columns)

    # Ensure columns populated
    rows_it = rows()
    first_row = next(rows_it)

    yield 'Sheet 1', columns, itertools.chain((first_row,), rows_it)

json_file = to_file_like_obj(json_bytes_iter)
ods_chunks = stream_write_ods(get_sheets(json_file))

Usage: Convert SQLite to ODS

SQLite isn't particularly streaming-friendly since typically you need random access to the file. But it's still possible to use stream-write-ods to convert SQLite to ODS.

import contextlib
import sqlite3
import tempfile
from stream_write_ods import stream_write_ods

@contextlib.contextmanager
def get_db():
    # Hard coded in memory database for the purposes of this example
    with sqlite3.connect(':memory:') as con:
        cur = con.cursor()
        cur.execute("CREATE TABLE my_table_a (my_col text);")
        cur.execute("CREATE TABLE my_table_b (my_col text);")
        cur.execute("INSERT INTO my_table_a VALUES ('Value A')")
        cur.execute("INSERT INTO my_table_b VALUES ('Value B')")
        yield con

def quote_identifier(value):
    return '"' + value.replace('"', '""') + '"'

def get_sheets(db):
    cur_table = db.cursor()
    cur_table.execute('''
        SELECT name FROM sqlite_master
        WHERE type = "table" AND name NOT LIKE 'sqlite\\_%' ESCAPE '\\'
    ''')
    cur_data = db.cursor()
    for table, in cur_table:
        cur_data.execute(f'SELECT * FROM {quote_identifier(table)} ORDER BY rowid')
        yield table, tuple(col[0] for col in cur_data.description), cur_data

with get_db() as db:
    ods_chunks = stream_write_ods(get_sheets(db))

Types

There are 8 possible data types in an Open Document Spreadsheet: boolean, currency, date, float, percentage, string, time, and void. 4 of these can be output by stream-write-ods, chosen automatically according to the following table.

Python type ODS type
boolean boolean
date date - without time component
datetime date - with time component
int float
float float
str string
NoneType string - as #NA

Limitations

ODS spreadsheets are essentially ZIP archives containing several member files. While in general ZIP archives can be up to 16EiB (exbibyte) in size using ZIP64, LibreOffice does not support ZIP64, and so ODS files are de-facto limited to 4GiB (gibibyte). This limit applies to the size of the entire compressed archive, the compressed size of each member file, and the uncompressed size of each member file.

Owner
Department for International Trade
Department for International Trade
A python based all-in-one tool for Google Drive

gdrive-tools A python based all-in-one tool for Google Drive Uses For Gdrive-Tools ✓ generate SA ✓ Add the SA and Add them to TD automatically ✓ Gener

XcodersHub 32 Feb 09, 2022
A telegram bot help you to get stylish fonts and text

Stylish Font Bot 🐿 This is a telegram bot help you to get stylish fonts and text. Config Vars 🤖 API_HASH: Get this value from my.telegram.org. API_K

MSTL updates 1 Nov 08, 2021
Um bot simples para seguir as pessoas

Um bot simples para seguir pessoas no instagram, criado apeanas para testes. Utilizando o framework "Selenium", criei um bot para entrar em uma conta

Mobben 1 Nov 05, 2021
Tools untuk cek nomor rekening, terhadap penipuan yang sudah terjadi!

No Rekening Checker Selalu waspada terhadap penipuan! Sebelum anda transfer sejumlah uang alangkah baiknya untuk cek terlebih dahulu, apakah norek itu

Hanif Ahmad Syauqi 8 Dec 25, 2022
Trading strategy for the Freqtrade crypto bot

NostalgiaForInfinity Trading strategy for the Freqtrade crypto bot Change strategy Add strategies to the user_data/strategies folder and also in the d

iterativ 1.5k Jan 01, 2023
Gorrabot is a bot made to automate checks and processes in the development process.

Gorrabot is a Gitlab bot made to automate checks and processes in the Faraday development. Features Check that the CHANGELOG is modified By default, m

Faraday 7 Dec 14, 2022
Discord-Lite - A light weight discord client written in Python, for developers, by developers.

Discord-Lite - A light weight discord client written in Python, for developers, by developers.

Sachit 142 Jan 07, 2023
A solution designed to extract, transform and load Chicago crime data from an RDS instance to other services in AWS.

This project is intended to implement a solution designed to extract, transform and load Chicago crime data from an RDS instance to other services in AWS.

Yesaswi Avula 1 Feb 04, 2022
A Discord/Xenforo bot!

telathbot A Discord/Xenforo bot! Pre-requisites pyenv (via installer) poetry Docker (with Go version of docker compose enabled) Local development Crea

Telath 4 Mar 09, 2022
AWS Auto Inventory allows you to quickly and easily generate inventory reports of your AWS resources.

Photo by Denny Müller on Unsplash AWS Automated Inventory ( aws-auto-inventory ) Automates creation of detailed inventories from AWS resources. Table

AWS Samples 123 Dec 26, 2022
AWS SQS event redrive Lambda

This repository contains the Lambda function to redrive sqs events from source to destination queue while controlling maxRetry per event.

1 Oct 19, 2021
Plataforma para atendimento a outras empresas que necessitam de atendimento técnico.

Plataforma para atendimento a outras empresas que necessitam de atendimento técnico. É possível que os usuarios de empresas parceiras registrem solici

Kelvin Alisson Cantarino 2 Jun 29, 2022
A simple versatile telgeram bot written in Python using pyTelegramBotAPI library.

A simple versatile telgeram bot written in Python using pyTelegramBotAPI library.

Benyamin Zojaji 15 Jun 17, 2022
Grade Notifyer Bot

A bot that automatically crawl the submission platform of montefiore to notify the student when a project has been graded.

Julien Gustin 2 Jun 02, 2022
Twitch Points Miner for multiple accounts with Discord logging

Twitch Points Miner for multiple accounts with Discord logging Creator of the Twitch Miner -- PLEASE NOTE THIS IS PROBABLY BANNABLE -- Made on python

8 Apr 27, 2022
Simple Python script that lets you upload image/video to imgur

Pymgur 🐍 Simple Python script that lets you upload image/video to imgur! Usage 🔨 Git Clone this repository install the requirements (pip install -r

3 Feb 20, 2022
A script that takes what you're listening too on Spotify and sets it as your Nertivia custom status.

nertivia-spotify-listening-status A script that takes what you're listening too on Spotify and sets it as your Nertivia custom status. setup Install r

Ben Tettmar 2 Feb 03, 2022
The text based version of my App Blocker that I planning on converting to GUI soon.

App-Blocker The text based version of my App Blocker that I planning on converting to GUI soon. Currently I am just uploading the appblocker.py file,

Harsh Raj 0 Sep 13, 2022
Python client for the Datadog API

datadog-api-client-python This repository contains a Python API client for the Datadog API. The code is generated using openapi-generator and apigento

Datadog, Inc. 58 Dec 16, 2022
A Twitter bot written in Python using Tweepy and hosted on a server.

A Twitter bot written in Python using Tweepy. It can like and/or retweet tweets that contain single or multiple keywords and hashtags.

anniedotexe 11 Dec 15, 2022