Open-Source Python CLI package for copying DynamoDB tables and items in parallel batch processing + query natural & Global Secondary Indexes (GSIs)

Overview

dynamo-cmdline

Python Command-Line Interface Package to copy Dynamodb data in parallel batch processing + query natural & Global Secondary Indexes (GSIs).

Author: Simon Ryu

This packaging distribution is published on PyPI, found here.

What is DynamoDB?

Dynamo is a NoSQL database. A table in Dynamo is defined by its Partition Key, which uniquely identifies a list of records. The lists are ordered by the Sort Key, an optional key that along with Partition Key, form a primary key referred to as a composite primary key. This gives an additional flexibility when querying data. Each partition (Partition Key) can be thought of as a filing cabinet drawer, containing a bunch of related records which may or may not be sorted (Sort Key) depending on your need. Accounting for this optionality, the DynamodbTable class in module dynamodb_table.py can represent tables with either simple (Partition Key) or composite (Partition Key + Sort Key) primary key.

Dynamo vs Relational database

Dynamo differs from traditional, relational databases in that tables cannot be queried by random fields. Because it is structured to guarantee fast and scalable queries, tables also cannot be joined, grouped or unioned. A specific item can be found by specifiying a partition key and sort key, or a range of values within a partition, filtered by the sort key. Although filtering by other fields (attributes) is possible, it is highly discouraged as AWS charges the user based on how much data is read, and non-key filters occur after the reads happen. Querying, and especially copying large amount of data across different AWS environments must be done with extreme care (double the query operations!), hence this CLI package was distributed.

GSI

Since querying is limited to the table's primary key, how can we address many different access patterns? The answer is global secondary indexes, or GSIs. A GSI allows you to essentially re-declare your table with a new key schema. When an item is written into the table, the index will update automatically, so managing dual-writing is not a concern. Most importantly, the GSI can be queried directly just like the natural table, just as fast.

Libraries Used

  • AWS SDK for Python - Boto3
  • multiprocessing
Generate folder trees directly from the terminal.

Dir Tree Artist 🎨 🌲 Intro Easily view folder structure, with parameters to sieve out what you want. Choose to exclude files from being viewed (.git,

Glenda T 0 May 17, 2022
A command line interface to interact with the Hypixel api allowing the user to get stats, leaderboards, etc

HyConsole is a way to get data on players and leaderboards from the Hypixel Minecraft server from the command line. Keep in mind I have no a

1 Feb 14, 2022
A simple script that outputs the current date on the user interface/terminal.

Py-Date A simple script that outputs the current date on the user interface/terminal. How to Run Open your terminal and cd into the folder containi

Arinzechukwu Okoye 1 Jan 13, 2022
Python library and command line tool for interacting with Bugzilla

python-bugzilla This package provides two bits: bugzilla python module for talking to a Bugzilla instance over XMLRPC or REST /usr/bin/bugzilla comman

Python Bugzilla Project 112 Nov 05, 2022
Splitgraph command line client and python library

Splitgraph Overview Splitgraph is a tool for building, versioning and querying reproducible datasets. It's inspired by Docker and Git, so it feels fam

Splitgraph 313 Dec 24, 2022
Tmux Based Dropdown Dashboard For Python

sextans It's a private configuration and an ongoing experiment while I use Archlinux. A simple drop down dashboard based on tmux. It includes followin

秋葉 4 Dec 22, 2021
Tarstats - A simple Python commandline application that collects statistics about tarfiles

A simple Python commandline application that collects statistics about tarfiles.

Kristian Koehntopp 13 Feb 20, 2022
Convert markdown to HTML using the GitHub API and some additional tweaks with Python.

Convert markdown to HTML using the GitHub API and some additional tweaks with Python. Comes with full formula support and image compression.

phseiff 70 Dec 23, 2022
Tools crack instagram + fb ayok dicoba keburu premium 😁

FITUR INSTALLASI [1] pkg update && pkg upgrade [2] pkg install git [3] pkg install python [4] pkg install python2 [5] pkg install nano [6]

Jeeck 1 Dec 11, 2021
inklayers is a command line program that exports layers from an SVG file.

inklayers is a command line program that exports layers from an SVG file. It can be used to create slide shows by editing a single SVG file.

11 Mar 29, 2022
This is a tool for managing file notes through the command line

This is a tool for managing file notes through the command line

2 Jun 22, 2022
lfb (light file browser) is a terminal file browser

lfb (light file browser) is a terminal file browser. The whole program is a mess as of now. In the feature I will remove the need for external dependencies, tidy up the code, make an actual readme, a

2 Apr 09, 2022
Python3 command-line tool for the inference of Boolean rules and pathway analysis on omics data

BONITA-Python3 BONITA was originally written in Python 2 and tested with Python 2-compatible packages. This version of the packages ports BONITA to Py

1 Dec 22, 2021
A simple and easy-to-use CLI parse tool.

A simple and easy-to-use CLI parse tool.

AbsentM 1 Mar 04, 2022
A simple python script to execute a command when a YubiKey is disconnected

YubiKeyExecute A python script to execute a command when a YubiKey / YubiKeys are disconnected. ‏‏‎ ‎ How to use: 1. Download the latest release and d

6 Mar 12, 2022
CLI tool to computes CO2 emissions of HPC computations following green-algorithms.org methodology

gqueue gqueue is a CLI (command line interface) tool that computes carbon footprint of HPC computations on clusters running slurm. It follows the meth

4 Dec 10, 2021
TUIFIManager - A cross-platform terminal-based file manager

TUIFI Manager A cross-platform terminal-based file manager (and component), mean

142 Dec 26, 2022
Sink is a CLI tool that allows users to synchronize their local folders to their Google Drives. It is similar to the Git CLI and allows fast and reliable syncs with the drive.

Sink is a CLI synchronisation tool that enables a user to synchronise local system files and folders with their Google Drives. It follows a git C

Yash Thakre 16 May 29, 2022
Command-line tool to use LNURL with your LND instance

Sprint planner Sprint planner is a Python script for planning your Jira tasks based on your calendar availability. Installation Use the package manage

Djuri Baars 6 Jan 14, 2022
A simple python implementation of a reverse shell

llehs A python implementation of a reverse shell. Note for contributors The project is open for contributions and is hacktoberfest registered! llehs u

Archisman Ghosh 2 Jul 05, 2022