Bigdata Simulation Library Of Dream By Sandman Books

Related tags

Data AnalysisSADMAN
Overview

BIGDATA SIMULATION LIBRARY OF DREAM BY SANDMAN BOOKS

=================

Solution Architecture

delta

Description


In the realm of Dreaming, its ruler SANDMAN, DREAM has a certain hobby; books. In his castle there is a Library in which they are kept, among other things, stories conceived by their authors but never written in our reality; Lucien, the person responsible for his organization, needs some help. Many people dream of published books, sales markets, stories that, in their reality, they would never imagine conceiving. And this voluminous data needs to be worked on. In order not to get lost in the information, Lucien receives all his dreams in a Non-Relational bank, MONGO. And he needs this to be organized in a relational way, that is, each author in his proper place. For that he pulled our dream and saw this Architecture where data arrives in MONGO undergo a transformation process in the STAGIN area and are populated in MYSQL. In its population, we split two final tables. One in its raw state, for complete queries, and another with metrics that informs the number of dreamers, their books and the total number of files. In this way, data is more organized, undergoing deduplication and consolidation processes.

Glossary of Data


Fields Type Description
_id long undescore ID
kind string type of book or text file
title string title of book
subtitle string subtitle of book
author array one or more authors who can dream of stories
publisher string publisher or not dreamed of by the author
publishedDate string year of published
edition string which edition does the book belong to
sample string sample of books
type string ISBN code
identifier string isbn identification number
pageCount integer number of pages
capCount integer number of chapters
wordCount integer number of words
categories string literary genre
original_price double original price
current_prefix string country currency prefix
current_sufix string country currency name
barcode string barcode
dreaming_date string the day you had the dream

image


delta

Start the Project


To run the project, you need to install the dependencies located in the "dependencies" folder and in the root of the project, run the shell_script "run_script.sh".

Sample of Payload MONGO


mongo

{
        "_id" : ObjectId("61b1fe6944dd42158674af31"),
        "kind" : "books#volume",
        "volumeInfo" : {
                "title" : "STORE VISIT",
                "Subtitle" : "GLASS IT HAIR MEMBER KEY ALMOST QUALITY. MARKET ALREADY AIR STILL ARTICLE. DECADE DECADE MEASURE PRESENT HUMAN MORNING. BIG BLOOD ECONOMIC FRONT SUCCESS AGO THEM. EVERY SON TROUBLE SIMPLE.",
                "author" : [
                        "PETER RODRIGUEZ",
                        "KELLY TORRES"
                ]
        },
        "publisher" : "FALL AWAY ABOUT INDEPENDENT",
        "publishedDate" : "1994",
        "edition" : "7º EDITION",
        "sample" : "...onto sport room audience. page dinner hundred. week statement should watch she even ball.\nour able tv break defense seek baby. employee last around music produce reach tv..",
        "industryIdentifiers" : [
                {
                        "type" : "ISBN_10",
                        "identifier" : "1-55027-208-X"
                },
                {
                        "type" : "ISBN_10",
                        "identifier" : "0-405-30324-6"
                }
        ],
        "pageCount" : 796,
        "wordCount" : 83331,
        "capCount" : 14,
        "categories" : [
                "NOVEL"
        ],
        "saleInfo" : {
                "original_price" : 78,
                "current_prefix" : "LAK",
                "current_sufix" : "Lao kip",
                "barcode" : "6747254889534"
        }
}

Sample of Payload in MYSQL


library

_id  |kind        |title                                                           |subtitle                                                                                                                                                                                                                                                       |author                                       |publisher                               |publishedDate|edition   |sample                                                                                                                                                                                                     |type   |identifier       |pageCount|wordCount|capCount|categories               |original_price|current_prefix|current_sufix              |barcode      |dreaming_date|
-----+------------+----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------+----------------------------------------+-------------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+-----------------+---------+---------+--------+-------------------------+--------------+--------------+---------------------------+-------------+-------------+
  670|csv#volume  |GOOD DETERMINE OF                                               |DON'T HAVE                                                                                                                                                                                                                                                     |KAREN ODOM                                   |DARK WAR INDEPENDENT                    |1982         |9º EDITION|...involve star apply later including truth. next while nor worry staff economic.¶condition region write college. return half offer. popular could direction above fish..                                  |ISBN_10|978-1-4340-7508-6|      479|    64333|      11|EPISTOLARY NOVEL         |          97.0|BHD           |Bahraini dinar             |3894426059691|     20211209|

resume

metric          |value|
----------------+-----+
Total of Dreamns|71395|
Total of Books  |59154|
Total of Data   |78000|
Owner
Maycon Cypriano
DATA ENGINEER | DATA SCIENCE | DATA PYTHON | DATA DRIVEN |
Maycon Cypriano
A data parser for the internal syncing data format used by Fog of World.

A data parser for the internal syncing data format used by Fog of World. The parser is not designed to be a well-coded library with good performance, it is more like a demo for showing the data struc

Zed(Zijun) Chen 40 Dec 12, 2022
vartests is a Python library to perform some statistic tests to evaluate Value at Risk (VaR) Models

gg I wasn't satisfied with any of the other available Gemini clients, so I wrote my own. Requires Python 3.9 (maybe older, I haven't checked) and opti

RAFAEL RODRIGUES 5 Jan 03, 2023
Flexible HDF5 saving/loading and other data science tools from the University of Chicago

deepdish Flexible HDF5 saving/loading and other data science tools from the University of Chicago. This repository also host a Deep Learning blog: htt

UChicago - Department of Computer Science 255 Dec 10, 2022
Shot notebooks resuming the main functions of GeoPandas

Shot notebooks resuming the main functions of GeoPandas, 2 notebooks written as Exercises to apply these functions.

1 Jan 12, 2022
Basis Set Format Converter

Basis Set Format Converter Repository for the online tool that allows you to enter a basis set in the form of text input for a variety of Quantum Chem

Manas Sharma 3 Jun 27, 2022
DenseClus is a Python module for clustering mixed type data using UMAP and HDBSCAN

DenseClus is a Python module for clustering mixed type data using UMAP and HDBSCAN. Allowing for both categorical and numerical data, DenseClus makes it possible to incorporate all features in cluste

Amazon Web Services - Labs 53 Dec 08, 2022
A utility for functional piping in Python that allows you to access any function in any scope as a partial.

WithPartial Introduction WithPartial is a simple utility for functional piping in Python. The package exposes a context manager (used with with) calle

Michael Milton 1 Oct 26, 2021
Python beta calculator that retrieves stock and market data and provides linear regressions.

Stock and Index Beta Calculator Python script that calculates the beta (β) of a stock against the chosen index. The script retrieves the data and resa

sammuhrai 4 Jul 29, 2022
CSV database for chihuahua (HUAHUA) blockchain transactions

super-fiesta Shamelessly ripped components from https://github.com/hodgerpodger/staketaxcsv - Thanks for doing all the hard work. This code does only

Arlene Macciaveli 1 Jan 07, 2022
Retentioneering 581 Jan 07, 2023
Programmatically access the physical and chemical properties of elements in modern periodic table.

API to fetch elements of the periodic table in JSON format. Uses Pandas for dumping .csv data to .json and Flask for API Integration. Deployed on "pyt

the techno hack 3 Oct 23, 2022
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an

PyMC 7.2k Dec 30, 2022
Big Data & Cloud Computing for Oceanography

DS2 Class 2022, Big Data & Cloud Computing for Oceanography Home of the 2022 ISblue Big Data & Cloud Computing for Oceanography class (IMT-A, ENSTA, I

Ocean's Big Data Mining 5 Mar 19, 2022
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

pgmpy 2.2k Dec 25, 2022
a tool that compiles a csv of all h1 program stats

h1stats - h1 Program Stats Scraper This python3 script will call out to HackerOne's graphql API and scrape all currently active programs for informati

Evan 40 Oct 27, 2022
Pypeln is a simple yet powerful Python library for creating concurrent data pipelines.

Pypeln Pypeln (pronounced as "pypeline") is a simple yet powerful Python library for creating concurrent data pipelines. Main Features Simple: Pypeln

Cristian Garcia 1.4k Dec 31, 2022
A Python Tools to imaging the shallow seismic structure

ShallowSeismicImaging Tools to imaging the shallow seismic structure, above 10 km, based on the ZH ratio measured from the ambient seismic noise, and

Xiao Xiao 9 Aug 09, 2022
Python utility to extract differences between two pandas dataframes.

Python utility to extract differences between two pandas dataframes.

Jaime Valero 8 Jan 07, 2023
Stock Analysis dashboard Using Streamlit and Python

StDashApp Stock Analysis Dashboard Using Streamlit and Python If you found the content useful and want to support my work, you can buy me a coffee! Th

StreamAlpha 27 Dec 09, 2022
[CVPR2022] This repository contains code for the paper "Nested Collaborative Learning for Long-Tailed Visual Recognition", published at CVPR 2022

Nested Collaborative Learning for Long-Tailed Visual Recognition This repository is the official PyTorch implementation of the paper in CVPR 2022: Nes

Jun Li 65 Dec 09, 2022