Pyspark sam - Analyze Big Sequence Alignments with PySpark in AWS EMR

Overview

pyspark_sam

This repo hosts my code for the article "Analyze Big Sequence Alignments with PySpark in AWS EMR".

Prerequisite

  1. Spark

  2. AWS CLI

  3. AWS Account

Run

Follow the instruction in the article. Once you have uploaded the files into your S3 bucket, run

aws emr create-cluster --name "Spark_step_pip" \
    --release-label emr-6.5.0 \
    --applications Name=Spark \
    --log-uri s3://[your_S3_bucket]/logs/ \
    --instance-type m5.xlarge \
    --instance-count 3 \
    --bootstrap-actions Path=s3://[your_S3_bucket]/emr_bootstrap.sh \
    --use-default-roles --auto-terminate \
    --steps "Type=Spark,Name=SparkProgram,ActionOnFailure=CONTINUE,Args=[--deploy-mode,cluster,--master,yarn,--py-files,s3://[your_S3_bucket]/helper_function.py,s3://[your_S3_bucket]/spark_3mer.py,s3://[your_S3_bucket]/test.sam,[your_S3_bucket],sankey.json]" 

When the job finishes, download the sankey.json. And run this command to visualize:

python sankey.py sankey.json

Authors

  • Sixing Huang - Concept and Coding

License

This project is licensed under the MIT License - see the LICENSE file for details

Owner
Sixing Huang
A triple Neo4j certified data scientist. I am currently working at BGI in Shenzhen.
Sixing Huang
Aria/qBittorrent Telegram mirror/leech bot

This is a Telegram Bot written in Python for mirroring files on the Internet to your Google Drive or Telegram. Based on python-aria-mirror-bot Feature

28 Dec 25, 2022
A Discord bot that generates inspirational quotes & motivating messages whenever a user is sad

Encourage bot is a discord bot that allows users to randomly get Inspirational quotes messages and gives motivational encouragements whenever someone says that he's sad/depressed.

1 Nov 25, 2021
A battle-tested Django 2.1 project template with configurations for AWS, Heroku, App Engine, and Docker.

For information on how to use this project template, check out the wiki. {{ project_name }} Table of Contents Requirements Local Setup Local Developme

Lionheart Software 64 Jun 15, 2022
Projeto do segundo módulo da Resilia

@ Projeto Resilia : Módulo 2 Vamos jogar Forca ! O jogo da forca é um jogo em que o jogador tem que acertar qual é a palavra proposta, tendo como dica

Mateus Sartorio 2 Feb 24, 2022
A simple and modular Discord bot with various functionalities.

All-In-Bot for Discord A simple and modular Discord bot with various functionalities. How to use the bot? Simple! Just invite the bot to your server u

Th3J0nny 3 Jan 29, 2022
An unofficial wrapper for Engineer Man's Piston API

Pistonpy Pistonpy is an API wrapper for the Piston code execution engine by Engineer Man. Key Features Simple modern and efficient Pythonic API using

AalbatrossGuy 4 Jan 03, 2022
Spotify playlist anonymizer.

Spotify heavily personalizes auto-generated playlists like Song Radio based on the music you've listened to in the past. But sometimes you want to listen to Song Radio precisely to hear some fresh so

Jakob de Maeyer 9 Nov 27, 2022
DongTai API SDK For Python

DongTai-SDK-Python Quick start You need a config file config.json { "DongTai":{ "token":"your token", "url":"http://127.0.0.1:90"

huoxian 50 Nov 24, 2022
Instagram Bot posting earthquakes with magnitude greater than or equal to 3.5.

Instagram Bot posting earthquakes with magnitude greater than or equal to 3.5

Alican Yüksel 4 Aug 22, 2022
Plays air warning sound when detects a certain phrase or a word in a specified Telegram chat.

Tryvoha Bot Disclaimer: this is more a convenient naming, rather than a real bot. It is designed to play air warning sound when detects a certain phra

Dmytro Novikov 2 Mar 02, 2022
Hazard-Nuker - Hazard Nuker With Python

🌟 Since hazard is free, donations are really appriciate and keeps the developme

†† 9 Oct 26, 2022
Unarchive Bot for Telegram

Telegram UnArchiver Bot UnArchiveBot: 🇬🇧 Bot that allows you to extract supported archive formats in telegram. 🇹🇷 Desteklenen arşiv biçimleri tele

Hüzünlü Artemis [HuzunluArtemis] 25 May 07, 2022
BroBot's files, code and tester.

README - BroBOT Made by Rohan Chaturvedi [email protected] DISCLAIMER: Th

1 Jan 09, 2022
:lock: Python 2.7/3.X client for HashiCorp Vault

hvac HashiCorp Vault API client for Python 3.x Tested against the latest release, HEAD ref, and 3 previous minor versions (counting back from the late

hvac 1k Dec 29, 2022
JAKYM, Just Another Konsole YouTube-Music. A command line based Youtube music player written in Python with spotify and youtube playlist support

Just Another Konsole YouTube-Music Overview I wanted to create this application so that I could use the command line to play music easily. I often pla

Mayank Jha 73 Jan 01, 2023
Esse script procura qualquer, dados que você queira na wikipedia! Em breve traremos um com dados em toda a internet.

Buscador de dados simples Dependências necessárias Para você poder começar a utilizar esta ferramenta, você vai precisar da dependência "wikipedia", p

Erick Campoy 4 Feb 24, 2022
Skyscanner Python SDK

Skyscanner Python SDK Important As of May 1st, 2020, the project is deprecated and no longer maintained. The latest update in v1.1.5 includes changing

Skyscanner 118 Sep 23, 2022
Discord Bot Personnal Server - Ha-Neul

Haneul Bot, it's a discord for help me on my personnal discord, she do a lot of boring and repetitive stain. You can use on your own server if you want, you just need to find a host for the programm

Maxvyr 1 Feb 03, 2022
Discord Mafia Game Bot using nextcord

Mafia-Bot Discord Mafia Game Bot using nextcord Features Mafia Game Game Replays Installation Run the following command to install required modules: p

Nian 6 Nov 19, 2022
A modular Telegram Python bot running on python3 with a sqlalchemy, redislab, mongo database, telethon, and pyrogram.

Zeldris Robot A modular Telegram Python bot running on python3 with a sqlalchemy, redislab, mongo database, telethon, and pyrogram. How to set up/depl

IDNCoderX 42 Dec 21, 2022