Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Last update: Dec 22, 2021

Related tags

Data Analysis covid-county

Overview

Covid County

Executive summary

Setup

Install miniconda, then in the command line, run

conda create -n covid-county
conda activate covid-county
conda install pandas ipython matplotlib tabulate

(Let me know if you want pure-Python no-Conda instructions via venv.)

2020 US presidential election

I've already downloaded countypres_2000-2020.csv from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ but you can download it again to ensure I haven't committed bad data.

2020 data is missing counts for District of Columbia (FIPS 11001)? Party split taken from 2016 election.

Census

From https://www.census.gov/programs-surveys/popest/technical-documentation/research/evaluation-estimates/2020-evaluation-estimates/2010s-counties-total.html I downloaded co-est2020.csv from the "Annual Resident Population Estimates for States and Counties: April 1, 2010 to July 1, 2019; April 1, 2020; and July 1, 2020 (CO-EST2020)" link. It's committed in this repo but you can download it yourself too.

Covid

Install Git and run this in this directory: git clone --depth 1 https://github.com/nytimes/covid-19-data.git (it might take a while)

Note five boroughs of NYC are combined into a single "county". This is taken into account by merging the 2020 Presidential votes from all five boroughs into a single county (since we can't split the Covid deaths into individual boroughs, this is the best we can do). Fix follows the recommendation per upstream issue 105.

Run

python main.py

(Takes ~45 seconds on my 2015-vintage laptop.)

More results

party bin	total Covid-19 deaths
Rep 80+%	38284
Rep 60–79%	211416
Rep 50–59%	123587
Dem 50–59%	196084
Dem 60–79%	210070
Dem 80+%	18331
unknown	5243

Simply by party:

Dem: 424485
Rep: 373287

Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Related tags

Overview

Covid County

Executive summary

Setup

2020 US presidential election

Census

Covid

Run

More results

Owner

Ahmed Fasih

Fancy data functions that will make your life as a data scientist easier.

Used for data processing in machine learning, and help us to construct ML model more easily from scratch

ICLR 2022 Paper submission trend analysis

Powerful, efficient particle trajectory analysis in scientific Python.

Churn prediction with PySpark

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Approximate Nearest Neighbor Search for Sparse Data in Python!

Hidden Markov Models in Python, with scikit-learn like API

A simple and efficient tool to parallelize Pandas operations on all available CPUs

Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.

SNV calling pipeline developed explicitly to process individual or trio vcf files obtained from Illumina based pipeline (grch37/grch38).

Lale is a Python library for semi-automated data science.

songplays datamart provide details about the musical taste of our customers and can help us to improve our recomendation system

Instant search for and access to many datasets in Pyspark.

Working Time Statistics of working hours and working conditions by industry and company

First steps with Python in Life Sciences

2019 Data Science Bowl

track your GitHub statistics

Supply a wrapper ``StockDataFrame`` based on the ``pandas.DataFrame`` with inline stock statistics/indicators support.

BAyesian Model-Building Interface (Bambi) in Python.