This repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

Overview

uber-pickups-analysis

Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city

Information about data set

The dataset contains, roughly, TWO groups of files: ● Uber trip data from 2014 (April - September), separated by month, with detailed location information. ● Uber trip data from 2015 (January - June), with less fine-grained location information.

Uber trip data from 2014 There are six files of raw data on Uber pickups in New York City from April to September 2014. The files are separated by month and each has the following columns: ● Date/Time : The date and time of the Uber pickup ● Lat : The latitude of the Uber pickup ● Lon : The longitude of the Uber pickup ● Base : The TLC base company code affiliated with the Uber pickup. These files are named:

● uber-raw-data-apr14.csv ● uber-raw-data-aug14.csv ● uber-raw-data-jul14.csv ● uber-raw-data-jun14.csv ● uber-raw-data-may14.csv ● uber-raw-data-sep14.csv

Uber trip data from 2015

Also included is the file uber-raw-data-janjune-15.csv This file has the following columns: ● Dispatching_base_num : The TLC base company code of the base that dispatched the Uber. ● Pickup_date : The date and time of the Uber pickup ● Affiliated_base_num : The TLC base company code affiliated with the Uber pickup. ● locationID : The pickup location ID affiliated with the Uber pickup These files are named:

  • uber-raw-data-janjune-15.csv

motive of Project

To analyze the data of the customer rides and visualize the data to find insights that can help improve business. Data analysis and visualization is an important part of data science. They are used to gather insights from the data and with visualization you can get quick information from the data.

How to Run the Project

In order to run the project just download the data from above mentioned source then run any file.

Prerequisites

You need to have installed following softwares and libraries in your machine before running this project.

Python 3 Anaconda: It will install ipython notebook and most of the libraries which are needed like sklearn, pandas, seaborn, matplotlib, numpy, scipy.

Installing

Python 3: https://www.python.org/downloads/ Anaconda: https://www.anaconda.com/download/

Authors

DEVA DEEKSHITH and kilari jaswanth(https://github.com/Kilarijaswanth)- combined work

Owner
B DEVA DEEKSHITH
B DEVA DEEKSHITH
Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

FINRA 25 Dec 28, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learn

Vowpal Wabbit 8.1k Dec 30, 2022
Cool Python features for machine learning that I used to be too afraid to use. Will be updated as I have more time / learn more.

python-is-cool A gentle guide to the Python features that I didn't know existed or was too afraid to use. This will be updated as I learn more and bec

Chip Huyen 3.3k Jan 05, 2023
Machine learning template for projects based on sklearn library.

Machine learning template for projects based on sklearn library.

Janez Lapajne 17 Oct 28, 2022
Accelerating model creation and evaluation.

EmeraldML A machine learning library for streamlining the process of (1) cleaning and splitting data, (2) training, optimizing, and testing various mo

Yusuf 0 Dec 06, 2021
Stats, linear algebra and einops for xarray

xarray-einstats Stats, linear algebra and einops for xarray ⚠️ Caution: This project is still in a very early development stage Installation To instal

ArviZ 30 Dec 28, 2022
monolish: MONOlithic Liner equation Solvers for Highly-parallel architecture

monolish is a linear equation solver library that monolithically fuses variable data type, matrix structures, matrix data format, vendor specific data transfer APIs, and vendor specific numerical alg

RICOS Co. Ltd. 179 Dec 21, 2022
MooGBT is a library for Multi-objective optimization in Gradient Boosted Trees.

MooGBT is a library for Multi-objective optimization in Gradient Boosted Trees. MooGBT optimizes for multiple objectives by defining constraints on sub-objective(s) along with a primary objective. Th

Swiggy 66 Dec 06, 2022
Iterative stochastic gradient descent (SGD) linear regressor with regularization

SGD-Linear-Regressor Iterative stochastic gradient descent (SGD) linear regressor with regularization Dataset: Kaggle “Graduate Admission 2” https://w

Zechen Ma 1 Oct 29, 2021
Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

Intel(R) Extension for Scikit-learn* Installation | Documentation | Examples | Support | FAQ With Intel(R) Extension for Scikit-learn you can accelera

Intel Corporation 858 Dec 25, 2022
Implemented four supervised learning Machine Learning algorithms

Implemented four supervised learning Machine Learning algorithms from an algorithmic family called Classification and Regression Trees (CARTs), details see README_Report.

Teng (Elijah) Xue 0 Jan 31, 2022
Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray

A unified Data Analytics and AI platform for distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray What is Analytics Zoo? Analytics Zo

2.5k Dec 28, 2022
Scikit learn library models to account for data and concept drift.

liquid_scikit_learn Scikit learn library models to account for data and concept drift. This python library focuses on solving data drift and concept d

7 Nov 18, 2021
Evidently helps analyze machine learning models during validation or production monitoring

Evidently helps analyze machine learning models during validation or production monitoring. The tool generates interactive visual reports and JSON profiles from pandas DataFrame or csv files. Current

Evidently AI 3.1k Jan 07, 2023
机器学习检测webshell

ai-webshell-detect 机器学习检测webshell,利用textcnn+简单二分类网络,基于keras,花了七天 检测原理: 从文件熵 文件长度 文件语句提取出特征,然后文件熵与长度送入二分类网络,文件语句送入textcnn 项目原理,介绍,怎么做出来的

Huoji's 56 Dec 14, 2022
Examples and code for the Practical Machine Learning workshop series

Practical Machine Learning Workshop Series Practical Machine Learning for Quantitative Finance Post conference workshop at the WBS Spring Conference D

CompatibL 21 Jun 25, 2022
A Microsoft Azure Web App project named Covid 19 Predictor using Machine learning Model

A Microsoft Azure Web App project named Covid 19 Predictor using Machine learning Model (Random Forest Classifier Model ) that helps the user to identify whether someone is showing positive Covid sym

Priyansh Sharma 2 Oct 06, 2022
Reproducibility and Replicability of Web Measurement Studies

Reproducibility and Replicability of Web Measurement Studies This repository holds additional material to the paper "Reproducibility and Replicability

6 Dec 31, 2022
A Tools that help Data Scientists and ML engineers train and deploy ML models.

Domino Research This repo contains projects under active development by the Domino R&D team. We build tools that help Data Scientists and ML engineers

Domino Data Lab 73 Oct 17, 2022
Official code for HH-VAEM

HH-VAEM This repository contains the official Pytorch implementation of the Hierarchical Hamiltonian VAE for Mixed-type Data (HH-VAEM) model and the s

Ignacio Peis 8 Nov 30, 2022