Credit Card Fraud Detection

Overview

Credit Card Fraud Detection

image


For this project, I used the datasets from the kaggle competition called IEEE-CIS Fraud Detection. The competition aims to improve fraud prevention system by building fraud detection models based on Vesta Corporation's real-world e-commerce transactional data, which contains information from device type to product features. My personal goal for this project is to not only explore the data and build models, but to also build an API server with retrainable model. To achieve this goal, I used fastapi, lightgbm and ray tune.

Also, I decided to develop this project to be the same as how data-related projects are developed in real-world scenarios, wherein the end goal of development is a project that is feasible for production. Therefore, I have put efforts on creating:

  1. Exploratory Data Analysis (EDA) in the notebooks/ folder;
  2. An API Server inside the api/ folder;
  3. Files for deployment such as Dockerfile and docker-compose.yml;
  4. Documentations in the docs/ folder; and
  5. Some necessary scripts in scripts/ folder.

Services

I have two services: app and script.

App

image The app service is a machine learning API that is open on port 8000. I used fastapi for the API server, so you can check it on http://localhost:8000/docs after you run the app service.

Scripts

image The script sends the request for the predictions on new sets of data, such as the kaggle testing data. After the script get all the responses, files will be written on /tmp/submission.csv (on host and container), but this part can take a lot of time. It is suggested to use docker logs -f lightgbm-project-demo_script_1 to check the progress of the process.

How to run this demo

1. Install the requirements

  • docker
  • docker-compose
  • make

2. Download the datasets

Option 1.

Setup kaggle API and use

make init-data

Option 2.

  1. Create a data folder: i.e.
mkdir data
  1. Download the data from kaggle IEEE-CIS Fraud Detection.

  2. Put the ieee-fraud-detection.zip inside the data/ folder.

  3. Unzip ieee-fraud-detection.zip.

3. Build the image

make build

4. Start the services

Start both the two services

docker-compose up

or only start the app service using

docker-compose up app

image

Here are some documentations

How to set up the working environment for this project

API example

Note

If you want to change the hyperparameters search space, you can go to config.py. Or you even want to use other framework the build the model, I think this demo is detail enough as a reference for your project.

About the hyperparameter search algorithm, I am using the random search for this demo but if you want to try other searching algorithm, you can change train.py.

I have another project use Tensorflow as the back bone model. Take a look about the project lightgbm-project-demo

Owner
RayWu
RayWu
Mines all the moneys and stuff and things.

NFT Miner NFT Miner - Version 1.1.0 - Quick Fix Since the whole NFT thing started booming on Twitter it's been hard not to see one of those ugly ass m

8w8 1 Dec 13, 2021
Information about a signed UEFI Shell that can be used when Secure Boot is enabled.

SignedUEFIShell During our research of the BootHole vulnerability last year, we tried to find as many signed bootloaders as we could. We searched all

Mickey 61 Jan 03, 2023
Dyson Sphere Program Blueprint Toolkit

dspbptk This is dspbptk, the Dyson Sphere Program Blueprint toolkit. Dyson Sphere Program is an amazing factory-building game by the incredibly talent

Johannes Bauer 22 Nov 15, 2022
Cloud-native SIEM for intelligent security analytics for your entire enterprise.

Microsoft Sentinel Welcome to the Microsoft Sentinel repository! This repository contains out of the box detections, exploration queries, hunting quer

Microsoft Azure 2.9k Jan 02, 2023
Free Vocabulary Trainer - not only for German, but any language

Bilderraten DOWNLOAD THE EXE FILE HERE! What can you do with it? Vocabulary Trainer for any language Use your own vocabulary list No coding required!

Hans Alemão 4 Jan 02, 2023
Algo próximo do ARP

ArpPY Algo parecido com o ARP-Scan. Dependencias O script necessita no mínimo ter o Python versão 3.x instalado e ter o sockets instalado. Executando

Feh's 3 Jan 18, 2022
Cairo-bloom - A naive bloom filter implementation in Cairo

🥀 cairo-bloom A naive bloom filter implementation in Cairo. A Bloom filter is a

Sam Barnes 37 Oct 01, 2022
Organize seu linux - organize your linux

OrganizeLinux Organize seu linux - organize your linux Organize seu linux Uma forma rápida de separar arquivos dispersos em pastas. formatos a serem c

Marcus Vinícius Ribeiro Andrade 1 Nov 30, 2021
A Red Team tool for exfiltrating sensitive data from Jira tickets.

Jir-thief This Module will connect to Jira's API using an access token, export to a word .doc, and download the Jira issues that the target has access

Antonio Piazza 82 Dec 12, 2022
Projeto job insights - Projeto avaliativo da Trybe do Bloco 32: Introdução à Python

Termos e acordos Ao iniciar este projeto, você concorda com as diretrizes do Código de Ética e Conduta e do Manual da Pessoa Estudante da Trybe. Boas

Lucas Muffato 1 Dec 09, 2021
Pattern Matching for Python 3.7+ in a simple, yet powerful, extensible manner.

Awesome Pattern Matching (apm) for Python pip install awesome-pattern-matching Simple Powerful Extensible Composable Functional Python 3.7+, PyPy3.7+

Julian Fleischer 97 Nov 03, 2022
A tool for RaceRoom Racing Experience which shows you launch data

R3E Launch Tool A tool for RaceRoom Racing Experience which shows you launch data. Usage Run the tool, change the Stop Speed to whatever you want, and

Yuval Rosen 2 Feb 01, 2022
Alerts for Western Australian Covid-19 exposure locations via email and Slack

WA Covid Mailer Sends alerts from Healthy WA's Covid19 Exposure Locations via email and slack. Setup Edit the configuration items in wacovidmailer.py

13 Mar 29, 2022
A subleq VM/interpreter created by me for no reason

What is Dumbleq? Dumbleq is a dumb Subleq VM/interpreter implementation created by me for absolutely no reason at all. What is Subleq? If you haven't

Phu Minh 2 Nov 13, 2022
NeoInterface - Neo4j made easy for Python programmers!

Neointerface - Neo4j made easy for Python programmers! A Python interface to use the Neo4j graph database, and simplify its use. class NeoInterface: C

15 Dec 15, 2022
YunoHost is an operating system aiming to simplify as much as possible the administration of a server.

YunoHost is an operating system aiming to simplify as much as possible the administration of a server. This repository corresponds to the core code, written mostly in Python and Bash.

YunoHost 1.5k Jan 09, 2023
Mengzhan (John) code for Closed Loop Control system of Sharp Wave Ripples in Hippocampus CA3 region

ClosedLoopControl_Yu Mengzhan (John) code for Closed Loop Control system of Sharp Wave Ripples in Hippocampus CA3 region Creating Python Virtual Envir

Mengzhan (John) Liufu 1 Jan 22, 2022
This is an implementation of PEP 557, Data Classes.

This is an implementation of PEP 557, Data Classes. It is a backport for Python 3.6. Because dataclasses will be included in Python 3.7, any discussio

Eric V. Smith 561 Dec 06, 2022
addons to the turtle package that help you drew stuff more quickly

TurtlePlus addons to the turtle package that help you drew stuff more quickly --------------

1 Nov 18, 2021
Sudo type me a payload

payloadSecretary Sudo type me a payload Have you ever found yourself having to perform a test, and a client has provided you with a VM inside a VDI in

7 Jul 21, 2022