iAWE is a wonderful dataset for those of us who work on Non-Intrusive Load Monitoring (NILM) algorithms.

Overview


Ax

Description

iAWE is a wonderful dataset for those of us who work on Non-Intrusive Load Monitoring (NILM) algorithms. You can find its main page and description via this link. If you are familiar with NILM-TK API, you probably know that you can work with iAWE hdf5 data file in NILM-TK. However I faced some problems that convinced me to Not use NILM-TK and iAWE hdf5 datafile. Instead, I decided to use the iAWE appliance consumption CSV files and preprocess them myself. So if you have problems with NILM-TK API and iAWE hdf5 data file too, this piece of code may help you to prepare 11 appliance consumption data for your NILM algorithm.

Installation

  • First, download the iAWE dataset using this link (also available on iAWE page!).
  • Download the electricity.tar.gz file.


Ax

  • Download the repo and all its folders.
  • Unzip the electricity.tar.gz and copy all 12 CSV file (plus the labels file into the electricity folder of the downloaded repo.
  • Now everythng is ready for you to start the data preprocessing using the main.py file. But before running the code let me show you what kind of problems we had with the original iAWE hdf5 file.

What problems did we solve?

Well, to be honest NILM-TK documentation is not very clear! If you try to use the hdf5 datafile of the datasets that works with NILM-TK, soon you will admit it. Sometimes you find the the similiar questions on stack overflow but when you try them, they simply don't work due to some updates in NILM-TK (undocumented maybe!?). So, having full control on the data was my main incentive to redo the data preprocessing by my self. You see 12 CSV files in your downloaded files. They belong to:

  • main meter (1)
  • main meter (2)
  • fridge
  • air conditioner (1)
  • air conditioner (2)
  • washing machine
  • laptop
  • iron
  • kitchen outlets
  • television
  • water filter
  • water motor The publisher of iAWE dataset has recommended to ignore the water motor CSV file as it is not accurate (so did we!). Each CSV file consists of timestamp, W, VAR, VA, f, V, PF and A columns. timestamp can be read and converted to read time and date by Python libraries. The publisher of dataset have collected time stamps to reduce the size of final data files which means there is no sampling when the appliances are not consuming power. On the other hand the start time of different appliances measurement is not the same so the length, start and end of most csv files are different. When you plot it in NILM-TK it is fine becuase it reads the timestamps and ignores the NA time steps. However when you want to feed this data into your algorithm it will be a problem which needs data preprocessing. To better understand the problem when using the raw data in iAWE dataset, I've plotted W (active power) of the air conditioner which is CSV file number 4.


AC

As you see, when youplot it in Python the NA timestamp will be plotted as a direct line between last available data and the next available one. It is neither human readable (to some extents!) nor NILM algorithm readable. In fact what your NILM algorithm will be fed with is the series of these values because your algorithm has nothing to do with timestamps! See this is what NILM algorithm sees as the AC power consumption:


AC WO

Now to make it both human readable and NILM algorithm readable, I did as below: (I've commented the code so you can see what is happening in every part of the code)

  • Loaded all CSV files in a dictionary of Dataframes with CSV file orders
  • Measured the lowes and highest timestamp in order to know the length of the measurement period (they have different lengthes!)
  • Created a big dataframe of zeros with from lowest timestamp to the highest one as its index
  • Used the update method on dataframes to transfer the values of dataframes to the big dataframes of zeros (Now all of them have the same length)
  • Putting all dfs into a dictionary of dataframes
  • Casting all the dataframes into the efficient period of sampling (Because now we know which part of sampling is useless)
  • Removing NAN values
  • Dropping unwanted columns
  • Filling NA values with last available value in dataframes
  • Saving all the dataframes as CSV files in the prepared data folder
  • Done!


AC WO

Conclusion

Basically, what we have here after running this code is 11 CSV files of W, VAR, VA, f, V, PF and A for 11 different meters. Prepared CSV file are all of the same length without NAN or NA values which are ready to be fed to any NILM algorithm. Despite the fact that I've done these changes to iAWE dataset, I'm sure the publishers of this dataset have much better solution via NILM-TK to have such an output. However due to lack of documentation or changes in their code I prefered to do this data preprocessing myself. Hope you enjoy it!

Owner
Mozaffar Etezadifar
NILM and RL researcher @ Polytechnique Montreal
Mozaffar Etezadifar
Evol is clear dsl for composable evolutionary algorithms that optimised for joy.

Evol is clear dsl for composable evolutionary algorithms that optimised for joy. Installation We currently support python3.6 and python3.7 and you can

GoDataDriven 178 Dec 27, 2022
Solving a card game with three search algorithms: BFS, IDS, and A*

Search Algorithms Overview In this project, we want to solve a card game with three search algorithms. In this card game, we have to sort our cards by

Korosh 5 Aug 04, 2022
Algorithmic trading backtest and optimization examples using order book imbalances. (bitcoin, cryptocurrency, bitmex)

Algorithmic trading backtest and optimization examples using order book imbalances. (bitcoin, cryptocurrency, bitmex)

172 Dec 21, 2022
Python Sorted Container Types: Sorted List, Sorted Dict, and Sorted Set

Python Sorted Containers Sorted Containers is an Apache2 licensed sorted collections library, written in pure-Python, and fast as C-extensions. Python

Grant Jenks 2.8k Jan 04, 2023
A custom prime algorithm, implementation, and performance code & review

Colander A custom prime algorithm, implementation, and performance code & review Pseudocode Algorithm 1. given a number of primes to find, the followi

Finn Lancaster 3 Dec 17, 2021
PICO is an algorithm for exploiting Reinforcement Learning (RL) on Multi-agent Path Finding tasks.

PICO is an algorithm for exploiting Reinforcement Learning (RL) on Multi-agent Path Finding tasks. It is developed by the Multi-Agent Artificial Intel

21 Dec 20, 2022
An NUS timetable generator which uses a genetic algorithm to optimise timetables to suit the needs of NUS students.

A timetable optimiser for NUS which uses an evolutionary algorithm to "breed" a timetable suited to your needs.

Nicholas Lee 3 Jan 09, 2022
frePPLe - open source supply chain planning

frePPLe Open source supply chain planning FrePPLe is an easy-to-use and easy-to-implement open source advanced planning and scheduling tool for manufa

frePPLe 385 Jan 06, 2023
Better control of your asyncio tasks

quattro: task control for asyncio quattro is an Apache 2 licensed library, written in Python, for task control in asyncio applications. quattro is inf

Tin Tvrtković 37 Dec 28, 2022
This repository is an individual project made at BME with the topic of self-driving car simulator and control algorithm.

BME individual project - NEAT based self-driving car This repository is an individual project made at BME with the topic of self-driving car simulator

NGO ANH TUAN 1 Dec 13, 2021
A lightweight, pure-Python mobile robot simulator designed for experiments in Artificial Intelligence (AI) and Machine Learning, especially for Jupyter Notebooks

aitk.robots A lightweight Python robot simulator for JupyterLab, Notebooks, and other Python environments. Goals A lightweight mobile robotics simulat

3 Oct 22, 2021
A priority of preferences for teacher assignment problem

Genetic-Algorithm-for-Assignment-Problem A priority of preferences for teacher assignment problem Keywords k-partition; clustering; education 4.0 Abst

hades 2 Oct 31, 2022
This is a demo for AAD algorithm.

Asynchronous-Anisotropic-Diffusion-Algorithm This is a demo for AAD algorithm. The subroutine of the anisotropic diffusion algorithm is modified from

3 Mar 21, 2022
N Queen Problem using Genetic Algorithm

The N Queen is the problem of placing N chess queens on an N×N chessboard so that no two queens attack each other.

Mahdi Hassanzadeh 2 Nov 11, 2022
This application solves sudoku puzzles using a backtracking recursive algorithm

This application solves sudoku puzzles using a backtracking recursive algorithm. The user interface is coded with Pygame to allow users to easily input puzzles.

Glenda T 0 May 17, 2022
A raw implementation of the nearest insertion algorithm to resolve TSP problems in a TXT format.

TSP-Nearest-Insertion A raw implementation of the nearest insertion algorithm to resolve TSP problems in a TXT format. Instructions Load a txt file wi

sjas_Phantom 1 Dec 02, 2021
Python package to monitor the power consumption of any algorithm

CarbonAI This project aims at creating a python package that allows you to monitor the power consumption of any python function. Documentation The com

Capgemini Invent France 36 Nov 11, 2022
🧬 Training the car to do self-parking using a genetic algorithm

🧬 Training the car to do self-parking using a genetic algorithm

Oleksii Trekhleb 652 Jan 03, 2023
Distributed Grid Descent: an algorithm for hyperparameter tuning guided by Bayesian inference, designed to run on multiple processes and potentially many machines with no central point of control

Distributed Grid Descent: an algorithm for hyperparameter tuning guided by Bayesian inference, designed to run on multiple processes and potentially many machines with no central point of control.

Martin 1 Jan 01, 2022
Robotic Path Planner for a 2D Sphere World

Robotic Path Planner for a 2D Sphere World This repository contains code implementing a robotic path planner in a 2D sphere world with obstacles. The

Matthew Miceli 1 Nov 19, 2021