A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance

Overview

A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance

Python version Project version Codacy Grade


UsageDownload

Introduction

ByteCog is a python script that aims to help security researchers and others a like to classify malicious software compared to other samples, depending on what the unknown file(s) is/are being tested against. This script can be extended to use a machine learning model to classify malware if you wanted to do so. ByteCog uses multiple methods of analyzing and classifying samples given to it, such as using Shannon Entropy to give a visual aspect for the researchers to look at while analyzing the code and finding possible readable code/text in a sample. ByteCog also uses Hausdorff Distance to calculate a 'raw similarity' value based on the difference in the entropy graphs of both samples, and finally ByteCog uses Jaro-Winkler Distance to calculate the 'true similarity' since the Hausdorff Distance will in most cases return a very high value if the sample is mostly the same entropy wise, so the Jaro-Winkler Distance is used to 'adjust' the simliarity value for this case of a sample.

Requirements

  • A python installation above 3.5+, which you can download from the official python website here.

Installation

Clone this repository to your local machine by following these instructions layed out here

Then proceed to download the dependencies file by running the following line in your console window

pip install -r requirements.txt

Usage

======================================================
|      ____          __         ______               |
|     / __ ) __  __ / /_ ___   / ____/____   ____    |
|    / __  |/ / / // __// _ \ / /    / __ \ / __ \   |
|   / /_/ // /_/ // /_ /  __// /___ / /_/ // /_/ /   |
|  /_____/ \__, / \__/ \___/ \____/ \____/ \__, /    |
|         /____/                          /____/     |
|                                                    |
|                    Version: 0.4                    |
|               Author: IlluminatiFish               |
======================================================

usage: bytecog.py [-h] -k KNOWN -u UNKNOWN -i IDENTIFIER -v VISUAL

Determine whether an unknown provided sample is similar to a known sample

optional arguments:
  -h, --help            show this help message and exit
  -k KNOWN, --known KNOWN
                        The file path to the known sample
  -u UNKNOWN, --unknown UNKNOWN
                        The file path to the unknown sample
  -i IDENTIFIER, --identifier IDENTIFIER
                        The antivirus identifier of the known file
  -v VISUAL, --visual VISUAL
                        If you want to show a visual representation of the file entropy

Features & Use Cases

  • Calculates sample similarity
  • Generates chunked entropy graph
  • Able to possibly detect malicious and benign software samples

Screenshots

Chunked Entropy Graph
chunk_entropy_graph

Output of ByteCog
bytecog_output

ByteCog Log File
bytecog log file

License

ByteCog - A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance Copyright (c) 2021 IlluminatiFish

This program is free software; you can redistribute it and/or modify the code base under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but without ANY warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/

Acknowledgements

  • Using a modified version of @venkat-abhi's Shannon Entropy calculator to work with my project script, you can find the original one here.

  • Using the fastest method to get maximum key from a dictionary using this snippet here.

References

Entropy Wiki
Jaro-Winkler Distance Wiki
Hausdorff Distance Wiki
Shannon Calculator
Referenced Article #1
Referenced Paper #1
Referenced Paper #2
Referenced Paper #3

Owner
I am a developer that has a passion for programming, mathematics and cyber security. Currently Developer @South-Hollow
Generate malicious files using recently published homoglyphic-attack (CVE-2021-42694)

CVE-2021-42694 Generate malicious files using recently published homoglyph-attack vulnerability, which was discovered at least in C, C++, C#, Go, Pyth

js-on 17 Dec 11, 2022
XSS scanner in python

DeadXSS XSS scanner in python How to Download: Step 1: git clone https://github.com/Deadeye0x/DeadXSS.git Step 2: cd DeadXSS Step 3: python3 DeadXSS.p

2 Jul 17, 2022
Passphrase-wordlist - Shameless clone of passphrase wordlist

This repository is NOT official -- the original repository is located on GitLab

Jeff McJunkin 2 Feb 05, 2022
zip-brute Zip File Password Cracking with Using Password List

Zip brute is a python script that cracks zip that are password protected using a wordlist dictionary.

AnonyminHack5 13 Nov 03, 2022
JS Deobfuscation is a Python script that deobfuscate JS code and it's time saver for you.

JS Deobfuscation is a Python script that deobfuscate JS code and it's time saver for you. Although it may not work with high degrees of obfuscation, it's a pretty nice tool to help you even if it's j

Quatrecentquatre 3 May 01, 2022
On-demand scanning for container registries

Lacework registry scanner Install & configure Lacework CLI Integrate a Container Registry Go to Lacework Resources Containers Container Image In

Will Robinson 1 Dec 14, 2021
Salesforce Recon and Exploitation Toolkit

Salesforce Recon and Exploitation Toolkit Salesforce Recon and Exploitation Toolkit Usage python3 main.py URL References Announcement Blog - https:/

81 Dec 23, 2022
WebLogic T3/IIOP RCE ExternalizableHelper.class of coherence.jar

CVE-2020-14756 WebLogic T3/IIOP RCE ExternalizableHelper.class of coherence.jar README project base on https://github.com/Y4er/CVE-2020-2555 and weblo

Y4er 77 Dec 06, 2022
Log4j rce test environment and poc

log4jpwn log4j rce test environment See: https://www.lunasec.io/docs/blog/log4j-zero-day/ Experiments to trigger in various software products mentione

Leon Jacobs 307 Dec 24, 2022
vulnerable APIs

vulnerable-apis vulnerable APIs inspired by https://github.com/mattvaldes/vulnerable-api Setup Docker If, Out of the box docker pull kmmanoj/vulnerabl

9 Jun 01, 2022
MVT is a forensic tool to look for signs of infection in smartphone devices

Mobile Verification Toolkit Mobile Verification Toolkit (MVT) is a collection of utilities to simplify and automate the process of gathering forensic

8.3k Jan 08, 2023
A small utility to deal with malware embedded hashes.

Uchihash is a small utility that can save malware analysts the time of dealing with embedded hash values used for various things such as: Dyn

Abdallah Elshinbary 48 Dec 19, 2022
A python module for retrieving and parsing WHOIS data

pythonwhois A WHOIS retrieval and parsing library for Python. Dependencies None! All you need is the Python standard library. Instructions The manual

Sven Slootweg 384 Dec 23, 2022
The Decompressoin tool for Vxworks MINIFS

MINIFS-Decompression The Decompression tool for Vxworks MINIFS filesystem. USAGE python minifs_decompression.py [target_firmware] The example of Mercu

8 Jan 03, 2023
Spring Cloud Gateway < 3.0.7 & < 3.1.1 Code Injection (RCE)

Spring Cloud Gateway 3.0.7 & 3.1.1 Code Injection (RCE) CVE: CVE-2022-22947 CVSS: 10.0 (Vmware - https://tanzu.vmware.com/security/cve-2022-22947)

Carlos Vieira 35 Dec 28, 2022
Malware-analysis-writeups - Some of my Malware Analysis writeups

About This repo contains some malware analysis writeups i've created over time m

Itay Migdal 14 Jun 22, 2022
Attack SQL Server through gopher protocol

Attack SQL Server through gopher protocol

hack2fun 17 Nov 30, 2022
A Python Scanner for log4j

log4j-Scanner scanner for log4j cat web-urls.txt | python3 log4j.py ID.burpcollaborator.net web-urls.txt http://127.0.0.1:8080 https://www.google.c

Ihebski 5 Jun 26, 2022
Security audit Python project dependencies against security advisory databases.

Security audit Python project dependencies against security advisory databases.

52 Dec 17, 2022
Password List Maker

Red-Key Red-Key Password List Maker Version 1.1.2 Created By FireKing255 -=Features=- Create Random Password List Create Password List Create Password

FireKing255 7 Dec 26, 2021