Various capabilities for static malware analysis.

Overview

Malchive

The malchive serves as a compendium for a variety of capabilities mainly pertaining to malware analysis, such as scripts supporting day to day binary analysis and decoder modules for various components of malicious code.

The goals behind the 'malchive' are to:

  • Allow teams to centralize efforts made in this realm and enforce communication and continuity
  • Have a shared corpus of tools for people to build on
  • Enforce clean coding practices
  • Allow others to interface with project members to develop their own capabilities
  • Promote a positive feedback loop between Threat Intel and Reverse Engineering staff
  • Make static file analysis more accessible
  • Serve as a vehicle to communicate the unique opportunity space identified via deep dive analysis

Documentation

At its core, malchive is a bunch of standalone scripts organized in a manner that the authors hope promotes the project's goals.

To view the documentation associated with this project, checkout the wiki page!

Scripts within the malchive are split up into the following core categories:

  • Utilities - These scripts may be run standalone to assist with static binary analysis or as modules supporting a broader program. Utilities always have a standalone component.
  • Helpers - These modules primarily serve to assist components in one or more of the other categories. They generally do not have a stand-alone component and instead serve the intents of those that do.
  • Binary Decoders - The purpose of scripts in this category is to retrieve, decrypt, and return embedded data (typically inside malware).
  • Active Discovery - Standalone scripts designed to emulate a small portion of a malware family's protocol for the purposes of discovering active controllers.

Installation

The malchive is a packaged distribution that is easily installed and will automatically create console stand-alone scripts.

Steps

You will need to install some dependencies for some of the required Python modules to function correctly.

  • First do a source install of YARA and make sure you compile using --dotnet
  • Next source install the YARA Python package.
  • Ensure you have sqlite3-dev installed
    • Debian: libsqlite3-dev
    • Red Hat: sqlite-devel / pip install pysqlite3

You can then clone the malchive repo and install...

  • pip install . when in the parent directory.
  • To remove, just pip uninstall malchive

Scripts

Console scripts stemming from utilities are appended with the prefix malutil, decoders are appended with maldec, and active discovery scripts are appended with maldisc. This allows for easily identifiable malchive scripts via tab autocompletion.

; running superstrings from cmd line
malutil-superstrings 1.exe -ss
0x9535 (stack) lstrlenA
0x9592 (stack) GetFileSize
0x95dd (stack) WriteFile
0x963e (stack) CreateFileA
0x96b0 (stack) SetFilePointer
0x9707 (stack) GetSystemDirectoryA

; running a decoder from cmd line
maldec-pivy test.exe_
{
    "MD5": "2973ee05b13a575c06d23891ab83e067",
    "Config": {
        "PersistActiveSetupName": "StubPath",
        "DefaultBrowserKey": "SOFTWARE\\Classes\\http\\shell\\open\\command",
        "PersistActiveSetupKeyPart": "Software\\Microsoft\\Active Setup\\Installed Components\\",
        "ServerId": "TEST - WIN_XP",
        "Callbacks": [
            {
                "callback": "192.168.1.104",
                "protocol": "Direct",
                "port": 3333
            },
            {
                "callback": "192.168.1.111",
                "protocol": "Direct",
                "port": 4444
            }
        ],
        "ProxyCfgPresent": false,
        "Password": "test$321$",
        "Mutex": ")#V0qA.I4",
        "CopyAsADS": true,
        "Melt": true,
        "InjectPersist": true,
        "Inject": true
    }
}

; cmd line use with other common utilities
echo -ne 'eJw9kLFuwzAMRIEC7ZylrVGgRSFZiUbBZmwqsMUP0VfcnuQn+rMde7KLTBIPj0ce34tHyMUJjrnw
p3apz1kicjoJrDRlQihwOXmpL4RmSR5qhEU9MqvgWo8XqGMLJd+sKNQPK0dIGjK+e5WANIT6NeOs
k2mI5NmYAmcrkbn4oLPK5gZX+hVlRoKloMV20uQknv2EPunHKQtcig1cpHY4Jodie5pRViV+rp1t
629J6Dyu4hwLR97LINqY5rYILm1hhlvinoyJZavOKTrwBHTwpZ9yPSzidUiPt8PUTkZ0FBfayWLp
a71e8U8YDrbtu0aWDj+/eBOu+jRkYabX+3hPu9LZ5fb41T+7fmRf' | base64 -d | zlib-flate -uncompress | malutil-xor - [KEY]

Interfacing

Utilities, decoders, and discovery scripts in this collection are designed to support single ad-hoc analysis as well as inclusion into other frameworks. After installation, the malchive should be part of your Python path. At this point accessing any of the scripts is straight forward.

Here are a few examples:

; accessing decoder modules
import sys
from malchive.decoders import testdecoder

p = testdecoder.GetConfig(open(sys.argv[1], 'rb').read())
print('password', p.rc4_key)
for c in p.callbacks:
    print('c2 address', c)

; accessing utilities
from malchive.utilities import xor
ret = xor.GenericXor(buff=b'testing', key=[0x51], count=0xff)
print(ret.run_crypt())

; accessing helpers
from malchive.helpers import winfunc
key = winfunc.CryptDeriveKey(b'testdatatestdata')

To understand more about a given module, see the associated wiki entry.

Contributing

Contributing to the malchive is easy, just ensure the following requirements are met:

  • When writing utilities, decoders, or discovery scripts, consider using the available templates or review existing code if you're not sure how to get started.
  • Make sure modification or contributions pass pre-commit tests.
  • Ensure the contribution is placed in one of the component folders.
  • Updated the setup file if needed with an entry.
  • Python3 is a must.

Legal

©2021 The MITRE Corporation. ALL RIGHTS RESERVED.

Approved for Public Release; Distribution Unlimited. Public Release Case Number 21-0153

Owner
MITRE Cybersecurity
MITRE Cybersecurity
使用pytorch+transformers复现了SimCSE论文中的有监督训练和无监督训练方法

SimCSE复现 项目描述 SimCSE是一种简单但是很巧妙的NLP对比学习方法,创新性地引入Dropout的方式,对样本添加噪声,从而达到对正样本增强的目的。 该框架的训练目的为:对于batch中的每个样本,拉近其与正样本之间的距离,拉远其与负样本之间的距离,使得模型能够在大规模无监督语料(也可以

58 Dec 20, 2022
TalkNet: Audio-visual active speaker detection Model

Is someone talking? TalkNet: Audio-visual active speaker detection Model This repository contains the code for our ACM MM 2021 paper, TalkNet, an acti

142 Dec 14, 2022
sangha, pronounced "suhng-guh", is a social networking, booking platform where students and teachers can share their practice.

Flask React Project This is the backend for the Flask React project. Getting started Clone this repository (only this branch) git clone https://github

Courtney Newcomer 17 Sep 29, 2021
Ceaser-Cipher - The Caesar Cipher technique is one of the earliest and simplest method of encryption technique

Ceaser-Cipher The Caesar Cipher technique is one of the earliest and simplest me

Lateefah Ajadi 2 May 12, 2022
Sentiment-Analysis and EDA on the IMDB Movie Review Dataset

Sentiment-Analysis and EDA on the IMDB Movie Review Dataset The main part of the work focuses on the exploration and study of different approaches whi

Nikolas Petrou 1 Jan 12, 2022
Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

Chenhe Dong 28 Nov 10, 2022
ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode

Alibaba 1.4k Jan 04, 2023
The first online catalogue for Arabic NLP datasets.

Masader The first online catalogue for Arabic NLP datasets. This catalogue contains 200 datasets with more than 25 metadata annotations for each datas

ARBML 94 Dec 26, 2022
AI-powered literature discovery and review engine for medical/scientific papers

AI-powered literature discovery and review engine for medical/scientific papers paperai is an AI-powered literature discovery and review engine for me

NeuML 819 Dec 30, 2022
Biterm Topic Model (BTM): modeling topics in short texts

Biterm Topic Model Bitermplus implements Biterm topic model for short texts introduced by Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. Actua

Maksim Terpilowski 49 Dec 30, 2022
Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis with Affective Knowledge. Proceedings of EMNLP 2021

AAGCN-ACSA EMNLP 2021 Introduction This repository was used in our paper: Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment An

Akuchi 36 Dec 18, 2022
The source code of HeCo

HeCo This repo is for source code of KDD 2021 paper "Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning". Paper Link: htt

Nian Liu 106 Dec 27, 2022
All the code I wrote for Overwatch-related projects that I still own the rights to.

overwatch_shit.zip This is (eventually) going to contain all the software I wrote during my five-year imprisonment stay playing Overwatch. I'll be add

zkxjzmswkwl 2 Dec 31, 2021
KoBERT - Korean BERT pre-trained cased (KoBERT)

KoBERT KoBERT Korean BERT pre-trained cased (KoBERT) Why'?' Training Environment Requirements How to install How to use Using with PyTorch Using with

SK T-Brain 1k Jan 02, 2023
PyTorch implementation of Tacotron speech synthesis model.

tacotron_pytorch PyTorch implementation of Tacotron speech synthesis model. Inspired from keithito/tacotron. Currently not as much good speech quality

Ryuichi Yamamoto 279 Dec 09, 2022
中文問句產生器;使用台達電閱讀理解資料集(DRCD)

Transformer QG on DRCD The inputs of the model refers to we integrate C and A into a new C' in the following form. C' = [c1, c2, ..., [HL], a1, ..., a

Philip 1 Oct 22, 2021
Toy example of an applied ML pipeline for me to experiment with MLOps tools.

Toy Machine Learning Pipeline Table of Contents About Getting Started ML task description and evaluation procedure Dataset description Repository stru

Shreya Shankar 190 Dec 21, 2022
Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.

anaGo anaGo is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras. anaGo can solve sequence labeling tasks such as nam

Hiroki Nakayama 1.5k Dec 05, 2022
Bu Chatbot, Konya Bilim Merkezi Yen için tasarlanmış olan bir projedir.

chatbot Bu Chatbot, Konya Bilim Merkezi Yeni Ufuklar Sergisi için 2021 Yılında tasarlanmış olan bir projedir. Chatbot Python ortamında yazılmıştır. Sö

Emre Özkul 1 Feb 23, 2022
SpikeX - SpaCy Pipes for Knowledge Extraction

SpikeX is a collection of pipes ready to be plugged in a spaCy pipeline. It aims to help in building knowledge extraction tools with almost-zero effort.

Erre Quadro Srl 384 Dec 12, 2022