Internationalized Domain Names for Python (IDNA 2008 and UTS #46)

Overview

Internationalized Domain Names in Applications (IDNA)

Support for the Internationalised Domain Names in Applications (IDNA) protocol as specified in RFC 5891. This is the latest version of the protocol and is sometimes referred to as “IDNA 2008”.

This library also provides support for Unicode Technical Standard 46, Unicode IDNA Compatibility Processing.

This acts as a suitable replacement for the “encodings.idna” module that comes with the Python standard library, but which only supports the older superseded IDNA specification (RFC 3490).

Basic functions are simply executed:

>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト

Installation

To install this library, you can use pip:

$ pip install idna

Alternatively, you can install the package using the bundled setup script:

$ python setup.py install

Usage

For typical usage, the encode and decode functions will take a domain name argument and perform a conversion to A-labels or U-labels respectively.

>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト

You may use the codec encoding and decoding methods using the idna.codec module:

>>> import idna.codec
>>> print('домен.испытание'.encode('idna'))
b'xn--d1acufc.xn--80akhbyknj4f'
>>> print(b'xn--d1acufc.xn--80akhbyknj4f'.decode('idna'))
домен.испытание

Conversions can be applied at a per-label basis using the ulabel or alabel functions if necessary:

>>> idna.alabel('测试')
b'xn--0zwm56d'

Compatibility Mapping (UTS #46)

As described in RFC 5895, the IDNA specification does not normalize input from different potential ways a user may input a domain name. This functionality, known as a “mapping”, is considered by the specification to be a local user-interface issue distinct from IDNA conversion functionality.

This library provides one such mapping, that was developed by the Unicode Consortium. Known as Unicode IDNA Compatibility Processing, it provides for both a regular mapping for typical applications, as well as a transitional mapping to help migrate from older IDNA 2003 applications.

For example, “Königsgäßchen” is not a permissible label as LATIN CAPITAL LETTER K is not allowed (nor are capital letters in general). UTS 46 will convert this into lower case prior to applying the IDNA conversion.

>>> import idna
>>> idna.encode('Königsgäßchen')
...
idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed
>>> idna.encode('Königsgäßchen', uts46=True)
b'xn--knigsgchen-b4a3dun'
>>> print(idna.decode('xn--knigsgchen-b4a3dun'))
königsgäßchen

Transitional processing provides conversions to help transition from the older 2003 standard to the current standard. For example, in the original IDNA specification, the LATIN SMALL LETTER SHARP S (ß) was converted into two LATIN SMALL LETTER S (ss), whereas in the current IDNA specification this conversion is not performed.

>>> idna.encode('Königsgäßchen', uts46=True, transitional=True)
'xn--knigsgsschen-lcb0w'

Implementors should use transitional processing with caution, only in rare cases where conversion from legacy labels to current labels must be performed (i.e. IDNA implementations that pre-date 2008). For typical applications that just need to convert labels, transitional processing is unlikely to be beneficial and could produce unexpected incompatible results.

encodings.idna Compatibility

Function calls from the Python built-in encodings.idna module are mapped to their IDNA 2008 equivalents using the idna.compat module. Simply substitute the import clause in your code to refer to the new module name.

Exceptions

All errors raised during the conversion following the specification should raise an exception derived from the idna.IDNAError base class.

More specific exceptions that may be generated as idna.IDNABidiError when the error reflects an illegal combination of left-to-right and right-to-left characters in a label; idna.InvalidCodepoint when a specific codepoint is an illegal character in an IDN label (i.e. INVALID); and idna.InvalidCodepointContext when the codepoint is illegal based on its positional context (i.e. it is CONTEXTO or CONTEXTJ but the contextual requirements are not satisfied.)

Building and Diagnostics

The IDNA and UTS 46 functionality relies upon pre-calculated lookup tables for performance. These tables are derived from computing against eligibility criteria in the respective standards. These tables are computed using the command-line script tools/idna-data.

This tool will fetch relevant codepoint data from the Unicode repository and perform the required calculations to identify eligibility. There are three main modes:

  • idna-data make-libdata. Generates idnadata.py and uts46data.py, the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors who wish to track this library against a different Unicode version may use this tool to manually generate a different version of the idnadata.py and uts46data.py files.
  • idna-data make-table. Generate a table of the IDNA disposition (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC 5892 and the pre-computed tables published by IANA.
  • idna-data U+0061. Prints debugging output on the various properties associated with an individual Unicode codepoint (in this case, U+0061), that are used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging or analysis.

The tool accepts a number of arguments, described using idna-data -h. Most notably, the --version argument allows the specification of the version of Unicode to use in computing the table data. For example, idna-data --version 9.0.0 make-libdata will generate library data against Unicode 9.0.0.

Additional Notes

  • Packages. The latest tagged release version is published in the Python Package Index.
  • Version support. This library supports Python 3.5 and higher. As this library serves as a low-level toolkit for a variety of applications, many of which strive for broad compatibility with older Python versions, there is no rush to remove older intepreter support. Removing support for older versions should be well justified in that the maintenance burden has become too high.
  • Python 2. Python 2 is supported by version 2.x of this library. While active development of the version 2.x series has ended, notable issues being corrected may be backported to 2.x. Use "idna<3" in your requirements file if you need this library for a Python 2 application.
  • Testing. The library has a test suite based on each rule of the IDNA specification, as well as tests that are provided as part of the Unicode Technical Standard 46, Unicode IDNA Compatibility Processing.
  • Emoji. It is an occasional request to support emoji domains in this library. Encoding of symbols like emoji is expressly prohibited by the technical standard IDNA 2008 and emoji domains are broadly phased out across the domain industry due to associated security risks. For now, applications that wish need to support these non-compliant labels may wish to consider trying the encode/decode operation in this library first, and then falling back to using encodings.idna. See the Github project for more discussion.
Owner
Kim Davies
Kim Davies
MSDorkDump is a Google Dork File Finder that queries a specified domain name and variety of file extensions

MSDorkDump is a Google Dork File Finder that queries a specified domain name and variety of file extensions (pdf, doc, docx, etc), and downloads them.

Joe Helle 150 Jan 03, 2023
Gitlab RCE - Remote Code Execution

Gitlab RCE - Remote Code Execution RCE for old gitlab version = 11.4.7 & 12.4.0-12.8.1 LFI for old gitlab versions 10.4 - 12.8.1 This is an exploit f

153 Nov 09, 2022
Scout Suite - an open source multi-cloud security-auditing tool,

Description Scout Suite is an open source multi-cloud security-auditing tool, which enables security posture assessment of cloud environments. Using t

NCC Group Plc 5k Jan 05, 2023
Log4Shell RCE Exploit - fully independent exploit does not require any 3rd party binaries.

Log4Shell RCE Exploit fully independent exploit does not require any 3rd party binaries. The exploit spraying the payload to all possible logged HTTP

258 Jan 02, 2023
Monty Hall Problem simulation written in Python.

Monty Hall Problem Simulation monty_hall_sim is a brute-force method of determining the optimal strategy for the Monty Hall Problem. Usage Set boolean

Xavier D 1 Aug 29, 2022
A proxy for asyncio.AbstractEventLoop for testing purposes

aioloop-proxy A proxy for asyncio.AbstractEventLoop for testing purposes. When tests writing for asyncio based code, there are controversial requireme

aio-libs 12 Dec 12, 2022
HatSploit native powerful payload generation and shellcode injection tool that provides support for common platforms and architectures.

HatVenom HatSploit native powerful payload generation and shellcode injection tool that provides support for common platforms and architectures. Featu

EntySec 100 Dec 23, 2022
ONT Analysis Toolkit (OAT)

A toolkit for monitoring ONT MinION sequencing, followed by data analysis, for viral genomes amplified with tiled amplicon sequencing.

6 Jun 14, 2022
It's a simple tool for test vulnerability Apache Path Traversal

SimplesApachePathTraversal Simples Apache Path Traversal It's a simple tool for test vulnerability Apache Path Traversal https://blog.mrcl0wn.com/2021

Mr. Cl0wn - H4ck1ng C0d3r 56 Dec 27, 2022
AnonStress-Stored-XSS-Exploit - An exploit and demonstration on how to exploit a Stored XSS vulnerability in anonstress

AnonStress Stored XSS Exploit An exploit and demonstration on how to exploit a S

صلى الله على محمد وآله 3 Jun 22, 2022
Passphrase-wordlist - Shameless clone of passphrase wordlist

This repository is NOT official -- the original repository is located on GitLab

Jeff McJunkin 2 Feb 05, 2022
Set the draft security HTTP header Permissions-Policy (previously Feature-Policy) on your Django app.

django-permissions-policy Set the draft security HTTP header Permissions-Policy (previously Feature-Policy) on your Django app. Requirements Python 3.

Adam Johnson 76 Nov 30, 2022
Herramienta para descargar eventos de Sucuri WAF hacia disco.

Descarga los eventos de Sucuri Script para descargar los eventos del Sucuri Web Application Firewall (WAF) en el disco como archivos CSV. Requerimient

CSIRT-RD 2 Nov 29, 2021
CloudFlare reconnaissance, tries to uncover the IP behind CF.

CloudFlare reconnaissance, tries to uncover the IP behind CF.

Neospace 8 Dec 03, 2021
An IDA pro python script to decrypt Qbot malware string

Qbot-Strings-Decrypter An IDA pro python script to decrypt Qbot malware strings.

stuckinvim 6 Sep 01, 2022
xp_CAPTCHA(白嫖版) burp 验证码 识别 burp插件

xp_CAPTCHA(白嫖版) 说明 xp_CAPTCHA (白嫖版) 验证码识别 burp插件 安装 需要python3 小于3.7的版本 安装 muggle_ocr 模块(大概400M左右) python3 -m pip install -i http://mirrors.aliyun.com/

算命縖子 588 Jan 09, 2023
Analyse a forensic target (such as a directory) to find and report files found and not found from CIRCL hashlookup public service

Analyse a forensic target (such as a directory) to find and report files found and not found from CIRCL hashlookup public service. This tool can help a digital forensic investigator to know the conte

hashlookup 96 Dec 20, 2022
This is a simple Port Flooder written in Python 3.

This is a simple Port Flooder written in Python 3. Use this tool to quickly stress test your network devices and measure your router's or server's load.

Júlio Carneiro 4 Feb 20, 2022
Vulnerability Scanner & Auto Exploiter You can use this tool to check the security by finding the vulnerability in your website or you can use this tool to Get Shells

About create a target list or select one target, scans then exploits, done! Vulnnr is a Vulnerability Scanner & Auto Exploiter You can use this tool t

Nano 108 Dec 04, 2021
This respository contains the source code of the printjack and phonejack attacks.

Printjack-Phonejack This repository contains the source code of the printjack and phonejack attacks. The Printjack directory contains the script to ca

pietrobiondi 2 Feb 12, 2022