Python port of Google's libphonenumber

Overview

phonenumbers Python Library

Coverage Status

This is a Python port of Google's libphonenumber library It supports Python 2.5-2.7 and Python 3.x (in the same codebase, with no 2to3 conversion needed).

Original Java code is Copyright (C) 2009-2015 The Libphonenumber Authors.

Release HISTORY, derived from upstream release notes.

Installation

Install using pip with:

pip install phonenumbers

Example Usage

The main object that the library deals with is a PhoneNumber object. You can create this from a string representing a phone number using the parse function, but you also need to specify the country that the phone number is being dialled from (unless the number is in E.164 format, which is globally unique).

>>> import phonenumbers
>>> x = phonenumbers.parse("+442083661177", None)
>>> print(x)
Country Code: 44 National Number: 2083661177 Leading Zero: False
>>> type(x)
<class 'phonenumbers.phonenumber.PhoneNumber'>
>>> y = phonenumbers.parse("020 8366 1177", "GB")
>>> print(y)
Country Code: 44 National Number: 2083661177 Leading Zero: False
>>> x == y
True
>>> z = phonenumbers.parse("00 1 650 253 2222", "GB")  # as dialled from GB, not a GB number
>>> print(z)
Country Code: 1 National Number: 6502532222 Leading Zero(s): False

The PhoneNumber object that parse produces typically still needs to be validated, to check whether it's a possible number (e.g. it has the right number of digits) or a valid number (e.g. it's in an assigned exchange).

>>> z = phonenumbers.parse("+120012301", None)
>>> print(z)
Country Code: 1 National Number: 20012301 Leading Zero: False
>>> phonenumbers.is_possible_number(z)  # too few digits for USA
False
>>> phonenumbers.is_valid_number(z)
False
>>> z = phonenumbers.parse("+12001230101", None)
>>> print(z)
Country Code: 1 National Number: 2001230101 Leading Zero: False
>>> phonenumbers.is_possible_number(z)
True
>>> phonenumbers.is_valid_number(z)  # NPA 200 not used
False

The parse function will also fail completely (with a NumberParseException) on inputs that cannot be uniquely parsed, or that can't possibly be phone numbers.

>>> z = phonenumbers.parse("02081234567", None)  # no region, no + => unparseable
Traceback (most recent call last):
  File "phonenumbers/phonenumberutil.py", line 2350, in parse
    "Missing or invalid default region.")
phonenumbers.phonenumberutil.NumberParseException: (0) Missing or invalid default region.
>>> z = phonenumbers.parse("gibberish", None)
Traceback (most recent call last):
  File "phonenumbers/phonenumberutil.py", line 2344, in parse
    "The string supplied did not seem to be a phone number.")
phonenumbers.phonenumberutil.NumberParseException: (1) The string supplied did not seem to be a phone number.

Once you've got a phone number, a common task is to format it in a standardized format. There are a few formats available (under PhoneNumberFormat), and the format_number function does the formatting.

>>> phonenumbers.format_number(x, phonenumbers.PhoneNumberFormat.NATIONAL)
'020 8366 1177'
>>> phonenumbers.format_number(x, phonenumbers.PhoneNumberFormat.INTERNATIONAL)
'+44 20 8366 1177'
>>> phonenumbers.format_number(x, phonenumbers.PhoneNumberFormat.E164)
'+442083661177'

If your application has a UI that allows the user to type in a phone number, it's nice to get the formatting applied as the user types. The AsYouTypeFormatter object allows this.

>>> formatter = phonenumbers.AsYouTypeFormatter("US")
>>> formatter.input_digit("6")
'6'
>>> formatter.input_digit("5")
'65'
>>> formatter.input_digit("0")
'650'
>>> formatter.input_digit("2")
'650 2'
>>> formatter.input_digit("5")
'650 25'
>>> formatter.input_digit("3")
'650 253'
>>> formatter.input_digit("2")
'650-2532'
>>> formatter.input_digit("2")
'(650) 253-22'
>>> formatter.input_digit("2")
'(650) 253-222'
>>> formatter.input_digit("2")
'(650) 253-2222'

Sometimes, you've got a larger block of text that may or may not have some phone numbers inside it. For this, the PhoneNumberMatcher object provides the relevant functionality; you can iterate over it to retrieve a sequence of PhoneNumberMatch objects. Each of these match objects holds a PhoneNumber object together with information about where the match occurred in the original string.

>>> text = "Call me at 510-748-8230 if it's before 9:30, or on 703-4800500 after 10am."
>>> for match in phonenumbers.PhoneNumberMatcher(text, "US"):
...     print(match)
...
PhoneNumberMatch [11,23) 510-748-8230
PhoneNumberMatch [51,62) 703-4800500
>>> for match in phonenumbers.PhoneNumberMatcher(text, "US"):
...     print(phonenumbers.format_number(match.number, phonenumbers.PhoneNumberFormat.E164))
...
+15107488230
+17034800500

You might want to get some information about the location that corresponds to a phone number. The geocoder.area_description_for_number does this, when possible.

>>> from phonenumbers import geocoder
>>> ch_number = phonenumbers.parse("0431234567", "CH")
>>> geocoder.description_for_number(ch_number, "de")
'Zürich'
>>> geocoder.description_for_number(ch_number, "en")
'Zurich'
>>> geocoder.description_for_number(ch_number, "fr")
'Zurich'
>>> geocoder.description_for_number(ch_number, "it")
'Zurigo'

For mobile numbers in some countries, you can also find out information about which carrier originally owned a phone number.

>>> from phonenumbers import carrier
>>> ro_number = phonenumbers.parse("+40721234567", "RO")
>>> carrier.name_for_number(ro_number, "en")
'Vodafone'

You might also be able to retrieve a list of time zone names that the number potentially belongs to.

>>> from phonenumbers import timezone
>>> gb_number = phonenumbers.parse("+447986123456", "GB")
>>> timezone.time_zones_for_number(gb_number)
('Atlantic/Reykjavik', 'Europe/London')

For more information about the other functionality available from the library, look in the unit tests or in the original libphonenumber project.

Memory Usage

The library includes a lot of metadata, potentially giving a significant memory overhead. There are two mechanisms for dealing with this.

  • The normal metadata for the core functionality of the library is loaded on-demand, on a region-by-region basis (i.e. the metadata for a region is only loaded on the first time it is needed).
  • Metadata for extended functionality is held in separate packages, which therefore need to be explicitly loaded separately. This affects:
    • The geocoding metadata, which is held in phonenumbers.geocoder and used by the geocoding functions (geocoder.description_for_number, geocoder.description_for_valid_number or geocoder.country_name_for_number).
    • The carrier metadata, which is held in phonenumbers.carrier and used by the mapping functions (carrier.name_for_number or carrier.name_for_valid_number).
    • The timezone metadata, which is held in phonenumbers.timezone and used by the timezone functions (time_zones_for_number or time_zones_for_geographical_number).

The phonenumberslite version of the library does not include the geocoder, carrier and timezone packages, which can be useful if you have problems installing the main phonenumbers library due to space/memory limitations.

If you need to ensure that the metadata memory use is accounted for at start of day (i.e. that a subsequent on-demand load of metadata will not cause a pause or memory exhaustion):

  • Force-load the normal metadata by calling phonenumbers.PhoneMetadata.load_all().
  • Force-load the extended metadata by importing the appropriate packages (phonenumbers.geocoder, phonenumbers.carrier, phonenumbers.timezone).

The phonenumberslite version of the package does not include the geocoding, carrier and timezone metadata, which can be useful if you have problems installing the main phonenumbers package due to space/memory limitations.

Project Layout

  • The python/ directory holds the Python code.
  • The resources/ directory is a copy of the resources/ directory from libphonenumber. This is not needed to run the Python code, but is needed when upstream changes to the master metadata need to be incorporated.
  • The tools/ directory holds the tools that are used to process upstream changes to the master metadata.
Owner
David Drysdale
David Drysdale
Split large XML files into smaller ones for easy upload

Split large XML files into smaller ones for easy upload. Works for WordPress Posts Import and other XML files.

Joseph Adediji 1 Jan 30, 2022
Make writing easier!

Handwriter Make writing easier! How to Download and install a handwriting font, or create a font from your handwriting. Use a word processor like Micr

64 Dec 25, 2022
A python tool to convert Bangla Bijoy text to Unicode text.

Unicode Converter A python tool to convert Bangla Bijoy text to Unicode text. Installation Unicode Converter can be installed via PyPi. Make sure pip

Shahad Mahmud 10 Sep 29, 2022
The project is investigating methods to extract human-marked data from document forms such as surveys and tests.

The project is investigating methods to extract human-marked data from document forms such as surveys and tests. They can read questions, multiple-choice exam papers, and grade.

Harry 5 Mar 27, 2022
WorldCloud Orçamento de Estado 2022

World Cloud Orçamento de Estado 2022 What it does This script creates a worldcloud, masked on a image, from a txt file How to run it? Install all libr

Jorge Gomes 2 Oct 12, 2021
utoken is a multilingual tokenizer that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresses and hashtags.

utoken utoken is a multilingual tokenizer that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresse

Ulf Hermjakob 11 Jan 05, 2023
Extract knowledge from raw text

Extract knowledge from raw text This repository is a nearly copy-paste of "From Text to Knowledge: The Information Extraction Pipeline" with some cosm

Raphael Sourty 10 Dec 03, 2022
Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

Meaningful Word Generator Generates meaningful words from dictionary with given no. of letters and words. This might be useful for generating short li

Mohammed Rabil 1 Jan 01, 2022
Find a Doc is a free online resource aimed at helping connect the foreign community in Japan with health services in their native language.

Find a Doc - Localization Find a Doc is a free online resource aimed at helping connect the foreign community in Japan with health services in their n

Our Japan Life 18 Dec 19, 2022
Returns unicode slugs

Python Slugify A Python slugify application that handles unicode. Overview Best attempt to create slugs from unicode strings while keeping it DRY. Not

Val Neekman 1.3k Jan 04, 2023
A username generator made from French Canadian most common names.

This script is used to generate a username list using the most common first and last names in Quebec in different formats. It can generate some passwords using specific patterns such as Tremblay2020.

5 Nov 26, 2022
Username reconnaisance tool that checks the availability of a specified username on over 200 websites.

Username reconnaisance tool that checks the availability of a specified username on over 200 websites. Installation & Usage Clone from Github: $ git c

Richard Mwewa 20 Oct 30, 2022
Text to ASCII and ASCII to text

Text2ASCII Description This python script (converter.py) contains two functions: encode() is used to return a list of Integer, one item per character

4 Jan 22, 2022
Fuzz a language by mixing up only few words.

afasi Fuzz a language by mixing up only few words. Status Beta. Note: The default branch is default. Use Examples Version General Help Translate Help

Stefan Hagen 2 Dec 14, 2022
Etranslate is a free and unlimited python library for transiting your texts

Etranslate is a free and unlimited python library for transiting your texts

Abolfazl Khalili 16 Sep 13, 2022
Simple python program to auto credit your code, text, book, whatever!

Credit Simple python program to auto credit your code, text, book, whatever! Setup First change credit_text to whatever text you would like to credit

Hashm 1 Jan 29, 2022
The bot creates hashtags for user's texts in Russian and English.

telegram_bot_hashtags The bot creates hashtags for user's texts in Russian and English. It is a simple bot for creating hashtags. NOTE file config.py

Yana Davydovich 2 Feb 12, 2022
REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

MUSE stands for Multilingual Universal Sentence Encoder - multilingual extension (supports 16 languages) of Universal Sentence Encoder (USE).

Dani El-Ayyass 47 Sep 05, 2022
This project aims to test check if your RegExp are being matched by grep.

Bash RegExp This project aims to test check if your RegExp are being matched by grep. It's a local server that starts on the port 8080. It runs the se

Quatrecentquatre 1 Feb 28, 2022
A Python3 script that simulates the user typing a text on their keyboard.

A Python3 script that simulates the user typing a text on their keyboard. (control the speed, randomness, rate of typos and more!)

Jose Gracia Berenguer 3 Feb 22, 2022