Python wrapper for Wikipedia

Overview

Wikipedia API

Wikipedia-API is easy to use Python wrapper for Wikipedias' API. It supports extracting texts, sections, links, categories, translations, etc from Wikipedia. Documentation provides code snippets for the most common use cases.

build status Documentation Status Test Coverage Version Py Versions GitHub stars

Installation

This package requires at least Python 3.4 to install because it's using IntEnum.

pip3 install wikipedia-api

Usage

Goal of Wikipedia-API is to provide simple and easy to use API for retrieving informations from Wikipedia. Bellow are examples of common use cases.

Importing

import wikipediaapi

How To Get Single Page

Getting single page is straightforward. You have to initialize Wikipedia object and ask for page by its name. It's parameter language has be one of supported languages.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    page_py = wiki_wiki.page('Python_(programming_language)')

How To Check If Wiki Page Exists

For checking, whether page exists, you can use function exists.

page_py = wiki_wiki.page('Python_(programming_language)')
print("Page - Exists: %s" % page_py.exists())
# Page - Exists: True

page_missing = wiki_wiki.page('NonExistingPageWithStrangeName')
print("Page - Exists: %s" %     page_missing.exists())
# Page - Exists: False

How To Get Page Summary

Class WikipediaPage has property summary, which returns description of Wiki page.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    print("Page - Title: %s" % page_py.title)
    # Page - Title: Python (programming language)

    print("Page - Summary: %s" % page_py.summary[0:60])
    # Page - Summary: Python is a widely used high-level programming language for

How To Get Page URL

WikipediaPage has two properties with URL of the page. It is fullurl and canonicalurl.

print(page_py.fullurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

print(page_py.canonicalurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

How To Get Full Text

To get full text of Wikipedia page you should use property text which constructs text of the page as concatanation of summary and sections with their titles and texts.

wiki_wiki = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.WIKI
)

p_wiki = wiki_wiki.page("Test 1")
print(p_wiki.text)
# Summary
# Section 1
# Text of section 1
# Section 1.1
# Text of section 1.1
# ...


wiki_html = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.HTML
)
p_html = wiki_html.page("Test 1")
print(p_html.text)
# <p>Summary</p>
# <h2>Section 1</h2>
# <p>Text of section 1</p>
# <h3>Section 1.1</h3>
# <p>Text of section 1.1</p>
# ...

How To Get Page Sections

To get all top level sections of page, you have to use property sections. It returns list of WikipediaPageSection, so you have to use recursion to get all subsections.

def print_sections(sections, level=0):
        for s in sections:
                print("%s: %s - %s" % ("*" * (level + 1), s.title, s.text[0:40]))
                print_sections(s.sections, level + 1)


print_sections(page_py.sections)
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readable
# **: Indentation - Python uses whitespace indentation, rath
# **: Statements and control flow - Python's statements include (among other
# **: Expressions - Some Python expressions are similar to l

How To Get Page In Other Languages

If you want to get other translations of given page, you should use property langlinks. It is map, where key is language code and value is WikipediaPage.

def print_langlinks(page):
        langlinks = page.langlinks
        for k in sorted(langlinks.keys()):
            v = langlinks[k]
            print("%s: %s - %s: %s" % (k, v.language, v.title, v.fullurl))

print_langlinks(page_py)
# af: af - Python (programmeertaal): https://af.wikipedia.org/wiki/Python_(programmeertaal)
# als: als - Python (Programmiersprache): https://als.wikipedia.org/wiki/Python_(Programmiersprache)
# an: an - Python: https://an.wikipedia.org/wiki/Python
# ar: ar - بايثون: https://ar.wikipedia.org/wiki/%D8%A8%D8%A7%D9%8A%D8%AB%D9%88%D9%86
# as: as - পাইথন: https://as.wikipedia.org/wiki/%E0%A6%AA%E0%A6%BE%E0%A6%87%E0%A6%A5%E0%A6%A8

page_py_cs = page_py.langlinks['cs']
print("Page - Summary: %s" % page_py_cs.summary[0:60])
# Page - Summary: Python (anglická výslovnost [ˈpaiθtən]) je vysokoúrovňový sk

How To Get Links To Other Pages

If you want to get all links to other wiki pages from given page, you need to use property links. It's map, where key is page title and value is WikipediaPage.

def print_links(page):
        links = page.links
        for title in sorted(links.keys()):
            print("%s: %s" % (title, links[title]))

print_links(page_py)
# 3ds Max: 3ds Max (id: ??, ns: 0)
# ?:: ?: (id: ??, ns: 0)
# ABC (programming language): ABC (programming language) (id: ??, ns: 0)
# ALGOL 68: ALGOL 68 (id: ??, ns: 0)
# Abaqus: Abaqus (id: ??, ns: 0)
# ...

How To Get Page Categories

If you want to get all categories under which page belongs, you should use property categories. It's map, where key is category title and value is WikipediaPage.

def print_categories(page):
        categories = page.categories
        for title in sorted(categories.keys()):
            print("%s: %s" % (title, categories[title]))


print("Categories")
print_categories(page_py)
# Category:All articles containing potentially dated statements: ...
# Category:All articles with unsourced statements: ...
# Category:Articles containing potentially dated statements from August 2016: ...
# Category:Articles containing potentially dated statements from March 2017: ...
# Category:Articles containing potentially dated statements from September 2017: ...

How To Get All Pages From Category

To get all pages from given category, you should use property categorymembers. It returns all members of given category. You have to implement recursion and deduplication by yourself.

def print_categorymembers(categorymembers, level=0, max_level=1):
        for c in categorymembers.values():
            print("%s: %s (ns: %d)" % ("*" * (level + 1), c.title, c.ns))
            if c.ns == wikipediaapi.Namespace.CATEGORY and level < max_level:
                print_categorymembers(c.categorymembers, level=level + 1, max_level=max_level)


cat = wiki_wiki.page("Category:Physics")
print("Category members: Category:Physics")
print_categorymembers(cat.categorymembers)

# Category members: Category:Physics
# * Statistical mechanics (ns: 0)
# * Category:Physical quantities (ns: 14)
# ** Refractive index (ns: 0)
# ** Vapor quality (ns: 0)
# ** Electric susceptibility (ns: 0)
# ** Specific weight (ns: 0)
# ** Category:Viscosity (ns: 14)
# *** Brookfield Engineering (ns: 0)

How To See Underlying API Call

If you have problems with retrieving data you can get URL of undrerlying API call. This will help you determine if the problem is in the library or somewhere else.

import wikipediaapi
import sys
wikipediaapi.log.setLevel(level=wikipediaapi.logging.DEBUG)

# Set handler if you use Python in interactive mode
out_hdlr = wikipediaapi.logging.StreamHandler(sys.stderr)
out_hdlr.setFormatter(wikipediaapi.logging.Formatter('%(asctime)s %(message)s'))
out_hdlr.setLevel(wikipediaapi.logging.DEBUG)
wikipediaapi.log.addHandler(out_hdlr)

wiki = wikipediaapi.Wikipedia(language='en')

page_ostrava = wiki.page('Ostrava')
print(page_ostrava.summary)
# logger prints out: Request URL: http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Ostrava&explaintext=1&exsectionformat=wiki

External Links

Other Badges

Code Climate Issue Count Coveralls Version Py Versions implementations Downloads Tags github-release Github commits (since latest release) GitHub forks GitHub stars GitHub watchers GitHub commit activity the past week, 4 weeks, year Last commit GitHub code size in bytes GitHub repo size in bytes PyPi License PyPi Wheel PyPi Format PyPi PyVersions PyPi Implementations PyPi Status PyPi Downloads - Day PyPi Downloads - Week PyPi Downloads - Month Libraries.io - SourceRank Libraries.io - Dependent Repos

Other Pages

.. toctree::
        :maxdepth: 2

        API
        CHANGES
        DEVELOPMENT
        wikipediaapi/api

Owner
Martin Majlis
Martin Majlis
Non official, but friendly QvaPay library for the Python language.

Python SDK for the QvaPay API Non official, but friendly QvaPay library for the Python language. Setup You can install this package by using the pip t

Carlos Lugones 17 Nov 25, 2022
A python package for AxisVM

PyAxisVM The package is under development. Follow us on social media, where we'll announce the first release! Overview The PyAxisVM project offers a h

AxisVM - InterCAD 8 Nov 19, 2022
Discord Bot for SurPath Hub's server

Dayong Dayong is dedicated to helping Discord servers build and manage their communities. Multipurpose —lots of features, lots of automation. Self-hos

SurPath Hub 6 Dec 18, 2021
Autodrive is designed to make it as easy as possible to interact with the Google Drive and Sheets APIs via Python

Autodrive Autodrive is designed to make it as easy as possible to interact with the Google Drive and Sheets APIs via Python. It is especially designed

Chris Larabee 1 Oct 02, 2021
Telegram Reporter

[Telegram Reporter v.3 ] 🇮🇷 AliCybeRR 🇮🇷 [ AliCybeRR.Reporter feature ] Login Your Telegram account 👽 support Termux ❕ No Limits ⚡ Secure 🔐 Free

AliCybeRR 1 Jun 08, 2022
Bomber-X - A SMS Bomber made with Python

Bomber-X A SMS Bomber made with Python Linux/Termux apt update apt upgrade apt i

S M Shahriar Zarir 2 Mar 10, 2022
Dante, my discord bot. Open source project in development and not optimized for other filesystems, install and setup script in development

DanteMode (In private development for ~6 months) Dante, my discord bot. Open source project in development and not optimized for other filesystems, in

2 Nov 05, 2021
42-event-notifier - 42 Event notifier using 42API and Github Actions

42 Event Notifier 42서울 Agenda에 새로운 이벤트가 등록되면 알려드립니다! 현재는 Github Issue로 등록되므로 상단

6 May 16, 2022
EduuRobot Telegram bot source code.

EduuRobot A multipurpose Telegram Bot made with Pyrogram and asynchronous programming. Requirements Python 3.6+ An Unix-like operating system (Running

Amano Team 119 Dec 23, 2022
Lamblayer: a minimal deployment tool for AWS Lambda layers

lamblayer lamblayer is a minimal deployment tool for AWS Lambda layers. lamblayer does, Create a Layers of built pip-installable python packages. Crea

Yusuke Takahashi 2 Aug 19, 2022
Python Dialogflow CX Scripting API (SCRAPI)

Python Dialogflow CX Scripting API (SCRAPI) A high level scripting API for bot builders, developers, and maintainers. Table of Contents Introduction W

Google Cloud Platform 39 Dec 09, 2022
Generates a coverage badge using coverage.py and the shields.io service.

Welcome to README Coverage Badger 👋 Generates a coverage badge using coverage.py and the shields.io service. Your README file is then updated with th

Victor Miti 10 Dec 06, 2022
Beyonic API Python official client library simplified examples using Flask, Django and Fast API.

Beyonic API Python official client library simplified examples using Flask, Django and Fast API.

A python package to easy the integration with Direct Online Pay (Mpesa, TigoPesa, AirtelMoney, Card Payments)

python-dpo A python package to easy the integration with Direct Online Pay (DPO) which easily allow you easily integrate with payment options once wit

NEUROTECH 15 Oct 08, 2022
ML-Test-Client

ML-Test-Client Introduction What is this? This Test Client App is to be used to crowd-test machine learning models with the goal of finding the best c

11 Jul 15, 2022
a discord bot for searching your movies, and bot return movie url for you :)

IMDb Discord Bot how to run this bot. the first step you must create prefixes.json file the second step you must create a virtualenv if you use window

Mehdi Radfar 6 Dec 20, 2022
A telegram bot that can send you high-quality audio 🎧🎧🎧

Music downloader bot Still under development Please Report issues to improve this repo.I will try to fix bugs in next update Music downloader bot is a

Anish Gowda 36 Dec 06, 2022
A discord.py code generator program. Compatible with both linux and windows.

Astro-Cord A discord.py code generator program. Compatible with both linux and windows. About This is a program made to make discord.py bot developmen

Astro Inc. 2 Dec 23, 2021
칼만 필터는 어렵지 않아(저자 김성필) 파이썬 코드(Unofficial)

KalmanFilter_Python 칼만 필터는 어렵지 않아(저자 김성필) 책을 공부하면서, Matlab 코드를 Python으로 변환한 것입니다. Contents Part01. Recursive Filter Chapter01. Average Filter Chapter0

Donghun Park 20 Oct 28, 2022
Cool Discord bot for you

BountyBot Баунти – современный бот созданный с целью сделать ваш сервер лучше! В кратце В нем присутствует множество основных и интересных функций, та

Leestarb Original 1 Nov 22, 2021