A common, beautiful interface to tabular data, no matter the format

Overview

rows

Join the chat at https://gitter.im/turicas/rows Current version at PyPI Downloads per month on PyPI Supported Python Versions Software status License: LGPLv3

No matter in which format your tabular data is: rows will import it, automatically detect types and give you high-level Python objects so you can start working with the data instead of trying to parse it. It is also locale-and-unicode aware. :)

Want to learn more? Read the documentation (or build and browse the docs locally by running make docs-serve after installing requirements-development.txt).

Installation

The easiest way to getting the hands dirty is install rows, using pip.

PyPI

pip install rows

For another ways to instal refer to the Installation section documentation.

Contribution start guide

The preferred way to start contributing for the project is creating a virtualenv (you can do by using virtualenv, virtualenvwrapper, pyenv or whatever tool you'd like).

Create the virtualenv:

mkvirtualenv rows

Install all plugins' dependencies:

pip install --editable .[all]

Install development dependencies:

pip install -r requirements-development.txt
Comments
  • OverflowError

    OverflowError

    Após instalar as dependências requeridas para-o pacote socios-brasil, ao tentar descompactar como indicado, obtenho o erro abaixo:

    Traceback (most recent call last):
     File "extract_dump.py", line 27, in <module> 
        import rows
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\__init__.py", line 22, in <module>
        import rows.plugins as plugins
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\plugins\__init__.py", line 20, in <module>
        from . import plugin_csv as csv # NOQA
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\plugins\plugin_csv.py", line 34, in <module>
        unicodecsv.field_size_limit(sys.maxsize) 
    OverflowError: Python int too large to convert to C long
    

    Rodando em Windows 7, Anaconda 64 bits, Python 3.6. Grato, Marcel Milcent

    opened by milcent 13
  • PDF Plugin

    PDF Plugin

    Create an algorithm to automatically extract tables from PDFs (available in text format). Could use pdftables, but the code is not up-to-date, does not work with Python3 etc.

    enhancement plugin 
    opened by turicas 7
  • Converter PDF x TXT

    Converter PDF x TXT

    Bom dia, estou tentando converter um arquivo pdf escaneado para texto (o pdf contém tabelas). Consegui instalar a biblioteca rows e as dependências rows[pdf], rows[cli]. Quando eu tento rodar o código em prompt command: rows pdf-to-text teste.pdf result.txt Eu tenho o seguinte erro: image

    Alguma ideia do que possa ser o problema?

    opened by Danielydsm 6
  • Autodetect delimiter in CSV files

    Autodetect delimiter in CSV files

    Currently the import_from_csv method have the parameter 'delimiter' that assumes ',' as default, but sometimes we don't know what is the delimiter and need it autodetect. Specially usefull in case of CSV files generated in MS Excell that uses ';' as delimiter.

    A quick and dirty possibility to make this works is counting the number of times ',', ';' and 'tab' is used in the file and assumes as delimiter the most used.

    enhancement help wanted plugin 
    opened by jeanferri 6
  • OverflowError: Python int too large to convert to C long

    OverflowError: Python int too large to convert to C long

    Bom dia!

    Estou aprendendo Python, então este pode ser um erro bem simples de resolver, mesmo assim não faço ideia do que pode ser feito:

    Ao tentar importar o rows aparece a mensagem do título.

    duplicate 
    opened by tbmpereira 5
  • Text plugin is not working on `rows convert`

    Text plugin is not working on `rows convert`

    The file cha-de-bebe.txt is not being read correctly on the command line (try rows print cha-de-bebe.txt or rows convert cha-de-bebe.txt cha-de-bebe.csv) -- but it was generated correctly using rows print http://some-url/ > cha-de-bebe.txt.

    @jsbueno could you please help checking it? I think this bug started after your PR #270 .

    bug 
    opened by turicas 5
  • locale.Error: unsupported locale setting

    locale.Error: unsupported locale setting

    ======================================================================
    ERROR: test_DecimalField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 203, in test_DecimalField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_FloatField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 171, in test_FloatField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_IntegerField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 144, in test_IntegerField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_PercentField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 250, in test_PercentField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_locale_context (tests.tests_localization.LocalizationTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_localization.py", line 41, in test_locale_context
        with locale_context(name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    opened by ignatenkobrain 5
  • Porting rows to Python3

    Porting rows to Python3

    This is a work in progress.

    I could make all tests pass on Python3, but 3 are broken on Python2 because of something I can't find yet on the type identification system.

    This PR is just to share it with you. Maybe your familiarity with the code can help fixing the tests.

    []'s!

    opened by henriquebastos 5
  • UserWarning: Call to deprecated function or class get_active_sheet

    UserWarning: Call to deprecated function or class get_active_sheet

    Hi, when I build package for Debian, debhelper tools runs pybuild, showing this warnings [1] I use the lastest source: git20151115.837b41.

    Is there something here or other has the same problem? thanks.

    [1] pybuild --test --test-nose -i python{version} -p 2.7 --dir . I: pybuild base:184: cd /pkgs/pkg-rows/rows-0.1.1+git20151115.837b41/.pybuild/pythonX.Y_2.7/build; python2.7 -m nose tests ...................................................................................................../usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): /usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): ./usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): ./usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self):

    ..........................

    Ran 129 tests in 1.936s

    OK

    opened by kretcheu 5
  • Add sphinx documentation

    Add sphinx documentation

    Hello dear reviewer,

    I basically did three things:

    • Add the sphinx to the requirements-development.txt
    • Create a basic documentation, based on the Readme, with few improvements i've made.
    • Move some basic project information (intro and archtecture) to the init.py of the rows module

    I think the Sphinx doc can also be used as a website, and maybe can be hosted at github pages.

    []'s I hope this will be usefull! :)

    opened by raphapassini 5
  • Could not find import_from_pdf function

    Could not find import_from_pdf function

    I need to import data from pdf and found this example: https://gist.github.com/turicas/6b9ca83dcd531a6cd4fd87ced2a28c70

    But I was unable to run it, since the import_from_pdf is not available to me.

    I have already run the command: pip install rows[all]

    Is pdf format no longer supported?

    opened by marcellalves 4
  • New release on pypi

    New release on pypi

    I started using the "rows" lib today, and I've lost several hours of work because of a bug on empty cells in ods input. Here is my story.

    I was learning/discovering the "rows" lib with an ODS file, and I fall across a strange behavior. Of course, I thought it was because I didn't use the lib properly : so I tried all possible options, searched on the Internet... etc. After several hours, I eventually tried the same code with an equivalent XLSX file and I found out that the behavior was different ! So I realized that I had found a bug on my first day of use of the rows lib !

    I decided that I should report the bug. I took the time to write a script to illustrate my bug report. I was using rows 0.4.1 from pypi, but, before creating the bug report on github, I thought I should check if the bug is still present in the "develop" branch... and my script shows that the bug is fixed in the "develop" branch !

    Release 0.4.1 is dated Feb 14, 2019... almost 4 years old ! There has been 210 commits since 0.4.1 ; among these 210 commits, I counted about 45 fixes. While counting the commit messages with a fix message, I found the commit that fixes my bug: issue #320 fixed on Match 27 2019 in this commit https://github.com/turicas/rows/commit/c569f9415f2c76b2f6e9afbe1d748946e759711f

    So, in December 2022, some users are wasting hours because of a bug that was found and fixed 3,5 years ago :-( No comment !

    So, please, push a new release on pypi !

    opened by alexis-via 2
  • Replace unicodecsv by standard csv module

    Replace unicodecsv by standard csv module

    unicodecsv is not maintained since a while now [1]. It was preferred over standard csv because of the unicode support. Now that Python3 csv module [2] supports it, let's use it.

    For more context, we hit issues while rebuilding uncicodecsv during Fedora Python3.11 mass rebuild [3][4].

    [1] https://github.com/jdunck/python-unicodecsv [2] https://docs.python.org/3/library/csv.html [3] https://copr.fedorainfracloud.org/coprs/g/python/python3.11/package/python-unicodecsv/ [4] https://bugzilla.redhat.com/show_bug.cgi?id=2021938

    opened by jcapiitao 1
  • NameError: name 'obj' is not defined

    NameError: name 'obj' is not defined

    Esse erro rolou quando fui tentar usar o método closest_same_column em rows.plugins.pdf image

    Aparentemente aqui no código está faltando a parte em que pegamos o o objeto que tem o valor passado como parâmetro para trabalharmos com ele (e aparentemente isso também acontece com o outro método closest_same_line

    opened by dehatanes 0
  • Python 3.10: cannot import name 'Iterator' from 'collections'

    Python 3.10: cannot import name 'Iterator' from 'collections'

    File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/rows/plugins/utils.py", line 20, in <module> 
    from collections import Iterator, OrderedDict            
    ImportError: cannot import name 'Iterator' from 'collections'
    

    Maybe this will be fix:

    try:
        from collections.abc import Iterator
    except ImportError:
        from collections import Iterator
    
    opened by fagci 0
  • [pgimport] Option to do not store values as NULL

    [pgimport] Option to do not store values as NULL

    NULL values can be confusing when analyzing data and there will be some cases where we prefer to add empty values as empty strings instead of NULL. The function pgimport (and the CLI equivalent) should have an option to deal with this scenario.

    enhancement cli plugin utils 
    opened by turicas 0
Releases(v0.4.1)
Owner
Álvaro Justen
Free/libre software hacker, hypnotist, remote worker, teacher, coffee lover/roaster
Álvaro Justen
Dicionario-git-github - Dictionary created to help train new users of Git and GitHub applications

Dicionário 📕 Dicionário criado com o objetivo de auxiliar no treinamento de nov

Felippe Rafael 1 Feb 07, 2022
Alternative StdLib for Nim for Python targets

Alternative StdLib for Nim for Python targets, hijacks Python StdLib for Nim

Juan Carlos 100 Jan 01, 2023
Your one and only Discord Bot that helps you concentrate!

Your one and only Discord Bot thats helps you concentrate! Consider leaving a ⭐ if you found the project helpful. concy-bot A bot which constructively

IEEE VIT Student Chapter 22 Sep 27, 2022
Graphene Metanode is a locally hosted node for one account and several trading pairs, which uses minimal RAM resources.

Graphene Metanode is a locally hosted node for one account and several trading pairs, which uses minimal RAM resources. It provides the necessary user stream data and order book data for trading in a

litepresence 5 May 08, 2022
Template for pre-commit hooks

Pre-commit hook template This repo is a template for a pre-commit hook. Try it out by running: pre-commit try-repo https://github.com/stefsmeets/pre-c

Stef Smeets 1 Dec 09, 2021
Our product DrLeaf which not only makes the work easier but also reduces the effort and expenditure of the farmer to identify the disease and its treatment methods.

Our product DrLeaf which not only makes the work easier but also reduces the effort and expenditure of the farmer to identify the disease and its treatment methods. We have to upload the image of an

Aniruddha Jana 2 Feb 02, 2022
🤖️ Plugin for Sentry which allows sending notification via DingTalk robot.

Sentry DingTalk Sentry 集成钉钉机器人通知 Requirments sentry = 21.5.1 特性 发送异常通知到钉钉 支持钉钉机器人webhook设置关键字 配置环境变量 DINGTALK_WEBHOOK: Optional(string) DINGTALK_CUST

1 Nov 04, 2021
📽 Streamlit application powered by a PyScaffold project setup

streamlit-demo Streamlit application powered by a PyScaffold project setup. Work in progress: The idea of this repo is to demonstrate how to package a

PyScaffold 2 Oct 10, 2022
Alerts for Western Australian Covid-19 exposure locations via email and Slack

WA Covid Mailer Sends alerts from Healthy WA's Covid19 Exposure Locations via email and slack. Setup Edit the configuration items in wacovidmailer.py

13 Mar 29, 2022
Telegram bot to search quotes from brainyquote.com

Brainy Quote Bot @BrainQuoteBot A star ⭐ from you means a lot to us! Telegram bot to search quotes from brainyquote.com Usage Deploy to Heroku Tap on

21 Nov 24, 2022
This program generates automatically new folders containing old version of program

Automated Folder Versions Generator by Sergiy Grimoldi - V.0.0.2 This program generates automatically new folders containing old version of something

Sergiy Grimoldi 1 Dec 23, 2021
This is a simple SV calling package for diploid assemblies.

dipdiff This is a simple SV calling package for diploid assemblies. It uses a modified version of svim-asm. The package includes its own version minim

Mikhail Kolmogorov 11 Jan 05, 2023
A powerful and user-friendly binary analysis platform!

angr angr is a platform-agnostic binary analysis framework. It is brought to you by the Computer Security Lab at UC Santa Barbara, SEFCOM at Arizona S

6.3k Jan 02, 2023
Animation picker for Audodesk Maya 2017 (or higher)

Dreamwall Picker Animation picker for Audodesk Maya 2017 (or higher) Authors: Lionel Brouyère, Olivier Evers This tool is a fork of Hotbox Designer (L

DreamWall 93 Dec 21, 2022
Processamento da Informação - Disciplina UFABC

Processamento da Informacao Disciplina UFABC, Linguagem de Programação Python - 2021.2 Objetivos Apresentar os fundamentos sobre manipulação e tratame

Melissa Junqueira de Barros Lins 1 Jun 12, 2022
This tool for beginner and help those people they gather information about Email Header Analysis, Instagram Information, Instagram Username Check, Ip Information, Phone Number Information, Port Scan

This tool for beginner and help those people they gather information about Email Header Analysis, Instagram Information, Instagram Username Check, Ip Information, Phone Number Information, Port Scan.

cb-kali 5 Feb 18, 2022
A docker container (Docker Desktop) for a simple python Web app few unit tested

Short web app using Flask, tested with unittest on making massive requests, responses of the website, containerized

Omar 1 Dec 13, 2021
Программа для практической работы №12 по дисциплине

Информатика: программа для практической работы №12 Код и блок-схема программы для практической работы №12 по дисциплине "Информатика" (I семестр). Сут

Vladislav 1 Dec 07, 2021
dotfiles - Cristian Valero Abundio

In this repository you can find various configurations to configure your Linux operating system, preferably ArchLinux and its derivatives.

Cristian Valero Abundio 1 Jan 09, 2022
Cardano SundaeSwap ISO SPO vote ranking script

Cardano SundaeSwap ISO SPOs vote ranking This Python 3 script uses the database populated by cardano-db-sync from the Cardano blockchain to generate a

SM₳UG 1 Nov 17, 2021