Primary QPDF source code and documentation

Related tags

Computer Visionqpdf
Overview

QPDF

QPDF

QPDF Build Total lgtm alerts Language grade on lgtm: C/C++ Documentation Status

QPDF is a command-line tool and C++ library that performs content-preserving transformations on PDF files. It supports linearization, encryption, and numerous other features. It can also be used for splitting and merging files, creating PDF files (but you have to supply all the content yourself), and inspecting files for study or analysis. QPDF does not render PDFs or perform text extraction, and it does not contain higher-level interfaces for working with page contents. It is a low-level tool for working with the structure of PDF files and can be a valuable tool for anyone who wants to do programmatic or command-line-based manipulation of PDF files.

The QPDF Manual is hosted online at https://qpdf.readthedocs.io.

Additional information about it can be found at https://qpdf.sourceforge.io. The source code repository is hosted at GitHub: https://github.com/qpdf/qpdf.

Verifying Distributions

The public key used to sign qpdf source distributions has fingerprint C2C9 6B10 011F E009 E6D1 DF82 8A75 D109 9801 2C7E and can be found at https://q.ql.org/pubkey.asc or downloaded from a public key server.

Copyright, License

QPDF is copyright (c) 2005-2021 Jay Berkenbilt

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

You may also see the license in the file LICENSE.txt in the source distribution.

Versions of qpdf prior to version 7 were released under the terms of version 2.0 of the Artistic License. At your option, you may continue to consider qpdf to be licensed under those terms. Please see the manual for additional information. The Artistic License appears in the file Artistic-2.0 in the source distribution.

Prerequisites

QPDF requires a C++ compiler that supports C++-14.

QPDF depends on the external libraries zlib and jpeg. The libjpeg-turbo library is also known to work since it is compatible with the regular jpeg library, and QPDF doesn't use any interfaces that aren't present in the straight jpeg8 API. These are part of every Linux distribution and are readily available. Download information appears in the documentation. For Windows, you can download pre-built binary versions of these libraries for some compilers; see README-windows.md for additional details.

Depending on which crypto providers are enabled, then GnuTLS and OpenSSL may also be required. This is discussed more in Crypto providers below.

Licensing terms of embedded software

QPDF makes use of zlib and jpeg libraries for its functionality. These packages can be downloaded separately from their own download locations. If the optional GnuTLS or OpenSSL crypto providers are enabled, then GnuTLS and/or OpenSSL are also required.

Please see the NOTICE file for information on licenses of embedded software.

Crypto providers

As of version 9.1.0, qpdf can use different crypto implementations. These can be selected at compile time or at runtime. The native crypto implementations that were used in all versions prior to 9.1.0 are still present and enabled by default.

Initially, the following providers are available:

  • native: a native implementation where all the source is embedded in qpdf and no external dependencies are required
  • openssl: an implementation that can use the OpenSSL (or BoringSSL) libraries to provide crypto; causes libqpdf to link with the OpenSSL library
  • gnutls: an implementation that uses the GnuTLS library to provide crypto; causes libqpdf to link with the GnuTLS library

The default behavior is for ./configure to discover which other crypto providers can be supported based on available external libraries, to build all available crypto providers, and to use an external provider as the default over the native one. This behavior can be changed with the following flags to ./configure:

  • --enable-crypto-x -- (where x is a supported crypto provider): enable the x crypto provider, requiring any external dependencies it needs
  • --disable-crypto-x -- disable the x provider, and do not link against its dependencies even if they are available
  • --with-default-crypto=x -- make x the default provider even if a higher priority one is available
  • --disable-implicit-crypto -- only build crypto providers that are explicitly requested with an --enable-crypto-x option

For example, if you want to guarantee that the GnuTLS crypto provider is used, you could run ./configure with --enable-crypto-gnutls --disable-implicit-crypto.

Please see the section on crypto providers in the manual for more details.

Note about weak cryptographic algorithms

The PDF file format used to rely on RC4 for encryption. Using 256-bit keys always uses AES instead, and with 128-bit keys, you can elect to use AES. qpdf does its best to warn when someone is writing a file with weak cryptographic algorithms, but qpdf must always retain support for being able to read and even write files with weak encryption to be able to fully support older PDF files and older PDF readers.

Building from source distribution on UNIX/Linux

For UNIX and UNIX-like systems, you can usually get by with just

./configure
make
make install

Packagers may set DESTDIR, in which case make install will install inside of DESTDIR, as is customary with many packages. Please also see the "Notes for Packagers" section of the manual.

For more detailed general information, see the "INSTALL" file in this directory. If you are already accustomed to building and installing software that uses autoconf, there's nothing new for you in the INSTALL file. Note that qpdf uses autoconf but not automake. We have our own system of Makefiles that allows cross-directory dependencies, doesn't use recursive make, and works better on non-UNIX platforms.

Building without wchar_t

Executive summary: manually define -DQPDF_NO_WCHAR_T in your build if you are building on a system without wchar_t. For details, read the rest of this section.

While wchar_t is part of the C++ standard library and should be present on virtually every system, there are some stripped down systems, such as those targeting certain embedded environments, that lack wchar_t. Internally, qpdf uses UTF-8 encoding for everything, so there is nothing important in qpdf's API that uses wchar_t. However, there is a helper method for converting between wchar_t* and char* that uses wchar_t.

If you are building in an environment that does not support wchar_t, you can define the preprocessor symbol QPDF_NO_WCHAR_T in your build. This will work whether you are building qpdf and need to avoid compiling the code that uses wchar_t or whether you are building client code that uses qpdf.

For example, to build qpdf on a system without wchar_t, be sure that -DQPDF_NO_WCHAR_T is part of your CXXFLAGS. Similar techniques will work in other places.

Note that, when you build code with libqpdf, it is not necessary to have the definition of QPDF_NO_WCHAR_T in your build match what was defined when the library was built as long as you are not calling QUtil::call_main_from_wmain in your code. In other words, if your qpdf library was built on a system without wchar_t and you are using that system to build at some later time after wchar_t was available, as long as you don't call the function that uses it, you can just build normally.

Note qpdf will never define QPDF_NO_WCHAR_T using autoconf or any other automated method in spite of the fact that it would be easy to do so. That is because there is a hard rule in qpdf that values determined by autoconf are not available in the public API. This is because there is never a guarantee or even expectation that those values will match between the system on which qpdf was build and the system on which a user is building code with libqpdf, and qpdf's include directory should look the same across all systems.

Building on Windows

QPDF is known to build and pass its test suite with mingw (latest version tested: gcc 7.2.0), mingw64 (latest version tested: 7.2.0) and Microsoft Visual C++ 2015, both 32-bit and 64-bit versions. MSYS2 is required to build as well in order to get make and other related tools. See README-windows.md for details on how to build under Windows.

Building Documentation

The QPDF manual is written in reStructured Text format and is build with sphinx. The sources to the user manual can be found in the manual directory. For more detailed information, consult the Building and Installing QPDF section of the manual or consult the build-doc script used in CI.

Additional Notes on Build

QPDF's build system can optionally use its own built-in rules rather than using libtool and obeying the compiler specified with configure. This can be enabled by passing --with-buildrules=buildrules where buildrules corresponds to one of the .mk files (other than rules.mk) in the make directory. This should never be necessary on a UNIX system, but may be necessary on a Windows system. See README-windows.md for details.

The software library is just libqpdf, and all the header files are in the qpdf subdirectories of include and libqpdf. If you link statically with -lqpdf, then you will also need to link with -lz and -ljpeg. The shared qpdf library is linked with -lz and -ljpeg, none of qpdf's public header files directly include files from libz, and only Pl_DCT.hh includes files from libjpeg, so for most cases, qpdf's development files are self contained. If you need to use Pl_DCT in your application code, you will need to have the header files for some libjpeg distribution in your include path.

To learn about using the library, please read comments in the header files in include/qpdf, especially QPDF.hh, QPDFObjectHandle.hh, and QPDFWriter.hh. These are the best sources of documentation on the API. You can also study the code of qpdf/qpdf.cc, which exercises most of the public interface. There are additional example programs in the examples directory. Reading all the source files in the qpdf directory (including the qpdf command-line tool and some test drivers) along with the code in the examples directory will give you a complete picture of every aspect of the public interface.

Additional Notes on Test Suite

By default, slow tests and tests that require dependencies beyond those needed to build qpdf are disabled. Slow tests include image comparison tests and large file tests. Image comparison tests can be enabled by passing --enable-test-compare-images to ./configure. This was on by default in qpdf versions prior to 3.0, but is now off by default. Large file tests can be enabled by passing --with-large-file-test-path=path to ./configure or by setting the QPDF_LARGE_FILE_TEST_PATH environment variable. On Windows, this should be a Windows path. Run ./configure --help for additional options. The test suite provides nearly full coverage even without these tests. Unless you are making deep changes to the library that would impact the contents of the generated PDF files or testing this on a new platform for the first time, there is no real reason to run these tests. If you're just running the test suite to make sure that qpdf works for your build, the default tests are adequate. The configure rules for these tests do nothing other than setting variables in autoconf.mk, so you can feel free to turn these on and off directly in autoconf.mk rather than rerunning configure.

If you are packaging qpdf for a distribution and preparing a build that is run by an autobuilder, you may want to add the --enable-show-failed-test-output to configure options. This way, if the test suite fails, test failure detail will be included in the build output. Otherwise, you will have to have access to the qtest.log file from the build to view test failures. The Debian packages for qpdf enable this option.

Random Number Generation

By default, qpdf uses the crypto provider for generating random numbers. The rest of this applies only if you are using the native crypto provider.

If the native crypto provider is in use, then, when qpdf detects either the Windows cryptography API or the existence of /dev/urandom, /dev/arandom, or /dev/random, it uses them to generate cryptographically secure random numbers. If none of these conditions are true, the build will fail with an error. This behavior can be modified in several ways:

  • If you configure with --disable-os-secure-random or define SKIP_OS_SECURE_RANDOM, qpdf will not attempt to use Windows cryptography or the random device. You must either supply your own random data provider or allow use of insecure random numbers.
  • If you configure qpdf with the --enable-insecure-random option or define USE_INSECURE_RANDOM, qpdf will try insecure random numbers if OS-provided secure random numbers are disabled. This is not a fallback. In order for insecure random numbers to be used, you must also disable OS secure random numbers since, otherwise, failure to find OS secure random numbers is a compile error. The insecure random number source is stdlib's random() or rand() calls. These random numbers are not cryptography secure, but the qpdf library is fully functional using them. Using non-secure random numbers means that it's easier in some cases to guess encryption keys. If you're not generating encrypted files, there's no advantage to using secure random numbers.
  • In all cases, you may supply your own random data provider. To do this, derive a class from qpdf/RandomDataProvider (since version 5.1.0) and call QUtil::setRandomDataProvider before you create any QPDF objects. If you supply your own random data provider, it will always be used even if support for one of the other random data providers is compiled in. If you wish to avoid any possibility of your build of qpdf from using anything but a user-supplied random data provider, you can define SKIP_OS_SECURE_RANDOM and not USE_INSECURE_RANDOM. In this case, qpdf will throw a runtime error if any attempt is made to generate random numbers and no random data provider has been supplied.

If you are building qpdf on a platform that qpdf doesn't know how to generate secure random numbers on, a patch would be welcome.

Comments
  • Helper api

    Helper api

    This pull request is here to serve as a place to discuss and review proposed API enhancements for higher level APIs as discussed in #178. This branch will be rebased and reworked multiple times. I'll push to it periodically when I have something in a reviewable state.

    opened by jberkenbilt 70
  • What am I doing wrong here?

    What am I doing wrong here?

    Hi,

    I'm programming agains the JSON interface and try to encrypt a file with these settings, but that always fails. Do you know what I'm doing wrong?

    {
      "inputFile": "TestFiles\test.pdf",
      "outputFile": "C:\\3eb6fc52-800f-4c32-a5fb-9ad1709d5260\\output_encryption_256_bit.pdf",
      "linearize": "",
      "encrypt": {
        "256bit": {
          "accessibility": "y",
          "annotate": "y",
          "assemble": "y",
          "extract": "y",
          "form": "y",
          "modifyOther": "y",
          "modify": "all",
          "print": "full",
          "cleartextMetadata": "y"
        },
        "userPassword": "user",
        "ownerPassword": "owner"
      }
    }
    
    opened by Sicos1977 63
  • Feature request: page

    Feature request: page "stamping"

    We would like to stamp PDF pages with pages from another file. By stamping I mean "pasting" pages from one pdf on top of another pdf. See the example on the image, that represents stamping B.pdf on top of A.pdf to obtain AB.pdf.

    stamping

    I'm envisioning something on the lines of:

    qpdf A.pdf -stamp B.pdf -output AB.pdf
    

    We are currently using pdftk for that task, but we'd love to get rid of it. I don't know whether this is a relatively easy operation or a hard one though.

    next 
    opened by kilburn 43
  • Can the underlay option be used from the C++ api

    Can the underlay option be used from the C++ api

    Hi,

    Just one question, can the onderlay option be called somehow from the C++ api. I want to do it from code instead of calling qpdf.exe (on Windows)

    https://github.com/qpdf/qpdf/blob/master/include/qpdf/qpdf-c.h

    enhancement next 
    opened by Sicos1977 30
  • slowness merging files in 8.1.0 on cifs share

    slowness merging files in 8.1.0 on cifs share

    I am running qpdf on CentOS 7 to merge files that are located on a cifs share (mounted in linux with mount.cifs).

    Starting in version 8.1.0, I am seeing slowness merging files on the cifs share, while versions 8.0.0 and earlier are fine. Merging the file on the native linux partition runs fine. It's only slow when merging on the cifs share.

    I'm just merging a simple pdf file like the attached with:

    time qpdf --empty --pages input.pdf 1-z -- output.pdf

    input.pdf is input.pdf.

    Could something have changed in 8.1.0 to cause this? I printed my results below. Please let me know if I can give more information to help narrow this down.

    Thanks for your help.

    qpdf 8.0.0 with input.pdf on linux partition:
    real    0m0.052s
    user    0m0.020s
    sys     0m0.033s
    
    qpdf 8.0.0 with input.pdf on cifs share:
    real    0m0.057s
    user    0m0.017s
    sys     0m0.035s
    
    qpdf 8.1.0 with input.pdf on linux partition:
    real    0m0.082s
    user    0m0.037s
    sys     0m0.049s
    
    qpdf 8.1.0 with input.pdf on cifs share:
    real    0m1.094s
    user    0m0.024s
    sys     0m0.393s
    
    bug next 
    opened by rbro 25
  • Tune QPDFObjectHandle::parseInternal

    Tune QPDFObjectHandle::parseInternal

    Summary of changes:

    To facilitate refactoring, the static parseInternal method together with the related setObjectDescriptionFromInput method were moved to a new QPDFParser class.

    To reduce code duplication and improve readability new private warn methods were added to the QPDFParser class.

    To improve readability and improve performance, the olist, offset, contents_string and contents_offset stacks were combined into a single stack. To improve performance, both the new stack as well as the existing state stack were implemented as std::vectors rather than as SparseOHArrays.

    Treatment of null objects was tuned by recognizing that QPDF_Arrays and QPDF_Dictionarys do not store direct null objects. They discard the null object and "store" them as a missing object instead. Creation of new distinct null objects for each direct null encountered, or setting descriptions/offsets for them, was eliminated. With the change, the impact of parsing a null object is the addition of one object handle to olist, a vector of object handles. No new null objects are created.

    Once parsed, arrays continue to be stored as SparseOHArrays. This maintains the overwhelming majority of the benefits created by SparseOHArray.

    opened by m-holger 24
  • Qpdf throwing the range conversion error

    Qpdf throwing the range conversion error

    I am trying to encrypt the pdf file. But it throws the range error. "integer out of range converting 4294967295 from a 8-byte signed type to a 4-byte signed type" Please clarify why it appears and how to fix.

    @jberkenbilt

    TIA

    opened by santoshturamari 24
  • A hangs close to ten minutes in qpdf

    A hangs close to ten minutes in qpdf

    hi,I find something maybe wrong in the newest qpdf. the poc file will cause the program to be hanged about ten minutes. Maybe this is a bug or feature? poc.pdf

    and I found that it maybe caused by the unparseObject in libqpdf/PDFWriter.cc,here are some backtrace:

    #0  QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x25, flags=0x0) at libqpdf/QPDFWriter.cc:1182
    #1  0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #2  0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #3  0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x24, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #4  0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #5  0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #6  0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x23, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #7  0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #8  0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #9  0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x22, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #10 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #11 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #12 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x21, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #13 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #14 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #15 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x20, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #16 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #17 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #18 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x1f, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #19 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #20 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #21 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x1e, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #22 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #23 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #24 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x1d, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #25 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #26 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #27 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x1c, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #28 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #29 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #30 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x1b, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #31 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #32 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #33 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x1a, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #34 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #35 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #36 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x19, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #37 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #38 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #39 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x18, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #40 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #41 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #42 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x17, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #43 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #44 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #45 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x16, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #46 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #47 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #48 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x15, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #49 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #50 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #51 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x14, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #52 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #53 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #54 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x13, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #55 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #56 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #57 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x12, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #58 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #59 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #60 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x11, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #61 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #62 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #63 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x10, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #64 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #65 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #66 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0xf, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #67 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #68 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #69 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0xe, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #70 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #71 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #72 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0xd, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #73 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #74 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #75 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0xc, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #76 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #77 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #78 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0xb, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #79 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #80 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #81 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0xa, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #82 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #83 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #84 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x9, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #85 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #86 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #87 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x8, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #88 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #89 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #90 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x7, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #91 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #92 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #93 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x6, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #94 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #95 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #96 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x5, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #97 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #98 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #99 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x4, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #100 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #101 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #102 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x3, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #103 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #104 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #105 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x2, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #106 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #107 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #108 0x00007ffff7a4dcf0 in QPDFWriter::unparseChild (this=0x7fffffffe070, child=..., level=0x1, flags=0x0) at libqpdf/QPDFWriter.cc:1195
    #109 0x00007ffff7a62cf0 in QPDFWriter::unparseObject (this=0x7fffffffe070, object=..., level=<optimized out>, flags=0x0, stream_length=0x0,
        compress=0x0) at libqpdf/QPDFWriter.cc:1544
    #110 0x00007ffff7a4e4cb in QPDFWriter::unparseObject (this=0x7ffff7fe8030, object=..., level=0x64edd0, flags=0xe799) at libqpdf/QPDFWriter.cc:1310
    #111 0x00007ffff7a7b8b8 in QPDFWriter::writeObject (this=0x7fffffffe070, object=..., object_stream_index=<optimized out>)
        at libqpdf/QPDFWriter.cc:1967
    #112 0x00007ffff7a90c17 in QPDFWriter::writeStandard (this=0x7fffffffe070) at libqpdf/QPDFWriter.cc:3397
    #113 0x00007ffff7a8580f in QPDFWriter::write (this=0x7fffffffe070) at libqpdf/QPDFWriter.cc:2526
    #114 0x0000000000412eb2 in write_outfile (pdf=..., o=...) at qpdf/qpdf.cc:2618
    #115 main (argc=<optimized out>, argv=<optimized out>) at qpdf/qpdf.cc:2700
    #116 0x00007ffff694ad20 in __libc_start_main () from /lib64/libc.so.6
    #117 0x0000000000405ba9 in _start ()
    

    Looking forward to you reply,thx : )

    opened by Krace 24
  • first page object not in lc_first_page_private

    first page object not in lc_first_page_private

    I'm trying to extract images from a PDF (unfortunately can't share the file) using the pdfimages 4.00 32-bit Windows binary from the xpdf package. The PDF itself can be viewed without any problems using any of my preferred viewers.

    The output of pdfimages -j in.pdf "prefix" gives me the following errors (once per page):

    Syntax Error: Page tree reference is wrong type (dictionary) Syntax Error: Invalid page count in page tree

    Clearly there's something wrong with the way the PDF has been authored. I tried using qpdf from the qpdf-7.0.0-bin-msvc32.zip package as follows to try and fix the file:

    qpdf in.pdf out.pdf

    This does create out.pdf, however pdfimages still throws the same errors with the new file as it did with the original.

    I then tried linearizing the file as follows, hoping that would fix it:

    qpdf --linearize in.pdf out.pdf

    This creates only a 0 (zero) byte out.pdf and throws the following error:

    INTERNAL ERROR: QPDF::calculateLinearizationData: first page object not in lc_first_page_private

    So what does this error mean anyway, and is there no way qpdf can fix this file? Is there anything else I can try, or perhaps something qpdf can do better in such a situation?

    linearization 
    opened by SumatraPeter 24
  • QPDF using wmain on Windows and paths with e.g. German umlauts.

    QPDF using wmain on Windows and paths with e.g. German umlauts.

    I tried to open a PDF file using a German umlaut and got the following error executing using cmd.exe:

    C:\Users\[...]>qpdf --check [...]_Bärbel_[...].pdf
    open [...]_Bärbel_[...].pdf: No such file or directory
    

    [...] is used to strip the file name to the most important details. I compiled qpdf using its support for wmain and all of your tests passed.

    Looking at the code, you are converting to UTF-8 and forwarding internally. Does that mean you are forwarding UTF-8 encoded paths to open as well? If so, the error is correct, UTF-8 encoded file names can't be found this way on Windows. One either needs to provide the encoding in use by the system currently or Unicode, meaning wopen instead of open.

    https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/open-wopen?view=vs-2017

    Mapping UTF-8 into the file system works on most non-Windows, because they by definition simply store individual bytes, which are most likely UTF-8 encoded. But Windows/NTFS doesn't do so, it stores UTF-16 chars.

    https://stackoverflow.com/questions/2050973/what-encoding-are-filenames-in-ntfs-stored-as

    The reason to switch to use wmain was that some of your tests are passing Unicode chars to qpdf, which failed without wmain, but succeeded with. But how you handle those arguments internally is totally up to you, while forwarding UTF-8 to open involves the file system and such.

    bug next 
    opened by ams-tschoening 23
  • Syntax for merging entire PDF files

    Syntax for merging entire PDF files

    If I needed to merge the entire a1.pdf and a2.pdf into b.pdf, is the best way to do it with:

    qpdf --empty --pages a1.pdf 1-z a2.pdf 1-z -- b.pdf

    Is there a way it can be called without specifying 1-z for each page? It would ideal if I could call it with something like:

    qpdf --empty --pages a1.pdf a2.pdf -- b.pdf

    which would then allow me to call it with wildcards like:

    qpdf --empty --pages a*.pdf -- b.pdf

    Thanks for your help.

    opened by rbro 22
  • Add private namespace QUtil_12

    Add private namespace QUtil_12

    This is an alternative solution to adding temporary functions to QUtil in #864. Instead, this proposes to create a complete private copy of QUtil that can be incrementally updated for qpdf 12 and used in its entirety to replace QUtil when it comes to the ABI update.

    On balance this feels like a cleaner solution, both making it easier to add additional tweaks to QUtil that may be required to allow other parts of qpdf take advantage of new features of C++-17, and making it easier to finally switch to qpdf 12.

    It is also about 1% faster than the equivalent commits in #864.

    opened by m-holger 0
  • Background to PR863

    Background to PR863

    (very early draft of a proposal for a qpdf 12 enhancement)

    Add array style methods to QPDFObjectHandle that work for all object types

    It is common for directory entries that allow for a list of objects to also allow a single object of the (same) type, as well as maybe a null object. For example, the /Filter entry in a stream dictionary is specified as an optional name, or an array of 0, 1, or more names.

    Users of the qpdf library should not need to be concerned with the detail. Instead, they should be able to process the result of getKey("/Filter") as if it was an array of names.

    ...

    As a demo of the concept I have implemented simple size, at, begin and end methods and used them with qpdf itself. This is to test whether the methods are practical and sufficiently efficient for general use (but not necessarily for performance critical parts of qpdf itself).

    opened by m-holger 0
  • Refactor QPDFTokenizer

    Refactor QPDFTokenizer

    Reduce copying of strings by:

    • use of std::string_view
    • retrieving results on the fly as needed instead of copying them into Token objects
    • not constructing both QPDFTokenizer::val and raw_val when they are identical
    opened by m-holger 1
  • Refactor QPDF_Array

    Refactor QPDF_Array

    The main changes are:

    • Merge SparseOHArray into QPDF_Array given that they are closely coupled and that there is no longer a reason to keep them separate.
    • Change index type to int in order to avoid unnecessary conversions.
    • Change underlying data structure to std::map as it turned out to be on balance more efficient here.
    • Make most methods virtual in order to avoid costly dynamic casting.
    • Eliminate repeated checks. Most method now return false or an uninitialized object handle to indicate failure rather than throwing an exception.
    • Reduce copying by using move semantics and by shifting elements in place during insert/erase.
    • Move some methods from QPDFObjectHandle to QPDFValue for efficiency.
    • Add additional ownership checks and object warnings.

    I have renamed most methods to be more in line with std::vector. I have also dropped the "this->". Given that there are only two data members and the methods are all short I do not feel that "this->" improves readability in this case.

    opened by m-holger 10
  • Joining multiple PDF with attachments produces a PDF without any attachments

    Joining multiple PDF with attachments produces a PDF without any attachments

    If I join many PDF files with or without attachments the results is a PDF with the sequence of the original files but losing all the attachments. If I use the Poppler tools PDFUNITE for the same operation the result is one PDF with all the original attachments. Maybe QPDF behaviour is by design. Thankyou. Vincenzo

    pages 
    opened by vf1962 3
Releases(v11.2.0)
Owner
QPDF
QPDF
OCR, Object Detection, Number Plate, Real Time

README.md PrePareded anaconda env requirements.txt clova AI → deep text recognition → trained weights (ex, .pth) wpod-net weights (ex, .h5 , .json) ht

Kaven Lee 7 Dec 06, 2022
Visual Attention based OCR

Attention-OCR Authours: Qi Guo and Yuntian Deng Visual Attention based OCR. The model first runs a sliding CNN on the image (images are resized to hei

Yuntian Deng 1.1k Jan 02, 2023
An application of high resolution GANs to dewarp images of perturbed documents

Docuwarp This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for general image to image translat

Thomas Huang 97 Dec 25, 2022
FOTS Pytorch Implementation

News!!! Recognition branch now is added into model. The whole project has beed optimized and refactored. ICDAR Dataset SynthText 800K Dataset detectio

Ning Lu 599 Dec 19, 2022
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022
a deep learning model for page layout analysis / segmentation.

OCR Segmentation a deep learning model for page layout analysis / segmentation. dependencies tensorflow1.8 python3 dataset: uw3-framed-lines-degraded-

99 Dec 12, 2022
A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.

LAREX LAREX is a semi-automatic open-source tool for layout analysis on early printed books. It uses a rule based connected components approach which

162 Jan 05, 2023
Generic framework for historical document processing

dhSegment dhSegment is a tool for Historical Document Processing. Its generic approach allows to segment regions and extract content from different ty

Digital Humanities Laboratory 343 Dec 24, 2022
Ackermann Line Follower Robot Simulation.

Ackermann Line Follower Robot This is a simulation of a line follower robot that works with steering control based on Stanley: The Robot That Won the

Lucas Mazzetto 2 Apr 16, 2022
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

EAST_ICPR: EAST for ICPR MTWI 2018 CHALLENGE Introduction This is a repository forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE. Origin Reposi

Haozheng Li 157 Aug 23, 2022
Lightning Fast Language Prediction 🚀

whatthelang Lightning Fast Language Prediction 🚀 Dependencies The dependencies can be installed using the requirements.txt file: $ pip install -r req

Indix 152 Oct 16, 2022
Erosion and dialation using structure element in OpenCV python

Erosion and dialation using structure element in OpenCV python

Tamzid hasan 2 Nov 11, 2021
YOLOv5 in DOTA with CSL_label.(Oriented Object Detection)(Rotation Detection)(Rotated BBox)

YOLOv5_DOTA_OBB YOLOv5 in DOTA_OBB dataset with CSL_label.(Oriented Object Detection) Datasets and pretrained checkpoint Datasets : DOTA Pretrained Ch

1.1k Dec 30, 2022
TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection Introduction The code and trained models of: TextField: Learning A Deep

Yukang Wang 101 Dec 12, 2022
LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

Murtaza Hassan 815 Dec 29, 2022
Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Scene Text-Spotting based on PSEnet+CRNN Pytorch implementation of an end to end Text-Spotter with a PSEnet text detector and CRNN text recognizer. We

azhar shaikh 62 Oct 10, 2022
This repository contains codes on how to handle mouse event using OpenCV

Handling-Mouse-Click-Events-Using-OpenCV This repository contains codes on how t

Happy N. Monday 3 Feb 15, 2022
Maze generator and solver with python

Procedural-Maze-Generator-Algorithms Check out my youtube channel : Auctux Ressources Thanks to Jamis Buck Book : Mazes for programmers Requirements P

Joseph 19 Dec 07, 2022
text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

text-detection-ctpn Scene text detection based on ctpn (connectionist text proposal network). It is implemented in tensorflow. The origin paper can be

Shaohui Ruan 3.3k Dec 30, 2022
An organized collection of tutorials and projects created for aspriring computer vision students.

A repository created with the purpose of teaching students in BME lab 308A- Hanoi University of Science and Technology

Givralnguyen 5 Nov 24, 2021