Layout Parser is a deep learning based tool for document image layout analysis tasks.

Overview

Layout Parser Logo

Docs PyPI PyVersion License


Layout Parser is a deep learning based tool for document image layout analysis tasks.

Installation

Use pip or conda to install the library:

pip install layoutparser

# Install Detectron2 for using DL Layout Detection Model
# Please make sure the PyTorch version is compatible with
# the installed Detectron2 version. 
pip install 'git+https://github.com/facebookresearch/detectron2.git#egg=detectron2' 

# Install the ocr components when necessary 
pip install layoutparser[ocr]      

This by default will install the CPU version of the Detectron2, and it should be able to run on most of the computers. But if you have a GPU, you can consider the GPU version of the Detectron2, referring to the official instructions.

Quick Start

We provide a series of examples for to help you start using the layout parser library:

  1. Table OCR and Results Parsing: layoutparser can be used for conveniently OCR documents and convert the output in to structured data.

  2. Deep Layout Parsing Example: With the help of Deep Learning, layoutparser supports the analysis very complex documents and processing of the hierarchical structure in the layouts.

DL Assisted Layout Prediction Example

Example Usage

The images shown in the figure above are: a screenshot of this paper, an image from the PRIMA Layout Analysis Dataset, a screenshot of the WSJ website, and an image from the HJDataset.

With only 4 lines of code in layoutparse, you can unlock the information from complex documents that existing tools could not provide. You can either choose a deep learning model from the ModelZoo, or load the model that you trained on your own. And use the following code to predict the layout as well as visualize it:

>>> import layoutparser as lp
>>> model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config')
>>> layout = model.detect(image) # You need to load the image somewhere else, e.g., image = cv2.imread(...)
>>> lp.draw_box(image, layout,) # With extra configurations

Citing layoutparser

If you find layoutparser helpful to your work, please consider citing our tool and paper using the following BibTeX entry.

@article{shen2021layoutparser,
  title={LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis},
  author={Shen, Zejiang and Zhang, Ruochen and Dell, Melissa and Lee, Benjamin Charles Germain and Carlson, Jacob and Li, Weining},
  journal={arXiv preprint arXiv:2103.15348},
  year={2021}
}
Comments
  • Apply detect() on readable PDF files

    Apply detect() on readable PDF files

    Hi there, from the docs I infere that detect() operates, for example, on PIL.Image objects. Is there way to directly operate on already readable PDF files (which obviates the need applying OCR as well). Greetings

    enhancement 
    opened by simonschoe 12
  • AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    Hi,

    Thank you for this awesome program! I successfully installed layout-parser Detectron2 on my windows 10 laptop. When I run the following code:

    import layoutparser as lp import cv2 from pdf2image import convert_from_bytes

    images = convert_from_bytes(open('C:\temp\ConsigneeList\Doc 4 Distribution List.pdf', 'rb').read())

    model = lp.Detectron2LayoutModel( config_path ='lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config', # In model catalog label_map = {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In modellabel_map extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional ) #loop through each page for image in images: ocr_agent = lp.ocr.TesseractAgent()

    image = np.array(image)
    
    layout = model.detect(image)
    

    text_blocks = lp.Layout([b for b in layout if b.type == 'Text']) #loop through each text box on page.

    for block in text_blocks: segment_image = (block .pad(left=5, right=5, top=5, bottom=5) .crop_image(image)) text = ocr_agent.detect(segment_image) block.set(text=text, inplace=True)

    for i, txt in enumerate(text_blocks.get_texts()):
            my_file = open("OUTPUT FILE PATH/FILENAME.TXT","a+")
            my_file.write(txt)
    

    I get the following errors:


    AttributeError Traceback (most recent call last) in ----> 1 model = lp.Detectron2LayoutModel( 2 config_path ='lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config', # In model catalog 3 label_map = {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In modellabel_map 4 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional 5 )

    C:\ProgramData\Anaconda3\lib\site-packages\layoutparser\file_utils.py in getattr(self, name) 224 value = getattr(module, name) 225 else: --> 226 raise AttributeError(f"module {self.name} has no attribute {name}") 227 228 setattr(self, name, value)

    AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    Any ideas on what is wrong? Thank you!!

    Sincerely,

    tom

    Checklist

    1. I have searched related issues but cannot get the expected help.
    2. The bug has not been fixed in the latest version, see the Layout Parser Releases

    To Reproduce Steps to reproduce the behavior:

    1. What command or script did you run?
    A placeholder for the command.
    

    Environment

    1. Please describe your Platform [Windows/MacOS/Linux]
    2. Please show the Layout Parser version
    3. You may add addition that may be helpful for locating the problem, such as
      • How you installed PyTorch [e.g., pip, conda, source]
      • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

    Error traceback If applicable, paste the error traceback here.

    Screenshots If applicable, add screenshots to help explain your problem.

    Additional context Add any other context about the problem here.

    bug 
    opened by theiman112860 10
  • 'GCVAgent' object has no attribute '_client'

    'GCVAgent' object has no attribute '_client'

    Hi, when I was running the tutorial of "OCR tables and parse the output", when I was trying to obtain the result:

    res = ocr_agent.detect(image, return_response=True)

    The response was

    Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/layoutparser/ocr/gcv_agent.py", line 168, in detect res = self._detect(img_content) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/layoutparser/ocr/gcv_agent.py", line 134, in _detect response = self._client.document_text_detection( AttributeError: 'GCVAgent' object has no attribute '_client'

    I googled and some sites said The Client() class was removed in the Client Library v0.25.1 and replaced with ImageAnnotatorClient().

    Was this a problem? Thank you.

    bug 
    opened by junxi-liu 8
  • Error installing dependencies

    Error installing dependencies

    Hi Team, Thank you for all the great work. It looks amazing. I tried installing pip install layoutparser but it thrown me the below error, can you please let me know how to rectify this,

    ERROR: Command errored out with exit status 1: command: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-wheel-awmfv0cr' cwd: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632
    Complete output (22 lines): running bdist_wheel running build running build_py creating build creating build\lib.win-amd64-3.8 creating build\lib.win-amd64-3.8\pycocotools copying pycocotools\coco.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\cocoeval.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\mask.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools_init_.py -> build\lib.win-amd64-3.8\pycocotools running build_ext cythoning pycocotools/_mask.pyx to pycocotools_mask.c C:\Users\pss.ch\AppData\Roaming\Python\Python38\site-packages\Cython\Compiler\Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\pycocotools_mask.pyx tree = Parsing.p_module(s, pxd, full_module_name) building 'pycocotools._mask' extension creating build\temp.win-amd64-3.8 creating build\temp.win-amd64-3.8\Release creating build\temp.win-amd64-3.8\Release\common creating build\temp.win-amd64-3.8\Release\pycocotools C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\pss.ch\AppData\Roaming\Python\Python38\site-packages\numpy\core\include -I./common "-IC:\Program Files\Anaconda\include" "-IC:\Program Files\Anaconda\include" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tc./common/maskApi.c /Fobuild\temp.win-amd64-3.8\Release./common/maskApi.obj -Wno-cpp -Wno-unused-function -std=c99 cl : Command line error D8021 : invalid numeric argument '/Wno-cpp' error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' failed with exit status 2

    ERROR: Failed building wheel for pycocotools ERROR: Command errored out with exit status 1: command: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-record-w4euj5sb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\Users\pss.ch\AppData\Roaming\Python\Python38\Include\pycocotools' cwd: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632
    Complete output (20 lines): running install running build running build_py creating build creating build\lib.win-amd64-3.8 creating build\lib.win-amd64-3.8\pycocotools copying pycocotools\coco.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\cocoeval.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\mask.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools_init_.py -> build\lib.win-amd64-3.8\pycocotools running build_ext skipping 'pycocotools_mask.c' Cython extension (up-to-date) building 'pycocotools._mask' extension creating build\temp.win-amd64-3.8 creating build\temp.win-amd64-3.8\Release creating build\temp.win-amd64-3.8\Release\common creating build\temp.win-amd64-3.8\Release\pycocotools C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\AppData\Roaming\Python\Python38\site-packages\numpy\core\include -I./common "-IC:\Program Files\Anaconda\include" "-IC:\Program Files\Anaconda\include" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tc./common/maskApi.c /Fobuild\temp.win-amd64-3.8\Release./common/maskApi.obj -Wno-cpp -Wno-unused-function -std=c99 cl : Command line error D8021 : invalid numeric argument '/Wno-cpp' error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' failed with exit status 2 ---------------------------------------- ERROR: Command errored out with exit status 1: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-record-w4euj5sb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\AppData\Roaming\Python\Python38\Include\pycocotools' Check the logs for full command output.

    opened by sriprad 8
  • enforce_cpu not working

    enforce_cpu not working

    When setting enforce_cpu true, still using CUDA instead of CPU. I think it is due to this https://github.com/Layout-Parser/layout-parser/blob/e035fc8f952addc620670e5b47864fe213db0e10/src/layoutparser/models/layoutmodel.py#L120

    Possible fix could be cfg.MODEL.DEVICE = "cuda" if torch.cuda.is_available() and (not enforce_cpu) else "cpu"

    bug 
    opened by lkluo 5
  • Adding support for mathematical formula recognition

    Adding support for mathematical formula recognition

    Have you considered adding support for mathematical formula recognition? Identifying the position of mathematical formulas in documents has always been a problem.

    modeling 
    opened by SleepyCelery 5
  • draw_box draw only one box from layout

    draw_box draw only one box from layout

    Describe the bug I just installed everything according to the installation guide and launched your jupyter notebook from here Deep Layout Parsing Example. After first draw_box it's show only one box, but in print(layout) i see all boxes. Same with second draw_box from your guide. not sure what i'm doing wrong.

    To Reproduce Steps to reproduce the behavior:

    1. installation guide + detectron2 install also from your guide
    2. Run jupyter notebook

    Environment

    1. MacOS
    2. VS Code
    3. Here some stuff from pip:
    torch==1.11.0
    torchvision==0.12.0
    Pillow==9.1.0
    opencv-python==4.5.5.64
    layoutparser==0.3.3
    

    Error traceback No errors, just behaviour not same like in guide or other guides

    Screenshots attached

    output1 output2

    bug 
    opened by Moo1234567 4
  • Gives wrong results when the code is run for some images in a loop

    Gives wrong results when the code is run for some images in a loop

    The code works when it is run for a single image. But when I run the same code in a loop for few images from the publaynet dataset, cached results seem to apply (i.e. The bounding boxes overlap and the boxes for the previous images are also put in the current image).

    opened by surajsubramanian 4
  • ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)

    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)

    While using this code, I get this error of Pillow. I tried re-installing pillow but still struggling with this issue. Any help to make this code run?

    import layoutparser as lp
    model = lp.Detectron2LayoutModel(
                config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
                label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
                extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional
            )
    model.detect(image)
    

    Getting this error:

    ImportError                               Traceback (most recent call last)
    [<ipython-input-6-59f0fb07b7e3>](https://localhost:8080/#) in <module>
          1 import layoutparser as lp
    ----> 2 model = lp.Detectron2LayoutModel(
          3             config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
          4             label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
          5             extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional
    
    31 frames
    [/usr/local/lib/python3.7/dist-packages/PIL/ImageFont.py](https://localhost:8080/#) in <module>
         35 from . import Image
         36 from ._deprecate import deprecate
    ---> 37 from ._util import is_directory, is_path
         38 
         39 
    
    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)
    
    
    opened by arhamshah 3
  • TypeError: inner() got an unexpected keyword argument 'image_context'

    TypeError: inner() got an unexpected keyword argument 'image_context'

    Hello! Recently encountered an issue when trying to use Google's OCR when running ocr_agent.detect

    Running this:

    image = cv2.imread("/Users/liz/Documents/Projects/LayoutParser/test2.png")
    ocr_agent = lp.GCVAgent.with_credential("/Users/liz/Documents/Projects/Keys/GoogleCloud/vision-341523-e3cbd0df8d19.json",languages = ['en'])
    res = ocr_agent.detect(image, return_response=True)
    

    Gives me the following error:

    TypeError                                 Traceback (most recent call last)
    <ipython-input-9-76614ef6a3e8> in <module>
          1 image = cv2.imread("/Users/liz/Documents/Projects/LayoutParser/test2.png")
          2 ocr_agent = lp.GCVAgent.with_credential("/Users/liz/Documents/Projects/Keys/GoogleCloud/vision-341523-e3cbd0df8d19.json",languages = ['en'])
    ----> 3 res = ocr_agent.detect(image, return_response=True)
          4 
          5 #layout = ocr_agent.gather_full_text_annotation(res, agg_level=lp.GCVFeatureType.WORD)
    
    /opt/homebrew/Caskroom/miniforge/base/envs/data310/lib/python3.9/site-packages/layoutparser/ocr.py in detect(self, image, return_response, return_only_text, agg_output_level)
        222                 img_content = image_file.read()
        223 
    --> 224         res = self._detect(img_content)
        225 
        226         if return_response:
    
    /opt/homebrew/Caskroom/miniforge/base/envs/data310/lib/python3.9/site-packages/layoutparser/ocr.py in _detect(self, img_content)
        188     def _detect(self, img_content):
        189         img_content = self._vision.types.Image(content=img_content)
    --> 190         response = self._client.document_text_detection(
        191             image=img_content, image_context=self._context
        192         )
    
    TypeError: inner() got an unexpected keyword argument 'image_context'
    

    Not sure what it is caused by, might be user error but I haven't been able to find anything else about it and I've tried everything I can think of (all the packages are up to date (or in google cloud vision's case, downgraded to stay on the old API). Thanks!

    bug 
    opened by liz-goodwin 3
  • bad result detected

    bad result detected

    I got bad result using layout-parser here is the image I am used: 1

    here is the code run in python :

    image = cv2.imread("1.png")
    # Convert the image from BGR (cv2 default loading style)
    # to RGB
    image = image[..., ::-1]
    origin_image = image.copy()
    
    model = lp.Detectron2LayoutModel('lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config', 
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})
    # Load the deep layout model from the layoutparser API 
    # For all the supported model, please check the Model 
    # Zoo Page: https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html
    
    layout = model.detect(image)
    # print("layout : ", layout)
    # Detect the layout of the input image
    text_blocks = lp.Layout([b for b in layout if b.type=='Text'])
    drawRectangleInImage(origin_image, text_blocks, (36,255,12))
    
    titles_blocks = lp.Layout([b for b in layout if b.type=='Title'])
    drawRectangleInImage(origin_image, titles_blocks, (76, 155, 175))
    
    figure_blocks = lp.Layout([b for b in layout if b.type=='Figure'])
    drawRectangleInImage(origin_image, figure_blocks, (122, 96, 216))
    
    lists_blocks = lp.Layout([b for b in layout if b.type=='List'])
    drawRectangleInImage(origin_image, lists_blocks, (176, 155, 175))
    
    tables_blocks = lp.Layout([b for b in layout if b.type=='Table'])
    drawRectangleInImage(origin_image, tables_blocks, (76, 255, 75))
    
    cv2.imshow('image', origin_image)
    cv2.waitKey()
    

    here is the result:

    截屏2022-01-18 11 45 06

    by the way :

    there is some warning generated :

    /usr/local/lib/python3.9/site-packages/detectron2/structures/image_list.py:99: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). max_size = (max_size + (stride - 1)) // stride * stride /usr/local/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

    bug 
    opened by DamonsJ 3
  • Any idea about Detectron gets overlapping and sometimes misses some blocks

    Any idea about Detectron gets overlapping and sometimes misses some blocks

    The problem I am currently using layout-parser to detect the blocks of a scanned book papers and trying to take each block separately from the page and do some processing over them.

    Checklist

    To Reproduce

    import layoutparser as lp
    import cv2
    
    image = cv2.imread("/content/image_0.jpg")
    # Convert the image from BGR (cv2 default loading style) to RGB
    image = image[..., ::-1]
    
    model = lp.Detectron2LayoutModel((lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config),
                                     extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                     label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})
    
    
    # Detect the layout of the input image
    layout = model.detect(image)
    
    # Show the detected layout of the input image
    lp.draw_box(image, layout, box_width=3)
    

    Environment

    1. Platform [Linux] (on colab)
    2. Installation commands
    !sudo apt-get update
    !sudo apt-get install libleptonica-dev tesseract-ocr libtesseract-dev python3-pil tesseract-ocr-eng tesseract-ocr-script-latn
    !pip install layoutparser	
    !pip install layoutparser torchvision && pip install "git+https://github.com/facebookresearch/[email protected]#egg=detectron2"	
    !pip install "layoutparser[ocr]"	
    !pip install "layoutparser[layoutmodels]" # Install DL layout model toolkit 
    

    Screenshots

    1- Overlapping |3|image_3| |---|---|

    2- Missing |7|image_7| |---|---|

    I know it may not the right place to release that issue, but I think you may have an idea about that problem

    bug 
    opened by rrrokhtar 0
  • [Bug] has_torch_function_variadic error

    [Bug] has_torch_function_variadic error

    Describe the bug When attempting to initialise a model (I've tried with AutoLayoutModel and Detectron2LayoutModel), torch.jit throws a RuntimeError as below...

    RuntimeError: 
    undefined value has_torch_function_variadic:
      File "/opt/conda/lib/python3.8/site-packages/torch/utils/smdebug.py", line 2962
             >>> loss.backward()
        """
        if has_torch_function_variadic(input, target, weight, pos_weight):
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            return handle_torch_function(
                binary_cross_entropy_with_logits,
    'binary_cross_entropy_with_logits' is being compiled since it was called from 'sigmoid_focal_loss'
      File "/opt/conda/lib/python3.8/site-packages/fvcore/nn/focal_loss.py", line 34
        """
        p = torch.sigmoid(inputs)
        ce_loss = F.binary_cross_entropy_with_logits(inputs, targets, reduction="none")
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        p_t = p * targets + (1 - p) * (1 - targets)
        loss = ce_loss * ((1 - p_t) ** gamma)
    

    To Reproduce Steps to reproduce the behavior:

    1. Install layout-parser, OpenCV, Detectron2 as below
    %pip install opensearch-py opencv-python --quiet
    %pip install -U layoutparser[ocr] --quiet
    !python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.10/index.html
    
    1. Import layoutparser and attempt to init model with lp.models.Detectron2LayoutModel(...)
    2. Error appears

    Environment Linux with layoutparser latest

    bug 
    opened by lucafrost 0
  • cannot import name 'is_directory' from 'PIL._util'(lp.Detectron2LayoutModel)

    cannot import name 'is_directory' from 'PIL._util'(lp.Detectron2LayoutModel)

    Describe the bug When I tried the sample codes:

    !pip install layoutparser
    !pip install 'git+https://github.com/facebookresearch/[email protected]#egg=detectron2'
    
    import layoutparser as lp
    import cv2
    import PIL
    
    image = cv2.imread("image.png")
    model = lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config')
    layout = model.detect(image)
    

    Colab link(Python 3.8.16): https://colab.research.google.com/drive/1lb8_Pcw8_NNdeKPL80HOYca8gaCB0f-E?usp=sharing

    I got an error on this line:

    lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config')

    The error message is:

    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.8/dist-packages/PIL/_util.py)

    I hope that I can get your help. Thanks!

    bug 
    opened by sudoghut 0
  • [Fix] reduce memory consumption and close pdf stream after usage

    [Fix] reduce memory consumption and close pdf stream after usage

    Flushes the pages and pdf afterwards to reduce the memory/ram consumption.

    Opens the pdf stream as a context manager so that the file is closed afterwads.

    opened by jakobnrmnn 0
  • Minor installation instruction error

    Minor installation instruction error

    On Mac, the command

    pip3 install -U layoutparser[ocr]
    

    doesn't work (returns "zsh: no matches found: layoutparser[ocr]"), you need to do

    pip3 install -U "layoutparser[ocr]"
    
    bug 
    opened by bholtdwyer 0
Releases(v0.3.4)
  • v0.3.4(Apr 6, 2022)

    Bug fixes

    • fix one critical bug for visualization mentioned in #131 by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/132

    Full Changelog: https://github.com/Layout-Parser/layout-parser/compare/v0.3.3...v0.3.4

    Source code(tar.gz)
    Source code(zip)
  • v0.3.3(Apr 3, 2022)

    Functional Updates

    • Robust pdf loading for empty pages by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/115
    • fix to issue #94 -- avoiding TesseractAgent.detect() inferring any sequence of digit as float by @k-for-code in https://github.com/Layout-Parser/layout-parser/pull/95
    • Better layout comparison by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/128
    • Better visualization functions by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/129

    Example Updates

    • Minor update to Deep Learning Parser example notebook by @Jim-Salmons in https://github.com/Layout-Parser/layout-parser/pull/56
    • Set inplace to True in sorting function by @yusanshi in https://github.com/Layout-Parser/layout-parser/pull/104
    • Add notebook for customizing LayoutParser Models with Label Studio Annotation by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/124

    New Contributors

    • @Jim-Salmons made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/56
    • @yusanshi made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/104
    • @k-for-code made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/95

    Full Changelog: https://github.com/Layout-Parser/layout-parser/compare/v0.3.2...v0.3.3

    Source code(tar.gz)
    Source code(zip)
  • v0.3.2(Sep 23, 2021)

    Important fixes for multibackend layout model support:

    • Resolves the issues mentioned in #78 with other fixes to improve the multibackend layout model support #79
    • Better tests for different backends #79 for preventing future related issues
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Sep 15, 2021)

    • Fixes for automatically setting label_map in Detectron2LayoutModel #75
    • Remove unnecessary class annotations (that might breaks Python 3.6 users) #75
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Sep 13, 2021)

    We are excited to release LayoutParser v0.3.0, with a lot of exciting updates and functional improvements.

    New Features

    • The biggest change in this version is that LayoutParser now supports multiple deep learning backends: Detectron2, effdet, and paddledetection. This allows for more flexible usage of the layoutparser library, and makes it easier for implementing customized layout models in the future. #54 #67
    • Additionally, the newly added AutoModel and improved model configuration parsing makes it easier load and use the layout detection models. #69
      • e.g, model = lp.AutoLayoutModel("lp://efficientdet/PubLayNet").
    • To support this multi-backend framework, we implement the dynamic importing mechanism as well as better ways for installing layoutparser and the needed dependencies (see instructions). #65 #68
    • And now layoutparser supports directly loading PDF files into as layout objects: #71
      import layoutparser as lp
      pdf_layout, pdf_images = lp.load_pdf("path/to/pdf", load_images=True)
      lp.draw_box(pdf_images[0], pdf_layout[0])
      
    • To support more flexible processing of the layout objects, a set of new toolkits are available: #72
      import layout parser as lp
      page_layout = lp.load_pdf("tests/fixtures/io/example.pdf")[0]
      pdf_lines = lp.simple_line_detection(page_layout)
      

    New Models

    • Add MFD model that can detect (display) equation regions within scientific documents #59
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Apr 12, 2021)

    Layout Parser v0.2.0 Release Notes

    New Features

    1. Support for loading and exporting the layout data in json and csv , see #6
    2. Add support for union and intersect operations, see #20 and the detailed explanation

    Improvements

    1. Functional improvements:
      1. When loading Layout Parser official models, Detectron2LayoutModel can automatically detect the label_map, . For example,

        model = lp.Detectron2LayoutModel("lp://HJDataset/faster_rcnn_R_50_FPN_3x/config")
        model.label_map
        # {1: 'Page Frame', ... }
        
      2. Detectron2LayoutModel now supports the enforce_cpu flag that enforces using cpu even when CUDA devices are available.

      3. For visualization.draw_box, it now supports a show_element_type flag that shows the bbox category name on the top left corner of the layout objects.

    2. Improve installation command and documentation, especially for installing Detectron2 on Windows platforms #25

    New Models

    1. Add the table bank detection models that can identify table regions

    Fixes

    1. Fix the incorrect layout issue mentioned in #9 - Thanks to @remidbs.
    2. Fix the some of the dependency issues mentioned in #11 and #13 by using iopath instead of fvcore. See #18, Thanks to @edisongustavo.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.3(Dec 21, 2020)

    Improvements:

    • Supports lazy loading for the Detectron2 module. Now the dependency for Detectron2 will be requested only when you explicitly create a Detectron2LayoutModel object. This might be helpful for using the plain layoutparser library without installing the Detectron2 module.

    New models:

    • Incorporated a pre-trained model based on the NewspaperNavigator dataset: lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config

    Fixes:

    • Corrected a bug in visualization that might overwrite original the image
    Source code(tar.gz)
    Source code(zip)
  • v0.1.2(Oct 30, 2020)

    In this version, we released a new model for publaynet and made several improvements:

    1. We released the mask_rcnn_X_101_32x8d_FPN_3x model trained on the publaynet dataset. Note: it's been trained on the full training set (while others are only trained on the validation set), and you could expect a 15% performance improvement based on this new model.
    2. We improved the support for PIL images for both layout modeling and visualization
    3. We improved the Default Language Settings for the Tesseract OCR model
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Jul 16, 2020)

    Fixes

    • Fixed a bug that could cause errors in loading Prima Models

    Updates

    • Update the prima MASK RCNN model with higher accuracy, and listed detailed evaluation reports.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Jun 24, 2020)

    layoutparser now supports the following functionalities:

    • Coordinate system:

      • Supports the 3 basic coordinate system and their geometric relationships
      • Supports the TextBlook and Layout system for convenient coordinate and text processing
    • OCR System:

      • Supports OCR based on Google Cloud Vision and Tesseract API.
    • Layout Modeling:

      • Supports using pre-trained Deep Learning models for layout object detection using Detection2
    • Visualization:

      • Supports highly-customizable presentation of the box coordinates and text in the detected layout
    Source code(tar.gz)
    Source code(zip)
MonsterManualPlus - An advanced monster manual for Tower of the Sorcerer.

Monster Manual + This is an advanced monster manual for Tower of the Sorcerer mods. Users can get a plenty of extra imformation for decision making wh

Yifan Zhou 1 Jan 01, 2022
An MkDocs plugin to export content pages as PDF files

MkDocs PDF Export Plugin An MkDocs plugin to export content pages as PDF files The pdf-export plugin will export all markdown pages in your MkDocs rep

Terry Zhao 266 Dec 13, 2022
Make posters from Markdown files.

MkPosters Create posters using Markdown. Supports icons, admonitions, and LaTeX mathematics. At the moment it is restricted to the specific layout of

Patrick Kidger 243 Dec 20, 2022
freeCodeCamp Scientific Computing with Python Project for Certification.

Polygon_Area_Calculator freeCodeCamp Python Project freeCodeCamp Scientific Computing with Python Project for Certification. In this project you will

Rajdeep Mondal 1 Dec 23, 2021
Word document generator with python

In this study, real world data is anonymized. The content is completely different, but the structure is the same. It was a script I prepared for the backend of a work using UiPath.

Ezgi Turalı 3 Jan 30, 2022
An awesome Data Science repository to learn and apply for real world problems.

AWESOME DATA SCIENCE An open source Data Science repository to learn and apply towards solving real world problems. This is a shortcut path to start s

Academic.io 20.3k Jan 09, 2023
script to calculate total GPA out of 4, based on input gpa.csv

gpa_calculator script to calculate total GPA out of 4 based on input gpa.csv to use, create a total.csv file containing only one integer showing the t

Mohamad Bastin 1 Feb 07, 2022
More detailed upload statistics for Nicotine+

More Upload Statistics A small plugin for Nicotine+ 3.1+ to create more detailed upload statistics. ⚠ No data previous to enabling this plugin will be

Nick 1 Dec 17, 2021
Clases y ejercicios del curso de python diactodo por la UNSAM

Programación en Python En el marco del proyecto de Inteligencia Artificial Interdisciplinaria, la Escuela de Ciencia y Tecnología de la UNSAM vuelve a

Maximiliano Villalva 3 Jan 06, 2022
MkDocs plugin for setting revision date from git per markdown file

mkdocs-git-revision-date-plugin MkDocs plugin that displays the last revision date of the current page of the documentation based on Git. The revision

Terry Zhao 48 Jan 06, 2023
Generate modern Python clients from OpenAPI

openapi-python-client Generate modern Python clients from OpenAPI 3.x documents. This generator does not support OpenAPI 2.x FKA Swagger. If you need

555 Jan 02, 2023
level2-data-annotation_cv-level2-cv-15 created by GitHub Classroom

[AI Tech 3기 Level2 P Stage] 글자 검출 대회 팀원 소개 김규리_T3016 박정현_T3094 석진혁_T3109 손정균_T3111 이현진_T3174 임종현_T3182 Overview OCR (Optimal Character Recognition) 기술

6 Jun 10, 2022
Autolookup GUI Plugin for Plover

Word Tray for Plover Word Tray is a GUI plugin that automatically looks up efficient outlines for words that start with the current input, much like a

Kathy 3 Jun 08, 2022
Xanadu Quantum Codebook is an experimental, exercise-based introduction to quantum computing using PennyLane.

Xanadu Quantum Codebook The Xanadu Quantum Codebook is an experimental, exercise-based introduction to quantum computing using PennyLane. This reposit

Xanadu 43 Dec 09, 2022
300+ Python Interview Questions

300+ Python Interview Questions

Pradeep Kumar 1.1k Jan 02, 2023
Data-Scrapping SEO - the project uses various data scrapping and Google autocompletes API tools to provide relevant points of different keywords so that search engines can be optimized

Data-Scrapping SEO - the project uses various data scrapping and Google autocompletes API tools to provide relevant points of different keywords so that search engines can be optimized; as this infor

Vibhav Kumar Dixit 2 Jul 18, 2022
Gtech μLearn Sample_bot

Ser_bot Gtech μLearn Sample_bot Do Greet a newly joined member in a channel (random message) While adding a reaction to a message send a message to a

Jerin Paul 1 Jan 19, 2022
The sarge package provides a wrapper for subprocess which provides command pipeline functionality.

Overview The sarge package provides a wrapper for subprocess which provides command pipeline functionality. This package leverages subprocess to provi

Vinay Sajip 14 Dec 18, 2022
Collection of Summer 2022 tech internships!

Collection of Summer 2022 tech internships!

Pitt Computer Science Club (CSC) 15.6k Jan 03, 2023
📘 OpenAPI/Swagger-generated API Reference Documentation

Generate interactive API documentation from OpenAPI definitions This is the README for the 2.x version of Redoc (React-based). The README for the 1.x

Redocly 19.2k Jan 02, 2023