Layout Parser is a deep learning based tool for document image layout analysis tasks.

Last update: Dec 30, 2022

Overview

Layout Parser is a deep learning based tool for document image layout analysis tasks.

Installation

Use pip or conda to install the library:

pip install layoutparser

# Install Detectron2 for using DL Layout Detection Model
# Please make sure the PyTorch version is compatible with
# the installed Detectron2 version. 
pip install 'git+https://github.com/facebookresearch/detectron2.git#egg=detectron2' 

# Install the ocr components when necessary 
pip install layoutparser[ocr]

This by default will install the CPU version of the Detectron2, and it should be able to run on most of the computers. But if you have a GPU, you can consider the GPU version of the Detectron2, referring to the official instructions.

Quick Start

We provide a series of examples for to help you start using the layout parser library:

Table OCR and Results Parsing: layoutparser can be used for conveniently OCR documents and convert the output in to structured data.
Deep Layout Parsing Example: With the help of Deep Learning, layoutparser supports the analysis very complex documents and processing of the hierarchical structure in the layouts.

DL Assisted Layout Prediction Example

The images shown in the figure above are: a screenshot of this paper, an image from the PRIMA Layout Analysis Dataset, a screenshot of the WSJ website, and an image from the HJDataset.

With only 4 lines of code in layoutparse, you can unlock the information from complex documents that existing tools could not provide. You can either choose a deep learning model from the ModelZoo, or load the model that you trained on your own. And use the following code to predict the layout as well as visualize it:

>>> import layoutparser as lp
>>> model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config')
>>> layout = model.detect(image) # You need to load the image somewhere else, e.g., image = cv2.imread(...)
>>> lp.draw_box(image, layout,) # With extra configurations

Citing `layoutparser`

If you find layoutparser helpful to your work, please consider citing our tool and paper using the following BibTeX entry.

@article{shen2021layoutparser,
  title={LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis},
  author={Shen, Zejiang and Zhang, Ruochen and Dell, Melissa and Lee, Benjamin Charles Germain and Carlson, Jacob and Li, Weining},
  journal={arXiv preprint arXiv:2103.15348},
  year={2021}
}

Comments

Apply detect() on readable PDF files

Hi there, from the docs I infere that detect() operates, for example, on PIL.Image objects. Is there way to directly operate on already readable PDF files (which obviates the need applying OCR as well). Greetings
enhancement

opened by simonschoe 12
AttributeError: module layoutparser has no attribute Detectron2LayoutModel
Hi,

Thank you for this awesome program! I successfully installed layout-parser Detectron2 on my windows 10 laptop. When I run the following code:

import layoutparser as lp import cv2 from pdf2image import convert_from_bytes

images = convert_from_bytes(open('C:\temp\ConsigneeList\Doc 4 Distribution List.pdf', 'rb').read())

model = lp.Detectron2LayoutModel( config_path ='lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config', # In model catalog label_map = {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In modellabel_map extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional ) #loop through each page for image in images: ocr_agent = lp.ocr.TesseractAgent()

image = np.array(image) layout = model.detect(image)

text_blocks = lp.Layout([b for b in layout if b.type == 'Text']) #loop through each text box on page.

for block in text_blocks: segment_image = (block .pad(left=5, right=5, top=5, bottom=5) .crop_image(image)) text = ocr_agent.detect(segment_image) block.set(text=text, inplace=True)

for i, txt in enumerate(text_blocks.get_texts()): my_file = open("OUTPUT FILE PATH/FILENAME.TXT","a+") my_file.write(txt)

I get the following errors:

AttributeError Traceback (most recent call last) in ----> 1 model = lp.Detectron2LayoutModel( 2 config_path ='lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config', # In model catalog 3 label_map = {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In modellabel_map 4 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional 5 )

C:\ProgramData\Anaconda3\lib\site-packages\layoutparser\file_utils.py in getattr(self, name) 224 value = getattr(module, name) 225 else: --> 226 raise AttributeError(f"module {self.name} has no attribute {name}") 227 228 setattr(self, name, value)

AttributeError: module layoutparser has no attribute Detectron2LayoutModel

Any ideas on what is wrong? Thank you!!

Sincerely,

tom

Checklist

I have searched related issues but cannot get the expected help.

The bug has not been fixed in the latest version, see the Layout Parser Releases

To Reproduce Steps to reproduce the behavior:

What command or script did you run?

A placeholder for the command.

Environment

Please describe your Platform [Windows/MacOS/Linux]

Please show the Layout Parser version

You may add addition that may be helpful for locating the problem, such as

How you installed PyTorch [e.g., pip, conda, source]

Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback If applicable, paste the error traceback here.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.
bug
opened by theiman112860 10
'GCVAgent' object has no attribute '_client'

Hi, when I was running the tutorial of "OCR tables and parse the output", when I was trying to obtain the result:

res = ocr_agent.detect(image, return_response=True)

The response was

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/layoutparser/ocr/gcv_agent.py", line 168, in detect res = self._detect(img_content) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/layoutparser/ocr/gcv_agent.py", line 134, in _detect response = self._client.document_text_detection( AttributeError: 'GCVAgent' object has no attribute '_client'

I googled and some sites said The Client() class was removed in the Client Library v0.25.1 and replaced with ImageAnnotatorClient().

Was this a problem? Thank you.
bug

opened by junxi-liu 8
Error installing dependencies

Hi Team, Thank you for all the great work. It looks amazing. I tried installing pip install layoutparser but it thrown me the below error, can you please let me know how to rectify this,

ERROR: Command errored out with exit status 1: command: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-wheel-awmfv0cr' cwd: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632
Complete output (22 lines): running bdist_wheel running build running build_py creating build creating build\lib.win-amd64-3.8 creating build\lib.win-amd64-3.8\pycocotools copying pycocotools\coco.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\cocoeval.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\mask.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools_init_.py -> build\lib.win-amd64-3.8\pycocotools running build_ext cythoning pycocotools/_mask.pyx to pycocotools_mask.c C:\Users\pss.ch\AppData\Roaming\Python\Python38\site-packages\Cython\Compiler\Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\pycocotools_mask.pyx tree = Parsing.p_module(s, pxd, full_module_name) building 'pycocotools._mask' extension creating build\temp.win-amd64-3.8 creating build\temp.win-amd64-3.8\Release creating build\temp.win-amd64-3.8\Release\common creating build\temp.win-amd64-3.8\Release\pycocotools C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\pss.ch\AppData\Roaming\Python\Python38\site-packages\numpy\core\include -I./common "-IC:\Program Files\Anaconda\include" "-IC:\Program Files\Anaconda\include" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tc./common/maskApi.c /Fobuild\temp.win-amd64-3.8\Release./common/maskApi.obj -Wno-cpp -Wno-unused-function -std=c99 cl : Command line error D8021 : invalid numeric argument '/Wno-cpp' error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' failed with exit status 2

ERROR: Failed building wheel for pycocotools ERROR: Command errored out with exit status 1: command: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-record-w4euj5sb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\Users\pss.ch\AppData\Roaming\Python\Python38\Include\pycocotools' cwd: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632
Complete output (20 lines): running install running build running build_py creating build creating build\lib.win-amd64-3.8 creating build\lib.win-amd64-3.8\pycocotools copying pycocotools\coco.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\cocoeval.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\mask.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools_init_.py -> build\lib.win-amd64-3.8\pycocotools running build_ext skipping 'pycocotools_mask.c' Cython extension (up-to-date) building 'pycocotools._mask' extension creating build\temp.win-amd64-3.8 creating build\temp.win-amd64-3.8\Release creating build\temp.win-amd64-3.8\Release\common creating build\temp.win-amd64-3.8\Release\pycocotools C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\AppData\Roaming\Python\Python38\site-packages\numpy\core\include -I./common "-IC:\Program Files\Anaconda\include" "-IC:\Program Files\Anaconda\include" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tc./common/maskApi.c /Fobuild\temp.win-amd64-3.8\Release./common/maskApi.obj -Wno-cpp -Wno-unused-function -std=c99 cl : Command line error D8021 : invalid numeric argument '/Wno-cpp' error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' failed with exit status 2 ---------------------------------------- ERROR: Command errored out with exit status 1: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-record-w4euj5sb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\AppData\Roaming\Python\Python38\Include\pycocotools' Check the logs for full command output.

opened by sriprad 8
enforce_cpu not working

When setting enforce_cpu true, still using CUDA instead of CPU. I think it is due to this https://github.com/Layout-Parser/layout-parser/blob/e035fc8f952addc620670e5b47864fe213db0e10/src/layoutparser/models/layoutmodel.py#L120

Possible fix could be cfg.MODEL.DEVICE = "cuda" if torch.cuda.is_available() and (not enforce_cpu) else "cpu"
bug

opened by lkluo 5
Adding support for mathematical formula recognition

Have you considered adding support for mathematical formula recognition? Identifying the position of mathematical formulas in documents has always been a problem.
modeling

opened by SleepyCelery 5
draw_box draw only one box from layout
Describe the bug I just installed everything according to the installation guide and launched your jupyter notebook from here Deep Layout Parsing Example. After first draw_box it's show only one box, but in print(layout) i see all boxes. Same with second draw_box from your guide. not sure what i'm doing wrong.

To Reproduce Steps to reproduce the behavior:

installation guide + detectron2 install also from your guide

Run jupyter notebook

Environment

1. MacOS 2. VS Code 3. Here some stuff from pip: torch==1.11.0 torchvision==0.12.0 Pillow==9.1.0 opencv-python==4.5.5.64 layoutparser==0.3.3

Error traceback No errors, just behaviour not same like in guide or other guides

Screenshots attached

bug
opened by Moo1234567 4
Gives wrong results when the code is run for some images in a loop

The code works when it is run for a single image. But when I run the same code in a loop for few images from the publaynet dataset, cached results seem to apply (i.e. The bounding boxes overlap and the boxes for the previous images are also put in the current image).

opened by surajsubramanian 4

ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)

While using this code, I get this error of Pillow. I tried re-installing pillow but still struggling with this issue. Any help to make this code run?

import layoutparser as lp
model = lp.Detectron2LayoutModel(
            config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
            label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
            extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional
        )
model.detect(image)

Getting this error:

ImportError                               Traceback (most recent call last)
[<ipython-input-6-59f0fb07b7e3>](https://localhost:8080/#) in <module>
      1 import layoutparser as lp
----> 2 model = lp.Detectron2LayoutModel(
      3             config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
      4             label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
      5             extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional

31 frames
[/usr/local/lib/python3.7/dist-packages/PIL/ImageFont.py](https://localhost:8080/#) in <module>
     35 from . import Image
     36 from ._deprecate import deprecate
---> 37 from ._util import is_directory, is_path
     38 
     39 

ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)

opened by arhamshah 3

TypeError: inner() got an unexpected keyword argument 'image_context'

Hello! Recently encountered an issue when trying to use Google's OCR when running ocr_agent.detect

Running this:

image = cv2.imread("/Users/liz/Documents/Projects/LayoutParser/test2.png")
ocr_agent = lp.GCVAgent.with_credential("/Users/liz/Documents/Projects/Keys/GoogleCloud/vision-341523-e3cbd0df8d19.json",languages = ['en'])
res = ocr_agent.detect(image, return_response=True)

Gives me the following error:

TypeError                                 Traceback (most recent call last)
<ipython-input-9-76614ef6a3e8> in <module>
      1 image = cv2.imread("/Users/liz/Documents/Projects/LayoutParser/test2.png")
      2 ocr_agent = lp.GCVAgent.with_credential("/Users/liz/Documents/Projects/Keys/GoogleCloud/vision-341523-e3cbd0df8d19.json",languages = ['en'])
----> 3 res = ocr_agent.detect(image, return_response=True)
      4 
      5 #layout = ocr_agent.gather_full_text_annotation(res, agg_level=lp.GCVFeatureType.WORD)

/opt/homebrew/Caskroom/miniforge/base/envs/data310/lib/python3.9/site-packages/layoutparser/ocr.py in detect(self, image, return_response, return_only_text, agg_output_level)
    222                 img_content = image_file.read()
    223 
--> 224         res = self._detect(img_content)
    225 
    226         if return_response:

/opt/homebrew/Caskroom/miniforge/base/envs/data310/lib/python3.9/site-packages/layoutparser/ocr.py in _detect(self, img_content)
    188     def _detect(self, img_content):
    189         img_content = self._vision.types.Image(content=img_content)
--> 190         response = self._client.document_text_detection(
    191             image=img_content, image_context=self._context
    192         )

TypeError: inner() got an unexpected keyword argument 'image_context'

Not sure what it is caused by, might be user error but I haven't been able to find anything else about it and I've tried everything I can think of (all the packages are up to date (or in google cloud vision's case, downgraded to stay on the old API). Thanks!

bug

opened by liz-goodwin 3

bad result detected

I got bad result using layout-parser here is the image I am used:

here is the code run in python :

image = cv2.imread("1.png")
# Convert the image from BGR (cv2 default loading style)
# to RGB
image = image[..., ::-1]
origin_image = image.copy()

model = lp.Detectron2LayoutModel('lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config', 
                             extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                             label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})
# Load the deep layout model from the layoutparser API 
# For all the supported model, please check the Model 
# Zoo Page: https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html

layout = model.detect(image)
# print("layout : ", layout)
# Detect the layout of the input image
text_blocks = lp.Layout([b for b in layout if b.type=='Text'])
drawRectangleInImage(origin_image, text_blocks, (36,255,12))

titles_blocks = lp.Layout([b for b in layout if b.type=='Title'])
drawRectangleInImage(origin_image, titles_blocks, (76, 155, 175))

figure_blocks = lp.Layout([b for b in layout if b.type=='Figure'])
drawRectangleInImage(origin_image, figure_blocks, (122, 96, 216))

lists_blocks = lp.Layout([b for b in layout if b.type=='List'])
drawRectangleInImage(origin_image, lists_blocks, (176, 155, 175))

tables_blocks = lp.Layout([b for b in layout if b.type=='Table'])
drawRectangleInImage(origin_image, tables_blocks, (76, 255, 75))

cv2.imshow('image', origin_image)
cv2.waitKey()

here is the result:

by the way ：

there is some warning generated ：

/usr/local/lib/python3.9/site-packages/detectron2/structures/image_list.py:99: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). max_size = (max_size + (stride - 1)) // stride * stride /usr/local/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

bug

opened by DamonsJ 3

Any idea about Detectron gets overlapping and sometimes misses some blocks

The problem I am currently using layout-parser to detect the blocks of a scanned book papers and trying to take each block separately from the page and do some processing over them.

Checklist

To Reproduce

import layoutparser as lp
import cv2

image = cv2.imread("/content/image_0.jpg")
# Convert the image from BGR (cv2 default loading style) to RGB
image = image[..., ::-1]

model = lp.Detectron2LayoutModel((lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config),
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})


# Detect the layout of the input image
layout = model.detect(image)

# Show the detected layout of the input image
lp.draw_box(image, layout, box_width=3)

Environment

Platform [Linux] (on colab)
Installation commands

!sudo apt-get update
!sudo apt-get install libleptonica-dev tesseract-ocr libtesseract-dev python3-pil tesseract-ocr-eng tesseract-ocr-script-latn
!pip install layoutparser	
!pip install layoutparser torchvision && pip install "git+https://github.com/facebookresearch/[email protected]#egg=detectron2"	
!pip install "layoutparser[ocr]"	
!pip install "layoutparser[layoutmodels]" # Install DL layout model toolkit

Screenshots

1- Overlapping ||| |---|---|

2- Missing ||| |---|---|

I know it may not the right place to release that issue, but I think you may have an idea about that problem

bug

opened by rrrokhtar 0

[Bug] has_torch_function_variadic error

Describe the bug When attempting to initialise a model (I've tried with AutoLayoutModel and Detectron2LayoutModel), torch.jit throws a RuntimeError as below...

RuntimeError: 
undefined value has_torch_function_variadic:
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/smdebug.py", line 2962
         >>> loss.backward()
    """
    if has_torch_function_variadic(input, target, weight, pos_weight):
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        return handle_torch_function(
            binary_cross_entropy_with_logits,
'binary_cross_entropy_with_logits' is being compiled since it was called from 'sigmoid_focal_loss'
  File "/opt/conda/lib/python3.8/site-packages/fvcore/nn/focal_loss.py", line 34
    """
    p = torch.sigmoid(inputs)
    ce_loss = F.binary_cross_entropy_with_logits(inputs, targets, reduction="none")
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    p_t = p * targets + (1 - p) * (1 - targets)
    loss = ce_loss * ((1 - p_t) ** gamma)

To Reproduce Steps to reproduce the behavior:

Install layout-parser, OpenCV, Detectron2 as below

%pip install opensearch-py opencv-python --quiet
%pip install -U layoutparser[ocr] --quiet
!python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.10/index.html

Import layoutparser and attempt to init model with lp.models.Detectron2LayoutModel(...)
Error appears

Environment Linux with layoutparser latest

bug

opened by lucafrost 0

cannot import name 'is_directory' from 'PIL._util'(lp.Detectron2LayoutModel)
Describe the bug When I tried the sample codes:

!pip install layoutparser !pip install 'git+https://github.com/facebookresearch/[email protected]#egg=detectron2' import layoutparser as lp import cv2 import PIL image = cv2.imread("image.png") model = lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config') layout = model.detect(image)

Colab link(Python 3.8.16): https://colab.research.google.com/drive/1lb8_Pcw8_NNdeKPL80HOYca8gaCB0f-E?usp=sharing

I got an error on this line:

lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config')

The error message is:

ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.8/dist-packages/PIL/_util.py)

I hope that I can get your help. Thanks!
bug
opened by sudoghut 0
[Fix] reduce memory consumption and close pdf stream after usage

Flushes the pages and pdf afterwards to reduce the memory/ram consumption.

Opens the pdf stream as a context manager so that the file is closed afterwads.

opened by jakobnrmnn 0
Minor installation instruction error
On Mac, the command

pip3 install -U layoutparser[ocr]

doesn't work (returns "zsh: no matches found: layoutparser[ocr]"), you need to do

pip3 install -U "layoutparser[ocr]"
bug
opened by bholtdwyer 0

Releases(v0.3.4)

v0.3.4(Apr 6, 2022)
Bug fixes

fix one critical bug for visualization mentioned in #131 by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/132

Full Changelog: https://github.com/Layout-Parser/layout-parser/compare/v0.3.3...v0.3.4
Source code(tar.gz)
Source code(zip)
v0.3.3(Apr 3, 2022)
Functional Updates

Robust pdf loading for empty pages by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/115

fix to issue #94 -- avoiding TesseractAgent.detect() inferring any sequence of digit as float by @k-for-code in https://github.com/Layout-Parser/layout-parser/pull/95

Better layout comparison by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/128

Better visualization functions by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/129

Example Updates

Minor update to Deep Learning Parser example notebook by @Jim-Salmons in https://github.com/Layout-Parser/layout-parser/pull/56

Set inplace to True in sorting function by @yusanshi in https://github.com/Layout-Parser/layout-parser/pull/104

Add notebook for customizing LayoutParser Models with Label Studio Annotation by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/124

New Contributors

@Jim-Salmons made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/56

@yusanshi made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/104

@k-for-code made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/95

Full Changelog: https://github.com/Layout-Parser/layout-parser/compare/v0.3.2...v0.3.3
Source code(tar.gz)
Source code(zip)
v0.3.2(Sep 23, 2021)
Important fixes for multibackend layout model support:

Resolves the issues mentioned in #78 with other fixes to improve the multibackend layout model support #79

Better tests for different backends #79 for preventing future related issues

Source code(tar.gz)
Source code(zip)
v0.3.1(Sep 15, 2021)
Fixes for automatically setting label_map in Detectron2LayoutModel #75

Remove unnecessary class annotations (that might breaks Python 3.6 users) #75

Source code(tar.gz)
Source code(zip)
v0.3.0(Sep 13, 2021)
We are excited to release LayoutParser v0.3.0, with a lot of exciting updates and functional improvements.

New Features

The biggest change in this version is that LayoutParser now supports multiple deep learning backends: Detectron2, effdet, and paddledetection. This allows for more flexible usage of the layoutparser library, and makes it easier for implementing customized layout models in the future. #54 #67

Additionally, the newly added AutoModel and improved model configuration parsing makes it easier load and use the layout detection models. #69

e.g, model = lp.AutoLayoutModel("lp://efficientdet/PubLayNet").

To support this multi-backend framework, we implement the dynamic importing mechanism as well as better ways for installing layoutparser and the needed dependencies (see instructions). #65 #68

And now layoutparser supports directly loading PDF files into as layout objects: #71
import layoutparser as lp pdf_layout, pdf_images = lp.load_pdf("path/to/pdf", load_images=True) lp.draw_box(pdf_images[0], pdf_layout[0])

To support more flexible processing of the layout objects, a set of new toolkits are available: #72
import layout parser as lp page_layout = lp.load_pdf("tests/fixtures/io/example.pdf")[0] pdf_lines = lp.simple_line_detection(page_layout)

New Models

Add MFD model that can detect (display) equation regions within scientific documents #59

Source code(tar.gz)
Source code(zip)
v0.2.0(Apr 12, 2021)
Layout Parser v0.2.0 Release Notes

New Features

Support for loading and exporting the layout data in json and csv , see #6

Add support for union and intersect operations, see #20 and the detailed explanation

Improvements

Functional improvements:

When loading Layout Parser official models, Detectron2LayoutModel can automatically detect the label_map, . For example,

model = lp.Detectron2LayoutModel("lp://HJDataset/faster_rcnn_R_50_FPN_3x/config") model.label_map # {1: 'Page Frame', ... }

Detectron2LayoutModel now supports the enforce_cpu flag that enforces using cpu even when CUDA devices are available.

For visualization.draw_box, it now supports a show_element_type flag that shows the bbox category name on the top left corner of the layout objects.

Improve installation command and documentation, especially for installing Detectron2 on Windows platforms #25

New Models

Add the table bank detection models that can identify table regions

Fixes

Fix the incorrect layout issue mentioned in #9 - Thanks to @remidbs.

Fix the some of the dependency issues mentioned in #11 and #13 by using iopath instead of fvcore. See #18, Thanks to @edisongustavo.

Source code(tar.gz)
Source code(zip)
v0.1.3(Dec 21, 2020)
Improvements:

Supports lazy loading for the Detectron2 module. Now the dependency for Detectron2 will be requested only when you explicitly create a Detectron2LayoutModel object. This might be helpful for using the plain layoutparser library without installing the Detectron2 module.

New models:

Incorporated a pre-trained model based on the NewspaperNavigator dataset: lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config

Fixes:

Corrected a bug in visualization that might overwrite original the image

Source code(tar.gz)
Source code(zip)
v0.1.2(Oct 30, 2020)
In this version, we released a new model for publaynet and made several improvements:

We released the mask_rcnn_X_101_32x8d_FPN_3x model trained on the publaynet dataset. Note: it's been trained on the full training set (while others are only trained on the validation set), and you could expect a 15% performance improvement based on this new model.

We improved the support for PIL images for both layout modeling and visualization

We improved the Default Language Settings for the Tesseract OCR model

Source code(tar.gz)
Source code(zip)
v0.1.1(Jul 16, 2020)
Fixes

Fixed a bug that could cause errors in loading Prima Models

Updates

Update the prima MASK RCNN model with higher accuracy, and listed detailed evaluation reports.

Source code(tar.gz)
Source code(zip)
v0.1.0(Jun 24, 2020)
layoutparser now supports the following functionalities:

Coordinate system:

Supports the 3 basic coordinate system and their geometric relationships

Supports the TextBlook and Layout system for convenient coordinate and text processing

OCR System:

Supports OCR based on Google Cloud Vision and Tesseract API.

Layout Modeling:

Supports using pre-trained Deep Learning models for layout object detection using Detection2

Visualization:

Supports highly-customizable presentation of the box coordinates and text in the detected layout

Source code(tar.gz)
Source code(zip)

Owner

layout-parser

GitHub Repository https://layout-parser.github.io/

MonsterManualPlus - An advanced monster manual for Tower of the Sorcerer.

Monster Manual + This is an advanced monster manual for Tower of the Sorcerer mods. Users can get a plenty of extra imformation for decision making wh

1 Jan 01, 2022

An MkDocs plugin to export content pages as PDF files

MkDocs PDF Export Plugin An MkDocs plugin to export content pages as PDF files The pdf-export plugin will export all markdown pages in your MkDocs rep

266 Dec 13, 2022

Make posters from Markdown files.

MkPosters Create posters using Markdown. Supports icons, admonitions, and LaTeX mathematics. At the moment it is restricted to the specific layout of

243 Dec 20, 2022

freeCodeCamp Scientific Computing with Python Project for Certification.

Polygon_Area_Calculator freeCodeCamp Python Project freeCodeCamp Scientific Computing with Python Project for Certification. In this project you will

1 Dec 23, 2021

Word document generator with python

In this study, real world data is anonymized. The content is completely different, but the structure is the same. It was a script I prepared for the backend of a work using UiPath.

3 Jan 30, 2022

An awesome Data Science repository to learn and apply for real world problems.

AWESOME DATA SCIENCE An open source Data Science repository to learn and apply towards solving real world problems. This is a shortcut path to start s

20.3k Jan 09, 2023

script to calculate total GPA out of 4, based on input gpa.csv

gpa_calculator script to calculate total GPA out of 4 based on input gpa.csv to use, create a total.csv file containing only one integer showing the t

1 Feb 07, 2022

More detailed upload statistics for Nicotine+

More Upload Statistics A small plugin for Nicotine+ 3.1+ to create more detailed upload statistics. ⚠ No data previous to enabling this plugin will be

1 Dec 17, 2021

Clases y ejercicios del curso de python diactodo por la UNSAM

Programación en Python En el marco del proyecto de Inteligencia Artificial Interdisciplinaria, la Escuela de Ciencia y Tecnología de la UNSAM vuelve a

3 Jan 06, 2022

MkDocs plugin for setting revision date from git per markdown file

mkdocs-git-revision-date-plugin MkDocs plugin that displays the last revision date of the current page of the documentation based on Git. The revision

48 Jan 06, 2023

Generate modern Python clients from OpenAPI

openapi-python-client Generate modern Python clients from OpenAPI 3.x documents. This generator does not support OpenAPI 2.x FKA Swagger. If you need

555 Jan 02, 2023

level2-data-annotation_cv-level2-cv-15 created by GitHub Classroom

[AI Tech 3기 Level2 P Stage] 글자 검출 대회 팀원 소개 김규리_T3016 박정현_T3094 석진혁_T3109 손정균_T3111 이현진_T3174 임종현_T3182 Overview OCR (Optimal Character Recognition) 기술

6 Jun 10, 2022

Autolookup GUI Plugin for Plover

Word Tray for Plover Word Tray is a GUI plugin that automatically looks up efficient outlines for words that start with the current input, much like a

3 Jun 08, 2022

Xanadu Quantum Codebook is an experimental, exercise-based introduction to quantum computing using PennyLane.

Xanadu Quantum Codebook The Xanadu Quantum Codebook is an experimental, exercise-based introduction to quantum computing using PennyLane. This reposit

43 Dec 09, 2022

300+ Python Interview Questions

1.1k Jan 02, 2023

Data-Scrapping SEO - the project uses various data scrapping and Google autocompletes API tools to provide relevant points of different keywords so that search engines can be optimized

Data-Scrapping SEO - the project uses various data scrapping and Google autocompletes API tools to provide relevant points of different keywords so that search engines can be optimized; as this infor

2 Jul 18, 2022

Layout Parser is a deep learning based tool for document image layout analysis tasks.

Related tags

Overview

Installation

Quick Start

DL Assisted Layout Prediction Example

Citing layoutparser

Comments

Releases(v0.3.4)

v0.3.4(Apr 6, 2022)

Bug fixes

v0.3.3(Apr 3, 2022)

Functional Updates

Example Updates

New Contributors

v0.3.2(Sep 23, 2021)

v0.3.1(Sep 15, 2021)

v0.3.0(Sep 13, 2021)

New Features

New Models

v0.2.0(Apr 12, 2021)

Layout Parser v0.2.0 Release Notes

New Features

Improvements

New Models

Fixes

v0.1.3(Dec 21, 2020)

v0.1.2(Oct 30, 2020)

v0.1.1(Jul 16, 2020)

Fixes

Updates

v0.1.0(Jun 24, 2020)

Owner

layout-parser

MonsterManualPlus - An advanced monster manual for Tower of the Sorcerer.

An MkDocs plugin to export content pages as PDF files

Make posters from Markdown files.

freeCodeCamp Scientific Computing with Python Project for Certification.

Word document generator with python

An awesome Data Science repository to learn and apply for real world problems.

script to calculate total GPA out of 4, based on input gpa.csv

More detailed upload statistics for Nicotine+

Clases y ejercicios del curso de python diactodo por la UNSAM

MkDocs plugin for setting revision date from git per markdown file

Generate modern Python clients from OpenAPI

level2-data-annotation_cv-level2-cv-15 created by GitHub Classroom

Autolookup GUI Plugin for Plover

Xanadu Quantum Codebook is an experimental, exercise-based introduction to quantum computing using PennyLane.

300+ Python Interview Questions

Data-Scrapping SEO - the project uses various data scrapping and Google autocompletes API tools to provide relevant points of different keywords so that search engines can be optimized

Gtech μLearn Sample_bot

The sarge package provides a wrapper for subprocess which provides command pipeline functionality.

Collection of Summer 2022 tech internships!

📘 OpenAPI/Swagger-generated API Reference Documentation

Citing `layoutparser`