Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

Overview

English | 简体中文

Introduction

PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.

Notice

PaddleOCR supports both dynamic graph and static graph programming paradigm

  • Dynamic graph: dygraph branch (default), supported by paddle 2.0.0 (installation)
  • Static graph: develop branch

Recent updates

  • 2021.2.8 Release PaddleOCRv2.0(branch release/2.0) and set as default branch. Check release note here: https://github.com/PaddlePaddle/PaddleOCR/releases/tag/v2.0.0
  • 2021.1.21 update more than 25+ multilingual recognition models models list, including:English, Chinese, German, French, Japanese,Spanish,Portuguese Russia Arabic and so on. Models for more languages will continue to be updated Develop Plan.
  • 2020.12.15 update Data synthesis tool, i.e., Style-Text,easy to synthesize a large number of images which are similar to the target scene image.
  • 2020.11.25 Update a new data annotation tool, i.e., PPOCRLabel, which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
  • 2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941
  • more

Features

  • PPOCR series of high-quality pre-trained models, comparable to commercial effects
    • Ultra lightweight ppocr_mobile series models: detection (3.0M) + direction classifier (1.4M) + recognition (5.0M) = 9.4M
    • General ppocr_server series models: detection (47.1M) + direction classifier (1.4M) + recognition (94.9M) = 143.4M
    • Support Chinese, English, and digit recognition, vertical text recognition, and long text recognition
    • Support multi-language recognition: Korean, Japanese, German, French
  • Rich toolkits related to the OCR areas
    • Semi-automatic data annotation tool, i.e., PPOCRLabel: support fast and efficient data annotation
    • Data synthesis tool, i.e., Style-Text: easy to synthesize a large number of images which are similar to the target scene image
  • Support user-defined training, provides rich predictive inference deployment solutions
  • Support PIP installation, easy to use
  • Support Linux, Windows, MacOS and other systems

Visualization

The above pictures are the visualizations of the general ppocr_server model. For more effect pictures, please see More visualizations.

Community

  • Scan the QR code below with your Wechat, you can access to official technical exchange group. Look forward to your participation.

Quick Experience

You can also quickly experience the ultra-lightweight OCR : Online Experience

Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Android systems): Sign in to the website to obtain the QR code for installing the App

Also, you can scan the QR code below to install the App (Android support only)

PP-OCR 2.0 series model list(Update on Dec 15)

Note : Compared with models 1.1, which are trained with static graph programming paradigm, models 2.0 are the dynamic graph trained version and achieve close performance.

Model introduction Model name Recommended scene Detection model Direction classifier Recognition model
Chinese and English ultra-lightweight OCR model (9.4M) ch_ppocr_mobile_v2.0_xx Mobile & server inference model / pre-trained model inference model / pre-trained model inference model / pre-trained model
Chinese and English general OCR model (143.4M) ch_ppocr_server_v2.0_xx Server inference model / pre-trained model inference model / pre-trained model inference model / pre-trained model

For more model downloads (including multiple languages), please refer to PP-OCR v2.0 series model downloads.

For a new language request, please refer to Guideline for new language_requests.

Tutorials

PP-OCR Pipeline

PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of three parts: DB text detection[2], detection frame correction and CRNN text recognition[7]. The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module. The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to the PP-OCR technical article (https://arxiv.org/abs/2009.09941). Besides, The implementation of the FPGM Pruner [8] and PACT quantization [9] is based on PaddleSlim.

Visualization more

  • Chinese OCR model
  • English OCR model
  • Multilingual OCR model

Guideline for new language requests

If you want to request a new language support, a PR with 2 following files are needed:

  1. In folder ppocr/utils/dict, it is necessary to submit the dict text to this path and name it with {language}_dict.txt that contains a list of all characters. Please see the format example from other files in that folder.

  2. In folder ppocr/utils/corpus, it is necessary to submit the corpus to this path and name it with {language}_corpus.txt that contains a list of words in your language. Maybe, 50000 words per language is necessary at least. Of course, the more, the better.

If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.

More details, please refer to Multilingual OCR Development Plan.

License

This project is released under Apache 2.0 license

Contribution

We welcome all the contributions to PaddleOCR and appreciate for your feedback very much.

  • Many thanks to Khanh Tran and Karl Horky for contributing and revising the English documentation.
  • Many thanks to zhangxin for contributing the new visualize function、add .gitignore and discard set PYTHONPATH manually.
  • Many thanks to lyl120117 for contributing the code for printing the network structure.
  • Thanks xiangyubo for contributing the handwritten Chinese OCR datasets.
  • Thanks authorfu for contributing Android demo and xiadeye contributing iOS demo, respectively.
  • Thanks BeyondYourself for contributing many great suggestions and simplifying part of the code style.
  • Thanks tangmq for contributing Dockerized deployment services to PaddleOCR and supporting the rapid release of callable Restful API services.
  • Thanks lijinhan for contributing a new way, i.e., java SpringBoot, to achieve the request for the Hubserving deployment.
  • Thanks Mejans for contributing the Occitan corpus and character set.
  • Thanks LKKlein for contributing a new deploying package with the Golang program language.
  • Thanks Evezerest, ninetailskim, edencfc, BeyondYourself and 1084667371 for contributing a new data annotation tool, i.e., PPOCRLabel。
Comments
  • 文本识别转推理模型后识别不正确!

    文本识别转推理模型后识别不正确!

    训练完之后用infer_rec.py预测是正常结果,但是用export_model.py就是错误的结果 这是我执行的附加参数:-c output/rec_chinese_lite_v2.0/config.yml -o Global.pretrained_model=output/rec_chinese_lite_v2.0/best_accuracy Global.save_inference_dir=./save_inference_dir 这是我用predict_rec.py预测的参数:--image_dir="2.bmp" --rec_model_dir="save_inference_dir" --rec_char_dict_path="train_data/labels.txt" 附加我上传了我训练的文件和一张图片例子

    output.zip

    status/close 
    opened by xinyujituan 47
  • C++ windows环境下 cpu_math_library_num_threads_ 以及 use_mkldnn_对于计算速度的影响

    C++ windows环境下 cpu_math_library_num_threads_ 以及 use_mkldnn_对于计算速度的影响

    采用教程编译了windows下的 ocr_system.exe(mkl数学库),测试发现,同一张图片有如下情况

    1. 同样的cpu_math_library_num_threads_=10情况下,use_mkldnn 选项打开耗时(1.85s) 关闭选项(1.6s)
    2. use_mkldnn 关闭,cpu_math_library_num_threads_=0时,耗时1.4s cpu_math_library_num_threads_=12时,耗时1.9s

    CPU Intel 8700(六核十二线程)

    这两个情况,怎么都是反着来的呀,费解。单线程速度最快吗?

    opened by qq61786631 46
  •  PaddleOCR 2.5 版本 ,use_tensorrt=True 跑不通

    PaddleOCR 2.5 版本 ,use_tensorrt=True 跑不通

    请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

    • 系统环境/System Environment:ubuntu18.04 cuda 10.2 cudnn8 python3.7 tensorrt 7.2.3.4
    • paddle2onnx 0.5 paddlehub 1.8.3 paddleocr 2.5.0.3 paddlepaddle-gpu 2.3.0 paddleslim 1.1.1 paddlex 1.3.7
    • 版本号/Version:Paddle:2.3.0 PaddleOCR:2.5.0.3 问题相关组件/Related components:tensorrt
    • 运行指令/Command Code:--use_tensorrt==true
    • 完整报错/Complete Error Message: [2022/06/21 13:07:18] ppocr DEBUG: Namespace(alpha=1.0, benchmark=False, beta=1.0, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/home/ocr_model/cls_infer', cls_thresh=0.9, cpu_threads=10, crop_res_save_dir='./output', det=True, det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_fce_box_type='poly', det_limit_side_len=960, det_limit_type='max', det_model_dir='/home/ocr_model/v3_det_infer/ch_PP-OCRv3_det_infer', det_pse_box_thresh=0.85, det_pse_box_type='quad', det_pse_min_area=16, det_pse_scale=1, det_pse_thresh=0, det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, draw_img_save_dir='./inference_results', drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, fourier_degree=5, gpu_mem=500, help='==SUPPRESS==', image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', layout=True, layout_label_map=None, layout_path_model='lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config', max_batch_size=10, max_text_length=25, min_subgraph_size=15, mode='structure', ocr=True, ocr_version='PP-OCRv3', output='./output', precision='fp32', process_id=0, rec=True, rec_algorithm='SVTR_LCNet', rec_batch_num=6, rec_char_dict_path='/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', rec_image_shape='3, 48, 320', rec_model_dir='/home/ocr_model/v3_rec_infer/ch_PP-OCRv3_rec_infer', save_crop_res=False, save_log_path='./log_output/', scales=[8, 16, 32], show_log=True, structure_version='PP-STRUCTURE', table=True, table_char_dict_path=None, table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=False, use_dilation=False, use_gpu=True, use_mp=False, use_onnx=False, use_pdserving=False, use_space_char=True, use_tensorrt=True, vis_font_path='./doc/fonts/simfang.ttf', warmup=False) W0621 13:07:21.006245 894 analysis_predictor.cc:1086] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect. I0621 13:07:21.030115 894 analysis_predictor.cc:854] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] I0621 13:07:21.065781 894 fuse_pass_base.cc:57] --- detected 10 subgraphs --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0621 13:07:21.130373 894 fuse_pass_base.cc:57] --- detected 185 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0621 13:07:21.150663 894 fuse_pass_base.cc:57] --- detected 24 subgraphs --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] --- Running IR pass [trt_map_matmul_to_mul_pass] I0621 13:07:21.155673 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [fc_fuse_pass] I0621 13:07:21.157024 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [conv_elementwise_add_fuse_pass] I0621 13:07:21.167703 894 fuse_pass_base.cc:57] --- detected 42 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0621 13:07:21.179683 894 tensorrt_subgraph_pass.cc:141] --- detect a sub-graph with 133 nodes I0621 13:07:21.196204 894 tensorrt_subgraph_pass.cc:403] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0621 13:07:21.737630 894 engine.cc:203] Run Paddle-TRT Dynamic Shape mode. I0621 13:07:42.235085 894 engine.cc:424] Inspector needs TensorRT version 8.2 and after. --- Running IR pass [conv_bn_fuse_pass] --- Running IR pass [conv_elementwise_add_act_fuse_pass] --- Running IR pass [conv_elementwise_add2_act_fuse_pass] --- Running IR pass [transpose_flatten_concat_fuse_pass] --- Running analysis [ir_params_sync_among_devices_pass] I0621 13:07:42.255204 894 ir_params_sync_among_devices_pass.cc:100] Sync params from CPU to GPU --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [memory_optimize_pass] I0621 13:07:42.259418 894 memory_optimize_pass.cc:216] Cluster name : shape_1.tmp_0_slice_0 size: 4 I0621 13:07:42.259449 894 memory_optimize_pass.cc:216] Cluster name : shape_0.tmp_0 size: 16 I0621 13:07:42.259457 894 memory_optimize_pass.cc:216] Cluster name : reshape2_0.tmp_1 size: 0 I0621 13:07:42.259477 894 memory_optimize_pass.cc:216] Cluster name : linear_1.tmp_1 size: 8 --- Running analysis [ir_graph_to_program_pass] I0621 13:07:42.308853 894 analysis_predictor.cc:1007] ======= optimize end ======= I0621 13:07:42.312048 894 naive_executor.cc:102] --- skip [feed], feed -> x I0621 13:07:42.312656 894 naive_executor.cc:102] --- skip [save_infer_model/scale_0.tmp_1], fetch -> fetch I0621 13:07:42.422236 894 analysis_predictor.cc:854] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] I0621 13:07:42.465929 894 fuse_pass_base.cc:57] --- detected 2 subgraphs --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0621 13:07:42.529808 894 fuse_pass_base.cc:57] --- detected 184 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] I0621 13:07:42.538357 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0621 13:07:42.548557 894 fuse_pass_base.cc:57] --- detected 19 subgraphs --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] I0621 13:07:42.553280 894 fuse_pass_base.cc:57] --- detected 9 subgraphs --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] I0621 13:07:42.554414 894 fuse_pass_base.cc:57] --- detected 4 subgraphs --- Running IR pass [trt_map_matmul_to_mul_pass] --- Running IR pass [fc_fuse_pass] I0621 13:07:42.557754 894 fuse_pass_base.cc:57] --- detected 9 subgraphs --- Running IR pass [conv_elementwise_add_fuse_pass] I0621 13:07:42.563249 894 fuse_pass_base.cc:57] --- detected 23 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0621 13:07:42.572366 894 tensorrt_subgraph_pass.cc:141] --- detect a sub-graph with 57 nodes I0621 13:07:42.577783 894 tensorrt_subgraph_pass.cc:403] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0621 13:07:42.580443 894 op_converter.h:253] trt input [pool2d_5.tmp_0_clone_0] dynamic shape info not set, please check and retry. Traceback (most recent call last): File "3.py", line 18, in ocr_version='PP-OCRv3', use_angle_cls=False, use_tensorrt=True, lang='ch') File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/paddleocr.py", line 437, in init super().init(params) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_system.py", line 47, in init self.text_recognizer = predict_rec.TextRecognizer(args) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_rec.py", line 74, in init utility.create_predictor(args, 'rec', logger) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/utility.py", line 313, in create_predictor predictor = inference.create_predictor(config) ValueError: (InvalidArgument) some trt inputs dynamic shape info not set, check the INFO log above for more details. [Hint: Expected all_dynamic_shape_set == true, but received all_dynamic_shape_set:0 != true:1.] (at /paddle/paddle/fluid/inference/tensorrt/convert/op_converter.h:287)

    代码 import os import sys import time

    import cv2

    sys.path.insert(0, '/usr/local/python3.7/lib/python3.7/site-packages/paddleocr')

    from paddleocr import PaddleOCR

    PKG_PATTERN = r'PKG.*:'

    root_path = os.path.join('/home', 'ocr_model') cls_model_dir = os.path.join(root_path, 'cls_infer') det_model_dir = os.path.join(root_path, 'v3_det_infer/ch_PP-OCRv3_det_infer') rec_model_dir = os.path.join(root_path, 'v3_rec_infer/ch_PP-OCRv3_rec_infer') addleOCR = PaddleOCR(cls_model_dir=cls_model_dir, det_model_dir=det_model_dir, rec_model_dir=rec_model_dir, ocr_version='PP-OCRv3', use_angle_cls=False, use_tensorrt=True, lang='ch')

    frame = cv2.imread('/home/095.png') print(frame.shape) while True: s1 = time.time() result = addleOCR.ocr(frame, cls=False) print('exec time:' + str(time.time() - s1)) print(result)

    status/close 
    opened by shihaitao118 41
  • 照着文档 python 调用 paddleocr package 报错 FatalError: `Process abort signal` is detected by the operating system,求助各位

    照着文档 python 调用 paddleocr package 报错 FatalError: `Process abort signal` is detected by the operating system,求助各位

    https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_en/whl_en.md

    centos系统 python3.6 paddleocr 2.0.2 paddlepaddle 2.0.0rc1

    from paddleocr import PaddleOCR,draw_ocr
    
    # Paddleocr supports Chinese, English, French, German, Korean and Japanese.
    # You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan`
    # to switch the language model in order.
    ocr = PaddleOCR(use_angle_cls=True,use_gpu=False, lang='ch') # need to run only once to download and load model into memory
    img_path = './tmp.jpg'
    result = ocr.ocr(img_path, cls=True)
    for line in result:
        print(line)
    

    错误信息:

    --------------------------------------
    C++ Traceback (most recent call last):
    --------------------------------------
    0   paddle::AnalysisPredictor::Run(std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> > const&, std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> >*, int)
    1   paddle::framework::NaiveExecutor::Run()
    2   paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
    3   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
    4   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
    5   std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CPUPlace, false, 0ul, paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, float>, paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, double> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
    6   paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const
    7   cblas_sgemm
    8   sgemm
    9   mkl_blas_sgemm
    10  mkl_serv_get_num_stripes
    11  omp_get_num_procs
    12  paddle::framework::SignalHandle(char const*, int)
    13  paddle::platform::GetCurrentTraceBackString()
    
    ----------------------
    Error Message Summary:
    ----------------------
    FatalError: `Process abort signal` is detected by the operating system.
      [TimeInfo: *** Aborted at 1613980216 (unix time) try "date -d @1613980216" if you are using GNU date ***]
      [SignalInfo: *** SIGABRT (@0x3f1000139a6) received by PID 80294 (TID 0x7f5e47933740) from PID 80294 ***]
    

    paddlepaddle 用 2.0.0 还是会报这样的错。

    opened by suparek 39
  • paddlepaddle-gpu 2.0.0rc1报FatalError: `Segmentation fault` is detected by the operating system.

    paddlepaddle-gpu 2.0.0rc1报FatalError: `Segmentation fault` is detected by the operating system.

    用的git上的最新版的PaddleOCR,在执行python tools/infer/predict_system.py报错,错误信息如下:


    C++ Traceback (most recent call last):

    0 paddle::framework::SignalHandle(char const*, int) 1 paddle::platform::GetCurrentTraceBackString()


    Error Message Summary:

    FatalError: Segmentation fault is detected by the operating system. [TimeInfo: *** Aborted at 1609724467 (unix time) try "date -d @1609724467" if you are using GNU date ***] [SignalInfo: *** SIGSEGV (@0x0) received by PID 127353 (TID 0x7f4aa7f1d700) from PID 0 ***]

    Segmentation fault (core dumped)

    执行**paddle.utils.run_check()**的信息如下:

    import paddle paddle.utils.run_check() Running verify PaddlePaddle program ... W0104 09:50:08.441300 127586 device_context.cc:320] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 10.2, Runtime API Version: 10.0 W0104 09:50:08.444324 127586 device_context.cc:330] device: 0, cuDNN Version: 8.0. PaddlePaddle works well on 1 GPU. W0104 09:50:10.058878 127586 parallel_executor.cc:491] Cannot enable P2P access from 0 to 2 W0104 09:50:10.058951 127586 parallel_executor.cc:491] Cannot enable P2P access from 0 to 3 W0104 09:50:10.799384 127586 parallel_executor.cc:491] Cannot enable P2P access from 1 to 2 W0104 09:50:10.799430 127586 parallel_executor.cc:491] Cannot enable P2P access from 1 to 3 W0104 09:50:10.799440 127586 parallel_executor.cc:491] Cannot enable P2P access from 2 to 0 W0104 09:50:10.799450 127586 parallel_executor.cc:491] Cannot enable P2P access from 2 to 1 W0104 09:50:11.883519 127586 parallel_executor.cc:491] Cannot enable P2P access from 3 to 0 W0104 09:50:11.883584 127586 parallel_executor.cc:491] Cannot enable P2P access from 3 to 1 W0104 09:50:15.108191 127586 fuse_all_reduce_op_pass.cc:75] Find all_reduce operators: 2. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 2. PaddlePaddle works well on 4 GPUs. PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

    环境信息: python版本3.8.5,3.7的也测试过一样的错误

    Package Version


    alabaster 0.7.12 anaconda-client 1.7.2 anaconda-navigator 1.9.12 anaconda-project 0.8.3 appdirs 1.4.4 asn1crypto 1.4.0 astor 0.8.1 astroid 2.4.2 astropy 4.0.1.post1 atomicwrites 1.4.0 attrs 20.1.0 Babel 2.8.0 backcall 0.2.0 backports.functools-lru-cache 1.6.1 backports.shutil-get-terminal-size 1.0.0 backports.tempfile 1.0 backports.weakref 1.0.post1 bce-python-sdk 0.8.53 beautifulsoup4 4.9.1 bitarray 1.5.3 bkcharts 0.2 bokeh 2.2.1 boto 2.49.0 Bottleneck 1.3.2 brotlipy 0.7.0 certifi 2020.6.20 cffi 1.14.2 cfgv 3.2.0 chardet 3.0.4 cliapp 1.0.9 click 7.1.2 cloudpickle 1.6.0 clyent 1.2.2 colorama 0.4.3 conda 4.8.4 conda-build 3.20.2 conda-package-handling 1.7.0 conda-verify 3.4.2 contextlib2 0.6.0.post1 cryptography 3.1 cycler 0.10.0 Cython 0.29.21 cytoolz 0.10.1 dask 2.25.0 datashape 0.5.4 decorator 4.4.2 distlib 0.3.1 distributed 2.25.0 docutils 0.16 entrypoints 0.3 et-xmlfile 1.0.1 fastcache 1.1.0 filelock 3.0.12 flake8 3.8.4 Flask 1.1.2 Flask-Babel 2.0.0 Flask-Cors 3.0.9 fsspec 0.8.0 future 0.18.2 gast 0.3.3 gevent 20.6.2 glob2 0.7 gmpy2 2.0.8 greenlet 0.4.16 h5py 2.10.0 HeapDict 1.0.1 html5lib 1.1 hypothesis 5.29.0 identify 1.5.10 idna 2.10 imageio 2.9.0 imagesize 1.2.0 imgaug 0.4.0 importlib-metadata 1.7.0 ipykernel 5.3.4 ipython 7.18.1 ipython-genutils 0.2.0 isort 5.4.2 itsdangerous 1.1.0 jdcal 1.4.1 jedi 0.17.2 Jinja2 2.11.2 joblib 0.16.0 jsonschema 3.2.0 jupyter-client 6.1.6 jupyter-console 6.2.0 jupyter-core 4.6.3 kiwisolver 1.2.0 lazy-object-proxy 1.4.3 libarchive-c 2.9 llvmlite 0.34.0 lmdb 1.0.0 locket 0.2.0 lxml 4.5.2 MarkupSafe 1.1.1 matplotlib 3.3.1 mccabe 0.6.1 mistune 0.8.4 mkl-fft 1.1.0 mkl-random 1.1.1 mkl-service 2.3.0 mock 4.0.2 more-itertools 8.5.0 mpmath 1.1.0 msgpack 1.0.0 multipledispatch 0.6.0 navigator-updater 0.2.1 nbformat 5.0.7 networkx 2.5 nltk 3.5 nodeenv 1.5.0 nose 1.3.7 numba 0.51.2 numexpr 2.7.1 numpy 1.19.1 numpydoc 1.1.0 odo 0.5.1 olefile 0.46 opencv-python 4.2.0.32 openpyxl 3.0.5 packaging 20.4 paddlepaddle-gpu 2.0.0rc1.post100 pandas 1.1.1 pandocfilters 1.4.2 parso 0.7.0 partd 1.1.0 path 15.0.0 pathlib2 2.3.5 patsy 0.5.1 pep8 1.7.1 pexpect 4.8.0 pickleshare 0.7.5 Pillow 7.2.0 pip 20.2.2 pkginfo 1.5.0.1 pluggy 0.13.1 ply 3.11 pre-commit 2.9.3 prompt-toolkit 3.0.7 protobuf 3.14.0 psutil 5.7.2 ptyprocess 0.6.0 py 1.9.0 pyclipper 1.2.1 pycodestyle 2.6.0 pycosat 0.6.3 pycparser 2.20 pycrypto 2.6.1 pycryptodome 3.9.9 pycurl 7.43.0.5 pyflakes 2.2.0 Pygments 2.6.1 pylint 2.6.0 pyodbc 4.0.0-unsupported pyOpenSSL 19.1.0 pyparsing 2.4.7 pyrsistent 0.16.0 PySocks 1.7.1 pytest 5.0.0 pytest-arraydiff 0.2 pytest-astropy 0.8.0 pytest-astropy-header 0.1.2 pytest-doctestplus 0.8.0 pytest-openfiles 0.5.0 pytest-remotedata 0.3.2 python-dateutil 2.8.1 python-Levenshtein 0.12.0 pytz 2020.1 PyWavelets 1.1.1 PyYAML 5.3.1 pyzmq 18.1.1 QtAwesome 0.7.2 qtconsole 4.7.6 QtPy 1.9.0 regex 2020.7.14 requests 2.24.0 rope 0.17.0 ruamel-yaml 0.15.87 scikit-image 0.16.2 scikit-learn 0.23.2 scipy 1.5.2 seaborn 0.10.1 Send2Trash 1.5.0 setuptools 49.6.0.post20200814 Shapely 1.7.1 simplegeneric 0.8.1 singledispatch 3.4.0.3 sip 4.19.13 six 1.15.0 snowballstemmer 2.0.0 sortedcollections 1.2.1 sortedcontainers 2.2.2 soupsieve 2.0.1 Sphinx 3.2.1 sphinxcontrib-applehelp 1.0.2 sphinxcontrib-devhelp 1.0.2 sphinxcontrib-htmlhelp 1.0.3 sphinxcontrib-jsmath 1.0.1 sphinxcontrib-qthelp 1.0.3 sphinxcontrib-serializinghtml 1.1.4 sphinxcontrib-websupport 1.2.4 SQLAlchemy 1.3.19 statsmodels 0.11.1 sympy 1.5.1 tables 3.6.1 tblib 1.7.0 terminado 0.8.3 testpath 0.4.4 threadpoolctl 2.1.0 toml 0.10.1 toolz 0.10.0 tornado 6.0.4 tqdm 4.48.2 traitlets 4.3.3 typing-extensions 3.7.4.3 unicodecsv 0.14.1 urllib3 1.25.10 virtualenv 20.2.2 visualdl 2.1.0 wcwidth 0.2.5 webencodings 0.5.1 Werkzeug 1.0.1 wheel 0.35.1 wrapt 1.11.2 xlrd 1.2.0 XlsxWriter 1.3.3 xlwt 1.3.0 xmltodict 0.12.0 zict 2.0.0 zipp 3.1.0 zope.event 4.4 zope.interface 5.1.0

    用之前的版本,安装1.8.5的测试没有问题

    documentation 
    opened by xiulianzw 34
  • 方向分类器在python上验证准确,然而转换为推理模型后,在cpp上部署,输出却有误

    方向分类器在python上验证准确,然而转换为推理模型后,在cpp上部署,输出却有误

    请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

    • 系统环境/System Environment:
    • 版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components:
    • 运行指令/Command Code:
    • 完整报错/Complete Error Message:

    增加了90°和270°两个角度并对模型进行训练,在python上验证无误,可以输出90°和270°结果,但在cpp部署的官方样例程序上,该模型经过推理后的推理模型准确度却下降了,请问为什么方向分类器推理后在cpp上的准确度不如训练模型在python上的准确度呢? 1666144706607 1666144229641

    status/close 
    opened by Camphora7 32
  • 训练模型和推理模型效果不一致

    训练模型和推理模型效果不一致

    PaddleOCR-release-2.0 基于det_mv3_db.yml训练车牌检测模型。 使用训练完的模型直接测试,infer_det.py,效果很好。 然后使用export_model.py对best_accuracy模型进行转换为推理模型(基于训练时的配置表config.yml),得到inference模型,使用predict_det.py做预测。效果没有前者好,检测框不紧密。

    F6726907-C310-4b5d-8CEF-EFCC44B193BC

    使用官方的ch_ppocr_mobile_v2.0_det_train进行测试,以及转换后测试效果也不一致。

    如下保证predict_det.py的效果和infer_det.py一致?

    opened by simplew2011 30
  • 内存溢出的问题!

    内存溢出的问题!

    我在训练文本检测网络DB时候,经常会出现内存溢出的问题,如下: aaa 其中,配置文件det_r50_vd_db.yml的内容如下:

    Global:
      algorithm: DB
      use_gpu: true
      epoch_num: 1200
      log_smooth_window: 20
      print_batch_step: 30
      save_model_dir: ./output/det_db/
      save_epoch_step: 200
      eval_batch_step: 10000
      train_batch_size_per_card: 2
      test_batch_size_per_card: 1
      image_shape: [3, 640, 640]
      reader_yml: ./configs/det/det_db_chinese_reader.yml
      pretrain_weights: ./pretrain_models/ResNet50_vd_ssld_pretrained/
      save_res_path: ./output/det_db/predicts_db.txt
      checkpoints: 
      save_inference_dir:
    

    配置文件det_db_chinese_reader.yml的内容如下:

    TrainReader:
      reader_function: ppocr.data.det.dataset_traversal,TrainReader
      process_function: ppocr.data.det.db_process,DBProcessTrain
      num_workers: 4
      img_set_dir: ""
      label_file_path: /home/aistudio/data/data39969/mtwi_2018_split/train.txt
    
    EvalReader:
      reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
      process_function: ppocr.data.det.db_process,DBProcessTest
      img_set_dir: ""
      label_file_path: /home/aistudio/data/data39969/mtwi_2018_split/test.txt
      test_image_shape: [736, 1280]
      
    TestReader:
      reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
      process_function: ppocr.data.det.db_process,DBProcessTest
      infer_img:
      img_set_dir: ""
      label_file_path: /home/aistudio/data/data39969/icpr_mtwi_task2/test.txt
      test_image_shape: [736, 1280]
      do_eval: True
    

    训练数据集来自于https://tianchi.aliyun.com/competition/entrance/231685/information,手动划分数据,训练集和验证集的划分比例9:1(9043:1005)。我的batch_size从2~16都试过,一直会出现内存溢出的问题,num_workers=1的话,可以训练,但是训练的迭代速度就太慢了。请问,有什么好的解决方法吗?

    opened by NextGuido 30
  • why getting 0.00 accuracy during training svtrnet?

    why getting 0.00 accuracy during training svtrnet?

    i was trying to train svtrnet model for bangla. here is the config file that i am using : https://pastecode.io/s/4czzqoix

    /backup2/synthtiger/bangla/PaddleOCR/ppocr/utils/bn_char_synth.txt contains characters like : } ~ । ঁ ং ঃ অ আ ই etc

    /backup2/synthtiger/bangla/PaddleOCR/train_data/ inside train_data folder i have folders like 0,1,2,3 etc and each folder containing 10k images ['/backup2/synthtiger/bangla/PaddleOCR/train_data/gt.txt'] gt.txt contains annotations of all the images that can be found inside train_data folder.

    Same for validation dataset. when i try to train i get acc : 0.00 like this :

    (mobassir) [email protected]:/backup2/synthtiger/bangla/PaddleOCR$ python3 tools/train.py -c configs/rec/rec_svtrnet.yml
    /home/apsisdev/.local/lib/python3.8/site-packages/scipy/fft/__init__.py:97: DeprecationWarning: The module numpy.dual is deprecated.  Instead of using dual, use the functions directly from numpy or scipy.
      from numpy.dual import register_func
    /home/apsisdev/.local/lib/python3.8/site-packages/scipy/sparse/sputils.py:17: DeprecationWarning: `np.typeDict` is a deprecated alias for `np.sctypeDict`.
      supported_dtypes = [np.typeDict[x] for x in supported_dtypes]
    /home/apsisdev/.local/lib/python3.8/site-packages/scipy/special/orthogonal.py:81: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      from numpy import (exp, inf, pi, sqrt, floor, sin, cos, around, int,
    [2022/06/19 11:16:38] ppocr INFO: Architecture : 
    [2022/06/19 11:16:38] ppocr INFO:     Backbone : 
    [2022/06/19 11:16:38] ppocr INFO:         depth : [3, 6, 3]
    [2022/06/19 11:16:38] ppocr INFO:         embed_dim : [64, 128, 256]
    [2022/06/19 11:16:38] ppocr INFO:         img_size : [32, 100]
    [2022/06/19 11:16:38] ppocr INFO:         last_stage : True
    [2022/06/19 11:16:38] ppocr INFO:         local_mixer : [[7, 11], [7, 11], [7, 11]]
    [2022/06/19 11:16:38] ppocr INFO:         mixer : ['Local', 'Local', 'Local', 'Local', 'Local', 'Local', 'Global', 'Global', 'Global', 'Global', 'Global', 'Global']
    [2022/06/19 11:16:38] ppocr INFO:         name : SVTRNet
    [2022/06/19 11:16:38] ppocr INFO:         num_heads : [2, 4, 8]
    [2022/06/19 11:16:38] ppocr INFO:         out_channels : 192
    [2022/06/19 11:16:38] ppocr INFO:         out_char_num : 25
    [2022/06/19 11:16:38] ppocr INFO:         patch_merging : Conv
    [2022/06/19 11:16:38] ppocr INFO:         prenorm : False
    [2022/06/19 11:16:38] ppocr INFO:     Head : 
    [2022/06/19 11:16:38] ppocr INFO:         name : CTCHead
    [2022/06/19 11:16:38] ppocr INFO:     Neck : 
    [2022/06/19 11:16:38] ppocr INFO:         encoder_type : reshape
    [2022/06/19 11:16:38] ppocr INFO:         name : SequenceEncoder
    [2022/06/19 11:16:38] ppocr INFO:     Transform : 
    [2022/06/19 11:16:38] ppocr INFO:         name : STN_ON
    [2022/06/19 11:16:38] ppocr INFO:         num_control_points : 20
    [2022/06/19 11:16:38] ppocr INFO:         stn_activation : none
    [2022/06/19 11:16:38] ppocr INFO:         tps_inputsize : [32, 64]
    [2022/06/19 11:16:38] ppocr INFO:         tps_margins : [0.05, 0.05]
    [2022/06/19 11:16:38] ppocr INFO:         tps_outputsize : [32, 100]
    [2022/06/19 11:16:38] ppocr INFO:     algorithm : SVTR
    [2022/06/19 11:16:38] ppocr INFO:     model_type : rec
    [2022/06/19 11:16:38] ppocr INFO: Eval : 
    [2022/06/19 11:16:38] ppocr INFO:     dataset : 
    [2022/06/19 11:16:38] ppocr INFO:         data_dir : /backup2/synthtiger/bangla/PaddleOCR/horizontal_valid/
    [2022/06/19 11:16:38] ppocr INFO:         label_file_list : ['/backup2/synthtiger/bangla/PaddleOCR/horizontal_valid/gt.txt']
    [2022/06/19 11:16:38] ppocr INFO:         name : SimpleDataSet
    [2022/06/19 11:16:38] ppocr INFO:         transforms : 
    [2022/06/19 11:16:38] ppocr INFO:             DecodeImage : 
    [2022/06/19 11:16:38] ppocr INFO:                 channel_first : False
    [2022/06/19 11:16:38] ppocr INFO:                 img_mode : BGR
    [2022/06/19 11:16:38] ppocr INFO:             CTCLabelEncode : None
    [2022/06/19 11:16:38] ppocr INFO:             RecResizeImg : 
    [2022/06/19 11:16:38] ppocr INFO:                 character_dict_path : None
    [2022/06/19 11:16:38] ppocr INFO:                 image_shape : [3, 64, 256]
    [2022/06/19 11:16:38] ppocr INFO:                 padding : False
    [2022/06/19 11:16:38] ppocr INFO:             KeepKeys : 
    [2022/06/19 11:16:38] ppocr INFO:                 keep_keys : ['image', 'label', 'length']
    [2022/06/19 11:16:38] ppocr INFO:     loader : 
    [2022/06/19 11:16:38] ppocr INFO:         batch_size_per_card : 512
    [2022/06/19 11:16:38] ppocr INFO:         drop_last : False
    [2022/06/19 11:16:38] ppocr INFO:         num_workers : 0
    [2022/06/19 11:16:38] ppocr INFO:         shuffle : False
    [2022/06/19 11:16:38] ppocr INFO: Global : 
    [2022/06/19 11:16:38] ppocr INFO:     cal_metric_during_train : True
    [2022/06/19 11:16:38] ppocr INFO:     character_dict_path : /backup2/synthtiger/bangla/PaddleOCR/ppocr/utils/bn_char_synth.txt
    [2022/06/19 11:16:38] ppocr INFO:     character_type : ch
    [2022/06/19 11:16:38] ppocr INFO:     checkpoints : None
    [2022/06/19 11:16:38] ppocr INFO:     distributed : False
    [2022/06/19 11:16:38] ppocr INFO:     epoch_num : 100
    [2022/06/19 11:16:38] ppocr INFO:     eval_batch_step : [0, 5000]
    [2022/06/19 11:16:38] ppocr INFO:     infer_img : doc/imgs_words_en/41.jpg
    [2022/06/19 11:16:38] ppocr INFO:     infer_mode : False
    [2022/06/19 11:16:38] ppocr INFO:     log_smooth_window : 20
    [2022/06/19 11:16:38] ppocr INFO:     max_text_length : 25
    [2022/06/19 11:16:38] ppocr INFO:     pretrained_model : None
    [2022/06/19 11:16:38] ppocr INFO:     print_batch_step : 200
    [2022/06/19 11:16:38] ppocr INFO:     save_epoch_step : 1
    [2022/06/19 11:16:38] ppocr INFO:     save_inference_dir : None
    [2022/06/19 11:16:38] ppocr INFO:     save_model_dir : /backup2/synthtiger/bangla/PaddleOCR/output/rec/svtr/
    [2022/06/19 11:16:38] ppocr INFO:     save_res_path : /backup2/synthtiger/bangla/PaddleOCR/output/rec/predicts_svtr_tiny.txt
    [2022/06/19 11:16:38] ppocr INFO:     use_gpu : True
    [2022/06/19 11:16:38] ppocr INFO:     use_space_char : True
    [2022/06/19 11:16:38] ppocr INFO:     use_visualdl : False
    [2022/06/19 11:16:38] ppocr INFO: Loss : 
    [2022/06/19 11:16:38] ppocr INFO:     name : CTCLoss
    [2022/06/19 11:16:38] ppocr INFO: Metric : 
    [2022/06/19 11:16:38] ppocr INFO:     main_indicator : acc
    [2022/06/19 11:16:38] ppocr INFO:     name : RecMetric
    [2022/06/19 11:16:38] ppocr INFO: Optimizer : 
    [2022/06/19 11:16:38] ppocr INFO:     beta1 : 0.9
    [2022/06/19 11:16:38] ppocr INFO:     beta2 : 0.99
    [2022/06/19 11:16:38] ppocr INFO:     epsilon : 8e-08
    [2022/06/19 11:16:38] ppocr INFO:     lr : 
    [2022/06/19 11:16:38] ppocr INFO:         learning_rate : 0.0005
    [2022/06/19 11:16:38] ppocr INFO:         name : Cosine
    [2022/06/19 11:16:38] ppocr INFO:         warmup_epoch : 2
    [2022/06/19 11:16:38] ppocr INFO:     name : AdamW
    [2022/06/19 11:16:38] ppocr INFO:     no_weight_decay_name : norm pos_embed
    [2022/06/19 11:16:38] ppocr INFO:     one_dim_param_no_weight_decay : True
    [2022/06/19 11:16:38] ppocr INFO:     weight_decay : 0.05
    [2022/06/19 11:16:38] ppocr INFO: PostProcess : 
    [2022/06/19 11:16:38] ppocr INFO:     name : CTCLabelDecode
    [2022/06/19 11:16:38] ppocr INFO: Train : 
    [2022/06/19 11:16:38] ppocr INFO:     dataset : 
    [2022/06/19 11:16:38] ppocr INFO:         data_dir : /backup2/synthtiger/bangla/PaddleOCR/train_data/
    [2022/06/19 11:16:38] ppocr INFO:         label_file_list : ['/backup2/synthtiger/bangla/PaddleOCR/train_data/gt.txt']
    [2022/06/19 11:16:38] ppocr INFO:         name : SimpleDataSet
    [2022/06/19 11:16:38] ppocr INFO:         transforms : 
    [2022/06/19 11:16:38] ppocr INFO:             DecodeImage : 
    [2022/06/19 11:16:38] ppocr INFO:                 channel_first : False
    [2022/06/19 11:16:38] ppocr INFO:                 img_mode : BGR
    [2022/06/19 11:16:38] ppocr INFO:             CTCLabelEncode : None
    [2022/06/19 11:16:38] ppocr INFO:             RecResizeImg : 
    [2022/06/19 11:16:38] ppocr INFO:                 character_dict_path : None
    [2022/06/19 11:16:38] ppocr INFO:                 image_shape : [3, 64, 256]
    [2022/06/19 11:16:38] ppocr INFO:                 padding : False
    [2022/06/19 11:16:38] ppocr INFO:             KeepKeys : 
    [2022/06/19 11:16:38] ppocr INFO:                 keep_keys : ['image', 'label', 'length']
    [2022/06/19 11:16:38] ppocr INFO:     loader : 
    [2022/06/19 11:16:38] ppocr INFO:         batch_size_per_card : 1024
    [2022/06/19 11:16:38] ppocr INFO:         drop_last : True
    [2022/06/19 11:16:38] ppocr INFO:         num_workers : 0
    [2022/06/19 11:16:38] ppocr INFO:         shuffle : True
    [2022/06/19 11:16:38] ppocr INFO: profiler_options : None
    [2022/06/19 11:16:38] ppocr INFO: train with paddle 2.3.0 and device Place(gpu:0)
    [2022/06/19 11:16:38] ppocr INFO: Initialize indexs of datasets:['/backup2/synthtiger/bangla/PaddleOCR/train_data/gt.txt']
    [2022/06/19 11:17:15] ppocr INFO: Initialize indexs of datasets:['/backup2/synthtiger/bangla/PaddleOCR/horizontal_valid/gt.txt']
    W0619 11:17:20.803553 1660197 gpu_context.cc:278] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 11.2
    W0619 11:17:20.823755 1660197 gpu_context.cc:306] device: 0, cuDNN Version: 8.1.
    [2022/06/19 11:17:32] ppocr INFO: train from scratch
    [2022/06/19 11:17:32] ppocr INFO: train dataloader has 9253 iters
    [2022/06/19 11:17:32] ppocr INFO: valid dataloader has 1851 iters
    [2022/06/19 11:17:32] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 5000 iterations
    [2022/06/19 14:28:20] ppocr INFO: epoch: [1/100], global_step: 200, lr: 0.000005, acc: 0.000000, norm_edit_dis: 0.000000, loss: 57.202286, avg_reader_cost: 53.43127 s, avg_batch_cost: 57.23867 s, avg_samples: 1024.0, ips: 17.89000 samples/s, eta: 612 days, 20:44:52
    

    do you need more informations? what am i missing? please help,thanks

    good first issue recognition status/close 
    opened by mobassir94 29
  • hub serving输入图片的base64,得到'Please check data format!', 'results': '', 'status': '-1'}

    hub serving输入图片的base64,得到'Please check data format!', 'results': '', 'status': '-1'}

    我是用的是hub serving的快速部署模式, 使用的是http://Ip地址:8868/predict/ocr_system这个接口 使用了两种方式来输入图片的base64 1. 读入本地图片 image = open(image_path, 'rb').read() imgBase64 = base64.b64encode(image).decode('utf-8')

    1. 根据url读取图片 content = requests.get(img_url).content imgBase64 = base64.b64encode(content).decode('utf-8') 均会出现Please check data format的问题, 大部分图片是可以的, 有少部分会在10s之后返回Please check data format结果,请问在输入到hub serving之前如何进行处理? 我尝试过先转成Image, 然后convert('RGB'), 然后转base64也不工作.
    opened by pkuyilong 29
  • paddleocr with paddle serving on tensorrt

    paddleocr with paddle serving on tensorrt

    环境配置如下: paddleocr-release2.5 docker_image: registry.baidubce.com/paddlepaddle/paddle:2.1.3-gpu-cuda10.2-cudnn7 tensorrt: 7.2.1.6 paddle-gpu: 2.1.1(用来适配tensorrt7.2) paddle-serving-app: 0.7.0 paddle-serving-client: 0.7.0 paddle-serving-server-gpu: 0.7.0.post102 问题: 运行paddle serving 运行python pipeline: python web_service.py报错如下: The input [conv2d_252.tmp_0] shape of trt subgraph is [-1,96,-1,-1], please enable trt dynamic_shape mode by SetTRTDynamicShapeInfo 之前根据https://gitee.com/paddlepaddle/Serving/blob/v0.8.2/doc/TensorRT_Dynamic_Shape_CN.md已在web_service.py中的DetOp和RecOp类加入set_dynamic_shape_info函数,但是无效,依然报错

    opened by sybest1259 28
  • 服务化解析失败IndexError: string index out of range

    服务化解析失败IndexError: string index out of range

    (venv) PS D:\orc2\paddleOCR> python tools/test_hubserving.py --server_url=http://127.0.0.1:8868/predict/structure_table --image_dir=D:\1.jpeg D:\orc2\paddleOCR\venv\lib\site-packages\skimage\util\dtype.py:27: DeprecationWarning: np.bool8 is a deprecated alias for np.bool_. (Deprecated NumPy 1.24) IndexError: string index out of range

    opened by zyzz1974 0
  • PPOCRLabel不能正常运行的问题

    PPOCRLabel不能正常运行的问题

    下载了最新的PaddleOCR 2.6,想使用PPOCRLabel训练自己的数据,但是发现一直都无法正常运行。 安装完全是跟着官方教程走的,基本都能正常安装完成,但是就是跑不起来。 开始是报错np.int有问题,找了代码把np.int改成np.int32解决。 跑到界面以后,选择重新识别后又报以下的错: AttributeError: 'tuple' object has no attribute 'insert' 有时候没点到“矩形标注”,而直接点击了图像,就会报这个错: Traceback (most recent call last): File "D:\Python310\lib\site-packages\PPOCRLabel\PPOCRLabel.py", line 1425, in scrollRequest bar.setValue(bar.value() + bar.singleStep() * units) TypeError: setValue(self, int): argument 1 has unexpected type 'float'

    • 系统环境/System Environment:windows 10
    • 版本号/Version:Paddle:2.4.1 PaddleOCR:2.6.0 问题相关组件/Related components:PPOCRLabel
    • 运行指令/Command Code:
    • 完整报错/Complete Error Message:
    opened by metoogo 2
  • Slow runtime large images on CPU

    Slow runtime large images on CPU

    • System environment: Ubuntu 20.04
    • Version: latest
    • Command code: -
    • Complete error message: -

    Hi @andyjpaddle, I am trying to use PaddleOCR to extract raw text from high resolution images (4k) of healthcare documents. The extraction quality is very satisfying, but the runtime it takes to get there is often over 16 seconds, which is out of scope for my intended use of the OCR engine.

    Being images of healthcare documents, there is lots and lots of text, thus downscaling the images did not provide great results thus far, massively increasing the word error rate (WER).

    I assume the issue might be the internal tokenizer of PaddleOCR which generates lots of visual tokens for large images, thus requiring much more time to complete.

    Is there any idea that pops to your mind to mitigate the issue? Ideally, the raw text extraction should take around 5 seconds to enable the completion of further downstream tasks in a reasonable time.

    For context: Python 3.7.15 on a single CPU

    opened by DiTo97 4
Releases(v2.6.0)
  • v2.6.0(Aug 24, 2022)

    Release Note

    • Release PP-Structurev2,with functions and performance fully upgraded, adapted to Chinese scenes, and new support for Layout Recovery and one line command to convert PDF to Word;
    • Layout Analysis optimization: model storage reduced by 95%, while speed increased by 11 times, and the average CPU time-cost is only 41ms;
    • Table Recognition optimization: 3 optimization strategies are designed, and the model accuracy is improved by 6% under comparable time consumption;
    • Key Information Extraction optimization:a visual-independent model structure is designed, the accuracy of semantic entity recognition is increased by 2.8%, and the accuracy of relation extraction is increased by 9.1%.
    Source code(tar.gz)
    Source code(zip)
  • v2.5.0(May 9, 2022)

    Release Note

    • Release PP-OCRv3: With comparable speed, the effect of Chinese scene is further improved by 5% compared with PP-OCRv2, the effect of English scene is improved by 11%, and the average recognition accuracy of 80 language multilingual models is improved by more than 5%.
    • Release PPOCRLabelv2: Add the annotation function for table recognition task, key information extraction task and irregular text image.
    • Release interactive e-book "Dive into OCR", covers the cutting-edge theory and code practice of OCR full stack technology.
    Source code(tar.gz)
    Source code(zip)
  • v2.1.1(May 26, 2021)

    Release Note

    1. Newly release model pruning and model quantization tools based on PaddleSlim. Path
    2. Newly release mobile deployment tools based on Paddle-Lite. Path
    3. Newly release Android demo of ppocr system. path
    4. Newly release service deployment based on Paddle Serving. path
    Source code(tar.gz)
    Source code(zip)
  • v2.1.0(Apr 19, 2021)

  • v2.0.0(Feb 8, 2021)

    Release Note

    一、Support dynamic graph programming paradigm, adapted to Paddle 2.0, including:

    1. Detection algorithm: DB, EAST, SAST
    2. Recognition algorithm: Rosetta, CRNN, RARE, SRN, STAR-Net
    3. PPOCR Chinese models: (1) Detection models: mobile, server (2) Text direction classification models: mobile (3) Recognition models: mobile, server
    4. Multilingual models: (1) English: mobile (2) Japanese, Korean, French, German, etc. 25 languages in total: mobile

    二、The related works on deployment have been well adapted, including Inference(Python, C++) , whl, and serving

    三、Release the annotation and synthesis tools:

    1. Release a new data synthesis tool, i.e., Style-Text,easy to synthesize a large number of images which are similar to the target scene image.
    2. Release a new data annotation tool, i.e., PPOCRLabel, which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Sep 27, 2020)

The papers published in top-tier AI conferences in recent years.

AI-conference-papers The papers published in top-tier AI conferences in recent years. Paper table AAAI ICLR CVPR ICML ICCV ECCV NIPS 2019 ✔️ ✔️ ✔️ ✔️

Jinbae Park 6 Dec 09, 2022
Virtualdragdrop - Virtual Drag and Drop Using OpenCV and Arduino

Virtualdragdrop - Virtual Drag and Drop Using OpenCV and Arduino

Rizky Dermawan 4 Mar 10, 2022
Um simples projeto para fazer o reconhecimento do captcha usado pelo jogo bombcrypto

CaptchaSolver - LEIA ISSO 😓 Para iniciar o codigo: pip install -r requirements.txt python captcha_solver.py Se você deseja pegar ver o resultado das

Kawanderson 50 Mar 21, 2022
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc

dengdan 484 Dec 07, 2022
Repository collecting all the submodules for the new PyTorch-based OCR System.

OCRopus3 is being replaced by OCRopus4, which is a rewrite using PyTorch 1.7; release should be soonish. Please check github.com/tmbdev/ocropus for up

NVIDIA Research Projects 138 Dec 09, 2022
A python screen recorder for low-end computers, provides high quality video output.

RecorderX - v1.0 A screen recorder made in Python with the help of OpenCv, it has ability to record your screen in high quality. No matter what your P

Priyanshu Jindal 4 Nov 10, 2021
Augmenting Anchors by the Detector Itself

Augmenting Anchors by the Detector Itself Introduction It is difficult to determine the scale and aspect ratio of anchors for anchor-based object dete

4 Nov 06, 2022
Repository for playing the computer vision apps: People analytics on Raspberry Pi.

play-with-torch Repository for playing the computer vision apps: People analytics on Raspberry Pi. Tools Tested Hardware RasberryPi 4 Model B here, RA

eMHa 1 Sep 23, 2021
Code for the paper "Controllable Video Captioning with an Exemplar Sentence"

SMCG Code for the paper "Controllable Video Captioning with an Exemplar Sentence" Introduction We investigate a novel and challenging task, namely con

10 Dec 04, 2022
A simple document layout analysis using Python-OpenCV

Run the application: python main.py *Note: For first time running the application, create a folder named "output". The application is a simple documen

Roinand Aguila 109 Dec 12, 2022
Rest API Written In Python To Classify NSFW Images.

✨ NSFW Classifier API ✨ Rest API Written In Python To Classify NSFW Images. Fastest Solution If you don't want to selfhost it, there's already an inst

Akshay Rajput 23 Dec 30, 2022
Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

11.4k Jan 02, 2023
A selectional auto-encoder approach for document image binarization

The code of this repository was used for the following publication. If you find this code useful please cite our paper: @article{Gallego2019, title =

Javier Gallego 89 Nov 18, 2022
The virtual calculator will be above the live streaming from your camera

The virtual calculator is above the live streaming from my camera usb , the program first detect my hand and in each frame calculate the distance between two finger ,if the distance is lower than the

gasbaoui mohammed al amine 5 Jul 01, 2022
This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

alebogado 1 Jan 27, 2022
huoyijie 1.2k Dec 29, 2022
Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

DewarpNet This repository contains the codes for DewarpNet training. Recent Updates [May, 2020] Added evaluation images and an important note about Ma

<a href=[email protected]"> 354 Jan 01, 2023
FastOCR is a desktop application for OCR API.

FastOCR FastOCR is a desktop application for OCR API. Installation Arch Linux fastocr-git @ AUR Build from AUR or install with your favorite AUR helpe

Bruce Zhang 58 Jan 07, 2023
Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.

doc2text doc2text extracts higher quality text by fixing common scan errors Developing text corpora can be a massive pain in the butt. Much of the tex

Joe Sutherland 1.3k Jan 04, 2023
Detect and fix skew in images containing text

Alyn Skew detection and correction in images containing text Image with skew Image after deskew Install and use via pip! Recommended way(using virtual

Kakul 230 Dec 21, 2022