Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

Overview

English | 简体中文

Introduction

PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.

Notice

PaddleOCR supports both dynamic graph and static graph programming paradigm

  • Dynamic graph: dygraph branch (default), supported by paddle 2.0.0 (installation)
  • Static graph: develop branch

Recent updates

  • 2021.2.8 Release PaddleOCRv2.0(branch release/2.0) and set as default branch. Check release note here: https://github.com/PaddlePaddle/PaddleOCR/releases/tag/v2.0.0
  • 2021.1.21 update more than 25+ multilingual recognition models models list, including:English, Chinese, German, French, Japanese,Spanish,Portuguese Russia Arabic and so on. Models for more languages will continue to be updated Develop Plan.
  • 2020.12.15 update Data synthesis tool, i.e., Style-Text,easy to synthesize a large number of images which are similar to the target scene image.
  • 2020.11.25 Update a new data annotation tool, i.e., PPOCRLabel, which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
  • 2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941
  • more

Features

  • PPOCR series of high-quality pre-trained models, comparable to commercial effects
    • Ultra lightweight ppocr_mobile series models: detection (3.0M) + direction classifier (1.4M) + recognition (5.0M) = 9.4M
    • General ppocr_server series models: detection (47.1M) + direction classifier (1.4M) + recognition (94.9M) = 143.4M
    • Support Chinese, English, and digit recognition, vertical text recognition, and long text recognition
    • Support multi-language recognition: Korean, Japanese, German, French
  • Rich toolkits related to the OCR areas
    • Semi-automatic data annotation tool, i.e., PPOCRLabel: support fast and efficient data annotation
    • Data synthesis tool, i.e., Style-Text: easy to synthesize a large number of images which are similar to the target scene image
  • Support user-defined training, provides rich predictive inference deployment solutions
  • Support PIP installation, easy to use
  • Support Linux, Windows, MacOS and other systems

Visualization

The above pictures are the visualizations of the general ppocr_server model. For more effect pictures, please see More visualizations.

Community

  • Scan the QR code below with your Wechat, you can access to official technical exchange group. Look forward to your participation.

Quick Experience

You can also quickly experience the ultra-lightweight OCR : Online Experience

Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Android systems): Sign in to the website to obtain the QR code for installing the App

Also, you can scan the QR code below to install the App (Android support only)

PP-OCR 2.0 series model list(Update on Dec 15)

Note : Compared with models 1.1, which are trained with static graph programming paradigm, models 2.0 are the dynamic graph trained version and achieve close performance.

Model introduction Model name Recommended scene Detection model Direction classifier Recognition model
Chinese and English ultra-lightweight OCR model (9.4M) ch_ppocr_mobile_v2.0_xx Mobile & server inference model / pre-trained model inference model / pre-trained model inference model / pre-trained model
Chinese and English general OCR model (143.4M) ch_ppocr_server_v2.0_xx Server inference model / pre-trained model inference model / pre-trained model inference model / pre-trained model

For more model downloads (including multiple languages), please refer to PP-OCR v2.0 series model downloads.

For a new language request, please refer to Guideline for new language_requests.

Tutorials

PP-OCR Pipeline

PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of three parts: DB text detection[2], detection frame correction and CRNN text recognition[7]. The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module. The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to the PP-OCR technical article (https://arxiv.org/abs/2009.09941). Besides, The implementation of the FPGM Pruner [8] and PACT quantization [9] is based on PaddleSlim.

Visualization more

  • Chinese OCR model
  • English OCR model
  • Multilingual OCR model

Guideline for new language requests

If you want to request a new language support, a PR with 2 following files are needed:

  1. In folder ppocr/utils/dict, it is necessary to submit the dict text to this path and name it with {language}_dict.txt that contains a list of all characters. Please see the format example from other files in that folder.

  2. In folder ppocr/utils/corpus, it is necessary to submit the corpus to this path and name it with {language}_corpus.txt that contains a list of words in your language. Maybe, 50000 words per language is necessary at least. Of course, the more, the better.

If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.

More details, please refer to Multilingual OCR Development Plan.

License

This project is released under Apache 2.0 license

Contribution

We welcome all the contributions to PaddleOCR and appreciate for your feedback very much.

  • Many thanks to Khanh Tran and Karl Horky for contributing and revising the English documentation.
  • Many thanks to zhangxin for contributing the new visualize function、add .gitignore and discard set PYTHONPATH manually.
  • Many thanks to lyl120117 for contributing the code for printing the network structure.
  • Thanks xiangyubo for contributing the handwritten Chinese OCR datasets.
  • Thanks authorfu for contributing Android demo and xiadeye contributing iOS demo, respectively.
  • Thanks BeyondYourself for contributing many great suggestions and simplifying part of the code style.
  • Thanks tangmq for contributing Dockerized deployment services to PaddleOCR and supporting the rapid release of callable Restful API services.
  • Thanks lijinhan for contributing a new way, i.e., java SpringBoot, to achieve the request for the Hubserving deployment.
  • Thanks Mejans for contributing the Occitan corpus and character set.
  • Thanks LKKlein for contributing a new deploying package with the Golang program language.
  • Thanks Evezerest, ninetailskim, edencfc, BeyondYourself and 1084667371 for contributing a new data annotation tool, i.e., PPOCRLabel。
Comments
  • 文本识别转推理模型后识别不正确!

    文本识别转推理模型后识别不正确!

    训练完之后用infer_rec.py预测是正常结果,但是用export_model.py就是错误的结果 这是我执行的附加参数:-c output/rec_chinese_lite_v2.0/config.yml -o Global.pretrained_model=output/rec_chinese_lite_v2.0/best_accuracy Global.save_inference_dir=./save_inference_dir 这是我用predict_rec.py预测的参数:--image_dir="2.bmp" --rec_model_dir="save_inference_dir" --rec_char_dict_path="train_data/labels.txt" 附加我上传了我训练的文件和一张图片例子

    output.zip

    status/close 
    opened by xinyujituan 47
  • C++ windows环境下 cpu_math_library_num_threads_ 以及 use_mkldnn_对于计算速度的影响

    C++ windows环境下 cpu_math_library_num_threads_ 以及 use_mkldnn_对于计算速度的影响

    采用教程编译了windows下的 ocr_system.exe(mkl数学库),测试发现,同一张图片有如下情况

    1. 同样的cpu_math_library_num_threads_=10情况下,use_mkldnn 选项打开耗时(1.85s) 关闭选项(1.6s)
    2. use_mkldnn 关闭,cpu_math_library_num_threads_=0时,耗时1.4s cpu_math_library_num_threads_=12时,耗时1.9s

    CPU Intel 8700(六核十二线程)

    这两个情况,怎么都是反着来的呀,费解。单线程速度最快吗?

    opened by qq61786631 46
  •  PaddleOCR 2.5 版本 ,use_tensorrt=True 跑不通

    PaddleOCR 2.5 版本 ,use_tensorrt=True 跑不通

    请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

    • 系统环境/System Environment:ubuntu18.04 cuda 10.2 cudnn8 python3.7 tensorrt 7.2.3.4
    • paddle2onnx 0.5 paddlehub 1.8.3 paddleocr 2.5.0.3 paddlepaddle-gpu 2.3.0 paddleslim 1.1.1 paddlex 1.3.7
    • 版本号/Version:Paddle:2.3.0 PaddleOCR:2.5.0.3 问题相关组件/Related components:tensorrt
    • 运行指令/Command Code:--use_tensorrt==true
    • 完整报错/Complete Error Message: [2022/06/21 13:07:18] ppocr DEBUG: Namespace(alpha=1.0, benchmark=False, beta=1.0, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/home/ocr_model/cls_infer', cls_thresh=0.9, cpu_threads=10, crop_res_save_dir='./output', det=True, det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_fce_box_type='poly', det_limit_side_len=960, det_limit_type='max', det_model_dir='/home/ocr_model/v3_det_infer/ch_PP-OCRv3_det_infer', det_pse_box_thresh=0.85, det_pse_box_type='quad', det_pse_min_area=16, det_pse_scale=1, det_pse_thresh=0, det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, draw_img_save_dir='./inference_results', drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, fourier_degree=5, gpu_mem=500, help='==SUPPRESS==', image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', layout=True, layout_label_map=None, layout_path_model='lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config', max_batch_size=10, max_text_length=25, min_subgraph_size=15, mode='structure', ocr=True, ocr_version='PP-OCRv3', output='./output', precision='fp32', process_id=0, rec=True, rec_algorithm='SVTR_LCNet', rec_batch_num=6, rec_char_dict_path='/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', rec_image_shape='3, 48, 320', rec_model_dir='/home/ocr_model/v3_rec_infer/ch_PP-OCRv3_rec_infer', save_crop_res=False, save_log_path='./log_output/', scales=[8, 16, 32], show_log=True, structure_version='PP-STRUCTURE', table=True, table_char_dict_path=None, table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=False, use_dilation=False, use_gpu=True, use_mp=False, use_onnx=False, use_pdserving=False, use_space_char=True, use_tensorrt=True, vis_font_path='./doc/fonts/simfang.ttf', warmup=False) W0621 13:07:21.006245 894 analysis_predictor.cc:1086] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect. I0621 13:07:21.030115 894 analysis_predictor.cc:854] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] I0621 13:07:21.065781 894 fuse_pass_base.cc:57] --- detected 10 subgraphs --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0621 13:07:21.130373 894 fuse_pass_base.cc:57] --- detected 185 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0621 13:07:21.150663 894 fuse_pass_base.cc:57] --- detected 24 subgraphs --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] --- Running IR pass [trt_map_matmul_to_mul_pass] I0621 13:07:21.155673 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [fc_fuse_pass] I0621 13:07:21.157024 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [conv_elementwise_add_fuse_pass] I0621 13:07:21.167703 894 fuse_pass_base.cc:57] --- detected 42 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0621 13:07:21.179683 894 tensorrt_subgraph_pass.cc:141] --- detect a sub-graph with 133 nodes I0621 13:07:21.196204 894 tensorrt_subgraph_pass.cc:403] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0621 13:07:21.737630 894 engine.cc:203] Run Paddle-TRT Dynamic Shape mode. I0621 13:07:42.235085 894 engine.cc:424] Inspector needs TensorRT version 8.2 and after. --- Running IR pass [conv_bn_fuse_pass] --- Running IR pass [conv_elementwise_add_act_fuse_pass] --- Running IR pass [conv_elementwise_add2_act_fuse_pass] --- Running IR pass [transpose_flatten_concat_fuse_pass] --- Running analysis [ir_params_sync_among_devices_pass] I0621 13:07:42.255204 894 ir_params_sync_among_devices_pass.cc:100] Sync params from CPU to GPU --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [memory_optimize_pass] I0621 13:07:42.259418 894 memory_optimize_pass.cc:216] Cluster name : shape_1.tmp_0_slice_0 size: 4 I0621 13:07:42.259449 894 memory_optimize_pass.cc:216] Cluster name : shape_0.tmp_0 size: 16 I0621 13:07:42.259457 894 memory_optimize_pass.cc:216] Cluster name : reshape2_0.tmp_1 size: 0 I0621 13:07:42.259477 894 memory_optimize_pass.cc:216] Cluster name : linear_1.tmp_1 size: 8 --- Running analysis [ir_graph_to_program_pass] I0621 13:07:42.308853 894 analysis_predictor.cc:1007] ======= optimize end ======= I0621 13:07:42.312048 894 naive_executor.cc:102] --- skip [feed], feed -> x I0621 13:07:42.312656 894 naive_executor.cc:102] --- skip [save_infer_model/scale_0.tmp_1], fetch -> fetch I0621 13:07:42.422236 894 analysis_predictor.cc:854] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] I0621 13:07:42.465929 894 fuse_pass_base.cc:57] --- detected 2 subgraphs --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0621 13:07:42.529808 894 fuse_pass_base.cc:57] --- detected 184 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] I0621 13:07:42.538357 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0621 13:07:42.548557 894 fuse_pass_base.cc:57] --- detected 19 subgraphs --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] I0621 13:07:42.553280 894 fuse_pass_base.cc:57] --- detected 9 subgraphs --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] I0621 13:07:42.554414 894 fuse_pass_base.cc:57] --- detected 4 subgraphs --- Running IR pass [trt_map_matmul_to_mul_pass] --- Running IR pass [fc_fuse_pass] I0621 13:07:42.557754 894 fuse_pass_base.cc:57] --- detected 9 subgraphs --- Running IR pass [conv_elementwise_add_fuse_pass] I0621 13:07:42.563249 894 fuse_pass_base.cc:57] --- detected 23 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0621 13:07:42.572366 894 tensorrt_subgraph_pass.cc:141] --- detect a sub-graph with 57 nodes I0621 13:07:42.577783 894 tensorrt_subgraph_pass.cc:403] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0621 13:07:42.580443 894 op_converter.h:253] trt input [pool2d_5.tmp_0_clone_0] dynamic shape info not set, please check and retry. Traceback (most recent call last): File "3.py", line 18, in ocr_version='PP-OCRv3', use_angle_cls=False, use_tensorrt=True, lang='ch') File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/paddleocr.py", line 437, in init super().init(params) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_system.py", line 47, in init self.text_recognizer = predict_rec.TextRecognizer(args) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_rec.py", line 74, in init utility.create_predictor(args, 'rec', logger) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/utility.py", line 313, in create_predictor predictor = inference.create_predictor(config) ValueError: (InvalidArgument) some trt inputs dynamic shape info not set, check the INFO log above for more details. [Hint: Expected all_dynamic_shape_set == true, but received all_dynamic_shape_set:0 != true:1.] (at /paddle/paddle/fluid/inference/tensorrt/convert/op_converter.h:287)

    代码 import os import sys import time

    import cv2

    sys.path.insert(0, '/usr/local/python3.7/lib/python3.7/site-packages/paddleocr')

    from paddleocr import PaddleOCR

    PKG_PATTERN = r'PKG.*:'

    root_path = os.path.join('/home', 'ocr_model') cls_model_dir = os.path.join(root_path, 'cls_infer') det_model_dir = os.path.join(root_path, 'v3_det_infer/ch_PP-OCRv3_det_infer') rec_model_dir = os.path.join(root_path, 'v3_rec_infer/ch_PP-OCRv3_rec_infer') addleOCR = PaddleOCR(cls_model_dir=cls_model_dir, det_model_dir=det_model_dir, rec_model_dir=rec_model_dir, ocr_version='PP-OCRv3', use_angle_cls=False, use_tensorrt=True, lang='ch')

    frame = cv2.imread('/home/095.png') print(frame.shape) while True: s1 = time.time() result = addleOCR.ocr(frame, cls=False) print('exec time:' + str(time.time() - s1)) print(result)

    status/close 
    opened by shihaitao118 41
  • 照着文档 python 调用 paddleocr package 报错 FatalError: `Process abort signal` is detected by the operating system,求助各位

    照着文档 python 调用 paddleocr package 报错 FatalError: `Process abort signal` is detected by the operating system,求助各位

    https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_en/whl_en.md

    centos系统 python3.6 paddleocr 2.0.2 paddlepaddle 2.0.0rc1

    from paddleocr import PaddleOCR,draw_ocr
    
    # Paddleocr supports Chinese, English, French, German, Korean and Japanese.
    # You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan`
    # to switch the language model in order.
    ocr = PaddleOCR(use_angle_cls=True,use_gpu=False, lang='ch') # need to run only once to download and load model into memory
    img_path = './tmp.jpg'
    result = ocr.ocr(img_path, cls=True)
    for line in result:
        print(line)
    

    错误信息:

    --------------------------------------
    C++ Traceback (most recent call last):
    --------------------------------------
    0   paddle::AnalysisPredictor::Run(std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> > const&, std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> >*, int)
    1   paddle::framework::NaiveExecutor::Run()
    2   paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
    3   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
    4   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
    5   std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CPUPlace, false, 0ul, paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, float>, paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, double> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
    6   paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const
    7   cblas_sgemm
    8   sgemm
    9   mkl_blas_sgemm
    10  mkl_serv_get_num_stripes
    11  omp_get_num_procs
    12  paddle::framework::SignalHandle(char const*, int)
    13  paddle::platform::GetCurrentTraceBackString()
    
    ----------------------
    Error Message Summary:
    ----------------------
    FatalError: `Process abort signal` is detected by the operating system.
      [TimeInfo: *** Aborted at 1613980216 (unix time) try "date -d @1613980216" if you are using GNU date ***]
      [SignalInfo: *** SIGABRT (@0x3f1000139a6) received by PID 80294 (TID 0x7f5e47933740) from PID 80294 ***]
    

    paddlepaddle 用 2.0.0 还是会报这样的错。

    opened by suparek 39
  • paddlepaddle-gpu 2.0.0rc1报FatalError: `Segmentation fault` is detected by the operating system.

    paddlepaddle-gpu 2.0.0rc1报FatalError: `Segmentation fault` is detected by the operating system.

    用的git上的最新版的PaddleOCR,在执行python tools/infer/predict_system.py报错,错误信息如下:


    C++ Traceback (most recent call last):

    0 paddle::framework::SignalHandle(char const*, int) 1 paddle::platform::GetCurrentTraceBackString()


    Error Message Summary:

    FatalError: Segmentation fault is detected by the operating system. [TimeInfo: *** Aborted at 1609724467 (unix time) try "date -d @1609724467" if you are using GNU date ***] [SignalInfo: *** SIGSEGV (@0x0) received by PID 127353 (TID 0x7f4aa7f1d700) from PID 0 ***]

    Segmentation fault (core dumped)

    执行**paddle.utils.run_check()**的信息如下:

    import paddle paddle.utils.run_check() Running verify PaddlePaddle program ... W0104 09:50:08.441300 127586 device_context.cc:320] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 10.2, Runtime API Version: 10.0 W0104 09:50:08.444324 127586 device_context.cc:330] device: 0, cuDNN Version: 8.0. PaddlePaddle works well on 1 GPU. W0104 09:50:10.058878 127586 parallel_executor.cc:491] Cannot enable P2P access from 0 to 2 W0104 09:50:10.058951 127586 parallel_executor.cc:491] Cannot enable P2P access from 0 to 3 W0104 09:50:10.799384 127586 parallel_executor.cc:491] Cannot enable P2P access from 1 to 2 W0104 09:50:10.799430 127586 parallel_executor.cc:491] Cannot enable P2P access from 1 to 3 W0104 09:50:10.799440 127586 parallel_executor.cc:491] Cannot enable P2P access from 2 to 0 W0104 09:50:10.799450 127586 parallel_executor.cc:491] Cannot enable P2P access from 2 to 1 W0104 09:50:11.883519 127586 parallel_executor.cc:491] Cannot enable P2P access from 3 to 0 W0104 09:50:11.883584 127586 parallel_executor.cc:491] Cannot enable P2P access from 3 to 1 W0104 09:50:15.108191 127586 fuse_all_reduce_op_pass.cc:75] Find all_reduce operators: 2. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 2. PaddlePaddle works well on 4 GPUs. PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

    环境信息: python版本3.8.5,3.7的也测试过一样的错误

    Package Version


    alabaster 0.7.12 anaconda-client 1.7.2 anaconda-navigator 1.9.12 anaconda-project 0.8.3 appdirs 1.4.4 asn1crypto 1.4.0 astor 0.8.1 astroid 2.4.2 astropy 4.0.1.post1 atomicwrites 1.4.0 attrs 20.1.0 Babel 2.8.0 backcall 0.2.0 backports.functools-lru-cache 1.6.1 backports.shutil-get-terminal-size 1.0.0 backports.tempfile 1.0 backports.weakref 1.0.post1 bce-python-sdk 0.8.53 beautifulsoup4 4.9.1 bitarray 1.5.3 bkcharts 0.2 bokeh 2.2.1 boto 2.49.0 Bottleneck 1.3.2 brotlipy 0.7.0 certifi 2020.6.20 cffi 1.14.2 cfgv 3.2.0 chardet 3.0.4 cliapp 1.0.9 click 7.1.2 cloudpickle 1.6.0 clyent 1.2.2 colorama 0.4.3 conda 4.8.4 conda-build 3.20.2 conda-package-handling 1.7.0 conda-verify 3.4.2 contextlib2 0.6.0.post1 cryptography 3.1 cycler 0.10.0 Cython 0.29.21 cytoolz 0.10.1 dask 2.25.0 datashape 0.5.4 decorator 4.4.2 distlib 0.3.1 distributed 2.25.0 docutils 0.16 entrypoints 0.3 et-xmlfile 1.0.1 fastcache 1.1.0 filelock 3.0.12 flake8 3.8.4 Flask 1.1.2 Flask-Babel 2.0.0 Flask-Cors 3.0.9 fsspec 0.8.0 future 0.18.2 gast 0.3.3 gevent 20.6.2 glob2 0.7 gmpy2 2.0.8 greenlet 0.4.16 h5py 2.10.0 HeapDict 1.0.1 html5lib 1.1 hypothesis 5.29.0 identify 1.5.10 idna 2.10 imageio 2.9.0 imagesize 1.2.0 imgaug 0.4.0 importlib-metadata 1.7.0 ipykernel 5.3.4 ipython 7.18.1 ipython-genutils 0.2.0 isort 5.4.2 itsdangerous 1.1.0 jdcal 1.4.1 jedi 0.17.2 Jinja2 2.11.2 joblib 0.16.0 jsonschema 3.2.0 jupyter-client 6.1.6 jupyter-console 6.2.0 jupyter-core 4.6.3 kiwisolver 1.2.0 lazy-object-proxy 1.4.3 libarchive-c 2.9 llvmlite 0.34.0 lmdb 1.0.0 locket 0.2.0 lxml 4.5.2 MarkupSafe 1.1.1 matplotlib 3.3.1 mccabe 0.6.1 mistune 0.8.4 mkl-fft 1.1.0 mkl-random 1.1.1 mkl-service 2.3.0 mock 4.0.2 more-itertools 8.5.0 mpmath 1.1.0 msgpack 1.0.0 multipledispatch 0.6.0 navigator-updater 0.2.1 nbformat 5.0.7 networkx 2.5 nltk 3.5 nodeenv 1.5.0 nose 1.3.7 numba 0.51.2 numexpr 2.7.1 numpy 1.19.1 numpydoc 1.1.0 odo 0.5.1 olefile 0.46 opencv-python 4.2.0.32 openpyxl 3.0.5 packaging 20.4 paddlepaddle-gpu 2.0.0rc1.post100 pandas 1.1.1 pandocfilters 1.4.2 parso 0.7.0 partd 1.1.0 path 15.0.0 pathlib2 2.3.5 patsy 0.5.1 pep8 1.7.1 pexpect 4.8.0 pickleshare 0.7.5 Pillow 7.2.0 pip 20.2.2 pkginfo 1.5.0.1 pluggy 0.13.1 ply 3.11 pre-commit 2.9.3 prompt-toolkit 3.0.7 protobuf 3.14.0 psutil 5.7.2 ptyprocess 0.6.0 py 1.9.0 pyclipper 1.2.1 pycodestyle 2.6.0 pycosat 0.6.3 pycparser 2.20 pycrypto 2.6.1 pycryptodome 3.9.9 pycurl 7.43.0.5 pyflakes 2.2.0 Pygments 2.6.1 pylint 2.6.0 pyodbc 4.0.0-unsupported pyOpenSSL 19.1.0 pyparsing 2.4.7 pyrsistent 0.16.0 PySocks 1.7.1 pytest 5.0.0 pytest-arraydiff 0.2 pytest-astropy 0.8.0 pytest-astropy-header 0.1.2 pytest-doctestplus 0.8.0 pytest-openfiles 0.5.0 pytest-remotedata 0.3.2 python-dateutil 2.8.1 python-Levenshtein 0.12.0 pytz 2020.1 PyWavelets 1.1.1 PyYAML 5.3.1 pyzmq 18.1.1 QtAwesome 0.7.2 qtconsole 4.7.6 QtPy 1.9.0 regex 2020.7.14 requests 2.24.0 rope 0.17.0 ruamel-yaml 0.15.87 scikit-image 0.16.2 scikit-learn 0.23.2 scipy 1.5.2 seaborn 0.10.1 Send2Trash 1.5.0 setuptools 49.6.0.post20200814 Shapely 1.7.1 simplegeneric 0.8.1 singledispatch 3.4.0.3 sip 4.19.13 six 1.15.0 snowballstemmer 2.0.0 sortedcollections 1.2.1 sortedcontainers 2.2.2 soupsieve 2.0.1 Sphinx 3.2.1 sphinxcontrib-applehelp 1.0.2 sphinxcontrib-devhelp 1.0.2 sphinxcontrib-htmlhelp 1.0.3 sphinxcontrib-jsmath 1.0.1 sphinxcontrib-qthelp 1.0.3 sphinxcontrib-serializinghtml 1.1.4 sphinxcontrib-websupport 1.2.4 SQLAlchemy 1.3.19 statsmodels 0.11.1 sympy 1.5.1 tables 3.6.1 tblib 1.7.0 terminado 0.8.3 testpath 0.4.4 threadpoolctl 2.1.0 toml 0.10.1 toolz 0.10.0 tornado 6.0.4 tqdm 4.48.2 traitlets 4.3.3 typing-extensions 3.7.4.3 unicodecsv 0.14.1 urllib3 1.25.10 virtualenv 20.2.2 visualdl 2.1.0 wcwidth 0.2.5 webencodings 0.5.1 Werkzeug 1.0.1 wheel 0.35.1 wrapt 1.11.2 xlrd 1.2.0 XlsxWriter 1.3.3 xlwt 1.3.0 xmltodict 0.12.0 zict 2.0.0 zipp 3.1.0 zope.event 4.4 zope.interface 5.1.0

    用之前的版本,安装1.8.5的测试没有问题

    documentation 
    opened by xiulianzw 34
  • 方向分类器在python上验证准确,然而转换为推理模型后,在cpp上部署,输出却有误

    方向分类器在python上验证准确,然而转换为推理模型后,在cpp上部署,输出却有误

    请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

    • 系统环境/System Environment:
    • 版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components:
    • 运行指令/Command Code:
    • 完整报错/Complete Error Message:

    增加了90°和270°两个角度并对模型进行训练,在python上验证无误,可以输出90°和270°结果,但在cpp部署的官方样例程序上,该模型经过推理后的推理模型准确度却下降了,请问为什么方向分类器推理后在cpp上的准确度不如训练模型在python上的准确度呢? 1666144706607 1666144229641

    status/close 
    opened by Camphora7 32
  • 训练模型和推理模型效果不一致

    训练模型和推理模型效果不一致

    PaddleOCR-release-2.0 基于det_mv3_db.yml训练车牌检测模型。 使用训练完的模型直接测试,infer_det.py,效果很好。 然后使用export_model.py对best_accuracy模型进行转换为推理模型(基于训练时的配置表config.yml),得到inference模型,使用predict_det.py做预测。效果没有前者好,检测框不紧密。

    F6726907-C310-4b5d-8CEF-EFCC44B193BC

    使用官方的ch_ppocr_mobile_v2.0_det_train进行测试,以及转换后测试效果也不一致。

    如下保证predict_det.py的效果和infer_det.py一致?

    opened by simplew2011 30
  • 内存溢出的问题!

    内存溢出的问题!

    我在训练文本检测网络DB时候,经常会出现内存溢出的问题,如下: aaa 其中,配置文件det_r50_vd_db.yml的内容如下:

    Global:
      algorithm: DB
      use_gpu: true
      epoch_num: 1200
      log_smooth_window: 20
      print_batch_step: 30
      save_model_dir: ./output/det_db/
      save_epoch_step: 200
      eval_batch_step: 10000
      train_batch_size_per_card: 2
      test_batch_size_per_card: 1
      image_shape: [3, 640, 640]
      reader_yml: ./configs/det/det_db_chinese_reader.yml
      pretrain_weights: ./pretrain_models/ResNet50_vd_ssld_pretrained/
      save_res_path: ./output/det_db/predicts_db.txt
      checkpoints: 
      save_inference_dir:
    

    配置文件det_db_chinese_reader.yml的内容如下:

    TrainReader:
      reader_function: ppocr.data.det.dataset_traversal,TrainReader
      process_function: ppocr.data.det.db_process,DBProcessTrain
      num_workers: 4
      img_set_dir: ""
      label_file_path: /home/aistudio/data/data39969/mtwi_2018_split/train.txt
    
    EvalReader:
      reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
      process_function: ppocr.data.det.db_process,DBProcessTest
      img_set_dir: ""
      label_file_path: /home/aistudio/data/data39969/mtwi_2018_split/test.txt
      test_image_shape: [736, 1280]
      
    TestReader:
      reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
      process_function: ppocr.data.det.db_process,DBProcessTest
      infer_img:
      img_set_dir: ""
      label_file_path: /home/aistudio/data/data39969/icpr_mtwi_task2/test.txt
      test_image_shape: [736, 1280]
      do_eval: True
    

    训练数据集来自于https://tianchi.aliyun.com/competition/entrance/231685/information,手动划分数据,训练集和验证集的划分比例9:1(9043:1005)。我的batch_size从2~16都试过,一直会出现内存溢出的问题,num_workers=1的话,可以训练,但是训练的迭代速度就太慢了。请问,有什么好的解决方法吗?

    opened by NextGuido 30
  • why getting 0.00 accuracy during training svtrnet?

    why getting 0.00 accuracy during training svtrnet?

    i was trying to train svtrnet model for bangla. here is the config file that i am using : https://pastecode.io/s/4czzqoix

    /backup2/synthtiger/bangla/PaddleOCR/ppocr/utils/bn_char_synth.txt contains characters like : } ~ । ঁ ং ঃ অ আ ই etc

    /backup2/synthtiger/bangla/PaddleOCR/train_data/ inside train_data folder i have folders like 0,1,2,3 etc and each folder containing 10k images ['/backup2/synthtiger/bangla/PaddleOCR/train_data/gt.txt'] gt.txt contains annotations of all the images that can be found inside train_data folder.

    Same for validation dataset. when i try to train i get acc : 0.00 like this :

    (mobassir) [email protected]:/backup2/synthtiger/bangla/PaddleOCR$ python3 tools/train.py -c configs/rec/rec_svtrnet.yml
    /home/apsisdev/.local/lib/python3.8/site-packages/scipy/fft/__init__.py:97: DeprecationWarning: The module numpy.dual is deprecated.  Instead of using dual, use the functions directly from numpy or scipy.
      from numpy.dual import register_func
    /home/apsisdev/.local/lib/python3.8/site-packages/scipy/sparse/sputils.py:17: DeprecationWarning: `np.typeDict` is a deprecated alias for `np.sctypeDict`.
      supported_dtypes = [np.typeDict[x] for x in supported_dtypes]
    /home/apsisdev/.local/lib/python3.8/site-packages/scipy/special/orthogonal.py:81: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      from numpy import (exp, inf, pi, sqrt, floor, sin, cos, around, int,
    [2022/06/19 11:16:38] ppocr INFO: Architecture : 
    [2022/06/19 11:16:38] ppocr INFO:     Backbone : 
    [2022/06/19 11:16:38] ppocr INFO:         depth : [3, 6, 3]
    [2022/06/19 11:16:38] ppocr INFO:         embed_dim : [64, 128, 256]
    [2022/06/19 11:16:38] ppocr INFO:         img_size : [32, 100]
    [2022/06/19 11:16:38] ppocr INFO:         last_stage : True
    [2022/06/19 11:16:38] ppocr INFO:         local_mixer : [[7, 11], [7, 11], [7, 11]]
    [2022/06/19 11:16:38] ppocr INFO:         mixer : ['Local', 'Local', 'Local', 'Local', 'Local', 'Local', 'Global', 'Global', 'Global', 'Global', 'Global', 'Global']
    [2022/06/19 11:16:38] ppocr INFO:         name : SVTRNet
    [2022/06/19 11:16:38] ppocr INFO:         num_heads : [2, 4, 8]
    [2022/06/19 11:16:38] ppocr INFO:         out_channels : 192
    [2022/06/19 11:16:38] ppocr INFO:         out_char_num : 25
    [2022/06/19 11:16:38] ppocr INFO:         patch_merging : Conv
    [2022/06/19 11:16:38] ppocr INFO:         prenorm : False
    [2022/06/19 11:16:38] ppocr INFO:     Head : 
    [2022/06/19 11:16:38] ppocr INFO:         name : CTCHead
    [2022/06/19 11:16:38] ppocr INFO:     Neck : 
    [2022/06/19 11:16:38] ppocr INFO:         encoder_type : reshape
    [2022/06/19 11:16:38] ppocr INFO:         name : SequenceEncoder
    [2022/06/19 11:16:38] ppocr INFO:     Transform : 
    [2022/06/19 11:16:38] ppocr INFO:         name : STN_ON
    [2022/06/19 11:16:38] ppocr INFO:         num_control_points : 20
    [2022/06/19 11:16:38] ppocr INFO:         stn_activation : none
    [2022/06/19 11:16:38] ppocr INFO:         tps_inputsize : [32, 64]
    [2022/06/19 11:16:38] ppocr INFO:         tps_margins : [0.05, 0.05]
    [2022/06/19 11:16:38] ppocr INFO:         tps_outputsize : [32, 100]
    [2022/06/19 11:16:38] ppocr INFO:     algorithm : SVTR
    [2022/06/19 11:16:38] ppocr INFO:     model_type : rec
    [2022/06/19 11:16:38] ppocr INFO: Eval : 
    [2022/06/19 11:16:38] ppocr INFO:     dataset : 
    [2022/06/19 11:16:38] ppocr INFO:         data_dir : /backup2/synthtiger/bangla/PaddleOCR/horizontal_valid/
    [2022/06/19 11:16:38] ppocr INFO:         label_file_list : ['/backup2/synthtiger/bangla/PaddleOCR/horizontal_valid/gt.txt']
    [2022/06/19 11:16:38] ppocr INFO:         name : SimpleDataSet
    [2022/06/19 11:16:38] ppocr INFO:         transforms : 
    [2022/06/19 11:16:38] ppocr INFO:             DecodeImage : 
    [2022/06/19 11:16:38] ppocr INFO:                 channel_first : False
    [2022/06/19 11:16:38] ppocr INFO:                 img_mode : BGR
    [2022/06/19 11:16:38] ppocr INFO:             CTCLabelEncode : None
    [2022/06/19 11:16:38] ppocr INFO:             RecResizeImg : 
    [2022/06/19 11:16:38] ppocr INFO:                 character_dict_path : None
    [2022/06/19 11:16:38] ppocr INFO:                 image_shape : [3, 64, 256]
    [2022/06/19 11:16:38] ppocr INFO:                 padding : False
    [2022/06/19 11:16:38] ppocr INFO:             KeepKeys : 
    [2022/06/19 11:16:38] ppocr INFO:                 keep_keys : ['image', 'label', 'length']
    [2022/06/19 11:16:38] ppocr INFO:     loader : 
    [2022/06/19 11:16:38] ppocr INFO:         batch_size_per_card : 512
    [2022/06/19 11:16:38] ppocr INFO:         drop_last : False
    [2022/06/19 11:16:38] ppocr INFO:         num_workers : 0
    [2022/06/19 11:16:38] ppocr INFO:         shuffle : False
    [2022/06/19 11:16:38] ppocr INFO: Global : 
    [2022/06/19 11:16:38] ppocr INFO:     cal_metric_during_train : True
    [2022/06/19 11:16:38] ppocr INFO:     character_dict_path : /backup2/synthtiger/bangla/PaddleOCR/ppocr/utils/bn_char_synth.txt
    [2022/06/19 11:16:38] ppocr INFO:     character_type : ch
    [2022/06/19 11:16:38] ppocr INFO:     checkpoints : None
    [2022/06/19 11:16:38] ppocr INFO:     distributed : False
    [2022/06/19 11:16:38] ppocr INFO:     epoch_num : 100
    [2022/06/19 11:16:38] ppocr INFO:     eval_batch_step : [0, 5000]
    [2022/06/19 11:16:38] ppocr INFO:     infer_img : doc/imgs_words_en/41.jpg
    [2022/06/19 11:16:38] ppocr INFO:     infer_mode : False
    [2022/06/19 11:16:38] ppocr INFO:     log_smooth_window : 20
    [2022/06/19 11:16:38] ppocr INFO:     max_text_length : 25
    [2022/06/19 11:16:38] ppocr INFO:     pretrained_model : None
    [2022/06/19 11:16:38] ppocr INFO:     print_batch_step : 200
    [2022/06/19 11:16:38] ppocr INFO:     save_epoch_step : 1
    [2022/06/19 11:16:38] ppocr INFO:     save_inference_dir : None
    [2022/06/19 11:16:38] ppocr INFO:     save_model_dir : /backup2/synthtiger/bangla/PaddleOCR/output/rec/svtr/
    [2022/06/19 11:16:38] ppocr INFO:     save_res_path : /backup2/synthtiger/bangla/PaddleOCR/output/rec/predicts_svtr_tiny.txt
    [2022/06/19 11:16:38] ppocr INFO:     use_gpu : True
    [2022/06/19 11:16:38] ppocr INFO:     use_space_char : True
    [2022/06/19 11:16:38] ppocr INFO:     use_visualdl : False
    [2022/06/19 11:16:38] ppocr INFO: Loss : 
    [2022/06/19 11:16:38] ppocr INFO:     name : CTCLoss
    [2022/06/19 11:16:38] ppocr INFO: Metric : 
    [2022/06/19 11:16:38] ppocr INFO:     main_indicator : acc
    [2022/06/19 11:16:38] ppocr INFO:     name : RecMetric
    [2022/06/19 11:16:38] ppocr INFO: Optimizer : 
    [2022/06/19 11:16:38] ppocr INFO:     beta1 : 0.9
    [2022/06/19 11:16:38] ppocr INFO:     beta2 : 0.99
    [2022/06/19 11:16:38] ppocr INFO:     epsilon : 8e-08
    [2022/06/19 11:16:38] ppocr INFO:     lr : 
    [2022/06/19 11:16:38] ppocr INFO:         learning_rate : 0.0005
    [2022/06/19 11:16:38] ppocr INFO:         name : Cosine
    [2022/06/19 11:16:38] ppocr INFO:         warmup_epoch : 2
    [2022/06/19 11:16:38] ppocr INFO:     name : AdamW
    [2022/06/19 11:16:38] ppocr INFO:     no_weight_decay_name : norm pos_embed
    [2022/06/19 11:16:38] ppocr INFO:     one_dim_param_no_weight_decay : True
    [2022/06/19 11:16:38] ppocr INFO:     weight_decay : 0.05
    [2022/06/19 11:16:38] ppocr INFO: PostProcess : 
    [2022/06/19 11:16:38] ppocr INFO:     name : CTCLabelDecode
    [2022/06/19 11:16:38] ppocr INFO: Train : 
    [2022/06/19 11:16:38] ppocr INFO:     dataset : 
    [2022/06/19 11:16:38] ppocr INFO:         data_dir : /backup2/synthtiger/bangla/PaddleOCR/train_data/
    [2022/06/19 11:16:38] ppocr INFO:         label_file_list : ['/backup2/synthtiger/bangla/PaddleOCR/train_data/gt.txt']
    [2022/06/19 11:16:38] ppocr INFO:         name : SimpleDataSet
    [2022/06/19 11:16:38] ppocr INFO:         transforms : 
    [2022/06/19 11:16:38] ppocr INFO:             DecodeImage : 
    [2022/06/19 11:16:38] ppocr INFO:                 channel_first : False
    [2022/06/19 11:16:38] ppocr INFO:                 img_mode : BGR
    [2022/06/19 11:16:38] ppocr INFO:             CTCLabelEncode : None
    [2022/06/19 11:16:38] ppocr INFO:             RecResizeImg : 
    [2022/06/19 11:16:38] ppocr INFO:                 character_dict_path : None
    [2022/06/19 11:16:38] ppocr INFO:                 image_shape : [3, 64, 256]
    [2022/06/19 11:16:38] ppocr INFO:                 padding : False
    [2022/06/19 11:16:38] ppocr INFO:             KeepKeys : 
    [2022/06/19 11:16:38] ppocr INFO:                 keep_keys : ['image', 'label', 'length']
    [2022/06/19 11:16:38] ppocr INFO:     loader : 
    [2022/06/19 11:16:38] ppocr INFO:         batch_size_per_card : 1024
    [2022/06/19 11:16:38] ppocr INFO:         drop_last : True
    [2022/06/19 11:16:38] ppocr INFO:         num_workers : 0
    [2022/06/19 11:16:38] ppocr INFO:         shuffle : True
    [2022/06/19 11:16:38] ppocr INFO: profiler_options : None
    [2022/06/19 11:16:38] ppocr INFO: train with paddle 2.3.0 and device Place(gpu:0)
    [2022/06/19 11:16:38] ppocr INFO: Initialize indexs of datasets:['/backup2/synthtiger/bangla/PaddleOCR/train_data/gt.txt']
    [2022/06/19 11:17:15] ppocr INFO: Initialize indexs of datasets:['/backup2/synthtiger/bangla/PaddleOCR/horizontal_valid/gt.txt']
    W0619 11:17:20.803553 1660197 gpu_context.cc:278] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 11.2
    W0619 11:17:20.823755 1660197 gpu_context.cc:306] device: 0, cuDNN Version: 8.1.
    [2022/06/19 11:17:32] ppocr INFO: train from scratch
    [2022/06/19 11:17:32] ppocr INFO: train dataloader has 9253 iters
    [2022/06/19 11:17:32] ppocr INFO: valid dataloader has 1851 iters
    [2022/06/19 11:17:32] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 5000 iterations
    [2022/06/19 14:28:20] ppocr INFO: epoch: [1/100], global_step: 200, lr: 0.000005, acc: 0.000000, norm_edit_dis: 0.000000, loss: 57.202286, avg_reader_cost: 53.43127 s, avg_batch_cost: 57.23867 s, avg_samples: 1024.0, ips: 17.89000 samples/s, eta: 612 days, 20:44:52
    

    do you need more informations? what am i missing? please help,thanks

    good first issue recognition status/close 
    opened by mobassir94 29
  • hub serving输入图片的base64,得到'Please check data format!', 'results': '', 'status': '-1'}

    hub serving输入图片的base64,得到'Please check data format!', 'results': '', 'status': '-1'}

    我是用的是hub serving的快速部署模式, 使用的是http://Ip地址:8868/predict/ocr_system这个接口 使用了两种方式来输入图片的base64 1. 读入本地图片 image = open(image_path, 'rb').read() imgBase64 = base64.b64encode(image).decode('utf-8')

    1. 根据url读取图片 content = requests.get(img_url).content imgBase64 = base64.b64encode(content).decode('utf-8') 均会出现Please check data format的问题, 大部分图片是可以的, 有少部分会在10s之后返回Please check data format结果,请问在输入到hub serving之前如何进行处理? 我尝试过先转成Image, 然后convert('RGB'), 然后转base64也不工作.
    opened by pkuyilong 29
  • paddleocr with paddle serving on tensorrt

    paddleocr with paddle serving on tensorrt

    环境配置如下: paddleocr-release2.5 docker_image: registry.baidubce.com/paddlepaddle/paddle:2.1.3-gpu-cuda10.2-cudnn7 tensorrt: 7.2.1.6 paddle-gpu: 2.1.1(用来适配tensorrt7.2) paddle-serving-app: 0.7.0 paddle-serving-client: 0.7.0 paddle-serving-server-gpu: 0.7.0.post102 问题: 运行paddle serving 运行python pipeline: python web_service.py报错如下: The input [conv2d_252.tmp_0] shape of trt subgraph is [-1,96,-1,-1], please enable trt dynamic_shape mode by SetTRTDynamicShapeInfo 之前根据https://gitee.com/paddlepaddle/Serving/blob/v0.8.2/doc/TensorRT_Dynamic_Shape_CN.md已在web_service.py中的DetOp和RecOp类加入set_dynamic_shape_info函数,但是无效,依然报错

    opened by sybest1259 28
  • 服务化解析失败IndexError: string index out of range

    服务化解析失败IndexError: string index out of range

    (venv) PS D:\orc2\paddleOCR> python tools/test_hubserving.py --server_url=http://127.0.0.1:8868/predict/structure_table --image_dir=D:\1.jpeg D:\orc2\paddleOCR\venv\lib\site-packages\skimage\util\dtype.py:27: DeprecationWarning: np.bool8 is a deprecated alias for np.bool_. (Deprecated NumPy 1.24) IndexError: string index out of range

    opened by zyzz1974 0
  • PPOCRLabel不能正常运行的问题

    PPOCRLabel不能正常运行的问题

    下载了最新的PaddleOCR 2.6,想使用PPOCRLabel训练自己的数据,但是发现一直都无法正常运行。 安装完全是跟着官方教程走的,基本都能正常安装完成,但是就是跑不起来。 开始是报错np.int有问题,找了代码把np.int改成np.int32解决。 跑到界面以后,选择重新识别后又报以下的错: AttributeError: 'tuple' object has no attribute 'insert' 有时候没点到“矩形标注”,而直接点击了图像,就会报这个错: Traceback (most recent call last): File "D:\Python310\lib\site-packages\PPOCRLabel\PPOCRLabel.py", line 1425, in scrollRequest bar.setValue(bar.value() + bar.singleStep() * units) TypeError: setValue(self, int): argument 1 has unexpected type 'float'

    • 系统环境/System Environment:windows 10
    • 版本号/Version:Paddle:2.4.1 PaddleOCR:2.6.0 问题相关组件/Related components:PPOCRLabel
    • 运行指令/Command Code:
    • 完整报错/Complete Error Message:
    opened by metoogo 2
  • Slow runtime large images on CPU

    Slow runtime large images on CPU

    • System environment: Ubuntu 20.04
    • Version: latest
    • Command code: -
    • Complete error message: -

    Hi @andyjpaddle, I am trying to use PaddleOCR to extract raw text from high resolution images (4k) of healthcare documents. The extraction quality is very satisfying, but the runtime it takes to get there is often over 16 seconds, which is out of scope for my intended use of the OCR engine.

    Being images of healthcare documents, there is lots and lots of text, thus downscaling the images did not provide great results thus far, massively increasing the word error rate (WER).

    I assume the issue might be the internal tokenizer of PaddleOCR which generates lots of visual tokens for large images, thus requiring much more time to complete.

    Is there any idea that pops to your mind to mitigate the issue? Ideally, the raw text extraction should take around 5 seconds to enable the completion of further downstream tasks in a reasonable time.

    For context: Python 3.7.15 on a single CPU

    opened by DiTo97 4
Releases(v2.6.0)
  • v2.6.0(Aug 24, 2022)

    Release Note

    • Release PP-Structurev2,with functions and performance fully upgraded, adapted to Chinese scenes, and new support for Layout Recovery and one line command to convert PDF to Word;
    • Layout Analysis optimization: model storage reduced by 95%, while speed increased by 11 times, and the average CPU time-cost is only 41ms;
    • Table Recognition optimization: 3 optimization strategies are designed, and the model accuracy is improved by 6% under comparable time consumption;
    • Key Information Extraction optimization:a visual-independent model structure is designed, the accuracy of semantic entity recognition is increased by 2.8%, and the accuracy of relation extraction is increased by 9.1%.
    Source code(tar.gz)
    Source code(zip)
  • v2.5.0(May 9, 2022)

    Release Note

    • Release PP-OCRv3: With comparable speed, the effect of Chinese scene is further improved by 5% compared with PP-OCRv2, the effect of English scene is improved by 11%, and the average recognition accuracy of 80 language multilingual models is improved by more than 5%.
    • Release PPOCRLabelv2: Add the annotation function for table recognition task, key information extraction task and irregular text image.
    • Release interactive e-book "Dive into OCR", covers the cutting-edge theory and code practice of OCR full stack technology.
    Source code(tar.gz)
    Source code(zip)
  • v2.1.1(May 26, 2021)

    Release Note

    1. Newly release model pruning and model quantization tools based on PaddleSlim. Path
    2. Newly release mobile deployment tools based on Paddle-Lite. Path
    3. Newly release Android demo of ppocr system. path
    4. Newly release service deployment based on Paddle Serving. path
    Source code(tar.gz)
    Source code(zip)
  • v2.1.0(Apr 19, 2021)

  • v2.0.0(Feb 8, 2021)

    Release Note

    一、Support dynamic graph programming paradigm, adapted to Paddle 2.0, including:

    1. Detection algorithm: DB, EAST, SAST
    2. Recognition algorithm: Rosetta, CRNN, RARE, SRN, STAR-Net
    3. PPOCR Chinese models: (1) Detection models: mobile, server (2) Text direction classification models: mobile (3) Recognition models: mobile, server
    4. Multilingual models: (1) English: mobile (2) Japanese, Korean, French, German, etc. 25 languages in total: mobile

    二、The related works on deployment have been well adapted, including Inference(Python, C++) , whl, and serving

    三、Release the annotation and synthesis tools:

    1. Release a new data synthesis tool, i.e., Style-Text,easy to synthesize a large number of images which are similar to the target scene image.
    2. Release a new data annotation tool, i.e., PPOCRLabel, which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Sep 27, 2020)

Installations for running keras-theano on GPU Upgrade pip and install opencv2 cd ~ pip install --upgrade pip pip install opencv-python Upgrade keras

Berat Kurar Barakat 14 Sep 30, 2022
A synthetic data generator for text recognition

TextRecognitionDataGenerator A synthetic data generator for text recognition What is it for? Generating text image samples to train an OCR software. N

Edouard Belval 2.5k Jan 04, 2023
Aloception is a set of package for computer vision: aloscene, alodataset, alonet.

Aloception is a set of package for computer vision: aloscene, alodataset, alonet.

Visual Behavior 86 Dec 28, 2022
Python tool that takes the OCR.space JSON output as input and draws a text overlay on top of the image.

OCR.space OCR Result Checker = Draw OCR overlay on top of image Python tool that takes the OCR.space JSON output as input, and draws an overlay on to

a9t9 4 Oct 18, 2022
Image processing using OpenCv

Image processing using OpenCv Write a program that opens the webcam, and the user selects one of the following on the video: ✅ If the user presses the

M.Najafi 4 Feb 18, 2022
Convert PDF/Image to TXT using EasyOcr - the best OCR engine available!

PDFImage2TXT - DOWNLOAD INSTALLER HERE What can you do with it? Convert scanned PDFs to TXT. Convert scanned Documents to TXT. No coding required!! In

Hans Alemão 2 Feb 22, 2022
天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 - 第三名解决方案

天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 比赛链接 个人博客记录 目录结构 ├── final------------------------------------决赛方案PPT ├── preliminary_contest--------------------

19 Aug 17, 2022
Image Recognition Model Generator

Takes a user-inputted query and generates a machine learning image recognition model that determines if an inputted image is or isn't their query

Christopher Oka 1 Jan 13, 2022
Read Japanese manga inside browser with selectable text.

mokuro Read Japanese manga with selectable text inside a browser. See demo: https://kha-white.github.io/manga-demo mokuro_demo.mp4 Demo contains excer

Maciej Budyś 170 Dec 27, 2022
Document Layout Analysis Projects

Layout_Analysis Introduction This is an implementation of RLSA and X-Y Cut with OpenCV Dependencies OpenCV 3.0+ How to use Compile with g++ : g++ -std

22 Dec 08, 2022
FastOCR is a desktop application for OCR API.

FastOCR FastOCR is a desktop application for OCR API. Installation Arch Linux fastocr-git @ AUR Build from AUR or install with your favorite AUR helpe

Bruce Zhang 58 Jan 07, 2023
A bot that extract text from images using the Tesseract OCR.

Text from image (OCR) @ocr_text_bot A simple bot to extract text from images. Usage What do I need? A AWS key configured locally, see here. NodeJS. I

Weverton Marques 4 Aug 06, 2021
Hand Detection and Finger Detection on Live Feed

Hand-Detection-On-Live-Feed Hand Detection and Finger Detection on Live Feed Getting Started Install the dependencies $ git clone https://github.com/c

Chauhan Mahaveer 2 Jan 02, 2022
A dataset handling library for computer vision datasets in LOST-fromat

A dataset handling library for computer vision datasets in LOST-fromat

8 Dec 15, 2022
A program that takes in the hand gesture displayed by the user and translates ASL.

Interactive-ASL-Recognition Using the framework mediapipe made by google, OpenCV library and through self teaching, I was able to create a program tha

Riddhi Bajaj 3 Nov 22, 2021
Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Detecting Text in Natural Image with Connectionist Text Proposal Network The codes are used for implementing CTPN for scene text detection, described

Tian Zhi 1.3k Dec 22, 2022
virtual mouse which can copy files, close tabs and many other features !

AI Virtual Mouse Controller Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera. Fingertip loca

Diwas Pandey 23 Oct 05, 2021
Text modding tools for FF7R (Final Fantasy VII Remake)

FF7R_text_mod_tools Subtitle modding tools for FF7R (Final Fantasy VII Remake) There are 3 tools I made. make_dualsub_mod.exe: Merges (or swaps) subti

10 Dec 19, 2022
3点クリックで円を指定し、極座標変換を行うサンプルプログラム

click-warpPolar 3点クリックで円を指定し、極座標変換を行うサンプルプログラムです。 Requirements OpenCV 3.4.2 or Later Usage 実行方法は以下です。 起動後、マウスで3点をクリックし円を指定してください。 python click-warpPol

KazuhitoTakahashi 17 Dec 30, 2022
caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection Abstract This is a caffe re-implementation of R2CNN: Rotational Region CNN fo

candler 80 Dec 28, 2021