A framework for detecting, highlighting and correcting grammatical errors on natural language text.

Overview

PyPI - License Visits Badge

Gramformer

Human and machine generated text often suffer from grammatical and/or typographical errors. It can be spelling, punctuation, grammatical or word choice errors. Gramformer is a library that exposes 3 seperate interfaces to a family of algorithms to detect, highlight and correct grammar errors. To make sure the corrections and highlights recommended are of high quality, it comes with a quality estimator. You can use Gramformer in one or more areas mentioned under the "use-cases" section below or any other usecase as you see fit. Gramformer stands on the shoulders of gaints, it combines some of the top notch researches in grammar correction. Note: It works at sentence levels and has been trained on 128 length sentences, so not (yet) suitable for long prose or paragraphs (stay tuned for upcoming releases)

Table of contents

Usecases for Gramformer

Area 1: Post-processing machine generated text

Machine-Language generation is becoming mainstream, so will post-processing machine generated text.

  • Conditioned Text generation output(Text2Text generation).
    • NMT: Machine Translated output.
    • ASR or STT: Speech to text output.
    • HTR: Handwritten text recognition output.
    • Paraphrase generation output.
  • Controlled Text generation output(Text generation with PPLM) [TBD].
  • Free-form text generation output(Text generation)[TBD].

Area 2:Human-In-The-Loop (HITL) text

  • Most Supervised NLU (Chatbots and Conversational) systems need humans/experts to enter or edit text that needs to be grammtical correct otherwise the quality of HITL data can degrade the model over a period of time

Area 3:Assisted writing for humans

  • Integrating into custom Text editors of your Apps. (A Poor man's grammarly, if you will)

Area 4:Custom Platform integration

As of today grammatical safety nets for authoring social contents (Post or Comments) or text in messaging platforms is very little (word level correction) or non-existent.The onus is on the author to install tools like grammarly to proof read.

  • Messaging platforms and Social platforms can highlight / correct grammtical errors automatically without altering the meaning or intent.

Installation

pip install git+https://github.com/PrithivirajDamodaran/Gramformer.git@v0.1

Quick Start

Correcter - [Available now]

from gramformer import Gramformer
import torch

def set_seed(seed):
  torch.manual_seed(seed)
  if torch.cuda.is_available():
    torch.cuda.manual_seed_all(seed)

set_seed(1212)


gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 

influent_sentences = [
    "Matt like fish",
    "the collection of letters was original used by the ancient Romans",
    "We enjoys horror movies",
    "Anna and Mike is going skiing",
    "I walk to the store and I bought milk",
    "We all eat the fish and then made dessert",
    "I will eat fish for dinner and drank milk",
    "what be the reason for everyone leave the company",
]   

for influent_sentence in influent_sentences:
    corrected_sentence = gf.correct(influent_sentence)
    print("[Input] ", influent_sentence)
    print("[Correction] ",corrected_sentence[0])
    print("-" *100)
[Input]  Matt like fish
[Correction]  Matt likes fish
----------------------------------------------------------------------------------------------------
[Input]  the collection of letters was original used by the ancient Romans
[Correction]  The collection of letters was originally used by the ancient Romans.
----------------------------------------------------------------------------------------------------
[Input]  We enjoys horror movies
[Correction]  We enjoy horror movies
----------------------------------------------------------------------------------------------------
[Input]  Anna and Mike is going skiing
[Correction]  Anna and Mike are going skiing
----------------------------------------------------------------------------------------------------
[Input]  I walk to the store and I bought milk
[Correction]  I walked to the store and bought milk.
----------------------------------------------------------------------------------------------------
[Input]  We all eat the fish and then made dessert
[Correction]  We all ate the fish and then made dessert
----------------------------------------------------------------------------------------------------
[Input]  I will eat fish for dinner and drank milk
[Correction]  I'll eat fish for dinner and drink milk.
----------------------------------------------------------------------------------------------------
[Input]  what be the reason for everyone leave the company
[Correction]  what can be the reason for everyone to leave the company.
----------------------------------------------------------------------------------------------------

Challenge with generative models

While Gramformer aims to post-process outputs from the generative models, Gramformer itself is a generative model. So the question arises, who will post-process the Gramformer outputs ? (I know, very meta :-)). In general all generative models have the tendency to generate spurious text sometimes, which we cannot control. So to make sure the gramformer grammar corrections (and highlights) are as accurate as possible, A quality estimator (QE) will be added. It can estimate a error correction quality score and use that as a filter on Top-N candidates to return only the best based on the score.

Correcter with QE estimator - [Coming soon !]

from gramformer import Gramformer
gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
corrected_sentence = gf.correct(<your input sentence>, filter_by_quality=True, max_candidates=3)

Highlighter - [Coming soon !]

from gramformer import Gramformer
gf = Gramformer(models = 1, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
highlighted_sentence = gf.highlight(<your input sentence>)
[Input]  Matt like fish
[Highlight]  Matt <e> like </e> fish
----------------------------------------------------------------------------------------------------
[Input]  the collection of letters was original used by the ancient Romans
[Highlight]  the collection of letters was <e> original used </e> by the ancient Romans
----------------------------------------------------------------------------------------------------
[Input]  We enjoys horror movies
[Highlight]  We <e> enjoys horror </e> movies
----------------------------------------------------------------------------------------------------
[Input]  Anna and Mike is going skiing
[Highlight]  Anna and Mike <e> is going </e> skiing
----------------------------------------------------------------------------------------------------
[Input]  I walk to the store and I bought milk
[Highlight]  I <e> walk to </e> the store and I bought milk
----------------------------------------------------------------------------------------------------
[Input]  We all eat the fish and then made dessert
[Highlight]  We all <e> eat the </e> fish and then made dessert
----------------------------------------------------------------------------------------------------
[Input]  I will eat fish for dinner and drank milk
[Highlight]  I will eat fish for dinner and <e> drank milk </e> 
----------------------------------------------------------------------------------------------------
[Input]  what be the reason for everyone leave the company
[Highlight]  <e> what be </e> the reason <e> for everyone </e> <e> leave the </e> company
----------------------------------------------------------------------------------------------------
[Input]  One of the most important issue is the lack of parking spaces at the local mall.
[Highlight]  One of the most important <e> issue is </e> the lack of parking spaces at the local mall.
----------------------------------------------------------------------------------------------------
[Input]  The survey we performed recently showed that most of customers are satisfied.
[Highlight]  The survey we performed recently showed that most <e> of customers </e> are satisfied.
----------------------------------------------------------------------------------------------------
[Input]  I’ve loved classical music ever since I was child.
[Highlight]  I’ve loved classical music ever since I <e> was child </e>.
----------------------------------------------------------------------------------------------------

Detector - [Coming soon !]

from gramformer import Gramformer
gf = Gramformer(models = 0, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
grammar_fluency_score = gf.detect(<your input sentence>)

Models

Model Type Return status
prithivida/grammar_error_detector Classifier Label TBD (prithivida/parrot_fluency_on_BERT can be repurposed here, but I would recommend you wait :-))
prithivida/grammar_error_highlighter Seq2Seq Grammar errors enclosed in <e> and </e> Beta
prithivida/grammar_error_correcter Seq2Seq The corrected sentence Beta

Dataset

  • First idea is to generate the dataset using the techniques mentioned in the first paper highlighted in reference section. You can use the technique on anyone of the publicy available wikipedia edits datasets. Write some rules to filter only the grammatical edits, do some cleanup and thats it Bob's your uncle :-).
  • Second and possibly very complicated and $$$ way to get some 200M synthetic sentences. This is based on the last paper under references section. Not recommended but by all means knock yourself out if you are interested :-)
  • Third source is to repurpose the GEC Task data
  • I combined sources 1 and 3 to get my training data (still working on source 2, will keep you posted)
  • I ended up with ~1M records and after some heurtistics based filtering amounted to ~1/2M records.
  • It took ~12 hours to train each of the above models.

Benchmark

TBD (I will benchmark grammformer models against the following publicy available models: salesken/grammar_correction and flexudy/t5-small-wav2vec2-grammar-fixer shortly.

References

Citation

TBD

Comments
  • [Spacy error] Can't find model 'en'

    [Spacy error] Can't find model 'en'

    Hello I have successfully installed the Gramformer on my windows PC. but when I run, it gives the following error.

    Traceback (most recent call last):
      File "main.py", line 27, in <module>
        grammar_correction = Gramformer(models = 1, use_gpu=True)
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\gramformer\gramformer.py", line 8, in __init__
        self.annotator = errant.load('en')
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\errant\__init__.py", line 16, in load
        nlp = nlp or spacy.load(lang, disable=["ner"])
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\spacy\__init__.py", line 30, in load
        return util.load_model(name, **overrides)
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\spacy\util.py", line 175, in load_model
        raise IOError(Errors.E050.format(name=name))
    OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.
    
    opened by muzamil47 3
  • Commercial use issue

    Commercial use issue

    Hey @PrithivirajDamodaran

    The readme states that Gramformer versions above 1.0 are allowed for commercial use - however, this is not currently the case as the grammar_error_correcter_v1 model has been trained using the non-commercial WI&Locness data, even though the documentation states otherwise:

    The grammar_error_correcter_v1 model is actually identical to the previous grammar_error_correcter model which is trained using the non-commercial WI&Locness data – they have identical weights, which you can verify with this script

    As the models are the same, this means that both models have been trained using the non-commercial WI&Locness data, and the grammar_error_correcter_v1 model along with Gramformer v1.1 and v1.2 should not be allowed for commercial use.

    Could you please update the readme to clarify this, or upload a new model that has not been trained using WI&Locness?

    Thanks

    question 
    opened by SimonHFL 2
  • Use corrector for highligher

    Use corrector for highligher

    Hi @PrithivirajDamodaran

    This is a great framework. Is it possible (for now) to use model corrector (model=2) for the highlighter(model=1)? After getting some correction, match it to the input and give prefix and suffix () for the mismatch?

    Thanks

    question 
    opened by ilhamsyahids 2
  • Error loading the tokenizer in transformers==4.4.2

    Error loading the tokenizer in transformers==4.4.2

    I'm getting error when initializing the class object, specifically at tokenizer loading:

    In [6]: correction_tokenizer = AutoTokenizer.from_pretrained(correction_model_tag)
    ---------------------------------------------------------------------------
    Exception                                 Traceback (most recent call last)
    <ipython-input-6-d34dd9c5fe99> in <module>
    ----> 1 correction_tokenizer = AutoTokenizer.from_pretrained(correction_model_tag)
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
        414             tokenizer_class_py, tokenizer_class_fast = TOKENIZER_MAPPING[type(config)]
        415             if tokenizer_class_fast and (use_fast or tokenizer_class_py is None):
    --> 416                 return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
        417             else:
        418                 if tokenizer_class_py is not None:
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/tokenization_utils_base.py in from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
       1703
       1704         return cls._from_pretrained(
    -> 1705             resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs
       1706         )
       1707
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/tokenization_utils_base.py in _from_pretrained(cls, resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs)
       1774         # Instantiate tokenizer.
       1775         try:
    -> 1776             tokenizer = cls(*init_inputs, **init_kwargs)
       1777         except OSError:
       1778             raise OSError(
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/models/t5/tokenization_t5_fast.py in __init__(self, vocab_file, tokenizer_file, eos_token, unk_token, pad_token, extra_ids, additional_special_tokens, **kwargs)
        134             extra_ids=extra_ids,
        135             additional_special_tokens=additional_special_tokens,
    --> 136             **kwargs,
        137         )
        138
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/tokenization_utils_fast.py in __init__(self, *args, **kwargs)
         85         if fast_tokenizer_file is not None and not from_slow:
         86             # We have a serialization from tokenizers which let us directly build the backend
    ---> 87             fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
         88         elif slow_tokenizer is not None:
         89             # We need to convert a slow tokenizer to build the backend
    
    Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 1 column 329667
    

    transformers==4.4.2.

    The installation package didn't specify the transformers version that this library is using. What should be the correct version? Or is it version independent and it's something else?

    opened by zhangyilun 2
  • Figma Gramformer Plugin

    Figma Gramformer Plugin

    Figma is used in creating a lot of digital interfaces today, a Gramformer Figma plugin would go a long way. I'll be willing to design the interface for the plugin but I don't know how to make the plugin itself. I hope someone takes this up. This is a link to get started https://www.figma.com/plugin-docs/setup/

    enhancement 
    opened by ayoolafelix 2
  • README.md get_edits and get_highlight example small fixes

    README.md get_edits and get_highlight example small fixes

    Hi there, when I copy and pasted the examples in the README locally I noticed they were bugging out for the edits and highlights (were only pulling the first char of the sentence for errant). Providing the full sentence seemed to get the desired output.

    opened by parisac 1
  • Training dataset

    Training dataset

    Hi Prithiviraj,

    Is there any chance you'd be able to release the training dataset you used to train the Gramformer huggingface model? I see that there are some details on the slices of data that you brought together in the Readme, but it would be useful to be able to use the same data that you used.

    The main reason I'm asking is I'd like to create a model that can take correct text and add grammatical errors to it. So I was thinking I could take the dataset you used to train Gramformer and use the inverse to train a model that does the inverse. I can go through the data prep process as you did, but it would definitely be easier if I were able to reuse yours, and it might be useful for reproducibility for others as well.

    invalid question 
    opened by d4buss 1
  • OSError: Can't load config for 'prithivida/grammar_error_correcter'

    OSError: Can't load config for 'prithivida/grammar_error_correcter'

    Hi, I have been using your code for the last few days. Suddenly, it started to crash.

    Have a look at the code and error given below:

    Code (Link: https://huggingface.co/prithivida/grammar_error_correcter_v1):

    from gramformer import Gramformer
    import torch
    
    def set_seed(seed):
      torch.manual_seed(seed)
      if torch.cuda.is_available():
        torch.cuda.manual_seed_all(seed)
    
    set_seed(1212)
    
    
    gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
    
    influent_sentences = [
        "Matt like fish",
        "the collection of letters was original used by the ancient Romans",
        "We enjoys horror movies",
        "Anna and Mike is going skiing",
        "I walk to the store and I bought milk",
        "We all eat the fish and then made dessert",
        "I will eat fish for dinner and drank milk",
        "what be the reason for everyone leave the company",
    ]   
    
    for influent_sentence in influent_sentences:
        corrected_sentence = gf.correct(influent_sentence)
        print("[Input] ", influent_sentence)
        print("[Correction] ",corrected_sentence[0])
        print("-" *100)
    

    Error

    404 Client Error: Not Found for url: https://huggingface.co/prithivida/grammar_error_correcter/resolve/main/config.json
    ---------------------------------------------------------------------------
    HTTPError                                 Traceback (most recent call last)
    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
        491                 use_auth_token=use_auth_token,
    --> 492                 user_agent=user_agent,
        493             )
    
    7 frames
    /usr/local/lib/python3.7/dist-packages/transformers/file_utils.py in cached_path(url_or_filename, cache_dir, force_download, proxies, resume_download, user_agent, extract_compressed_file, force_extract, use_auth_token, local_files_only)
       1278             use_auth_token=use_auth_token,
    -> 1279             local_files_only=local_files_only,
       1280         )
    
    /usr/local/lib/python3.7/dist-packages/transformers/file_utils.py in get_from_cache(url, cache_dir, force_download, proxies, etag_timeout, resume_download, user_agent, use_auth_token, local_files_only)
       1441             r = requests.head(url, headers=headers, allow_redirects=False, proxies=proxies, timeout=etag_timeout)
    -> 1442             r.raise_for_status()
       1443             etag = r.headers.get("X-Linked-Etag") or r.headers.get("ETag")
    
    /usr/local/lib/python3.7/dist-packages/requests/models.py in raise_for_status(self)
        942         if http_error_msg:
    --> 943             raise HTTPError(http_error_msg, response=self)
        944 
    
    HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/prithivida/grammar_error_correcter/resolve/main/config.json
    
    During handling of the above exception, another exception occurred:
    
    OSError                                   Traceback (most recent call last)
    <ipython-input-10-0f43e537fe87> in <module>
         10 
         11 
    ---> 12 gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all
         13 
         14 influent_sentences = [
    
    /usr/local/lib/python3.7/dist-packages/gramformer/gramformer.py in __init__(self, models, use_gpu)
         14 
         15     if models == 2:
    ---> 16         self.correction_tokenizer = AutoTokenizer.from_pretrained(correction_model_tag)
         17         self.correction_model     = AutoModelForSeq2SeqLM.from_pretrained(correction_model_tag)
         18         self.correction_model     = self.correction_model.to(device)
    
    /usr/local/lib/python3.7/dist-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
        400         kwargs["_from_auto"] = True
        401         if not isinstance(config, PretrainedConfig):
    --> 402             config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
        403 
        404         use_fast = kwargs.pop("use_fast", True)
    
    /usr/local/lib/python3.7/dist-packages/transformers/models/auto/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
        428         """
        429         kwargs["_from_auto"] = True
    --> 430         config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
        431         if "model_type" in config_dict:
        432             config_class = CONFIG_MAPPING[config_dict["model_type"]]
    
    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
        502                 f"- or '{pretrained_model_name_or_path}' is the correct path to a directory containing a {CONFIG_NAME} file\n\n"
        503             )
    --> 504             raise EnvironmentError(msg)
        505 
        506         except json.JSONDecodeError:
    
    OSError: Can't load config for 'prithivida/grammar_error_correcter'. Make sure that:
    
    - 'prithivida/grammar_error_correcter' is a correct model identifier listed on 'https://huggingface.co/models'
    
    - or 'prithivida/grammar_error_correcter' is the correct path to a directory containing a config.json file
    ![Screenshot from 2021-07-01 18-36-07](https://user-images.githubusercontent.com/4704211/124133526-5a9da900-da9b-11eb-9733-61df46ab01e1.png)
    
    

    Possible Solution:

    Rename this link from: https://huggingface.co/prithivida/grammar_error_correcter/ to: https://huggingface.co/prithivida/grammar_error_correcter_v1/

    Please help me fix this. thank you

    opened by Nomiluks 1
  • Inference Issue !!!

    Inference Issue !!!

    OSError Traceback (most recent call last)

    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs) 241 if resolved_config_file is None: --> 242 raise EnvironmentError 243 config_dict = cls._dict_from_json_file(resolved_config_file)

    OSError:

    During handling of the above exception, another exception occurred:

    OSError Traceback (most recent call last)

    3 frames

    in () ----> 1 correction_tokenizer = AutoTokenizer.from_pretrained("prithivida/grammar_error_correcter") 2 correction_model = AutoModelForSeq2SeqLM.from_pretrained("prithivida/grammar_error_correcter") 3 print("[Gramformer] Grammar error correction model loaded..") 4 5

    /usr/local/lib/python3.7/dist-packages/transformers/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs) 204 config = kwargs.pop("config", None) 205 if not isinstance(config, PretrainedConfig): --> 206 config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs) 207 208 if "bert-base-japanese" in str(pretrained_model_name_or_path):

    /usr/local/lib/python3.7/dist-packages/transformers/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs) 201 202 """ --> 203 config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs) 204 205 if "model_type" in config_dict:

    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs) 249 f"- or '{pretrained_model_name_or_path}' is the correct path to a directory containing a {CONFIG_NAME} file\n\n" 250 ) --> 251 raise EnvironmentError(msg) 252 253 except json.JSONDecodeError:

    OSError: Can't load config for 'prithivida/grammar_error_correcter'. Make sure that:

    • 'prithivida/grammar_error_correcter' is a correct model identifier listed on 'https://huggingface.co/models'

    • or 'prithivida/grammar_error_correcter' is the correct path to a directory containing a config.json file

    Solutions for this issue????

    invalid 
    opened by sabhi27 1
  • How to train Gramformer on non-English languages.

    How to train Gramformer on non-English languages.

    Hey @PrithivirajDamodaran , Great work on building Gramformer, ive played with it and the results are amazing.

    I work on pushing nlp forward in under represented languages, and hence i humbly request you to please tell me how do i train gramformer on non-English sentences ?

    I checked out your HuggingFace page 'https://huggingface.co/prithivida/grammar_error_correcter' but coudn't find any resources on how to train gramformer from scratch. If you could help me in training Gramformer on non-English langauages it would really mean a lot to me. Do let me know.

    Thanks

    question 
    opened by StephennFernandes 1
  • pip install is erroring out,

    pip install is erroring out,

    I am unable to do pip install of the package, here is the error:

    Collecting git+https://github.com/PrithivirajDamodaran/[email protected] Cloning https://github.com/PrithivirajDamodaran/Gramformer.git (to revision v0.1) to c:\users\sumit\appdata\local\temp\pip-req-build-sw54k_0h ERROR: Error [WinError 2] The system cannot find the file specified while executing command git clone -q https://github.com/PrithivirajDamodaran/Gramformer.git 'C:\Users\Sumit\AppData\Local\Temp\pip-req-build-sw54k_0h' ERROR: Cannot find command 'git' - do you have 'git' installed and in your PATH?

    I also tried directly downloading the repo and tried executing the package. Model is not present in location(correction_model_tag = "prithivida/grammar_error_correcter"). Any way to download the pretrain model.

    opened by ranjan-sumit 1
  • OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

    OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

    OSError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_9376\2706950954.py in 25 26 ---> 27 gf = Gramformer(models = 1, use_gpu=False) # 1=corrector, 2=detector 28 29 influent_sentences = [

    ~\anaconda3_9\envs\python37\lib\site-packages\gramformer\gramformer.py in init(self, models, use_gpu) 7 import errant 8 #self.annotator = errant.load('en_core_web_sm') ----> 9 self.annotator = errant.load('en') # en is deprecated from spacy 3.0 onwards 10 11 if use_gpu:

    ~\anaconda3_9\envs\python37\lib\site-packages\errant_init_.py in load(lang, nlp) 17 18 # Load spacy ---> 19 nlp = nlp or spacy.load(lang, disable=["ner"]) 20 21 # Load language edit merger

    ~\anaconda3_9\envs\python37\lib\site-packages\spacy_init_.py in load(name, **overrides) 28 if depr_path not in (True, False, None): 29 warnings.warn(Warnings.W001.format(path=depr_path), DeprecationWarning) ---> 30 return util.load_model(name, **overrides) 31 32

    ~\anaconda3_9\envs\python37\lib\site-packages\spacy\util.py in load_model(name, **overrides) 173 elif hasattr(name, "exists"): # Path or Path-like to model data 174 return load_model_from_path(name, **overrides) --> 175 raise IOError(Errors.E050.format(name=name)) 176 177

    OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

    opened by vky2998 2
  • Word limit

    Word limit

    The model is having trouble with long sentences. Specially if the words in the sentences are in upper case. It outputs only limited sentence as an output and the rest neglected sentence is shown as error.

    opened by Talib6509 0
  • Gramformer Highlight function not working

    Gramformer Highlight function not working

    Hello... I'm trying to get the edits between two sentences, but the highlight function is not working. Has anybody faced the same issue? Many thanks in advance

    opened by NourAlMerey 0
  • Suggestions to improve the grammar results for short sentences

    Suggestions to improve the grammar results for short sentences

    Hello..!

    I have used Gramformer model and I think this could be quite useful for checking and correcting some grammar points, especially for correcting singular/plural, verb forms and tenses, and spelling. However, some other grammar points (like correcting sentence structure, comparative/superlative forms, pronoun cases, etc.) seem to be still tricky.

    Note: I need to use the model on short sentences.

    The biggest challenge I faced in my case is: (Please suggest how to avoid it or improve it or changing some parameters...) 1 - Since it corrects grammar by generating text, most of the time it completely changes the sentence and rephrase it. How can we avoid this.

    whose bags you can bring? --> Which bags you can bring? (Just a sample, and sometime it generates totally changed verbose sentence)

    2 - Every time I give the same sentence as input, it generates different outputs:

    I go can there: three outputs in three different run ("I go, there"., "can I go there?", "I go back there.")

    Thanks!

    opened by muzamil47 0
Releases(v1.4)
  • v1.4(Aug 10, 2021)

    ⚡️ Features added/changed

    ✅ Correct API uses a ranker to sort good quality corrections. ✅ Highlight API returns sents w/errors marked up as readable tags. ✅ Edit API returns error types, positions, and respective corrections. ✅ The latest model checkpoint has been refreshed w/more data.

    License update to MIT.

    Source code(tar.gz)
    Source code(zip)
Owner
Prithivida
Applied NLP, XAI for NLP and Data Engineering
Prithivida
Flake8 plugin to find commented out or dead code

flake8-eradicate flake8 plugin to find commented out (or so called "dead") code. This is quite important for the project in a long run. Based on eradi

wemake.services 277 Dec 27, 2022
It's not just a linter that annoys you!

README for Pylint - https://pylint.pycqa.org/ Professional support for pylint is available as part of the Tidelift Subscription. Tidelift gives softwa

Python Code Quality Authority 4.4k Jan 04, 2023
Utilities for refactoring imports in python-like syntax.

aspy.refactor_imports Utilities for refactoring imports in python-like syntax. Installation pip install aspy.refactor_imports Examples aspy.refactor_i

Anthony Sottile 20 Nov 01, 2022
Easy saving and switching between multiple KDE configurations.

Konfsave Konfsave is a config manager. That is, it allows you to save, back up, and easily switch between different (per-user) system configurations.

42 Sep 25, 2022
The official GitHub mirror of https://gitlab.com/pycqa/flake8

Flake8 Flake8 is a wrapper around these tools: PyFlakes pycodestyle Ned Batchelder's McCabe script Flake8 runs all the tools by launching the single f

Python Code Quality Authority 2.6k Jan 03, 2023
coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.

"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live." ― John F. Woods coala provides a

coala development group 3.4k Dec 29, 2022
Plugin for mypy to support zope.interface

Plugin for mypy to support zope.interface The goal is to be able to make zope interfaces to be treated as types in mypy sense. Usage Install both mypy

Shoobx 36 Oct 29, 2022
A static-analysis bot for Github

Imhotep, the peaceful builder. What is it? Imhotep is a tool which will comment on commits coming into your repository and check for syntactic errors

Justin Abrahms 221 Nov 10, 2022
A python documentation linter which checks that the docstring description matches the definition.

Darglint A functional docstring linter which checks whether a docstring's description matches the actual function/method implementation. Darglint expe

Terrence Reilly 463 Dec 31, 2022
docstring style checker

pydocstyle - docstring style checker pydocstyle is a static analysis tool for checking compliance with Python docstring conventions. pydocstyle suppor

Python Code Quality Authority 982 Jan 03, 2023
Flake8 extension to provide force-check option

flake8-force Flake8 extension to provide force-check option. When this option is enabled, flake8 performs all checks even if the target file cannot be

Kenichi Maehashi 9 Oct 29, 2022
Flake8 wrapper to make it nice, legacy-friendly, configurable.

THE PROJECT IS ARCHIVED Forks: https://github.com/orsinium/forks It's a Flake8 wrapper to make it cool. Lint md, rst, ipynb, and more. Shareable and r

Life4 232 Dec 16, 2022
❄️ A flake8 plugin to help you write better list/set/dict comprehensions.

flake8-comprehensions A flake8 plugin that helps you write better list/set/dict comprehensions. Requirements Python 3.6 to 3.9 supported. Installation

Adam Johnson 398 Dec 23, 2022
Flake8 plugin that checks import order against various Python Style Guides

flake8-import-order A flake8 and Pylama plugin that checks the ordering of your imports. It does not check anything else about the imports. Merely tha

Python Code Quality Authority 270 Nov 24, 2022
mypy plugin to type check Kubernetes resources

kubernetes-typed mypy plugin to dynamically define types for Kubernetes objects. Features Type checking for Custom Resources Type checking forkubernet

Artem Yarmoliuk 16 Oct 10, 2022
:sparkles: Surface lint errors during code review

✨ Linty Fresh ✨ Keep your codebase sparkly clean with the power of LINT! Linty Fresh parses lint errors and report them back to GitHub as comments on

Lyft 183 Dec 18, 2022
Enforce the same configuration across multiple projects

Nitpick Flake8 plugin to enforce the same tool configuration (flake8, isort, mypy, Pylint...) across multiple Python projects. Useful if you maintain

Augusto W. Andreoli 315 Dec 25, 2022
A plugin for Flake8 that checks pandas code

pandas-vet pandas-vet is a plugin for flake8 that provides opinionated linting for pandas code. It began as a project during the PyCascades 2019 sprin

Jacob Deppen 146 Dec 28, 2022
A plugin for Flake8 that provides specializations for type hinting stub files

flake8-pyi A plugin for Flake8 that provides specializations for type hinting stub files, especially interesting for linting typeshed. Functionality A

Łukasz Langa 58 Jan 04, 2023
A plugin for Flake8 finding likely bugs and design problems in your program. Contains warnings that don't belong in pyflakes and pycodestyle.

flake8-bugbear A plugin for Flake8 finding likely bugs and design problems in your program. Contains warnings that don't belong in pyflakes and pycode

Python Code Quality Authority 869 Dec 30, 2022