✨Fast Coreference Resolution in spaCy with Neural Networks

Overview

NeuralCoref 4.0: Coreference Resolution in spaCy with Neural Networks.

NeuralCoref is a pipeline extension for spaCy 2.1+ which annotates and resolves coreference clusters using a neural network. NeuralCoref is production-ready, integrated in spaCy's NLP pipeline and extensible to new training datasets.

For a brief introduction to coreference resolution and NeuralCoref, please refer to our blog post. NeuralCoref is written in Python/Cython and comes with a pre-trained statistical model for English only.

NeuralCoref is accompanied by a visualization client NeuralCoref-Viz, a web interface powered by a REST server that can be tried online. NeuralCoref is released under the MIT license.

Version 4.0 out now! Available on pip and compatible with SpaCy 2.1+.

Current Release Version spaCy Travis-CI NeuralCoref online Demo

  • Operating system: macOS / OS X · Linux · Windows (Cygwin, MinGW, Visual Studio)
  • Python version: Python 3.6+ (only 64 bit)
  • Package managers: [pip]

Install NeuralCoref

Install NeuralCoref with pip

This is the easiest way to install NeuralCoref.

pip install neuralcoref

spacy.strings.StringStore size changed error

If you have an error mentioning spacy.strings.StringStore size changed, may indicate binary incompatibility when loading NeuralCoref with import neuralcoref, it means you'll have to install NeuralCoref from the distribution's sources instead of the wheels to get NeuralCoref to build against the most recent version of SpaCy for your system.

In this case, simply re-install neuralcoref as follows:

pip uninstall neuralcoref
pip install neuralcoref --no-binary neuralcoref

Installing SpaCy's model

To be able to use NeuralCoref you will also need to have an English model for SpaCy.

You can use whatever english model works fine for your application but note that the performances of NeuralCoref are strongly dependent on the performances of the SpaCy model and in particular on the performances of SpaCy model's tagger, parser and NER components. A larger SpaCy English model will thus improve the quality of the coreference resolution as well (see some details in the Internals and Model section below).

Here is an example of how you can install SpaCy and a (small) English model for SpaCy, more information can be found on spacy's website:

pip install -U spacy
python -m spacy download en

Install NeuralCoref from source

You can also install NeuralCoref from sources. You will need to install the dependencies first which includes Cython and SpaCy.

Here is the process:

venv .env
source .env/bin/activate
git clone https://github.com/huggingface/neuralcoref.git
cd neuralcoref
pip install -r requirements.txt
pip install -e .

Internals and Model

NeuralCoref is made of two sub-modules:

  • a rule-based mentions-detection module which uses SpaCy's tagger, parser and NER annotations to identify a set of potential coreference mentions, and
  • a feed-forward neural-network which compute a coreference score for each pair of potential mentions.

The first time you import NeuralCoref in python, it will download the weights of the neural network model in a cache folder.

The cache folder is set by defaults to ~/.neuralcoref_cache (see file_utils.py) but this behavior can be overided by setting the environment variable NEURALCOREF_CACHE to point to another location.

The cache folder can be safely deleted at any time and the module will download again the model the next time it is loaded.

You can have more information on the location, downloading and caching process of the internal model by activating python's logging module before loading NeuralCoref as follows:

import logging;
logging.basicConfig(level=logging.INFO)
import neuralcoref
>>> INFO:neuralcoref:Getting model from https://s3.amazonaws.com/models.huggingface.co/neuralcoref/neuralcoref.tar.gz or cache
>>> INFO:neuralcoref.file_utils:https://s3.amazonaws.com/models.huggingface.co/neuralcoref/neuralcoref.tar.gz not found in cache, downloading to /var/folders/yx/cw8n_njx3js5jksyw_qlp8p00000gn/T/tmp_8y5_52m
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40155833/40155833 [00:06<00:00, 6679263.76B/s]
>>> INFO:neuralcoref.file_utils:copying /var/folders/yx/cw8n_njx3js5jksyw_qlp8p00000gn/T/tmp_8y5_52m to cache at /Users/thomaswolf/.neuralcoref_cache/f46bc05a4bfba2ae0d11ffd41c4777683fa78ed357dc04a23c67137abf675e14.7d6f9a6fecf5cf09e74b65f85c7d6896b21decadb2554d486474f63b95ec4633
>>> INFO:neuralcoref.file_utils:creating metadata file for /Users/thomaswolf/.neuralcoref_cache/f46bc05a4bfba2ae0d11ffd41c4777683fa78ed357dc04a23c67137abf675e14.7d6f9a6fecf5cf09e74b65f85c7d6896b21decadb2554d486474f63b95ec4633
>>> INFO:neuralcoref.file_utils:removing temp file /var/folders/yx/cw8n_njx3js5jksyw_qlp8p00000gn/T/tmp_8y5_52m
>>> INFO:neuralcoref:extracting archive file /Users/thomaswolf/.neuralcoref_cache/f46bc05a4bfba2ae0d11ffd41c4777683fa78ed357dc04a23c67137abf675e14.7d6f9a6fecf5cf09e74b65f85c7d6896b21decadb2554d486474f63b95ec4633 to dir /Users/thomaswolf/.neuralcoref_cache/neuralcoref

Loading NeuralCoref

Adding NeuralCoref to the pipe of an English SpaCy Language

Here is the recommended way to instantiate NeuralCoref and add it to SpaCY's pipeline of annotations:

# Load your usual SpaCy model (one of SpaCy English models)
import spacy
nlp = spacy.load('en')

# Add neural coref to SpaCy's pipe
import neuralcoref
neuralcoref.add_to_pipe(nlp)

# You're done. You can now use NeuralCoref as you usually manipulate a SpaCy document annotations.
doc = nlp(u'My sister has a dog. She loves him.')

doc._.has_coref
doc._.coref_clusters

Loading NeuralCoref and adding it manually to the pipe of an English SpaCy Language

An equivalent way of adding NeuralCoref to a SpaCy model pipe is to instantiate the NeuralCoref class first and then add it manually to the pipe of the SpaCy Language model.

# Load your usual SpaCy model (one of SpaCy English models)
import spacy
nlp = spacy.load('en')

# load NeuralCoref and add it to the pipe of SpaCy's model
import neuralcoref
coref = neuralcoref.NeuralCoref(nlp.vocab)
nlp.add_pipe(coref, name='neuralcoref')

# You're done. You can now use NeuralCoref the same way you usually manipulate a SpaCy document and it's annotations.
doc = nlp(u'My sister has a dog. She loves him.')

doc._.has_coref
doc._.coref_clusters

Using NeuralCoref

NeuralCoref will resolve the coreferences and annotate them as extension attributes in the spaCy Doc, Span and Token objects under the ._. dictionary.

Here is the list of the annotations:

Attribute Type Description
doc._.has_coref boolean Has any coreference has been resolved in the Doc
doc._.coref_clusters list of Cluster All the clusters of corefering mentions in the doc
doc._.coref_resolved unicode Unicode representation of the doc where each corefering mention is replaced by the main mention in the associated cluster.
doc._.coref_scores Dict of Dict Scores of the coreference resolution between mentions.
span._.is_coref boolean Whether the span has at least one corefering mention
span._.coref_cluster Cluster Cluster of mentions that corefer with the span
span._.coref_scores Dict Scores of the coreference resolution of & span with other mentions (if applicable).
token._.in_coref boolean Whether the token is inside at least one corefering mention
token._.coref_clusters list of Cluster All the clusters of corefering mentions that contains the token

A Cluster is a cluster of coreferring mentions which has 3 attributes and a few methods to simplify the navigation inside a cluster:

Attribute or method Type / Return type Description
i int Index of the cluster in the Doc
main Span Span of the most representative mention in the cluster
mentions list of Span List of all the mentions in the cluster
__getitem__ return Span Access a mention in the cluster
__iter__ yields Span Iterate over mentions in the cluster
__len__ return int Number of mentions in the cluster

Navigating the coreference cluster chains

You can also easily navigate the coreference cluster chains and display clusters and mentions.

Here are some examples, try them out to test it for yourself.

import spacy
import neuralcoref
nlp = spacy.load('en')
neuralcoref.add_to_pipe(nlp)

doc = nlp(u'My sister has a dog. She loves him')

doc._.coref_clusters
doc._.coref_clusters[1].mentions
doc._.coref_clusters[1].mentions[-1]
doc._.coref_clusters[1].mentions[-1]._.coref_cluster.main

token = doc[-1]
token._.in_coref
token._.coref_clusters

span = doc[-1:]
span._.is_coref
span._.coref_cluster.main
span._.coref_cluster.main._.coref_cluster

Important: NeuralCoref mentions are spaCy Span objects which means you can access all the usual Span attributes like span.start (index of the first token of the span in the document), span.end (index of the first token after the span in the document), etc...

Ex: doc._.coref_clusters[1].mentions[-1].start will give you the index of the first token of the last mention of the second coreference cluster in the document.

Parameters

You can pass several additional parameters to neuralcoref.add_to_pipe or NeuralCoref() to control the behavior of NeuralCoref.

Here is the full list of these parameters and their descriptions:

Parameter Type Description
greedyness float A number between 0 and 1 determining how greedy the model is about making coreference decisions (more greedy means more coreference links). The default value is 0.5.
max_dist int How many mentions back to look when considering possible antecedents of the current mention. Decreasing the value will cause the system to run faster but less accurately. The default value is 50.
max_dist_match int The system will consider linking the current mention to a preceding one further than max_dist away if they share a noun or proper noun. In this case, it looks max_dist_match away instead. The default value is 500.
blacklist boolean Should the system resolve coreferences for pronouns in the following list: ["i", "me", "my", "you", "your"]. The default value is True (coreference resolved).
store_scores boolean Should the system store the scores for the coreferences in annotations. The default value is True.
conv_dict dict(str, list(str)) A conversion dictionary that you can use to replace the embeddings of rare words (keys) by an average of the embeddings of a list of common words (values). Ex: conv_dict={"Angela": ["woman", "girl"]} will help resolving coreferences for Angela by using the embeddings for the more common woman and girl instead of the embedding of Angela. This currently only works for single words (not for words groups).

How to change a parameter

import spacy
import neuralcoref

# Let's load a SpaCy model
nlp = spacy.load('en')

# First way we can control a parameter
neuralcoref.add_to_pipe(nlp, greedyness=0.75)

# Another way we can control a parameter
nlp.remove_pipe("neuralcoref")  # This remove the current neuralcoref instance from SpaCy pipe
coref = neuralcoref.NeuralCoref(nlp.vocab, greedyness=0.75)
nlp.add_pipe(coref, name='neuralcoref')

Using the conversion dictionary parameter to help resolve rare words

Here is an example on how we can use the parameter conv_dict to help resolving coreferences of a rare word like a name:

import spacy
import neuralcoref

nlp = spacy.load('en')

# Let's try before using the conversion dictionary:
neuralcoref.add_to_pipe(nlp)
doc = nlp(u'Deepika has a dog. She loves him. The movie star has always been fond of animals')
doc._.coref_clusters
doc._.coref_resolved
# >>> [Deepika: [Deepika, She, him, The movie star]]
# >>> 'Deepika has a dog. Deepika loves Deepika. Deepika has always been fond of animals'
# >>> Not very good...

# Here are three ways we can add the conversion dictionary
nlp.remove_pipe("neuralcoref")
neuralcoref.add_to_pipe(nlp, conv_dict={'Deepika': ['woman', 'actress']})
# or
nlp.remove_pipe("neuralcoref")
coref = neuralcoref.NeuralCoref(nlp.vocab, conv_dict={'Deepika': ['woman', 'actress']})
nlp.add_pipe(coref, name='neuralcoref')
# or after NeuralCoref is already in SpaCy's pipe, by modifying NeuralCoref in the pipeline
nlp.get_pipe('neuralcoref').set_conv_dict({'Deepika': ['woman', 'actress']})

# Let's try agin with the conversion dictionary:
doc = nlp(u'Deepika has a dog. She loves him. The movie star has always been fond of animals')
doc._.coref_clusters
# >>> [Deepika: [Deepika, She, The movie star], a dog: [a dog, him]]
# >>> 'Deepika has a dog. Deepika loves a dog. Deepika has always been fond of animals'
# >>> A lot better!

Using NeuralCoref as a server

A simple example of server script for integrating NeuralCoref in a REST API is provided as an example in examples/server.py.

To use it you need to install falcon first:

pip install falcon

You can then start the server as follows:

cd examples
python ./server.py

And query the server like this:

curl --data-urlencode "text=My sister has a dog. She loves him." -G localhost:8000

There are many other ways you can manage and deploy NeuralCoref. Some examples can be found in spaCy Universe.

Re-train the model / Extend to another language

If you want to retrain the model or train it on another language, see our training instructions as well as our blog post

Comments
  • binary incompatibility

    binary incompatibility

    I'm using the current spaCy from the master branch, and getting this error:

    RuntimeWarning: spacy.tokens.span.Span size changed, may indicate binary incompatibility. Expected 72 from C header, got 80 from PyObject

    I'm assuming this happens because span.pxd has changed after the 2.1 release: https://github.com/explosion/spaCy/commits/master/spacy/tokens/span.pxd

    I tried reinstalling with

    pip install neuralcoref --no-binary neuralcoref

    But the warning remains and the program crashes when I run nlp(doc):

    Process finished with exit code -1073741819 (0xC0000005)

    Any idea on how to fix this? I'm compiling spaCy from sources too, so I was hoping not to have to do the same for neuralcoref ...

    upgrade install 
    opened by svlandeg 29
  • ExtraData: unpack(b) received extra data.

    ExtraData: unpack(b) received extra data.

    I get the following error while loading a custom mode with:

    ...
    neuralcoref.add_to_pipe(nlp)
    
    model in init = True
     ExtraData: unpack(b) received extra data. 
    ---------------------------------------------------------------------------
    ExtraData                                 Traceback (most recent call last)
    <ipython-input-6-3f11485ad4f8> in <module>
    ----> 1 neuralcoref.add_to_pipe(nlp)
    
    /workspace/neuralcoref_02/neuralcoref_with_training_mods/neuralcoref/__init__.py in add_to_pipe(nlp, **kwargs)
         40 
         41 def add_to_pipe(nlp, **kwargs):
    ---> 42     coref = NeuralCoref(nlp.vocab, **kwargs)
         43     nlp.add_pipe(coref, name="neuralcoref")
         44     return nlp
    
    neuralcoref.pyx in neuralcoref.neuralcoref.NeuralCoref.__init__()
    
    neuralcoref.pyx in neuralcoref.neuralcoref.NeuralCoref.from_disk()
    
    /opt/conda/lib/python3.6/site-packages/thinc/neural/_classes/model.py in from_bytes(self, bytes_data)
        353 
        354     def from_bytes(self, bytes_data):
    --> 355         data = srsly.msgpack_loads(bytes_data)
        356         weights = data[b"weights"]
        357         queue = [self]
    
    /opt/conda/lib/python3.6/site-packages/srsly/_msgpack_api.py in msgpack_loads(data, use_list)
         27     # msgpack-python docs suggest disabling gc before unpacking large messages
         28     gc.disable()
    ---> 29     msg = msgpack.loads(data, raw=False, use_list=use_list)
         30     gc.enable()
         31     return msg
    
    /opt/conda/lib/python3.6/site-packages/srsly/msgpack/__init__.py in unpackb(packed, **kwargs)
         58         object_hook = kwargs.get('object_hook')
         59         kwargs['object_hook'] = functools.partial(_decode_numpy, chain=object_hook)
    ---> 60     return _unpackb(packed, **kwargs)
         61 
         62 
    
    _unpacker.pyx in srsly.msgpack._unpacker.unpackb()
    
    ExtraData: unpack(b) received extra data.
    

    Thats how my model-folder looks like.

    image

    the model was generated as explained below (see mail from chieter).

    wontfix usage 
    opened by SimonF89 23
  • NeuralCoref-3.0 can't load the new spacy model

    NeuralCoref-3.0 can't load the new spacy model

    I couldn't load the spacy model en-coref-sm. I have installed both neuralcoref-3.0 and en-coref-sm by downloading and running the setup.py even I tried the pip install for both. Once the installation completed when tried to load the spacy model it throws the below exception.

    Traceback (most recent call last): File "/home/extraction/CoreferenceResolver.py", line 5, in from neuralcoref import Coref File "/usr/local/lib/python2.7/dist-packages/neuralcoref-3.0-py2.7-linux-x86_64.egg/neuralcoref/init.py", line 3, in from .neuralcoref import NeuralCoref File "neuralcoref.pyx", line 101, in init neuralcoref.neuralcoref TypeError: must be char, not unicode

    Please provide me the clear steps to begin with the new neuralcoref

    ubuntu 
    opened by Praveenabiginfo 21
  • spacy.strings.StringStore has the wrong size, try recompiling

    spacy.strings.StringStore has the wrong size, try recompiling

    Spacy works perfectly fine for me with the usual spacy-provided models, but trying to load en_coref_md or en_coref_lg fails with the following message:

    $ pip install https://github.com/huggingface/neuralcoref-models/releases/download/en_coref_md-3.0.0/en_coref_md-3.0.0.tar.gz
    $ python
    Python 3.7.0 (default, Jun 28 2018, 07:39:16)
    [Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>>
    >>> import spacy
    >>> spacy.load('en_coref_md')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/site-packages/spacy/__init__.py", line 17, in load
        return util.load_model(name, **overrides)
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/site-packages/spacy/util.py", line 114, in load_model
        return load_model_from_package(name, **overrides)
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/site-packages/spacy/util.py", line 134, in load_model_from_package
        cls = importlib.import_module(name)
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
      File "<frozen importlib._bootstrap>", line 983, in _find_and_load
      File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 728, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/site-packages/en_coref_md/__init__.py", line 6, in <module>
        from en_coref_md.neuralcoref import NeuralCoref
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/site-packages/en_coref_md/neuralcoref/__init__.py", line 1, in <module>
        from .neuralcoref import NeuralCoref
      File "strings.pxd", line 23, in init en_coref_md.neuralcoref.neuralcoref
    ValueError: spacy.strings.StringStore has the wrong size, try recompiling. Expected 88, got 112
    
    >>> spacy.__version__
    '2.0.11'
    

    My environment:

    python 3.7
    spacy 2.0.11
    mac
    

    Not sure if this makes any difference, but spacy was installed via conda while coref was installed via pip. This is part of the output of conda list

    spacy                     2.0.11           py37h6440ff4_2
    en-coref-md               3.0.0                     <pip>
    
    upgrade install 
    opened by fersarr 19
  • Training Neuralcoref for Dutch does not work

    Training Neuralcoref for Dutch does not work

    Dear guys,

    Firstly, thank you guys so much for this interesting work. I'm training the neuralcoref model for Dutch language using SoNar corpus, at first, I used this script to convert the MMAX format to CONLL format. After that, I trained a w2v model to prepare the static_word_embedding files. I have a few questions that I could not answer myself and I could not also find anywhere else.

    • I don't know what tuned_word_embedding files are, whenever I ran the conllparser.py, it just complained about missing those files. Looking deeper to the original tuned_word_embedding, I could see that it is similar to the static_word_embeddings, however, there are words that appear in both static and tuned word embeddings, and there are words that only appear in tuned_word_embeddings. For this reason, I just used exactly the same word embeddings file for both static and tuned. It seemed to work (at least not throw any complaint but I'm not sure if it work or not).
    • I have no idea how you constructed the MISSING and the UNK tokens in those static/tuned word embeddings.
    • When I run the train code, it ran quite well at first but then display this error (I think it's from PERL): image

    I came across many topics as well as posting questions on many threads, however I still got no help or guidance. Thank you so much for any help that any of you can provide.

    With best regards, Eric

    wontfix training 
    opened by EricLe-dev 18
  • Using OntoNotes 5.0 to generate coNLL files

    Using OntoNotes 5.0 to generate coNLL files

    Description I am currently stucked at the "Get the data" section for training the neural coreference model. As a newbie, I have little understanding of converting the skeleton files to conll files. Here are the commands specified in the guide:

    skeleton2conll.sh -D [path_to_ontonotes_train_folder] [path_to_skeleton_train_folder] skeleton2conll.sh -D [path_to_ontonotes_test_folder] [path_to_skeleton_test_folder] skeleton2conll.sh -D [path_to_ontonotes_dev_folder] [path_to_skeleton_dev_folder]
    h

    Result

    Here is my command. image

    Here is the output in case image wont load:

    $ ".\conll-2012-scripts\conll-2012\v3\scripts\skeleton2conll.sh" -D ".\ontonotes-release-5.0\data\files\data\" ".\conll-2012-train\conll-2012\" please make sure that you are pointing to the directory 'conll-2012'

    Data OntoNotes 5.0 from LDC (thru email) Training, and Development data (both are v4) Test Data (Official, v9) CoNLL 2012 scripts (v3) last four from this link

    Steps to reproduce

    1. Download the data
    2. Extract the data
    3. Run the command skeleton2conll.sh -D [path/to/conll-2012-train-v0/data/files/data] [path/to/conll-2012]

    Build/Platform Windows 10 Git Bash (mingw64) python 3.6 cpu (no CUDA)

    Alternatively, if someone knows how to use conll-formatted Onotnotes 5.0, I can also put an issue about it.

    wontfix training 
    opened by vrian 15
  • Attribute Error

    Attribute Error

    Code: import spacy import en_coref_md

    nlp = en_coref_md.load() doc = nlp(u'My sister has a dog. She loves him.')

    doc..has_coref doc..coref_clusters

    Error:

    AttributeError Traceback (most recent call last) in () 2 import en_coref_md 3 ----> 4 nlp = en_coref_md.load() 5 doc = nlp(u'My sister has a dog. She loves him.') 6

    ~\Anaconda3\Scripts\en_coref_md_init_.py in load(**overrides) 13 overrides['disable'] = disable + ['neuralcoref'] 14 nlp = load_model_from_init_py(file, **overrides) ---> 15 coref = neuralcoref.NeuralCoref(nlp.vocab) 16 coref.from_disk(nlp.path / 'neuralcoref') 17 nlp.add_pipe(coref, name='neuralcoref')

    AttributeError: module 'neuralcoref' has no attribute 'NeuralCoref'

    windows 
    opened by humehta 15
  • Python stopped Working

    Python stopped Working

    Hi,

    I am a windows 10 user, working on spacy 2.1.4 with english web-lg model(v2.1.0) After adding neuralcoref to pipeline, I am getting a Python stopped working error as soon as I parse.

    Wanted to know what is causing this error.

    upgrade install 
    opened by RandomForestGump 14
  • #include

    #include "ios" error in mac Mojave

    Hi, I encounter the following error when I try to install the models

    Processing ./en_coref_sm-3.0.0.tar.gz
    Requirement already satisfied: spacy>=>=2.0.0a18 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from en-coref-sm==3.0.0) (2.0.12)
    Requirement already satisfied: numpy>=1.7 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.15.2)
    Collecting regex==2017.4.5 (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0)
    Requirement already satisfied: requests<3.0.0,>=2.13.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (2.18.4)
    Requirement already satisfied: preshed<2.0.0,>=1.0.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.0.0)
    Requirement already satisfied: murmurhash<0.29,>=0.28 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.28.0)
    Requirement already satisfied: plac<1.0.0,>=0.9.6 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.9.6)
    Requirement already satisfied: ujson>=1.35 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.35)
    Requirement already satisfied: cymem<1.32,>=1.30 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.31.2)
    Requirement already satisfied: dill<0.3,>=0.2 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.2.7.1)
    Requirement already satisfied: thinc<6.11.0,>=6.10.3 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (6.10.3)
    Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (3.0.4)
    Requirement already satisfied: idna<2.7,>=2.5 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (2.6)
    Requirement already satisfied: urllib3<1.23,>=1.21.1 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.22)
    Requirement already satisfied: certifi>=2017.4.17 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (2018.8.24)
    Requirement already satisfied: six<2.0.0,>=1.10.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.11.0)
    Requirement already satisfied: cytoolz<0.10,>=0.9.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.9.0.1)
    Requirement already satisfied: tqdm<5.0.0,>=4.10.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (4.22.0)
    Requirement already satisfied: msgpack-numpy<1.0.0,>=0.4.1 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.4.1)
    Requirement already satisfied: wrapt<1.11.0,>=1.10.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.10.11)
    Requirement already satisfied: msgpack<1.0.0,>=0.5.6 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.5.6)
    Requirement already satisfied: toolz>=0.8.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from cytoolz<0.10,>=0.9.0->thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.9.0)
    Requirement already satisfied: msgpack-python>=0.3.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from msgpack-numpy<1.0.0,>=0.4.1->thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.5.4)
    Building wheels for collected packages: en-coref-sm
      Running setup.py bdist_wheel for en-coref-sm ... error
      Complete output from command /Users/kyoungrok/anaconda/bin/python -u -c "import setuptools, tokenize;__file__='/private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-wheel-ww8axwrm --python-tag cp36:
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.macosx-10.7-x86_64-3.6
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm
      copying en_coref_sm/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      copying en_coref_sm/neuralcoref/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      copying en_coref_sm/__init__.pxd -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0
      copying en_coref_sm/en_coref_sm-3.0.0/tokenizer -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0
      copying en_coref_sm/en_coref_sm-3.0.0/meta.json -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      copying en_coref_sm/en_coref_sm-3.0.0/ner/lower_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      copying en_coref_sm/en_coref_sm-3.0.0/ner/moves -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      copying en_coref_sm/en_coref_sm-3.0.0/ner/cfg -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      copying en_coref_sm/en_coref_sm-3.0.0/ner/tok2vec_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      copying en_coref_sm/en_coref_sm-3.0.0/ner/upper_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      copying en_coref_sm/en_coref_sm-3.0.0/parser/lower_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      copying en_coref_sm/en_coref_sm-3.0.0/parser/moves -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      copying en_coref_sm/en_coref_sm-3.0.0/parser/cfg -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      copying en_coref_sm/en_coref_sm-3.0.0/parser/tok2vec_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      copying en_coref_sm/en_coref_sm-3.0.0/parser/upper_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/vocab
      copying en_coref_sm/en_coref_sm-3.0.0/vocab/vectors -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/vocab
      copying en_coref_sm/en_coref_sm-3.0.0/vocab/lexemes.bin -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/vocab
      copying en_coref_sm/en_coref_sm-3.0.0/vocab/strings.json -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/vocab
      copying en_coref_sm/en_coref_sm-3.0.0/vocab/key2row -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/vocab
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/cfg -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/single_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/pairs_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/static_vectors
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/static_vectors/vectors -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/static_vectors
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/static_vectors/key2row -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/static_vectors
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/tuned_vectors
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/tuned_vectors/vectors -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/tuned_vectors
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/tuned_vectors/key2row -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/tuned_vectors
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/tagger
      copying en_coref_sm/en_coref_sm-3.0.0/tagger/tag_map -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/tagger
      copying en_coref_sm/en_coref_sm-3.0.0/tagger/cfg -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/tagger
      copying en_coref_sm/en_coref_sm-3.0.0/tagger/model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/tagger
      copying en_coref_sm/meta.json -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm
      copying en_coref_sm/neuralcoref/neuralcoref.pyx -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      copying en_coref_sm/neuralcoref/__init__.pxd -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      copying en_coref_sm/neuralcoref/neuralcoref.pxd -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      running build_ext
      building 'en_coref_sm.neuralcoref.neuralcoref' extension
      creating build/temp.macosx-10.7-x86_64-3.6
      creating build/temp.macosx-10.7-x86_64-3.6/en_coref_sm
      creating build/temp.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/kyoungrok/anaconda/include -arch x86_64 -I/Users/kyoungrok/anaconda/include -arch x86_64 -I/Users/kyoungrok/anaconda/include/python3.6m -I/private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/include -I/Users/kyoungrok/anaconda/include/python3.6m -c en_coref_sm/neuralcoref/neuralcoref.cpp -o build/temp.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref/neuralcoref.o
      warning: include path for stdlibc++ headers not found; pass '-std=libc++' on the command line to use the libc++ standard library instead [-Wstdlibcxx-not-found]
      In file included from en_coref_sm/neuralcoref/neuralcoref.cpp:580:
      In file included from /private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/include/numpy/arrayobject.h:15:
      In file included from /private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/include/numpy/ndarrayobject.h:17:
      In file included from /private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/include/numpy/ndarraytypes.h:1728:
      /private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/include/numpy/npy_deprecated_api.h:11:2: warning: "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings]
      #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION"
       ^
      en_coref_sm/neuralcoref/neuralcoref.cpp:583:10: fatal error: 'ios' file not found
      #include "ios"
               ^~~~~
      2 warnings and 1 error generated.
      error: command 'gcc' failed with exit status 1
    
      ----------------------------------------
      Failed building wheel for en-coref-sm
    
    opened by kyoungrok0517 14
  • Unable to import modules

    Unable to import modules

    Hi,

    I get the following error when I try to run either of the simple examples in your README file:

    Traceback (most recent call last): File "/Users/maximild/src/MaxQA/src/test.py", line 1, in import en_coref_md File "/Users/maximild/anaconda3/lib/python3.6/site-packages/en_coref_md/init.py", line 6, in from en_coref_md.neuralcoref import NeuralCoref File "/Users/maximild/anaconda3/lib/python3.6/site-packages/en_coref_md/neuralcoref/init.py", line 1, in from .neuralcoref import NeuralCoref File "strings.pxd", line 23, in init en_coref_md.neuralcoref.neuralcoref ValueError: spacy.strings.StringStore has the wrong size, try recompiling. Expected 88, got 112

    I appear to have successfully downloaded the en_coref_md model, but I am unable to import it. I'm using spaCy 2.0.11 and Python 3.6 if that helps.

    Any suggestions on what might be wrong?

    Thanks!

    opened by BBCMax 14
  • Extension 'has_coref' already exists on Doc.

    Extension 'has_coref' already exists on Doc.

    My code:

    import spacy import en_coref_sm

    nlp = en_coref_sm.load() doc = nlp(u'The lungs are located in the chest.They are conical in shape.')

    print (doc..has_coref) print (doc..coref_clusters)

    Hey I ran into the following error when I inputted my own sentence::::

    ValueError Traceback (most recent call last) in () 2 import en_coref_sm 3 ----> 4 nlp = en_coref_sm.load() 5 doc = nlp(u'The lungs are located in the chest.They are conical in shape.') 6

    ~\Anaconda3\lib\site-packages\en_coref_sm_init_.py in load(**overrides) 13 overrides['disable'] = disable + ['neuralcoref'] 14 nlp = load_model_from_init_py(file, **overrides) ---> 15 coref = NeuralCoref(nlp.vocab) 16 coref.from_disk(nlp.path / 'neuralcoref') 17 nlp.add_pipe(coref, name='neuralcoref')

    neuralcoref.pyx in en_coref_sm.neuralcoref.neuralcoref.NeuralCoref.init()

    doc.pyx in spacy.tokens.doc.Doc.set_extension()

    ValueError: [E090] Extension 'has_coref' already exists on Doc. To overwrite the existing extension, set force=True on Doc.set_extension.

    opened by humehta 14
  • Regarding finetuning neuralcoref

    Regarding finetuning neuralcoref

    So, I have my own spacy model for custom NER and I want to incorporate coreference resolution for my detected entities. So, would existing pretrained model work or would I have to create a new dataset for it?

    opened by Tanmay98 0
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
  • Results completely differ from web-demo

    Results completely differ from web-demo

    When using neuralcoref master with Space==2.1.0 I can use neuralcoref just fine. However the results drastically differ from the version deployed to huggingface.co/neuralcoref

    "She is close to the habour" yields: grafik

    Whereas the same text executed via examples/server.py yields an empty reply

    ❯ curl --data-urlencode "text=she is close to the habour" -G localhost:8000
    {}%
    

    I can confirm that my curl call succeeds with other prompts.

    ▽ {mentions: […], clusters: […], resolved: "she is close to the habour. where might she be heading?"}
      ▽ mentions: [{…}, {…}]
        ▽ [0]: {start: 0, end: 3, text: "she", resolved: "she"}
            start: 0
            end: 3
            text: "she"
            resolved: "she"
        ▽ [1]: {start: 40, end: 43, text: "she", resolved: "she"}
            start: 40
            end: 43
            text: "she"
            resolved: "she"
      ▽ clusters: [["she", "she"]]
        ▽ [0]: ["she", "she"]
            [0]: "she"
            [1]: "she"
        resolved: "she is close to the habour. where might she be heading?"
    

    It seems like NOMINAL is missing somehow.

    opened by chris-aeviator 0
  • (base) C:\Users\sk136\neuralcoref>python -m neuralcoref.train.learn --train ./data/train/ --eval ./data/dev/ facing problem while executing.. this command

    (base) C:\Users\sk136\neuralcoref>python -m neuralcoref.train.learn --train ./data/train/ --eval ./data/dev/ facing problem while executing.. this command

    . . . 🌋 Construct test file Writing in C:\Users\sk136\neuralcoref\neuralcoref\train\test_mentions.txt 🌋 Computing score Error during the scoring Command '['perl', 'C:\Users\sk136\neuralcoref\neuralcoref\train\scorer_wrapper.pl', 'muc', './data/dev//key.txt', 'C:\Users\sk136\neuralcoref\neuralcoref\train\test_mentions.txt']' returned non-zero exit status 2. Can't locate CorScorer.pm in @INC (you may need to install the CorScorer module) (@INC contains: scorer/lib /usr/lib/perl5/site_perl /usr/share/perl5/site_perl /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5/core_perl /usr/share/perl5/core_perl) at C:\Users\sk136\neuralcoref\neuralcoref\train\scorer_wrapper.pl line 16. BEGIN failed--compilation aborted at C:\Users\sk136\neuralcoref\neuralcoref\train\scorer_wrapper.pl line 16.

    Traceback (most recent call last): File "C:\Users\sk136\anaconda3\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\sk136\anaconda3\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\sk136\neuralcoref\neuralcoref\train\learn.py", line 565, in run_model(args) File "C:\Users\sk136\neuralcoref\neuralcoref\train\learn.py", line 175, in run_model eval_evaluator.test_model() File "C:\Users\sk136\neuralcoref\neuralcoref\train\evaluator.py", line 180, in test_model self.get_score(file_path=ALL_MENTIONS_PATH) File "C:\Users\sk136\neuralcoref\neuralcoref\train\evaluator.py", line 283, in get_score scorer_out = subprocess.check_output( File "C:\Users\sk136\anaconda3\lib\subprocess.py", line 424, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "C:\Users\sk136\anaconda3\lib\subprocess.py", line 528, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['perl', 'C:\Users\sk136\neuralcoref\neuralcoref\train\scorer_wrapper.pl', 'muc', './data/dev//key.txt', 'C:\Users\sk136\neuralcoref\neuralcoref\train\test_mentions.txt']' returned non-zero exit status 2.

    perl related issue

    opened by sandeep16064 0
  • GPU support - cuda 11.1 - TypeError: Unsupported type <class 'numpy.ndarray'>

    GPU support - cuda 11.1 - TypeError: Unsupported type

    Example:

    import spacy
    
    spacy.require_gpu()
    >> True
    
    nlp = spacy.load("en_core_web_sm")
    doc = nlp("this is my example text")
    print(doc)
    >> this is my example text
    
    import neuralcoref
    neuralcoref.add_to_pipe(nlp)
    >> <spacy.lang.en.English object at 0x7f53d9da6d60>
    
    doc =  nlp("this is my example text")
    >> Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/brj/.local/share/virtualenvs/spacy-ozIRu_0L/lib/python3.8/site-packages/spacy/language.py", line 445, in __call__
        doc = proc(doc, **component_cfg.get(name, {}))
      File "neuralcoref.pyx", line 593, in neuralcoref.neuralcoref.NeuralCoref.__call__
      File "neuralcoref.pyx", line 720, in neuralcoref.neuralcoref.NeuralCoref.predict
      File "neuralcoref.pyx", line 908, in neuralcoref.neuralcoref.NeuralCoref.get_mention_embeddings
      File "neuralcoref.pyx", line 899, in neuralcoref.neuralcoref.NeuralCoref.get_average_embedding
      File "cupy/_core/core.pyx", line 1591, in cupy._core.core.ndarray.__array_ufunc__
      File "cupy/_core/_kernel.pyx", line 1218, in cupy._core._kernel.ufunc.__call__
      File "cupy/_core/_kernel.pyx", line 138, in cupy._core._kernel._preprocess_args
      File "cupy/_core/_kernel.pyx", line 124, in cupy._core._kernel._preprocess_arg
    TypeError: Unsupported type <class 'numpy.ndarray'>
    
    # printing versions
    import cupy
    spacy.__version__
    >> 2.3.7
    neuralcoref.__version__
    >> 4.1.0
    cupy.__version__
    >> 10.4.0
    

    Everything works fine if I run this without spacy.require_gpu().

    opened by bryanjohns 0
Releases(v4.0.0)
Owner
Hugging Face
Solving NLP, one commit at a time!
Hugging Face
DeLighT: Very Deep and Light-Weight Transformers

DeLighT: Very Deep and Light-weight Transformers This repository contains the source code of our work on building efficient sequence models: DeFINE (I

Sachin Mehta 440 Dec 18, 2022
Conditional Transformer Language Model for Controllable Generation

CTRL - A Conditional Transformer Language Model for Controllable Generation Authors: Nitish Shirish Keskar, Bryan McCann, Lav Varshney, Caiming Xiong,

Salesforce 1.7k Dec 28, 2022
Code examples for my Write Better Python Code series on YouTube.

Write Better Python Code This repository contains the code examples used in my Write Better Python Code series published on YouTube: https:/

858 Dec 29, 2022
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context This repository contains the code in both PyTorch and TensorFlow for our paper

Zhilin Yang 3.3k Dec 28, 2022
Fuzzy String Matching in Python

FuzzyWuzzy Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.

SeatGeek 8.8k Jan 01, 2023
Diaformer: Automatic Diagnosis via Symptoms Sequence Generation

Diaformer Diaformer: Automatic Diagnosis via Symptoms Sequence Generation (AAAI 2022) Diaformer is an efficient model for automatic diagnosis via symp

Junying Chen 20 Dec 13, 2022
Speech Recognition for Uyghur using Speech transformer

Speech Recognition for Uyghur using Speech transformer Training: this model using CTC loss and Cross Entropy loss for training. Download pretrained mo

Uyghur 11 Nov 17, 2022
This repository is home to the Optimus data transformation plugins for various data processing needs.

Transformers Optimus's transformation plugins are implementations of Task and Hook interfaces that allows execution of arbitrary jobs in optimus. To i

Open Data Platform 37 Dec 14, 2022
Codes for processing meeting summarization datasets AMI and ICSI.

Meeting Summarization Dataset Meeting plays an essential part in our daily life, which allows us to share information and collaborate with others. Wit

xcfeng 39 Dec 14, 2022
NeurIPS'21: Probabilistic Margins for Instance Reweighting in Adversarial Training (Pytorch implementation).

source code for NeurIPS21 paper robabilistic Margins for Instance Reweighting in Adversarial Training

9 Dec 20, 2022
LightSeq: A High-Performance Inference Library for Sequence Processing and Generation

LightSeq is a high performance inference library for sequence processing and generation implemented in CUDA. It enables highly efficient computation of modern NLP models such as BERT, GPT2, Transform

Bytedance Inc. 2.5k Jan 03, 2023
DeepAmandine is an artificial intelligence that allows you to talk to it for hours, you won't know the difference.

DeepAmandine This is an artificial intelligence based on GPT-3 that you can chat with, it is very nice and makes a lot of jokes. We wish you a good ex

BuyWithCrypto 3 Apr 19, 2022
Signature remover is a NLP based solution which removes email signatures from the rest of the text.

Signature Remover Signature remover is a NLP based solution which removes email signatures from the rest of the text. It helps to enchance data conten

Forges Alterway 8 Jan 06, 2023
This code extends the neural style transfer image processing technique to video by generating smooth transitions between several reference style images

Neural Style Transfer Transition Video Processing By Brycen Westgarth and Tristan Jogminas Description This code extends the neural style transfer ima

Brycen Westgarth 110 Jan 07, 2023
This repository contains Python scripts for extracting linguistic features from Filipino texts.

Filipino Text Linguistic Feature Extractors This repository contains scripts for extracting linguistic features from Filipino texts. The scripts were

Joseph Imperial 1 Oct 05, 2021
Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2. It is trained (finetuned) on a curated list of approximately 45K Python (~470MB) files gathered from the

Galois Autocompleter 91 Sep 23, 2022
The Internet Archive Research Assistant - Daily search Internet Archive for new items matching your keywords

The Internet Archive Research Assistant - Daily search Internet Archive for new items matching your keywords

Kay Savetz 60 Dec 25, 2022
Différents programmes créant une interface graphique a l'aide de Tkinter pour simplifier la vie des étudiants.

GP211-Grand-Projet Ce repertoire contient tout les programmes nécessaires au bon fonctionnement de notre projet-logiciel. Cette interface graphique es

1 Dec 21, 2021
Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

IMDB Sentiment Analysis This is the final project of Machine Learning Courses in Huazhong University of Science and Technology, School of Artificial I

Daniel 0 Dec 27, 2021
Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Mask-Align: Self-Supervised Neural Word Alignment This is the implementation of our work Mask-Align: Self-Supervised Neural Word Alignment. @inproceed

THUNLP-MT 46 Dec 15, 2022