A high-level Python library for Quantum Natural Language Processing

Related tags

Text Data & NLPlambeq
Overview

lambeq

lambeq logo

Build status License PyPI version PyPI downloads arXiv

About

lambeq is a toolkit for quantum natural language processing (QNLP).

Documentation: https://cqcl.github.io/lambeq/

Getting started

Prerequisites

  • Python 3.7+

Installation

Direct pip install

The base lambeq can be installed with the command:

pip install lambeq

This does not include optional dependencies such as depccg and PyTorch, which have to be installed separately. In particular, depccg is required for lambeq.ccg2discocat.DepCCGParser.

To install lambeq with depccg, run instead:

pip install cython numpy
pip install lambeq[depccg]
depccg_en download

See below for further options.

Automatic installation (recommended)

This runs an interactive installer to help pick the installation destination and configuration.

  1. Run:
    bash <(curl 'https://cqcl.github.io/lambeq/install.sh')

Git installation

This required Git to be installed.

  1. Download this repository:

    git clone https://github.com/CQCL/lambeq
  2. Enter the repository:

    cd lambeq
  3. Make sure pip is up-to-date:

    pip install --upgrade pip wheel
  4. (Optional) If installing the optional dependency depccg, the following packages must be installed before installing depccg:

    pip install cython numpy

    Further information can be found on the depccg homepage.

  5. Install lambeq from the local repository using pip:

    pip install --use-feature=in-tree-build .

    To include depccg, run instead:

    pip install --use-feature=in-tree-build .[depccg]

    To include all optional dependencies, run instead:

    pip install --use-feature=in-tree-build .[all]
  6. If using a pretrained depccg parser, download a pretrained model:

    depccg_en download

Usage

The docs/examples directory contains notebooks demonstrating usage of the various tools in lambeq.

Example - parsing a sentence into a diagram (see docs/examples/ccg2discocat.ipynb):

from lambeq.ccg2discocat import DepCCGParser

depccg_parser = DepCCGParser()
diagram = depccg_parser.sentence2diagram('This is a test sentence')
diagram.draw()

Note: all pre-trained depccg models apart from the basic one are broken, and depccg has not yet been updated to fix this. Therefore, it is recommended to just use the basic parser, as shown here.

Testing

Run all tests with the command:

pytest

Note: if you have installed in a virtual environment, remember to install pytest in the same environment using pip.

Building Documentation

To build the documentation, first install the required dependencies:

pip install -r docs/requirements.txt

then run the commands:

cd docs
make clean
make html

the docs will be under docs/_build.

To rebuild the rst files themselves, run:

sphinx-apidoc --force -o docs lambeq

License

Distributed under the Apache 2.0 license. See LICENSE for more details.

Citation

If you wish to attribute our work, please cite the accompanying paper:

@article{kartsaklis2021lambeq,
   title={lambeq: {A}n {E}fficient {H}igh-{L}evel {P}ython {L}ibrary for {Q}uantum {NLP}},
   author={Dimitri Kartsaklis and Ian Fan and Richie Yeung and Anna Pearson and Robin Lorenz and Alexis Toumi and Giovanni de Felice and Konstantinos Meichanetzidis and Stephen Clark and Bob Coecke},
   year={2021},
   journal={arXiv preprint arXiv:2110.04236},
}
Comments
  • No module named BobcatParser

    No module named BobcatParser

    I'm trying to run the following code from your tutorials website, but I am unable to install BobcatParser. I am working in a Colab environment. Is there a dependency that I may be missing?

    from lambeq import BobcatParser
    
    parser = BobcatParser(root_cats=('NP', 'N'), verbose='text')
    
    raw_train_diagrams = parser.sentences2diagrams(train_data, suppress_exceptions=True)
    raw_val_diagrams = parser.sentences2diagrams(val_data, suppress_exceptions=True) 
    
    opened by alt-shreya 14
  • error during installation

    error during installation

    When I try installing using sh <(curl 'https://cqcl.github.io/lambeq/install.sh'), i get the follwing error:

    ERROR: Cannot install lambeq[depccg]==0.1.0, lambeq[depccg]==0.1.1 and lambeq[depccg]==0.1.2 because these package versions have conflicting dependencies.
    
    The conflict is caused by:
        lambeq[depccg] 0.1.2 depends on depccg==1.1.0; extra == "depccg"
        lambeq[depccg] 0.1.1 depends on depccg==1.1.0; extra == "depccg"
        lambeq[depccg] 0.1.0 depends on depccg==1.1.0; extra == "depccg"
    
    To fix this you could try to:
    1. loosen the range of package versions you've specified
    2. remove package versions to allow pip attempt to solve the dependency conflict
    
    ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies
    
    

    I am on a 2020 Macbook air (with Apple M1 chip), and using conda with python=3.8.11 . Will any of that be causing the problem?

    opened by mithunpaul08 13
  • Problem with trainer.fit(), operands of different shape

    Problem with trainer.fit(), operands of different shape

    Hi, I am trying to run the quantum trainer algorithm. When running the following line:

    trainer.fit(train_dataset, val_dataset, evaluation_step=1, logging_step=100)

    i get the following error:

    ValueError                          Traceback (most recent call last)
    Input In [17], in <cell line: 1>()
    ----> 1 trainer.fit(train_dataset, val_dataset, evaluation_step=1, logging_step=100)
    
    File c:\python38\lib\site-packages\lambeq\training\trainer.py:365, in Trainer.fit(self, train_dataset, val_dataset, evaluation_step, logging_step)
        363 step += 1
        364 x, y_label = batch
    --> 365 y_hat, loss = self.training_step(batch)
        366 if (self.evaluate_on_train and
        367         self.evaluate_functions is not None):
        368     for metr, func in self.evaluate_functions.items():
    
    File c:\python38\lib\site-packages\lambeq\training\quantum_trainer.py:149, in QuantumTrainer.training_step(self, batch)
        133 def training_step(
        134         self,
        135         batch: tuple[list[Any], np.ndarray]) -> tuple[np.ndarray, float]:
        136     """Perform a training step.
        137 
        138     Parameters
       (...)
        147 
        148     """
    --> 149     y_hat, loss = self.optimizer.backward(batch)
        150     self.train_costs.append(loss)
        151     self.optimizer.step()
    
    File c:\python38\lib\site-packages\lambeq\training\spsa_optimizer.py:126, in SPSAOptimizer.backward(self, batch)
        124 self.model.weights = xplus
        125 y0 = self.model(diagrams)
    --> 126 loss0 = self.loss_fn(y0, targets)
        127 xminus = self.project(x - self.ck * delta)
        128 self.model.weights = xminus
    
    Input In [13], in <lambda>(y_hat, y)
    ----> 1 loss = lambda y_hat, y: -np.sum(y * np.log(y_hat)) / len(y)  # binary cross-entropy loss
          3 acc = lambda y_hat, y: np.sum(np.round(y_hat) == y) / len(y) / 2  # half due to double-counting
          4 eval_metrics = {"acc": acc}
    
    ValueError: operands could not be broadcast together with shapes (30,2) (30,)
    

    I have just fixed the .py file in the lib following #12. The algorithm raised an error even before. I can't recall exactly, but i don't think it was the same error.

    What can i do to solve this? Thank you for your time.

    opened by Stephenito 10
  • Error when running Parser

    Error when running Parser

    Below is the code I ran for testing the parser : from lambeq import BobcatParser

    parser = BobcatParser() diagram = parser.sentence2diagram('This is a test sentence') diagram.draw()

    and this is the error I received when running it : 2022-05-22 20:19:08.041411: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2022-05-22 20:19:08.042271: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Traceback (most recent call last): File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1342, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1255, in request self._send_request(method, url, body, headers, encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1301, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1250, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1010, in _send_output self.send(msg) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 950, in send self.connect() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1424, in connect self.sock = self._context.wrap_socket(self.sock, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 500, in wrap_socket return self.sslsocket_class._create( File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1040, in _create self.do_handshake() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1309, in do_handshake self._sslobj.do_handshake() ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "C:\Users\elmm\Desktop\CQM\QNLP Depression 2.py", line 3, in parser = BobcatParser() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\bobcat_parser.py", line 258, in init download_model(model_name_or_path, model_dir, verbose) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\bobcat_parser.py", line 130, in download_model model_file, headers = urlretrieve(url, reporthook=progress_bar.update) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 239, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 214, in urlopen return opener.open(url, data, timeout) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 517, in open response = self._open(req, data) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 534, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 494, in _call_chain result = func(*args) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1385, in https_open return self.do_open(http.client.HTTPSConnection, req, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1345, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)>

    How can I resolve this?

    opened by ACE07-Sev 9
  • Question : What does the Quantum_trainer output?

    Question : What does the Quantum_trainer output?

    Hi, I wish to use the Quantum_trainer to do a Depression Detection using a chatbot (the sentences would be the input for the QNLP module), and wish to then classify as whether the person has depression or not.

    May I ask what is the input and what is the output in the sample trainer for the quantum module? Does it do binary classification or is it something I need to add as an additional layer?

    opened by ACE07-Sev 8
  • An Error while running Classical Pipeline Example given in docs/examples

    An Error while running Classical Pipeline Example given in docs/examples

    Hi @dimkart I hope you are doing well

    I am trying to run the code given here on my Google Colab account - https://github.com/CQCL/lambeq/blob/main/docs/examples/classical_pipeline.ipynb

    I am installing lambeq directly on Colab and it is picking up the latest version of DisCoPy

    But I am continuously getting an error like this. I have pasted the full stack trace here -

    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    <ipython-input-11-84634b74856a> in <module>()
         39 dev_cost_fn, dev_costs, dev_accs = make_cost_fn(dev_pred_fn, dev_labels)
         40 
    ---> 41 result = train(train_cost_fn, x0, niter=20, callback=dev_cost_fn, optimizer_fn=torch.optim.AdamW, lr=0.1)
    
    10 frames
    <ipython-input-11-84634b74856a> in train(func, x0, niter, callback, optimizer_fn, lr)
          3     optimizer = optimizer_fn(x, lr=lr)
          4     for _ in range(niter):
    ----> 5         loss = func(x)
          6 
          7         optimizer.zero_grad()
    
    <ipython-input-11-84634b74856a> in cost_fn(params, **kwargs)
         16 def make_cost_fn(pred_fn, labels):
         17     def cost_fn(params, **kwargs):
    ---> 18         predictions = pred_fn(params)
         19 
         20         logits = predictions[:, 1] - predictions[:, 0]
    
    <ipython-input-10-dbb8534e3157> in predict(params)
          1 def make_pred_fn(circuits):
          2     def predict(params):
    ----> 3         return torch.stack([c.lambdify(*parameters)(*params).eval(contractor=tn.contractors.auto).array for c in circuits])
          4     return predict
          5 
    
    <ipython-input-10-dbb8534e3157> in <listcomp>(.0)
          1 def make_pred_fn(circuits):
          2     def predict(params):
    ----> 3         return torch.stack([c.lambdify(*parameters)(*params).eval(contractor=tn.contractors.auto).array for c in circuits])
          4     return predict
          5 
    
    /usr/local/lib/python3.7/dist-packages/discopy/tensor.py in eval(self, contractor)
        448         if contractor is None:
        449             return Functor(ob=lambda x: x, ar=lambda f: f.array)(self)
    --> 450         array = contractor(*self.to_tn()).tensor
        451         return Tensor(self.dom, self.cod, array)
        452 
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/contractors/opt_einsum_paths/path_contractors.py in auto(nodes, output_edge_order, memory_limit, ignore_edge_order)
        262         output_edge_order=output_edge_order,
        263         nbranch=1,
    --> 264         ignore_edge_order=ignore_edge_order)
        265   return greedy(nodes, output_edge_order, memory_limit, ignore_edge_order)
        266 
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/contractors/opt_einsum_paths/path_contractors.py in branch(nodes, output_edge_order, memory_limit, nbranch, ignore_edge_order)
        160   alg = functools.partial(
        161       opt_einsum.paths.branch, memory_limit=memory_limit, nbranch=nbranch)
    --> 162   return base(nodes, alg, output_edge_order, ignore_edge_order)
        163 
        164 
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/contractors/opt_einsum_paths/path_contractors.py in base(nodes, algorithm, output_edge_order, ignore_edge_order)
         86   path, nodes = utils.get_path(nodes_set, algorithm)
         87   for a, b in path:
    ---> 88     new_node = contract_between(nodes[a], nodes[b], allow_outer_product=True)
         89     nodes.append(new_node)
         90     nodes = utils.multi_remove(nodes, [a, b])
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/network_components.py in contract_between(node1, node2, name, allow_outer_product, output_edge_order, axis_names)
       2083     axes1 = [axes1[i] for i in ind_sort]
       2084     axes2 = [axes2[i] for i in ind_sort]
    -> 2085     new_tensor = backend.tensordot(node1.tensor, node2.tensor, [axes1, axes2])
       2086     new_node = Node(tensor=new_tensor, name=name, backend=backend)
       2087     # node1 and node2 get new edges in _remove_edges
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/backends/pytorch/pytorch_backend.py in tensordot(self, a, b, axes)
         44   def tensordot(self, a: Tensor, b: Tensor,
         45                 axes: Union[int, Sequence[Sequence[int]]]) -> Tensor:
    ---> 46     return torchlib.tensordot(a, b, dims=axes)
         47 
         48   def reshape(self, tensor: Tensor, shape: Tensor) -> Tensor:
    
    /usr/local/lib/python3.7/dist-packages/torch/functional.py in tensordot(a, b, dims, out)
       1032 
       1033     if out is None:
    -> 1034         return _VF.tensordot(a, b, dims_a, dims_b)  # type: ignore[attr-defined]
       1035     else:
       1036         return _VF.tensordot(a, b, dims_a, dims_b, out=out)  # type: ignore[attr-defined]
    
    RuntimeError: expected scalar type Float but found Double
    

    I was able to successfully carry out experiments using the Quantum Pipeline code on Google Colab and did not faced any issues but for this one I am getting error. I have tried to fix the issue by converting variables or some function outputs to float() but I was unable to rectify this.

    Can you please help me fix this issue?

    Thank you so much!

    opened by srinjoyganguly 8
  • Add Japanese support to DepCCGParser

    Add Japanese support to DepCCGParser

    Updated DepCCGParser to support Japanese. The sample code is as follows.

    1. Prepare depccg.

    pip install cython numpy depccg
    depccg_en download
    depccg_ja download
    

    2. Install Japanese fonts on Ubuntu.

    apt install -y fonts-migmix
    rm ~/.cache/matplotlib/fontlist-v330.json
    

    3. Set the matplotlib Japanese font in the jupyter notebook python code.

    import matplotlib
    from matplotlib.font_manager import FontProperties
    
    font_path = "/usr/share/fonts/truetype/migmix/migmix-1p-regular.ttf"
    font_prop = FontProperties(fname=font_path)
    matplotlib.rcParams["font.family"] = font_prop.get_name()
    

    4. Use sentence2diagram in the jupyter notebook python code.

    from lambeq import DepCCGParser
    from discopy import grammar
    
    parser = DepCCGParser(lang='ja')
    diagram = parser.sentence2diagram('これはテストの文です。')
    grammar.draw(diagram, figsize=(14,3), fontsize=12)
    

    5. Use ansatz in the jupyter notebook python code.

    from lambeq import AtomicType, IQPAnsatz
    
    # Define atomic types
    N = AtomicType.NOUN
    S = AtomicType.SENTENCE
    
    # Convert string diagram to quantum circuit
    ansatz = IQPAnsatz({N: 1, S: 1}, n_layers=2)
    discopy_circuit = ansatz(diagram)
    discopy_circuit.draw(figsize=(15,10))
    

    6. Use pytket in the jupyter notebook python code.

    from pytket.circuit.display import render_circuit_jupyter
    
    tket_circuit = discopy_circuit.to_tk()
    render_circuit_jupyter(tket_circuit)
    
    opened by KentaroAOKI 6
  • <unk> token feature in the forward() function

    token feature in the forward() function

    Considering the necessity of the token for the never-seen before entities, how can I implement the token in the forward function to allow for the model to calculate probabilities for the instances which have unknown symbols. Based on my understanding and guide from one of the moderators I think it's supposed to be in the forward function.

    Could you kindly assist me in implementing this?

    opened by ACE07-Sev 6
  • inference

    inference

    I'm trying to run Quantum pipeline using JAX backend. In order to better show results (how each sentence was classified according to the two different categories) , it is available an example of code to realise the inference as in the classic NLP deep learning approaches (i.e. like in transformers-based approaches or similar)

    stale 
    opened by nlpirate 6
  • WebParser error

    WebParser error

    Hi I'm trying to run the Web Parser but very similar to the issue with the Bobcat Parser, it gives this error : Traceback (most recent call last): File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1342, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1255, in request self._send_request(method, url, body, headers, encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1301, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1250, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1010, in _send_output self.send(msg) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 950, in send self.connect() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1424, in connect self.sock = self._context.wrap_socket(self.sock, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 500, in wrap_socket return self.sslsocket_class._create( File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1040, in _create self.do_handshake() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1309, in do_handshake self._sslobj.do_handshake() ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "C:\Users\elmm\Desktop\CQM\QNLP_test.py", line 6, in new_diagram = parser.sentence2diagram('he was overtaken by the depression.') File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\ccg_parser.py", line 227, in sentence2diagram return self.sentences2diagrams( File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\ccg_parser.py", line 157, in sentences2diagrams trees = self.sentences2trees(sentences, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\web_parser.py", line 159, in sentences2trees raise e File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\web_parser.py", line 148, in sentences2trees with urlopen(url) as f: File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 214, in urlopen return opener.open(url, data, timeout) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 517, in open response = self._open(req, data) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 534, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 494, in _call_chain result = func(*args) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1385, in https_open return self.do_open(http.client.HTTPSConnection, req, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1345, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)>

    Process finished with exit code 1

    I really need to get this working because Bobcat fails in parsing the sentences into diagrams. Can someone please help me with this?

    opened by ACE07-Sev 4
  • Numpy int32 error

    Numpy int32 error

    Hi I ran the Quantum_trainer, and been getting this and can't seem to find where the issue is.

    Traceback (most recent call last): File "C:\Users\elmm\Desktop\CQM\QNLP Depression.py", line 103, in trainer.fit(train_dataset, val_dataset, logging_step=12) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\trainer.py", line 365, in fit y_hat, loss = self.training_step(batch) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\quantum_trainer.py", line 149, in training_step y_hat, loss = self.optimizer.backward(batch) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\spsa_optimizer.py", line 125, in backward y0 = self.model(diagrams) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\model.py", line 59, in call return self.forward(*args, **kwds) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\tket_model.py", line 131, in forward return self.get_diagram_output(x) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\tket_model.py", line 103, in get_diagram_output seed=self._randint() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\tket_model.py", line 71, in _randint return np.random.randint(low, high) File "mtrand.pyx", line 746, in numpy.random.mtrand.RandomState.randint File "_bounded_integers.pyx", line 1334, in numpy.random._bounded_integers._rand_int32 ValueError: low is out of bounds for int32

    opened by ACE07-Sev 4
  • Error with Bobcat Parser

    Error with Bobcat Parser

    Hi,

    I am having a relatively difficult to understand error. I am trying to look at the sentence "Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat", and can't figure out why it is giving me the error below. Does BobCatParser throw an error if it doesn't recognize a word? I am unclear what I would need to do to fix the sentence.

    from lambeq import BobcatParser

    parser = BobcatParser() diagram = parser.sentence2diagram('Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat') diagram.draw()

    raceback (most recent call last):

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/bobcat_parser.py", line 382, in sentences2trees trees.append(self._build_ccgtree(result[0]))

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/bobcat/parser.py", line 258, in getitem return self.root[index]

    IndexError: list index out of range

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/spyder_kernels/py3compat.py", line 356, in compat_exec exec(code, globals, locals)

    File "/Users/dabeaulieu/Documents/Initiatives/quantum/machine learning/notebooks/qnlp/ankush/Quantum_NLP/testcode.py", line 12, in diagram = parser.sentence2diagram('Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat')

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/ccg_parser.py", line 231, in sentence2diagram return self.sentences2diagrams(

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/ccg_parser.py", line 161, in sentences2diagrams trees = self.sentences2trees(sentences,

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/bobcat_parser.py", line 387, in sentences2trees raise BobcatParseError(' '.join(sent.words))

    BobcatParseError: Bobcat failed to parse 'Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat'.

    opened by dancbeaulieu 1
  • Error from_tk

    Error from_tk

    Discussed in https://github.com/CQCL/lambeq/discussions/49

    Originally posted by JVM1982 October 10, 2022 Hello.

    Why the following code does not work ? :

    sentence = 'person runs program .' diagram = remove_cups( parser.sentence2diagram( sentence ) ) circuit = ansatz( diagram ) print( model( [ circuit ] ) ) # OK # print( model( [ from_tk( circuit.to_tk() ) ] ) ) # ERROR #

    [[0.14473685 0.85526315]] Unexpected exception formatting exception. Falling back to standard exception

    Traceback (most recent call last): File "/home/javier.valera/.local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3378, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "/tmp/ipykernel_16444/2916265434.py", line 9, in print( model( [ from_tk( circuit.to_tk() ) ] ) ) File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/model.py", line 59, in call return self.forward(*args, **kwds) File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/tket_model.py", line 131, in forward return self.get_diagram_output(x) File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/tket_model.py", line 101, in get_diagram_output *[diag_f(*self.weights) for diag_f in lambdified_diagrams], File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/tket_model.py", line 101, in *[diag_f(*self.weights) for diag_f in lambdified_diagrams], File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/monoidal.py", line 509, in return lambda xs: self.id(self.dom).then(( File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/monoidal.py", line 510, in self.id(left) @ box.lambdify(*symbols, **kwargs)(*xs) File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/quantum/gates.py", line 321, in return lambda *xs: type(self)(c_fn(*xs), distance=self.distance) File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/quantum/gates.py", line 448, in return lambda *xs: type(self)(data(*xs)) File "", line 2, in _lambdifygenerated return [email protected]@n.l_0 NameError: name 'runs__n' is not defined

    Thanks.

    opened by Thommy257 0
  • cfg: other languages compatibility

    cfg: other languages compatibility

    I am interested in some verticalization of lambeq (and consequently discocat) to languages other than English, particularly Italian. As far as I read from the documentation from a linguistic point of view, at the base of the framework, there are cfg grammars. I know this theoretical formalism very well. How is it possible to visualize and extract the structures and formalisms of these grammars from the library so that they can be extended/modified?

    opened by nlpirate 0
Releases(0.2.7)
  • 0.2.7(Oct 11, 2022)

    Added:

    • Added support for Japanese to DepCCGParser. (credit: KentaroAOKI https://github.com/CQCL/lambeq/pull/24)
    • Overhauled the CircuitAnsatz interface, and added three new ansätze.
    • Added helper methods to CCGTree to get the children of a tree. Added a new .tree2diagram method to TreeReader, extracted from TreeReader.sentence2diagram.
    • Added a new TreeReaderMode named HEIGHT.
    • Added new methods to Checkpoint for creating, saving and loading checkpoints for training.
    • Documentation: added a section for how to select the right model and trainer for training.
    • Documentation: added links to glossary terms throughout the documentation.
    • Documentation: added UML class diagrams for the sub-packages in lambeq.

    Changed:

    • Dependencies: bumped the minimum versions of discopy and torch.
    • IQPAnsatz now post-selects in the Hadamard basis.
    • PytorchModel now initialises using xavier_uniform.
    • CCGTree.to_json can now be applied to None, returning None.
    • Several slow imports have been deferred, making lambeq much faster to import for the first time.
    • In CCGRule.infer_rule, direction checks have been made explicit.
    • UnarySwap is now specified to be a unaryBoxConstructor.
    • BobcatParser has been refactored for easier use with external evaluation tools.
    • Documentation: headings have been organised in the tutorials into subsections.

    Fixed:

    • Fixed how CCGRule.infer_rule assigns a punc + X instance: if the result is X\X the assigned rule is CONJUNCTION, otherwise the rule is REMOVE_PUNCTUATION_LEFT (similarly for punctuation on the right).

    Removed:

    • Removed unnecessary override of .from_diagrams in NumpyModel.
    • Removed unnecessary kwargs parameters from several constructors.
    • Removed unused special_cases parameter and _ob method from CircuitAnsatz.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.6(Aug 11, 2022)

    Added:

    • A strict pregroups mode to the CLI. With this mode enabled, all swaps are removed from the output string diagrams by changing the ordering of the atomic types, converting them into a valid pregroup form.

    Fixed:

    • Adjusted the behaviour of output normalisation in quantum models. Now, NumpyModel always returns probabilities instead of amplitudes.
    • Removed the prediction from the output of the SPSAOptimizer, which now returns just the loss.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.5(Jul 26, 2022)

    Added:

    • Added a "swapping" unary rule box to handle unary rules that change the direction of composition, improving the coverage of the BobcatParser.
    • Added a --version flag to the CLI.
    • Added a make_checkpoint method to all training models.

    Changed:

    • Changed the WebParser so that the online service to use is specified by name rather than by URL.
    • Changed the BobcatParser to only allow one tree per category in a cell, doubling parsing speed without affecting the structure of the parse trees (in most cases).
    • Made the linting of the codebase stricter, enforced by the GitHub action. The flake8 configuration can be viewed in the setup.cfg file.

    Fixed:

    • Fixed the parameter names in CCGRule, where dom and cod had inadvertently been swapped.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.4(Jul 4, 2022)

    Added:

    • Support for using jax as backend of tensornetwork when setting use_jit=True in the NumpyModel. The interface is not affected by this change, but performance of the model is significantly improved.

    Fixed:

    • Fix a bug that caused the BobcatParser and the WebParser to trigger an SSL certificate error using Windows.
    • Fix false positives in assigning conjunction rule using the CCGBankParser. The rule , + X[conj] -> X[conj] is a case of removing left punctuation, but was being assigned conjunction erroneously.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.3(Jun 8, 2022)

    Added:

    • CCGRule: Add symbol method that returns the ASCII symbol of a given CCG rule.
    • CCGTree: Extend deriv method with CCG output. It is now capable of returning standard CCG diagrams.
    • Command-line interface: CCG mode. When enabled, the output will be a string representation of the CCG diagram corresponding to the CCGTree object produced by the parser, instead of DisCoPy diagram or circuit.
    • Documentation: Add a troubleshooting page.

    Change:

    • Change the behaviour of spiders_reader such that the spiders decompose logarithmically. This change also affects other rewrite rules that use spiders, such as coordination and relative pronouns.
    • Rename AtomicType.PREPOSITION to AtomicType.PREPOSITIONAL_PHRASE.

    Fixed:

    • Fix a bug that raised a dtype error when using the TketModel on Windows.
    • Fix a bug that caused the normalisation of scalar outputs of circuits without open wires using a QuantumModel.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.2(Apr 24, 2022)

    Added:

    • Add support for Python 3.10.
    • Unify class hierarchies for parsers and readers: CCGParser is now a subclass of Reader and placed in the common package text2diagram. The old packages reader and ccg2discocat are no longer available. Compatibility problems with previous versions should be minimal, since from Release 0.2.0 and onwards all lambeq classes can be imported from the global namespace.
    • Add CurryRewriteRule, which uses map-state duality in order to remove adjoint types from the boxes of a diagram. When used in conjunction with discopy.rigid.Diagram.normal_form(), this removes cups from the diagram, eliminating post-selection.
    • The Bobcat parser now updates automatically when new versions are made available online.
    • Allow customising available root categories for the parser when using the command-line interface.

    Fixed:

    • Update grammar file of Bobcat parser to avoid problems with conflicting unary rules.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.1(Apr 7, 2022)

    Added:

    • A new Checkpoint class that implements pickling and file operations from Trainer and Model.

    Changed:

    • Improvements to the training module, allowing multiple diagrams to be accepted as input to the SPSAOptimizer.
    • Updated documentation, including sub-package structures and class diagrams.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.0(Mar 21, 2022)

    Added:

    • A new state-of-the-art CCG parser, fully integrated with lambeq, which replaces depccg as the default parser of the toolkit. The new Bobcat parser has better performance, simplifies installation, and provides compatibility with Windows (which was not supported due to a depccg conflict). depccg is still supported as an alternative external dependency.
    • A training package, providing a selection of trainers, models, and optimizers that greatly simplify supervised training for most of lambeq's use cases, classical and quantum. The new package adds several new features to lambeq, such as the ability to save to and restore models from checkpoints.
    • Furthermore, the training package uses DisCoPy's tensor network capability to contract tensor diagrams efficiently. In particular, DisCoPy 0.4.1's new unitary and density matrix simulators in result in substantially faster training speeds compared to the previous version.
    • A command-line interface, which provides most of lambeq's functionality from the command line. For example, lambeq can now be used as a standard command-line pregroup parser.
    • A web parser class that can send parsing queries to an online API, so that local installation of a parser is not strictly necessary anymore. The web parser is particularly helpful for testing purposes, interactive usage or when a local parser is unavailable, but should not be used for serious experiments.
    • A new lambeq.pregroups package that provides methods for easy creation of pregroup diagrams, removal of cups, and printing of diagrams in text form (i.e. in a terminal).
    • A new TreeReader class that exploits the biclosed structure of CCG grammatical derivations.
    • Three new rewrite rules for relative pronouns and coordination.
    • Tokenisation features have been added in all parsers and readers.
    • Additional generator methods and minor improvements for the CCGBankParser class.

    Changed:

    • Improved and more detailed package structure.
    • Most classes and functions can now be imported from lambeq directly, instead of having to import from the sub-packages.
    • The circuit and tensor modules have been combined into an lambeq.ansatz package. (However, as mentioned above, the classes and functions they define can now be imported directly from lambeq and should continue to do so in future releases.)
    • Improved documentation and additional tutorials.
    Source code(tar.gz)
    Source code(zip)
  • 0.1.2(Oct 12, 2021)

    Minor changes:

    • Add URLs to setup file

    Fixes:

    • Fix logo link in README
    • Fix missing version when building docs in GitHub action
    • Fix typo to description keyword in setup file
    Source code(tar.gz)
    Source code(zip)
  • 0.1.1(Oct 12, 2021)

    Minor changes:

    • Update install script to use PyPI package
    • Add badges and link to documentation to README file
    • Add lambeq logo and link to GitHub to documentation
    • Allow documentation to automatically get package version
    • Add keywords and classifiers to setup file

    Fixes:

    • Add lambeq.circuit submodule to top-level lambeq module
    • Fix references to license file
    Source code(tar.gz)
    Source code(zip)
  • 0.1.0(Oct 12, 2021)

    Initial release of lambeq. This contains a lot of core material:

    • converting sentences to string diagrams
    • CCG parsing, including reading from CCGBank
    • support for depccg parsing
    • rewriting diagrams
    • ansatze for circuits and tensors, including MPS ansatze
    • support for JAX and PyTorch integration
    • example notebooks and documentation
    Source code(tar.gz)
    Source code(zip)
Owner
Cambridge Quantum
Quantum Software and Technologies
Cambridge Quantum
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

Hugging Face 77.1k Dec 31, 2022
Ecommerce product title recognition package

revizor This package solves task of splitting product title string into components, like type, brand, model and article (or SKU or product code or you

Bureaucratic Labs 16 Mar 03, 2022
Natural Language Processing

NLP Natural Language Processing apps Multilingual_NLP.py start #This script is demonstartion of Mul

Ritesh Sharma 1 Oct 31, 2021
Understand Text Summarization and create your own summarizer in python

Automatic summarization is the process of shortening a text document with software, in order to create a summary with the major points of the original document. Technologies that can make a coherent

Sreekanth M 1 Oct 18, 2022
An automated program that helps customers of Pizza Palour place their pizza orders

PIzza_Order_Assistant Introduction An automated program that helps customers of Pizza Palour place their pizza orders. The program uses voice commands

Tindi Sommers 1 Dec 26, 2021
nlp基础任务

NLP算法 说明 此算法仓库包括文本分类、序列标注、关系抽取、文本匹配、文本相似度匹配这五个主流NLP任务,涉及到22个相关的模型算法。 框架结构 文件结构 all_models ├── Base_line │   ├── __init__.py │   ├── base_data_process.

zuxinqi 23 Sep 22, 2022
SDL: Synthetic Document Layout dataset

SDL is the project that synthesizes document images. It facilitates multiple-level labeling on document images and can generate in multiple languages.

Sơn Nguyễn 0 Oct 07, 2021
A python package to fine-tune transformer-based models for named entity recognition (NER).

nerblackbox A python package to fine-tune transformer-based language models for named entity recognition (NER). Resources Source Code: https://github.

Felix Stollenwerk 13 Jul 30, 2022
Help you discover excellent English projects and get rid of disturbing by other spoken language

GitHub English Top Charts 「Help you discover excellent English projects and get

GrowingGit 544 Jan 09, 2023
MMDA - multimodal document analysis

MMDA - multimodal document analysis

AI2 75 Jan 04, 2023
SAINT PyTorch implementation

SAINT-pytorch A Simple pyTorch implementation of "Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing" based on https://arx

Arshad Shaikh 63 Dec 25, 2022
Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

Tokenizer Le Tokenizer est un analyseur lexicale, il permet, comme Flex and Yacc par exemple, de tokenizer du code, c'est à dire transformer du code e

Manolo 1 Aug 15, 2022
Utility for Google Text-To-Speech batch audio files generator. Ideal for prompt files creation with Google voices for application in offline IVRs

Google Text-To-Speech Batch Prompt File Maker Are you in the need of IVR prompts, but you have no voice actors? Let Google talk your prompts like a pr

Ponchotitlán 1 Aug 19, 2021
Fast, general, and tested differentiable structured prediction in PyTorch

Torch-Struct: Structured Prediction Library A library of tested, GPU implementations of core structured prediction algorithms for deep learning applic

HNLP 1.1k Dec 16, 2022
This is a project built for FALLABOUT2021 event under SRMMIC, This project deals with NLP poetry generation.

FALLABOUT-SRMMIC 21 POETRY-GENERATION HINGLISH DESCRIPTION We have developed a NLP(natural language processing) model which automatically generates a

7 Sep 28, 2021
Big Bird: Transformers for Longer Sequences

BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the c

Google Research 457 Dec 23, 2022
Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"

GAN stability This repository contains the experiments in the supplementary material for the paper Which Training Methods for GANs do actually Converg

Lars Mescheder 884 Nov 11, 2022
Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

SpeechMix Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together. Introduction For the same input: from datas

Eric Lam 31 Nov 07, 2022
This repository serves as a place to document a toy attempt on how to create a generative text model in Catalan, based on GPT-2

GPT-2 Catalan playground and scripts to train a GPT-2 model either from scrath or from another pretrained model.

Laura 1 Jan 28, 2022
ChessCoach is a neural network-based chess engine capable of natural-language commentary.

ChessCoach is a neural network-based chess engine capable of natural-language commentary.

Chris Butner 380 Dec 03, 2022