A high-level Python library for Quantum Natural Language Processing

Related tags

Text Data & NLPlambeq
Overview

lambeq

lambeq logo

Build status License PyPI version PyPI downloads arXiv

About

lambeq is a toolkit for quantum natural language processing (QNLP).

Documentation: https://cqcl.github.io/lambeq/

Getting started

Prerequisites

  • Python 3.7+

Installation

Direct pip install

The base lambeq can be installed with the command:

pip install lambeq

This does not include optional dependencies such as depccg and PyTorch, which have to be installed separately. In particular, depccg is required for lambeq.ccg2discocat.DepCCGParser.

To install lambeq with depccg, run instead:

pip install cython numpy
pip install lambeq[depccg]
depccg_en download

See below for further options.

Automatic installation (recommended)

This runs an interactive installer to help pick the installation destination and configuration.

  1. Run:
    bash <(curl 'https://cqcl.github.io/lambeq/install.sh')

Git installation

This required Git to be installed.

  1. Download this repository:

    git clone https://github.com/CQCL/lambeq
  2. Enter the repository:

    cd lambeq
  3. Make sure pip is up-to-date:

    pip install --upgrade pip wheel
  4. (Optional) If installing the optional dependency depccg, the following packages must be installed before installing depccg:

    pip install cython numpy

    Further information can be found on the depccg homepage.

  5. Install lambeq from the local repository using pip:

    pip install --use-feature=in-tree-build .

    To include depccg, run instead:

    pip install --use-feature=in-tree-build .[depccg]

    To include all optional dependencies, run instead:

    pip install --use-feature=in-tree-build .[all]
  6. If using a pretrained depccg parser, download a pretrained model:

    depccg_en download

Usage

The docs/examples directory contains notebooks demonstrating usage of the various tools in lambeq.

Example - parsing a sentence into a diagram (see docs/examples/ccg2discocat.ipynb):

from lambeq.ccg2discocat import DepCCGParser

depccg_parser = DepCCGParser()
diagram = depccg_parser.sentence2diagram('This is a test sentence')
diagram.draw()

Note: all pre-trained depccg models apart from the basic one are broken, and depccg has not yet been updated to fix this. Therefore, it is recommended to just use the basic parser, as shown here.

Testing

Run all tests with the command:

pytest

Note: if you have installed in a virtual environment, remember to install pytest in the same environment using pip.

Building Documentation

To build the documentation, first install the required dependencies:

pip install -r docs/requirements.txt

then run the commands:

cd docs
make clean
make html

the docs will be under docs/_build.

To rebuild the rst files themselves, run:

sphinx-apidoc --force -o docs lambeq

License

Distributed under the Apache 2.0 license. See LICENSE for more details.

Citation

If you wish to attribute our work, please cite the accompanying paper:

@article{kartsaklis2021lambeq,
   title={lambeq: {A}n {E}fficient {H}igh-{L}evel {P}ython {L}ibrary for {Q}uantum {NLP}},
   author={Dimitri Kartsaklis and Ian Fan and Richie Yeung and Anna Pearson and Robin Lorenz and Alexis Toumi and Giovanni de Felice and Konstantinos Meichanetzidis and Stephen Clark and Bob Coecke},
   year={2021},
   journal={arXiv preprint arXiv:2110.04236},
}
Comments
  • No module named BobcatParser

    No module named BobcatParser

    I'm trying to run the following code from your tutorials website, but I am unable to install BobcatParser. I am working in a Colab environment. Is there a dependency that I may be missing?

    from lambeq import BobcatParser
    
    parser = BobcatParser(root_cats=('NP', 'N'), verbose='text')
    
    raw_train_diagrams = parser.sentences2diagrams(train_data, suppress_exceptions=True)
    raw_val_diagrams = parser.sentences2diagrams(val_data, suppress_exceptions=True) 
    
    opened by alt-shreya 14
  • error during installation

    error during installation

    When I try installing using sh <(curl 'https://cqcl.github.io/lambeq/install.sh'), i get the follwing error:

    ERROR: Cannot install lambeq[depccg]==0.1.0, lambeq[depccg]==0.1.1 and lambeq[depccg]==0.1.2 because these package versions have conflicting dependencies.
    
    The conflict is caused by:
        lambeq[depccg] 0.1.2 depends on depccg==1.1.0; extra == "depccg"
        lambeq[depccg] 0.1.1 depends on depccg==1.1.0; extra == "depccg"
        lambeq[depccg] 0.1.0 depends on depccg==1.1.0; extra == "depccg"
    
    To fix this you could try to:
    1. loosen the range of package versions you've specified
    2. remove package versions to allow pip attempt to solve the dependency conflict
    
    ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies
    
    

    I am on a 2020 Macbook air (with Apple M1 chip), and using conda with python=3.8.11 . Will any of that be causing the problem?

    opened by mithunpaul08 13
  • Problem with trainer.fit(), operands of different shape

    Problem with trainer.fit(), operands of different shape

    Hi, I am trying to run the quantum trainer algorithm. When running the following line:

    trainer.fit(train_dataset, val_dataset, evaluation_step=1, logging_step=100)

    i get the following error:

    ValueError                          Traceback (most recent call last)
    Input In [17], in <cell line: 1>()
    ----> 1 trainer.fit(train_dataset, val_dataset, evaluation_step=1, logging_step=100)
    
    File c:\python38\lib\site-packages\lambeq\training\trainer.py:365, in Trainer.fit(self, train_dataset, val_dataset, evaluation_step, logging_step)
        363 step += 1
        364 x, y_label = batch
    --> 365 y_hat, loss = self.training_step(batch)
        366 if (self.evaluate_on_train and
        367         self.evaluate_functions is not None):
        368     for metr, func in self.evaluate_functions.items():
    
    File c:\python38\lib\site-packages\lambeq\training\quantum_trainer.py:149, in QuantumTrainer.training_step(self, batch)
        133 def training_step(
        134         self,
        135         batch: tuple[list[Any], np.ndarray]) -> tuple[np.ndarray, float]:
        136     """Perform a training step.
        137 
        138     Parameters
       (...)
        147 
        148     """
    --> 149     y_hat, loss = self.optimizer.backward(batch)
        150     self.train_costs.append(loss)
        151     self.optimizer.step()
    
    File c:\python38\lib\site-packages\lambeq\training\spsa_optimizer.py:126, in SPSAOptimizer.backward(self, batch)
        124 self.model.weights = xplus
        125 y0 = self.model(diagrams)
    --> 126 loss0 = self.loss_fn(y0, targets)
        127 xminus = self.project(x - self.ck * delta)
        128 self.model.weights = xminus
    
    Input In [13], in <lambda>(y_hat, y)
    ----> 1 loss = lambda y_hat, y: -np.sum(y * np.log(y_hat)) / len(y)  # binary cross-entropy loss
          3 acc = lambda y_hat, y: np.sum(np.round(y_hat) == y) / len(y) / 2  # half due to double-counting
          4 eval_metrics = {"acc": acc}
    
    ValueError: operands could not be broadcast together with shapes (30,2) (30,)
    

    I have just fixed the .py file in the lib following #12. The algorithm raised an error even before. I can't recall exactly, but i don't think it was the same error.

    What can i do to solve this? Thank you for your time.

    opened by Stephenito 10
  • Error when running Parser

    Error when running Parser

    Below is the code I ran for testing the parser : from lambeq import BobcatParser

    parser = BobcatParser() diagram = parser.sentence2diagram('This is a test sentence') diagram.draw()

    and this is the error I received when running it : 2022-05-22 20:19:08.041411: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2022-05-22 20:19:08.042271: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Traceback (most recent call last): File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1342, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1255, in request self._send_request(method, url, body, headers, encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1301, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1250, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1010, in _send_output self.send(msg) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 950, in send self.connect() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1424, in connect self.sock = self._context.wrap_socket(self.sock, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 500, in wrap_socket return self.sslsocket_class._create( File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1040, in _create self.do_handshake() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1309, in do_handshake self._sslobj.do_handshake() ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "C:\Users\elmm\Desktop\CQM\QNLP Depression 2.py", line 3, in parser = BobcatParser() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\bobcat_parser.py", line 258, in init download_model(model_name_or_path, model_dir, verbose) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\bobcat_parser.py", line 130, in download_model model_file, headers = urlretrieve(url, reporthook=progress_bar.update) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 239, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 214, in urlopen return opener.open(url, data, timeout) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 517, in open response = self._open(req, data) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 534, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 494, in _call_chain result = func(*args) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1385, in https_open return self.do_open(http.client.HTTPSConnection, req, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1345, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)>

    How can I resolve this?

    opened by ACE07-Sev 9
  • Question : What does the Quantum_trainer output?

    Question : What does the Quantum_trainer output?

    Hi, I wish to use the Quantum_trainer to do a Depression Detection using a chatbot (the sentences would be the input for the QNLP module), and wish to then classify as whether the person has depression or not.

    May I ask what is the input and what is the output in the sample trainer for the quantum module? Does it do binary classification or is it something I need to add as an additional layer?

    opened by ACE07-Sev 8
  • An Error while running Classical Pipeline Example given in docs/examples

    An Error while running Classical Pipeline Example given in docs/examples

    Hi @dimkart I hope you are doing well

    I am trying to run the code given here on my Google Colab account - https://github.com/CQCL/lambeq/blob/main/docs/examples/classical_pipeline.ipynb

    I am installing lambeq directly on Colab and it is picking up the latest version of DisCoPy

    But I am continuously getting an error like this. I have pasted the full stack trace here -

    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    <ipython-input-11-84634b74856a> in <module>()
         39 dev_cost_fn, dev_costs, dev_accs = make_cost_fn(dev_pred_fn, dev_labels)
         40 
    ---> 41 result = train(train_cost_fn, x0, niter=20, callback=dev_cost_fn, optimizer_fn=torch.optim.AdamW, lr=0.1)
    
    10 frames
    <ipython-input-11-84634b74856a> in train(func, x0, niter, callback, optimizer_fn, lr)
          3     optimizer = optimizer_fn(x, lr=lr)
          4     for _ in range(niter):
    ----> 5         loss = func(x)
          6 
          7         optimizer.zero_grad()
    
    <ipython-input-11-84634b74856a> in cost_fn(params, **kwargs)
         16 def make_cost_fn(pred_fn, labels):
         17     def cost_fn(params, **kwargs):
    ---> 18         predictions = pred_fn(params)
         19 
         20         logits = predictions[:, 1] - predictions[:, 0]
    
    <ipython-input-10-dbb8534e3157> in predict(params)
          1 def make_pred_fn(circuits):
          2     def predict(params):
    ----> 3         return torch.stack([c.lambdify(*parameters)(*params).eval(contractor=tn.contractors.auto).array for c in circuits])
          4     return predict
          5 
    
    <ipython-input-10-dbb8534e3157> in <listcomp>(.0)
          1 def make_pred_fn(circuits):
          2     def predict(params):
    ----> 3         return torch.stack([c.lambdify(*parameters)(*params).eval(contractor=tn.contractors.auto).array for c in circuits])
          4     return predict
          5 
    
    /usr/local/lib/python3.7/dist-packages/discopy/tensor.py in eval(self, contractor)
        448         if contractor is None:
        449             return Functor(ob=lambda x: x, ar=lambda f: f.array)(self)
    --> 450         array = contractor(*self.to_tn()).tensor
        451         return Tensor(self.dom, self.cod, array)
        452 
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/contractors/opt_einsum_paths/path_contractors.py in auto(nodes, output_edge_order, memory_limit, ignore_edge_order)
        262         output_edge_order=output_edge_order,
        263         nbranch=1,
    --> 264         ignore_edge_order=ignore_edge_order)
        265   return greedy(nodes, output_edge_order, memory_limit, ignore_edge_order)
        266 
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/contractors/opt_einsum_paths/path_contractors.py in branch(nodes, output_edge_order, memory_limit, nbranch, ignore_edge_order)
        160   alg = functools.partial(
        161       opt_einsum.paths.branch, memory_limit=memory_limit, nbranch=nbranch)
    --> 162   return base(nodes, alg, output_edge_order, ignore_edge_order)
        163 
        164 
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/contractors/opt_einsum_paths/path_contractors.py in base(nodes, algorithm, output_edge_order, ignore_edge_order)
         86   path, nodes = utils.get_path(nodes_set, algorithm)
         87   for a, b in path:
    ---> 88     new_node = contract_between(nodes[a], nodes[b], allow_outer_product=True)
         89     nodes.append(new_node)
         90     nodes = utils.multi_remove(nodes, [a, b])
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/network_components.py in contract_between(node1, node2, name, allow_outer_product, output_edge_order, axis_names)
       2083     axes1 = [axes1[i] for i in ind_sort]
       2084     axes2 = [axes2[i] for i in ind_sort]
    -> 2085     new_tensor = backend.tensordot(node1.tensor, node2.tensor, [axes1, axes2])
       2086     new_node = Node(tensor=new_tensor, name=name, backend=backend)
       2087     # node1 and node2 get new edges in _remove_edges
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/backends/pytorch/pytorch_backend.py in tensordot(self, a, b, axes)
         44   def tensordot(self, a: Tensor, b: Tensor,
         45                 axes: Union[int, Sequence[Sequence[int]]]) -> Tensor:
    ---> 46     return torchlib.tensordot(a, b, dims=axes)
         47 
         48   def reshape(self, tensor: Tensor, shape: Tensor) -> Tensor:
    
    /usr/local/lib/python3.7/dist-packages/torch/functional.py in tensordot(a, b, dims, out)
       1032 
       1033     if out is None:
    -> 1034         return _VF.tensordot(a, b, dims_a, dims_b)  # type: ignore[attr-defined]
       1035     else:
       1036         return _VF.tensordot(a, b, dims_a, dims_b, out=out)  # type: ignore[attr-defined]
    
    RuntimeError: expected scalar type Float but found Double
    

    I was able to successfully carry out experiments using the Quantum Pipeline code on Google Colab and did not faced any issues but for this one I am getting error. I have tried to fix the issue by converting variables or some function outputs to float() but I was unable to rectify this.

    Can you please help me fix this issue?

    Thank you so much!

    opened by srinjoyganguly 8
  • Add Japanese support to DepCCGParser

    Add Japanese support to DepCCGParser

    Updated DepCCGParser to support Japanese. The sample code is as follows.

    1. Prepare depccg.

    pip install cython numpy depccg
    depccg_en download
    depccg_ja download
    

    2. Install Japanese fonts on Ubuntu.

    apt install -y fonts-migmix
    rm ~/.cache/matplotlib/fontlist-v330.json
    

    3. Set the matplotlib Japanese font in the jupyter notebook python code.

    import matplotlib
    from matplotlib.font_manager import FontProperties
    
    font_path = "/usr/share/fonts/truetype/migmix/migmix-1p-regular.ttf"
    font_prop = FontProperties(fname=font_path)
    matplotlib.rcParams["font.family"] = font_prop.get_name()
    

    4. Use sentence2diagram in the jupyter notebook python code.

    from lambeq import DepCCGParser
    from discopy import grammar
    
    parser = DepCCGParser(lang='ja')
    diagram = parser.sentence2diagram('これはテストの文です。')
    grammar.draw(diagram, figsize=(14,3), fontsize=12)
    

    5. Use ansatz in the jupyter notebook python code.

    from lambeq import AtomicType, IQPAnsatz
    
    # Define atomic types
    N = AtomicType.NOUN
    S = AtomicType.SENTENCE
    
    # Convert string diagram to quantum circuit
    ansatz = IQPAnsatz({N: 1, S: 1}, n_layers=2)
    discopy_circuit = ansatz(diagram)
    discopy_circuit.draw(figsize=(15,10))
    

    6. Use pytket in the jupyter notebook python code.

    from pytket.circuit.display import render_circuit_jupyter
    
    tket_circuit = discopy_circuit.to_tk()
    render_circuit_jupyter(tket_circuit)
    
    opened by KentaroAOKI 6
  • <unk> token feature in the forward() function

    token feature in the forward() function

    Considering the necessity of the token for the never-seen before entities, how can I implement the token in the forward function to allow for the model to calculate probabilities for the instances which have unknown symbols. Based on my understanding and guide from one of the moderators I think it's supposed to be in the forward function.

    Could you kindly assist me in implementing this?

    opened by ACE07-Sev 6
  • inference

    inference

    I'm trying to run Quantum pipeline using JAX backend. In order to better show results (how each sentence was classified according to the two different categories) , it is available an example of code to realise the inference as in the classic NLP deep learning approaches (i.e. like in transformers-based approaches or similar)

    stale 
    opened by nlpirate 6
  • WebParser error

    WebParser error

    Hi I'm trying to run the Web Parser but very similar to the issue with the Bobcat Parser, it gives this error : Traceback (most recent call last): File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1342, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1255, in request self._send_request(method, url, body, headers, encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1301, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1250, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1010, in _send_output self.send(msg) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 950, in send self.connect() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1424, in connect self.sock = self._context.wrap_socket(self.sock, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 500, in wrap_socket return self.sslsocket_class._create( File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1040, in _create self.do_handshake() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1309, in do_handshake self._sslobj.do_handshake() ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "C:\Users\elmm\Desktop\CQM\QNLP_test.py", line 6, in new_diagram = parser.sentence2diagram('he was overtaken by the depression.') File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\ccg_parser.py", line 227, in sentence2diagram return self.sentences2diagrams( File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\ccg_parser.py", line 157, in sentences2diagrams trees = self.sentences2trees(sentences, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\web_parser.py", line 159, in sentences2trees raise e File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\web_parser.py", line 148, in sentences2trees with urlopen(url) as f: File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 214, in urlopen return opener.open(url, data, timeout) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 517, in open response = self._open(req, data) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 534, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 494, in _call_chain result = func(*args) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1385, in https_open return self.do_open(http.client.HTTPSConnection, req, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1345, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)>

    Process finished with exit code 1

    I really need to get this working because Bobcat fails in parsing the sentences into diagrams. Can someone please help me with this?

    opened by ACE07-Sev 4
  • Numpy int32 error

    Numpy int32 error

    Hi I ran the Quantum_trainer, and been getting this and can't seem to find where the issue is.

    Traceback (most recent call last): File "C:\Users\elmm\Desktop\CQM\QNLP Depression.py", line 103, in trainer.fit(train_dataset, val_dataset, logging_step=12) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\trainer.py", line 365, in fit y_hat, loss = self.training_step(batch) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\quantum_trainer.py", line 149, in training_step y_hat, loss = self.optimizer.backward(batch) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\spsa_optimizer.py", line 125, in backward y0 = self.model(diagrams) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\model.py", line 59, in call return self.forward(*args, **kwds) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\tket_model.py", line 131, in forward return self.get_diagram_output(x) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\tket_model.py", line 103, in get_diagram_output seed=self._randint() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\tket_model.py", line 71, in _randint return np.random.randint(low, high) File "mtrand.pyx", line 746, in numpy.random.mtrand.RandomState.randint File "_bounded_integers.pyx", line 1334, in numpy.random._bounded_integers._rand_int32 ValueError: low is out of bounds for int32

    opened by ACE07-Sev 4
  • Error with Bobcat Parser

    Error with Bobcat Parser

    Hi,

    I am having a relatively difficult to understand error. I am trying to look at the sentence "Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat", and can't figure out why it is giving me the error below. Does BobCatParser throw an error if it doesn't recognize a word? I am unclear what I would need to do to fix the sentence.

    from lambeq import BobcatParser

    parser = BobcatParser() diagram = parser.sentence2diagram('Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat') diagram.draw()

    raceback (most recent call last):

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/bobcat_parser.py", line 382, in sentences2trees trees.append(self._build_ccgtree(result[0]))

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/bobcat/parser.py", line 258, in getitem return self.root[index]

    IndexError: list index out of range

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/spyder_kernels/py3compat.py", line 356, in compat_exec exec(code, globals, locals)

    File "/Users/dabeaulieu/Documents/Initiatives/quantum/machine learning/notebooks/qnlp/ankush/Quantum_NLP/testcode.py", line 12, in diagram = parser.sentence2diagram('Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat')

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/ccg_parser.py", line 231, in sentence2diagram return self.sentences2diagrams(

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/ccg_parser.py", line 161, in sentences2diagrams trees = self.sentences2trees(sentences,

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/bobcat_parser.py", line 387, in sentences2trees raise BobcatParseError(' '.join(sent.words))

    BobcatParseError: Bobcat failed to parse 'Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat'.

    opened by dancbeaulieu 1
  • Error from_tk

    Error from_tk

    Discussed in https://github.com/CQCL/lambeq/discussions/49

    Originally posted by JVM1982 October 10, 2022 Hello.

    Why the following code does not work ? :

    sentence = 'person runs program .' diagram = remove_cups( parser.sentence2diagram( sentence ) ) circuit = ansatz( diagram ) print( model( [ circuit ] ) ) # OK # print( model( [ from_tk( circuit.to_tk() ) ] ) ) # ERROR #

    [[0.14473685 0.85526315]] Unexpected exception formatting exception. Falling back to standard exception

    Traceback (most recent call last): File "/home/javier.valera/.local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3378, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "/tmp/ipykernel_16444/2916265434.py", line 9, in print( model( [ from_tk( circuit.to_tk() ) ] ) ) File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/model.py", line 59, in call return self.forward(*args, **kwds) File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/tket_model.py", line 131, in forward return self.get_diagram_output(x) File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/tket_model.py", line 101, in get_diagram_output *[diag_f(*self.weights) for diag_f in lambdified_diagrams], File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/tket_model.py", line 101, in *[diag_f(*self.weights) for diag_f in lambdified_diagrams], File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/monoidal.py", line 509, in return lambda xs: self.id(self.dom).then(( File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/monoidal.py", line 510, in self.id(left) @ box.lambdify(*symbols, **kwargs)(*xs) File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/quantum/gates.py", line 321, in return lambda *xs: type(self)(c_fn(*xs), distance=self.distance) File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/quantum/gates.py", line 448, in return lambda *xs: type(self)(data(*xs)) File "", line 2, in _lambdifygenerated return [email protected]@n.l_0 NameError: name 'runs__n' is not defined

    Thanks.

    opened by Thommy257 0
  • cfg: other languages compatibility

    cfg: other languages compatibility

    I am interested in some verticalization of lambeq (and consequently discocat) to languages other than English, particularly Italian. As far as I read from the documentation from a linguistic point of view, at the base of the framework, there are cfg grammars. I know this theoretical formalism very well. How is it possible to visualize and extract the structures and formalisms of these grammars from the library so that they can be extended/modified?

    opened by nlpirate 0
Releases(0.2.7)
  • 0.2.7(Oct 11, 2022)

    Added:

    • Added support for Japanese to DepCCGParser. (credit: KentaroAOKI https://github.com/CQCL/lambeq/pull/24)
    • Overhauled the CircuitAnsatz interface, and added three new ansätze.
    • Added helper methods to CCGTree to get the children of a tree. Added a new .tree2diagram method to TreeReader, extracted from TreeReader.sentence2diagram.
    • Added a new TreeReaderMode named HEIGHT.
    • Added new methods to Checkpoint for creating, saving and loading checkpoints for training.
    • Documentation: added a section for how to select the right model and trainer for training.
    • Documentation: added links to glossary terms throughout the documentation.
    • Documentation: added UML class diagrams for the sub-packages in lambeq.

    Changed:

    • Dependencies: bumped the minimum versions of discopy and torch.
    • IQPAnsatz now post-selects in the Hadamard basis.
    • PytorchModel now initialises using xavier_uniform.
    • CCGTree.to_json can now be applied to None, returning None.
    • Several slow imports have been deferred, making lambeq much faster to import for the first time.
    • In CCGRule.infer_rule, direction checks have been made explicit.
    • UnarySwap is now specified to be a unaryBoxConstructor.
    • BobcatParser has been refactored for easier use with external evaluation tools.
    • Documentation: headings have been organised in the tutorials into subsections.

    Fixed:

    • Fixed how CCGRule.infer_rule assigns a punc + X instance: if the result is X\X the assigned rule is CONJUNCTION, otherwise the rule is REMOVE_PUNCTUATION_LEFT (similarly for punctuation on the right).

    Removed:

    • Removed unnecessary override of .from_diagrams in NumpyModel.
    • Removed unnecessary kwargs parameters from several constructors.
    • Removed unused special_cases parameter and _ob method from CircuitAnsatz.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.6(Aug 11, 2022)

    Added:

    • A strict pregroups mode to the CLI. With this mode enabled, all swaps are removed from the output string diagrams by changing the ordering of the atomic types, converting them into a valid pregroup form.

    Fixed:

    • Adjusted the behaviour of output normalisation in quantum models. Now, NumpyModel always returns probabilities instead of amplitudes.
    • Removed the prediction from the output of the SPSAOptimizer, which now returns just the loss.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.5(Jul 26, 2022)

    Added:

    • Added a "swapping" unary rule box to handle unary rules that change the direction of composition, improving the coverage of the BobcatParser.
    • Added a --version flag to the CLI.
    • Added a make_checkpoint method to all training models.

    Changed:

    • Changed the WebParser so that the online service to use is specified by name rather than by URL.
    • Changed the BobcatParser to only allow one tree per category in a cell, doubling parsing speed without affecting the structure of the parse trees (in most cases).
    • Made the linting of the codebase stricter, enforced by the GitHub action. The flake8 configuration can be viewed in the setup.cfg file.

    Fixed:

    • Fixed the parameter names in CCGRule, where dom and cod had inadvertently been swapped.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.4(Jul 4, 2022)

    Added:

    • Support for using jax as backend of tensornetwork when setting use_jit=True in the NumpyModel. The interface is not affected by this change, but performance of the model is significantly improved.

    Fixed:

    • Fix a bug that caused the BobcatParser and the WebParser to trigger an SSL certificate error using Windows.
    • Fix false positives in assigning conjunction rule using the CCGBankParser. The rule , + X[conj] -> X[conj] is a case of removing left punctuation, but was being assigned conjunction erroneously.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.3(Jun 8, 2022)

    Added:

    • CCGRule: Add symbol method that returns the ASCII symbol of a given CCG rule.
    • CCGTree: Extend deriv method with CCG output. It is now capable of returning standard CCG diagrams.
    • Command-line interface: CCG mode. When enabled, the output will be a string representation of the CCG diagram corresponding to the CCGTree object produced by the parser, instead of DisCoPy diagram or circuit.
    • Documentation: Add a troubleshooting page.

    Change:

    • Change the behaviour of spiders_reader such that the spiders decompose logarithmically. This change also affects other rewrite rules that use spiders, such as coordination and relative pronouns.
    • Rename AtomicType.PREPOSITION to AtomicType.PREPOSITIONAL_PHRASE.

    Fixed:

    • Fix a bug that raised a dtype error when using the TketModel on Windows.
    • Fix a bug that caused the normalisation of scalar outputs of circuits without open wires using a QuantumModel.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.2(Apr 24, 2022)

    Added:

    • Add support for Python 3.10.
    • Unify class hierarchies for parsers and readers: CCGParser is now a subclass of Reader and placed in the common package text2diagram. The old packages reader and ccg2discocat are no longer available. Compatibility problems with previous versions should be minimal, since from Release 0.2.0 and onwards all lambeq classes can be imported from the global namespace.
    • Add CurryRewriteRule, which uses map-state duality in order to remove adjoint types from the boxes of a diagram. When used in conjunction with discopy.rigid.Diagram.normal_form(), this removes cups from the diagram, eliminating post-selection.
    • The Bobcat parser now updates automatically when new versions are made available online.
    • Allow customising available root categories for the parser when using the command-line interface.

    Fixed:

    • Update grammar file of Bobcat parser to avoid problems with conflicting unary rules.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.1(Apr 7, 2022)

    Added:

    • A new Checkpoint class that implements pickling and file operations from Trainer and Model.

    Changed:

    • Improvements to the training module, allowing multiple diagrams to be accepted as input to the SPSAOptimizer.
    • Updated documentation, including sub-package structures and class diagrams.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.0(Mar 21, 2022)

    Added:

    • A new state-of-the-art CCG parser, fully integrated with lambeq, which replaces depccg as the default parser of the toolkit. The new Bobcat parser has better performance, simplifies installation, and provides compatibility with Windows (which was not supported due to a depccg conflict). depccg is still supported as an alternative external dependency.
    • A training package, providing a selection of trainers, models, and optimizers that greatly simplify supervised training for most of lambeq's use cases, classical and quantum. The new package adds several new features to lambeq, such as the ability to save to and restore models from checkpoints.
    • Furthermore, the training package uses DisCoPy's tensor network capability to contract tensor diagrams efficiently. In particular, DisCoPy 0.4.1's new unitary and density matrix simulators in result in substantially faster training speeds compared to the previous version.
    • A command-line interface, which provides most of lambeq's functionality from the command line. For example, lambeq can now be used as a standard command-line pregroup parser.
    • A web parser class that can send parsing queries to an online API, so that local installation of a parser is not strictly necessary anymore. The web parser is particularly helpful for testing purposes, interactive usage or when a local parser is unavailable, but should not be used for serious experiments.
    • A new lambeq.pregroups package that provides methods for easy creation of pregroup diagrams, removal of cups, and printing of diagrams in text form (i.e. in a terminal).
    • A new TreeReader class that exploits the biclosed structure of CCG grammatical derivations.
    • Three new rewrite rules for relative pronouns and coordination.
    • Tokenisation features have been added in all parsers and readers.
    • Additional generator methods and minor improvements for the CCGBankParser class.

    Changed:

    • Improved and more detailed package structure.
    • Most classes and functions can now be imported from lambeq directly, instead of having to import from the sub-packages.
    • The circuit and tensor modules have been combined into an lambeq.ansatz package. (However, as mentioned above, the classes and functions they define can now be imported directly from lambeq and should continue to do so in future releases.)
    • Improved documentation and additional tutorials.
    Source code(tar.gz)
    Source code(zip)
  • 0.1.2(Oct 12, 2021)

    Minor changes:

    • Add URLs to setup file

    Fixes:

    • Fix logo link in README
    • Fix missing version when building docs in GitHub action
    • Fix typo to description keyword in setup file
    Source code(tar.gz)
    Source code(zip)
  • 0.1.1(Oct 12, 2021)

    Minor changes:

    • Update install script to use PyPI package
    • Add badges and link to documentation to README file
    • Add lambeq logo and link to GitHub to documentation
    • Allow documentation to automatically get package version
    • Add keywords and classifiers to setup file

    Fixes:

    • Add lambeq.circuit submodule to top-level lambeq module
    • Fix references to license file
    Source code(tar.gz)
    Source code(zip)
  • 0.1.0(Oct 12, 2021)

    Initial release of lambeq. This contains a lot of core material:

    • converting sentences to string diagrams
    • CCG parsing, including reading from CCGBank
    • support for depccg parsing
    • rewriting diagrams
    • ansatze for circuits and tensors, including MPS ansatze
    • support for JAX and PyTorch integration
    • example notebooks and documentation
    Source code(tar.gz)
    Source code(zip)
Owner
Cambridge Quantum
Quantum Software and Technologies
Cambridge Quantum
TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.

TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.

Alexa 98 Dec 09, 2022
Grover is a model for Neural Fake News -- both generation and detectio

Grover is a model for Neural Fake News -- both generation and detection. However, it probably can also be used for other generation tasks.

Rowan Zellers 856 Dec 24, 2022
Unsupervised Language Model Pre-training for French

FlauBERT and FLUE FlauBERT is a French BERT trained on a very large and heterogeneous French corpus. Models of different sizes are trained using the n

GETALP 212 Dec 10, 2022
Officile code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning"

CvarAdversarialRL Official code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning". Initial setup Create a virtual

Mathieu Godbout 1 Nov 19, 2021
A cross platform OCR Library based on PaddleOCR & OnnxRuntime

A cross platform OCR Library based on PaddleOCR & OnnxRuntime

RapidOCR Team 767 Jan 09, 2023
A 10000+ hours dataset for Chinese speech recognition

A 10000+ hours dataset for Chinese speech recognition

309 Dec 16, 2022
SAINT PyTorch implementation

SAINT-pytorch A Simple pyTorch implementation of "Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing" based on https://arx

Arshad Shaikh 63 Dec 25, 2022
This is the offline-training-pipeline for our project.

offline-training-pipeline This is the offline-training-pipeline for our project. We adopt the offline training and online prediction Machine Learning

0 Apr 22, 2022
A Fast Command Analyser based on Dict and Pydantic

Alconna Alconna 隶属于ArcletProject, 在Cesloi内有内置 Alconna 是 Cesloi-CommandAnalysis 的高级版,支持解析消息链 一般情况下请当作简易的消息链解析器/命令解析器 文档 暂时的文档 Example from arclet.alcon

19 Jan 03, 2023
CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)

CrossNER is a fully-labeled collected of named entity recognition (NER) data spanning over five diverse domains (Politics, Natural Science, Music, Literature, and Artificial Intelligence) with specia

Zihan Liu 89 Nov 10, 2022
Active learning for text classification in Python

Active Learning allows you to efficiently label training data in a small-data scenario.

Webis 375 Dec 28, 2022
Neural text generators like the GPT models promise a general-purpose means of manipulating texts.

Boolean Prompting for Neural Text Generators Neural text generators like the GPT models promise a general-purpose means of manipulating texts. These m

Jeffrey M. Binder 20 Jan 09, 2023
Tool to check whether a GCP bucket is public or not.

Tool to check publicly accessible GCP bucket. Blog https://justm0rph3u5.medium.com/gcp-inspector-auditing-publicly-exposed-gcp-bucket-ac6cad55618c Wha

DIVYANSHU SHUKLA 7 Nov 24, 2022
A minimal code for fairseq vq-wav2vec model inference.

vq-wav2vec inference A minimal code for fairseq vq-wav2vec model inference. Runs without installing the fairseq toolkit and its dependencies. Usage ex

Vladimir Larin 7 Nov 15, 2022
NeMo: a toolkit for conversational AI

NVIDIA NeMo Introduction NeMo is a toolkit for creating Conversational AI applications. NeMo product page. Introductory video. The toolkit comes with

NVIDIA Corporation 5.3k Jan 04, 2023
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing

Token Shift GPT Implementation of Token Shift GPT - An autoregressive model that relies solely on shifting along the sequence dimension and feedforwar

Phil Wang 32 Oct 14, 2022
Code release for "COTR: Correspondence Transformer for Matching Across Images"

COTR: Correspondence Transformer for Matching Across Images This repository contains the inference code for COTR. We plan to release the training code

UBC Computer Vision Group 358 Dec 24, 2022
SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning

SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning We propose a SASE mode

Tower 1 Nov 20, 2021
A modular Karton Framework service that unpacks common packers like UPX and others using the Qiling Framework.

Unpacker Karton Service A modular Karton Framework service that unpacks common packers like UPX and others using the Qiling Framework. This project is

c3rb3ru5 45 Jan 05, 2023