Facilitating the design, comparison and sharing of deep text matching models.

Last update: Jan 02, 2023

Overview

MatchZoo

Facilitating the design, comparison and sharing of deep text matching models.
MatchZoo 是一个通用的文本匹配工具包，它旨在方便大家快速的实现、比较、以及分享最新的深度文本匹配模型。

🔥 News: MatchZoo-py (PyTorch version of MatchZoo) is ready now.

The goal of MatchZoo is to provide a high-quality codebase for deep text matching research, such as document retrieval, question answering, conversational response ranking, and paraphrase identification. With the unified data processing pipeline, simplified model configuration and automatic hyper-parameters tunning features equipped, MatchZoo is flexible and easy to use.

Tasks	Text 1	Text 2	Objective
Paraphrase Indentification	string 1	string 2	classification
Textual Entailment	text	hypothesis	classification
Question Answer	question	answer	classification/ranking
Conversation	dialog	response	classification/ranking
Information Retrieval	query	document	ranking

Get Started in 60 Seconds

To train a Deep Semantic Structured Model, import matchzoo and prepare input data.

import matchzoo as mz

train_pack = mz.datasets.wiki_qa.load_data('train', task='ranking')
valid_pack = mz.datasets.wiki_qa.load_data('dev', task='ranking')

Preprocess your input data in three lines of code, keep track parameters to be passed into the model.

preprocessor = mz.preprocessors.DSSMPreprocessor()
train_processed = preprocessor.fit_transform(train_pack)
valid_processed = preprocessor.transform(valid_pack)

Make use of MatchZoo customized loss functions and evaluation metrics:

ranking_task = mz.tasks.Ranking(loss=mz.losses.RankCrossEntropyLoss(num_neg=4))
ranking_task.metrics = [
    mz.metrics.NormalizedDiscountedCumulativeGain(k=3),
    mz.metrics.MeanAveragePrecision()
]

Initialize the model, fine-tune the hyper-parameters.

model = mz.models.DSSM()
model.params['input_shapes'] = preprocessor.context['input_shapes']
model.params['task'] = ranking_task
model.guess_and_fill_missing_params()
model.build()
model.compile()

Generate pair-wise training data on-the-fly, evaluate model performance using customized callbacks on validation data.

train_generator = mz.PairDataGenerator(train_processed, num_dup=1, num_neg=4, batch_size=64, shuffle=True)
valid_x, valid_y = valid_processed.unpack()
evaluate = mz.callbacks.EvaluateAllMetrics(model, x=valid_x, y=valid_y, batch_size=len(valid_x))
history = model.fit_generator(train_generator, epochs=20, callbacks=[evaluate], workers=5, use_multiprocessing=False)

References

Tutorials

English Documentation

中文文档

If you're interested in the cutting-edge research progress, please take a look at awaresome neural models for semantic match.

Install

MatchZoo is dependent on Keras and Tensorflow. Two ways to install MatchZoo:

Install MatchZoo from Pypi:

pip install matchzoo

Install MatchZoo from the Github source:

git clone https://github.com/NTMC-Community/MatchZoo.git
cd MatchZoo
python setup.py install

Models

DRMM: this model is an implementation of A Deep Relevance Matching Model for Ad-hoc Retrieval.
MatchPyramid: this model is an implementation of Text Matching as Image Recognition
ARC-I: this model is an implementation of Convolutional Neural Network Architectures for Matching Natural Language Sentences
DSSM: this model is an implementation of Learning Deep Structured Semantic Models for Web Search using Clickthrough Data
CDSSM: this model is an implementation of Learning Semantic Representations Using Convolutional Neural Networks for Web Search
ARC-II: this model is an implementation of Convolutional Neural Network Architectures for Matching Natural Language Sentences
MV-LSTM:this model is an implementation of A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations
aNMM: this model is an implementation of aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model
DUET: this model is an implementation of Learning to Match Using Local and Distributed Representations of Text for Web Search
K-NRM: this model is an implementation of End-to-End Neural Ad-hoc Ranking with Kernel Pooling
CONV-KNRM: this model is an implementation of Convolutional neural networks for soft-matching n-grams in ad-hoc search
models under development: Match-SRNN, DeepRank, BiMPM ....

Citation

If you use MatchZoo in your research, please use the following BibTex entry.

@inproceedings{Guo:2019:MLP:3331184.3331403,
 author = {Guo, Jiafeng and Fan, Yixing and Ji, Xiang and Cheng, Xueqi},
 title = {MatchZoo: A Learning, Practicing, and Developing System for Neural Text Matching},
 booktitle = {Proceedings of the 42Nd International ACM SIGIR Conference on Research and Development in Information Retrieval},
 series = {SIGIR'19},
 year = {2019},
 isbn = {978-1-4503-6172-9},
 location = {Paris, France},
 pages = {1297--1300},
 numpages = {4},
 url = {http://doi.acm.org/10.1145/3331184.3331403},
 doi = {10.1145/3331184.3331403},
 acmid = {3331403},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {matchzoo, neural network, text matching},
}

Development Team

Fan Yixing Core Dev ASST PROF, ICT	Wang Bo Core Dev M.S. TU Delft	Wang Zeyi Core Dev B.S. UC Davis	Pang Liang Core Dev ASST PROF, ICT	Yang Liu Core Dev PhD. UMASS
Wang Qinghua Documentation B.S. Shandong Univ.	Wang Zizhen Dev M.S. UCAS	Su Lixin Dev PhD. UCAS	Yang Zhou Dev M.S. CQUT	Tian Junfeng Dev M.S. ECNU

Contribution

Please make sure to read the Contributing Guide before creating a pull request. If you have a MatchZoo-related paper/project/compnent/tool, send a pull request to this awesome list!

Thank you to all the people who already contributed to MatchZoo!

Jianpeng Hou, Lijuan Chen, Yukun Zheng, Niuguo Cheng, Dai Zhuyun, Aneesh Joshi, Zeno Gantner, Kai Huang, stanpcf, ChangQF, Mike Kellogg

Project Organizers

Jiafeng Guo
- Institute of Computing Technology, Chinese Academy of Sciences
- Homepage
Yanyan Lan
- Institute of Computing Technology, Chinese Academy of Sciences
- Homepage
Xueqi Cheng
- Institute of Computing Technology, Chinese Academy of Sciences
- Homepage

License

Apache-2.0

Comments

Run aNMM

I am new to MatrchZoo. I wonder how to run aNMM. The docs don't have usage for aNMM. I feel I have to run a script for calculating the bin_sizes for aNMM? But I cannot find where this script lies.

Furthermore, my training data does need to have a format like here: https://github.com/NTMC-Community/MatchZoo/blob/master/matchzoo/datasets/toy/train.csv

right?

And, where are the batches created? Since you have positive and negative documents for each query, the batch should contain examples with pos and negs samples, right?

How can I load my own data?

Thanks.
question

opened by ctrado18 26
Suggestions for MatchZoo 2.0
Anybody wanting to make suggestions for MZ 2.0, please add it in this issue.

Here are my suggestions:

[x] Add docstrings for all functions and classes

[ ] Make MZ OS independent

[ ] Make MZ usable by providing custom data

[ ] Allow External Benchmarking

[ ] Siamese Recurrent Networks (Proposed Model)

[ ] docker, conda, virtualenv support (wishlist)

More details at https://github.com/faneshion/MatchZoo/issues/106
2.0 discussion
opened by aneesh-joshi 24
Reproduction of Benchmark Results

When running through the procedure described in the readme for the benchmark results of WikiQA, the reproduced values for [email protected], [email protected], and MAP are roughly half of the values shown in the table. Could you provide insight as to why this may be occuring?
bug question

opened by ghost 21
External benchmarking on Match Zoo
Hi, I am trying to establish benchmark results on all the document similarity models at MatchZoo. While there are some established benchmarks, it would be good if we had a MatchZoo-code independent system for evaluating results.

Eg:

input_data -> MZ -> result_data result_data - > independent_evaluation_code -> metric scores (Example: [email protected], map, etc.)

The current scenario is that the evaluation code is strongly ingrained in the MZ code, which can cause problems with different commits over time. As seen in https://github.com/faneshion/MatchZoo/issues/99

1. Is there a way already for doing this? I assume TREC is for that. Could someone direct me on how to use it? 2. Could some one direct me on how to go about making such an evaluation code? (Once developed, I will push it back into MZ and it could be like a Continuous Integration test.)

What do you think, @faneshion @yangliuy @bwanglzu @millanbatra @mandroid6?

Thanks!
opened by aneesh-joshi 20

Tensorflow2.0目前是否全面支持？

如题，我目前的运行环境使用是tf2.0版本，keras是为2.3.0。但无法执行报错信息如下：

~/anaconda3/lib/python3.7/site-packages/keras/engine/training.py in _prepare_total_loss(self, masks)
    690 
    691                     output_loss = loss_fn(
--> 692                         y_true, y_pred, sample_weight=sample_weight)
    693 
    694                 if len(self.outputs) > 1:

TypeError: __call__() got an unexpected keyword argument 'sample_weight'

question

opened by hezhefly 18

DSSM returning NaN for loss when used with tensorflow-gpu backend.
I have been running DSSM on quite a large dataset and was looking at tensorflow-gpu to speed up the training. However the returning loss and mae are always NaN for both the train and evaluation phase. I have tried a very basic tensorflow model from their tutorials and it works fine.

Im not really sure where to start debugging with this, any help would be greatly appreciated.

The model works fine with the cpu version of tensorflow. Example:

model.fit(x,y, epochs=2)

Epoch 1/2 10000/10000 [==============================] - 1s 139us/step - loss: nan - mean_absolute_error: nan Epoch 2/2 10000/10000 [==============================] - 1s 138us/step - loss: nan - mean_absolute_error: nan
opened by MichaelStarkey 18
Using a model as a search engine

I see that the models usually needs a text1 and text2 to perform the training and predictions. Usually on search engines I just need the text2 (document) to perform the indexing step (training).

How can I train the model like a search engine? i.e. I don't have the text1 information (query/question) and I want to index my documents.

Does using the same text for text1 and text2 works for training?
question

opened by denisb411 18
add preparation data for TREC data set

I've added all modules for processing TREC dataset, mainly: the modifications enable to get TREC like run with corresponding ids for queries and documents. Hence, the evaluation with trec_eval is possible now. In addition to performing n-cross validation with MatchZoo. Soon, I'll add programs for constructing TREC input files that are needed by the added functions.

opened by thiziri 18
support keras 2.3 and tensorflow 2.0
update requirements.txt: keras=2.3.0 and tensorflow >= 2.0.0

upgrade pip in .travis.yml (tf 2.0 requires pip >= 19)

make raking losses inherit keras.losses.Loss to support sample_weight keyword param

replace some keras.backend.tf with tf (K.tf does not exist anymore in 2.3.0 as keras is going to be synced with tf.keras and drop multi-backend)

add clear_session before prepare in model tests to prevent OOM during CI test

fix #789
opened by matthew-z 17
A question about the manner of input data to model.fit_generator()

I find that input data is sent to model by outside circulation iteration. Seeing the follow plot.

I am feeling uncertain why do it and I change it to this(because I want to use tensorboard by callback function).

I just use model.fit_generator() to handel data and train. However, it raises a exception that is caused by validation_data。 I trace it into keras inner cores and find it occurs when model starts to run evaluate_generator()。In the function evaluate_generator()，eval data generator is empty and lead to a exception at a epoch! However, it is strange and confuses me why the exception does not occur in the start epoch。I trace code and think it may be a bugger of Keras，is it true? Additional, whether this is the reason that you make a outside iteration to train model。

大佬们好，我发现代码一开始是在外部循环迭代输入数据进入模型训练，我很奇怪并且我想在fit_generator()中调用回调函数，所以我自己手动改了代码，如图2，但是这个产生了一个异常，经过追溯，我发现是model在训练的时候，在一个epoch结束之后调用验证集跑的时候，验证集数据在keras内部代码通过一个队列调用的时候，是空的导致的异常，但是很奇怪的是，这个问题不是一开始的epoch就发生的，而是在epoch好多次之后的某一次epoch，出现了这个异常，我很奇怪，追溯代码，感觉好像是keras的bug，不知道对不对，另外我想一开始的数据处理，是不是为了解决这个问题？求指教～谢谢～
question

opened by Star-in-Sky 17
Predict a new query

I already searched here. I use right now v1. Is there any sample code (I just found a broken link)? I have my trained DRRM model and want to predict ranking documents for a new query.

How is the current state in v2 to that?

I handleld to train the modle for my own custom text data with own fast word embedding. Normally I just would predict a new query but the output are the text IDs. So for DRRM are new words ignored which have no embedding in the dict?

Thank you very much!
question

opened by datistiquo 15
TypeSpec error while DRMM model build
Hello, I am getting the following error while I am trying to build the DRMM model for my ranking task at this line (here)

TypeError: Could not build a TypeSpec for KerasTensor(type_spec=TensorSpec(shape=(1, None, 10, 1), dtype=tf.float32, name=None), name='tf.operators.add/AddV2:0', description="created by layer 'tf.operators.add'") of unsupported type <class 'keras.engine.keras_tensor.KerasTensor'>.

Please note that I am using the following environment configuration in my local machine

Python 3.8.11 MatchZoo 2.1.0 tensorflow 2.8.0

Describe your attempts

I checked the documentation and found no answer

I checked to make sure that this is not a duplicate issue

Additionally, I also tried different kinds of solution like this here

And here

Context

OS [macOS 12.4]:

Hardware [Metal M1]

Thank-you for your help and time.

Regards, Govind Shukla
bug
opened by govind17 0
Bug/enhancement

Describe the bug

MatchZoo breaks when run in google colab beacause of deprocated dependencies in keras

To Reproduce

Attempt to import match zoo in google colab:

!pip3 install matchzoo

import tensorflow from tensorflow import keras import matchzoo as mz import nltk import pandas as pd

Describe your attempts

Attempted to run matchzoo in google colab Fixed dependecy issues

You should also provide code snippets you tried as a workaround, StackOverflow solution that you have walked through, or your best guess of the cause that you can't locate (e.g. cosmic radiation).

Context

Nine FIles Needed edit: attention layer.py #from keras.engine import Layer from keras.layers import Layer # Changed from previous line to fix tensorflow toolchain

data_generator.py import tensorflow # Added to fix toolchain issues #import keras from tensorflow import keras # Changed from previous line

decating_dropout_layer.py #from keras.engine import Layer from keras.layers import Layer # Changed from previous line to fix tensorflow toolchain

dynamic_pooling_layer.py #from keras.engine import Layer from keras.layers import Layer # Changed from previous line to fix tensorflow toolchain

matching_layer.py #from keras.engine import Layer from keras.layers import Layer # Changed from previous line to fix tensorflow toolchain

matching_tensor_layer.py #from keras.engine import Layer from keras.layers import Layer # Changed from previous line to fix tensorflow toolchain

multi_perspective_layer.py #from keras.engine import Layer from keras.layers import Layer # Changed from previous line to fix tensorflow toolchain

semantic_composite_layer.py #from keras.engine import Layer from keras.layers import Layer # Changed from previous line to fix tensorflow toolchain

spatial_gru.py #from keras.engine import Layer from keras.layers import Layer # Changed from previous line to fix tensorflow toolchain

Additional Information

I clone the repo and will push this update as a contribution to the code base
bug

opened by jerrycearley 2
Can Deep Component (Representation-focused model) is there in matchzoo?

Hello, I'm working on SLGen(Structure Learning for Headline Generation(AAAI-20)) paper. In this paper, they are utilizing Deep and Wide component for Structure representation of Text documents.

Now, the question is does MatchZoo provide me the facility of finding a Deep component- Interaction-focused model and Representation-focused model - for a Text document.

If anybody working on this please help me out!

Thank you Darshan Tank
question

opened by Darshan2104 0
DSSM model.predict() scores rank does not match with the rank by dot layer cosine similarity
Describe the Question

I have a trained DSSM model and wanted to compare the ranked items based on dssm model.predict() scores against the cosine similarity scores after the model's dot layer, I would expect the two ranks to be the same since model.predict() is just the final score after a linear activation but the results are completely the opposite and I'm trying to understand how that might be given the linear coefficient from the final dense layer is positive.

Describe your attempts

[x] I walked through the tutorials

[x] I checked the documentation

[x] I checked to make sure that this is not a duplicate question

question
opened by jchen0529 0
set_up.py missing tensorflow
Describe the bug

the project needs TensorFlow, but set_up.py does not contain the package. Although the requirements.txt contain the package, but when execute the command: pip install -e ., it will not install the package and occur no module error? Actually, is there any reason that not containing TensorFlow in set_up.py???

To Reproduce

pip3 install -e . python3 -m pytest -v tests/unit_test/processor_units/test_processor_units.py ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /mnt/zejun/smp/data/python_star_2000repo/MatchZoo/venv_test_7/bin/python3.7 cachedir: .pytest_cache rootdir: /mnt/zejun/smp/data/python_star_2000repo/MatchZoo plugins: cov-3.0.0, mock-3.6.1 collecting ... collected 0 items / 1 error

==================================== ERRORS ==================================== ___ ERROR collecting tests/unit_test/processor_units/test_processor_units.py ___ ImportError while importing test module '/mnt/zejun/smp/data/python_star_2000repo/MatchZoo/tests/unit_test/processor_units/test_processor_units.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /usr/lib/python3.7/importlib/init.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) tests/unit_test/processor_units/test_processor_units.py:4: in from matchzoo.preprocessors import units matchzoo/init.py:20: in from . import preprocessors matchzoo/preprocessors/init.py:1: in from . import units matchzoo/preprocessors/units/init.py:13: in from .tokenize import Tokenize matchzoo/preprocessors/units/tokenize.py:2: in from matchzoo.utils.bert_utils import is_chinese_char,
matchzoo/utils/init.py:4: in from .make_keras_optimizer_picklable import make_keras_optimizer_picklable matchzoo/utils/make_keras_optimizer_picklable.py:1: in import keras venv_test_7/lib/python3.7/site-packages/keras/init.py:21: in from tensorflow.python import tf2 E ModuleNotFoundError: No module named 'tensorflow' =========================== short test summary info ============================ ERROR tests/unit_test/processor_units/test_processor_units.py !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! =============================== 1 error in 0.91s ===============================

Describe your attempts

[x] I checked the documentation and found no answer

[x] I checked to make sure that this is not a duplicate issue

Context

Ubutun

bug
opened by idiomaticrefactoring 1
GPU-Utils is low 1%
Describe the bug

run the example in Get Started in 60 Seconds

Context

OS Ubuntu18.04

Hardware Tesla 80k, cuda 10.1,cudnn7.0

matchzoo 2.2.0, tensorflow2.2.0, keras2.3.0

Additional Information

Other things you want the developers to know.
bug
opened by lonelydancer 0

Releases(v2.2)

v2.2(Oct 9, 2019)

v2.2
Source code(tar.gz)
Source code(zip)
v2.1(Apr 4, 2019)
add automation modules

mz.auto.tuner that automatically search for model hyper parameters

mz.auto.preprer that unifies model preprocessing and training processes

add QuoraQP dataset

rewrite mz.DataGenerator to be callback-based

fix models behaviors under classification tasks

reorganize project structure, the most significant one being moving processor_units to preprocessors.units

rename redundant names (e.g. NaiveModel -> Naive, TokenizeUnit -> Tokenize)

update the tutorials

various other updates

Source code(tar.gz)
Source code(zip)

Owner

Neural Text Matching Community

GitHub Repository

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines spaCy-wrap is minimal library intended for wrapping fine-tuned transformers from t

32 Dec 29, 2022

CLIPfa: Connecting Farsi Text and Images

CLIPfa: Connecting Farsi Text and Images OpenAI released the paper Learning Transferable Visual Models From Natural Language Supervision in which they

66 Dec 14, 2022

Product-Review-Summarizer - Created a product review summarizer which clustered thousands of product reviews and summarized them into a maximum of 500 characters, saving precious time of customers and helping them make a wise buying decision.

Product-Review-Summarizer - Created a product review summarizer which clustered thousands of product reviews and summarized them into a maximum of 500 characters, saving precious time of customers an

1 Jan 01, 2022

Deep Learning for Natural Language Processing - Lectures 2021

This repository contains slides for the course "20-00-0947: Deep Learning for Natural Language Processing" (Technical University of Darmstadt, Summer term 2021).

0 Feb 21, 2022

Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.

PART 2: CHAIN LINKING AUDIO-TO-TEXT NLP TASKS 2A: TRANSCRIBE-TRANSLATE-SENTIMENT-ANALYSIS In notebook3.0, I demo a simple workflow to: transcribe a lo

30 Jul 13, 2022

Use PaddlePaddle to reproduce the paper：mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

MT5_paddle Use PaddlePaddle to reproduce the paper：mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer English | 简体中文 mT5: A Massively

2 Oct 17, 2021

HAN2HAN : Hangul Font Generation

36 Dec 28, 2022

auto_code_complete is a auto word-completetion program which allows you to customize it on your need

auto_code_complete v1.3 purpose and usage auto_code_complete is a auto word-completetion program which allows you to customize it on your needs. the m

2 Feb 22, 2022

[KBS] Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks

#Sentic GCN Introduction This repository was used in our paper: Aspect-Based Sentiment Analysis via Affective Knowledge Enhanced Graph Convolutional N

35 Nov 16, 2022

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

2017 VQA Challenge Winner (CVPR'17 Workshop) pytorch implementation of Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challeng

166 Dec 11, 2022

Chinese segmentation library

What is loso? loso is a Chinese segmentation system written in Python. It was developed by Victor Lin ( Fang-Pen Lin 82 Jun 28, 2022

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

Python_Natural_Language_Processing This repository contains tutorials on important topics related to Natural Language Processing (NPL). No. Name 01 01

170 Dec 13, 2022

Example code for "Real-World Natural Language Processing"

Real-World Natural Language Processing This repository contains example code for the book "Real-World Natural Language Processing." AllenNLP (2.5.0 or

303 Dec 17, 2022

CorNet Correlation Networks for Extreme Multi-label Text Classification

CorNet Correlation Networks for Extreme Multi-label Text Classification Prerequisites python==3.6.3 pytorch==1.2.0 torchgpipe==0.0.5 click==7.0 ruamel

38 Dec 31, 2022

Simple and efficient RevNet-Library with DeepSpeed support

RevLib Simple and efficient RevNet-Library with DeepSpeed support Features Half the constant memory usage and faster than RevNet libraries Less memory

112 Dec 05, 2022

2021 2학기 데이터크롤링 기말프로젝트

공지 주제 웹 크롤링을 이용한 취업 공고 스케줄러 스케줄 주제 정하기 코딩하기 핵심 코드 설명 + 피피티 구조 구상 // 12/4 토 피피티 + 스크립트(대본) 제작 + 녹화 // ~ 12/10 ~ 12/11 금~토 영상 편집 // ~12/11 토 웹크롤러 사람인_평균

2 Aug 16, 2022

Skipgram Negative Sampling in PyTorch

PyTorch SGNS Word2Vec's SkipGramNegativeSampling in Python. Yet another but quite general negative sampling loss implemented in PyTorch. It can be use

287 Dec 14, 2022

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER. @inproceedings{tedes

40 Dec 11, 2022

MicBot - MicBot uses Google Translate to speak everyone's chat messages

MicBot MicBot uses Google Translate to speak everyone's chat messages. It can al

2 Mar 09, 2022

Idea is to build a model which will take keywords as inputs and generate sentences as outputs.

keytotext Idea is to build a model which will take keywords as inputs and generate sentences as outputs. Potential use case can include: Marketing Sea

364 Jan 03, 2023

Facilitating the design, comparison and sharing of deep text matching models.

Related tags

Overview

MatchZoo

Get Started in 60 Seconds

References

Install

Models

Citation

Development Team

Contribution

Project Organizers

License

Comments

Describe your attempts

Context

Describe the bug

To Reproduce

Describe your attempts

Context

Additional Information

Describe the Question

Describe your attempts

Describe the bug

To Reproduce

Describe your attempts

Context

Describe the bug

Context

Additional Information

Releases(v2.2)

v2.2(Oct 9, 2019)

v2.1(Apr 4, 2019)

Owner

Neural Text Matching Community

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

CLIPfa: Connecting Farsi Text and Images

Product-Review-Summarizer - Created a product review summarizer which clustered thousands of product reviews and summarized them into a maximum of 500 characters, saving precious time of customers and helping them make a wise buying decision.

Deep Learning for Natural Language Processing - Lectures 2021

Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.

Use PaddlePaddle to reproduce the paper：mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

HAN2HAN : Hangul Font Generation

auto_code_complete is a auto word-completetion program which allows you to customize it on your need

[KBS] Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Chinese segmentation library

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

Example code for "Real-World Natural Language Processing"

CorNet Correlation Networks for Extreme Multi-label Text Classification

Simple and efficient RevNet-Library with DeepSpeed support

2021 2학기 데이터크롤링 기말프로젝트

Skipgram Negative Sampling in PyTorch

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

MicBot - MicBot uses Google Translate to speak everyone's chat messages

Idea is to build a model which will take keywords as inputs and generate sentences as outputs.