3 Repositories
Latest Python Libraries
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
18 Nov 28, 2022
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble
datasketch: Big Data Looks Small datasketch gives you probabilistic data structures that can process and search very large amount of data super fast,
1.9k Jan 07, 2023
Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.
Building Shazam from scratch In this repository we tried to implement a simplified copy of the Shazam application able to tell you the name of a song
0 Nov 17, 2022