This repository contains implementations of all Machine Learning Algorithms from scratch in Python. Mathematics required for ML and many projects have also been included.
Key :-
1️⃣ Python Basics 🔴 Not Done Yet
a. Python basics :- variables, list, sets, tuples, loops, functions, lambda functions, dictionary, input methods rest are completed
b. Python Oops
c. File and Error Handling
d. Iteration Protocol and Generators
2️⃣ Data Acquisition
a. Data Acquisition using Beautiful Soup
b. Data Acquisition using Web APIs
3️⃣ Python Libraries :-
a. Numpy
b. Matplotlib
c. Seaborn
d. Pandas
🔴Plotly
4️⃣ Feature Selection and Extraction
a.Feature Selection - Chi2 test, RandomForest Classifier
b.Feature Extraction - Principal Component Analysis
💯
Basics of Machine Learning
1️⃣ Basic
✅Types of ML
✅Challenges in ML
✅Overfitting and Underfitting
🔴Testing and Validation
🔴Cross Validation
🔴Grid Search
🔴Random Search
🔴Confusion Matrix
🔴Precision, Recall ], F1 Score
🔴ROC-AUC Curve
2️⃣ Predictive Modelling
🔴Introduction to Predictive Modelling
🔴Model in Analytics
🔴Bussiness Problem and Prediction Model
🔴Phases of Predictive Modelling
🔴Data Exploration for Modelling
🔴Data and Patterns
🔴Identifying Missing Data
🔴Outlier Detection
🔴Z-Score
🔴IQR
🔴Percentile
🔥
Machine-Learning
1️⃣ K- Nearest Neighbour:-
- Theory
- Implementation
2️⃣ Linear Regression
- What is Linear Regression
- What is gradient descent
- Implementation of gradient descent
- Importance of Learning Rate
- Types of Gradient Descent
- Making predictions on data set
- Contour and Surface Plots
- Visualizing Loss function and Gradient Descent
🔴 Polynomial Regression
🔴Regularization
🔴Ridge Regression
🔴Lasso Regression
🔴Elastic Net and Early Stopping
- Multivariate Linear Regression on boston housing dataset
- Optimization of Multivariate Linear Regression
- Using Scikit Learn for Linear Regression
- Closed Form Solution
- LOWESS - Locally Weighted Regression
- Maximum Likelihood Estimation
- Project - Air Pollution Regression
3️⃣ Logistic Regression
- Hypothesis function
- Log Loss
- Proof of Log loss by MLE
- Gradient Descent Update rule for Logistic Regression
- Gradient Descent Implementation of Logistic Regression
🔴Multiclass Classification
- Sk-Learn Implementation of Logistic Regression on chemical classification dataset.
4️⃣ Natural Language Processing
- Bag of Words Pipeline
- Tokenization and Stopword Removal
- Regex based Tokenization
- Stemming & Lemmatization
- Constructing Vocab
- Vectorization with Stopwords Removal
- Bag of Words Model- Unigram, Bigram, Trigram, n- gram
- TF-IDF Normalization
5️⃣ Naive Bayes
- Bayes Theorem Formula
- Bayes Theorem - Spam or not
- Bayes Theorem - Disease or not
- Mushroom Classification
- Text Classification
- Laplace Smoothing
- Multivariate Bernoulli Naive Bayes
- Multivariate Event Model Naive Bayes
- Multivariate Bernoulli Naive Bayes vs Multivariate Event Model Naive Bayes
- Gaussian Naive Bayes
🔴 Project on Naive Bayes
6️⃣ Decision Tree
- Entropy
- Information Gain
- Process Kaggle Titanic Dataset
- Implementation of Information Gain
- Implementation of Decision Tree
- Making Predictions
- Decision Trees using Sci-kit Learn
7️⃣ Support Vector Machine
- SVM Implementation in Python
🔴Different Types of Kernel
🔴Project on SVC
🔴Project on SVR
🔴Project on SVC
8️⃣ Principal Component Analysis
🔴 PCA in Python
🔴 PCA Project
🔴 Fail Case of PCA (Swiss Roll)
9️⃣ K- Means
🔴 Implentation in Python
- Implementation using Libraries
- K-Means ++
- DBSCAN
🔴 Project
🔟 Ensemble Methods and Random Forests
🔴Ensemble and Voting Classifiers
🔴Bagging and Pasting
🔴Random Forest
🔴Extra Tree
🔴 Ada Boost
🔴 Gradient Boosting
🔴 Gradient Boosting with Sklearn
🔴 Stacking Ensemble Learning
1️⃣1️⃣ Unsupervised Learning
🔴 Hierarchical Clustering
🔴 DBSCAN
🔴 BIRCH
🔴 Mean - Shift
🔴 Affinity Propagation
🔴 Anomaly Detection
🔴Spectral Clustering
🔴 Gaussian Mixture
🔴 Bayesian Gaussian Mixture Models
💯
Mathematics required for Machine Learning
1️⃣ Statistics:
a. Measures of central tendency – mean, median, mode
b. measures of dispersion – mean deviation, standard deviation, quartile deviation, skewness and kurtosis.
c. Correlation coefficient, regression, least squares principles of curve fitting
2️⃣ Probability:
a. Introduction, finite sample spaces, conditional probability and independence, Bayes’ theorem, one dimensional random variable, mean, variance.
3️⃣ Linear Algebra :- scalars,vectors,matrices,tensors.transpose,broadcasting,matrix multiplication, hadamard product,norms,determinants, solving linear equations
📚
Handwritten notes with proper implementation and Mathematics Derivations of each algorithm from scratch
✅ KNN
✅ Linear Regressio
✅ Logistic Regression
✅ Feature Selection and Extraction
✅ Naive Bayes
🙌
Projects :-
🔅 Movie Recommendation System
🔅 Diabetes Classification
🔅 Handwriting Recognition
🔅 Linkedin Webscraping
🔅 Air Pollution Regression
Owner
Vanshika Mishra
I am a Data Science Enthusiast. Research and open source piques my interests