feature/Adaboost implementation #207

JasonShin · 2019-01-19T02:55:43Z

I'm submitting a ...
[/] feature request
Summary

An AdaBoost classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases.

In an effort to implement boosting models in the library, Adaboost would be an ideal first model to implement in order for me to understand how boosting works.

Illustration

References

Boosting Wiki artcle: https://en.wikipedia.org/wiki/Boosting_(machine_learning)
StatQuest explanation: https://www.youtube.com/watch?v=LsK-xG1cLYA
Packt Video: https://www.youtube.com/watch?v=BoGNyWW9-mE
Coursera ML on Adaboost: https://www.coursera.org/lecture/ml-classification/example-of-adaboost-in-action-um0cX
ML from Scratch implementation: https://github.com/eriklindernoren/ML-From-Scratch/blob/master/mlfromscratch/supervised_learning/adaboost.py
Some random implementation: https://github.com/jaimeps/adaboost-implementation
http://ficik.github.io/nlpjs/docs/classifier_adaboost.js.html
Python 3 adaboost: https://adataanalyst.com/machine-learning/adaboost-python-3/
Weak Learning, Boosting, and the AdaBoost algorithm (haven't read): https://jeremykun.com/2015/05/18/boosting-census/

JasonShin · 2019-01-20T14:34:37Z

I found the ML-From-Scratch example and the StatQuest explanation exactly aligns in terms of implementation. I will work on the feature primarily based on the ML-From-Scratch example.

JasonShin · 2019-01-22T12:47:24Z

Started the implementation here https://github.com/machinelearnjs/machinelearnjs/tree/feature/adaboost

JasonShin · 2019-01-26T15:48:15Z

@BenjaminMcDonald suggested refactoring go about

tf.sum(tf.where(tf.equal(y, pred), tf.zerosLike(w), w))

is equivalent to

// Sum of weights of misclassified samples
// w = [0.213, 0.21342] -> y = [1, 2] -> prediction = [2, 2] ->
// any index that has -1 -> grab them from w and get a sum of them
let error = Array.from(w.dataSync())
  .filter((_, index) => y[index] !== prediction[index])
  .reduce((total, x) => total + x);

JasonShin · 2019-01-30T14:25:52Z

fit function is working fine now, hopefully.

There's still a problem with predict

JasonShin · 2019-01-31T12:52:08Z

Prediction is still always returning 1s.

To debug the issue:

Grab the classifier attributes, alpha, polarity, threshold, feature_index, and alpha
get the Python implementation
Try running Python's predict using the classifier attributes/save points

Observations:

Classifier's attributes are all same..

Alternative solutions:

Find a different example

JasonShin · 2019-02-23T02:12:47Z

Python experiment

import numpy as np

clfs = [
    { "p": -1, "threshold": 0.6455696225166321, "feature_index": 0, "alpha": 11.512925148010254 },
    { "p": -1, "threshold": 0.6455696225166321, "feature_index": 1, "alpha": 11.512925148010254 },
    { "p": -1, "threshold": 0.6455696225166321, "feature_index": 2, "alpha": 11.512925148010254 },
    { "p": -1, "threshold": 0.6455696225166321, "feature_index": 3, "alpha": 11.512925148010254 },
    { "p": -1, "threshold": 0.6455696225166321, "feature_index": 4, "alpha": 11.512925148010254 },
    { "p": -1, "threshold": 0.6455696225166321, "feature_index": 5, "alpha": 11.512925148010254 },
]

test_x = np.array([ [ 5.4, 3, 4.5, 1.5 ],
  [ 5.6, 2.7, 4.2, 1.3 ],
  [ 7.7, 3, 6.1, 2.3 ],
  [ 5, 3.6, 1.4, 0.2 ],
  [ 6.3, 3.4, 5.6, 2.4 ],
  [ 5.2, 3.5, 1.5, 0.2 ],
  [ 5.5, 2.6, 4.4, 1.2 ],
  [ 5.2, 2.7, 3.9, 1.4 ],
  [ 4.9, 2.5, 4.5, 1.7 ],
  [ 6.6, 3, 4.4, 1.4 ],
  [ 6.2, 2.9, 4.3, 1.3 ],
  [ 5.5, 4.2, 1.4, 0.2 ],
  [ 6.9, 3.1, 5.1, 2.3 ],
  [ 6.5, 3.2, 5.1, 2 ],
  [ 5.5, 3.5, 1.3, 0.2 ],
  [ 6.3, 2.8, 5.1, 1.5 ],
  [ 4.7, 3.2, 1.3, 0.2 ],
  [ 7.2, 3, 5.8, 1.6 ],
  [ 6.1, 2.8, 4, 1.3 ],
  [ 7.4, 2.8, 6.1, 1.9 ],
  [ 6.7, 3.1, 4.4, 1.4 ],
  [ 5.1, 2.5, 3, 1.1 ],
  [ 4.4, 2.9, 1.4, 0.2 ],
  [ 4.9, 3.1, 1.5, 0.1 ],
  [ 5.9, 3, 5.1, 1.8 ],
  [ 6.7, 3.1, 4.7, 1.5 ],
  [ 5.1, 3.5, 1.4, 0.3 ],
  [ 6.1, 2.9, 4.7, 1.4 ],
  [ 5.6, 3, 4.5, 1.5 ],
  [ 7, 3.2, 4.7, 1.4 ],
  [ 6, 3, 4.8, 1.8 ],
  [ 6.4, 2.9, 4.3, 1.3 ],
  [ 5.6, 2.9, 3.6, 1.3 ],
  [ 6.5, 3, 5.2, 2 ],
  [ 6.1, 2.8, 4.7, 1.2 ],
  [ 6.3, 3.3, 6, 2.5 ],
  [ 4.9, 3.1, 1.5, 0.1 ],
  [ 5.7, 3.8, 1.7, 0.3 ] ])

def predict(X):
    n_samples = np.shape(X)[0]
    y_pred = np.zeros((n_samples, 1))
    # For each classifier => label the samples
    for clf in clfs:
        # Set all predictions to '1' initially
        predictions = np.ones(np.shape(y_pred))
        # The indexes where the sample values are below threshold
        negative_idx = (clf['p'] * X[:, clf['feature_index']] < clf['p'] * clf['threshold'])
        # Label those as '-1'
        predictions[negative_idx] = -1
        # Add predictions weighted by the classifiers alpha
        # (alpha indicative of classifier's proficiency)
        y_pred += clf['alpha'] * predictions

    # Return sign of prediction sum
    y_pred = np.sign(y_pred).flatten()

    return y_pred

# predictions = predict(test_x)

# print(predictions)

print(test_x[:, 4])

JasonShin added this to the Sprint 1 milestone Jan 19, 2019

JasonShin self-assigned this Jan 19, 2019

JasonShin added the feature New feature that does not exist in Kalimdor.js yet label Jan 19, 2019

JasonShin removed this from the Sprint 1 milestone May 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature/Adaboost implementation #207

feature/Adaboost implementation #207

JasonShin commented Jan 19, 2019 •

edited

JasonShin commented Jan 20, 2019

JasonShin commented Jan 22, 2019

JasonShin commented Jan 26, 2019 •

edited

JasonShin commented Jan 30, 2019

JasonShin commented Jan 31, 2019 •

edited

JasonShin commented Feb 23, 2019

feature/Adaboost implementation #207

feature/Adaboost implementation #207

Comments

JasonShin commented Jan 19, 2019 • edited

JasonShin commented Jan 20, 2019

JasonShin commented Jan 22, 2019

JasonShin commented Jan 26, 2019 • edited

JasonShin commented Jan 30, 2019

JasonShin commented Jan 31, 2019 • edited

JasonShin commented Feb 23, 2019

JasonShin commented Jan 19, 2019 •

edited

JasonShin commented Jan 26, 2019 •

edited

JasonShin commented Jan 31, 2019 •

edited