sklearn_src


sklearn, software that uses scikit-learn, which is a Python-based library for machine learning computations.

  1. blob_classify_kernelized_svm, a scikit-learn code which uses a kernelized support vector machine to classify an artificial dataset of groups of "blobs".
  2. blob_classify_logistic_multi, a scikit-learn code which uses multiple applications of logistic regression to classify an artificial dataset of three groups of "blobs".
  3. blob_cluster_kmeans, a scikit-learn code which uses the k-means algorithm to cluster blob data.
  4. cancer_classify_decision, a scikit-learn code which uses a decision tree algorithm to classify the breast cancer dataset, comparing the training and testing accuracy as the depth of the tree is varied.
  5. cancer_classify_forest, a scikit-learn code which uses the random forest algorithm to classify the breast cancer dataset.
  6. cancer_classify_gradboost, a scikit-learn code which uses the gradient boosting algorithm to classify the breast cancer dataset.
  7. cancer_classify_knn, a scikit-learn code which uses the k-nearest neighbor algorithm to classify the breast cancer dataset, comparing the training and testing accuracy as the number of neighbors is increased.
  8. cancer_classify_logistic, a scikit-learn code which uses logistic regression to classify the breast cancer dataset, investigating the influence of the C parameter.
  9. cancer_classify_mlp, a scikit-learn code which uses a multilayer perceptron to classify the breast cancer dataset.
  10. cancer_classify_svm_rbf, a scikit-learn code which uses the support vector algorithm with RBF kernel on the cancer dataset, showing that the data should be rescaled to avoid overfitting.
  11. cancer_scale_minmax, a scikit-learn code which uses the min-max scaling to preprocess the cancer dataset.
  12. cancer_visualize_histogram, a scikit-learn code which displays all 30 features of the cancer dataset as histograms of feature frequence for malignant versus benign cases.
  13. cancer_visualize_pca, a scikit-learn code which uses principal component analysis (PCA) of the cancer dataset to visualize the difference between malignant and benign cases.
  14. circle_classify_gradboost, a scikit-learn code which uses the gradient boost algorithm to classify the artificial circle dataset, and then determines the prediction uncertainties.
  15. digits_visualize_pca, a scikit-learn code which uses principal component analysis (PCA) of the digits dataset to visualize the grouping of data.
  16. digits_visualize_tsne, a scikit-learn code which uses t-distributed stochastic neighbor embedding (tsne) of the digits dataset to visualize the grouping of data.
  17. faces_classify_knn, a scikit-learn code which uses the k-nearest neighbor algorithm to match new faces with images in the faces dataset.
  18. faces_classify_nmf, a scikit-learn code which uses the nonnegative matrix factorizatoin algorithm to match new faces with images in the faces dataset.
  19. faces_classify_pca, a scikit-learn code which uses principal component analysis (PCA) to match new faces with images in the faces dataset.
  20. handcrafted_classify_svm_rbf, a scikit-learn code which uses the support vector algorithm with RBF kernel on the handcrafted dataset.
  21. forge_classify_knn, a scikit-learn code which uses the k-nearest neighbor algorithm to choose one of two classes for each of 26 items in the forge dataset, involving two features.
  22. forge_classify_svm, a scikit-learn code which uses the support vector machine (SCM) classifier to choose one of two classes for each of 26 items in the forge dataset, involving two features.
  23. handcrafted_classify_svm_rbf, a scikit-learn code which uses the support vector algorithm with RBF kernel on the handcrafted dataset.
  24. housing_data_fetch, a scikit-learn code which fetches a housing dataset from GitHub and stores it locally.
  25. iris_classify_gradboost, a scikit-learn code which uses the gradient boost algorithm to classify the iris dataset, and then determines the prediction uncertainties.
  26. iris_classify_knn, a scikit-learn code which uses the k-nearest neighbor algorithm to classify the species of iris specimens based on a set of 150 sets of four measurements sepal and petal width and length.
  27. logistic_regression, a scikit-learn code which use logistic regression to classify data.
  28. moon_classify_forest, a scikit-learn code which uses the random forest algorithm to classify samples of the artificial moon dataset.
  29. moon_classify_mlp, a scikit-learn code which uses a multilayer perceptron method to classify samples of the artificial moon dataset.
  30. ram_regression_decision, a scikit-learn code which uses a decision tree algorithm to perform regression on the RAM price dataset.
  31. ram_regression_linear, a scikit-learn code which uses linear regresssion to perform regression on the RAM price dataset.
  32. signal_classify_nmf, a scikit-learn code which uses non-negative matrix factorization (nmf) to match new signals to items in the signal dataset.
  33. study_classify_logistic, a scikit-learn code which uses the logistic regression algorithm to classify the outcome of students based on study time.
  34. tester, a BASH script which runs the tests.
  35. wave_regression_knn, a scikit-learn code which uses the k-nearest neighbor algorithm to form a regression predictor for the wave dataset.
  36. wave_regression_ols, a scikit-learn code which uses the ordinary least squares algorithm to form a regression predictor for the wave dataset.


Last revised on 28 March 2024.