Supervised Learning in Machine Learning – new Questions

A) Data, labels, and annotation reality

  1. What minimum causal label-signal fraction is required for generalization when 𝑘% of labels are heuristics or social proxies, and how can we estimate it from observable statistics alone?
  2. Can we design protocols that prove an annotator guideline set is self-consistent (no hidden contradictions) before a single label is collected?
  3. When label noise is correlated with protected attributes, which reweighting scheme reduces disparity without inflating risk on minority subpopulations?
  4. What is the optimal joint budget split between new examples and relabeling existing examples for a fixed error target under class imbalance and long-tail regimes?
  5. Can we infer the latent “taxonomy drift” (evolving class definitions) from versioned labels and automatically propose the minimal taxonomy edit that restores consistency?

B) Objectives, losses, and alignment with use

  1. For cost-sensitive problems, can a loss be constructed that is provably equivalent to a stakeholder utility function under unknown prevalence shift?
  2. When ground truth is interval- or set-valued (e.g., medical grading ranges), which surrogate loss yields calibrated decisions and stable learning dynamics?
  3. How can we regularize to prefer counterfactually stable predictions (unchanged under plausible feature edits) without access to a structural causal model?

C) Generalization, complexity, and sample efficiency

  1. What is the tightest practically estimable bound on test risk using only train-time traces (gradients, curvature, sharpness), without a held-out set?
  2. Do compression-based flatness metrics remain predictive of generalization under heavy data augmentation and label smoothing—or do they saturate?
  3. Can we decompose generalization error into “representation debt” vs “classifier head debt” to guide where to spend data vs compute?

D) Distribution shift, OOD, and temporal change

  1. Which single diagnostic (among density ratio, feature skew, conditional drift) most reliably predicts decision boundary failure under covariate + label shift?
  2. Can we pre-train a representation that is provably stable to specified environmental factors (season, sensor, jurisdiction) with only weak metadata?
  3. What is the minimal online adaptation signal (few labeled points per week?) that prevents catastrophic performance decay over a year-long domain shift?

E) Evaluation, uncertainty, and monitoring

  1. How should we construct a test set to guarantee lower-variance estimates of extreme-tail error (e.g., the riskiest 0.1%) with limited labeling budget?
  2. Which uncertainty proxy (ensembles, deep evidential, conformal) is most actionable for rejection/deferral decisions under class imbalance?
  3. Can we create a single, scale-free metric that penalizes both overconfidence and calibration failure across strata, replacing dozens of disjoint plots?

F) Robustness, security, and failure analysis

  1. What defense class simultaneously mitigates: (a) small-norm adversaries, (b) natural corruptions, and (c) spurious-feature reliance—without losing accuracy on clean data?
  2. Is there a principled way to measure feature fragility (sensitivity to semantically irrelevant changes) separate from classical robustness metrics?
  3. How do we attribute post-deployment failures to training data, objective choice, or data pipeline bugs with a formal blame-allocation method?

G) Interpretability, causal claims, and recourse

  1. Which explanation method remains stable across re-trains and seeds—enough to be contractually dependable for regulated audits?
  2. Can supervised models yield counterfactual recourse recommendations that are both feasible for users and do not induce feedback loops (gaming)?
  3. How can we separate genuine causal features from persistent dataset artifacts when both produce equally strong saliency signals?

H) Fairness, subgroup reliability, and equity

  1. Under multiple overlapping sensitive attributes, what is the minimal subgroup coverage needed to bound worst-group error within ε, given finite data?
  2. Which training-time intervention (reweighting, constraints, adversarial debiasing) preserves subgroup calibration after deployment drift?
  3. Can we define fairness tests that are robust to silently shifting label policies across regions or annotator pools?

I) Efficiency, scaling, and hardware

  1. What is the compute–data Pareto frontier for supervised learning in the low-label, high-augmentation regime, and where does synthetic data change the slope?
  2. Which distillation strategy best preserves calibration and tail accuracy when compressing large teachers into edge models?
  3. Can we co-design data pipelines with hardware scheduling (prefetch, mixed precision, cache locality) to reduce wall-time without altering convergence?

J) Human-in-the-loop, active learning, and programmatic supervision

  1. What is the optimal acquisition policy when annotator time is non-uniform (experts vs crowd) and label latency alters class prevalence?
  2. When using weak/heuristic labels at scale, which denoising approach prevents systematic bias amplification while still improving sample efficiency?
  3. Can we auto-detect annotation guidelines that induce “label shortcuts” and propose revised prompts that reduce shortcut frequency?

K) Privacy, governance, and lifecycle

  1. What privacy budget (ε, δ) meaningfully protects individuals and preserves minority-class recall in small populations?
  2. How do we cryptographically verify that a production model was trained only on policy-compliant datasets (provenance attestation) without exposing IP?
  3. Can we formalize deprecation policies for models and datasets (sunset triggers) that are triggered by measurable drift or harm thresholds?

L) Cross-paradigm leverage (from SL to foundation models)

  1. What is the most data-efficient way to fine-tune foundation models for supervised tasks while preserving domain calibration and avoiding spurious features?
  2. Can we design task-aware adapters that guarantee no degradation on critical sub-slices compared to full fine-tuning?
  3. How do we quantify when supervised fine-tuning unlearns useful prior invariances from pretraining—and prevent it?

M) Productization and guarantees

  1. What contract-grade SL metrics (with tolerance bands) meaningfully reflect downstream utility better than raw accuracy/AUC?
  2. Can we publish verifiable model cards that tie evaluation slices to automatic monitors and alert thresholds post-deployment?

 

Executive Verdicts — NTZE (“next-to-zero evidence”) audit

Value for Science — Grade (concise)

Overall: 4.2 / 5 (84/100)

Why this high: The set is high-novelty (NTZE-heavy) yet strongly connected to active literatures; many prompts are decision-relevant and experimentable with modest resources (benchmarks, simulators, A/Bs), making them potent catalysts for near-term progress.

Subscores

  • Novelty (evidence scarcity, frontier): 4.7/5

  • Connectivity/association (to existing theory/practice): 4.2/5

  • Actionability (testability in ≤12 months): 3.8/5

  • Rigor-readiness (clarity of metrics, falsifiers, PICO-like framing): 3.4/5

  • Potential impact (if resolved): 4.5/5

Highest scientific ROI clusters (to prioritize):

  • D12 (single diagnostic for boundary failure under mixed shift)

  • E17 (scale-free miscalibration/overconfidence metric across strata)

  • K34 (verifiable training-data provenance/attestation)

  • H24–H25 (worst-group guarantees under overlapping attributes; drift-proof subgroup calibration)

  • C9–C10 (train-time trace bounds; flatness under heavy augmentation/label smoothing)

Fast wins (≤3 months, seed studies):

  • Benchmark deferral/uncertainty under class imbalance with contract-grade utility.

  • Prototype provenance attestation (property proofs + hardware attest) on a public SL dataset.

  • Public worst-group evaluation suite with overlapping attributes & sample-complexity curves.

  • Purpose of NTZE here: grade novelty/evidence depth, not plausibility. An item can be well-connected to prior work yet still be NTZE if the exact claim lacks controlled tests, benchmarks, or formal results.
  • Overall count (40 prompts):
    • NTZE (E0): 26/40 – frontier, little/no direct empirical or theoretical resolution for the specific claim.
    • Partially evidenced (E1–E2): 12/40 – adjacent evidence/surveys/benchmarks exist but don’t fully settle the claim.
    • Established (E3): 2/40 – mature lines where core mechanisms/guarantees are already known (selective classification/deferral foundations; conformal prediction guarantees).
  • Highest-connectivity but still NTZE: causal label-signal fraction (A1), self-consistent annotation guidelines (A2), single best diagnostic for decision-boundary failure under mixed shifts (D12), scale-free metric penalizing overconfidence+miscalibration across strata (E17), formal blame-allocation for failures (F20), cryptographically complete provenance attestation for training sets (K34 – emerging but incomplete).
  • Prompts with substantive prior art (thus not NTZE, though still open in details):
    • Selective prediction/deferral (E16): rigorous methods & DNN implementations.
    • Conformal prediction for actionable uncertainty (E16/E17, LLMs & human-in-the-loop): distribution-free guarantees; early RCTs for decision aid.
    • Flatness/sharpness vs generalization (C10/C9): extensive theory/algorithms (SAM/ESAM/Fisher-SAM), but “tightest practically estimable bound from train-time traces alone” remains NTZE.
    • Worst-group performance & overlapping groups (H24/H25): strong Group-DRO & intersectional fairness literature; your exact finite-data guarantees under overlapping attributes still partially open.
    • DP under class imbalance & minority recall (K33): active results show difficult trade-offs; far from closed.
    • Knowledge distillation preserving calibration/tail accuracy (I28): many studies but no consensus winner across regimes.

Claim Table — NTZE grading per item (concise)

# Claim nucleus (abridged) Verdict Confidence Best source(s) Evidence level Notes / rewrite ≤30w
1 Minimum causal label-signal fraction for gen. from observables NTZE Mod E0 No practical estimator with guarantees.
2 Prove guideline self-consistency pre-labeling NTZE Mod E0 No general proof systems; formalizations ad hoc.
3 Reweighting under attribute-correlated label noise reduces disparity without minority harm Partial Mod E1 Group-robust/fairness work adjacent; not settled.
4 Optimal budget split: new data vs relabeling under long tail NTZE Mod E0 No closed-form guidance beyond heuristics.
5 Infer taxonomy drift and auto-propose minimal edits NTZE Mod E0 Dataset ontology drift tools immature.
6 Loss provably equivalent to stakeholder utility under unknown prevalence shift NTZE Low E0 Utility-consistent losses under shift unproven.
7 Surrogate loss for interval/set-valued labels with calibration + stability Partial Low E1 Pieces exist; unified recipe missing.
8 Regularize for counterfactual stability without SCM NTZE Mod E0 No SCM-free guarantees beyond heuristics.
9 Tight test-risk bound from train-time traces only NTZE Mod E0 Bounds exist but not tight/practical.
10 Flatness metrics predict gen. under heavy aug/label smoothing Partial High E1 Evidence mixed; open under strong aug.
11 Decompose gen. error into rep vs head debt to guide spend NTZE Low E0 No accepted decomposition/estimators.
12 Single diagnostic best predicts boundary failure under covariate+label shift NTZE High E0 Surveys show no single winner.
13 Pretrain repr. provably stable to environment factors with weak metadata NTZE Mod E0 Lacks proofs with weak tags.
14 Minimal online adaptation signal to prevent year-long decay NTZE Mod E0 No universal sample-complexity curves.
15 Build test set for lower-variance extreme-tail error with limited budget NTZE Mod E0 Active area; no standard.
16 Which uncertainty proxy is actionable for deferral under imbalance Partial→Established (foundations) High E2–E3 Selective prediction solid; imbalance specifics open.
17 Single, scale-free metric penalizing overconfidence + miscalibration across strata NTZE High E0 Many proxies; no accepted single metric.
18 One defense class mitigates adv + corruptions + spurious reliance without clean loss NTZE Mod E0 Trilemma persists.
19 Measure feature fragility distinct from robustness NTZE Low E0 Definitions/benchmarks lacking.
20 Formal blame allocation training vs objective vs pipeline NTZE Mod E0 Causal debugging remains unsolved.
21 Explanation method stable across retrains/seeds for audits NTZE Mod E0 Stability guarantees scarce.
22 Counterfactual recourse that’s feasible and avoids feedback loops Partial Low E1 Recourse literature exists; loop-safety open.
23 Separate causal features from artifacts when saliency identical NTZE Mod E0 Needs causal supervision/experiments.
24 Minimal subgroup coverage to bound worst-group error with overlapping attributes Partial High E1 Group-DRO & overlapping theory partial.
25 Training-time debiasing that preserves subgroup calibration after drift NTZE Mod E0 Post-drift calibration under fairness not settled.
26 Fairness tests robust to shifting label policies NTZE Mod E0 Hard open problem.
27 Compute–data Pareto for low-label, high-aug; where synthetic changes slope Partial Low E1 Scaling laws exist; this regime open.
28 Distillation that preserves calibration & tail accuracy on edge Partial Mod E1 Many methods; no universal winner.
29 Co-design data pipelines with hardware scheduling to reduce wall-time without changing convergence NTZE Low E0 Lacks formal invariance proofs.
30 Optimal acquisition when annotators differ & latency shifts prevalence NTZE Mod E0 Some active learning pieces; no full solution.
31 Weak-label denoising that improves efficiency without bias amplification NTZE Mod E0 Context-dependent; no guarantee.
32 Auto-detect guideline-induced label shortcuts & auto-fix prompts NTZE Low E0 Shortcut detectors exist; auto-repair open.
33 Privacy budgets that protect and preserve minority recall Partial Mod E1 Trade-offs documented; recipe unknown.
34 Cryptographically verify training on policy-compliant data (provenance attestation) Partial Mod E1 Attest distributional properties; full provenance hard.
35 Formal sunset triggers for models/datasets tied to drift/harm NTZE Low E0 Governance patterns, not formalized triggers.
36 Data-efficient FT of FMs w/ preserved domain calibration & no spuriouss Partial Low E1 Many recipes; guarantees lacking.
37 Task-aware adapters guaranteeing no degradation on critical sub-slices NTZE Mod E0 No worst-slice guarantees.
38 Quantify when FT unlearns pretraining invariances & prevent it NTZE Mod E0 No accepted diagnostics/guards.
39 Contract-grade SL metrics tied to downstream utility > AUC/acc NTZE Mod E0 Domain-specific; no general contractable set.
40 Verifiable model cards linked to monitors & alerts Partial Mod E1 Property-attested/zk-aided cards proposed; early.

Legend: E0=NTZE, E1=adjacent/partial literature, E2=controlled evidence, E3=established.


Evidence Dossiers — strongest anchors (illustrative)

  • Selective prediction / deferral: Formal risk-coverage trade-offs; DNN implementations; contract-friendly framing of “actionable uncertainty.”
  • Conformal prediction: Distribution-free uncertainty guarantees; growing applied evidence incl. human-study benefit for decisions.
  • Flatness–generalization link & SAM family: Significant empirical/theoretical backing; open under heavy augmentation/smoothing.
  • Group-DRO & overlapping groups: Worst-group generalization and intersectional fairness theory; still few finite-sample coverage recipes for overlapping subgroups.
  • DP × class imbalance: Active 2024–2025 work shows trade-offs; no general guarantees on minority recall preservation.
  • Provenance/property attestation & verifiable cards: Cryptographic/TEE-based property attestations; comprehensive provenance proofs remain open.

Bias & Heuristics Analysis

  • Connectivity inflation: Many prompts are highly connected to live literatures; NTZE guards against mistaking association for tested evidence.
  • Metric myopia: Over-reliance on AUC/accuracy hides tail & subgroup harms—your prompts emphasize deferral, tails, and worst-group.
  • Publication bias: Surveys can suggest maturity; NTZE reminds that benchmarked, general solutions remain rare.

Missing/Uncertain Signals

  • Unified, scale-free miscalibration metric across strata (E17): no consensus.
  • One-diagnostic OOD “oracle” (D12): surveys agree the winner is contextual.
  • Formal contract metrics (M39) & automated governance triggers (K35): conceptual only.

Frontier/Decisive Study Designs (fast wins)

  • Selective-prediction under imbalance: head-to-head ensembles vs conformal vs evidential nets with deferral utility ground truth.
  • Overlapping-subgroup budgeting: sample-complexity curves for worst-group error with intersectional partitions using Group-DRO baselines.
  • Provenance attestation pilot: combine distributional property proofs + hardware attestation to verify dataset policy compliance on a public benchmark.

Methods (Search Log)

  • Source families: arXiv/preprints; PMLR/ICML/ICLR proceedings; ACM/TOIS; PMC journals; surveys; standards/cryptography.
  • Queries & timestamps (Europe/Berlin, ISO-8601):
    • 2025-10-18T13:31 — “selective classification reject option Geifman El-Yaniv 2017 pdf” → methods & ICML’19 extension.
    • 2025-10-18T13:33 — “sharpness flatness generalization SAM Foret pdf” → SAM/ESAM/Fisher-SAM.
    • 2025-10-18T13:36 — “OOD detection survey 2024 2025 decision boundary failure diagnostic” → 2024/2025 surveys.
    • 2025-10-18T13:39 — “Group DRO worst-group Sagawa 2019 2020 ICLR pdf” → DRO corpus.
    • 2025-10-18T13:42 — “overlapping groups fairness theory NeurIPS 2020 pdf” → theory paper & surveys.
    • 2025-10-18T13:45 — “conformal prediction introduction Angelopoulos 2023; human decision RCT 2024” → MAL survey; RCT.
    • 2025-10-18T13:48 — “differential privacy class imbalance minority recall 2024 2025” → arXiv/ICML 2025 results.
    • 2025-10-18T13:51 — “verifiable ML property cards attestation cryptographic proof training data” → Laminator + distributional attestation.

Overall Abstract (plain-language)

Your supervised-learning “not asked yet” list is high-novelty and decision-relevant. Most items are NTZE because the exact claims lack controlled proofs, standardized benchmarks, or end-to-end demonstrations—despite strong connections to active literatures (selective prediction, Group-DRO, conformal, DP, KD). That’s precisely why NTZE is valuable: it marks where rigor is still missing and helps prioritize fast, decisive experiments (especially around deferral, worst-group error, provenance, and DP×imbalance).


 

Supervised Learning in Machine Learning: Core Concepts, Algorithms, and Applications

Supervised learning is a foundational approach in machine learning where models are trained on labeled data to predict outcomes for new, unseen data. It is widely used for both classification (categorical outcomes) and regression (continuous outcomes) tasks, with applications spanning healthcare, finance, engineering, and more.

Key Types and Algorithms in Supervised Learning

Supervised learning encompasses two main problem types: classification (predicting discrete labels) and regression (predicting continuous values) (Jiang et al., 2020; Shetty et al., 2022; Uddin et al., 2019; , 2024). Common algorithms include:

  • Decision Trees and Random Forests: Popular for their interpretability and strong performance, especially Random Forests in disease prediction (Shetty et al., 2022; Syed & Lokhande, 2024; Uddin et al., 2019; Ouadah et al., 2022).
  • Support Vector Machines (SVM): Frequently used for classification, especially in biomedical and pattern recognition tasks (Syed & Lokhande, 2024; Uddin et al., 2019; Bzdok et al., 2018; , 2024).
  • Naive Bayes: Known for simplicity and effectiveness in text and spam detection (Syed & Lokhande, 2024; , 2024).
  • K-Nearest Neighbors (KNN): Effective for large datasets and pattern recognition (Bzdok et al., 2018; Ouadah et al., 2022).
  • Neural Networks: Powerful for complex, high-dimensional data, though not always covered in basic surveys (, 2024).

Model Building, Validation, and Performance

The supervised learning process involves:

  • Data Preparation: Collecting and preprocessing labeled data (Jiang et al., 2020; , 2024).
  • Model Training: Learning patterns from input-output pairs (Jiang et al., 2020; Shetty et al., 2022; , 2024).
  • Validation and Testing: Using cross-validation and confusion matrices to assess accuracy, speed, and risk of overfitting (Jiang et al., 2020; Singh et al., 2016; Kolosova & Berestizhevsky, 2020; , 2024).
  • Algorithm Selection: Based on data characteristics like size, heterogeneity, and linearity (Shetty et al., 2022; Ouadah et al., 2022).

Applications and Comparative Performance

Supervised learning is applied in diverse fields such as medical diagnosis, image and speech recognition, fraud detection, and predictive maintenance (Jiang et al., 2020; Shetty et al., 2022; Syed & Lokhande, 2024; Uddin et al., 2019; Pruneski et al., 2022; Ouadah et al., 2022). Random Forests and SVMs often show superior accuracy in disease prediction, while KNN excels with large datasets (Uddin et al., 2019; Ouadah et al., 2022). The choice of algorithm depends on the specific problem and data properties.

Timeline of Supervised Learning Research

  • 2015
    • 1 paper: (Libbrecht & Noble, 2015)- 2016
    • 1 paper: (Singh et al., 2016)- 2017
    • 2 papers: (Choudhary & Gianey, 2017; Zhou, 2018)- 2018
    • 1 paper: (Bzdok et al., 2018)- 2019
    • 3 papers: (Sen et al., 2019; Uddin et al., 2019; Van Engelen & Hoos, 2019)- 2020
    • 2 papers: (Jiang et al., 2020; Kolosova & Berestizhevsky, 2020)- 2021
    • 1 paper: (Shirobokov et al., 2021)- 2022
    • 5 papers: (Gupta et al., 2022; Shetty et al., 2022; Tiwari, 2022; Pruneski et al., 2022; Ouadah et al., 2022)- 2023
    • 2 papers: (Ali & Khan, 2023; Ren et al., 2023)- 2024
    • 2 papers: (Syed & Lokhande, 2024; , 2024)Figure 1: Research on supervised learning spans decades, with recent growth in applications and algorithm development. Larger markers indicate more influential papers.
Subtopic Key Papers (Citations)
Algorithm Comparisons (Singh et al., 2016; Shetty et al., 2022; Syed & Lokhande, 2024; Choudhary & Gianey, 2017; Sen et al., 2019; Uddin et al., 2019; , 2024; Ouadah et al., 2022)
Healthcare Applications (Jiang et al., 2020; Uddin et al., 2019; Pruneski et al., 2022)
Model Validation & Evaluation (Jiang et al., 2020; Singh et al., 2016; Kolosova & Berestizhevsky, 2020; , 2024)
Weak/Semi-supervised Learning (Zhou, 2018; Ren et al., 2023; Van Engelen & Hoos, 2019)

Figure 2: Table highlights key papers for major supervised learning subtopics.

Summary

Supervised learning is central to modern machine learning, offering robust solutions for classification and regression across many domains. Algorithm choice and model validation are critical for success, with Random Forests, SVMs, and KNN among the most widely used methods. The field continues to evolve, with ongoing research into algorithm performance, applications, and extensions like weak and semi-supervised learning.

 

References

Jiang, T., Gradus, J., & Rosellini, A. (2020). Supervised Machine Learning: A Brief Primer.. Behavior therapy, 51 5, 675-687. https://doi.org/10.1016/j.beth.2020.05.002

Singh, A., Thakur, N., & Sharma, A. (2016). A review of supervised machine learning algorithms. 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), 1310-1315.

Gupta, V., Mishra, V., Singhal, P., & Kumar, A. (2022). An Overview of Supervised Machine Learning Algorithm. 2022 11th International Conference on System Modeling & Advancement in Research Trends (SMART), 87-92. https://doi.org/10.1109/smart55829.2022.10047618

Kolosova, T., & Berestizhevsky, S. (2020). Supervised Machine Learning. Methodology of Educational Measurement and Assessment. https://doi.org/10.1007/978-1-4419-1428-6_5931

Shetty, S., Shetty, S., Singh, C., & Rao, A. (2022). Supervised Machine Learning: Algorithms and Applications. Fundamentals and Methods of Machine and Deep Learning. https://doi.org/10.1002/9781119821908.ch1

Syed, I., & Lokhande, V. (2024). AN OVERVIEW OF THE SUPERVISED MACHINE LEARNING. International Research Journal of Modernization in Engineering Technology and Science. https://doi.org/10.56726/irjmets51366

Tiwari, S. (2022). Supervised Machine Learning: A Brief Introduction. Proceedings of the International Conference on Virtual Learning – VIRTUAL LEARNING – VIRTUAL REALITY (17th edition). https://doi.org/10.58503/icvl-v17y202218

Choudhary, R., & Gianey, H. (2017). Comprehensive Review On Supervised Machine Learning Algorithms. 2017 International Conference on Machine Learning and Data Science (MLDS), 37-43. https://doi.org/10.1109/mlds.2017.11

Sen, P., Hajra, M., & Ghosh, M. (2019). Supervised Classification Algorithms in Machine Learning: A Survey and Review. Advances in Intelligent Systems and Computing. https://doi.org/10.1007/978-981-13-7403-6_11

Ali, A., & Khan, W. (2023). A Supervised Machine Learning Algorithms: Applications, Challenges, and Recommendations. Proceedings of the Pakistan Academy of Sciences: A. Physical and Computational Sciences. https://doi.org/10.53560/ppasa(60-4)831

Zhou, Z. (2018). A brief introduction to weakly supervised learning. National Science Review, 5, 44-53. https://doi.org/10.1093/nsr/nwx106

Uddin, S., Khan, A., Hossain, M., & Moni, M. (2019). Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making, 19. https://doi.org/10.1186/s12911-019-1004-8

Ren, Z., Wang, S., & Zhang, Y. (2023). Weakly supervised machine learning. CAAI Trans. Intell. Technol., 8, 549-580. https://doi.org/10.1049/cit2.12216

Pruneski, J., Pareek, A., Kunze, K., Martin, R., Karlsson, J., Oeding, J., Kiapour, A., Nwachukwu, B., & Williams, R. (2022). Supervised machine learning and associated algorithms: applications in orthopedic surgery. Knee Surgery, Sports Traumatology, Arthroscopy, 31, 1196-1202. https://doi.org/10.1007/s00167-022-07181-2

Van Engelen, J., & Hoos, H. (2019). A survey on semi-supervised learning. Machine Learning, 109, 373-440. https://doi.org/10.1007/s10994-019-05855-6

Bzdok, D., Krzywinski, M., & Altman, N. (2018). Machine learning: Supervised methods, SVM and kNN. Nature Methods, 1-6.

Libbrecht, M., & Noble, W. (2015). Machine learning applications in genetics and genomics. Nature Reviews Genetics, 16, 321-332. https://doi.org/10.1038/nrg3920

(2024). Supervised Machine Learning a Brief Survey of Approaches. Al-Iraqia Journal of Scientific Engineering Research. https://doi.org/10.58564/ijser.2.4.2023.121

Ouadah, A., Zemmouchi-Ghomari, L., & Salhi, N. (2022). Selecting an appropriate supervised machine learning algorithm for predictive maintenance. The International Journal of Advanced Manufacturing Technology, 119, 4277 – 4301. https://doi.org/10.1007/s00170-021-08551-9

Shirobokov, M., Trofimov, S., & Ovchinnikov, M. (2021). Survey of machine learning techniques in spacecraft control design. Acta Astronautica, 186, 87-97. https://doi.org/10.1016/j.actaastro.2021.05.018