Learning Rate Analysis

Learning rate controls the step size during gradient descent. Four values were tested with Adam optimizer and ReLU activations.

Results

Learning Rate	Outcome
0.1	Model failed to learn (~51% accuracy)
0.01	100% test accuracy (small-dataset variance artifact)
0.001	Most stable and generalizable result
0.0001	Very slow convergence

Analysis

0.1 — Too Large

A learning rate of 0.1 caused the model to diverge. The steps are so large that the optimizer overshoots the loss minimum, bouncing around without converging. The ~51% accuracy is essentially random guessing on a binary problem.

0.01 — Suspicious 100%

A perfect 100% test accuracy is a red flag on a dataset of 1,025 samples. With only a few hundred test samples, random variation in the train/test split can produce this result even when the model is slightly overfit. This is not a reliable result to report as the best configuration.

0.001 — Sweet Spot

A learning rate of 0.001 produced the most consistent and trustworthy results across multiple runs. The loss curves for training and validation descend smoothly and converge together, indicating genuine generalization rather than memorization.

0.0001 — Too Small

A very small learning rate means very small gradient steps. The model learns, but extremely slowly. In practice, this may mean the model hasn’t converged within the number of epochs allocated, leaving accuracy lower than it could be with more training time.

Takeaway

0.001 is the standard default for Adam — and this experiment confirms why. The adjacent values (0.1 too large, 0.0001 too slow) show the sensitivity of training to this hyperparameter.

Overview

Model

Experiments

Results

Learning Rate

Learning Rate Analysis

Results

Analysis

0.1 — Too Large

0.01 — Suspicious 100%

0.001 — Sweet Spot

0.0001 — Too Small

Takeaway

Overview

Model

Experiments

Results

​Learning Rate Analysis

​Results

​Analysis

​0.1 — Too Large

​0.01 — Suspicious 100%

​0.001 — Sweet Spot

​0.0001 — Too Small

​Takeaway

Learning Rate Analysis

Results

Analysis

0.1 — Too Large

0.01 — Suspicious 100%

0.001 — Sweet Spot

0.0001 — Too Small

Takeaway