Simulated Annealing - From Spin Glasses to Optimization
Simulated annealing is one of the most successful optimization algorithms inspired by physics. Proposed by Kirkpatrick, Gelatt, and Vecchi in 1983, it mimics the physical process of annealing—slowly cooling a material to find its lowest-energy crystalline state. The algorithm has found applications in everything from VLSI design to protein folding, making it one of the most widely used optimization techniques.
For background on the physics that inspired this algorithm, see Spin Glasses.
The Physical Analogy
In physical annealing:
-
Heat: Material is heated to high temperature, atoms move freely
-
Cool: Temperature is slowly decreased
-
Crystallize: Atoms settle into low-energy crystalline structure
In simulated annealing:
-
High temperature: System explores configuration space freely
-
Cooling schedule: Temperature gradually decreases
-
Ground state: System settles into low-energy (optimal) configuration
The key insight is that slow cooling allows the system to escape local minima and find the global minimum, just like in physical annealing.
The Algorithm
Simulated annealing works as follows:
1. Initialize: Start with random configuration, set T = T₀ (high)
2. While T > T_min:
a. For each iteration:
- Propose random move (e.g., flip a spin, swap cities)
- Calculate energy change ΔE
- Accept move with probability P = min(1, exp(-ΔE/T))
b. Decrease temperature: T = α·T (where 0 < α < 1)
3. Return final configuration
The Metropolis acceptance criterion is crucial:
$
P(\text{accept}) = \begin{cases}
1 & \text{if } \Delta E < 0 \text{ (always accept improvements)} \
\exp(-\Delta E / T) & \text{if } \Delta E > 0 \text{ (probabilistically accept worse moves)}
\end{cases}
$
At high temperature, the system accepts many "bad" moves, exploring widely. As temperature decreases, it becomes more selective, eventually only accepting improvements.
Cooling Schedules
The choice of cooling schedule is critical for performance:
Linear Cooling
$
T(t) = T_0 - \alpha t
$
Simple but often too fast, can get trapped in local minima.
Exponential Cooling
$
T(t) = T_0 \alpha^t
$
Most common, provides smooth transition. Typical .
Logarithmic Cooling
$
T(t) = \frac{T_0}{\log(1 + t)}
$
Theoretically optimal for convergence, but very slow in practice.
Adaptive Cooling
Adjust cooling rate based on acceptance rate:
-
High acceptance rate → cool faster
-
Low acceptance rate → cool slower
Applications
Traveling Salesman Problem
Find the shortest route visiting all cities exactly once. Simulated annealing can find near-optimal solutions for thousands of cities.
Interactive simulation: Watch simulated annealing solve the TSP. The algorithm starts with a random tour and gradually improves it as temperature decreases. At high temperature, the tour changes dramatically. As temperature decreases, the algorithm converges to a near-optimal solution. See Markov Chains, Traveling Salesman and Spin Glasses for more details on the connection between TSP and spin glasses.
Below is a spin glass simulation with full annealing controls, demonstrating the same principles applied to spin glass systems:
VLSI Design
Place and route components on integrated circuits to minimize wire length and maximize performance.
Protein Folding
Find the native conformation of proteins by minimizing free energy. The energy landscape is similar to spin glasses—rugged with many local minima.
Machine Learning
-
Training neural networks (alternative to gradient descent)
-
Feature selection
-
Hyperparameter optimization
Scheduling
-
Job shop scheduling
-
Resource allocation
-
Task assignment
Performance Characteristics
Advantages:
-
Can escape local minima (unlike greedy algorithms)
-
Simple to implement
-
Works well for many problem types
-
Can handle discrete and continuous variables
Disadvantages:
-
No guarantee of finding global optimum
-
Requires careful tuning of cooling schedule
-
Can be slow for large problems
-
Performance depends on problem structure
Comparison with Other Methods
vs. Gradient Descent
-
Gradient descent: Fast but gets stuck in local minima
-
Simulated annealing: Slower but can escape local minima
vs. Genetic Algorithms
-
Genetic algorithms: Population-based, explores multiple regions simultaneously
-
Simulated annealing: Single trajectory, focuses on promising regions
vs. Quantum Annealing
-
Quantum annealing: Uses quantum tunneling, potentially faster for certain problems
-
Simulated annealing: Classical, more widely applicable
Practical Tips
-
Initial temperature: Should be high enough that ~80% of moves are accepted
-
Cooling rate: Balance between speed and quality (typically )
-
Stopping criterion: Stop when acceptance rate is very low or temperature is near zero
-
Move generation: Design moves that preserve problem constraints
-
Multiple runs: Run multiple times with different random seeds, take best result
Modern Extensions
Parallel Simulated Annealing
Run multiple independent chains in parallel, periodically exchange information.
Simulated Tempering
Allow temperature to increase occasionally, helping escape deep local minima.
Quantum-Classical Hybrid
Combine quantum annealing with classical simulated annealing for hybrid optimization.
Conclusion
Simulated annealing demonstrates how physical principles can inspire powerful algorithms. By mimicking the slow cooling of materials, it provides a robust method for finding near-optimal solutions to complex optimization problems.
The algorithm's success across diverse applications—from circuit design to protein folding—highlights the universality of energy landscape concepts from spin glass physics.
Related Articles:
- Spin Glasses - The physics behind the algorithm, where the energy landscape concepts originate
- Markov Chains, Traveling Salesman and Spin Glasses - The mathematical framework of Markov chain Monte Carlo methods that power simulated annealing
- Quantum Spin Glass - Quantum extensions of annealing that use quantum tunneling instead of thermal fluctuations
- Quantum Computing - Hybrid Approaches - Quantum annealing and alternative computing approaches

