Introduction to Genetic Algorithms
Genetic algorithms (GA) are a type of evolutionary algorithm used to generate high-quality solutions to optimization and search problems by relying on bio-inspired operators such as mutation, crossover and selection. GAs were introduced as a computational analog of adaptive systems and Darwin’s theory of natural evolution. Starting from a random population of individuals that represent candidate solutions (called chromosomes or the genotype of the problem), GAs use the principle of survival of the fittest to evolve the population over generations. More fit solutions (chromosomes) are preferentially selected from one generation to breed a new generation. This selection process provides solutions that are better adapted to their environment.
The main purpose of genetic algorithms is to find the best solution for a given problem within a large search space of all possible solutions. Optimization problems that are too complex for analytical or enumerative methods, or for which the problem definition changes too often for conventional solution methods to be effective, are suitable application areas for genetic algorithms. Other candidate application domains for GAs include scheduling, optimization, machine learning, economics and more.
This paper provides an in-depth overview of genetic algorithms including: the basic genetic algorithm process, parameter representation, fitness evaluation, genetic operators of selection, crossover and mutation, various variations of GAs, advantages and limitations of using GAs and examples of genetic algorithm applications from literature.
Genetic Algorithm Process
The standard genetic algorithm follows these basic steps:
Initialization: The initial population of candidate solutions (called chromosomes or the genotype of the problem) is usually generated randomly. Alternatively, a small set of elite solutions can be used as a starting point.
Fitness Evaluation: The objective or fitness function is used to evaluate each individual in the population and assign a fitness score according to how close it comes to meeting the criteria specified by the objective function.
Selection: Individuals from the current population are selected to breed a new generation. Individual selection is based on fitness – the better fitted individuals are typically more likely to be selected.
Crossover: Parts of the selected individuals’ chromosomes are swapped or recombined to form new offspring with the combined traits of the parents. Crossover is the main genetic operator used to combine the genetic material of multiple individuals.
Mutation: Random changes are sometimes made in the reproduced offspring at each locus (position) in a chromosome during reproduction with some mutation probability. This introduces randomization and helps maintain diversity in the population.
Replacement: The new generation of candidate solutions replaces the current population. Typically, the entire new population is used to replace the older one, but occasionally the fittest individuals in the old population are carried over to the new population.
Iteration: Steps 2-6 are repeated for a predetermined number of iterations or until a termination criterion is reached such as a good enough fitness level solution is found.
Parameter Representation
Genetic algorithms rely on parameter representation to map potential solutions to chromosomes or genotypes. This involves specifying:
The encoding of each parameter or variable of the problem into genes or loci within the chromosome.
Chromosome length and structure – they can be fixed or variable length. Typical representations are binary bit strings or real-valued vectors.
Mapping function to convert the encoded chromosomes back into an actual solution in the problem space.
Correct representation design is crucial as it determines the “shape” of the search space and how solutions are modified by genetic operators during the search. Representations should be chosen to promote beneficial chromosomes and leverage genetic operators effectively.
Fitness Evaluation
The fitness function (also called the environment) defines the problem space and assigns a fitness score to each individual in the population based on how close it comes to meeting the criteria of a valid solution. It is an objective measure of quality and performance for a given solution to the problem.
Individuals that score high on the fitness evaluation are preferable as they represent better solutions for the problem. The fitness function drives the evolutionary process by survival and reproduction of the fittest. Choosing a meaningful, scalable and appropriately challenging fitness function is key for getting the genetic search to converge effectively.
Genetic Operators
Once individuals are randomly initialized and evaluated for fitness, genetic algorithms employ three main genetic operators as follows:
Selection: Individuals are probabilistically selected from the current population based on their fitness score. Higher fit individuals have greater chances of being selected to breed a new generation. This biases the search toward better solutions. Common selection methods are roulette wheel, tournament selection etc.
Crossover: Portions of the genomes of two selected parent chromosomes are swapped to create new offspring with blended traits. A crossover point is randomly chosen and all subsequent genes are exchanged. Offspring inherit useful traits from both parents to explore new potential solutions.
Mutation: Occasional random bit flips or gene alterations are applied to some loci in offspring to maintain diversity and avoid local optima. Mutation introduces random variations and helps the population from premature convergence on suboptimal solutions. Typical mutation rates are low (1-10%).
These operators mimic evolution in nature and enable GAs to evolve and discover increasingly better candidate solutions over generations through analogy rather than through explicit enumeration. Their stochastic nature balances exploration versus exploitation of the search space.
Variations of Genetic Algorithms
Due to its biologically inspired foundations, the core template of genetic algorithms can be flexibly adapted. Some key variations include:
Steady-state GA: It replaces solutions in the current population rather than the entire population. Enables faster evolution.
Messy GA: Allows for variable gene lengths and non-homologous crossover instead of uniform encoding.
Multi-objective GA: Evolves a population of non-dominated solutions along multiple objectives simultaneously.
Distributed GA: Divides the population over multiple clusters/nodes to perform evaluations and genetic operations in parallel. Speeds up evolution.
Genetic programming: Chromosomes represent computer programs rather than fixed-length strings. Automatically creates computer programs to solve problems.
Estimation of distribution algorithms: Replaces sampling individuals with probabilistic models to guide recombination. Reduces random walks.
Niched GA: Introduces mechanisms to promote and preserve diversity through speciation and niche formation. Helps maintain distinctive niches of solutions.
Genetic algorithms are a highly flexible and adaptable framework that can be customized for different real-world applications based on their encoding, operators and supporting mechanisms. Careful parameter tuning can further enhance their performance.
Advantages and Limitations
GA is a population-based metaheuristic derived from evolution principles. It does not require gradient or derivative information like other optimization techniques. Hence it can handle non-linear, discontinuous and complex multimodal problems.
GAs are capable of finding reasonably good and often optimal solutions to difficult problems that are beyond the reach of exact or enumerative solution methods. They can effectively search huge solution spaces that grow exponentially with problem size.
GA’s evolutionary approach of selection, mutation and crossover helps them converge on global optima rather than get trapped in local optima.
Limitations include lack of ability to guarantee an optimal solution or convergence. Reconvergence is possible but not assured. GA’s require problem-specific design and tuning for best results. Fitness evaluation can become prohibitively costly for some problems.
Example Applications
Some example applications of genetic algorithms that have been extensively studied and reported in literature include:
Travelling salesman problem: Routing of salesman visits across cities to minimize distance traveled. Significant computational runtime savings over brute force techniques.
Job shop scheduling: Scheduling jobs on machines to minimize tardiness, makespan or other criteria. Outperforms classical priority dispatching rules.
Protein structure prediction: Estimates 3D protein folding conformations most consistent with amino acid sequence. Useful for drug discovery.
Engineering design: GA’s applied to tasks like digital filter design, robot controller design, spacecraft antenna design have demonstrated advantages over conventional methods.
Economic modeling: Modeling financial markets, predicting currency exchange rates and evolving trading strategies through genetic programming.
Machine learning: Feature selection, induction of decision trees, unsupervised clustering are emerging application domains. Hybrid GA-based techniques demonstrate capabilities.
Genetic algorithms have proven effective for diverse optimization and search problems due to their robustness, flexibility and scalability. With increasing computational power, their application scope continues expanding across sciences, engineering and business domains. GA’s remain an area of active algorithmic research as well.
