Genetic Algorithms Homework
This assignment is out of 50 points, and is worth 5 percent of your grade.
Write a genetic algorithm that follows the standard cycle as described in the lecture notes in
language of your choice that works on linux.gl (C, C++, java, python, perl, lisp, etc).
The only requirements are that your genetic algorithm maintains a
population of 200 at all times and your fitness function takes at most
O(N^2) to compute, where N is the length of a string. The cycle is:
- Initialize random population - this generation is #0
- Evaluate population
- Select survivors (with a selection algorithm of your choice)
- Repopulate with crossover and mutation (with methods of your choice)
- Check to see if the word is correct; if so exit; otherwise go to step 2
Turn in a document that states how your genetic algorithm works, by explaining the main components.
Be sure to at least answer the following questions in your writeup:
- Explain how your fitness function works and the motivation behind it
- How do you select survivors? Approximately how many out of the 200 population will usually be kept?
- How do you do crossover and mutation? At what rate does your mutation change the solution?
Does your mutation rate change over time?
- Explain any additional parameters that can be tweaked in your program
- GA's tend to reach plateaus of fitness if the search goes on for too long.
Does your approach try to tackle this problem in any specific way?
- Did you stray at all from the standard way GA's work in order to make
your GA more effective?
- Does your fitness function meet the O(N^2) restraint (provide informal analysis)
- How do we compile (if needed) and run your code on linux.gl? Please also include this particular information in your header comment.
The goal of your genetic algorithm is to guess a word of length up to 50 characters. The only
characters will be A-Z capital letters and a-z lower case letters (case sensitive).
Some examples: DonMiner, AbCdEaBcDe, IAmBatman, A, AB...
It is suggested you print out statistical information about each generation as it is generated.
Your program must take exactly one command line argument: the string to match. For example:
$ ./myprogram DONMINER. Once the string is found, print out the amount of generations
it took to get there.
Your algorithm must not cheat by looking at the goal string. The only functions that should
know the secret string are the termination check and the fitness function. However, you may use the length of the string wherever you wish (i.e., you may initialize your population so that all of them have the same length as the goal string).
Here is some sample output from my implementation.
For extra credit, your genetic algorithm will be pitted in a competition to find solutions
the fastest (fewest generations). The fastest genetic algorithm will receive 10 points extra credit, with several
runner ups earning less than 10. If you do not follow instructions (use a population of
a different size than 200, fitness function takes more than O(N^2) time, etc.) you will
be disqualified from possible extra credit.
Also, if you "cheat" (e.g., lie about number of generations, use the string incorrectly) you will
be disqualified.
For an additional 4 points of extra credit, implement your genetic algorithm in LISP.