***************************************************************************** Documentation for GenLab Snakes Version 1.0 September 27, 2000 Copyright (c) 2000 Fascination Software All Rights Reserved A product of Fascination Software Co. ***************************************************************************** Legal Stuff: This software (GenLab Snakes Version 1.0) is shareware. You are allowed to copy it, distribute it, and use it. You may not, however, change or alter the files in any way. Despite the unlikelihood of your computer blowing up when you run this program: Fascinations Software, and the author, are not in any way liable for any damages incurred by the use of this software. The user assumes full responsibility for the use of this software. _____________________________________________________________________________ Purpose: The idea for this piece of software began when a friend gave me a Rubik's Snake (TM) which she had procured for me at a trade show. Upon examination of the toy, I realized that it would be easy to encode the path of the snake into a binary DNA code and write a genetic algorithm to use a population of virtual snakes to solve an arbitrary fitness function. Thus the concept for GenLab Snakes was born! After several months of design and evolution, this piece of software is ready for release. GenLab Snakes is a small, fast, and elegant program designed to provide a pleasing visual interface to a sophisticated underlying genetic algorithm. It is a useful tool for investigating the complexities of chaos, pondering the nature of the universe, or just having fun! ============================================================================= Requirements: GenLab Snakes requires a minimum PC compatible configuration of: - Intel Pentium(TM) or better CPU with MMX(TM) support - 8 Megabytes of memory - A super VGA video card with support for VESA VBE 2.1 or better - DOS 6.11, Windows 95, or Windows 98 GenLab Snakes will run without a problem in a Windows 9x DOS command box, or in straight DOS mode. ============================================================================= _____________________________Theory of operation:____________________________ GenLab Snakes uses the principle of genetic evolution to find a solution for a particular fitness function from a huge vector space of possible contenders. It begins this journey by creating a population of 1024 virtual Snakes with randomly generated DNA. The following sequence of steps is then executed continuously: 1. First, the a 3D model is built for each virtual Snake, and a fitness function is evaluated with the data from the 3D model. The value of each Snake's fitness is stored with its DNA. 2. The Snakes are then sorted in order of descending fitness value. The best, or fittest, Snakes are then at the top of the population. 3. After sorting, the Snakes produce children. Child Snakes are generated from pairs of parents. The parents are chosen randomly, but only the top 50% of the Snakes are guaranteed to become parents and survive to the next generation. After the child Snakes are generated, the lower 50% of the population is discarded and the children take their place. 4. The child Snakes are evaluated for their fitness, and the cycle begins again. ______________________Why does this solve a function?________________________ Because of the way that child Snakes are produced, the top 50% of the population are guaranteed to produce children during every generation. The only situation in which a Snake falls out of the top 50% of the population is when a new child Snake has a higher fitness value. As new child Snakes are born and outperform their parents (with respect to the fitness function), they move to the top of the population and begin producing offspring. These offspring may then again outperform their parents, and in this way the population of Snakes converges on an optimal solution to the fitness function. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Important Safety Tip: All of the numeric displays in this software are in hexadecimal. Deal with it. Hexadecimal is the most reasonable method of displaying binary numbers, so learn to convert to decimal in your head, or better yet just learn to think in hex. ***************************************************************************** Main Screen: When GenLab Snakes starts, it displays the Main Screen and waits for user input in Pause mode. The Main Screen contains several windows which display important information: - Mini Snake Windows = The majority of the Main Screen is occupied by a set of nine small windows, each of which displays one Snake. Initially these windows display the 9 fittest members of the population. One window is selected at all times; the selected window has a red border. The arrow keys may be used to select a different window. - Rank = A small window below the 9 Snake windows shows the rank of the currently selected Snake. A '0' indicates the fittest member of the population, while a '3ff' indicates the least fit Snake in the population. - DNA = The "DNA" window displays the 24-gene DNA code of the currently selected Snake. The DNA is displayed in octal, with only the digits 0-5 used. - Fitness = The "Fitness" window shows the fitness value of the currently selected Snake. - FFunc = The "FFunc" window displays the active Fitness Function. There are 5 different Fitness Functions which may be selected. - Stats = The "Stats" window displays several useful statistical measures: 1. Generation: The current generation number of the population 2. Min Fitness: The fitness value of the least fit Snake in the population 3. Max Fitness: The fitness value of the most fit Snake in the population 4. Avg Fitness: The arithmetic mean average of the fitness values of all Snakes in the population 5. Median Fit: The median fitness value of all Snakes in the population 6. Saturation %: The percentage of Snakes in the population with a fitness value equal to Max Fitness 7. Avg % Rank: The arithmetic mean average of the percent ranks of all Snakes in the population 8. Histogram: The bottom portion of the "Stats" window is occupied with a fully auto-scaling histogram of the fitness values of all Snakes in the population. The left-most bar of the histogram shows the relative count of Snakes with the highest fitness values, while the right-most bar of the histogram shows the relative count of Snakes with the lowest fitness value. This histogram is an excellent indicator of the "maturity" of a population, and can give great insight into the qualitative status of the Snakes. A red colored bar indicates the position of the currently selected Snake on the histogram. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Main Screen Commands: The following key commands are active within the Main Screen: = Rotate the currently selected Snake = View the currently selected Snake in Large View Mode <+,-> = Zoom In or Zoom Out on the currently selected Snake = Select a different window = Quit GenLab Snakes = Pause or Resume the genetic algorithm loop = Select a different page of Snakes <1,2,3,4,5> = Select a Fitness Function = Go to the Snake Gallery = Save the state of the Snake population in SNAKEPOP.DAT = Read the state of the Snake population from SNAKEPOP.DAT = Randomly generate a new population = Change the DNA code of the currently selected Snake - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Change DNA Code Mode: After pressing 'c' on the Main Screen, a cursor will appear in "DNA:" window. The numeric keys 0-5 and the arrow keys may be used to modify the existing DNA code. When done, press to confirm the new DNA, or press to revert back to the original DNA. ***************************************************************************** Large View Mode: This viewing mode is useful when you want to get "up close and personal" with a particular Snake. Pressing the key in the Main Screen will cause GenLab Snakes to enter the Large View Mode, and the Snake which was selected on the Main Screen will be displayed on the full screen. This viewing mode assists the user in copying a virtual Snake onto a real Rubik's Snake(TM) by showing greater detail through increased size, and by showing the Snake's internal structure with transparency. The Large View Mode offers two different transparency display methods and one non-transparent (standard) display method. GenLab Snakes may be toggled between these display modes by pressing the key. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Large View Mode Commands: The following key commands are active within the Large View Mode: = Rotate the displayed Snake <+,-> = Zoom In or Zoom Out on the displayed Snake = Toggle the display transparency mode = Go back to the Main Screen ***************************************************************************** Gallery Mode: The Gallery Mode is intended to showcase a number of interesting Snakes that I have created while writing this program. The left side of the screen contains a descriptive list of Snakes in the Gallery. One of the Snake names in this list will be highlighted with a selection box, which may be moved with the up-arrow and down-arrow keys. The right portion of the screen displays the currently selected Snake. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Gallery Mode Commands: The following key commands are active within the Gallery Mode: = Rotate the displayed Snake <+,-> = Zoom In or Zoom Out on the displayed Snake = Select the previous Snake in the Gallery list = Select the next Snake in the Gallery list = Toggle the display transparency mode = Go back to the Main Screen = Copy the selected Snake into the selected window in the Main Screen, and go back to the Main Screen ????????????????????????????????????????????????????????????????????????????? Technical: For those interested in the more detailed inner workings of the underlying genetic algorithm in GenLab Snakes, I present the following information: __________________________________DNA Coding_________________________________ Each octal gene in the DNA sequence explicitly codes a direction in 3-dimensional space. The encoding is like this: 0=+x 1=+y 2=+z 3=-x 4=-y 5=-z A quick examination of a real Rubik's Snake(TM) shows that each junction of two pieces can only be locked in place in four different orientations. So for each gene there are only 4 (out of 6) legal values. This is why most randomly created DNA strings will have a fitness value of 0; they are impossible (illegal) to build. The exact set of legal values for each gene depends upon the value of the previous gene (or the hard-coded initial conditions for gene 0). The 3 dimensional path is created by reading the DNA string from the least significant (right-most) gene to the most significant (left-most) gene. _____________________________Offspring Calculation___________________________ The child Snakes are generated in the following sequence: 1. 512 pairs of Snakes are created; each Snake in the top 50% of the population is paired with a random Snake also in the top 50%. 2. For each pair of Snakes, two children are created. The parents take turns being Mother and Father. 3. For each child Snake, a Combination Method (A or B) is randomly chosen: - Combination Method A: A random 24-bit binary string is generated, and each bit in the random string determines if the child Snake inherits the Mother's gene (0) or the Father's gene (1) - Combination Method B: A string of {x = random, 1<=x<=21} genes is copied from a random location in the Father's DNA and pasted into a different random location in the Mother's DNA 4. The bottom 50% (512 Snakes) of the old population is discarded, and the new child Snakes (1024) are added to the remaining top 50% to bring the total population size to 1536 Snakes. 5. The Snakes are sorted according to fitness value, and the bottom 512 are thrown out to leave 1024 Snakes in the population again. 6. Each Snake DNA is compared to every other Snake DNA, and all exact clones are replaced with random DNA. This is done because inbreeding really screws the algorithm up. _________________________Fitness Function Definitions________________________ Fitness Function #1: The value of a Snake's fitness is equal to the distance between the first piece of the Snake, and the last piece of the Snake. This fitness function will quickly converge on the optimal solution, which is a perfectly straight Snake. This is because there are many 'isomer Snakes' which have unique DNA but result in the same (optimal) fitness value, and because the optimal solution has a simple (repetitive) DNA structure. - Optimum Fitness Value = 0109 hex Fitness Function #2: The value of a Snake's fitness is inversely proportional to the volume of the smallest 3-dimensional rectangle which completely encloses the path of the Snake. Note that the 3-dimensional path of the Snake is not the same as the 3-dimensional model of the Snake displayed on the screen. This function reaches its optimal fitness value when the Snake is on a flat plane. However the population often becomes filled with non- flat Snakes which have high fitness values (local maxima) and the algorithm does not converge on the optimum solution. One method which often works to generate the optimum Snake is to let the population become saturated with solutions from Fitness Function #1 (straight lines), and then switch to Fitness Function #2. This often provides the necessary initial genetic code to find the optimum solution. - Optimum Fitness Value = 276 hex Fitness Function #3: The value of a Snake's fitness is inversely proportional to the sum of distances between the center of the Snake path and each point along the Snake path. I originally intended this Fitness Function to evolve a ball-shaped object, which I already knew how to build with a real Snake. Unfortunately, however, this Fitness Function evolves to a ball very slowly. I also discovered that there is a different Snake shape which has a higher fitness value than a ball for this Fitness Function. It is a very compact and cool-looking shape, which may be found in the Gallery. It is extremely difficult to evolve this shape naturally with GenLab Snakes. Good luck. - Optimum Fitness Value = 1759 hex Fitness Function #4: The value of a Snake's fitness is equal to the scalar sum of the +x and +z components of the cross products of each consecutive pair of vectors along the path of the Snake referenced from the center of the Snake. Essentially this causes Snakes which spiral outwards in a certain direction to have high fitness values. This Fitness Function achieves its optimum fitness value quickly. - Optimum Fitness Value = 090 or 08f hex Fitness Function #5: The value of a Snake's fitness is equal to the sum of the sines of the angles between each consecutive pair of vectors along the path of the Snake referenced from the center of the Snake. This gives high fitness values to Snake paths which turn the most. To my amazement, this Fitness Function converges rather quickly on its optimum shape, which is the ball. - Optimum Fitness Value = 1c8 hex ________________________________Diversity Code_______________________________ As I stated above, inbreeding causes this algorithm to fail to reach the optimum solution, because it allows a single (local maxima) solution to fill the entire population. Similarly, excessive similarity within the population also hampers the ability of the algorithm to effectively search the entire input vector space. Diversity is a good thing; a Snake which does not have a very high fitness value compared with other Snakes in the population may still contain a piece of code necessary for the final global maxima, or at least for incrementally improving children. Early versions of GenLab Snakes were plagued with overcrowding problems when multiple variations (isomers) of a local optima would fill the population and impede further improvement. To correct for this, GenLab Snakes has a function to promote diversity. Before step 1 of the Offspring Calculation algorithm described above, a single pass through the top 50% of the population is made to get rid of excessive conformists. The sorted Snakes are scanned sequentially, and all Snakes after the 15th Snake in a run of 16 or more Snakes with identical fitness values are swapped with random Snakes from the lower 50% of the population. This prevents the population from having very many more than 16 Snakes with the same fitness value, and promotes diversity. This allows the algorithm to converge on the optimum solution much faster. ----------------------------------------------------------------------------- Tips and Tricks: - It is useful to 'guide' a population of Snakes to the solution of a Fitness Function by first allowing the population to fill with solutions from a different Fitness Function. - Cheating: It is possible to cause the algorithm to evolve the population very quickly and create isomers of a solution by 'seeding' the population with a known solution. This can be done by transferring a solution from the Gallery, or by directly entering the DNA code for a known solution. ----------------------------------------------------------------------------- Have fun playing God! Fascination Software www.FascinationSoftware.com