Summary of Kaufmann's NK landscapes and Genetic Algorithms W.Langdon@cs.ucl.ac.uk (300 lines) 31-Oct-95 Modifications: WBL 31-oct-95 Added http://www.santafe.edu/~unamay/ls-meeting.html WBL 17-Aug-95 Added Thompson MSc. reference WBL 31-Jul-95 Added ECAL-95 reference WBL 30-Jun-95 Added PPSN3 reference WBL 15-Mar-95 Update my www address WBL 16-Jan-95 Added notes on Lee Alternberg's paper WBL 22-Nov-94 Added notes from Alden Wright WBL 18-Nov-94 Added notes from Ian East WBL 16-Nov-94 Added notes on Terry Jones work On 4 Oct 1994 I asked the GA-List (v8n38) if anyone was aware of work that had been done on using the ideas of Kauffman's NK fitness landscapes with GAs. Apparently they where discussed in the context of GAs at the Simulation of Adaptive Behaviour conference in Brighton (SAB-94). In the tradition of the list, I'm now posting a summary. (Also available via ftp; cs.ucl.ac.uk genetic/ga-nk-summary.txt) Responses varied from "a waste of time", (cf. Dover's NK = Not Korrect below), to enthusiasm. The written responses where on the whole positive and since they comprise what follows, this summary is biased in favour of Kauffman's work. My thanks to those who supplied information (references etc) and in particular to Ron Goldthwaite for his recommendations for summaries/ introductions to real (ie biological) gene evolution which he supplied to me a year ago. I have included them at the end of this file. People Interested: ------------------ me W.Langdon@cs.ucl.ac.uk Andy Swann Chris Watkins watkins@mgna.demon.co.uk Alden Wright wright@cs.umt.edu Inman Harvey inmanh@cogs.sussex.ac.uk Lee Altenberg altenber@acpub.duke.edu Ron Goldthwaite rogoldthwaite@ucdavis.edu Terry Jones at SFI Tim J.Benham Ian East ian@buckingham.ac.uk References ---------- "The origins of Order:Self-Organisation and Selection in Evolution" by Stuart A Kauffman. Oxford University Press, 1993. GA-List v8n39, "NK landscapes and GAs (Re: v8n38)", by Inman Harvey A very good summary of what we are talking about. Attached at end. See also Ron Goldthwaite section on Biological aspects. ftp://ftp.santafe.edu/pub/terry/model-of-landscapes.ps.gz 17 pages Summary of landscape models, covers NK landscapes and many others. Lots of references, the GA specific ones include: Manderick,B., De Weger,M. and Spiessens,P. ICGA-91 Maithias and Whitley, PPSN2 Maithias and Whitley, FOGA2 Shuster and Stadler, SFI 93-11-069 Stadler and Schnabl, 1992, Physics Letters Vol 161 pp337-344 Vose, FOGA2 Weinberger, 1990, Biological Cybernetics, vol 63, pp325-336 Weinberger, 1991, Informantion Dynamics, pp185-193 Wim Hordijk and Manderick, ECAL-95, The Usefulnes of Recombination Terry's landscape code (GA-List v8n43) in ftp://ftp.santafe.edu/pub/terry/nk/nk_landscape.* "Species Adaptation Genetic Algorithms: A Basis for a Continuing SAGA", Inman Harvey, in 'Toward a Practice of Autonomous Systems, Proc of First ECAL', F.J. Varela and P. Bourgine (eds), MIT Press 1992. "The Genetic Algorithm and the Structure of the Fitness Landscape" by Manderick, de Weger, and Spiessens, IGCA 4. @INPROCEEDINGS{Altenberg:1994b, AUTHOR = "Lee Altenberg", YEAR = "1994", PAGES = "182-187", TITLE = "Evolving better representations through selective genome growth", BOOKTITLE = "Proceedings of the {IEEE} World Congress on Computational Intelligence", EDITOR = "J. David Schaffer and H. P. Schwefel and Hiroaki Kitano", PUBLISHER = "IEEE", ADDRESS = "Piscataway N.J.", Notes = {Suggests modification to NK landscape; real genomes are produced by growth of small ones, growth is by adding a gene at a time and each change is immediately beneficial. Analyses growth to 32 bit genome and shows resulting fitness landscape is easier to optimise than a straight NK landscape.} @inproceedings{ieee94:kinnear, author = {Kenneth E. {Kinnear, Jr.} }, title = {Fitness Landscapes and Difficulty in Genetic Programming}, year = 1994, booktitle = {Proceedings of the 1994 IEEE World Conference on Computational Intelligence}, publisher = {IEEE Press}, volume = {1}, pages = {142-147}, size = {6 pages}, note = {}, WBLnotes = {Defines difficulty as number of fitness cases/1000. Considers a few parity and sort problems. Fitness landscape investigated by using GP operators (without selection) on gen=0 to give a number of random walks. Look at autocorrelation of fitness along these walks. Essentially none (<0.5) very much worse than published GA. Also little correlation between this and difficulty measure. WBL: Does apparent landscape change as GP evolves? } } Richard Kenneth Thompson (GA-List v9n41) Computer Science University of Montana Missoula, MT 59812 M.S. Thesis, July 1995 Director: Alden H. Wright wright@cs.umt.edu NK Fitness landscapes over fixed length Boolean strings, and Boolean networks as presented by Stuart Kauffman, are studied in terms of their computational complexity. It is shown that if the epistatic influences for fitness contributions of NK landscapes are linearly ``localized,'' the optimal fitness can be found in polynomial time. If, however, the epistatic influences are from arbitrarily chosen positions of the string, the problem becomes NP-complete. These results are extended to show that the problem of finding pre-images in Boolean networks is NP-complete if the positions that affect the future of any given position of the string are arbitrarily chosen. The host is: ftp.cs.umt.edu, and the file is /pub/papers/rthompson_thesis.ps.gz . Reviews of Kauffman's "The Origins of Order" -------------------------------------------- @article( x, author = {GA Dover}, title = {On the Edge}, year = 1993, journal = {Nature}, volume = {365}, pages = {704-706}, size = {2}, goldthwaite_notes{ rogoldthwaite@ucdavis.edu 28 Nov 93 GP mail list Apart from matters of style, the review's substance is that: 1. developmental biology isn't so despairing yet that we need to abandon experimental analyses for Kauffman's top-down modeling 2. Attractors of Boolean networks don't correspond naturally to genomic regulatory networks' behaviors, despite Kauffman's appeal to analogies with autocatalytic metabolic soups which can self-reproduce. 3. The final 200-page segment on cell differentiation fails to demonstrate that within-cell genetic regulation needs to be supplemented by larger between-cell influences from "overall dynamic attractors". 4. Boolean networks and attractors based on fixed adaptive landscapes are not consistent with what's found in nature: extensive genetic and functional redundancy, and much genomic and ecological 'turbulance'. The convenient geometry of Sewell Wright's fixed adaptive landscape, discussed in the GP list Spring 93, is arguably more like a heaving ocean surface (see, e.g. J.Gillespie 9_), and it's not clear how robust the attractors of boolean genetic networks could be. } ) Other reviews of Kauffman (which I've not yet read) include: 1. The Origins of Order: Self-Organization and Selection in Evolution.... New Scientist v138, n1876 (June 5, 1993):46. 2. The Origins of Order: Self-Organization and Selection in Evolution.... Science v260, n5113 (June 4, 1993):1531 (3 pages). 3. Order for free. (theory of spontaneous order in complex... New Scientist v137, n1860 (Feb 13, 1993):S10 (2 pages). WWW pages --------- SFI http://www.santafe.edu/ Adaptive Search on Biological and Computational Landscapes Meeting July 21-23, 1995 Santa Fe Insitute http://www.santafe.edu/~unamay/ls-meeting.html mine http://www.cs.ucl.ac.uk/staff/W.Langdon/ Inman's http://www.cogs.susx.ac.uk/users/inmanh/index.html Developmental Geneticists texts (Ron Goldthwaite)plus my edtis ============================================================== One good introduction (for non-specialists) is Gilbert Gottlieb's _Individual Development and Evolution_, 1992. This is an historical/conceptual presentation, with no modeling, but an excellent exposition of the basic biological problems. In the GA/GP/NN literature, I'm not yet sure what I think of Huge de Garis's developmental models. A reference for "Gene Evolution" may not be what's appropriate (I was emphasizing developmental biology), but if it is, then between John Maynard-Smith's Evolutionary Genetics and M. Kimura's Neutral Theory of Molecular Evolution you'd have a usable orientation, and be able to comprehend and appreciate some of John Gillespie's rather sophisticated book. Origins of Order is targeted more at problems of development, where a gene doesn't act alone nor have single simple effects. Population genetics seems inadequate here, still. I'm fond of Gunther Stent's analogy, that genes have the role of a play's script in a theatre: the script directs the action, but the complete reality of events in the theatre also depends on the specific audience, the available lighting/electronics, the outside weather, and other meta-script 'ontogenetic' elements. Stent is a famous neurobiologist; this analogy appeared in his 1981 article "Strengths and Weaknesses of the Genetic Approach to the Development of the Nervous System", Annual Review of Neuroscience, p.163-194. This article's details could be brought up to date with G.L. Miklo's trenchant 1993 review, "Molecules and Cognition: ...Evolutionary Overview of Brain Structure and Function in Some Vertebrates and Invertebrates", J. Neurobiology Vol 24 No.6, p842-890. I mentioned Gottlieb's book; he's an old behavioral biologist, but this 1992 effort is a nice historical/conceptual presentation of what's been left out of even biology by the focus on 'selfish genes' from population genetics and Richard Dawkins' extraordinarily skillful popularization of it. For a broader synthetic view (which along with everything else makes clear why biology is not a branch of engineering or even of physics), there is no better source than the essays by Ernst Mayr, 1988, Towards a New Philosophy of Biology. Mayr's also an older, senior scientist - no upstart radical he. But compare his non-engineering perspective with Miklo's insistence that progress in developmental neurobiology won't proceed until we understand enough to 'grow our own' wet nervous systems from genetic modules. I've not satisfactorally translated any of this biological perspective into the GP modeling paradigm. GP's success sans traditional 'loci' or 'alleles' is very provocative and inexplicable; Lee Altman's formal analysis of some GP dynamics is suggestive but I still find it murky as population genetics, much less as GP theory. Richard Dawkins, "The Extended Phenotype" Whole-animal biologists, especially embryologists, see genes acting in complex collaboration, with each gene having multiple effects. It's not clear that Dawkins metaphor in which the specific oarsman can be 'selected' by the average effect s/he has on the teams of various racing hulls s/he's participated in, is a good analogy to gene fitnesses -- and in any case population genetics theory has minimal analytical machinery to quantify this analogy Review the last chapter of Extended Phenotype... Dawkins comes up for air among the rest of the multi-gene unitary animals. Then, consider (for example) Ernst Mayr's essays on the "Units of Selection" issue in evolutionary biology: are the genes selected directly, or whole organisms, or maybe even entire populations? Mayr is a senior scientist in evolutionary biology, and his essays are as readable and as well informed as Dawkins'. A convenient compilation is Mayr's 1988 Towards a New Philosophy of Biology (note Essays 6 and 7 especially). Stephen J. Gould's essays in Natural History magazine, collected into books every few years, are a less concise source for a whole-animal perspective. Another good book is older but comparable in theme, T.Frazetta's 1975 Complex Adaptations in Evolving Populations. A final recommendation is a usefully catholic collection of technical terms defined by signed essays, E. Keller & E. Lloyd's (Eds) 1992 Keywords in Evolutionary Biology. Relevent contributors include Dawkins ('Progress'), Kitcher ('Gene: current usages'), Lewontin ('Genotype and Phenotype'), Lloyd ('Units of Selection') and S. Gould ('Heterochrony'). ============================================================ From: inmanh@cogs.susx.ac.uk (Inman Harvey) Date: Thu, 6 Oct 94 14:43 BST Subject: NK landscapes and GAs (Re: v8n38) In GA-List v8n38 Bill Langdon asked if anyone has been using the ideas of Kauffman's NK fitness landscapes with GAs, mentioning discussion at SAB-94. I have used NK landscapes for some time as a convenient tool for testing theoretical ideas for some time, and I think I was the only person at SAB-94 who does (though I didn't mention them in my talk there!). They are abstract fitness landscapes, based on binary genotypes of length N, where the final fitness is based on the average fitness contribution from each of the N alleles. These are each in turn dependent on epistatic interactions with K other alleles (for some fixed K, 0 <= K <= N-1). For any one gene position, there are 2^(K+1) possible combinations of alleles at that position and its epistatic neighbours, and at the start of a run lookup tables are generated (N of them, one for each gene-position) of size 2^(K+1), and filled with random numbers in the range [0.0,1.0]. This provides a tunable fitness landscape: if K is set to zero, no epistasis, fitness contribution from each allele is simply additive: if K = N-1, any single mutation has epistatic interactions with all gene positions, giving a maximally rugged fitness landscape. Varying K between these values gives varying degrees of smoothness. Inman Harvey Evolutionary Robotics Group COGS, Univ. of Sussex ============================================================ (eddited) It is clear to me that genetic representation must use some sort of program approach in order to achieve invariance against scale and complexity of phenotype. I believe the right question to pose, is "What are the salient features of a language in which to express the construction of a phenotype?". For me, the question is can Kauffman's ideas be made to provide a workable way of designing effective representations for artificial GAs targeted at problems that are hard enough to be worthy of such an approach. Ian East University of Buckingham ian@buck.ac.uk ============================================================ (eddited) Genetic Algorithms Digest Monday, November 21, 1994 (GA-List v8n45) I have C++ code (not well commented) to compute NK fitness functions based on the following idea: When you need a new random value, you generate the value, and store it in a hash table so that if you need the same value again, it gets looked up from the hash table. If one does not need to compute a very large number of values from a given fitness landscape, this method works well for large values of N and K. I can make the code available. Alden Wright, University of Montana ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Non NK fitness landscapes 30-6-95 The following uses an NP-Complete problem rather than NK to generate the fitness landscape, however there seems to be some similarities. @inproceedings (x, author = {Hiroaki Inayoshi and Bernard Manderick}, title = {The Weighted Graph Bi-Partitioning Problem: A Look at GA Performance}, booktitle = {Parallel Problem Solving from Nature -- {PPSN III}}, year = 1994, editor = {Yuval Davidor and Hans-Paul Schwefel and Reinhard Manner}, series = {}, pages = {617--625}, address = {}, month = {}, organisation ={}, publisher = {Springer-Verlag}, note = {}, %ignored by BibTex but available for automatic searching % email = {}, keywords = {genetic algorithms}, ISBN = {}, url = {}, size = {}, abstract = {}, notes = {Includes fitness landscape analysis which shows: ``high quality local optima tend to be close to each other'' and ``the landscape is quite smooth. The typical change in cost arround a local optima is about 7\% and 2\% for a random point. Refers to Manderick: The genetic algorithm and the structure of the fitness landscape, ICGA-91. } )