Changeset 1241 for cpp/frams/genetics


Ignore:
Timestamp:
05/18/23 03:43:42 (20 months ago)
Author:
Maciej Komosinski
Message:

No longer sort modifiers and cancel out antagonistic modifiers in f1 and f4; simplifying modifier sequences is now much less intrusive to allow for 2N distinct values of properties instead of only 2*N that resulted from earlier forced ordering (N is the number of same-letter upper- and lower-case characters in a modifier sequence)

Location:
cpp/frams/genetics
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • cpp/frams/genetics/f4/f4_general.cpp

    r1240 r1241  
    13631363                        // in the future this could be generalized to all neuron properties, for example N:|:power:0.6:range:1.4, or can even use '=' or ',' instead of ':' if no ambiguity
    13641364                        char prop_dir, prop_symbol, prop_end[2]; // prop_end is only to ensure that neuron parameter definition is completed
    1365                         if (sscanf(genot + pos_inout, ":%c%c%1[:]", &prop_dir, &prop_symbol, &prop_end) != 3)
     1365                        if (sscanf(genot + pos_inout, ":%c%c%1[:]", &prop_dir, &prop_symbol, prop_end) != 3)
    13661366                                // error: incorrect format
    13671367                                return pos_inout + 1 + 1;
     
    14021402#ifdef F4_SIMPLIFY_MODIFIERS
    14031403                        char *ptr = (char*)(genot + pos_inout);
    1404 
    1405 #ifdef __BORLANDC__ // "[bcc32c Error] cannot compile this non-trivial TLS destruction yet" (C++B 10.4u2)
    1406                         static
    1407 #else
    1408                         thread_local
    1409 #endif
    1410                                 vector<int> modifs_counts(strlen(all_modifiers_no_comma)); ///<an array with a known constant size storing counters of each modifier symbol from all_modifiers_no_comma, created once to avoid reallocation every time when modifier genes are simplified during parsing. Initialization of required size; it will never be resized.
    1411                         std::fill(modifs_counts.begin(), modifs_counts.end(), 0); //zeroing only needed if we encountered a char from all_modifiers_no_comma and enter the 'while' loop below
    1412 
    1413                         while (char *m = GenoOperators::strchrn0(all_modifiers_no_comma, *ptr)) //only processes a section of chars known in all_modifiers_no_comma, other characters will exit the loop
    1414                         {
    1415                                 modifs_counts[m - all_modifiers_no_comma]++;
     1404                        string original = "";
     1405                        while (GenoOperators::strchrn0(all_modifiers_no_comma, *ptr)) //only processes a section of chars known in all_modifiers_no_comma, other characters will exit the loop
     1406                        {
     1407                                original += *ptr;
    14161408                                GenoOperators::skipWS(++ptr); //advance and ignore whitespace
    14171409                        }
     
    14191411                        if (advanced > 0) //found modifiers
    14201412                        {
    1421                                 string simplified = GenoOperators::simplifiedModifiers(all_modifiers_no_comma, modifs_counts);
     1413                                string simplified = GenoOperators::simplifiedModifiers(original);
    14221414                                // add a node for each char in "simplified"
    14231415                                for (size_t i = 0; i < simplified.length(); i++)
  • cpp/frams/genetics/f4/f4_oper.cpp

    r1238 r1241  
    1818// TODO add support for properties of (any class of) neurons - not just sigmoid/force/intertia (':' syntax) for N
    1919// TODO add mapping genotype character ranges for neural [connections]
    20 // TODO for some genotypes, #defining/undefining F4_SIMPLIFY_MODIFIERS produces significantly different phenotypes (e.g. length of some Joint changes from 1.25 to 1.499, coordinates of Parts change, friction of some part changes from 1.28 to 0.32). Comparing f4_Node trees, the simplification works as intended, there are no huge changes apart from removing contradicting modifiers like 'R' and 'r' or 'L' and 'l', and dispersing the modifiers (changed order). There is no reason for such a significant influence of this. A hypothesis is that something may be wrong with calculating the influence of individual modifiers, e.g. some strong nonlinearity is introduced where it should not be, or some compensation between modifiers that should not influence each other (like L and R), or some modifier f4_Nodes are skipped/ignored when applying? Investigate. Example genotype that displays this issue: /*4*/,i<qlM,C<X>N:*#1>>,r<MRF<Xcm>N:Gpart>#5#1#2MLL#1>#1>>>>#5ML#2L#1>>>Lf,r<#1>rM<CqmLlCfqiFLqXFfl><F,<<XI>iN:|[-1:4.346]><XF><<XrRQ>N:G#3>>QiXFMR>fXM#2MfcR>R#3>>X
    2120// TODO The f0 genotypes for /*4*/<<RX>X>X> and RX(X,X) are identical, but if you replace R with Q or C, there are small differences - check why and perhaps unify?
     21// TODO F4_SIMPLIFY_MODIFIERS in f4_general.cpp: currently it works while parsing (which is a bit "cheating": we get a phenotype that is a processed version of the genotype, thus some changes in modifiers in the genotype have no effect on its phenotype). Another (likely better) option, instead of simplifying while parsing, would be during mutations (like it is done in f1): when mutations add/modify/remove a modifier node, they could "clean" the tree by simplifying modifiers on the same subpath just as GenoOperators::simplifiedModifiers() does. This way, simplifying would be only performed when we actually modify a part of a genotype, not each time we interpret it, and there would be no hidden mechanism: all visible genes would have an expected effect on the phenotype.
    2222
    2323
  • cpp/frams/genetics/genooperators.cpp

    r1233 r1241  
    473473
    474474//#include <cassert>
    475 string GenoOperators::simplifiedModifiers(const char *str_of_char_pairs, vector<int> &char_counts)
     475string GenoOperators::simplifiedModifiersFixedOrder(const char *str_of_char_pairs, vector<int> &char_counts)
    476476{
    477477//      assert(strlen(str_of_char_pairs) == char_counts.size());
    478478//      assert(char_counts.size() % 2 == 0);
    479         const int MAX_NUMBER_SAME_TYPE = 8; // max. number of modifiers of each type = 8 (mainly for Rr)
     479        const int MAX_NUMBER_SAME_TYPE = 8; // max. number of modifiers of each type (case-sensitive) - mainly for rR, even though for rR, 4 would be sufficient if we assume lower or upper can be chosen as required for minimal length, e.g. rrrrr==RRR, RRRRRR==rr
    480480        string simplified;
    481         //#define CLUMP_IDENTICAL_MODIFIERS //not good because properties are calculated incrementally, non-linearly, and their values are updated after each modifier character, so these values may for example saturate after a large number of identical modifier symbols. The order of modifiers is in general relevant and extreme values of properties increase this relevance, so better keep the modifiers dispersed.
     481        //#define CLUMP_IDENTICAL_MODIFIERS //not good because with the exception of rR properties are calculated incrementally, non-linearly, and their values are updated after each modifier character, so these values may for example saturate after a large number of identical modifier symbols. The order of modifiers is (with the exception of rR) relevant and extreme values of properties increase this relevance, so better keep the modifiers dispersed.
    482482#ifdef CLUMP_IDENTICAL_MODIFIERS
    483483        for (size_t i = 0; i < strlen(str_of_char_pairs); i++)
     
    507507        return simplified;
    508508}
     509
     510string GenoOperators::simplifiedModifiers(const string & original)
     511{
     512        const int MAX_NUMBER_SAME_TYPE = 6; // max. number of modifiers of each type (case-insensitive). rR could be treated separately in simplification because their influence follows different (i.e., simple additive) logic - so the simplifiedModifiersFixedOrder() logic with cancelling out is appropriate for rR. However in this function, making no exception to rR does not cause any harm to these modifiers either - the only consequence is that we will not remove antagonistic letters and will not simplify sequences of rR longer than 4, while they could be simplified (e.g. rrrrr==RRR, RRRRRR==rr).
     513        int counter[256] = {}; //initialize with zeros; 256 is unnecessarily too big and redundant, but enables very fast access (indexed directly by the ascii code)
     514        string simplified = "";
     515        for (int i = original.size() - 1; i >= 0; i--) //iterate from end to begin - easier to remove "oldest" = first modifiers
     516        {
     517                unsigned char c = original[i];
     518                if (!std::isalpha(c))
     519                        continue;
     520                unsigned char lower = std::tolower(c);
     521                counter[lower]++;
     522                if (counter[lower] <= MAX_NUMBER_SAME_TYPE) //get rid of modifiers that are too numerous, but get rid of the first ones in the string (="oldest", the last ones looking from the end), because their influence on the parameter value is the smallest
     523                        simplified += c;
     524        }
     525        std::reverse(simplified.begin(), simplified.end()); //"simplified" was built in reverse order, so need to restore the order that corresponds to "original"
     526        return simplified;
     527}
  • cpp/frams/genetics/genooperators.h

    r1233 r1241  
    215215        static void skipWS(char *&s); ///<advances pointer \a s skipping whitespaces.
    216216        static bool areAlike(char*, char*); ///<compares two text strings skipping whitespaces. Returns 1 when equal, 0 when different.
    217         static char* strchrn0(const char *str, char ch); ///<like strchr, but does not find zero char in \a str.
     217        static char* strchrn0(const char *str, char ch); ///<like strchr, but does not find ascii=0 char in \a str.
    218218
    219219        static int getRandomChar(const char *choices, const char *excluded); ///<returns index of a random character from \a choices excluding \a excluded, or -1 when everything is excluded or \a choices is empty.
    220         static string simplifiedModifiers(const char *str_of_char_pairs, vector<int> &char_counts); ///<returns a sequence of chars from \a str_of_char_pairs based on how many times each char occurred in \a char_counts. Assume that an even-index char and the following odd-index char have the opposite influence, so they cancel out.
     220        static string simplifiedModifiersFixedOrder(const char *str_of_char_pairs, vector<int> &char_counts); ///<returns a sequence of chars from \a str_of_char_pairs based on how many times each char occurred in \a char_counts. Assume that an even-index char and the following odd-index char have the opposite influence, so they cancel out. We don't use this function, because a fixed order imposed by this function means that the number of different parameter values produced by a sequence of modifiers is lowered (N same-letter upper- and lower-case chars yield only 2*N different values). Due to how modifiers work, the effect of aaA, aAa, Aaa etc. is different (N same-letter upper- and lower-case chars yield 2^N different values), so simplifying modifiers should not impose any order and should not interfere with their original order - see \a simplifiedModifiers().
    221221        //@}
     222        static string simplifiedModifiers(const string &original); ///<from the \a original sequence removes modifiers that are too numerous (exceeding a defined threshold number), starting the removal from the leftmost (="oldest" when interpreting the sequence from left to right) ones.
    222223};
    223224
Note: See TracChangeset for help on using the changeset viewer.