Changeset 1235
- Timestamp:
- 05/06/23 20:04:18 (20 months ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
cpp/frams/genetics/f4/f4_general.cpp
r1234 r1235 4 4 5 5 // Copyright (C) 1999,2000 Adam Rotaru-Varga (adam_rotaru@yahoo.com), GNU LGPL 6 // 2018, Grzegorz Latosinski, added support for new API for neuron types and their properties6 // 2018, Grzegorz Latosinski, added support for new API for neuron types and development checkpoints 7 7 8 8 #include "f4_general.h" … … 1212 1212 1213 1213 sprint(out); 1214 len = out.length(); 1214 1215 1215 1216 // very last '>' can be omitted 1216 len = out.length(); 1217 if (len > 1) 1218 if (out[len - 1] == '>') { (out.directWrite())[len - 1] = 0; out.endWrite(); }; //Macko 2023-04 "can be omitted", but it is removed as a rule even in generated genotypes :) 1217 // MacKo 2023-05: after tightening parsing and removing a silent repair for missing '>' after '#', this is no longer always the case. 1218 // For genotypes using '#', removing trailing >'s makes them invalid: /*4*/<X><N:N>X#1>> or /*4*/<X><N:N>X#1#2>>> or /*4*/<X><N:N>X#1#2#3>>>> etc. 1219 // Such invalid genotypes with missing >'s would then require silently adding >'s, but now stricter parsing and clear information about invalid syntax is preferred. 1220 // See also comments in f4_processRecur() case '#'. 1221 //if (len > 1) 1222 // if (out[len - 1] == '>') { (out.directWrite())[len - 1] = 0; out.endWrite(); }; //Macko 2023-04 "can be omitted" => was always removed in generated genotypes. 1223 1219 1224 // copy back to string 1220 1225 // if new is longer, reallocate buf … … 1312 1317 if (end == NULL) 1313 1318 return pos_inout + 1; //error 1314 f4_Node *node = new f4_Node("#", par, pos_inout); 1319 f4_Node *node = new f4_Node("#", par, pos_inout); //TODO here or elsewhere: gene mapping seems to map '#' but not the following number 1315 1320 node->reps = reps.getInt(); 1316 1321 // skip number … … 1326 1331 { 1327 1332 return genot_len + 1; //MacKo 2023-04: report an error, better to be more strict instead of a silent repair (genotype stays invalid but is interpreted and reported as valid) with non-obvious consequences? 1328 //earlier apporach - silently treating this problem (we don't ever see where the error is because it gets corrected in some way here, while parsing the genotype, and error location in the genotype is never reported): 1329 //node = new f4_Node(">", par, genot_len - 1); // check if needed and if this is really the best repair operation; seemed to happen too many times in succession for some genotypes even though they were only a result of f4 operators, not manually created... and the operators should not generate invalid genotypes, right? Or maybe crossover does? Seems like too many #N's for closing >'s; removing #N or adding > helped. Operators somehow don't do it properly sometimes? But F4_ADD_REP adds '>'... (TODO) 1333 //earlier approach - silently treating this problem (we don't ever see where the error is because it gets corrected in some way here, while parsing the genotype, and error location in the genotype is never reported): 1334 //node = new f4_Node(">", par, genot_len - 1); // Maybe TODO: check if this was needed and if this was really the best repair operation; could happen many times in succession for some genotypes even though they were only a result of f4 operators, not manually created... and the operators should not generate invalid genotypes, right? Or maybe crossover does? Seemed like too many #n's for closing >'s; removing #n or adding > helped. Examples (remove trailing >'s to make invalid): /*4*/<X><N:N>X#1>> or /*4*/<X><N:N>X#1#2>>> or /*4*/<X><N:N>X#1#2#3>>>> etc. 1335 // So operators somehow don't do it properly sometimes? But F4_ADD_REP adds '>'... Maybe the rule to always remove final trailing '>' was responsible? (now commented out). Since the proper syntax for # is #n ...repcode... > ...endcode..., perhaps endcode also needs '>' as the final delimiter. If we have many #'s in the genotype and the final >'s are missing, in the earlier approach we would keep adding them here as needed to ensure the syntax is valid. If we don't add '>' here silently, they must be explicitly added or else the genotype is invalid. BUT this earlier approach here only handled the situation where the genotype ended prematurely; what about cases where '>' may be needed as delimiters for # in the middle of the genotype? Or does # always concern all genes until the end, unless explicitly delimited earlier? Perhaps, if the '>' endcode delimiters are not present in the middle of the genotype, we don't know where they should be so the earlier approach would always add them only at the end of the genotype? 1330 1336 } 1331 1337 return 0; // OK
Note: See TracChangeset
for help on using the changeset viewer.