Individuals encode a function for mate selection— maps information about another individual to a scalar. Eval is done with a predator-prey environment. Outcome is: - that extinction time is increased. - evolve to favor mates with “survival traits” Calls the approach an extension of “Evolutionary Reinforcement Learning” from [[Ackey 1991 - Interactions between learning and evolution]] Agents get an action and an “evaluation“ network, plus a preference network. The preference network learns a linear mapping from their genome and the potential mate’s genome, to a scalar. The weights of this network (linear mapping) are evolved. Trials compared four inputs to the preference network: - (other) just the potential mate’s genome $G_c$ - (Abs diff) The abs diff between the agent’s and the potential mate’s: $|G_a - G_c| \in R^n$ - (Squared diff) : $(G_a - G_c)^2 \in R^2$ - (Euclidean diff): $|(G_a-G_c)^2| \in R$ (note! Only a scalar for input to the preference network) Results were that Other worked best, Abs diff and Squared diff followed closely. A limited post-hoc analysis found that preference networks (1) selected for mates that assigned rationally high values to high-energy states, and (2) selected against mates that tried to mate when grass was distant.