_IN PROGRESS - Brian's Working Notes

* Continual Learning * Surveys * Khetarpal 25 - continual reinforcement learning * Towards continual reinforcement learning 2022 * Lesort 30 - continual robot learning * Continual learning for robotics 2020 * Parisi 42 - continual learning with neural networks * Continual lifelong learning with neural networks: A review 2019 * Taylor & Stone 60 - transfer learning in RL * Transfer learning for reinforcement learning domains: A survey 2009 * Lifelong learning * Never-ending learning * Transfer learning * Meta-learning * Online learning and non-stationarity * Continuing Tasks (Sutton & Barto) * Ring * Dissertation: Continual learning in reinforcement environments, 1994 * Child: A first step towards continual learning, 1997 * Toward a formal framework for continual learning, 2005 - [[Signaling theory]] in biology - signaling contributes to finding an [[Evolutionarily stable strategy]] - Fisher demonstrated that diploidy has an effect (?) on honest signaling, demonstrates the runaway effect in sexual selection - Other cool honest signals to explore (from wikipedia): - Risk seeking in young men - Hunting of large/dangerous game - Costly religious rituals * Signaling in sexual selection * “Sensory exploitation hypothesis” * pre-existing preferences in female receivers can drive the evolution of signal innovation in male senders * “hidden preference hypothesis” * successful calls match “hidden preferences” in the female observer * No current work has explored mating displays in an evolutionary reinforcement or MARL framework (that I’ve found so far) * Iain Couzin - Max Planck Animal Behavior. Collective Behavior. * Coordinated hunting disrupts information transfer among the prey * Handegard,… Couzin. The dynamics of coordinated group hunting and collective information transfer among schooling prey. [[Evolutionary Reinforcement Learning]] - [[Bai 2023 - Evolutionary Reinforcement Learning: A Survey]] evolve objective function to learn how to have social fairness - mediate social dilemma with social contracts Dyanmic env self agents - eugene potenicy ma-ppo surprising effectiveness of ppo Anger comes from violating fairness in public good games. - Public good games allow any agent to contribute to the public good, and all agents benefit from the combined public goods - The rational strategy is free-riding: to take without giving. - In animal behavior, free-riding is generally met with anger, and, at least according to Eckman (via discussion with Cassidy), this violation of fairness/justice is the key source of the emotion of anger. - MARL has looked at the emergence of free-riding and cooperative strategies previously - Why is the _expression_ of and _development_ of an _internal state_ known as anger so common? - Under what circumstances does it emerge? - Does getting angry have a benefit? - hypothesis: anger is a threat/warning that future continued free-riding will be punished. - without actual future punishment, it will not emerge (ie, not an actual threat if nothing happens) - so anger as a communication token will only emerge if there's also the ability to somehow punish the offender - can this relate to human experiences of being frustrated at not seeing others punished? - i believe theres work that shows humans direct more anger to those that allow free-riding vs those that actually do the free-riding