General Bibliography - Brian's Working Notes

# Emotion / Communication / ToM - Emotion prediction as computation over a generative theory of mind, Houlihan, et al., 2023. - [https://doi.org/10.1098/rsta.2022.0047](https://doi.org/10.1098/rsta.2022.0047) - inverse planning of emotion presentation - Multi-Agent Cooperation and the Emergence of (Natural) Language, Lazaridou, 2017. - Agents communicate to pick an image. - “Multi-agent coordination communication games” - Agents develop a language that is human-interpretable because the environment is grounded. - Agents develop by bootstrapping on top of each other (are you sure?) - Variation on Lewis’ signaling game (Lewis 1969.) — “cheap talk” - Sukhbaatar 2016. - Forester 2016. ## Books - Principles of Animal Communication, Bradbury & Venrencamp. 2011 - [ ] Buy used copy - Focus on principles, and relations to econ & other sciences. - [Student resources (chapter outlines, summaries and references (Good!))](https://learninglink.oup.com/access/bradbury-animalcomm-2e-student-resources#tag_chapter-01) - Animal Signals, Maynard Smith, David Harper, 2004. - [ ] Buy copy (had this before, where is it?) - The Evolution of Animal Communication: Reliability and Deception in Signaling Systems, Searcy & Nowicki, 2005. - Game theory bent with focus on deception. - [ ] Buy used copy # RL / Learning actions for contexts - Reward-Respecting Subtasks for Model-Based Reinforcement Learning, Sutton et al., 2023. - https://arxiv.org/pdf/2202.03466.pdf - Continual Lifelong Learning with Neural Networks: A Review, Parisi, 2019. - Dopamine reward prediction-error signalling: a two-component response, Wolfram Schultz, 2016. - How biological learning is fast and accurate - Real-Time Reinforcement Learning, Ramstedt & Pal, 2019. - Proposes Realtime Actor-Critic (RTAC) to handle changing states and actions during learning. - ![[Pasted image 20240304153554.png|200]] - "Realtime" allows the state to change during the time action-selection is taking place. ## RL on changing morphologies - One Policy to Control Them All: Shared modular policies for agent-agnostic control, Huang, 2020. - https://huangw118.github.io/modular-rl/ - graph neural net, but doesn't seem like that's of much use - The Role of Morphology in Graph-Based Incompatible Control - hand-constructed the graph? - AnyMorph: Learning Transferable Polices By Inferring Agent Morphology, Trabucco, 2022. - used a seq-to-seq transformer to learn an "embedding" language for joints, then applied that to the unseen morphologies, so they would get an input?/policy?/graph? appropriate to their role - again, simple morphs with limited variation, I bet random sine waves would solve this about as well - [ ] To read! - https://arxiv.org/pdf/2206.12279.pdf - https://umd.zoom.us/rec/share/Wli1HkEJOojJ1s0MasVdxDwMmqcWo3H5uay87_rG4GYOGWIch_417MwTET3LWcQ.W_Q3_UNncVDbqFAS - MetaMorph: Learning Universal Controllers with Transformers, Gupta, 2022. - also varies the dynamics and environment, as well as the morphology - DMAP: a Distributed Morphological Policy for Learning to Locomote with a Changing Body, Chiappa, 2022. - online changes to morphology! - [ ] To read! ## MARL ## Books - [[Multi-Agent Reinforcement Learning, Foundations and Modern Approaches, Albrecht, Christianos, Schäfer, 29 Feb 2024 (preprint).pdf]] and [great repo with simple examples using pytorch and gym](https://github.com/marl-book/codebase) - See marl-book.com for updates - [ ] Try out the code examples with leveled foraging - Does not cover communication at all! # Embodiment - VOYAGER: An Open-Ended Embodied Agent with Large Language Models, Wang et al., 2023. - https://voyager.minedojo.org - Minecraft player using LLM ## Essential factors for communication to emerge * Emerge, not evolve. Communication, even complex communication, often arises in an individual's lifetime. * [?] What are some examples of learned communication signals in non-human animals? - Shared emitting and receiving sensory modality. - The sender has to have the capability to send a signal that the receiver is capable of perceiving. - The context a sender is in must be understandable by the receiver (and possibly vice-versa) for the signal to have meaning. - Not sure about this one. But without an understood context, the receiver won't be able to use ToM to understand the signal, limiting communication to stimulus-response (listener hears a call, takes an action like hiding, avoids a predator, which reinforces the action (hiding) in that context (the signal) without any ToM). - [?] Hypothesis: complex communication that relies on ToM has a different evolutionary origin from stimulus-response communication that doesn't - _Both_ the sender and receiver need to motivated to communicate. - If the signal has no use to the receiver, even if they "hear" it, if they don't change their behavior, it's not useful.