NeuroAI for AI Safety

[1] Mnih, V., K. Kavukcuoglu, D. Silver, et al., “Playing atari with deep reinforcement learning”, arXiv [cs.LG], 2013.

[2] Silver, D., A. Huang, C.J. Maddison, et al., “Mastering the game of go with deep neural networks and tree search”, Nature 529(7587), 2016, pp. 484–489.

[3] Silver, D., T. Hubert, J. Schrittwieser, et al., “A general reinforcement learning algorithm that masters chess, shogi, and go through self-play”, Science 362(6419), 2018, pp. 1140–1144.

[4] Krizhevsky, A., I. Sutskever, and G.E. Hinton, “ImageNet classification with deep convolutional neural networks”, Advances in neural information processing systems, Curran Associates, Inc. (2012).

[5] Dosovitskiy, A., L. Beyer, A. Kolesnikov, et al., “An image is worth 16x16 words: Transformers for image recognition at scale”, 2020.

[6] Grigorescu, S., B. Trasnea, T. Cocias, and G. Macesanu, “A survey of deep learning techniques for autonomous driving”, arXiv [cs.LG], 2019.

[7] Rajpurkar, P., E. Chen, O. Banerjee, and E.J. Topol, “AI in health and medicine”, Nat. Med. 28(1), 2022, pp. 31–38.

[8] Jumper, J., R. Evans, A. Pritzel, et al., “Highly accurate protein structure prediction with AlphaFold”, Nature 596(7873), 2021, pp. 583–589.

[9] Watson, J.L., D. Juergens, N.R. Bennett, et al., “Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models”, 2022.

[10] Brown, T.B., B. Mann, N. Ryder, et al., “Language models are few-shot learners”, 2020.

[11] Bubeck, S., V. Chandrasekaran, R. Eldan, et al., “Sparks of artificial general intelligence: Early experiments with GPT-4”, 2023.

[12] Trinh, T.H., Y. Wu, Q.V. Le, H. He, and T. Luong, “Solving olympiad geometry without human demonstrations”, Nature 625(7995), 2024, pp. 476–482.

[13] Lam, R., A. Sanchez-Gonzalez, M. Willson, et al., “Learning skillful medium-range global weather forecasting”, Science 382(6677), 2023, pp. 1416–1421.

[14] Kochkov, D., J. Yuval, I. Langmore, et al., “Neural general circulation models for weather and climate”, Nature 632(8027), 2024, pp. 1060–1066.

[15] Pyzer-Knapp, E.O., J.W. Pitera, P.W.J. Staar, et al., “Accelerating materials discovery using artificial intelligence, high performance computing and robotics”, Npj Comput. Mater. 8(1), 2022, pp. 1–9.

[16] Rolnick, D., P.L. Donti, L.H. Kaack, et al., “Tackling climate change with machine learning”, arXiv [cs.CY], 2019.

[17] Luccioni, A.S., S. Viguier, and A.-L. Ligozat, “Estimating the carbon footprint of BLOOM, a 176B parameter language model”, Journal of Machine Learning Research, 2023, pp. 253:1–253:15.

[18] Zhang, B.H., B. Lemoine, and M. Mitchell, “Mitigating unwanted biases with adversarial learning”, arXiv [cs.LG], 2018.

[19] Bender, E.M., T. Gebru, A. McMillan-Major, and M. Mitchell, “On the dangers of stochastic parrots: Can language models be too big?”, Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, (2021).

[20] Peterson, D., and S. Hoffman, Geopolitical implications of AI and digital surveillance adoption, Brookings Institute, 2022.

[21] Urbina, F., F. Lentzos, C. Invernizzi, and S. Ekins, “Dual use of artificial intelligence-powered drug discovery”, Nat. Mach. Intell. 4(3), 2022, pp. 189–191.

[22] Bostrom, N., “Superintelligence: Paths, dangers, strategies”, xvi 328, 2014.

[23] Amodei, D., C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and D. Mané, Concrete problems in AI safety, arXiv, 2016.

[24] Bengio, Y., G. Hinton, A. Yao, et al., “Managing extreme AI risks amid rapid progress”, Science 384(6698), 2024, pp. 842–845.

[25] Bengio, Y., et al., Interim report, International Scientific Report on the Safety of Advanced AI, 2024.

[26] Hendrycks, D., M. Mazeika, and T. Woodside, “An overview of catastrophic AI risks”, 2023.

[27] Eloundou, T., S. Manning, P. Mishkin, and D. Rock, “GPTs are GPTs: An early look at the labor market impact potential of large language models”, 2023.

[28] Ramesh, A., P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with CLIP latents”, arXiv [cs.CV], 2022.

[29] Christian, B., The alignment problem: Machine learning and human values, WW Norton, New York, NY, 2021.

[30] Rapp, J.T., B.J. Bremer, and P.A. Romero, “Self-driving laboratories to autonomously navigate the protein fitness landscape”, Nat Chem Eng 1(1), 2024, pp. 97–107.

[31] Park, J.S., J.C. O’Brien, C.J. Cai, M.R. Morris, P. Liang, and M.S. Bernstein, “Generative agents: Interactive simulacra of human behavior”, arXiv [cs.HC], 2023.

[32] Lála, J., O. O’Donoghue, A. Shtedritski, S. Cox, S.G. Rodriques, and A.D. White, “PaperQA: Retrieval-augmented generative agent for scientific research”, arXiv [cs.CL], 2023.

[33] Lu, C., C. Lu, R.T. Lange, J. Foerster, J. Clune, and D. Ha, “The AI scientist: Towards fully automated open-ended scientific discovery”, arXiv [cs.AI], 2024.

[34] Jimenez, C.E., J. Yang, A. Wettig, et al., “SWE-bench: Can language models resolve real-world GitHub issues?”, arXiv [cs.CL], 2023.

[35] Yang, J., C.E. Jimenez, A. Wettig, et al., “SWE-agent: Agent-computer interfaces enable automated software engineering”, arXiv [cs.SE], 2024.

[36] Russell, S., Human compatible: Artificial intelligence and the problem of control, Viking, 2019.

[37] Richards, B., B. Agüera y Arcas, G. Lajoie, and D. Sridhar, “The illusion of AI’s existential risk”, Noema, 2023.

[38] Cisek, P., “Evolution of behavioural control from chordates to primates”, Phil. Trans. R. Soc. B 377(1844), 2022.

[39] Bennett, M., A brief history of intelligence, Mariner Books, 2023.

[40] Ortega, P.A., V. Maini, and D.M.S. Team, “Building safe artificial intelligence: Specification, robustness, and assurance”, DeepMind Safety Research Blog, 2018.

[41] Geirhos, R., J.-H. Jacobsen, C. Michaelis, et al., “Shortcut learning in deep neural networks”, Nat Mach Intell 2(11), 2020, pp. 665–673.

[42] Lindsay, G.W., and D. Bau, “Testing methods of neural systems understanding”, Cogn. Syst. Res. 82, 2023, pp. 101156.

[43] Goodfellow, I.J., J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples”, arXiv [stat.ML], 2014.

[44] Elsayed, G.F., S. Shankar, B. Cheung, et al., “Adversarial examples that fool both computer vision and time-limited humans”, Neural Inf Process Syst 31, 2018, pp. 3914–3924.

[45] Ilyas, A., S. Santurkar, D. Tsipras, L. Engstrom, B. Tran, and A. Madry, “Adversarial examples are not bugs, they are features”, 2019.

[46] Guo, C., M.J. Lee, G. Leclerc, et al., “Adversarially trained neural representations may already be as robust as corresponding biological neural representations”, 2022.

[47] Bartoldson, B.R., J. Diffenderfer, K. Parasyris, and B. Kailkhura, “Adversarial robustness limits via scaling-law and human-alignment studies”, arXiv [cs.LG], 2024.

[48] Fort, S., and B. Lakshminarayanan, “Ensemble everything everywhere: Multi-scale aggregation for adversarial robustness”, arXiv [cs.CV], 2024.

[49] Li, Z., W. Brendel, E. Walker, et al., “Learning from brains how to regularize machines”, Advances in neural information processing systems, Curran Associates, Inc. (2019).

[50] Dapello, J., T. Marques, M. Schrimpf, F. Geiger, D. Cox, and J.J. DiCarlo, “Simulating a primary visual cortex at the front of CNNs improves robustness to image perturbations”, Advances in Neural Information Processing Systems 33, 2020, pp. 13073–13087.

[51] Safarani, S., A. Nix, K. Willeke, et al., “Towards robust vision by multi-task learning on monkey visual cortex”, arXiv [cs.CV], 2021.

[52] Li, Z., J. Ortega Caro, E. Rusak, et al., “Robust deep learning object recognition models rely on low frequency information in natural images”, PLOS Computational Biology 19(3), 2023, pp. e1010932.

[53] Marr, D., Vision: A computational approach, Freeman & Co., San Francisco, 1982.

[54] Bostrom, N., and A. Sandberg, Whole brain emulation: A roadmap, Future of Humanity Institute, 2008.

[55] Byrnes, S., “Intro to brain-like-AGI safety”, 2022.

[56] Cvitkovic, M., “Neurotechnology is critical for AI alignment”, 2023.

[57] Thiergart, L., and S.L. Norman, Distillation of neurotech and alignment workshop january 2023, LessWrong, 2023.

[58] Olah, C., and S. Carter, “Research debt”, Distill, 2017.

[59] Kahneman, D., Thinking, fast and slow, Farrar, Straus & Giroux, New York, NY, 2013.

[60] Harari, Y.N., Sapiens: A brief history of humankind, Harper, New York, NY, 2015.

[61] Descartes, R., Discourse on method, 1637.

[62] Zhuang, C., S. Yan, A. Nayebi, et al., “Unsupervised neural network models of the ventral visual stream”, Proc. Natl. Acad. Sci. U. S. A. 118(3), 2021.

[63] Nagarajan, V., A. Andreassen, and B. Neyshabur, “Understanding the failure modes of out-of-distribution generalization”, 2021.

[64] Lake, B.M., T.D. Ullman, J.B. Tenenbaum, and S.J. Gershman, “Building machines that learn and think like people”, Behav. Brain Sci. 40, 2017, pp. e253.

[65] Geirhos, R., P. Rubisch, C. Michaelis, M. Bethge, F.A. Wichmann, and W. Brendel, “ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness”, arXiv.org, 2018.

[66] Wang, E.Y., P.G. Fahey, K. Ponder, et al., “Towards a foundation model of the mouse visual cortex”, bioRxiv; Accepted in Nature, 2024.

[67] Bashivan, P., K. Kar, and J.J. DiCarlo, “Neural population control via deep image synthesis”, Science 364(6439), 2019.

[68] Walker, E.Y., F.H. Sinz, E. Cobos, et al., “Inception loops discover what excites neurons most using deep predictive models”, Nat. Neurosci. 22(12), 2019, pp. 2060–2065.

[69] Gu, J., X. Jia, P. de Jorge, et al., “A survey on transferability of adversarial examples across deep neural networks”, arXiv [cs.CV], 2023.

[70] Brendel, W., J. Rauber, and M. Bethge, “Decision-based adversarial attacks: Reliable attacks against black-box machine learning models”, arXiv [stat.ML], 2017.

[71] Athalye, A., L. Engstrom, A. Ilyas, and K. Kwok, “Synthesizing robust adversarial examples”, arXiv [cs.CV], 2017.

[72] Howe, N., M. Zajac, I. McKenzie, et al., “Exploring scaling trends in LLM robustness”, arXiv [cs.LG], 2024.

[73] Wang, T.T., A. Gleave, T. Tseng, et al., “Adversarial policies beat superhuman go AIs”, arXiv [cs.LG], 2022.

[74] Gleave, A., “AI safety in a world of vulnerable machine learning systems”, 2023.

[75] Zador, A.M., “A critique of pure learning and what artificial neural networks can learn from animal brains”, Nat. Commun. 10(1), 2019, pp. 1–7.

[76] Hasson, U., S.A. Nastase, and A. Goldstein, “Direct fit to nature: An evolutionary perspective on biological and artificial neural networks”, Neuron 105(3), 2020, pp. 416–434.

[77] Lake, B.M., R. Salakhutdinov, and J.B. Tenenbaum, “Human-level concept learning through probabilistic program induction”, Science 350(6266), 2015, pp. 1332–1338.

[78] Filos, A., P. Tigas, R. McAllister, N. Rhinehart, S. Levine, and Y. Gal, “Can autonomous vehicles identify, recover from, and adapt to distribution shifts?”, arXiv [cs.LG], 2020.

[79] Sucholutsky, I., L. Muttenthaler, A. Weller, et al., “Getting aligned on representational alignment”, 2023.

[80] Dalrymple, D.“davidad”., J. Skalse, Y. Bengio, et al., “Towards guaranteed safe AI: A framework for ensuring robust and reliable AI systems”, 2024.

[81] Linsley, D., P. Zhou, A.K. Ashok, et al., “The 3D-PC: A benchmark for visual perspective taking in humans and machines”, arXiv [cs.CV], 2024.

[82] Sinz, F., A.S. Ecker, P. Fahey, et al., “Stimulus domain transfer in recurrent models for large scale cortical population prediction on video”, Advances in neural information processing systems 31, 2018.

[83] Franke, K., K.F. Willeke, K. Ponder, et al., “State-dependent pupil dilation rapidly shifts visual feature selectivity”, Nature 610(7930), 2022, pp. 128–134.

[84] Fu, J., S. Shrinivasan, K. Ponder, et al., “Pattern completion and disruption characterize contextual modulation in mouse visual cortex”, bioRxiv and under revision in Nature, 2023, pp. 2023–03.

[85] Ding, Z., D.T. Tran, K. Ponder, et al., “Bipartite invariance in mouse primary visual cortex”, bioRxiv and under revision in Nature, 2023.

[86] Wu, M.C.K., S.V. David, and J.L. Gallant, “Complete functional characterization of sensory neurons by system identification”, Annu. Rev. Neurosci. 29, 2006, pp. 477–505.

[87] Lurz, K.-K., M. Bashiri, K. Willeke, et al., “Generalization in data-driven models of primary visual cortex”, (2020).

[88] Cobos, E., T. Muhammad, P.G. Fahey, et al., “It takes neurons to understand neurons: Digital twins of visual cortex synthesize neural metamers”, 2022.

[89] Du, F., M.A. Núñez-Ochoa, M. Pachitariu, and C. Stringer, “Towards a simplified model of primary visual cortex”, 2024.

[90] Ding, Z., D.T. Tran, K. Ponder, et al., “Bipartite invariance in mouse primary visual cortex”, bioRxivorg, 2023.

[91] Cadena, S.A., G.H. Denfield, E.Y. Walker, et al., “Deep convolutional models improve predictions of macaque V1 responses to natural images”, PLoS Comput. Biol. 15(4), 2019, pp. e1006897.

[92] Miao, H.-Y., and F. Tong, “Convolutional neural network models of neuronal responses in macaque V1 reveal limited non-linear processing”, bioRxivorg, 2023, pp. 2023.08.26.554952.

[93] Willeke, K.F., K. Restivo, K. Franke, et al., “Deep learning-driven characterization of single cell tuning in primate visual area V4 unveils topological organization”, bioRxiv, 2023, pp. 2023.05.12.540591.

[94] Cowley, B.R., P.L. Stan, J.W. Pillow, and M.A. Smith, “Compact deep neural network models of visual cortex”, 2023.

[95] Cadena, S.A., K.F. Willeke, K. Restivo, et al., “Diverse task-driven modeling of macaque V4 reveals functional specialization towards semantic tasks”, PLoS Comput. Biol. 20(5), 2024, pp. e1012056.

[96] Pierzchlewicz, P.A., K.F. Willeke, A.F. Nix, et al., “Energy guided diffusion for generating neurally exciting images”, 2023.

[97] Wang, T., T.S. Lee, H. Yao, et al., “Large-scale calcium imaging reveals a systematic V4 map for encoding natural scenes”, Nat. Commun. 15(1), 2024, pp. 6401.

[98] Rust, N.C., V. Mante, E.P. Simoncelli, and J.A. Movshon, “How MT cells analyze the motion of visual patterns”, Nat. Neurosci. 9(11), 2006, pp. 1421–1431.

[99] Nishimoto, S., and J.L. Gallant, “A three-dimensional spatiotemporal receptive field model explains responses of area MT neurons to naturalistic movies”, J. Neurosci. 31(41), 2011, pp. 14551–14564.

[100] Willmore, B.D.B., R.J. Prenger, and J.L. Gallant, “Neural representation of natural images in visual area V2”, Journal of Neuroscience 30(6), 2010, pp. 2102.

[101] Mineault, P.J., F.A. Khawaja, D.A. Butts, and C.C. Pack, “Hierarchical processing of complex motion along the primate dorsal visual pathway”, Proc. Natl. Acad. Sci. U. S. A. 109, 2012, pp. 972–980.

[102] Pasupathy, A., and C.E. Connor, “Population coding of shape in area V4”, Nat. Neurosci. 5(12), 2002, pp. 1332–1338.

[103] Rançon, U., T. Masquelier, and B.R. Cottereau, “A general model unifying the adaptive, transient and sustained properties of ON and OFF auditory neural responses”, PLoS Comput. Biol. 20(8), 2024, pp. e1012288.

[104] Marin Vargas, A., A. Bisi, A.S. Chiappa, C. Versteeg, L.E. Miller, and A. Mathis, “Task-driven neural network models predict neural dynamics of proprioception”, Cell 187(7), 2024, pp. 1745–1761.e19.

[105] Fu, J., S. Shrinivasan, K. Ponder, et al., “Pattern completion and disruption characterize contextual modulation in mouse visual cortex”, 2023.

[106] Hubel, D.H., and T.N. Wiesel, “Receptive fields of single neurones in the cat’s striate cortex”, J. Physiol. 148, 1959, pp. 574–591.

[107] Olshausen, B.A., and D.J. Field, “What is the other 85 percent of V1 doing?”, In 23 problems in systems neuroscience. Oxford University PressNew York, 2006, 182–212.

[108] Li, B.M., I.M. Cornacchia, N.L. Rochefort, and A. Onken, “V1T: Large-scale mouse V1 response prediction using a vision transformer”, arXiv [cs.CV], 2023.

[109] Kaplan, J., S. McCandlish, T. Henighan, et al., “Scaling laws for neural language models”, 2020.

[110] Talebi, V., and C.L. Baker Jr, “Natural versus synthetic stimuli for estimating receptive field models: A comparison of predictive robustness”, J. Neurosci. 32(5), 2012, pp. 1560–1576.

[111] Cowley, B., and J.W. Pillow, “High-contrast “gaudy”images improve the training of deep neural network models of visual cortex”, Advances in neural information processing systems, Curran Associates, Inc. (2020), 21591–21603.

[112] Ponce, C.R., W. Xiao, P.F. Schade, T.S. Hartmann, G. Kreiman, and M.S. Livingstone, “Evolving images for visual neurons using a deep generative network reveals coding principles and neuronal preferences”, Cell 177(4), 2019, pp. 999–1009.e10.

[113] Canatar, A., J. Feather, A. Wakhloo, and S. Chung, “A spectral theory of neural prediction and alignment”, Neural information processing systems, (2023).

[114] Bashiri, M., E. Walker, K.-K. Lurz, et al., “A flow-based latent state generative model of neural population responses to natural images”, Advances in Neural Information Processing Systems 34, 2021, pp. 15801–15815.

[115] Williams, A.H., E. Kunz, S. Kornblith, and S.W. Linderman, “Generalized shape metrics on neural representations”, arXiv:2110.14739 [cs, stat], 2021.

[116] Duong, L.R., J. Zhou, J. Nassar, J. Berman, J. Olieslagers, and A.H. Williams, “Representational dissimilarity metric spaces for stochastic neural networks”, 2023.

[117] Kriegeskorte, N., and X.-X. Wei, “Neural tuning and representational geometry”, Nat. Rev. Neurosci. 22(11), 2021, pp. 703–718.

[118] Yasar, T.B., P. Gombkoto, A.L. Vyssotski, et al., “Months-long tracking of neuronal ensembles spanning multiple brain areas with ultra-flexible tentacle electrodes”, Nat. Commun. 15(1), 2024, pp. 4822.

[119] Tootell, R.B.H., D. Tsao, and W. Vanduffel, “Neuroimaging weighs in: Humans meet macaques in ‘primate’ visual cortex”, J. Neurosci. 23(10), 2003, pp. 3981–3989.

[120] Liu, X., P. Chen, X. Ding, et al., “A narrative review of cortical visual prosthesis systems: The latest progress and significance of nanotechnology for the future”, Ann. Transl. Med. 10(12), 2022, pp. 716.

[121] St-Yves, G., E.J. Allen, Y. Wu, K. Kay, and T. Naselaris, “Brain-optimized neural networks learn non-hierarchical models of representation in human visual cortex”, 2022.

[122] Card, N.S., M. Wairagkar, C. Iacobacci, et al., “An accurate and rapidly calibrating speech neuroprosthesis”, 2023.

[123] Moses, D.A., S.L. Metzger, J.R. Liu, et al., “Neuroprosthesis for decoding speech in a paralyzed person with anarthria”, N. Engl. J. Med. 385(3), 2021, pp. 217–227.

[124] Bouchard, K.E., N. Mesgarani, K. Johnson, and E.F. Chang, “Functional organization of human sensorimotor cortex for speech articulation”, Nature advance online publication, 2013.

[125] Mathis, M.W., A. Perez Rotondo, E.F. Chang, A.S. Tolias, and A. Mathis, “Decoding the brain: From neural representations to mechanistic models”, Cell 187(21), 2024, pp. 5814–5832.

[126] Défossez, A., C. Caucheteux, J. Rapin, O. Kabeli, and J.-R. King, “Decoding speech from non-invasive brain recordings”, 2022.

[127] Vaidya, A.R., S. Jain, and A.G. Huth, “Self-supervised models of audio effectively explain human cortical responses to speech”, 2022.

[128] Schrimpf, M., I. Blank, G. Tuckute, et al., “Artificial neural networks accurately predict language processing in the brain”, bioRxiv, 2020, pp. 2020.06.26.174482.

[129] Zhou, Z., and C. Firestone, “Humans can decipher adversarial images”, Nat. Commun. 10(1), 2018, pp. 1334.

[130] Veerabadran, V., J. Goldman, S. Shankar, et al., “Subtle adversarial image manipulations influence both human and machine perception”, Nat. Commun. 14(1), 2023, pp. 4933.

[131] Gaziv, G., M.J. Lee, and J. DiCarlo, “Strong and precise modulation of human percepts via robustified ANNs”, Neural Inf Process Syst, 2023.

[132] Dapello, J., K. Kar, M. Schrimpf, et al., “Aligning model and macaque inferior temporal cortex representations improves model-to-human behavioral alignment and adversarial robustness”, 2022.

[133] Li, Z., J. Ortega Caro, E. Rusak, et al., “Robust deep learning object recognition models rely on low frequency information in natural images”, PLOS Computational Biology 19(3), 2023, pp. e1010932.

[134] Pirlot, C., R.C. Gerum, C. Efird, J. Zylberberg, and A. Fyshe, “Improving the accuracy and robustness of CNNs using a deep CCA neural data regularizer”, 2022.

[135] Croce, F., M. Andriushchenko, V. Sehwag, et al., “RobustBench: A standardized adversarial robustness benchmark”, arXiv preprint arXiv:2010. 09670, 2020.

[136] Attias, E., C. Pehlevan, and D. Obeid, “A brain-inspired regularizer for adversarial robustness”, arXiv [cs.LG], 2024.

[137] Stevenson, I.H., and K.P. Kording, “How advances in neural recording affect data analysis”, Nat. Neurosci. 14(2), 2011, pp. 139–142.

[138] Urai, A.E., B. Doiron, A.M. Leifer, and A.K. Churchland, “Large-scale neural recordings call for new insights to link brain and behavior”, Nat Neurosci 25(1), 2022, pp. 11–19.

[139] Manley, J., S. Lu, K. Barber, et al., “Simultaneous, cortex-wide dynamics of up to 1 million neurons reveal unbounded scaling of dimensionality with neuron number”, Neuron 112(10), 2024, pp. 1694–1709.e5.

[140] Richards, B.A., T.P. Lillicrap, P. Beaudoin, et al., “A deep learning framework for neuroscience”, Nat. Neurosci. 22(11), 2019, pp. 1761–1770.

[141] Sandbrink, J., H. Hobbs, J. Swett, A. Dafoe, and A. Sandberg, “Differential technology development: A responsible innovation principle for navigating technology risks”, SSRN Electron. J., 2022.

[142] Hanson, R., The age of em: Work, love, and life when robots rule the earth, Oxford University Press, London, England, 2016.

[143] Brunton, S.L., and J.N. Kutz, Data-driven science and engineering: Machine learning, dynamical systems, and control, Cambridge University Press, Cambridge, England, 2019.

[144] Zador, A., B. Richards, B. Ölveczky, et al., “Toward next-generation artificial intelligence: Catalyzing the NeuroAI revolution”, 2022.

[145] Merel, J., D. Aldarondo, J. Marshall, Y. Tassa, G. Wayne, and B. Ölveczky, “Deep neuroethology of a virtual rodent”, arXiv:1911.09451 [q-bio], 2019.

[146] Aldarondo, D., J. Merel, J.D. Marshall, et al., “A virtual rodent predicts the structure of neural activity across behaviours”, Nature 632(8025), 2024, pp. 594–602.

[147] Dorkenwald, S., A. Matsliah, A.R. Sterling, et al., “Neuronal wiring diagram of an adult brain”, bioRxiv, 2023, pp. 2023.06.27.546656.

[148] Azevedo, A., E. Lesser, J.S. Phelps, et al., “Connectomic reconstruction of a female drosophila ventral nerve cord”, Nature 631(8020), 2024, pp. 360–368.

[149] Shiu, P.K., G.R. Sterne, N. Spiller, et al., “A drosophila computational brain model reveals sensorimotor processing”, Nature 634(8032), 2024, pp. 210–219.

[150] Vaxenburg, R., I. Siwanowicz, J. Merel, et al., “Whole-body simulation of realistic fruit fly locomotion with deep reinforcement learning”, 2024.

[151] Lappalainen, J.K., F.D. Tschopp, S. Prakhya, et al., “Connectome-constrained deep mechanistic networks predict neural responses across the fly visual system at single-neuron resolution”, 2023.

[152] Cowley, B.R., A.J. Calhoun, N. Rangarajan, et al., “Mapping model units to visual neurons reveals population code for social behaviour”, Nature 629(8014), 2024, pp. 1100–1108.

[153] Cook, S.J., T.A. Jarrell, C.A. Brittin, et al., “Whole-animal connectomes of both caenorhabditis elegans sexes”, Nature 571(7763), 2019, pp. 63–71.

[154] Haspel, G., B. Baker, I. Beets, et al., “The time is ripe to reverse engineer an entire nervous system: Simulating behavior from neural interactions”, arXiv [q-bio.NC], 2023.

[155] Simeon, Q., L.R. Venâncio, K.I. Zhao, et al., “Scaling properties for artificial neural network models of the

{C . e l e g a n s}

nervous system”, 2023.

[156] Moravec, H., “Mind children: The future of robot and human intelligence”, Harvard UP, 1988.

[157] OpenAI, J. Achiam, S. Adler, et al., “GPT-4 technical report”, arXiv [cs.CL], 2023.

[158] Dubey, A., A. Jauhri, A. Pandey, et al., “The llama 3 herd of models”, arXiv [cs.AI], 2024.

[159] Bommasani, R., D.A. Hudson, E. Adeli, et al., “On the opportunities and risks of foundation models”, 2022.

[160] Paszke, A., S. Gross, F. Massa, et al., “PyTorch: An imperative style, high-performance deep learning library”, 2019.

[161] Wang, R., and Z.S. Chen, “Large-scale foundation models and generative AI for BigData neuroscience”, 2023.

[162] Pandarinath, C., D.J. O’Shea, J. Collins, et al., “Inferring single-trial neural population dynamics using sequential auto-encoders”, Nat. Methods 15(10), 2018, pp. 805–815.

[163] Ye, J., and C. Pandarinath, “Representation learning for neural population activity with neural data transformers”, 2021.

[164] Hurwitz, C., N. Kudryashova, A. Onken, and M.H. Hennig, “Building population models for large-scale neural recordings: Opportunities and pitfalls”, Curr. Opin. Neurobiol. 70, 2021, pp. 64–73.

[165] Azabou, M., V. Arora, V. Ganesh, et al., “A unified, scalable framework for neural population decoding”, Neural information processing systems, (2023).

[166] Zhang, Y., Y. Wang, D. Jimenez-Beneto, et al., “Towards a ‘universal translator’ for neural dynamics at single-cell, single-spike resolution”, 2024.

[167] Schulz, A., J. Vetter, R. Gao, et al., “Modeling conditional distributions of neural and behavioral data with masked variational autoencoders”, bioRxiv, 2024, pp. 2024.04.19.590082.

[168] Caro, J.O., A.H.O. Fonseca, C. Averill, et al., “BrainLM: A foundation model for brain activity recordings”, 2023.

[169] Chau, G., C. Wang, S. Talukder, et al., “Population transformer: Learning population-level representations of neural activity”, arXiv [cs.LG], 2024.

[170] Abbaspourazad, S., O. Elachqar, A.C. Miller, S. Emrani, U. Nallasamy, and I. Shapiro, “Large-scale training of foundation models for wearable biosignals”, 2024.

[171] Jiang, W., L. Zhao, and B.-L. Lu, “Large brain model for learning generic representations with tremendous EEG data in BCI”, (2023).

[172] Csaky, R., M.W.J. van Es, O.P. Jones, and M. Woolrich, “Foundational GPT model for MEG”, arXiv [cs.LG], 2024.

[173] Labs, C.-L. at R., D. Sussillo, P. Kaifosh, and T. Reardon, “A generic noninvasive neuromotor interface for human-computer interaction”, 2024.

[174] Kupers, E.R., T. Knapen, E.P. Merriam, and K.N. Kay, “Principles of intensive human neuroimaging”, Trends Neurosci., 2024.

[175] Yang, C., M. Westover, and J. Sun, “BIOT: Biosignal transformer for cross-data learning in the wild”, Neural information processing systems, (2023).

[176] Thomas, A.W., C. Ré, and R.A. Poldrack, “Self-supervised learning of brain dynamics from broad neuroimaging data”, 2023.

[177] Thapa, R., B. He, M.R. Kjaer, et al., “SleepFM: Foundation model for sleep analysis”, (2024).

[178] Ho, A., T. Besiroglu, E. Erdil, et al., “Algorithmic progress in language models”, arXiv [cs.CL], 2024.

[179] Kato, S., H.S. Kaplan, T. Schrödel, et al., “Global brain dynamics embed the motor command sequence of caenorhabditis elegans”, Cell 163(3), 2015, pp. 656–669.

[180] Ahrens, M.B., M.B. Orger, D.N. Robson, J.M. Li, and P.J. Keller, “Whole-brain functional imaging at cellular resolution using light-sheet microscopy”, Nat. Methods 10(5), 2013, pp. 413–420.

[181] Gao, P., E. Trautmann, B. Yu, et al., “A theory of multineuronal dimensionality, dynamics and measurement”, 2017.

[182] Jazayeri, M., and S. Ostojic, “Interpreting neural computations by examining intrinsic and embedding dimensionality of neural activity”, arXiv [q-bio.NC], 2021.

[183] Stringer, C., M. Pachitariu, N. Steinmetz, M. Carandini, and K.D. Harris, “High-dimensional geometry of population responses in visual cortex”, bioRxiv, 2018, pp. 374090.

[184] Williams, A., “How to cross-validate PCA, clustering, and matrix decomposition models”, 2018.

[185] Yemini, E., A. Lin, A. Nejatbakhsh, et al., “NeuroPAL: A multicolor atlas for whole-brain neuronal identification in C. elegans”, Cell 184(1), 2021, pp. 272–288.e11.

[186] Suzuki, R., C. Wen, D. Sprague, S. Onami, and K.D. Kimura, “Whole-brain spontaneous GCaMP activity with NeuroPAL cell ID information of semi-restricted worms”, 2024.

[187] Chen, X., Y. Mu, Y. Hu, et al., “Brain-wide organization of neuronal activity and convergent sensorimotor transformations in larval zebrafish”, Neuron 100(4), 2018, pp. 876–890.e5.

[188] Stringer, C., M. Michaelos, D. Tsyboulski, S.E. Lindo, and M. Pachitariu, “High-precision coding in visual cortex”, Cell 184(10), 2021, pp. 2767–2778.e15.

[189] Friston, K., “The free-energy principle: A unified brain theory?”, Nat. Rev. Neurosci. 11(2), 2010, pp. 127–138.

[190] Kluger, D.S., M.G. Allen, and J. Gross, “Brain-body states embody complex temporal dynamics”, Trends Cogn. Sci. 28(8), 2024, pp. 695–698.

[191] Zhao, M., N. Wang, X. Jiang, et al., “MetaWorm: An integrative data-driven model simulatingc. Elegansbrain, body and environment interactions”, bioRxiv, 2024, pp. 2024.02.22.581686.

[192] Lobato-Rios, V., S.T. Ramalingasetty, P.G. Özdil, J. Arreguit, A.J. Ijspeert, and P. Ramdya, “NeuroMechFly, a neuromechanical model of adult drosophila melanogaster”, Nat. Methods 19(5), 2022, pp. 620–627.

[193] Guo, Y., W.J. Veneman, H.P. Spaink, and F.J. Verbeek, “Three-dimensional reconstruction and measurements of zebrafish larvae from high-throughput axial-view in vivo imaging”, Biomed. Opt. Express 8(5), 2017, pp. 2611–2634.

[194] Ramalingasetty, S.T., S.M. Danner, J. Arreguit, et al., “A whole-body musculoskeletal model of the mouse”, IEEE Access 9(99), 2021, pp. 163861–163881.

[195] Gilmer, J.I., S. Coltman, G.C. Velasco, et al., “A novel biomechanical model of the mouse forelimb predicts muscle activity in optimal control simulations of reaching movements”, bioRxivorg, 2024, pp. 2024.09.05.611289.

[196] DeWolf, T., S. Schneider, P. Soubiran, A. Roggenbach, and M. Mathis, “Neuro-musculoskeletal modeling reveals muscle-level neural dynamics of adaptive learning in sensorimotor cortex”, bioRxiv, 2024, pp. 2024.09.11.612513.

[197] Almani, M.N., J. Lazzari, A. Chacon, and S. Saxena, “μSim: A goal-driven framework for elucidating the neural control of movement through musculoskeletal modeling”, bioRxiv, 2024, pp. 2024–02.

[198] Caggiano, V., H. Wang, G. Durandau, M. Sartori, and V. Kumar, “MyoSuite – a contact-rich simulation suite for musculoskeletal motor control”, arXiv [cs.RO], 2022.

[199] Todorov, E., T. Erez, and Y. Tassa, “MuJoCo: A physics engine for model-based control”, 2012 IEEE/RSJ international conference on intelligent robots and systems, IEEE (2012), 5026–5033.

[200] Codol, O., J.A. Michaels, M. Kashefi, J.A. Pruszynski, and P.L. Gribble, “MotorNet: A python toolbox for controlling differentiable biomechanical effectors with artificial neural networks”, eLife, 2024.

[201] Barbagli, F., A. Frisoli, K. Salisbury, and M. Bergamasco, “Simulating human fingers: A soft finger proxy model and algorithm”, 12th international symposium on haptic interfaces for virtual environment and teleoperator systems, 2004. HAPTICS ’04. proceedings, IEEE (2004), 9–17.

[202] Bhattasali, N.X., A. Zador, and T.A. Engel, “Neural circuit architectural priors for embodied control”, Neural Inf Process Syst abs/2201.05242, 2022, pp. 12744–12759.

[203] Mathis, A., P. Mamidanna, K.M. Cury, et al., “DeepLabCut: Markerless pose estimation of user-defined body parts with deep learning”, Nat. Neurosci. 21(9), 2018, pp. 1281–1289.

[204] Karashchuk, P., K.L. Rupp, E.S. Dickinson, et al., “Anipose: A toolkit for robust markerless 3D pose estimation”, Cell Rep. 36(13), 2021, pp. 109730.

[205] Pereira, T.D., N. Tabris, A. Matsliah, et al., “SLEAP: A deep learning system for multi-animal pose tracking”, Nat. Methods 19(4), 2022, pp. 486–495.

[206] Dunn, T.W., J.D. Marshall, K.S. Severson, et al., “Geometric deep learning enables 3D kinematic profiling across species and environments”, Nat. Methods 18(5), 2021, pp. 564–573.

[207] Mildenhall, B., P.P. Srinivasan, M. Tancik, J.T. Barron, R. Ramamoorthi, and R. Ng, “NeRF: Representing scenes as neural radiance fields for view synthesis”, arXiv [cs.CV], 2020.

[208] Kerbl, B., G. Kopanas, T. Leimkühler, and G. Drettakis, “3D gaussian splatting for real-time radiance field rendering”, arXiv [cs.GR], 2023.

[209] Maheswaranathan, N., L.T. McIntosh, H. Tanaka, et al., “Interpreting the retinal neural code for natural scenes: From computations to neurons”, Neuron 111(17), 2023, pp. 2742–2755.e4.

[210] Seung, H.S., “Predicting visual function by interpreting a neuronal wiring diagram”, Nature 634(8032), 2024, pp. 113–123.

[211] Bermejo, R., D. Houben, and H.P. Zeigler, “Optoelectronic monitoring of individual whisker movements in rats”, J. Neurosci. Methods 83(2), 1998, pp. 89–96.

[212] Knutsen, P.M., D. Derdikman, and E. Ahissar, “Tracking whisker and head movements in unrestrained behaving rodents”, J. Neurophysiol. 93(4), 2005, pp. 2294–2301.

[213] Wilson, A.D., and M. Baietto, “Applications and advances in electronic-nose technologies”, Sensors (Basel) 9(7), 2009, pp. 5099–5148.

[214] Mamiya, A., P. Gurung, and J.C. Tuthill, “Neural coding of leg proprioception in drosophila”, Neuron 100(3), 2018, pp. 636–650.e6.

[215] Mamiya, A., A. Sustar, I. Siwanowicz, et al., “Biomechanical origins of proprioceptor feature selectivity and topographic maps in the drosophila leg”, Neuron 111(20), 2023, pp. 3230–3243.e14.

[216] Martinelli, F., B. Simsek, W. Gerstner, and J. Brea, “Expand-and-cluster: Parameter recovery of neural networks”, 2024.

[217] Liu, H.X., and S. Feng, “Curse of rarity for autonomous vehicles”, Nat. Commun. 15(1), 2024, pp. 4808.

[218] Beiran, M., and A. Litwin-Kumar, “Prediction of neural activity in connectome-constrained recurrent networks”, 2024.

[219] Pearl, J., Causality: Models, reasoning, and inference, Cambridge University Press, Cambridge, England, 2009.

[220] Bailey, D.H., A.J. Jung, A.M. Beltz, et al., “Causal inference on human behaviour”, Nature Human Behavior 8(8), 2024, pp. 1448–1459.

[221] Zhao, W., J.P. Queralta, and T. Westerlund, “Sim-to-real transfer in deep reinforcement learning for robotics: A survey”, arXiv [cs.LG], 2020.

[222] Howard, J., and S. Ruder, “Universal language model fine-tuning for text classification”, arXiv [cs.CL], 2018.

[223] O’Byrne, J., and K. Jerbi, “How critical is brain criticality?”, Trends Neurosci. 45(11), 2022, pp. 820–837.

[224] Rust, N., “Is the brain uncontrollable, like the weather?”, 2023.

[225] Bischoff, S., A. Darcher, M. Deistler, et al., “A practical guide to statistical distances for evaluating generative models in science”, arXiv [cs.LG], 2024.

[226] Ostrow, M., A. Eisen, L. Kozachkov, and I. Fiete, “Beyond geometry: Comparing the temporal structure of computation in neural circuits with dynamical similarity analysis”, Advances in Neural Information Processing Systems 36, 2024.

[227] Heusel, M., H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “GANs trained by a two time-scale update rule converge to a local nash equilibrium”, Advances in Neural Information Processing Systems 30, 2017.

[228] Zheng, L., W.-L. Chiang, Y. Sheng, et al., “Judging LLM-as-a-judge with MT-bench and chatbot arena”, arXiv [cs.CL], 2023.

[229] Sarma, G.P., A. Safron, and N.J. Hay, “Integrative biological simulation, neuropsychology, and AI safety”, 2019.

[230] Einevoll, G.T., A. Destexhe, M. Diesmann, et al., “The scientific case for brain simulations”, arXiv [q-bio.NC], 2019.

[231] Haufler, D., S. Ito, C. Koch, and A. Arkhipov, “Simulations of cortical networks using spatially extended conductance-based neuronal models”, J. Physiol. 601(15), 2023, pp. 3123–3139.

[232] Wellcome Trust, “Scaling up connectomics: The road to a whole mouse brain connectome”, 2023.

[233] Pospisil, D.A., M.J. Aragon, S. Dorkenwald, et al., “The fly connectome reveals a path to the effectome”, Nature 634(8032), 2024, pp. 201–209.

[234] Lu, W., L. Zeng, X. Du, et al., “Digital twin brain: A simulation and assimilation platform for whole human brain”, arXiv [cs.NE], 2023.

[235] Yamaura, H., J. Igarashi, and T. Yamazaki, “Simulation of a human-scale cerebellar network model on the K computer”, Front. Neuroinform. 14, 2020, pp. 16.

[236] Marblestone*, A.H., B.M. Zamft*, Y.G. Maguire, et al., “Physical principles for scalable neural recording”, Front. Comput. Neurosci. 7, 2013.

[237] Januszewski, M., and V. Jain, “Next-generation AI for connectomics”, Nat. Methods 21(8), 2024, pp. 1398–1399.

[238] Haehn, D., J. Hoffer, B. Matejek, et al., “Scalable interactive visualization for connectomics”, Informatics (MDPI) 4(3), 2017, pp. 29.

[239] White, J.G., E. Southgate, J.N. Thomson, and S. Brenner, “The structure of the nervous system of the nematode caenorhabditis elegans”, Philos. Trans. R. Soc. Lond. B Biol. Sci. 314(1165), 1986, pp. 1–340.

[240] Consortium, T.M., J.A. Bae, M. Baptiste, et al., “Functional connectomics spanning multiple areas of mouse visual cortex”, 2023.

[241] Winding, M., B.D. Pedigo, C.L. Barnes, et al., “The connectome of an insect brain”, Science 379(6636), 2023, pp. eadd9330.

[242] Shapson-Coe, A., M. Januszewski, D.R. Berger, et al., “A petavoxel fragment of human cerebral cortex reconstructed at nanoscale resolution”, Science 384(6696), 2024, pp. eadk4858.

[243] Zheng, Z., C.S. Own, A.A. Wanner, et al., “Fast imaging of millimeter-scale areas with beam deflection transmission electron microscopy”, Nat. Commun. 15(1), 2024, pp. 6860.

[244] Januszewski, M., J. Kornfeld, P.H. Li, et al., “High-precision automated reconstruction of neurons with flood-filling networks”, Nat. Methods 15(8), 2018, pp. 605–610.

[245] Sheridan, A., T.M. Nguyen, D. Deb, et al., “Local shape descriptors for neuron segmentation”, Nat. Methods 20(2), 2023, pp. 295–303.

[246] Kebschull, J.M., and A.M. Zador, “Cellular barcoding: Lineage tracing, screening and beyond”, Nat. Methods 15(11), 2018, pp. 871–879.

[247] Chen, X., Y.-C. Sun, H. Zhan, et al., “High-throughput mapping of long-range neuronal projection using in situ sequencing”, Cell 179(3), 2019, pp. 772–786.e19.

[248] Livet, J., T.A. Weissman, H. Kang, et al., “Transgenic strategies for combinatorial expression of fluorescent proteins in the nervous system”, Nature 450(7166), 2007, pp. 56–62.

[249] Viswanathan, S., M.E. Williams, E.B. Bloss, et al., “High-performance probes for light and electron microscopy”, Nat. Methods 12(6), 2015, pp. 568–576.

[250] Mishchenko, Y., “On optical detection of densely labeled synapses in neuropil and mapping connectivity with combinatorially multiplexed fluorescent synaptic markers”, PLoS One 5(1), 2010, pp. e8853.

[251] Marblestone, A.H., and E.S. Boyden, “Designing tools for assumption-proof brain mapping”, Neuron 83(6), 2014, pp. 1239–1241.

[252] Qihua, C., C. Xuejin, W. Chenxuan, L. Yixiong, X. Zhiwei, and W. Feng, “Learning multimodal volumetric features for large-scale neuron tracing”, arXiv [cs.CV], 2024.

[253] Chen, F., P.W. Tillberg, and E.S. Boyden, “Optical imaging. Expansion microscopy”, Science 347(6221), 2015, pp. 543–548.

[254] Tavakoli, M.R., J. Lyudchik, M. Januszewski, et al., “Light-microscopy based dense connectomic reconstruction of mammalian brain tissue”, bioRxiv, 2024, pp. 2024.03.01.582884.

[255] Alon, S., D.R. Goodwin, A. Sinha, et al., “Expansion sequencing: Spatially precise in situ transcriptomics in intact biological systems”, Science 371(6528), 2021, pp. eaax2656.

[256] Kang, J., M.E. Schroeder, Y. Lee, et al., “Multiplexed expansion revealing for imaging multiprotein nanostructures in healthy and diseased brain”, Nat. Commun. 15(1), 2024, pp. 9722.

[257] Adamhmarblestone, P. by, “On whole-mammalian-brain connectomics”, 2019.

[258] Rodriques, S., “Optical microscopy provides a path to a $10M mouse brain connectome (if it eliminates proofreading)”, 2023. https://www.sam-rodriques.com/post/optical-microscopy-provides-a-path-to-a-10m-mouse-brain-connectome-if-it-eliminates-proofreading

[259] Linghu, C., S.L. Johnson, P.A. Valdes, et al., “Spatial multiplexing of fluorescent reporters for imaging signaling network dynamics”, Cell 183(6), 2020, pp. 1682–1698.e24.

[260] Zador, A.M., J. Dubnau, H.K. Oyibo, H. Zhan, G. Cao, and I.D. Peikon, “Sequencing the connectome”, PLoS Biol. 10(10), 2012, pp. e1001411.

[261] Shin, D., M.E. Urbanek, H.H. Larson, et al., “High-complexity barcoded rabies virus for scalable circuit mapping using single-cell and single-nucleus sequencing”, bioRxiv, 2024, pp. 2024–10.

[262] Arkhipov, A., N.W. Gouwens, Y.N. Billeh, et al., “Visual physiology of the layer 4 cortical circuit in silico”, PLoS Comput. Biol. 14(11), 2018, pp. e1006535.

[263] Holler, S., G. Köstinger, K.A.C. Martin, G.F.P. Schuhknecht, and K.J. Stratford, “Structure and function of a neocortical synapse”, Nature 591(7848), 2021, pp. 111–116.

[264] Campagnola, L., S.C. Seeman, T. Chartrand, et al., “Local connectivity and synaptic dynamics in mouse and human neocortex”, Science 375(6585), 2022, pp. eabj5861.

[265] Sanfilippo, P., A.J. Kim, A. Bhukel, et al., “Mapping of multiple neurotransmitter receptor subtypes and distinct protein complexes to the connectome”, Neuron 112(6), 2024, pp. 942–958.e13.

[266] Lee, H.S., A. Ghetti, A. Pinto-Duarte, et al., “Astrocytes contribute to gamma oscillations and recognition memory”, Proc. Natl. Acad. Sci. U. S. A. 111(32), 2014, pp. E3343–52.

[267] Shen, F.Y., M.M. Harrington, L.A. Walker, H.P.J. Cheng, E.S. Boyden, and D. Cai, “Light microscopy based approach for mapping connectivity with molecular specificity”, Nat. Commun. 11(1), 2020, pp. 4632.

[268] Cadwell, C.R., A. Palasantza, X. Jiang, et al., “Electrophysiological, transcriptomic and morphologic profiling of single neurons using patch-seq”, Nature biotechnology 34(2), 2016, pp. 199–203.

[269] Fuzik, J., A. Zeisel, Z. Máté, et al., “Integration of electrophysiological recordings with single-cell RNA-seq data identifies neuronal subtypes”, Nature biotechnology 34(2), 2016, pp. 175–183.

[270] Scala, F., D. Kobak, M. Bernabucci, et al., “Phenotypic variation of transcriptomic cell types in mouse motor cortex”, Nature 598(7879), 2021, pp. 144–150.

[271] Lee, B.R., A. Budzillo, K. Hadley, et al., “Scaled, high fidelity electrophysiological, morphological, and transcriptomic cell characterization”, Elife 10, 2021.

[272] Gouwens, N.W., S.A. Sorensen, F. Baftizadeh, et al., “Integrated morphoelectric and transcriptomic classification of cortical GABAergic cells”, Cell 183(4), 2020, pp. 935–953.e19.

[273] Berg, J., S.A. Sorensen, J.T. Ting, et al., “Human neocortical expansion involves glutamatergic neuron diversification”, Nature 598(7879), 2021, pp. 151–158.

[274] Atanas, A.A., J. Kim, Z. Wang, et al., “Brain-wide representations of behavior spanning multiple timescales and states in C. elegans”, Cell 186(19), 2023, pp. 4134–4151.e31.

[275] Vries, S.E.J. de, J.A. Lecoq, M.A. Buice, et al., “A large-scale standardized physiological survey reveals functional organization of the mouse visual cortex”, Nat. Neurosci. 23(1), 2020, pp. 138–151.

[276] Siegle, J.H., X. Jia, S. Durand, et al., “Survey of spiking in the mouse visual system reveals functional hierarchy”, Nature 592(7852), 2021, pp. 86–92.

[277] Leonard, M.K., L. Gwilliams, K.K. Sellers, et al., “Large-scale single-neuron speech sound encoding across the depth of human cortex”, Nature 626(7999), 2024, pp. 593–602.

[278] Khanna, A.R., W. Muñoz, Y.J. Kim, et al., “Single-neuronal elements of speech production in humans”, Nature 626(7999), 2024, pp. 603–610.

[279] Creamer, M.S., A.M. Leifer, and J.W. Pillow, “Bridging the gap between the connectome and whole-brain activity inc. Elegans”, bioRxiv, 2024.

[280] Shen, F.X., M.L. Baum, N. Martinez-Martin, et al., “Returning individual research results from digital phenotyping in psychiatry”, Am. J. Bioeth. 24(2), 2024, pp. 69–90.

[281] Bugeon, S., J. Duffield, M. Dipoppa, et al., “A transcriptomic axis predicts state modulation of cortical interneurons”, Nature 607(7918), 2022, pp. 330–338.

[282] Condylis, C., A. Ghanbari, N. Manjrekar, et al., “Dense functional and molecular readout of a circuit hub in sensory cortex”, Science 375(6576), 2022, pp. eabl5981.

[283] M’Saad, O., and J. Bewersdorf, “Light microscopy of proteins in their ultrastructural context”, Nat. Commun. 11(1), 2020, pp. 3850.

[284] Zinchenko, V., J. Hugger, V. Uhlmann, D. Arendt, and A. Kreshuk, “MorphoFeatures for unsupervised exploration of cell types, tissues, and organs in volume electron microscopy”, Elife 12, 2023.

[285] Gamlin, C.R., C.M. Schneider-Mizell, M. Mallory, et al., “Integrating EM and patch-seq data: Synaptic connectivity and target specificity of predicted sst transcriptomic types”, bioRxivorg, 2023.

[286] Randi, F., A.K. Sharma, S. Dvali, and A.M. Leifer, “Neural signal propagation atlas of caenorhabditis elegans”, Nature, 2023, pp. 1–9.

[287] Gouwens, N.W., J. Berg, D. Feng, et al., “Systematic generation of biophysically detailed models for diverse cortical neuron types”, Nat. Commun. 9(1), 2018, pp. 710.

[288] Billeh, Y.N., B. Cai, S.L. Gratiy, et al., “Systematic integration of structural and functional data into multi-scale models of mouse primary visual cortex”, Neuron 106(3), 2020, pp. 388–403.e18.

[289] Markram, H., E. Muller, S. Ramaswamy, et al., “Reconstruction and simulation of neocortical microcircuitry”, Cell 163(2), 2015, pp. 456–492.

[290] Zhang, Y., G. He, L. Ma, et al., “A GPU-based computational framework that bridges neuron simulation and artificial intelligence”, Nat. Commun. 14(1), 2023, pp. 5798.

[291] Dura-Bernal, S., S.A. Neymotin, B.A. Suter, et al., “Multiscale model of primary motor cortex circuits predicts in vivo cell-type-specific, behavioral state-dependent dynamics”, Cell Rep. 42(6), 2023, pp. 112574.

[292] Hagen, E., D. Dahmen, M.L. Stavrinou, et al., “Hybrid scheme for modeling local field potentials from point-neuron networks”, Cereb. Cortex 26(12), 2016, pp. 4461–4496.

[293] Rimehaug, A.E., A.J. Stasik, E. Hagen, et al., “Uncovering circuit mechanisms of current sinks and sources with biophysical simulations of primary visual cortex”, Elife 12, 2023.

[294] Stevens, R., and G. Orr, Bioimaging capabilities to enable mapping of the neural connections in a complex brain, US Department of Energy (USDOE), Washington DC (United States), 2020.

[295] Simeon, Q., A. Kashyap, K.P. Kording, and E.S. Boyden, “Homogenized

C. elegans

neural activity and connectivity data”, 2024. https://arxiv.org/abs/2411.12091

[296] Wong, L., G. Grand, A.K. Lew, et al., “From word models to world models: Translating from natural language to the probabilistic language of thought”, arXiv preprint arXiv:2306.12672, 2023.

[297] Sumers, T.R., S. Yao, K. Narasimhan, and T.L. Griffiths, “Cognitive architectures for language agents”, arXiv preprint arXiv:2309.02427, 2023.

[298] Binz, M., I. Dasgupta, A.K. Jagadish, M. Botvinick, J.X. Wang, and E. Schulz, “Meta-learned models of cognition”, Behavioral and Brain Sciences 47, 2024, pp. e147.

[299] Lieder, F., and T.L. Griffiths, “Resource-rational analysis: Understanding human cognition as the optimal use of limited computational resources”, Behavioral and brain sciences 43, 2020, pp. e1.

[300] Gershman, S.J., E.J. Horvitz, and J.B. Tenenbaum, “Computational rationality: A converging paradigm for intelligence in brains, minds, and machines”, Science 349(6245), 2015, pp. 273–278.

[301] Icard, T., “Resource rationality”, Book manuscript, 2023.

[302] Newell, A., Unified theories of cognition, Harvard University Press, 1994.

[303] Spelke, E.S., What babies know: Core knowledge and composition volume 1, Oxford University Press, 2022.

[304] Chollet, F., M. Knoop, B. Landers, et al., “ARC prize 2024”, 2024.

[305] Ellis, K., C. Wong, M. Nye, et al., DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep bayesian program learning, arXiv, 2020.

[306] Chollet, F., “On the measure of intelligence”, 2019.

[307] Moskvichev, A., V.V. Odouard, and M. Mitchell, “The conceptarc benchmark: Evaluating understanding and generalization in the arc domain”, arXiv preprint arXiv:2305.07141, 2023.

[308] Wüst, A., T. Tobiasch, L. Helff, D.S. Dhami, C.A. Rothkopf, and K. Kersting, “Bongard in wonderland: Visual puzzles that still make AI go mad?”, arXiv preprint arXiv:2410.19546, 2024.

[309] Bakhtin, A., L. van der Maaten, J. Johnson, L. Gustafson, and R. Girshick, “Phyre: A new benchmark for physical reasoning”, Advances in Neural Information Processing Systems 32, 2019.

[310] Bear, D.M., E. Wang, D. Mrowca, et al., “Physion: Evaluating physical prediction from vision in humans and machines”, arXiv preprint arXiv:2106.08261, 2021.

[311] Kirfel, L., T. Icard, and T. Gerstenberg, “Inference from explanation.”, Journal of Experimental Psychology: General 151(7), 2022, pp. 1481.

[312] Sun, J.J., A. Ulmer, D. Chakraborty, et al., “The MABe22 benchmarks for representation learning of multi-agent behavior”, arXiv preprint arXiv:2207.10553, 2022.

[313] Smith, M.L., J.D. Davidson, B. Wild, D.M. Dormagen, T. Landgraf, and I.D. Couzin, “Behavioral variation across the days and lives of honey bees”, Iscience 25(9), 2022.

[314] Matzner, S., T. Warfel, and R. Hull, “ThermalTracker-3D: A thermal stereo vision system for quantifying bird and bat activity at offshore wind energy sites”, Ecological Informatics 57, 2020, pp. 101069.

[315] Van Horn, G., S. Branson, R. Farrell, et al., “Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection”, Proceedings of the IEEE conference on computer vision and pattern recognition, (2015), 595–604.

[316] Koger, B., A. Deshpande, J.T. Kerby, J.M. Graving, B.R. Costelloe, and I.D. Couzin, “Quantifying the movement, behaviour and environmental context of group-living animals using drones and computer vision”, Journal of Animal Ecology, 2023.

[317] Naik, H., A.H.H. Chan, J. Yang, et al., “3D-POP–an automated annotation approach to facilitate markerless 2D-3D tracking of freely moving birds with marker-based motion capture”, arXiv preprint arXiv:2303.13174, 2023.

[318] Allen, K., F. Brändle, M. Botvinick, et al., “Using games to understand the mind”, Nature Human Behaviour, 2024, pp. 1–9.

[319] Mnih, V., N. Heess, A. Graves, and K. Kavukcuoglu, “Recurrent models of visual attention”, 2014.

[320] Badia, A.P., B. Piot, S. Kapturowski, et al., “Agent57: Outperforming the atari human benchmark”, International conference on machine learning, PMLR (2020), 507–517.

[321] Xue, C., V. Pinto, C. Gamage, E. Nikonova, P. Zhang, and J. Renz, “Phy-q as a measure for physical reasoning intelligence”, Nature Machine Intelligence 5(1), 2023, pp. 83–93.

[322] Das, R., J.B. Tenenbaum, A. Solar-Lezama, and Z. Tavares, “Combining functional and automata synthesis to discover causal reactive programs”, Proceedings of the ACM on Programming Languages 7(POPL), 2023, pp. 1628–1658.

[323] Chowdhary, S., I. Iacopini, and F. Battiston, “Quantifying human performance in chess”, Scientific Reports 13(1), 2023, pp. 2113.

[324] Meta Fundamental AI Research Diplomacy Team (FAIR)†, A. Bakhtin, N. Brown, et al., “Human-level play in the game of diplomacy by combining language models with strategic reasoning”, Science 378(6624), 2022, pp. 1067–1074.

[325] Russek, E., D. Acosta-Kane, B. van Opheusden, M.G. Mattar, and T. Griffiths, “Time spent thinking in online chess reflects the value of computation”, PsyArXiv, 2022.

[326] Gao, P., and S. Ganguli, “On simplicity and complexity in the brave new world of large-scale neuroscience”, Curr. Opin. Neurobiol. 32, 2015, pp. 148–155.

[327] Krakauer, J.W., A.A. Ghazanfar, A. Gomez-Marin, M.A. MacIver, and D. Poeppel, “Neuroscience needs behavior: Correcting a reductionist bias”, Neuron 93(3), 2017, pp. 480–490.

[328] Mobbs, D., P.C. Trimmer, D.T. Blumstein, and P. Dayan, “Foraging for foundations in decision neuroscience: Insights from ethology”, Nat. Rev. Neurosci. 19(7), 2018, pp. 419–427.

[329] Hall-McMaster, S., and F. Luyckx, “Revisiting foraging approaches in neuroscience”, Cogn. Affect. Behav. Neurosci. 19(2), 2019, pp. 225–230.

[330] Pinto, L., K. Rajan, B. DePasquale, S.Y. Thiberge, D.W. Tank, and C.D. Brody, “Task-dependent changes in the large-scale dynamics and necessity of cortical regions”, Neuron 104(4), 2019, pp. 810–824.e9.

[331] Miller, C.T., D. Gire, K. Hoke, et al., “Natural behavior is the language of the brain”, Curr. Biol. 32(10), 2022, pp. R482–R493.

[332] Dennis, E.J., A. El Hady, A. Michaiel, et al., “Systems neuroscience of natural behaviors in rodents”, J. Neurosci. 41(5), 2021, pp. 911–919.

[333] Niv, Y., “The primacy of behavioral research for understanding the brain”, Behav. Neurosci. 135(5), 2021, pp. 601–609.

[334] Driscoll, L.N., K. Shenoy, and D. Sussillo, “Flexible multitask computation in recurrent networks utilizes shared dynamical motifs”, Nat. Neurosci. 27(7), 2024, pp. 1349–1363.

[335] Peterson, J.C., D.D. Bourgin, M. Agrawal, D. Reichman, and T.L. Griffiths, “Using large-scale experiments and machine learning to discover theories of human decision-making”, Science 372(6547), 2021, pp. 1209–1214.

[336] Binz, M., E. Akata, M. Bethge, et al., “Centaur: A foundation model of human cognition”, arXiv preprint arXiv:2410.20268, 2024.

[337] Strauch, C., C.-A. Wang, W. Einhäuser, S. Van der Stigchel, and M. Naber, “Pupillometry as an integrated readout of distinct attentional networks”, Trends in Neurosciences 45(8), 2022, pp. 635–647.

[338] Sirois, S., and J. Brisson, “Pupillometry”, Wiley Interdisciplinary Reviews: Cognitive Science 5(6), 2014, pp. 679–692.

[339] Fan, Z., Y. Zhu, Y. He, Q. Sun, H. Liu, and J. He, “Deep learning on monocular object pose detection and tracking: A comprehensive overview”, ACM Computing Surveys 55(4), 2022, pp. 1–40.

[340] Kendall, A., M. Grimes, and R. Cipolla, “Posenet: A convolutional network for real-time 6-dof camera relocalization”, Proceedings of the IEEE international conference on computer vision, (2015), 2938–2946.

[341] Freer, C.E., D.M. Roy, and J.B. Tenenbaum, “Towards common-sense reasoning via conditional simulation: Legacies of turing in artificial intelligence.”, Turing’s Legacy 42, 2014, pp. 195–252.

[342] Goodman, N., V. Mansinghka, D.M. Roy, K. Bonawitz, and J.B. Tenenbaum, “Church: A language for generative models”, arXiv preprint arXiv:1206.3255, 2012.

[343] Carpenter, B., A. Gelman, M.D. Hoffman, et al., “Stan: A probabilistic programming language”, Journal of statistical software 76, 2017.

[344] Goodman, N.D., and A. Stuhlmüller, “The Design and Implementation of Probabilistic Programming Languages”, 2014.

[345] Abril-Pla, O., V. Andreani, C. Carroll, et al., “PyMC: A modern, and comprehensive probabilistic programming framework in python”, PeerJ Computer Science 9, 2023, pp. e1516.

[346] Bingham, E., J.P. Chen, M. Jankowiak, et al., “Pyro: Deep universal probabilistic programming”, Journal of machine learning research 20(28), 2019, pp. 1–6.

[347] Phan, D., N. Pradhan, and M. Jankowiak, “Composable effects for flexible and accelerated probabilistic programming in NumPyro”, arXiv preprint arXiv:1912.11554, 2019.

[348] Ge, H., K. Xu, and Z. Ghahramani, “Turing: A language for flexible probabilistic inference”, International conference on artificial intelligence and statistics, PMLR (2018), 1682–1690.

[349] Cusumano-Towner, M.F., F.A. Saad, A.K. Lew, and V.K. Mansinghka, “Gen: A general-purpose probabilistic programming system with programmable inference”, Proceedings of the 40th acm sigplan conference on programming language design and implementation, (2019), 221–236.

[350] Meent, J.-W. van de, B. Paige, H. Yang, and F. Wood, “An introduction to probabilistic programming”, arXiv preprint arXiv:1809.10756, 2018.

[351] Chaudhuri, S., K. Ellis, O. Polozov, et al., “Neurosymbolic programming”, Foundations and Trends in Programming Languages 7(3), 2021, pp. 158–243.

[352] Botvinick, M., and M. Toussaint, “Planning as inference”, Trends in cognitive sciences 16(10), 2012, pp. 485–488.

[353] Milli, S., F. Lieder, and T.L. Griffiths, “A rational reinterpretation of dual-process theories”, Cognition 217, 2021, pp. 104881.

[354] Pearl, J., “The algorithmization of counterfactuals”, Annals of Mathematics and Artificial Intelligence 61, 2011, pp. 29–39.

[355] Solway, A., and M.M. Botvinick, “Goal-directed decision making as probabilistic inference: A computational framework and potential neural correlates.”, Psychological review 119(1), 2012, pp. 120.

[356] Koller, D., “Probabilistic graphical models: Principles and techniques”, 2009.

[357] Tavares, Z., J. Koppel, X. Zhang, R. Das, and A. Solar-Lezama, “A language for counterfactual generative models”, International conference on machine learning, PMLR (2021), 10173–10182.

[358] Dawid, A.P., “Influence diagrams for causal modelling and inference”, International Statistical Review 70(2), 2002, pp. 161–189.

[359] MacDermott, M., T. Everitt, and F. Belardinelli, “Characterising decision theories with mechanised causal graphs”, 2023. https://arxiv.org/abs/2307.10987

[360] Hammond, L., J. Fox, T. Everitt, R. Carey, A. Abate, and M. Wooldridge, “Reasoning about causality in games (abstract reprint)”, Proceedings of the AAAI Conference on Artificial Intelligence 38(20), 2024, pp. 22697–22697.

[361] Levine, S., “Reinforcement learning and control as probabilistic inference: Tutorial and review”, arXiv [cs.LG], 2018.

[362] Everitt, T., R. Kumar, V. Krakovna, and S. Legg, “Modeling AGI safety frameworks with causal influence diagrams”, 2019. https://arxiv.org/abs/1906.08663

[363] Kenton, Z., R. Kumar, S. Farquhar, J. Richens, M. MacDermott, and T. Everitt, “Discovering agents”, Artificial Intelligence 322, 2023, pp. 103963.

[364] Richens, J., and T. Everitt, “Robust agents learn causal world models”, 2024. https://arxiv.org/abs/2402.10877

[365] Everitt, T., M. Hutter, R. Kumar, and V. Krakovna, “Reward tampering problems and solutions in reinforcement learning: A causal influence diagram perspective”, Synthese 198(Suppl 27), 2021, pp. 6435–6467.

[366] Tavares, Z., X. Zhang, E. Minaysan, J. Burroni, R. Ranganath, and A.S. Lezama, “The random conditional distribution for higher-order probabilistic inference”, arXiv preprint arXiv:1903.10556, 2019.

[367] Stuhlmüller, A., and N.D. Goodman, “Reasoning about reasoning by nested conditioning: Modeling theory of mind with probabilistic programs”, Cognitive Systems Research 28, 2014, pp. 80–99.

[368] Evans, O., A. Stuhlmüller, J. Salvatier, and D. Filan, “Modeling Agents with Probabilistic Programs”, 2017.

[369] Goodman, N.D., J.B. Tenenbaum, and T.P. Contributors, “Probabilistic Models of Cognition”, 2016.

[370] Belle, V., and L. De Raedt, “Semiring programming: A framework for search, inference and learning”, CoRR, abs/1609.06954, 2016.

[371] Hinton, G.E., P. Dayan, B.J. Frey, and R.M. Neal, “The" wake-sleep" algorithm for unsupervised neural networks”, Science 268(5214), 1995, pp. 1158–1161.

[372] Stuhlmüller, A., J. Taylor, and N. Goodman, “Learning stochastic inverses”, Advances in neural information processing systems 26, 2013.

[373] Piantadosi, S.T., and R.A. Jacobs, “Four problems solved by the probabilistic language of thought”, Current Directions in Psychological Science 25(1), 2016, pp. 54–59.

[374] Neal, R.M., “New monte carlo methods based on hamiltonian dynamics”, MaxEnt workshop, number july, (2011).

[375] Delaloye, S., and P.E. Holtzheimer, “Deep brain stimulation in the treatment of depression”, Dialogues Clin. Neurosci. 16(1), 2014, pp. 83–91.

[376] Holtzheimer, P.E., and H.S. Mayberg, “Deep brain stimulation for psychiatric disorders”, Annu. Rev. Neurosci. 34(1), 2011, pp. 289–307.

[377] Oxley, T.J., N.L. Opie, S.E. John, et al., “Minimally invasive endovascular stent-electrode array for high-fidelity, chronic recordings of cortical neural activity”, Nat. Biotechnol. 34(3), 2016, pp. 320–327.

[378] Musk, E., and Neuralink, “An integrated brain-machine interface platform with thousands of channels”, bioRxiv, 2019, pp. 703801.

[379] Sahasrabuddhe, K., A.A. Khan, A.P. Singh, et al., “The argo: A 65,536 channel recording system for high density neural recordingin vivo”, bioRxiv, 2020, pp. 2020.07.17.209403.

[380] Paulk, A.C., Y. Kfir, A.R. Khanna, et al., “Large-scale neural recordings with single neuron resolution using neuropixels probes in human cortex”, Nat. Neurosci. 25(2), 2022, pp. 252–263.

[381] Nishimoto, S., A.T. Vu, T. Naselaris, Y. Benjamini, B. Yu, and J.L. Gallant, “Reconstructing visual experiences from brain activity evoked by natural movies”, Curr. Biol. 21(19), 2011, pp. 1641–1646.

[382] Scotti, P., A. Banerjee, J. Goode, et al., “Reconstructing the mind’s eye: fMRI-to-image with contrastive learning and diffusion priors”, Neural information processing systems, (2023).

[383] Christiano, P.F., J. Leike, T. Brown, M. Martic, S. Legg, and D. Amodei, “Deep reinforcement learning from human preferences”, Advances in neural information processing systems 30, 2017.

[384] Fong, R.C., W.J. Scheirer, and D.D. Cox, “Using human brain activity to guide machine learning”, Sci. Rep. 8(1), 2018, pp. 5397.

[385] Nisbett, R.E., and T.D. Wilson, “Telling more than we can know: Verbal reports on mental processes”, Psychol. Rev. 84(3), 1977, pp. 231.

[386] Lightman, H., V. Kosaraju, Y. Burda, et al., “Let’s verify step by step”, arXiv [cs.LG], 2023.

[387] Luo, L., Y. Liu, R. Liu, et al., “Improve mathematical reasoning in language models by automated process supervision”, arXiv [cs.CL], 2024.

[388] Hendrycks, D., C. Burns, S. Basart, et al., “Aligning AI with shared human values”, arXiv [cs.CY], 2020.

[389] Sharma, M., M. Tong, T. Korbak, et al., “Towards understanding sycophancy in language models”, arXiv [cs.CL], 2023.

[390] Shen, T., R. Jin, Y. Huang, et al., “Large language model alignment: A survey”, arXiv [cs.CL], 2023.

[391] Phuong, M., M. Aitchison, E. Catt, et al., “Evaluating frontier models for dangerous capabilities”, arXiv [cs.LG], 2024.

[392] Müller, V.C., “Ethics of artificial intelligence and robotics”, 2020.

[393] OpenAI, “Introducing OpenAI o1”, 2024.

[394] Abdin, M., J. Aneja, H. Awadalla, et al., “Phi-3 technical report: A highly capable language model locally on your phone”, arXiv [cs.CL], 2024.

[395] Branwen, G., “WBE and DRL: A middle way of imitation learning from the human brain”, 2018.

[396] Sucholutsky, I., and T.L. Griffiths, “Alignment with human representations supports robust few-shot learning”, 2023.

[397] Muttenthaler, L., L. Linhardt, J. Dippel, et al., “Improving neural network representations using human similarity judgments”, Neural information processing systems, (2023).

[398] Muttenthaler, L., K. Greff, F. Born, et al., “Aligning machine and human visual representations across abstraction levels”, arXiv [cs.CV], 2024.

[399] Deng, J., W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database”, 2009 IEEE conference on computer vision and pattern recognition, IEEE (2009), 248–255.

[400] Fellbaum, C., “WordNet and wordnets”, In K. Brown, ed., Encyclopedia of language and linguistics. Elsevier, 2005.

[401] Mehrer, J., C.J. Spoerer, E.C. Jones, N. Kriegeskorte, and T.C. Kietzmann, “An ecologically motivated image dataset for deep learning yields better models of human vision”, Proc. Natl. Acad. Sci. U. S. A. 118(8), 2021, pp. e2011417118.

[402] Huh, M., B. Cheung, T. Wang, and P. Isola, “The platonic representation hypothesis”, arXiv [cs.LG], 2024.

[403] Gauthaman, R.M., B. Ménard, and M.F. Bonner, “Universal scale-free representations in human visual cortex”, arXiv [q-bio.NC], 2024.

[404] Roads, B.D., and B.C. Love, “Enriching ImageNet with human similarity judgments and psychological embeddings”, 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR), IEEE (2021).

[405] Hebart, M.N., O. Contier, L. Teichmann, et al., “THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior”, Elife 12, 2023.

[406] Borghesani, V., J. Armoza, M.N. Hebart, P. Bellec, and S.M. Brambati, “The three terms task - an open benchmark to compare human and artificial semantic representations”, Sci Data 10(1), 2023, pp. 117.

[407] Peterson, J.C., R.M. Battleday, T.L. Griffiths, and O. Russakovsky, “Human uncertainty makes classification more robust”, arXiv [cs.CV], 2019.

[408] Xu, B., A. Bikakis, D. Onah, A. Vlachidis, and L. Dickens, “Measuring error alignment for decision-making systems”, arXiv [cs.AI], 2024.

[409] Klerke, S., Y. Goldberg, and A. Søgaard, “Improving sentence compression by learning to predict gaze”, arXiv [cs.CL], 2016.

[410] Ishibashi, T., Y. Sugano, and Y. Matsushita, “Gaze-guided image classification for reflecting perceptual class ambiguity”, Adjunct proceedings of the 31st annual ACM symposium on user interface software and technology, ACM (2018).

[411] Linsley, D., D. Shiebler, S. Eberhardt, and T. Serre, “Learning what and where to attend”, arXiv [cs.CV], 2018.

[412] Linsley, D., P. Feng, T. Boissin, et al., “Adversarial alignment: Breaking the trade-off between the strength of an attack and its relevance to human perception”, 2023.

[413] Lindsey, J.W., and S. Lippl, “Implicit regularization of multi-task learning and finetuning in overparameterized neural networks”, 2023.

[414] Itti, L., C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis”, IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1998, pp. 1254–1259.

[415] Mathias, S., D. Kanojia, A. Mishra, and P. Bhattacharyya, “A survey on using gaze behaviour for natural language processing”, arXiv [cs.CL], 2021.

[416] Kiegeland, S., D.R. Reich, R. Cotterell, L.A. Jäger, and E. Wilcox, “The pupil becomes the master: Eye-tracking feedback for tuning LLMs”, ICML 2024 workshop on LLMs and cognition, (2024).

[417] Hinton, G., O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network”, arXiv [stat.ML], 2015.

[418] Kriegeskorte, N., M. Mur, and P. Bandettini, “Representational similarity analysis - connecting the branches of systems neuroscience”, Front. Syst. Neurosci. 2, 2008, pp. 4.

[419] Kornblith, S., M. Norouzi, H. Lee, and G. Hinton, “Similarity of neural network representations revisited”, International conference on machine learning, PMLR (2019), 3519–3529.

[420] Xie, Y., A. Goyal, W. Zheng, et al., “Monte carlo tree search boosts reasoning via iterative preference learning”, arXiv [cs.AI], 2024.

[421] Fyshe, A., P.P. Talukdar, B. Murphy, and T.M. Mitchell, “Interpretable semantic vectors from a joint model of brain- and text-based meaning”, Proc. Conf. Assoc. Comput. Linguist. Meet. 2014, 2014, pp. 489–499.

[422] Huth, A., S. Nishimoto, A. Vu, and J. Gallant, “A continuous semantic space describes the representation of thousands of object and action categories across the human brain”, Neuron 76(6), 2012, pp. 1210–1224.

[423] Tang, J., A. LeBel, S. Jain, and A.G. Huth, “Semantic reconstruction of continuous language from non-invasive brain recordings”, 2022.

[424] Antonello, R., A. Vaidya, and A.G. Huth, “Scaling laws for language encoding models in fMRI”, arXiv [cs.CL], 2023.

[425] Kohler, J., M.C. Ottenhoff, S. Goulis, et al., “Synthesizing speech from intracranial depth electrodes using an encoder-decoder framework”, Neurons, Behavior, Data analysis, and Theory, 2022.

[426] Moussa, O., D. Klakow, and M. Toneva, “Improving semantic understanding in speech language models via brain-tuning”, arXiv [cs.CL], 2024.

[427] Meek, A., A. Karpov, S.H. Cho, R. Koopmanschap, L. Farnik, and B.-I. Cirstea, “Inducing human-like biases in moral reasoning language models”, UniReps: 2nd edition of the workshop on unifying representations in neural models, (2024).

[428] Saxe, R., and N. Kanwisher, “People thinking about thinking people. The role of the temporo-parietal junction in ‘theory of mind’”, Neuroimage 19(4), 2003, pp. 1835–1842.

[429] Sato, M., K. Tomeoka, I. Horiguchi, K. Arulkumaran, R. Kanai, and S. Sasai, “Scaling law in neural data: Non-invasive speech decoding with 175 hours of EEG data”, arXiv [q-bio.NC], 2024.

[430] Allen, E.J., G. St-Yves, Y. Wu, et al., “A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence”, Nat. Neurosci. 25(1), 2022, pp. 116–126.

[431] Freteault, M., M. Le Clei, L. Tetrel, P. Bellec, and N. Farrugia, “Alignment of auditory artificial networks with massive individual fMRI brain data leads to generalizable improvements in brain encoding and downstream tasks”, bioRxivorg, 2024.

[432] Ariely, D., and G.S. Berns, “Neuromarketing: The hope and hype of neuroimaging in business”, Nat. Rev. Neurosci. 11(4), 2010, pp. 284–292.

[433] Coupé, C., Y.M. Oh, D. Dediu, and F. Pellegrino, “Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche”, Sci. Adv. 5(9), 2019, pp. eaaw2594.

[434] Zheng, J., and M. Meister, “The unbearable slowness of being”, arXiv [q-bio.NC], 2024.

[435] Willett, F.R., D.T. Avansino, L.R. Hochberg, J.M. Henderson, and K.V. Shenoy, “High-performance brain-to-text communication via handwriting”, Nature 593(7858), 2021, pp. 249–254.

[436] Willett, F.R., E.M. Kunz, C. Fan, et al., “A high-performance speech neuroprosthesis”, Nature 620(7976), 2023, pp. 1031–1036.

[437] Neuralink, “PRIME study progress update — user experience”, 2024.

[438] Scotti, P.S., M. Tripathy, C.K.T. Villanueva, et al., “MindEye2: Shared-subject models enable fMRI-to-image with 1 hour of data”, arXiv [cs.CV], 2024.

[439] Antonello, R., N. Sarma, J. Tang, J. Song, and A. Huth, “How many bytes can you take out of brain-to-text decoding?”, arXiv [cs.CL], 2024.

[440] Parthasarathy, N., J. Soetedjo, S. Panchavati, et al., “High performance P300 spellers using GPT2 word prediction with cross-subject training”, arXiv [cs.CL], 2024.

[441] Shi, N., Y. Miao, C. Huang, et al., “Estimating and approaching the maximum information rate of noninvasive visual brain-computer interface”, Neuroimage 289(120548), 2024, pp. 120548.

[442] Li, X., J. Chen, N. Shi, et al., “A hybrid steady-state visual evoked response-based brain-computer interface with MEG and EEG”, Expert Syst. Appl. 223(119736), 2023, pp. 119736.

[443] Kernel, “Kernel flux and flow”, 2021.

[444] Tripathy, K., Z.E. Markow, A.K. Fishell, et al., “Decoding visual information from high-density diffuse optical tomography neuroimaging data”, Neuroimage 226(117516), 2021, pp. 117516.

[445] Shin, J., J. Kwon, and C.-H. Im, “A ternary hybrid EEG-NIRS brain-computer interface for the classification of brain activation patterns during mental arithmetic, motor imagery, and idle state”, Front. Neuroinform. 12, 2018, pp. 5.

[446] Norman, S.L., D. Maresca, V.N. Christopoulos, et al., “Single-trial decoding of movement intentions using functional ultrasound neuroimaging”, Neuron 109(9), 2021, pp. 1554–1566.e4.

[447] Jung, T., N. Zeng, J.D. Fabbri, et al., “Stable, chronic in-vivo recordings from a fully wireless subdural-contained 65,536-electrode brain-computer interface device”, bioRxivorg, 2024, pp. 2024.05.17.594333.

[448] Allen, K.R., F. Brändle, M.M. Botvinick, et al., “Using games to understand the mind”, 2023.

[449] Spiers, H.J., A. Coutrot, and M. Hornberger, “Explaining world-wide variation in navigation ability from millions of people: Citizen science project sea hero quest”, Top. Cogn. Sci. 15(1), 2023, pp. 120–138.

[450] Devlin, S., R. Georgescu, I. Momennejad, et al., “Navigation turing test (NTT): Learning to evaluate human-like navigation”, 2021.

[451] Cross, L., J. Cockburn, Y. Yue, and J.P. O’Doherty, “Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments”, Neuron 109(4), 2021, pp. 724–738.e7.

[452] Kemtur, A., F. Paugam, B. Pinsard, et al., “Behavioral imitation with artificial neural networks leads to personalized models of brain dynamics during videogame play”, bioRxiv, 2023, pp. 2023.10.28.564546.

[453] Zhang, T., “Modelling navigation representations during naturalistic driving”, 2021.

[454] Hafner, D., J. Pasukonis, J. Ba, and T. Lillicrap, “Mastering diverse domains through world models”, 2023.

[455] Bard, N., J.N. Foerster, S. Chandar, et al., “The hanabi challenge: A new frontier for AI research”, arXiv [cs.LG], 2019.

[456] Brown, N., A. Bakhtin, A. Lerer, and Q. Gong, “Combining deep reinforcement learning and search for imperfect-information games”, arXiv [cs.GT], 2020.

[457] Pan, A., J.S. Chan, A. Zou, et al., “Do the rewards justify the means? Measuring trade-offs between rewards and ethical behavior in the MACHIAVELLI benchmark”, arXiv [cs.LG], 2023.

[458] Marblestone, A.H., G. Wayne, and K.P. Kording, “Toward an integration of deep learning and neuroscience”, Front. Comput. Neurosci. 10, 2016, pp. 94.

[459] Arora, S., and P. Doshi, “A survey of inverse reinforcement learning: Challenges, methods and progress”, arXiv [cs.LG], 2018.

[460] Sarma, G., and N.J. Hay, “Mammalian value systems”, Informatica (Vilnius) abs/1607.08289(4), 2016.

[461] Sarma, G.P., N.J. Hay, and A. Safron, “AI safety and reproducibility: Establishing robust foundations for the neuropsychology of human values”, In Developments in language theory. Springer International Publishing, Cham, 2018, 507–512.

[462] Collins, K.M., I. Sucholutsky, U. Bhatt, et al., “Building machines that learn and think with people”, arXiv [cs.HC], 2024.

[463] Yamins, D.L.K., and J.J. DiCarlo, “Using goal-driven deep learning models to understand sensory cortex”, Nat. Neurosci. 19(3), 2016, pp. 356–365.

[464] Doerig, A., R. Sommers, K. Seeliger, et al., “The neuroconnectionist research programme”, 2022.

[465] Yamins, D.L.K., H. Hong, C.F. Cadieu, E.A. Solomon, D. Seibert, and J.J. DiCarlo, “Performance-optimized hierarchical models predict neural responses in higher visual cortex”, Proceedings of the national academy of sciences 111(23), 2014, pp. 8619–8624.

[466] Khaligh-Razavi, S.-M., and N. Kriegeskorte, “Deep supervised, but not unsupervised, models may explain IT cortical representation”, PLoS Comput. Biol. 10(11), 2014, pp. e1003915.

[467] Sussillo, D., M.M. Churchland, M.T. Kaufman, and K.V. Shenoy, “A neural network that finds a naturalistic solution for the production of muscle activity”, Nat. Neurosci. 18(7), 2015, pp. 1025–1033.

[468] Mineault, P., S. Bakhtiari, B. Richards, and C. Pack, “Your head is there to move you around: Goal-driven models of the primate dorsal pathway”, Neural information processing systems (NeurIPS), (2021).

[469] Nayebi, A., R. Rajalingham, M. Jazayeri, and G.R. Yang, “Neural foundations of mental simulation: Future prediction of latent representations on dynamic scenes”, Neural information processing systems, (2023).

[470] Conwell, C., J.S. Prince, K.N. Kay, G.A. Alvarez, and T. Konkle, “What can 1.8 billion regressions tell us about the pressures shaping high-level visual representation in brains and machines?”, 2022.

[471] Conwell, C., J.S. Prince, K.N. Kay, G.A. Alvarez, and T. Konkle, “A large-scale examination of inductive biases shaping high-level visual representation in brains and machines”, Nat. Commun. 15(1), 2024, pp. 9383.

[472] Rideaux, R., and A.E. Welchman, “But still it moves: Static image statistics underlie how we see motion”, Journal of Neuroscience 40(12), 2020, pp. 2538–2552.

[473] Bakhtiari, S., P. Mineault, T. Lillicrap, C. Pack, and B. Richards, “The functional specialization of visual cortex emerges from training parallel pathways with self-supervised predictive learning”, Neural information processing systems (NeurIPS), (2021).

[474] Vafaii, H., J. Yates, and D. Butts, “Hierarchical VAEs provide a normative account of motion processing in the primate brain”, Neural information processing systems, (2023).

[475] Chang, L., B. Egger, T. Vetter, and D.Y. Tsao, “Explaining face representation in the primate brain using different computational models”, Curr. Biol. 31(13), 2021, pp. 2785–2795.e4.

[476] Bonner, M.F., and R.A. Epstein, “Object representations in the human brain reflect the co-occurrence statistics of vision and language”, Nat. Commun. 12(1), 2021, pp. 4081.

[477] Lindsay, G.W., and K.D. Miller, “How biological attention mechanisms improve task performance in a large-scale visual system model”, Elife 7, 2018.

[478] Lotter, W., G. Kreiman, and D. Cox, “A neural network trained for prediction mimics diverse features of biological neurons and perception”, Nat Mach Intell 2(4), 2020, pp. 210–219.

[479] Hannagan, T., A. Agrawal, L. Cohen, and S. Dehaene, “Emergence of a compositional neural code for written words: Recycling of a convolutional neural network for reading”, Proceedings of the National Academy of Sciences 118(46), 2021.

[480] Kell, A.J.E., D.L.K. Yamins, E.N. Shook, S.V. Norman-Haignere, and J.H. McDermott, “A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy”, Neuron 98(3), 2018, pp. 630–644.

[481] Francl, A., and J.H. McDermott, “Deep neural network models of sound localization reveal how perception is adapted to real-world environments”, Nat. Hum. Behav. 6(1), 2022, pp. 111–133.

[482] Caucheteux, C., and J.-R. King, “Brains and algorithms partially converge in natural language processing”, Commun Biol 5(1), 2022, pp. 1–10.

[483] Fedorenko, E., I.A. Blank, M. Siegelman, and Z. Mineroff, “Lack of selectivity for syntax relative to word meanings throughout the language network”, Cognition 203(104348), 2020, pp. 104348.

[484] Caucheteux, C., A. Gramfort, and J.-R. King, “Evidence of a predictive coding hierarchy in the human brain listening to speech”, Nat Hum Behav, 2023, pp. 1–12.

[485] Schrimpf, M., I.A. Blank, G. Tuckute, et al., “The neural architecture of language: Integrative modeling converges on predictive processing”, Proceedings of the National Academy of Sciences 118(45), 2021, pp. e2105646118.

[486] Rajan, K., C.D. Harvey, and D.W. Tank, “Recurrent network models of sequence generation and memory”, Neuron 90(1), 2016, pp. 128–142.

[487] Pandarinath, C., P. Nuyujukian, C.H. Blabe, et al., “High performance communication by people with paralysis using an intracortical brain-computer interface”, Elife 6, 2017, pp. e18554.

[488] Russo, A.A., R. Khajeh, S.R. Bittner, et al., “Neural trajectories in the supplementary motor area and motor cortex exhibit distinct geometries, compatible with different classes of computation”, Neuron 107(4), 2020, pp. 745–758.e6.

[489] Dasgupta, S., C.F. Stevens, and S. Navlakha, “A neural algorithm for a fundamental computing problem”, Science 358(6364), 2017, pp. 793–796.

[490] Wang, P.Y., Y. Sun, R. Axel, L.F. Abbott, and G.R. Yang, “Evolving the olfactory system with machine learning”, bioRxiv, 2021, pp. 2021.04.15.439917.

[491] Taleb, F., M.S. Vasco, N. Rajabi, M. Björkman, and D. Kragic, “Can foundation models smell like humans?”, (2024).

[492] Banino, A., C. Barry, B. Uria, et al., “Vector-based navigation using grid-like representations in artificial agents”, Nature 557(7705), 2018, pp. 429–433.

[493] Cueva, C.J., and X.-X. Wei, “Emergence of grid-like representations by training recurrent neural networks to perform spatial localization”, arXiv [q-bio.NC], 2018.

[494] Whittington, J.C.R., T.H. Muller, S. Mark, et al., “The tolman-eichenbaum machine: Unifying space and relational memory through generalization in the hippocampal formation”, Cell 183(5), 2020, pp. 1249–1263.e23.

[495] Nayebi, A., A. Attinger, M.G. Campbell, et al., “Explaining heterogeneity in medial entorhinal cortex with task-driven neural networks”, 2021.

[496] Sorscher, B., G.C. Mel, S.A. Ocko, L.M. Giocomo, and S. Ganguli, “A unified theory for the computational and mechanistic origins of grid cells”, Neuron 111(1), 2023, pp. 121–137.e13.

[497] Stachenfeld, K.L., M.M. Botvinick, and S.J. Gershman, “The hippocampus as a predictive map”, Nat. Neurosci. 20(11), 2017, pp. 1643–1653.

[498] Fang, C., D. Aronov, L. Abbott, and E.L. Mackevicius, “Neural learning rules for generating flexible predictions and computing the successor representation”, elife 12, 2023, pp. e80680.

[499] George, T.M., K. Stachenfeld, C. Barry, C. Clopath, and T. Fukai, “A generative model of the hippocampal formation trained with theta driven local learning rules”, Neural information processing systems, (2023).

[500] Bono, J., S. Zannone, V. Pedrosa, and C. Clopath, “Learning predictive cognitive maps with spiking neurons during behavior and replays”, Elife 12, 2023, pp. e80671.

[501] Whittington, J.C.R., J. Warren, and T.E.J. Behrens, “Relating transformers to models and neural representations of the hippocampal formation”, arXiv [cs.NE], 2021.

[502] Jensen, K.T., G. Hennequin, and M.G. Mattar, “A recurrent network model of planning explains hippocampal replay and human behavior”, Nat. Neurosci. 27(7), 2024, pp. 1340–1348.

[503] Mante, V., D. Sussillo, K.V. Shenoy, and W.T. Newsome, “Context-dependent computation by recurrent dynamics in prefrontal cortex”, Nature 503(7474), 2013, pp. 78–84.

[504] Wang, J.X., Z. Kurth-Nelson, D. Kumaran, et al., “Prefrontal cortex as a meta-reinforcement learning system”, Nat. Neurosci. 21(6), 2018, pp. 860–868.

[505] Yang, G.R., M.R. Joglekar, H.F. Song, W.T. Newsome, and X.-J. Wang, “Task representations in neural networks trained to perform many cognitive tasks”, Nat. Neurosci. 22(2), 2019, pp. 297–306.

[506] Summerfield, C., F. Luyckx, and H. Sheahan, “Structure learning and the posterior parietal cortex”, Prog. Neurobiol. 184, 2020, pp. 101717.

[507] Fedorenko, E., S.T. Piantadosi, and E.A.F. Gibson, “Language is primarily a tool for communication rather than thought”, Nature 630(8017), 2024, pp. 575–586.

[508] Berrios, W., and A. Deza, Joint rotational invariance and adversarial training of a dual-stream transformer yields state of the art brain-score for area V4, arXiv, 2022.

[509] Schrimpf, M., J. Kubilius, H. Hong, et al., “Brain-score: Which artificial neural network for object recognition is most brain-like?”, BioRxiv, 2018, pp. 407007.

[510] Linsley, D., I.F.R. Rodriguez, T. Fel, et al., “Performance-optimized deep neural networks are evolving into worse models of inferotemporal visual cortex”, Neural information processing systems, (2023).

[511] Gokce, A., and M. Schrimpf, “Scaling laws for task-optimized models of the primate visual ventral stream”, arXiv [cs.LG], 2024.

[512] Chen, Z., and M.F. Bonner, “Universal dimensions of visual representation”, 2024.

[513] Antonello, R., and A. Huth, “Predictive coding or just feature discovery? An alternative account of why language models fit brain data”, Neurobiol. Lang. (Camb.) 5(1), 2024, pp. 64–79.

[514] Sexton, N.J., and B.C. Love, “Reassessing hierarchical correspondences between brain and deep networks through direct interface”, Science Advances 8(28), 2022, pp. eabm2219.

[515] Soni, A., S. Srivastava, M. Khosla, and K.P. Kording, “Conclusions about neural network to brain alignment are profoundly impacted by the similarity measure”, bioRxiv, 2024, pp. 2024.08.07.607035.

[516] Thobani, I., J. Sagastuy-Brena, A. Nayebi, R. Cao, and D.L.K. Yamins, “Inter-animal transforms as a guide to model-brain comparison”, ICLR 2024 workshop on representational alignment, (2024).

[517] Ahlert, J., T. Klein, F.A. Wichmann, and R. Geirhos, “How aligned are different alignment metrics?”, (2024).

[518] Klabunde, M., T. Wald, T. Schumacher, K. Maier-Hein, M. Strohmaier, and F. Lemmerich, “ReSi: A comprehensive benchmark for representational similarity measures”, arXiv [cs.LG], 2024.

[519] Elmoznino, E., and M.F. Bonner, “High-performing neural network models of visual cortex benefit from high latent dimensionality”, 2022.

[520] Schaeffer, R., M. Khona, S. Chandra, M. Ostrow, B. Miranda, and S. Koyejo, “Position: Maximizing neural regression scores may not identify good models of the brain”, UniReps: 2nd edition of the workshop on unifying representations in neural models, (2024).

[521] Zednik, C., and F. Jäkel, “Bayesian reverse-engineering considered as a research strategy for cognitive science”, Synthese 193(12), 2016, pp. 3951–3985.

[522] Golan, T., P.C. Raju, and N. Kriegeskorte, “Controversial stimuli: Pitting neural networks against each other as models of human cognition”, Proc. Natl. Acad. Sci. U. S. A. 117(47), 2020, pp. 29330–29337.

[523] Tuckute, G., D. Finzi, E. Margalit, et al., “How to optimize neuroscience data utilization and experiment design for advancing primate visual and linguistic brain models?”, arXiv [q-bio.NC], 2024.

[524] Olshausen, B.A., and D.J. Field, “Emergence of simple-cell receptive field properties by learning a sparse code for natural images”, Nature 381, 1996, pp. 607–609.

[525] Rao, R.P.N., and D.H. Ballard, “Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects”, Nat. Neurosci. 2(1), 1999, pp. 79–87.

[526] Agrawal, K.K., A.K. Mondal, A. Ghosh, and B.A. Richards, “

α

-ReQ : Assessing representation quality in self-supervised learning by measuring eigenspectrum decay”, (2022).

[527] Northcutt, C.G., L. Jiang, and I.L. Chuang, “Confident learning: Estimating uncertainty in dataset labels”, arXiv [stat.ML], 2019.

[528] Caballero, E., K. Gupta, I. Rish, and D. Krueger, “Broken neural scaling laws”, 2022.

[529] Kandasamy, K., W. Neiswanger, J. Schneider, B. Poczos, and E.P. Xing, “Neural architecture search with bayesian optimisation and optimal transport”, Advances in Neural Information Processing Systems 31, 2018.

[530] Schultz, W., P. Dayan, and P.R. Montague, “A neural substrate of prediction and reward”, Science 275(5306), 1997, pp. 1593–1599.

[531] Hassabis, D., D. Kumaran, C. Summerfield, and M. Botvinick, “Neuroscience-inspired artificial intelligence”, Neuron 95(2), 2017, pp. 245–258.

[532] Lindsay, G., Models of the mind: How physics, engineering and mathematics have shaped our understanding of the brain, Bloomsbury Sigma, 1. Aufl, 2021.

[533] Schultz, W., “Reward functions of the basal ganglia”, J. Neural Transm. 123(7), 2016, pp. 679–693.

[534] Pessiglione, M., B. Seymour, G. Flandin, R.J. Dolan, and C.D. Frith, “Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans”, Nature 442(7106), 2006, pp. 1042–1045.

[535] Huys, Q.J.M., T.V. Maia, and M.J. Frank, “Computational psychiatry as a bridge from neuroscience to clinical applications”, Nat. Neurosci. 19(3), 2016, pp. 404–413.

[536] Lee, S.W., S. Shimojo, and J.P. O’Doherty, “Neural computations underlying arbitration between model-based and model-free learning”, Neuron 81(3), 2014, pp. 687–699.

[537] Lowet, A.S., Q. Zheng, S. Matias, J. Drugowitsch, and N. Uchida, “Distributional reinforcement learning in the brain”, Trends Neurosci. 43(12), 2020, pp. 980–997.

[538] Muller, T.H., J.L. Butler, S. Veselic, et al., “Distributional reinforcement learning in prefrontal cortex”, Nat. Neurosci. 27(3), 2024, pp. 403–408.

[539] Lee, R.S., Y. Sagiv, B. Engelhard, I.B. Witten, and N.D. Daw, “A feature-specific prediction error model explains dopaminergic heterogeneity”, Nat. Neurosci. 27(8), 2024, pp. 1574–1586.

[540] Barto, A., R.L. Lewis, and S. Singh, “Where do rewards come from”, 31, 2009.

[541] Keramati, M., and B. Gutkin, “Homeostatic reinforcement learning for integrating reward collection and physiological stability”, Elife 3, 2014.

[542] Juechems, K., and C. Summerfield, “Where does value come from?”, Trends Cogn. Sci. 23(10), 2019, pp. 836–850.

[543] Millidge, B., M. Walton, and R. Bogacz, “Reward bases: Instantaneous reward revaluation with temporal difference learning”, bioRxiv, 2022, pp. 2022.04.14.488361.

[544] Weber, L., D.M. Yee, D.M. Small, and F.H. Petzschner, “Rethinking reinforcement learning: The interoceptive origin of reward”, PsyArXiv, 2024.

[545] Li, F., J.W. Lindsey, E.C. Marin, et al., “The connectome of the adult drosophila mushroom body provides insights into function”, Elife 9, 2020.

[546] Liu, D., M. Rahman, A. Johnson, et al., “A hypothalamic circuit underlying the dynamic control of social homeostasis”, bioRxiv, 2023.

[547] Allsop, S.A., R. Wichmann, F. Mills, et al., “Corticoamygdala transfer of socially derived information gates observational learning”, Cell 173(6), 2018, pp. 1329–1342.

[548] Kohl, J., B.M. Babayan, N.D. Rubinstein, et al., “Functional circuit architecture underlying parental behaviour”, Nature 556(7701), 2018, pp. 326–331.

[549] Dabney, W., Z. Kurth-Nelson, N. Uchida, et al., “A distributional code for value in dopamine-based reinforcement learning”, Nature 577(7792), 2020, pp. 671–675.

[550] Byrnes, S., “Dopamine-supervised learning in mammals & fruit flies”, 2021. https://www.lesswrong.com/posts/GnmLReqNrP4CThn6/dopamine-supervised-learning-in-mammals-and-fruit-flies

[551] Sutton, R.S., and A.G. Barto, Reinforcement learning: An introduction, MIT press, 2018.

[552] Morris, M.J., J.E. Beilharz, J. Maniam, A.C. Reichelt, and R.F. Westbrook, “Why is obesity such a problem in the 21st century? The intersection of palatable food, cues and reward pathways, stress, and cognition”, Neurosci. Biobehav. Rev. 58, 2015, pp. 36–45.

[553] Ulrich, M., A. Rüger, V. Durner, G. Grön, and H. Graf, “Reward is not reward: Differential impacts of primary and secondary rewards on expectation, outcome, and prediction error in the human brain’s reward processing regions”, Neuroimage 283(120440), 2023, pp. 120440.

[554] Markman, M., E. Saruco, S. Al-Bas, et al., “Differences in discounting behavior and brain responses for food and money reward”, eNeuro 11(4), 2024.

[555] Gadagkar, V., P.A. Puzerey, R. Chen, E. Baird-Daniel, A.R. Farhang, and J.H. Goldberg, “Dopamine neurons encode performance error in singing birds”, Science 354(6317), 2016, pp. 1278–1282.

[556] Duffy, A., K.W. Latimer, J.H. Goldberg, A.L. Fairhall, and V. Gadagkar, “Dopamine neurons evaluate natural fluctuations in performance quality”, Cell reports 38(13), 2022.

[557] Roeser, A., V. Gadagkar, A. Das, P.A. Puzerey, B. Kardon, and J.H. Goldberg, “Dopaminergic error signals retune to social feedback during courtship”, Nature 623(7986), 2023, pp. 375–380.

[558] Mandelblat-Cerf, Y., L. Las, N. Denisenko, and M.S. Fee, “A role for descending auditory cortical projections in songbird vocal learning”, Elife 3, 2014.

[559] Hisey, E., M.G. Kearney, and R. Mooney, “A common neural circuit mechanism for internally guided and externally reinforced forms of motor learning”, Nature neuroscience 21(4), 2018, pp. 589–597.

[560] Kearney, M.G., T.L. Warren, E. Hisey, J. Qi, and R. Mooney, “Discrete evaluative and premotor circuits enable vocal learning in songbirds”, Neuron 104(3), 2019, pp. 559–575.

[561] Roberts, T.F., S.M. Gobes, M. Murugan, B.P. Ölveczky, and R. Mooney, “Motor circuits are required to encode a sensory model for imitative learning”, Nature neuroscience 15(10), 2012, pp. 1454–1459.

[562] Mackevicius, E.L., and M.S. Fee, “Building a state space for song learning”, Curr. Opin. Neurobiol. 49, 2018, pp. 59–68.

[563] Kanoski, S.E., M.R. Hayes, and K.P. Skibicka, “GLP-1 and weight loss: Unraveling the diverse neural circuitry”, Am. J. Physiol. Regul. Integr. Comp. Physiol. 310(10), 2016, pp. R885–95.

[564] Ng, A.Y., and S. Russell, “Algorithms for inverse reinforcement learning”, ICML 67, 2000, pp. 663–670.

[565] Rothkopf, C.A., and D.H. Ballard, “Modular inverse reinforcement learning for visuomotor behavior”, Biol. Cybern. 107(4), 2013, pp. 477–490.

[566] Yamaguchi, S., H. Naoki, M. Ikeda, et al., “Identification of animal behavioral strategies by inverse reinforcement learning”, PLoS Comput. Biol. 14(5), 2018, pp. e1006122.

[567] Jara-Ettinger, J., “Theory of mind as inverse reinforcement learning”, Curr. Opin. Behav. Sci. 29, 2019, pp. 105–110.

[568] Tan, J., X. Shen, X. Zhang, Z. Song, and Y. Wang, “Estimating reward function from medial prefrontal cortex cortical activity using inverse reinforcement learning”, Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2022, 2022, pp. 3346–3349.

[569] Kalweit, G., M. Kalweit, M. Alyahyay, et al., “NeuRL: Closed-form inverse reinforcement learning for neural decoding”, arXiv [q-bio.NC], 2022.

[570] Ashwood, Z.C., A. Jha, and J.W. Pillow, “Dynamic inverse reinforcement learning for characterizing animal behavior”, Neural Inf Process Syst, 2022.

[571] Gong, J., S. Cao, S. Korivand, and N. Jalili, “Reconstructing human gaze behavior from EEG using inverse reinforcement learning”, Smart Health 32(100480), 2024, pp. 100480.

[572] Lee, S.H., M.-H. Oh, and W.-Y. Ahn, “Inverse reinforcement learning captures value representations in the reward circuit in a real-time driving task: A preliminary study”, (2024).

[573] Urbaniak, R., M. Xie, and E. Mackevicius, “Linking cognitive strategy, neural mechanism, and movement statistics in group foraging behaviors”, Sci. Rep. 14(1), 2024, pp. 21770.

[574] Davidson, G., G. Todd, J. Togelius, T.M. Gureckis, and B.M. Lake, “Goals as reward-producing programs”, 2024. https://arxiv.org/abs/2405.13242

[575] Cisek, P., “Resynthesizing behavior through phylogenetic refinement”, Atten. Percept. Psychophys. 81(7), 2019, pp. 2265–2287.

[576] Karin, O., and U. Alon, “The dopamine circuit as a reward-taxis navigation system”, PLoS Comput. Biol. 18(7), 2022, pp. e1010340.

[577] Yao, Z., C.T. van Velthoven, M. Kunst, et al., “A high-resolution transcriptomic and spatial atlas of cell types in the whole mouse brain”, Nature 624(7991), 2023, pp. 317–332.

[578] Barack, D.L., and J.W. Krakauer, “Two views on the cognitive brain”, Nat. Rev. Neurosci., 2021, pp. 1–13.

[579] Räuker, T., A. Ho, S. Casper, and D. Hadfield-Menell, “Toward transparent AI: A survey on interpreting the inner structures of deep neural networks”, 2023.

[580] Bereska, L., and E. Gavves, “Mechanistic interpretability for AI safety – a review”, 2024.

[581] Olah, C., N. Cammarata, L. Schubert, G. Goh, M. Petrov, and S. Carter, “Zoom in: An introduction to circuits”, Distill 5(3), 2020, pp. e00024.001.

[582] Elhage, N., N. Nanda, C. Olsson, et al., “A mathematical framework for transformer circuits”, Transformer Circuits Thread, 2021.

[583] Zou, A., L. Phan, S. Chen, et al., “Representation engineering: A top-down approach to AI transparency”, arXiv [cs.LG], 2023.

[584] Nanda, N., “A longlist of theories of impact for interpretability”, 2022.

[585] Goh, G., †N.C., †C.V., et al., “Multimodal neurons in artificial neural networks”, Distill, 2021.

[586] Burns, C., H. Ye, D. Klein, and J. Steinhardt, “Discovering latent knowledge in language models without supervision”, 2024.

[587] Turner, A.M., L. Thiergart, G. Leech, et al., “Activation addition: Steering language models without optimization”, 2024.

[588] Bau, D., J.-Y. Zhu, H. Strobelt, et al., “GAN dissection: Visualizing and understanding generative adversarial networks”, 2018.

[589] Meng, K., D. Bau, A. Andonian, and Y. Belinkov, “Locating and editing factual associations in GPT”, 2023.

[590] Thompson, J.A.F., “Forms of explanation and understanding for neuroscience and artificial intelligence”, J. Neurophysiol. 126(6), 2021, pp. 1860–1874.

[591] Jonas, E., and K.P. Kording, “Could a neuroscientist understand a microprocessor?”, PLoS Comput. Biol. 13(1), 2017, pp. e1005268.

[592] Lillicrap, T.P., and K.P. Kording, “What does it mean to understand a neural network?”, arXiv [cs.LG], 2019.

[593] Olah, C., “Interpretability vs neuroscience”, 2021.

[594] Kar, K., S. Kornblith, and E. Fedorenko, “Interpretability of artificial neural network models in artificial intelligence vs. neuroscience”, Nat Mach Intell 4(12), 2022, pp. 1065–1067.

[595] Shan Xu, C., M. Januszewski, Z. Lu, et al., “A connectome of the adult drosophila central brain”, bioRxiv, 2020, pp. 2020.01.21.911859.

[596] Scheffer, L.K., C.S. Xu, M. Januszewski, et al., “A connectome and analysis of the adult drosophila central brain”, Elife 9, 2020.

[597] Nguyen, J.P., F.B. Shipley, A.N. Linder, et al., “Whole-brain calcium imaging with cellular resolution in freely behaving caenorhabditis elegans”, Proc. Natl. Acad. Sci. U. S. A. 113(8), 2016, pp. E1074–81.

[598] Aimon, S., T. Katsuki, T. Jia, et al., “Fast near-whole brain imaging in adult drosophila during responses to stimuli and behavior”, bioRxiv, 2018, pp. 033803.

[599] Lad, V., W. Gurnee, and M. Tegmark, “The remarkable robustness of LLMs: Stages of inference?”, arXiv [cs.LG], 2024.

[600] Shin, H., M.B. Ogando, L. Abdeladim, et al., “Recurrent pattern completion drives the neocortical representation of sensory inference”, 2023.

[601] Adesnik, H., and L. Abdeladim, “Probing neural codes with two-photon holographic optogenetics”, Nat. Neurosci. 24(10), 2021, pp. 1356–1366.

[602] Shadlen, M.N., and W.T. Newsome, “The variable discharge of cortical neurons: Implications for connectivity, computation, and information coding”, J. Neurosci. 18(10), 1998, pp. 3870–3896.

[603] Watanakeesuntorn, W., K. Takahashi, K. Ichikawa, et al., “Massively parallel causal inference of whole brain dynamics at single neuron resolution”, 2020.

[604] Vyas, S., M.D. Golub, D. Sussillo, and K.V. Shenoy, “Computation through neural population dynamics”, Annu. Rev. Neurosci. 43(1), 2020, pp. 249–275.

[605] Maheswaranathan, N., A. Williams, M. Golub, S. Ganguli, and D. Sussillo, “Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics”, Adv. Neural Inf. Process. Syst. 32, 2019.

[606] Vaswani, A., N. Shazeer, N. Parmar, et al., “Attention is all you need”, 2017.

[607] Dutta, S., T. Gautam, S. Chakrabarti, and T. Chakraborty, “Redesigning the transformer architecture with insights from multi-particle dynamical systems”, arXiv [cs.LG], 2021.

[608] Beck, M., K. Pöppel, M. Spanring, et al., “xLSTM: Extended long short-term memory”, arXiv [cs.LG], 2024.

[609] Dao, T., and A. Gu, “Transformers are SSMs: Generalized models and efficient algorithms through structured state space duality”, arXiv [cs.LG], 2024.

[610] Elhage, N., T. Hume, C. Olsson, et al., “Toy models of superposition”, Transformer Circuits Thread, 2022.

[611] Barak, O., M. Rigotti, and S. Fusi, “The sparseness of mixed selectivity neurons controls the generalization-discrimination trade-off”, J. Neurosci. 33(9), 2013, pp. 3844–3856.

[612] Meunier, D., R. Lambiotte, and E.T. Bullmore, “Modular and hierarchically modular organization of brain networks”, Front. Neurosci. 4, 2010, pp. 200.

[613] Bonhoeffer, T., and A. Grinvald, “Iso-orientation domains in cat visual cortex are arranged in pinwheel-like patterns”, Nature 353, 1991, pp. 429–431.

[614] Tanaka, K., “Columns for complex visual object features in the inferotemporal cortex: Clustering of cells with similar but slightly different stimulus selectivities”, Cereb. Cortex 13(1), 2003, pp. 90–99.

[615] Paik, S.-B., and D.L. Ringach, “Retinal origin of orientation maps in visual cortex”, Nat. Neurosci. 14(7), 2011, pp. 919–925.

[616] Ringach, D.L., P.J. Mineault, E. Tring, N.D. Olivas, P. Garcia-Junco-Clemente, and J.T. Trachtenberg, “Spatial clustering of tuning in mouse primary visual cortex”, Nat. Commun. 7(1), 2016, pp. 12270.

[617] Voss, C., G. Goh, N. Cammarata, M. Petrov, L. Schubert, and C. Olah, “Branch specialization”, Distill 6(4), 2021.

[618] Achterberg, J., D. Akarca, D.J. Strouse, J. Duncan, and D.E. Astle, “Spatially-embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings”, 2022.

[619] Kumar, S., T.R. Sumers, T. Yamakoshi, et al., “Shared functional specialization in transformer-based language models and the human brain”, Nat. Commun. 15(1), 2024, pp. 5523.

[620] Dobzhansky, T., “Nothing in biology makes sense except in the light of evolution”, Am. Biol. Teach. 35(3), 1973, pp. 125–129.

[621] Kaznatcheev, A., and K.P. Kording, “Nothing makes sense in deep learning, except in the light of evolution”, 2022.

[622] Mordvintsev, A., E. Randazzo, E. Niklasson, M. Levin, and S. Greydanus, “Thread: Differentiable self-organizing systems”, Distill, 2020.

[623] Stanley, K.O., J. Clune, J. Lehman, and R. Miikkulainen, “Designing neural networks through neuroevolution”, Nat. Mach. Intell. 1(1), 2019, pp. 24–35.

[624] Sherrington, C.S., The integrative action of the nervous system, Yale University Press, New Haven, CT, US, 1906.

[625] Hartline, H.K., “The receptive fields of optic nerve fibers”, Am. J. Physiol. 130(4), 1940, pp. 690–699.

[626] De Boer, E., and P. Kuyper, “Triggered correlation”, IEEE Transactions on Biomedical Engineering(3), 1968, pp. 169–179.

[627] Marmarelis, P.Z., and V.Z. Marmarelis, Analysis of physiological systems: The white-noise approach, Plenum Press New York, 1978.

[628] Ringach, D.L., “Mapping receptive fields in primary visual cortex”, J. Physiol. 558(Pt 3), 2004, pp. 717–728.

[629] Schwartz, O., J.W. Pillow, N.C. Rust, and E.P. Simoncelli, “Spike-triggered neural characterization”, J. Vis. 6(4), 2006.

[630] Berkes, P., and L. Wiskott, “Analysis and interpretation of quadratic models of receptive fields”, Nat. Protoc. 2(2), 2007, pp. 400–407.

[631] Paninski, L., “Maximum likelihood estimation of cascade point-process neural encoding models”, Network: Computation in Neural Systems 15(4), 2004, pp. 243–262.

[632] McFarland, J.M., Y. Cui, and D.A. Butts, “Inferring nonlinear neuronal computation based on physiologically plausible inputs”, PLoS Comput. Biol. 9(7), 2013, pp. e1003143.

[633] Wu, A., I.-S. Park, and J.W. Pillow, “Convolutional spike-triggered covariance analysis for neural subunit models”, Adv. Neural Inf. Process. Syst., 2015, pp. 793–801.

[634] McIntosh, L.T., N. Maheswaranathan, A. Nayebi, S. Ganguli, and S.A. Baccus, “Deep learning models of the retinal response to natural scenes”, Adv. Neural Inf. Process. Syst. 29, 2016, pp. 1369–1377.

[635] Cadena, S.A., F.H. Sinz, T. Muhammad, et al., “How well do deep neural networks trained on object recognition characterize the mouse visual system?”, 2019.

[636] Gosselin, F., and P.G. Schyns, “Bubbles: A technique to reveal the use of information in recognition tasks”, Vision Res. 41(17), 2001, pp. 2261–2271.

[637] Hubel, D.H., and T.N. Wiesel, “Receptive fields and functional architecture of monkey striate cortex”, J. Physiol. 195(1), 1968, pp. 215–243.

[638] Butts, D.A., and M.S. Goldman, “Tuning curves, neuronal variability, and sensory coding”, PLoS Biol. 4(4), 2006, pp. e92.

[639] DiMattina, C., and K. Zhang, “Adaptive stimulus optimization for sensory systems neuroscience”, Front. Neural Circuits 7, 2013, pp. 101.

[640] Feather, J., G. Leclerc, A. Mądry, and J.H. McDermott, “Model metamers reveal divergent invariances between biological and artificial neural networks”, Nat. Neurosci., 2023, pp. 1–18.

[641] Gold, J.I., and M.N. Shadlen, “The neural basis of decision making”, Annu. Rev. Neurosci. 30, 2007, pp. 535–574.

[642] Cohen, M.R., and W.T. Newsome, “Estimates of the contribution of single neurons to perception depend on timescale and noise correlation”, J. Neurosci. 29(20), 2009, pp. 6635–6648.

[643] Bernstein, J.G., P.A. Garrity, and E.S. Boyden, “Optogenetics and thermogenetics: Technologies for controlling the activity of targeted cells within intact neural circuits”, Curr. Opin. Neurobiol. 22(1), 2012, pp. 61–71.

[644] Emiliani, V., E. Entcheva, R. Hedrich, et al., “Optogenetics for light control of biological systems”, Nat Rev Methods Primers 2(1), 2022, pp. 1–25.

[645] Churchland, M.M., J.P. Cunningham, M.T. Kaufman, et al., “Neural population dynamics during reaching”, Nature 487(7405), 2012, pp. 51–56.

[646] Saxena, S., and J.P. Cunningham, “Towards the neural population doctrine”, Curr. Opin. Neurobiol. 55, 2019, pp. 103–111.

[647] Ebitz, R.B., and B.Y. Hayden, “The population doctrine in cognitive neuroscience”, Neuron 109(19), 2021, pp. 3055–3068.

[648] Gallego, J.A., M.G. Perich, L.E. Miller, and S.A. Solla, “Neural manifolds for the control of movement”, Neuron 94(5), 2017, pp. 978–984.

[649] Chung, S., and L.F. Abbott, “Neural population geometry: An approach for understanding biological and artificial neural networks”, Curr. Opin. Neurobiol. 70, 2021, pp. 137–144.

[650] Norman, K.A., S.M. Polyn, G.J. Detre, and J.V. Haxby, “Beyond mind-reading: Multi-voxel pattern analysis of fMRI data”, Trends Cogn. Sci. 10(9), 2006, pp. 424–430.

[651] Kriegeskorte, N., and J. Diedrichsen, “Inferring brain-computational mechanisms with models of activity measurements”, Philos. Trans. R. Soc. Lond. B Biol. Sci. 371(1705), 2016.

[652] Bell, A.J., and T. Sejnowski, “A non-linear information maximisation algorithm that performs blind separation”, Adv. Neural Inf. Process. Syst., 1994, pp. 467–474.

[653] Lee, D.D., and H.S. Seung, “Learning the parts of objects by non-negative matrix factorization”, Nature 401(6755), 1999, pp. 788–791.

[654] Williams, A.H., T.H. Kim, F. Wang, et al., “Unsupervised discovery of demixed, low-dimensional neural dynamics across multiple timescales through tensor component analysis”, Neuron 98(6), 2018, pp. 1099–1115.e8.

[655] Pellegrino, A., H. Stein, and N.A. Cayco-Gajic, “Dimensionality reduction beyond neural subspaces with slice tensor component analysis”, Nat. Neurosci. 27(6), 2024, pp. 1199–1210.

[656] Haxby, J.V., “Multivariate pattern analysis of fMRI: The early beginnings”, Neuroimage 62(2), 2012, pp. 852–855.

[657] Naselaris, T., K.N. Kay, S. Nishimoto, and J.L. Gallant, “Encoding and decoding in fMRI”, Neuroimage 56(2), 2011, pp. 400–410.

[658] Wang, H.-T., J. Smallwood, J. Mourao-Miranda, et al., “Finding the needle in a high-dimensional haystack: Canonical correlation analysis for neuroscientists”, Neuroimage 216, 2020, pp. 116745.

[659] Sussillo, D., “Neural circuits as computational dynamical systems”, Curr. Opin. Neurobiol. 25, 2014, pp. 156–163.

[660] Strogatz, S., M. Friedman, A.J. Mallinckrodt, and S. McKay, “Nonlinear dynamics and chaos: With applications to physics, biology, chemistry, and engineering”, Comput. Phys. Commun. 8(5), 1994, pp. 532–532.

[661] Khosla, M., A.H. Williams, J. McDermott, and N. Kanwisher, “Privileged representational axes in biological and artificial neural networks”, 2024.

[662] Prince, J.S., C. Conwell, G.A. Alvarez, and T. Konkle, “A case for sparse positive alignment of neural systems”, (2024).

[663] Sporns, O., G. Tononi, and R. Kötter, “The human connectome: A structural description of the human brain”, PLoS Comput. Biol. 1(4), 2005, pp. e42.

[664] Seung, S., Connectome: How the brain’s wiring makes us who we are, Houghton Mifflin Harcourt, 2012.

[665] Bassett, D.S., and O. Sporns, “Network neuroscience”, Nat. Neurosci. 20(3), 2017, pp. 353–364.

[666] Barabási, D.L., G. Bianconi, E. Bullmore, et al., “Neuroscience needs network science”, 2023.

[667] Hulse, B.K., H. Haberkern, R. Franconville, et al., “A connectome of the drosophila central complex reveals network motifs suitable for flexible navigation and context-dependent action selection”, Elife 10, 2021, pp. e66039.

[668] Shinomiya, K., A. Nern, I.A. Meinertzhagen, S.M. Plaza, and M.B. Reiser, “Neuronal circuits integrating visual motion information in drosophila melanogaster”, Curr. Biol. 32(16), 2022, pp. 3529–3544.e2.

[669] Cammarata, N., G. Goh, S. Carter, C. Voss, L. Schubert, and C. Olah, “Curve circuits”, Distill, 2021.

[670] Zeiler, M.D., and R. Fergus, “Visualizing and understanding convolutional networks”, arXiv [cs.CV], 2013.

[671] Cammarata, N., S. Carter, G. Goh, C. Olah, M. Petrov, and L. Schubert, “Thread: circuits”, Distill 5(3), 2020, pp. e24.

[672] Wang, K., A. Variengien, A. Conmy, B. Shlegeris, and J. Steinhardt, “Interpretability in the wild: A circuit for indirect object identification in GPT-2 small”, arXiv [cs.LG], 2022.

[673] Heimersheim, S., and N. Nanda, “How to use and interpret activation patching”, arXiv [cs.LG], 2024.

[674] Templeton, A., T. Conerly, J. Marcus, et al., “Scaling monosemanticity: Extracting interpretable features from claude 3 sonnet”, Transformer Circuits Thread, 2024.

[675] Mueller, A., J. Brinkmann, M. Li, et al., “The quest for the right mediator: A history, survey, and theoretical grounding of causal interpretability”, arXiv [cs.LG], 2024.

[676] Cunningham, H., A. Ewart, L. Riggs, R. Huben, and L. Sharkey, “Sparse autoencoders find highly interpretable features in language models”, arXiv [cs.LG], 2023.

[677] Gao, L., T.D. la Tour, H. Tillman, et al., “Scaling and evaluating sparse autoencoders”, arXiv [cs.LG], 2024.

[678] Alain, G., and Y. Bengio, “Understanding intermediate layers using linear classifier probes”, arXiv [stat.ML], 2016.

[679] Rumelhart, D.E., J.L. Mcclelland, and P.R. Group, Parallel distributed processing, volume 1: Explorations in the microstructure of cognition: foundations, Bradford Books, Cambridge, Mass, 1987.

[680] Hinton, g.E., S. Osindero, and Y. Teh, “A fast learning algorithm for deep belief nets”, Neural Comput. 18(7), 2006, pp. 1527–1554.

[681] Lee, H., R. Grosse, R. Ranganath, and A.Y. Ng, “Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations”, Proceedings of the 26th international conference on machine learning, (2009), 609–616.

[682] Barredo Arrieta, A., N. Díaz-Rodríguez, J. Del Ser, et al., “Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI”, Inf. Fusion 58, 2020, pp. 82–115.

[683] Simonyan, K., A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps”, https://openreview.net › forumhttps://openreview.net › forum, 2013.

[684] Olah, C., A. Satyanarayan, I. Johnson, et al., “The building blocks of interpretability”, Distill 3(3), 2018.

[685] Zimmermann, R.S., J. Borowski, R. Geirhos, M. Bethge, T.S.A. Wallis, and W. Brendel, “How well do feature visualizations support causal understanding of CNN activations?”, Proceedings of the 35th international conference on neural information processing systems, Curran Associates Inc. (2024), 11730–11744.

[686] Freeman, J., C.M. Ziemba, D.J. Heeger, E.P. Simoncelli, and J.A. Movshon, “A functional and perceptual signature of the second visual area in primates”, Nat. Neurosci. 16(7), 2013, pp. 974–981.

[687] Olsson, C., N. Elhage, N. Nanda, et al., “In-context learning and induction heads”, arXiv [cs.LG], 2022.

[688] Geva, M., J. Bastings, K. Filippova, and A. Globerson, “Dissecting recall of factual associations in auto-regressive language models”, arXiv [cs.CL], 2023.

[689] Nanda, N., L. Chan, T. Lieberum, J. Smith, and J. Steinhardt, “Progress measures for grokking via mechanistic interpretability”, arXiv [cs.LG], 2023.

[690] Quirke, P., C. Neo, and F. Barez, “Increasing trust in language models through the reuse of verified circuits”, arXiv [cs.LG], 2024.

[691] Syed, A., C. Rager, and A. Conmy, “Attribution patching outperforms automated circuit discovery”, arXiv [cs.LG], 2023.

[692] Conmy, A., A. Mavor-Parker, A. Lynch, S. Heimersheim, and A. Garriga-Alonso, “Towards automated circuit discovery for mechanistic interpretability”, Adv. Neural Inf. Process. Syst. 36, 2023, pp. 16318–16352.

[693] Weiss, G., Y. Goldberg, and E. Yahav, “Thinking like transformers”, arXiv [cs.LG], 2021.

[694] Zhou, H., A. Bradley, E. Littwin, et al., “What algorithms can transformers learn? A study in length generalization”, 2023.

[695] Lindner, D., J. Kramar, S. Farquhar, M. Rahtz, T. McGrath, and V. Mikulik, “Tracr: Compiled transformers as a laboratory for interpretability”, arXiv, 2023.

[696] Kohonen, T., P. Lehtiö, J. Rovamo, J. Hyvärinen, K. Bry, and L. Vainio, “A principle of neural associative memory”, Neuroscience 2(6), 1977, pp. 1065–1076.

[697] Hopfield, J.J., “Neural networks and physical systems with emergent collective computational abilities”, Proc. Natl. Acad. Sci. U. S. A. 79(8), 1982, pp. 2554–2558.

[698] Hinton, G.E., “Learning distributed representations of concepts”, In R.G.M. Morris, ed., Parallel distributed processing: Implications for psychology and neurobiology (pp. Clarendon Press/Oxford University Press, x, New York, NY, US, 1989, 46–61.

[699] Thorpe, S., “Local vs. Distributed coding”, Intellectica 8(2), 1989, pp. 3–40.

[700] Rigotti, M., O. Barak, M.R. Warden, et al., “The importance of mixed selectivity in complex cognitive tasks”, Nature 497(7451), 2013, pp. 585–590.

[701] Bricken, T., A. Templeton, J. Batson, et al., “Towards monosemanticity: Decomposing language models with dictionary learning”, Transformer Circuits Thread, 2023.

[702] McDougall, C.S., A. Conmy, C. Rushing, T. McGrath, and N. Nanda, “Copy suppression: Comprehensively understanding an attention head”, https://openreview.net › forumhttps://openreview.net › forum, 2023.

[703] Rajamanoharan, S., T. Lieberum, N. Sonnerat, et al., “Jumping ahead: Improving reconstruction fidelity with JumpReLU sparse autoencoders”, arXiv [cs.LG], 2024.

[704] Rajamanoharan, S., A. Conmy, L. Smith, et al., “Improving dictionary learning with gated sparse autoencoders”, arXiv [cs.LG], 2024.

[705] Lieberum, T., S. Rajamanoharan, A. Conmy, et al., “Gemma scope: Open sparse autoencoders everywhere all at once on gemma 2”, arXiv [cs.LG], 2024.

[706] Marks, S., C. Rager, E.J. Michaud, Y. Belinkov, D. Bau, and A. Mueller, “Sparse feature circuits: Discovering and editing interpretable causal graphs in language models”, arXiv [cs.LG], 2024.

[707] Marks, S., and M. Tegmark, “The geometry of truth: Emergent linear structure in large language model representations of true/false datasets”, arXiv [cs.AI], 2023.

[708] Sharma, A.S., D. Atkinson, and D. Bau, “Locating and editing factual associations in mamba”, arXiv [cs.CL], 2024.

[709] Wang, J., X. Ge, W. Shu, et al., “Towards universality: Studying mechanistic similarity across language model architectures”, arXiv [cs.CL], 2024.

[710] Liu, Z., E. Gan, and M. Tegmark, “Seeing is believing: Brain-inspired modular training for mechanistic interpretability”, Entropy (Basel) 26(1), 2023.

[711] Jarvis, D., R. Klein, B. Rosman, and A.M. Saxe, “On the specialization of neural modules”, arXiv [cs.LG], 2024.

[712] Clune, J., J.-B. Mouret, and H. Lipson, “The evolutionary origins of modularity”, Proc. Biol. Sci. 280(1755), 2013, pp. 20122863.

[713] Margalit, E., H. Lee, D. Finzi, J.J. DiCarlo, K. Grill-Spector, and D.L.K. Yamins, “A unifying framework for functional organization in early and higher ventral visual cortex”, Neuron 112(14), 2024, pp. 2435–2451.e7.

[714] Kouh, M., and T. Poggio, “A canonical neural circuit for cortical nonlinear operations”, Neural Comput. 20(6), 2008, pp. 1427–1451.

[715] Liu, Z., Y. Wang, S. Vaidya, et al., “KAN: Kolmogorov-arnold networks”, 2024.

[716] Steinmetz, N.A., C. Aydin, A. Lebedeva, et al., “Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings”, Science 372(6539), 2021, pp. eabf4588.

[717] Meng, Y., K. Hynynen, and N. Lipsman, “Applications of focused ultrasound in the brain: From thermoablation to drug delivery”, Nat. Rev. Neurol. 17, 2020, pp. 7–22.

[718] Sevilla, J., L. Heim, A. Ho, T. Besiroglu, M. Hobbhahn, and P. Villalobos, “Compute trends across three eras of machine learning”, arXiv [cs.LG], 2022.

[719] Markiewicz, C.J., K.J. Gorgolewski, F. Feingold, et al., “The OpenNeuro resource for sharing of neuroscience data”, eLife 10, 2021, pp. e71774.

[720] Marblestone, A., A. Gamick, T. Kalil, C. Martin, M. Cvitkovic, and S.G. Rodriques, “Unblock research bottlenecks with non-profit start-ups”, Nature 601(7892), 2022, pp. 188–190.

[721] Greydanus, S., and D. Kobak, “Scaling down deep learning with MNIST-1D”, arXiv [cs.LG], 2020.

[722] Wang, G., Y. Sun, S. Cheng, and S. Song, “Evolving connectivity for recurrent spiking neural networks”, Neural information processing systems, (2023).

[723] Abril-Pla, O., V. Andreani, C. Carroll, et al., “PyMC: A modern, and comprehensive probabilistic programming framework in python”, PeerJ Comput. Sci. 9, 2023, pp. e1516.

[724] Stringer, C., M. Pachitariu, N. Steinmetz, C.B. Reddy, M. Carandini, and K.D. Harris, “Spontaneous behaviors drive multidimensional, brainwide activity”, Science 364(6437), 2019, pp. 255.

[725] Halchenko, Y., J.T.W. II, H. Christian, et al., “Dandi/dandi-cli: 0.64.0”, 2024. https://zenodo.org/records/14171788

[726] Rübel, O., A. Tritt, R. Ly, et al., “The Neurodata Without Borders ecosystem for neurophysiological data science”, eLife 11, 2022, pp. e78362.

[727] Gorgolewski, K.J., T. Auer, V.D. Calhoun, et al., “The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments”, Scientific Data 3(1), 2016, pp. 160044.

[728] Miller, K.L., F. Alfaro-Almagro, N.K. Bangerter, et al., “Multimodal population brain imaging in the UK Biobank prospective epidemiological study”, Nature Neuroscience 19(11), 2016, pp. 1523–1536.

[729] Kalcher, K., W. Huf, R.N. Boubela, et al., “Fully exploratory network independent component analysis of the 1000 functional connectomes database”, Frontiers in Human Neuroscience 6, 2012.

[730] Van Essen, D.C., K. Ugurbil, E. Auerbach, et al., “The Human Connectome Project: A data acquisition perspective”, NeuroImage 62(4), 2012, pp. 2222–2231.

[731] Casey, B.J., T. Cannonier, M.I. Conley, et al., “The Adolescent Brain Cognitive Development (ABCD) study: Imaging acquisition across 21 sites”, Developmental Cognitive Neuroscience 32, 2018, pp. 43–54.

[732] Alexander, L.M., J. Escalera, L. Ai, et al., “An open resource for transdiagnostic research in pediatric mental health and learning disorders”, Scientific Data 4(1), 2017, pp. 170181.

[733] Boyle, J., B. Pinsard, V. Borghesani, F. Paugam, E. DuPre, and P. Bellec, “The Courtois NeuroMod project: Quality assessment of the initial data release (2020)”, 2023 Conference on Cognitive Computational Neuroscience, Cognitive Computational Neuroscience (2023).

[734] Seeliger, K., R.P. Sommers, U. Güçlü, S.E. Bosch, and M.A.J. Van Gerven, “A large single-participant fMRI dataset for probing brain responses to naturalistic stimuli in space and time”, 2019. http://biorxiv.org/lookup/doi/10.1101/687681

[735] Siegel, J.S., S. Subramanian, D. Perry, et al., “Psilocybin desynchronizes the human brain”, Nature 632(8023), 2024, pp. 131–138.

NeuroAI for AI Safety

A Differential Path

NeuroAI for AI Safety

Appendix

Log-sigmoid scaling laws