[59]
Francesco Marra, Diego Gragnaniello, Luisa Verdoliva, and Giovanni Poggi. Do GANs leave
artificial fingerprints? In 2019 IEEE Conference on Multimedia Information Processing and
Retrieval (MIPR), pages 506–511, 2019. doi: 10.1109/MIPR.2019.00103.
[60]
Masnoon Nafees, Shimei Pan, Zhiyuan Chen, and James R Foulds. Impostor gan: Toward
modeling social media user impersonation with generative adversarial networks. In Deceptive
AI, pages 157–165. Springer, 2020.
[61]
John D. Norton. Ignorance and indifference*. Philosophy of Science, 75(1):45–68, 2008. doi:
10.1086/587822.
[62]
Chris Olah. Mechanistic interpretability, variables, and the importance of interpretable bases.
Transformer Circuits Thread(June 27). http://www. transformer-circuits. pub/2022/mech-
interp-essay/index. html, 2022.
[63]
Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova DasSarma, Tom
Henighan, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, et al. In-context learning and
induction heads. arXiv preprint arXiv:2209.11895, 2022.
[64]
Stephen M. Omohundro. The basic AI drives. In Pei Wang, Ben Goertzel, and Stan Franklin,
editors, Artificial General Intelligence 2008, Proceedings of the First AGI Conference, AGI
2008, March 1-3, 2008, University of Memphis, Memphis, TN, USA, volume 171 of Frontiers
in Artificial Intelligence and Applications, pages 483–492. IOS Press, 2008. URL http:
//www.booksonline.iospress.nl/Content/View.aspx?piid=8341.
[65] OpenAI. Gpt-4 technical report, 2023.
[66]
Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela Mishkin,
Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models
to follow instructions with human feedback. arXiv preprint arXiv:2203.02155, 2022.
[67]
Lorenzo Pacchiardi, Alex J. Chan, Sören Mindermann, Ilan Moscovitz, Alexa Y. Pan, Yarin
Gal, Owain Evans, and Jan Brauner. How to catch an ai liar: Lie detection in black-box llms
by asking unrelated questions, 2023.
[68]
Alison R. Panisson, Stefan Sarkadi, Peter McBurney, Simon Parsons, and Rafael H. Bordini.
Lies, bullshit, and deception in agent-oriented programming languages. In Robin Cohen,
Murat Sensoy, and Timothy J. Norman, editors, Proceedings of the 20th International Trust
Workshop co-located with AAMAS/IJCAI/ECAI/ICML 2018, Stockholm, Sweden, July 14, 2018,
volume 2154 of CEUR Workshop Proceedings, pages 50–61. CEUR-WS.org, 2018. URL
http://ceur-ws.org/Vol-2154/paper5.pdf.
[69]
Peter S. Park, Simon Goldstein, Aidan O’Gara, Michael Chen, and Dan Hendrycks. Ai
deception: A survey of examples, risks, and potential solutions, 2023.
[70] Judea Pearl. Causality. Cambridge university press, 2009.
[71]
Julien Perolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent
de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen McAleer,
Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina
Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc
Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent
Sifre, Nathalie Beauguerlange, Remi Munos, David Silver, Satinder Singh, Demis Hassabis,
and Karl Tuyls. Mastering the game of stratego with model-free multiagent reinforcement
learning. Science, 378(6623):990–996, 2022. doi: 10.1126/science.add4679. URL https:
//www.science.org/doi/abs/10.1126/science.add4679.
[72]
Denis Peskov, Benny Cheng, Ahmed Elgohary, Joe Barrow, Cristian Danescu-Niculescu-Mizil,
and Jordan Boyd-Graber. It takes two to lie: One to lie, and one to listen. In Proceedings
of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3811–
3854, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.
acl-main.353. URL https://aclanthology.org/2020.acl-main.353.
15