https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/ Adversarial Attacks on LLMsLilian WengOctober 25, 2023 “Adversarial attacks are inputs that trigger the model to output something undesired.”
Data, Science, and Tinkering
https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/ Adversarial Attacks on LLMsLilian WengOctober 25, 2023 “Adversarial attacks are inputs that trigger the model to output something undesired.”