Comment by lionkor

15 hours ago

Hi, I'm curious how preventing jailbreaks protects the user?

> Prompt guardrails to prevent jailbreak attempts and ensure safe user interactions [...]

3 comments

lionkor

Untrusted inputs to systems with agency or access to privileged data. Here’s a data exfiltration example in Google AI Studio:

https://x.com/wunderwuzzi23/status/1821210923157098919

sparacha 14 hours ago

That's a fair point - technically it protects the application from malicious attempts to subvert the desired LLM experience. The more specific language (and I think we could do better here) would be that Arch ensures users remain within the bounds of an intended LLM experience. That at least was the intention behind "ensure safe user interactions"...

adilhafeez 14 hours ago

Jailbreak ensures a smooth developer experience by controlling what traffic from user make its way to the model. With jailbreak (and other guardrails soon to be added) developers can short-circuit response and with observability developers can get insights on how users are interacting with their APIs.