-
agentlans/prompt-safety-classification
Viewer • Updated • 72.1k • 59 -
Jammies-io/safety-refusal
Viewer • Updated • 100 • 9 -
RefusalBench: Generative Evaluation of Selective Refusal in Grounded Language Models
Paper • 2510.10390 • Published • 3 -
nvidia/Aegis-AI-Content-Safety-Dataset-2.0
Viewer • Updated • 33.4k • 3.56k • 71
Daniel Bis
danielbis
·
AI & ML interests
https://scholar.google.com/citations?user=ArMgXHYAAAAJ&hl=en
Recent Activity
updated
a collection
14 days ago
safety
updated
a collection
17 days ago
safety
liked
a model
about 2 months ago
zentropi-ai/cope-a-9b
Organizations
None yet