Tag

safety

Posts that mention safety across consulting projects, internal experiments, and client engagements.

A response to Anthropic research on stabilizing the character of large language models.

We're creating adversarial AI not through failed alignment—but by teaching AI systems exactly what their relationship with humans is.