Welcoming Zentropi's CoPE-B-A4B to the ROOST Model Community

Welcoming Zentropi's CoPE-B-A4B to the ROOST Model Community

Published

Overview

Today, Zentropi announced the release of their newest model, CoPE-B-A4B, and we are excited to share that this model is joining the ROOST Model Community (RMC). CoPE-B-A4B is an open source, bring-your-own-policy (BYOP) safety classifier that has low latency, low costs to run at scale, and accepts longform content policies.

Key Features

Running a high-quality classifier against every piece of content on a busy platform has historically been expensive and/or slow, especially with reasoning-based models that emit hundreds of intermediate tokens per decision. As many safety teams know from experience and as OpenAI noted when gpt-oss-safeguard launched in the RMC, it’s often more strategic to pair faster classifiers or hash algorithms for high-volume, first-pass filtering with slower, more deliberative models for harder edge cases.

CoPE-B-A4B addresses that safety team need for a low latency, low cost BYOP classifier, and is part of a broader effort to make harmful content classification much more affordable and easy to use for all builders. This is particularly valuable for smaller developers, many of whom join ROOST looking for initial guidance on how to increase operational efficiency by automating content classification challenges.

CoPE-B-A4B has three defining characteristics that make it a strong fit for production deployments:

  • Low latency: In ROOST’s testing, CoPE-B-A4B classifies content in one forward pass, with no internal reasoning chain. The output is a single binary token. Per-call latency stays comfortably sub-second on commodity GPUs, even when the policy document is long.

  • Low cost at scale: Built on a Mixture-of-Experts (MoE) architecture, CoPE-B-A4B has 25 billion parameters in total but only uses about 4 billion to process any given piece of content. Each token is routed through a small subset of specialized experts rather than the whole network for efficiency.

  • Long policy support: The model accepts substantial policy documents alongside the content being labeled. Teams do not need to compress their policies to fit.

Notably, because CoPE-B-A4B is BYOP, it lets organizations define their own content policy and ask the model to classify content against it. Teams stay in charge of what counts as harmful on their platform, rather than inheriting a fixed taxonomy chosen by and tailored for someone else. (We’ve previously written about why this matters in safety.)

These features make CoPE-B-A4B a good fit for a different stage of the safety workflow than a higher-latency, reasoning-based classifier, and puts advanced classification within reach for small teams and organizations.

Access and Implementation

The text-only model is open weights under Apache 2.0 and available today at zentropi-ai/cope-b-a4b. A model card walks through bias and limitations, how it was trained, deployment specifics, including the recommended prompt format. The Zentropi team will be joining the next RMC Office Hours on June 10th to answer questions and share more about CoPE-B-A4B.

“Policy Packs”: A Shared Policy Library for Builders

A BYOP model is only as useful as the policies people bring to it. Most platforms, especially smaller ones without dedicated safety teams, do not have a library of well-tested content policies to draw from.

ROOST already hosts “policy packs” – publicly shared policies for a specific harm and/or model contributed by community members – including a set of teen safety policies contributed by OpenAI, and terrorist and violent extremist content policy from the Christchurch Call Foundation. Keep in mind that policy packs vary in how they're structured, so it's worth checking that a given pack is compatible with your setup before plugging it in. If you have developed policy packs that work well in your domain, in your language, and reflecting your norms, we would love to see them contributed as a policy pack in the community repository.

Evaluation is the Next Frontier

A recurring topic amongst the ROOST community of safety practitioners is the lack of reliable benchmarks to evaluate AI models intended to create user safeguards (“safety models”). This is a vexing problem for builders who seek to assess how available safety models compare to each other in order to pick the most appropriate for their platforms and use cases, and for model builders who seek to document the performance of their latest releases. In the context of CoPE-B, the benchmarking question is even more complicated, as it requires assessing how models perform on BYOP (vs. fixed) policies. For teams evaluating whether CoPE-B-A4B is right for their workflow, this question is immediate. And we are actively working on it. We’re grateful to the Zentropi team for tackling this issue with rigor and transparency in their model card for CoPE-B.

More on this one soon, as we continue joining forces with partners in both academia and industry to increase usability, transparency and accountability for safety tools.

Get Involved

ROOST’s mission is to make online safety tools open, shared, and auditable for everyone, and the ROOST Model Community is where that mission comes to life. The RMC brings together safety practitioners, AI researchers, and model creators to jointly and openly build the tools that address content detection. Adding models like CoPE-B-A4B, the policies that make it shine, and the evaluations that verify a model’s value are critical steps on that journey. By making these resources freely and immediately available, the RMC ensures that any organization, anywhere in the world, regardless of size or resourcing, can improve their safety operations with confidence.

We believe the future of online safety demands open, transparent tools. Every model released, every policy pack shared, every benchmark published makes the ecosystem more trustworthy for everyone. We are grateful to Zentropi and all our partners who are building this future with us.