Building Safety Infrastructure in the Open
Written by
Vinay Rao
Published
ROOST is a non-profit building open source safety infrastructure. This work sits at the intersection of platform safety, an evolving threat landscape, and a foundational shift in the technology itself. There is wide recognition for the need for safety infrastructure, but few discuss the inherent challenges. The ecosystem is changing, with attacks becoming multi-modal, real-time, and ever more fast paced. The attackers are empowered with AI for scale, variety, and reverse-engineering defenses. The systems that catch abuse, detect coordinated attacks, and enforce policy at scale, are full of trade-offs. Building them in the open means being honest about the trade-offs, optimizing for the greatest impact, all while making systems maximally accessible.
Infrastructure Flexibility vs Performance
We are building systems that sit on top of core-infrastructure like databases and message brokers. Many organizations that adopt our software may already have invested in core-infrastructure. These orgs don’t want a second set of databases.
Recognizing this, ROOST builds safety infrastructure that is flexible – organizations can bring their own databases and easily plug them in. In some cases, flexibility might result in an organization plugging in a database that has high reading latency that might impact the throughput of the system. When performance is likely to be impacted, we reduce flexibility to preserve performance. Finding where the balance lies will require iteration.
Contributor Accessibility vs System Performance
ROOST aims to enable the broadest segment of the open source community to contribute to what we build. That means choosing languages, packages, and tools that the community is comfortable with.
The trade-off here is that some performance oriented languages like C++ are powerful but are less familiar to developers. Many widely used languages like Python are less performant. Choosing one over the other has real consequences. There is no clean answer here and we expect to revisit this as the community and the technology familiarity evolve.
Analytical Power vs System Complexity
Safety decisions get better when you can analyze across events, users, and behaviors. That means handling high throughput event streams, running classifiers over context aggregated across multiple events or users, building relationship graphs, clustering content, users, and behaviors in real-time.
The tradeoff is complexity. A system that does all this needs a consistent data model, efficient orchestration of multiple systems, requiring statefulness (and foregoing easy parallelization), and enough introspectability to know when things are working and when they are not. Every new subsystem makes the overall system harder to reason about, test, and debug. Ambition should not outrun the ability to run an understandable system.
Capability vs Operational Burden
Self-hosted deployment has many advantages for the safety teams. These teams work on highly sensitive data, so keeping it within the network is important. Platforms can vary a lot in terms of data, content, behaviors and a self hosted platform can be customized to each platform’s specific use case.
The tradeoff between capability and operational burden arises in the ability of small teams to maintain and optimize complex systems. The more capable a system, the harder it is to run. They may not have SREs to attend to alerts. Simpler systems that are easier to maintain sacrifice throughput or capabilities or both. Finding the right defaults and the right escape hatches is an ongoing design problem.
Detection Speed and Confidence
AI is becoming universally available and ever cheaper to use. This implies that adversarial actors will use it extensively. Attacks will grow in scale and complexity. Attacks will morph ever faster to evade detection. Static systems will not keep up.
The tradeoff here is between speed and confidence of detection and intervention. Detection systems will need to learn from ever smaller datasets to respond to novel threats quickly. But they are likely to produce false positives and miss edge cases. Unsupervised methods will be critical for monitoring traffic and discovering new attacks as long as teams can build mechanisms to reliably label the new behaviors. This makes it imperative that AI is used to build, test, and ship defenses despite known issues like hallucinations, jailbreaks, and prompt-injections. Navigating this tension with care is essential.
Accessibility and Expressiveness
Safety teams are a mix of policy experts, analysts, and engineers. Our tools need to work for all of them. Which means analysis, logic, and actioning should be expressible in human readable languages and not just in structured languages like SQL or python.
The trade-off here is between accessibility and expressiveness of the logic that describes an attack fingerprint. With structured languages we can correctly express complex logic, compose logical modules to capture a wide variety of attacks. Limiting tools to structured languages will make much of the safety teams unable to contribute.
LLMs are getting much better at converting natural language into code. This could be a powerful avenue to achieve the desired expressiveness. However the LLM-generated logic introduces its own risks of correctness and reproducibility. This is the direction things are heading, and our design reflects that future.
Open Source Advantage
These tradeoffs are not unique to us. Every team building safety infrastructure faces them. The difference is how you navigate them. Open source means these designs happen in the open. The code and PRs are fully visible, but the collaboration goes deeper. Developers wrestling with these same tradeoffs connect in our Discord server to talk through challenges. Teams contribute to a collective roadmap shaped by real implementation needs. Office hours for projects like HMA become spaces where those who have deployed systems share hard-won lessons with those just starting out. The community sees the trade-offs, debates them, and helps course-correct when we get it wrong.
No single organization has all the answers. Together we can iterate faster, amplify what works, and build infrastructure that serves everyone. This is the moment that demands it – when threats are evolving faster than any one team can counter them alone. Rather than each organization or platform pushing the boulder up the hill from scratch, we are joining forces across the ecosystem to lift all boats. That is why we are doing this work in the open and why we know it is the best way forward.