Published on April 12, 2025 6:36 AM GMT
2025-04-12
DISCLAIMER
- This document is written quickly and contains opinions I may change quickly, as I get new info.This document contains politically sensitive info that I might take down in future.
In short this document describes how to setup distributed whisteblowing processes to reduce personal risk for everyone involved in the process. Typically whisteblowing (such as with wikileaks or snowden leaks) incurs significant personal risk. Reducing personal risk may ensure it is highly likely to happen when an org doesn't have complete trust of all its members, forcing them to pay a secrecy tax (in Assange's words) relative to orgs that do have complete trust and/or higher levels of transparency.
Potential problems
- Low-attention regime
- Whistleblower sends documents to Bob via SecureDrop server or via hard disk dead drop
- PROBLEM: good infra, protocols, incentives to coordinate hard disk dead drops don't exist, especially if trying to use multiple hops to reach destinationIMO hard disk dead drops are better than using Tor + tails + PGP, as of 2025
- PROBLEM: need public guidelines on redaction, so anyone can do it (i.e. become Bob)PROBLEM: for leaked docs/videos that are not especially high-stakes (such as govt classified docs), it might be okay to just publish the redacted docs/videos on a website at this stage. Website must be hard-but-not-impossible to censor. for instance 4chan for docs or rumble for video. Unsure which existing websites are best suited for this use case.
- Mirror a searchable version of docs to thousands of servers immediately
- PROBLEM: need open source web crawler to crawl entire internet including any leaked docs/videosOR: PROBLEM: need a standard protocol to only crawl websites and torrents that claim to have leaked docs on them (maybe they include a special flag in their readme/robots.txt, and some cryptographic proof-of-work to prove they're not spam)PROBLEM: need open source plaintext extraction and embedding generation so that along with the raw html crawls (WARC), the plaintext and embeddings are also circulated in the same torrent. need standardised format (WARC-parquet?) that keeps some metadata just like WARC keeps metadata.
- Popular media website will do document verification. I'm assuming they won't face any significant challenge with this.
- PROBLEM: need open source crawling and mirroring crawls of all social mediaI think actually doing distributed social media is too hard. Complexity of app ensures software developers who write the app are politically co-optible. What's easier to do is have distributed crawling and mirroring of a centralised site, so people in future can still view the consensus reached by users of the social media. If it ever gets taken down, someone can get a new server running (does not have to have content of old one).
Summary of potential solutions
- coordination for hard disk dead drops, including multi-hop hard disk dead dropsredaction guidelinesopen source web crawling
- flags and proof-of-work to only crawl some websitescrawl and mirror leaked docs. crawl and mirror social media discussions.
- standardise format to share extracted plaintext and embeddings
- to publish torrent link, maybe raw docs, and social media discussionsguidelines must be country-wise and include legal considerations. always use a social media of a country different from the country where leak happened.
- torrents proposed above are illegal in all countries. instead we can legally circulate country A's secrets via torrent within country B and legally circulate country B's secrets via torrent within country A. only getting the info past the border is illegal, for that again need securedrop or hard disk dead drop.have to think more about who publishes the legal torrents
IMPORTANT: Need feedback from people who have actually worked with whistleblowers, to validate all hypotheses listed above.
IMPORTANT: Need to decide whether whisteblowing of important orgs (such as companies/labs researching intelligence, and national/intl intelligence agencies) is actually what I want to work on.
- Crux: Does this lead to longterm human flourishing or not?Narrowed crux: Would orgs developing intelligence (superintelligent AI, human genetic engg, BCIs etc) necessarily be trustworthy if there existed lots of public information about both their capabilities and their values?
- Leaked info about values means public can decide if the company represents their true values. i.e. control via democracy instead of via market.Leaked info about capabilities means public can coordinate under a different leader to shut down existing org and setup a different org.
- This requires complete global surveillance (public information) and inter-govt coordination if the open sourced capabilities include offense-beats-defence weaponry deployable by small groups (such as bioweapons, nanotech weapons, etc).Seems difficult to find a solution that can leak info about values of an org without also leaking info about capabilities of the org. Above proposed solutions will obviously leak both.More narrowed crux: Lots of lesswrong crowd defers to yudkowsky's mental priors against open source bioweapons and complete surveillance (public info). Will have to write separate post on this.
Discuss