Two real fraud catches in production
On the same day Module 4 (recording fingerprint via ACRCloud) was deployed, DistroShield caught two operationally significant cases against an indie distributor's incoming catalog. Both are anonymized below โ technical details (scores, match counts, latencies, costs) are real and verifiable on the production API.
Case 1 โ Cross-distributor identity fraud
Module 4 / Recording Fingerprint via ACRCloud
Situation
A client of an indie distributor submitted a 20.5 MB WAV through the standard upload pipeline with metadata claiming a self-authored track. To DistroShield's per-track AI detector (Module 1), the audio looked human-made (ai_score = 0.0069, confidence = 100%) โ and indeed the audio is human-made, not AI-generated. To the duplicate-check module (Module 2 โ string match against Spotify / Deezer / YouTube), the title and artist had no exact-match red flag. Modules 1, 2, and 3 alone would not have escalated.
What it actually was: the same audio recording was already registered on at least three other distributors, under three completely different artist identities. The submitting client was attempting to be the fourth identity that would receive royalties for the same recording.
What DistroShield caught
The newly-deployed Module 4 hashed the first ~5 seconds of audio and queried ACRCloud's catalog (~70M commercial recordings). Three matches at perfect score:
| Score | Match | ISRC issuer code | Released |
|---|---|---|---|
| 100 | Recording A โ Artist X | issuer code Y | 2023 |
| 100 | Recording B โ Artist Z | different issuer | 2022 |
| 100 | Recording C โ Artist W | different issuer | 2022 |
Three different ISRC issuer codes confirm three different distributors. Three different artist names on three different Spotify / Deezer accounts.
DistroShield's response:
recommendation: review review_reason: cross_distributor_recording_fraud recording_fingerprint.matched: true recording_fingerprint.highest_score: 100 distinct_artists_at_perfect_score: 3
The track was held in the distributor's review queue instead of being delivered to DSPs. The reviewer confirmed via the verdict button that the audio itself is human-made (Module 1 was correct: not AI-generated), and the cross-distributor fraud signal led to the submitting client being identified as the impostor โ the uploader, not a victim. The track was blocked from distribution.
Important nuance: the verdict button (human / ai / uncertain / hybrid_confirmed) refers to whether the audio is AI-generated, not to whether the track should ship. A track can be verdict: human (audio is real) AND recommendation: review (held due to recording fraud) simultaneously. They are separate decisions about separate questions.
What would have happened without DistroShield
The 4th-identity attempt would have shipped to Spotify / Apple / Deezer / Amazon / YouTube Music. Every play royalty for that recording would split across 4 rights claimants โ at least one of whom is the legitimate owner. Eventually one of the legit owners files a DMCA strike against the indie distributor. Possible downstream consequences:
- Direct payout hold on the offending track.
- Spotify / YouTube content audit on the distributor's broader catalog (because the pattern indicates a content-mill upstream relationship the DSP wants to investigate).
- Potential per-incident penalties under Spotify's 2024-2025 anti-fraud policy ($10/track, with multipliers for repeat offenders).
- Reputational damage with rights administration networks (BMI, ASCAP).
- In a worst-case scenario where the platform sees a pattern: distributor delisting (existential).
The conservative downside estimate for a single distributor caught delivering one cross-distributor identity fraud incident is $10Kโ$50K in payout holds, plus uncapped reputational tail.
Cost to catch it
A single ACRCloud query at the post-trial Pay-as-you-go rate (with bucket-fee and 3rd-party-IDs surcharges this project carries): ~$0.0065.
One pre-upload API call. Cost: less than a cent. Caught a recording attempting to be registered under a 4th distributor identity. Avoided downside: tens of thousands in payout holds plus reputational damage that doesn't wash out.
Why no competitor catches this
- Spotify / Apple / Deezer detect AFTER upload, not before. They would have caught it eventually โ and that catch is what triggers the punishment for the distributor. DistroShield is the only layer that runs before the delivery.
- Audible Magic and Pex are enterprise-priced and target rights-holders / DSPs, not distributor ingest pipelines.
- Manual reviewers cannot listen to every track of every release at scale, and even if they could, they wouldn't catch this โ the audio is the same, the metadata is fully fabricated.
Case 2 โ Album-coherence false-negative recovery
Module 1 + post-processing
Situation
Same day, same distributor. A client submitted a 5-track EP of bachata-style songs, all generated by the same AI music tool, with the same vocal style, same rhythm, same arrangement language across all 5 tracks. The detector model (v7c) โ which runs at ~93% accuracy in production but has a known ~7% FN rate โ caught 4 of the 5 tracks as AI:
| Track | AI Score | Classification | Recommendation |
|---|---|---|---|
| 1 | 0.0224 | human | PASS โ false negative |
| 2 | 0.7427 | ai | flagged |
| 3 | 0.7426 | ai | flagged |
| 4 | 0.7418 | ai | flagged |
| 5 | 0.7415 | ai | flagged |
Track 1 is statistically a false negative. Same generator, same artist identity, same EP delivery โ the probability that one track of an AI EP is genuinely human is essentially zero.
What the system did
Per-track inference alone would have delivered Track 1 to DSPs while flagging the other 4 โ fragmenting the release in a way that would itself be a flag to the DSP audit team ("why did 4 of 5 tracks of this EP get pulled?").
The album-coherence post-processing layer cross-checks per-track classifications within a release. The rule: if โฅ60% of tracks in a single-artist release are classified ai, force the remaining tracks to pending_review regardless of their per-track score.
In this EP, ratio = 4/5 = 80% > 60%. Track 1 was forced into the review queue with an album_coherence_anomaly flag. The reviewer listened, confirmed it was AI (the vocal section is what gives it away), and emitted a final verdict via the API. All 5 tracks of the EP held.
What would have happened without the coherence layer
Track 1 distributes to DSPs as a single. The other 4 stay flagged. From the DSP's perspective, the distributor delivered a fragmented release where 80% was pulled โ a strong audit signal. Possible outcomes:
- DSP investigates the entire release and the distributor's recent catalog.
- The 1 distributed track stays live but eventually triggers retroactive AI policy enforcement when content scanning catches up.
- Cumulative trust-score damage with DSPs.
The product narrative
The detector is good, not perfect โ every ML detector has a per-track FN rate, and at scale the FN is mathematical certainty. The system-level architecture compensates: per-track ML + cross-track coherence = layered defense. When the model's per-track decision is wrong, the redundancy catches it.
This case also produces gold-quality training data for the next model iteration: an explicit FN with context ("model missed this track, but the other 4 of 5 were AI-confirmed"). Training data sourced from real production verdicts is the substrate the detector improves on.
Even with a 93%-accurate detector, the per-track FN rate is mathematically inevitable at scale. DistroShield's release-level coherence check converts isolated false-negatives back into review queue items. The detector keeps getting smarter; the system stays robust while it learns.
Why no one else catches this
| Layer | Position in pipeline | Catches this? |
|---|---|---|
| Spotify / Apple / Deezer audit | Post-upload | Yes โ but after punishing the distributor |
| Audible Magic / Pex | Rights-holder side | Enterprise-priced, not distributor ingest |
| Manual review | Distributor side | Doesn't scale, can't detect cross-distributor identity reuse |
| DistroShield | Pre-upload, distributor-side | ~$0.40/track, four modules |
Postscript โ what we learned in the next 24 hours
On the day after Module 4 deployed, a different track from the same distributor was flagged with a single match at score 100 โ appearing to trigger another fraud signal. On inspection, the match window was only 2.88 seconds out of an audio sample of ~5 seconds. The fingerprint service was matching a generic drum-machine intro shared by two unrelated indie tracks. A re-query 90 minutes later returned a different match at score 100 (different track, same short window) โ confirming that the algorithm was reporting hash collisions on short generic intros, not real fraud.
Same-day fix: increase audio sample sent to the fingerprint service from 1MB (~5-10s) to 5MB (~30-60s), and add a match_window_ms โฅ 6000 gate for single-match escalation. Multi-artist coincidence (the cross-distributor case above) is independently strong signal and is NOT gated by window length, so the real fraud catch remains intact while generic-intro false positives get suppressed cleanly.
A first version that catches fraud will also produce false positives. The product matures by exposing both the catches and the misses honestly, then engineering the system to discriminate. We shipped the fix the same day, against a real production false positive, and validated it against the original real catch โ both still behave correctly.
A product that has only success stories at 24h is suspicious. A product that catches a real cross-distributor fraud, surfaces a false positive, and ships the discrimination fix in the same day is mature.
What we DON'T claim from these
- We don't claim 100% fraud detection. The album-coherence case literally exists because the per-track model misses things.
- We don't claim instant ROI quantification. The downside numbers above are conservative estimates โ the actual cost to a distributor depends on DSP, region, catalog size, and timing.
- We don't claim DSPs can't catch these. They can โ eventually. The value is catching them before the DSP has to.
Honesty in the pitch is itself a wedge against sales-led pricing and unverifiable claims.