ImageNet contains naturally occurring Apple NeuralHash collisions

saithound · on Aug 19, 2021

Keep in mind that Apple's claimed false positive rate (one in a trillion chance of an account being flagged innocently), and the collision rate determined by Dwyer in the article, are both derived without any adversarial assumptions. Given that NeuralHash collider and similar tools already exist, the false positive rate is expected to be much much higher.

Imagine that you play a game of craps against an online casino. The casino throws a virtual six-sided die, secretly generated using Microsoft Excel's random number generator. Your job is to predict the result. If you manage to predict the result 100 times in a row, you win and the casino will pay you $1000000000000 (one trillion dollars). If you ever fail to predict the result of a throw, the game is over, you lose and you pay the casino $1 (one dollar).

In an ordinary, non-adversarial context, the probability that you win the game is much less than one in one trillion, so this game is very safe for the casino. But this number is very misleading: it's based on naive assumptions that are completely meaningless in an adversarial context. If your adversary has a decent knowledge of mathematics at the high school level, the serial correlation in Excel's generator comes into play, and the relevant probability is no longer one in one trillion. The relevant number is 1/216 instead! When faced with a class of adversarial math majors, a casino that offers this game will promptly go bankrupt. With Apple's CSAM detection, you get to be that casino.

mrtranscendence · on Aug 19, 2021

Why would anyone bother with such an attack? The end result is that some peon at Apple has to look at the images and mark them as not CSAM. You've cost someone a bit of privacy, but that's it.

giantrobot · on Aug 19, 2021

It's entirely possible to alter an image such that its raw form looks different from its scaled form [0]. A government or just well resourced group can take a legitimate CSAM image and modify it such that when scaled for use in the perceptual algorithm(s) it changes to be some politically sensitive image. Upon review it'll look like CSAM so off it goes to reporting agencies.

Because the perceptual hash algorithms are presented as black boxes the image they perceive isn't audited or reviewed. There's zero recognition of this weakness by Apple or NCMEC (and their equivalents). For the system to even begin to be trustworthy all content would need to be reviewed raw and scaled-as-fed-into-the-algorithm.

[0] https://bdtechtalks.com/2020/08/03/machine-learning-adversar...

skygazer · on Aug 19, 2021

This attack does seem easily defeated, even naively, by downscaling by three different means (bicubic, nearest neighbor, Lanczos, etc.) and rejecting the downscale that most differs from the other two, since the attack is tailored to a specific downscaling algorithm -- the attack seems to only be effective against systems that make no effort at all to safeguard against it.

Granted, Apple makes no mention of any safeguard, but it would be trivial in principle to protect against, and is not an unavoidable failing.

sennight · on Aug 20, 2021

The objective of being mindful of the thumbnail is to fool the human reviewer responsible for alerting the police to your target's need for a good swatting - the algorithm has already flagged the image by the time it is presented as a thumbnail during review.

You'd basically start off with an image known (or very likely) to be cataloged in a CP hash database.

Note its NeuralHash.

Find a non-CP image that would, after being scaled down or otherwise sanitized, fool an unaccountable and likely disinterested Apple employee into muttering "close enough" while selecting whichever option box it is that causes life ruination.

Feed that imagine into an adversarial network until it spits out the desired NeuralHash.

Distribute that image to everyone who has ever disagreed with you on the internet, prayed to the wrong god, competed with you in business, voted the wrong way, etc.

skygazer · on Aug 20, 2021

my aim was to point out that the above reverenced "image scaling attack" is easily protected against, because it is fragile to alternate scaling methods -- it breaks if you don't use the scaling algorithm the attacker planned for, and there exist secure scaling algorithms that are immune. [0] Since defeating the image scaling attack is trivial, it means that, if it is addressed, the thumbnail will always resemble the full image.

With that out of the way, that, obviously, just forecloses this one particular attack, specifically, where you want the thumbnail to appear dramatically different than the full image in order to fool the user that it's an innocent image and the reviewer that it's an illegal image. It's still, never-the-less, possible to have a confusing thumbnail -- perhaps an adult porn image engineered to have a CSAM hash collision will be enough to convince a beleaguered or overeager reviewer to pull the trigger. The "Image Scaling Attack" is neither sufficient or necessary.

(However, that confusing image would almost certainly not also fool Apple's unspecified secondary server-side hashing algorithm, as referenced on page 13 of Apple's Security Threat Model Review, so would never be shown to a human reviewer: "as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash" [1])

[0] Understanding and Preventing Image-Scaling Attacks in Machine Learning https://www.sec.cs.tu-bs.de/pubs/2020-sec.pdf

[1] https://www.apple.com/child-safety/pdf/Security_Threat_Model...

sennight · on Aug 20, 2021

> However, that confusing image would almost certainly not also fool Apple's unspecified secondary server-side hashing algorithm, as referenced on page 13 of Apple's Security Threat Model Review...

Uh, on what timescale? If you mean "tomorrow" then sure, if you mean "for years" - then no. They're relying on the second perceptual hashing algorithm to remain a secret, which is insanely foolish. Just based on what I know about these CP hashlists and the laziness of programmers, I feel pretty confident that it is either an algorithm trained on the thumbnails themselves (which would be laughably bad) or it was a prior attempt that got replaced by what is now deployed on the users' hardware. Why would I think that? Because it would have been the only other thing on hand for the necessary step of generating the hash black list. So they're stuck with at least one of those forever - and will have a very limited range of potential responses to the massive infosec spotlight picking them apart... unless they want to recatalog every bit of CP all over again.

skygazer · on Aug 20, 2021

Yeah, I don’t have that answer, of course. But nothing prevents them from changing that secondary algorithm yearly, or at whatever rate the CSAM database owners would tolerate full rehashing, or chaining together multiple hashes. They can literally tune it to whatever arbitrary false positive rate they want. Although, not knowing any better, I would guess that they would just use Microsoft’s PhotoDNA hash unchanged, and just keep it under wraps, since I think that’s what they already use for iCloud email attachment scanning. PhotoDNA just does a scaled down, black and white edge/intensity gradient comparison, and not a neural net feature detection. I would think using a completely different technology would make the pair of algorithms extremely robust taken together, but that’s not my field.

sennight · on Aug 20, 2021

> But nothing prevents them from changing...

While there may not be an immovable obstacle standing between them and a complete recataloging, there are a lot of factors that would strongly disincentivise it. Chief among them being the fact that the project is a already a radioactive cost center - and unless they plan on switching industries and giving Blue Coat a run for its money, it always will be.

> ...chaining together multiple hashes.

That would be the lazy programmer way to do it that would very likely result in a situation where correlation starts popping up - that is why DBAs weren't advised to do some whacky md5/sha1 mashup that avoids requiring every user rekey in the wake of a digest bump up.

> ...I would guess that they would just use Microsoft’s PhotoDNA hash unchanged...

That is a reasonable guess, because that is what all the NGOs have been using - IWF being one of the more notorious. That would be bad news though, for anyone expected the thumbnail perceptual hashing step to provide meaningful protection.

> I would think using a completely different technology would make the pair of algorithms extremely robust...

Nope - which is why you don't see hybrid cryptographic algorithms. Also, if they are using PhotoDNA on their verification step then they actually implemented the thing totally backwards... because the high-pass filter approach makes it resistant to the hash length extension attacks that are imperceivable to humans. That counts for nothing by the time the first algorithm has been fooled by an extension attack (and this neural thing is definitely vulnerable to it), because the attacker would already be selecting for a thumbnail image that would fool a human in the second step - and PhotoDNA would be looking for the exact same thing that a human would: points of contrast.

BTW, PhotoDNA is a black box with no outside scrutiny to speak of - you can count on one hand the number of papers where it is even mentioned (and only ever in passing).

cinquemb · on Aug 20, 2021

> The objective of being mindful of the thumbnail is to fool the human reviewer responsible for alerting the police to your target's need for a good swatting - the algorithm has already flagged the image by the time it is presented as a thumbnail during review.

Yeah, mentioned something like that here [0]:

>> And then one can compromise and infect millions of such backdoored devices and start feeding (much cheaper than the government enforcement implementation) spoofed data into these systems at scale on these backdoored devices that act like "swatting as a service" and completely nullify any meaning they could get from doing this.

[0] https://news.ycombinator.com/item?id=28177170

FranksTV · on Aug 21, 2021

Why not just put actual child porn on their phone if you have access and want to screw with them?

And if you can do that, why wait for apple to implement this? Why not just put child porn on their phone and then call the police?

cinquemb · on Aug 22, 2021

> Why not just put child porn on their phone and then call the police?

That does not scale.

> … why wait for apple to implement this?

Enables scaling of what the above commentator discusses.

jachee · on Aug 20, 2021

… and repeat “on the order of 30+“ times to trigger the reporting threshold, and convince them to save it into their synced-to-iCloud Photos library.

sennight · on Aug 20, 2021

So have you never heard of catfishing? Because if you have, then you know it wouldn't be hard to do just what you described - and you're pretending otherwise for some reason.

Karunamon · on Aug 20, 2021

...or just blast the messages to them on WhatsApp or any other tool that automatically dumps incoming images into the photo roll.

orivej · on Aug 20, 2021

The attack relies on the fact that when downscaling by a large factor, the tested downscalers (except Pillow in non-nearest neighmode mode, and all of them in area averaging mode) ignore most of the pixels of the original image and compute the result based on the select few which are the same in all modes, making the result look nearly the same regardless of the mode.

skygazer · on Aug 20, 2021

Thanks for that reference to Pillow. I presume it's from "Understanding and Preventing Image-Scaling Attacks in Machine Learning" [0] which mentions secure scaling algorithms immune to the attack. I wish I could mention this in the grand parent, but the editing window closed.

[0] https://www.sec.cs.tu-bs.de/pubs/2020-sec.pdf

BizarroLand · on Aug 20, 2021

If you know the hash of an approved image, say of a puppy, a malicious person could edit a CP image to have the same hash as the puppy, right?

Wouldn't that mean that the system can be poisoned? How would they deal with that?

giantrobot · on Aug 21, 2021

Just a correction for you, there's not a list of approved images. The CSAM database are a list of illegal (unapproved if you will) images.

Other than that, yes it's possible to add noise to an image so a perceptual algorithm misidentifies it. I described the false positive case but it can also be used for false negatives. Someone can apply noise to a legit CSAM image (in the NCMEC database) so Apple's system fails to identify it.

The false positive case is scary because if it happens to you your life is ruined. The false negative case just means people have CSAM and don't get found by the system. I'm much more concerned about the false positive case.

saithound · on Aug 21, 2021

Keep in mind that there are multiple straight paths from the false negative case to the false positive case. I'll give you one examples: pedos can and will use the collider to produce large batches of CSAM that collide with perfectly legitimate images (e.g. common iPhone wallpapers). They literally have nothing to lose by doing this.

Eventually, these photos will make their way into the NCMEC database, and produce a large number of false positives. This will also make the other attacks discussed here easier to execute (e.g. by lowering the human review threshold, since everybody will start with a few strikes).

sandos · on Aug 20, 2021

That has to be scaling where you dont properly get rid of all the high-frequency bits of information before scaling, no?

I have a hard time seeing this happen with a "perfect" scaling algoritm in the spatial domain.

vineyardmike · on Aug 19, 2021

> The end result is that some peon at Apple has to look at the images and mark them as not CSAM.

As others said, if the non-csam looks sexual at all, they'll probably get flagged for post-apple review.

Beyond that, it doesn't seem to be in apple's interest to be conservative in flagging. An employee reviewer's best interest is to minimize false negatives not false positives.

As many mentioned, even an investigation can have horrible affects on some (innocent) person's life. I would not be shocked to learn that some crafty individual working at "meme factories" creating intentional collisions with distributed images just for "fun" - and politically motivated attacks seem plausible (eg. make liberal political memes flag CSAM).

Then there are targeted motives for an attack. Have a journalist you want to attack or find a reason to warrant? Find them on dating app and send them nudes with CSAM collisions. Or any number of other targetted attacks against them.

k2enemy · on Aug 19, 2021

> Beyond that, it doesn't seem to be in apple's interest to be conservative in flagging. An employee reviewer's best interest is to minimize false negatives not false positives.

I would have thought the opposite. If there is a false positive that leads to an arrest and ruins someone's life, but the public sees that it is a false positive, then Apple will take an enormous hit in the marketplace. Nobody will want to take on the demonstrated real risk being falsely accused of possessing CSAM.

If they have a false negative, it is unclear to me what negative effects Apple would suffer. As far as I know, nobody would know about it outside of Apple.

partdavid · on Aug 19, 2021

Another commenter used the right phrase, "terribly asymmetry". There won't be any such thing in the public eye as a "false positive", only kiddie porn traders who got away with it.

> Nobody will want to take on the demonstrated real risk being falsely accused of possessing CSAM.

I don't think this is much of a risk. Defamation is difficult to prove, never applies to the law enforcement agencies who are going to make these arrests and "ruin people's lives", is never going to be down to Apple anyway (Apple is only referring things for investigation). I don't think you'll see much public indignance about this--even arguing against it in the real public eye (outside of wonky venues like HN) sounds tantamount to "supporting kiddie porn" in the naïve view of most members of the public.

> If they have a false negative, it is unclear to me what negative effects Apple would suffer. As far as I know, nobody would know about it outside of Apple.

This could potentially arise in any case with an abuser or producer or enabler or pimp or Maxwell/Epstein customer who has an iPhone; which is a lot of cases. As soon as Apple devices are supposed to "detect" kiddie porn, people will ask why people like this weren't caught earlier; and since Apple has money, they won't just ask this in the court of public opinion, they will sue for damages for abuses that "should have" been prevented by Apple's inspection of their pictures. Even if that's unlikely, it's much easier for a peon whose job it is to look at kiddie porn to just forward it on; and such a case really could damage Apple's optics.

zionic · on Aug 20, 2021

1) be a horrible human and want to troll

2) modify close up/ambiguous adult porn to be flagged as CP with free GitHub tool

3) batch a few thousand porn photos like this to poison them

4) upload them everywhere, 4chan/Reddit/tumblr/discord/imagefap

5) some poor sap manages to save 20+ of your bait images

6) apple reviewer sees 100x100px blurry gray image of definitely porn that was flagged as CP. hits report.

7) a SWAT team bust down your door and takes all your devices while you go to jail and your mugshot is everywhere

Someone will do this, and the poisoned images will spread organically until they find a victim

rapnie · on Aug 20, 2021

Plus by doing this at scale you generate such workload for reviewers that they are way more likely to quickly press 'report' in step 6.

smabie · on Aug 20, 2021

This is actually a good idea. Maybe can convince apple this won't work.

owlbite · on Aug 20, 2021

Hasn't this ship already sailed though?

At least I'm given to understand that they already scanned all these photos uploaded to iCloud anyway (in the same way many other similar providers do). Whether it happens on the device or the server doesn't seem to make any difference to this attack.

(That's not to say that (a) the scanning of stuff on a server was a good idea in the first place or (b) encouraging politicians to use your own device to spy on you is a good idea or (c) this isn't the thin end of a very painful wedge, just that we've not opened a new vulnerability)

bildung · on Aug 20, 2021

Because people behind the keyboards make mistakes all the time. Just in the last month i experienced

* a call center agent at a haulage firm, instead of entering the delivery date we talked about on the phone, clicked for the delivery to be returned to the factory.

* Google automatically blocked an ad account from delivering ads because we allegedly profiteered from Covid (untrue of course, but we surly talked about the challenges caused by the pandemic somewhere on the site, so the "AI" apparently got triggered by some keywords), and humans repeatedly confirmed the AI decision.

* Facebook blocked an ad account that was unused in 2020, wanted ID, got the correct ID (identical name etc.), and the human denied confirmation.

Google and Facebook are of course known to be beyond kafkaesque, so this is no surprise. But imagine the costs the innocents pay once they accidently get entered into the FBI CP suspect database.

nomel · on Aug 19, 2021

Why is this question being downvoted? I too would like to know what this attack achieves.

From what I see, the end result of false flagging is either someone has CSAM in iCloud and you push them over the threshold that results in reporting and prosecution, or there is no CASM, so the reviewer sees all of the hash collision images, including those that are natural.

Is the problem that an attacker can force natural hash collision images to be viewed by a reviewer, violating that persons privacy? Do we know if this process is different than how Google, Facebook, Snapchat, Dropbox, Microsoft, and others have implemented these necessarily fuzzy matches for their CSAM scans of cloud hosted?

Or am I missing something that the downvoters saw?

stefan_ · on Aug 19, 2021

You are one underpaid random guy in India looking at CSAM all day clicking the wrong button away from a raid of your home and the end of your life as you know it.

fortran77 · on Aug 20, 2021

Yes. Once you're accused, your life is over. Nobody will pay any attention to a news article the next week saying they found nothing.

dgrunwald · on Aug 20, 2021

You're assuming the police are nice and quickly announce when they haven't found anything. The more likely outcome is that it takes several months until the case is dropped silently.

nomel · on Aug 20, 2021

Could you explain this? The process, according to what's publicly known, is that the images will go to NCMEC for further review, then NCMEC will report it to the authorities, if it's actually CSAM. The low paid (a big assumption here) Apple reviewer is only the final step for Apple, not prosecution.

Notanothertoo · on Aug 19, 2021

This, these charges are damning once they are made. Plus the countless legal dollars you are going to have to front and hours spent proving innocence and that's assuming the justice system actually works.. Try explain this to your employer while you start missing deadlines due to court dates.. The police also could easily leverage this to warrant hop. As they have been found doing in the past. I think the bike rider who had nothing to do with a crime and got accused because he was the only one in a broad geo location Warren is all the president of of you need that this will be abused.

polynox · on Aug 20, 2021

Try explaining to your employer that your employer's laptop was seized because the police took all of your computers and devices when arresting you.

wbl · on Aug 20, 2021

A false report is not going to lead to charges. In order to bring a case they need evidence of possession.

iszomer · on Aug 20, 2021

Innocent until proven guilty or, with the way we're heading, guilty until proven innocent. And it wouldn't matter as you are now already _tainted_.

foota · on Aug 19, 2021

The idea I've heard is that images could be generated that are sexual in nature but that have been altered to match a CSAM hash, making a tricky situation.

nomel · on Aug 20, 2021

That's an interesting point! From my understanding, Apple's hash is not the final qualifier. SCMEC also reviews them before reporting to the authorities. But, I can imagine a scenario that might require opinion.

csmpltn · on Aug 19, 2021

> "The end result is that some peon at Apple has to look at the images and mark them as not CSAM. You've cost someone a bit of privacy, but that's it."

This can be abused to spam Apple's manual review process, grinding it down to a halt. You've cost Apple time and money by making them review each such fake report.

incrudible · on Aug 19, 2021

> You've cost Apple time and money by making them review each such fake report.

Ok, but… how do I profit? If I wanted to waste Apple employee time, I could surely find a way to do it, but why would I? The functioning of society relies on the fact that people generally have better things to do than waste each others time.

stemlord · on Aug 19, 2021

It's a form of protest, thus not seen as a waste of time by those doing it.

incompatible · on Aug 20, 2021

Perhaps somebody could make unflaggable CSAM images by giving them the same hashes as very widely available images.

BizarroLand · on Aug 20, 2021

Or you could identify the factors that cause a hash to be computed and then start generating random images that compute to the same hash, creating 10s of thousands of images of digital noise that all look alike to the computer.

syshum · on Aug 20, 2021

>>Ok, but… how do I profit?

To some people the ethical problems with this system are worth the effort to make their implementation impossible.

if you are asking for monetary profit, that does not have to be a goal for someone to want to engage in disruption of this service.

perryizgr8 · on Aug 20, 2021

> but why would I?

For the lulz.

ec109685 · on Aug 20, 2021

It can’t be. There’s a different private hash function that also has to match that particular csam image’s hash value before a human sees it. An adversarial attack can’t produce that one since the expected value isn’t known.

GistNoesis · on Aug 20, 2021

This second "secret" hash function, because it is applied to raw offensive content that Apple can't have, has to be shared at least with people maintaining the CSAM database.

You can't rely that it won't ever leak, and when it does, it will be almost undetectable and have huge consequences.

As soon as the first on-device CSAM flag has been raised, it becomes a legal and political problem. Even without a second matching hash, it already put Apple in an untenable position. They already are in a mud fight with the pigs.

They can't say : we got 100M hits this month on our first CSAM filter but we only reported 10 cases, because to avoid false positives our second filter throw everything to dev/null, and we didn't even manually reviewed them because your privacy matter to us. It has become a political problem where for good measure they will have to report cases to make the numbers look "good".

Attackers of the system can also plant false negatives aka real CSAM that has been modified enough to pass the first hash but fail this second hash. So that, in the audit, independent security researchers who review Apple system, will be able to say that Apple automated system, sided with the bad guys, by rejecting true CSAM and not reporting it.

Also remember, that Apple can also do something else than what they say they do for PR reasons : maybe some secret law will force them to reveal to the authorities as soon as the first flag has been raised, and force them not tell about it. And because it's in the name of fighting the "bad guys", that's something most people expect them to do.

From the user perspective, there is nothing we can audit, it's all security by obscurity disguised with pseudo-crypto-PR, it's just a big "Trust us" blanked signed paper that will soon be used to dragnet surveil anyone for any content.

mdoms · on Aug 19, 2021

What if I can generate an attack that will mark your own picture of your own toddler nude in a bathtub as CSAM? Do you still feel confident in "some peon at Apple" to mark it as not CSAM?

saithound · on Aug 19, 2021

Okay, let's play peon. Here are three perfectly legal and work-safe thumbnails of a famous singer: https://imgur.com/a/j40fMex. The singer is underage in precisely one of the three photos. Can you decide which one?

If your account has a large number of safety vouchers that trigger a CSAM match, then Apple will gather enough fragments to reassemble a secret key X (unique to your device) which they can use to decrypt the "visual derivatives" (very low resolution thumbnails) stored in all your matched safety vouchers.

An Apple employee looks at the thumbnails derived from your photos. The only judgment call this employee gets to make is whether it can be ruled out (based on the way the thumbnail looks) that your uploaded photo is CSAM-related. As long as the thumbnail contains a person, or something that looks like the depiction of a person (especially in a vaguely violent or vaguely sexual context, e.g. with nude skin or skin with injuries) they will not be able to rule out this possibility based on the thumbnail alone. And they will not have access to anything else.

Given the ability to produce hash collisions, an adversary can easily generate photos that fail this visual inspection as well. This can be accomplished straightforwardly by using perfectly legal violent or sexual material to produce the collision (e.g. most people would not suspect foul play if they got a photo of genitals from their Tinder date). But much more sophisticated attacks [2] are also possible: since the computation of the visual derivative happens on the client, an adversary will be able to reverse engineer the precise algorithm.

While 30 matching hashes are probably not sufficient to convict somebody, they're more than sufficient to make somebody a suspect. Reasonable suspicion is enough to get a warrant, which means search and seizure, computer equipment hauled away and subjected to forensic analysis, etc. If a victim works with children, they'll be fired for sure. And if they do charge somebody, it will be in Apple's very best interest not to assist the victim in any way: that would require admitting to faults in a high profile algorithm whose mere existence was responsible for significant negative publicity. In an absurdly unlucky case, the jury may even interpret "1 in 1 trillion chance of false positive" as "way beyond reasonable doubt".

Chances are the FBI won't have the time to go after every report. But an attack may have consequences even if it never gets to the "warrant/charge/conviction" stage. E.g. if a victim ever gets a job where they need to obtain a security clearance, the Background Investigation Process will reveal their "digital footprint", almost certainly including the fact that the FBI got a CyberTipline Report about them. That will prevent them from being granted interim determination, and will probably lead to them being denied a security clearance.

(See also my FAQ from the last thread [1], and an explanation of the algorithm [3])

[1] https://news.ycombinator.com/item?id=28232625

[2] https://graphicdesign.stackexchange.com/questions/106260/ima...

[3] https://news.ycombinator.com/item?id=28231218

mrtranscendence · on Aug 19, 2021

Fair enough. I suppose it's true that you could create a colliding sexually explicit image where age is indeterminate, and the reviewer may not realize it isn't a match.

> Given the ability to produce hash collisions, an adversary can easily generate photos that fail this visual inspection as well.

Apple could easily fix this by also showing a low-res version of the CSAM image that was collided with, but I'll grant that they may not be able to do that legally (and reviewers probably don't want to look at actual CSAM).

vlovich123 · on Aug 19, 2021

The problem is that it is a scaled low-res version. There are well publicized attacks[1] showing you can completely change the contents of the image post scaling. There's also the added problem that if the scaled down image is small, even without the attack, it's impossible to make a reasonable human judgement call (as OP points out).

The problem isn't CSAM scanning in principle. The problem is that the shift to the client & the various privacy-preserving steps Apple is attempting to make is actually making the actions taken in response to a match different in a concerning way. One big problem isn't the cases where the authorities should investigate*, but that a malicious actor can act surreptitiously and leave behind almost no footprint of the attack. Given SWATting is a real thing, imagine how it plays out if child pornography is a thing. From the authorities perspective SWATting is low incidence & not that big a deal. Very different perspective on the victim side though.

[1] https://embracethered.com/blog/posts/2020/husky-ai-image-res...

* One could argue about the civil liberties aspect & the fact that having CSAM images is not the same as actually abusing children. However, among the general population that line of reasoning just gets you dismissed as supporting child abuse & is only starting to become acknowledged in the psychiatry community.

throwaway672000 · on Aug 19, 2021

Won't enough images be real matches for them to be looking at it (in low res) for most of their work day?

scraptor · on Aug 20, 2021

How stupid do you have to be to run your actual child porn operation on icloud after a week of headlines like this?

ChrisKnott · on Aug 19, 2021

You're adding quite a lot of technobabble gloss to an "attack vector" that boils down to "people can send you images that are visually indistinguishable from known CSAM".

Guess what, they can already do this but worse by just sending you actual illegal images of 17.9 year olds.

While it would be bad to be subjected to such an attack, and there is a small chance it would lead to some kind of interaction with law enforcement, the outcomes you present are just scaremongering and not reasonable.

saithound · on Aug 19, 2021

I suggest you reread the comment, because "people can send you images that are visually indistinguishable from known CSAM" is not what is being said at all. Where did you even get that from?

The point is precisely that people can become victims of various new attacks, without ever touching photos that are actual "known CSAM". For Christ's sake, half the comments here are about how adversaries can create and spread political memes that trigger automated CSAM filters on people's phones just to "pwn the libz".

> Guess what, they can already do this but worse by just sending you actual illegal images of 17.9 year olds.

No, this misses the point completely. You cannot easily trigger any automated systems merely by taking photos of 17.9 year olds and sending them to people. E.g. your own photos are not in the NCMEC databases, and you'd have to reveal your own illegal activities to get them in there. You (or malicious political organizations) especially cannot attack and expose "wrongthinking" groups of people by sending them photos of 17.9 year olds.

Spooky23 · on Aug 19, 2021

> No, this misses the point completely. You cannot easily trigger any automated systems merely by taking photos of 17.9 year olds and sending them to people.

An attacker can embed a matching image inside of a PowerPoint zip file, and email it to any corporate employee using O365.

Or, an angry parent can call the police and let them know that a 16 year old possesses nose pictures of their 15 year old girlfriend.

The over top response to this controversy is really disappointing.

saithound · on Aug 20, 2021

Sure, your proposed attack, that requires the victim to have a 15 year old girlfriend, to break an (admittedly silly) law by having nude photos on their phone, for you to call the cops, and for them to take such a call seriously is clearly comparable to a vector that can be used to target innocents, groups of individuals, etc. who did not break the law in any way, and that do not require the attacker to handle prohibitex material at all, and requires Apple to keep a ton of information completely obscure to even provide a weak semblance of security (it was shown to be completely broken except possibly for one unknown hash, in two weeks). Clearly comparable. Sure. Clearly.

For one last time, the NeuralHash collisions make this tool perfectly unusable for catching pedos: all of the next generation of CSAM content will collide with hashes of popular, innocent images. Two weeks after it was deployed, Apple's CSAM scanning is now _only_ an attack vector and a privacy risk. It's completely useless for its nominal function. This would be a massive, hilarious own goal from Apple even if the public reaction was over the top (although it isn't). They just reduced the privacy and security of nearly all their customers, further exposed themselves to the whims of governments, and for no gain whatsoever.

ChrisKnott · on Aug 19, 2021

Can you explain how these theoretical political memes hash-match to an image in the NCMEC database, and then also pass the visual check?

> "No, this misses the point completely. You cannot easily trigger any automated systems merely by taking photos of 17.9 year olds and sending them to people."

Did I say "taking"? I am talking about sending (theoretical) actual images from the NCMEC database. This is functionally identical to the "attack" you describe.

saithound · on Aug 19, 2021

Yes, I can. This is just one possible strategy: there are many others, where different things are done, and where things are done in a different order.

You use the collider [1] and one of the many scaling attacks ([2] [3] [4], just the ones linked in this thread) to create an image that matches the hash of a reasonably fresh CSAM image currently circulating on the Internet, and resizes to some legal sexual or violent image. Note that knowing such a hash and having such an image are both perfectly legal. Moreover, since the resizing (the creation of the visual derivative) is done on the client, you can tailor your scaling attack to the specific resampling algorithm.

Eventually, someone will make a CyberTipline report about the actual CSAM image whose hash you used, and the image (being a genuine CSAM image) will make its way into the NCMEC hash database. You will even be able to tell precisely when this happens, since you have the client-side half of the PST database, and you can execute the NeuralHash algorithm.

You can start circulating the meme before or after this step. Repeat until you have circulated enough photos to make sure that many people in the targeted group have exceeded the threshold.

Note that the memes will trigger automated CSAM matches, and pass the Apple employee's visual inspection: due to the safety voucher system, Apple will not inspect the full-size images at all, and they will have no way of telling that the NeuralHash is a false positive.

[1] https://github.com/anishathalye/neural-hash-collider

[2] https://embracethered.com/blog/posts/2020/husky-ai-image-res...

[3] https://bdtechtalks.com/2020/08/03/machine-learning-adversar...

[4] https://graphicdesign.stackexchange.com/questions/106260/ima...

saithound · on Aug 19, 2021

Okay, perhaps the three thumbnails was unclear. I didn't mean to illustrate any specific attack with it, just to convey the feeling of why it's difficult to tell apart legal and potentially illegal content based on thumbnails (i.e. why a reviewer would have to click "possible CSAM" even if the thumbnail looks like "vanilla" sexual or violent content that probably depicts adults). I'd splice in a sentence to clarify this, but I can't edit that particular comment anymore.

ChrisKnott · on Aug 19, 2021

Ok yeah, I do agree this scaling attack potentially makes this feasible, if it essentially allows you to present a completely different image to the reviewer as to the user. Has anyone done this yet? i.e. an image that NeuralHashes to a target hash, and also scale-attacks to a target image, but looks completely different.

(Perhaps I misunderstood your original post, but this seems to be a completely different scenario to the one you originally described with reference to the three thumbnails)

ec109685 · on Aug 20, 2021

This attack doesn’t work. If the resized image doesn’t match the CSAM image your NeuralHash mimicked, then when Apple runs it’s private perceptual hash, the hash value won’t match the expected value and it will be ignored without any human looking at it.

saithound · on Aug 20, 2021

We have no reason to believe that Apple's second, secret perceptual hash provides any meaningful protection against such attacks. At best, we can hope that it'll allow early detection of attacks in a few cases, but chances are that's the best it can do. We might not ever learn: Apple now has a very strong incentive not to admit to any evidence of abuse or to any faults in their algorithm.

(Sorry, this is going to be long. I know understand most/all of this stuff, it's mostly there to provide a bit of context for the users reading our exchange)

The term "hash function" is a bit of a misnomer here. When people hear "hash", they tend to think about cryptographic hash functions, such as SHA256 or BLAKE3. When two messages have the same hash value, we say that they collide. Fortunately, cryptographic hash functions have several good properties associated with them: for example, there is no known way to generate a message that yields a given predetermined hash value, no known way to find two different messages with the same hash value, and no known way to make a small change to a message without changing the corresponding hash value. These properties make cryptographic hash functions secure, trustworthy and collision-resistant even in the face of powerful adversaries. Generally, when you decide to use two unrelated cryptographic hash algorithms instead of one, executing a preimage attacks against both hashes becomes much more difficult for the adversary.

However, as you know, the hash functions that Apple uses for identifying CSAM images are not "cryptographic hash functions" at all. They are "perceptual hash functions". The purpose of a perceptual hash is the exact opposite of a cryptographic hash: two images that humans see/hear/perceive (hence the term perceptual) to be the same or similar should have the same perceptual hash. There is no known perceptual hash function that remains secure and trustworthy in any sense in the face of (even unsophisticated) adversaries. In particular, preimage attacks against perceptual hashes are very easy, compared to the same attacks against cryptographic hashes.

Using two unrelated cryptographic hashes meaningfully increases resistance to collision and preimage attacks. Using ROT13 twice does not increase security in any meaningful sense. Using two perceptual hashes, while not as bad, is still much closer to the "using ROT13 twice for added security" than to the "using multiple cryptographic hashes" end.

Finding a SHA1 collision took 22 years, and there are still no effective preimage attacks against it. Creating the NeuralHash collider took a single week. More importantly, even if you were to use two unrelated perceptual hash functions, executing a preimage attacks against both hashes need not become much more difficult for the adversary: easy * easy is still easy. Layering cryptography upon cryptography is meaningful, but only as long as one of the layers is actually difficult to attack. This is not the case for perceptual hashes. In fact, in many similar contexts, these adversarial attacks tend to transfer: if they work against one technique or model, they often work against other models as well [3]. In the attack discussed above, the adversary has nearly full control over the "visual derivative", so even a very unsophisticated adversary can subject the target thumbnail itself to the collider before performing the resizing attack, and hope that it transfers against the second hash. If the second hash is a variant of NeuralHash (somewhat likely, it could even be NeuralHash performed on the thumbnail itself; we don't know anything about it!), or if it's a ML model trained on the same or similar datasets (quite likely), or if it's one of the known algorithms (say PhotoDNA) then some amount of transfer is likely to happen. And given an adversary that is going to distribute a large number of photos anyway, a 10% success rate is more than enough. Given the diminished state space (fixed size thumbnails, almost certainly smaller than 64x64 for legal reasons), a 10% success rate is completely plausible even with these naive approaches. An adversary that has some (even very little information) about the second hash algorithm can do much more sophisticated stuff, and perform much better.

But what if we boldly rule out all transfer results? Doesn't Apple keep their algorithm secret?! Can we think of the weights (coefficients) of the second perceptual hash as some kind of secret key in the cryptographical sense? Alas, no. Apple would have to make sure that all the outputs of the secret perceptual hash are kept secret as well. Due to the way perceptual hashing algorithms work, they provide a natural training gradient having access to sufficiently many input-outputs examples is probably enough to train a high-fidelity "clone" that allows one to generate adversarial examples and perform successful preimage attacks even if the weights of the clone are completely different from the secret weights of the original network. This can be done with standard black box techniques [4]. It's much harder (but nowhere near crypto hard, still perfectly plausible) to pull this off when they have access to one bit of output (match or no match). A single compromised Apple employee can gather enough data to do this given the ability to observe some inputs and outputs, even if said employee has no access to the innards or the magic numbers. The hash algorithm is kept secret because if it wasn't, an attack would be completely trivial: but an adversary does not need to learn this secret to mount an effective attack.

These are just two scenarios. There are many others. "Nobody has ever demonstrated such an attack working end-to-end" is not a good defense: it's been two weeks since the system was rolled out, and once an attack is executed, we probably won't learn about it for years to come. But the attacker can be rewarded way before "due process" kicks in: e.g. if a victim ever gets a job where they need to obtain a security clearance, the Background Investigation Process will reveal their "digital footprint", almost certainly including the fact that the NCMEC got a report about them, even if the FBI never followed up on it. That will prevent them from being granted interim determination, and will probably lead to them being denied a security clearance. If you pull off this attack on your political opponents, you can prevent them from getting government jobs, possibly without them ever learning why. And again, this is one single proposed attack. There were at least 6 different attacks proposed by regular HN users in the recent threads!

As a more general observation, cryptography tends to be resistant to attacks only if one can say things such as "the adversary cannot be successful unless they know some piece of information k, and we have very good mathematical reasons (e.g. computational hardness) to believe that they can't learn k". The technology is flawed: even the state-of-the-art in perceptual hashes does not satisfy this criterion. Currently, they are at best technicool gadgets, but layering technicool upon technicool cannot make their system more secure.And Apple's system is a high-profile target if there ever was one.

Barring a major breakthrough in perceptual hashing (one that Apple decided to keep secret and leave out of both whitepapers), the claim that the secret second hash will prevent collision attacks is not justified. The chances of such a secret breakthrough are very slim: it'd be like learning that SpaceX has already built a base on the Moon and has been doing regular supply runs with secret spaceships. Vaguely plausible in theory (SpaceX has people who do rocketry, Apple has people who do cybersecurity), but vanishingly unlikely in practice.

And that's before we mention that the mere existence of the collider made the entire exercise completely pointless: the real pedos can now use the collider to effectively anonymize their CSAM drops, making sure that all of their content collides with innocnent photos, and ensuring that none of the images will be picked up by NeuralHash anyway. For all practical purposes, Apple's CSAM detection is now _only_ an attack vector, and nothing else.

[3] https://arxiv.org/abs/1809.02861 [4] https://towardsdatascience.com/adversarial-attacks-in-machin...

ec109685 · on Aug 20, 2021

The first half of your post is predicated on it being likely the noise added to generate hash A using the NeuralHash is likely to produce a specific hash B with some unknown perceptual hashing function (which they specifically call out [1] as independent of the NeuralHash function precisely because they don’t want to make this easy, so speculating it might be the NeuralHash run again is incorrect). Hash A is generated via thousands of iterations of an optimization function, guessing and checking to produce a 12 bit number. What shows that same noise would produce an identical match when run through a completely different hashing function that is designed very differently specifically to avoid these attacks? Just one bit of difference will prevent a match. Nothing you’ve linked to would show any likelihood of that being anywhere close to 10 percent.

For the second part, yes if an Apple engineer (that had access to this code) leaked the internal hash function they used or a bunch of example image’s to hash values, that would allow these adversarial attacks.

Until you can show an example or paper where the same adversarial image generates a specific hash value for two unrelated perceptual hash functions, with one being hidden, it is not right to predict a high likelihood of that first scenario being possible.

Here’s a thought exercise, how long would it have taken researches to generate a hash collision with that dog image if the NeuralHash wasn’t public and you received no immediate feedback that you were “right” or getting closer along the way?

[1] https://www.apple.com/child-safety/pdf/Security_Threat_Model...

saithound · on Aug 20, 2021

> Until you can show an example or paper where the same adversarial image generates a specific hash value for two unrelated perceptual hash functions, with one being hidden, it is not right to predict a high likelihood of that first scenario being possible.

"There is no paper attacking ROT13 done twice, therefore it must be secure". Usually, it's on the one proposing the protocol to make a case for its security. Doubly so when it's supposed to last a long time, a lot of people are interested in attacking it, and successful attacks can put people in harm's way.

You know what, if you think that this is difficult, feel free to pick an existing perceptual hash function H, cough up some money, and we'll announce a modest prize (say $4000) on HN for the first person to have a working collision attack for NeuralHash+H. H will run on a scaled-down thumbnail, and we'll keep the precise identity of the algorithm secret. If the challenge gets any traction, but nobody succeeds within 40 days, I'll pay you $4000 for your effort. If you're right, this should be easy money. (cf SHA1, which lasted 22 years)

Heck If Apple claims that this is difficult (afaict they don't, it would be unwise), they might even join in with their own preimage challenge for $$$. It'd be a no-brainer, a simple and cheap way of generating good publicity.

ec109685 · on Aug 21, 2021

They claim their H is resistant to adversarial attacks, so they are claiming this to be difficult.

If I took an exact public perceptual hash function implementation and used that as H in your contest, it might be possible for a researcher attacking all public perceptual hash functions to stumble on the right one within 40 days.

I agree with you that we are trusting Apple to implement this competently. This isn’t something that can be proved to work mathematically where nothing about the implementation has to be kept secret.

So worse case everything you say could come true but to imply that is likely is wrong.

incrudible · on Aug 19, 2021

This leaves open the question of how the image gets on the device of the victim. You would have to craft a very specific image that the victim is likely to save, and the existence of such a specially crafted file would completely exonerate them.

mLuby · on Aug 19, 2021

I'd think the attack would be done in reverse:

1. Get a photo that the target already has.

2. Generate an objectionable image with the same hash as the target's photo. (This is obviously illegal.)

3. Submit the objectionable image to the government database.

Now the target's photo will be flagged until manually reviewed.

This doesn't sound impossible as a targeted attack, and if done on a handful of images that millions of people might have saved (popular memes?) it might even grind the manual reviews to a halt. But maybe I'm not understanding something in this (very bad idea) system.

XorNot · on Aug 20, 2021

This requires the attacker handling CSAM which defeats the benefit. The risk in all cases is anytime you actually handle CSAM then the attack is void since you're now actually guilty of the crime and have to do it (very few will cross that line).

The point though is that this is something someone's Apple phone is doing, that their device is not. So the goal is to send a hash collided images by non-Apple channels (email) where there is a reasonably good chance that image would make it's way into someone's global device photo store and into automatic iCloud uploads.

Sending an MMS would work, for example, or a picture to Signal which then someone saves to outside of Signal (a meme).

In all these cases, the original sender doesn't have an Apple device: so they're not getting scanned by the same algorithm, but more importantly their device is not spying on them. Importantly too: they've done nothing illegal.

But: the victim is getting flagged by their own device. And the victim has to have their device seized and analysed to determine (1) that it's not CSAM, (2) that they were sent those images that flagged and aren't trying to divert attention by getting themselves false pinged upfront, but then (3) the sender has committed no crime. There's no reason or even risk to investigate them, because by the time the victim has dealt with law enforcement, it's been established that no one had anything illegal.

It's the digital equivalent of a sock of cat litter testing positive as being methamphetamine, except if it was your drive through McDonald's order.

The goal is not to get convictions, the goal is harrassment.

mLuby · on Aug 20, 2021

> the sender has committed no crime

> they've done nothing illegal

Perhaps that's true in the narrowest sense, but aren't the odds of generating a colliding file so low as to all but rule out coincidence and therefore strongly indicate premeditated cyber-attack (which is illegal)?

If I were law enforcement, at the very least I'd want to keep tabs on these sources of false positives. Probably easy enough to convince a judge that someone capable of the "tech wizardry" to collide a hash can un-collide one too, and therefore more thorough/invasive search warrants of the source are justified.

XorNot · on Aug 20, 2021

Your argument is "the technology is flawed, there let's also arrest anyone who we suspect of generating false positives".

Like security researchers. Or the people currently inspecting the algorithm. And also frankly what are you going to do about overseas adversaries? The most likely people looking at how to exploit this would explicitly be state-sponsored Russian hackers - this is right up the alley of their desire to be able to cause low level chaos without committing to a serious attack.

And at the end of the day you've still succeeded: the point is that by the time you've established it was spurious, the target has already been through the legal wringer. The legal wringer is the point.

gzer0 · on Aug 19, 2021

Your explanations are brilliant. Thank you

robertoandred · on Aug 19, 2021

Apple can only ever see the visual derivatives in vouchers of images that match CSAM hashes, not vouchers of all your images.

saithound · on Aug 19, 2021

Yep. I'm aware of this, and it doesn't affect the point I was making, but it's worth pointing out. I made an edit to the text to make this explicit.

ec109685 · on Aug 20, 2021

None of those thumbnails (or visual derivatives) will match the hash value of the known csam you are trying to simulate since it won’t be possible to know the hash value target since that hash function is private.

0x0 · on Aug 19, 2021

Seeing how "well" app review works, I would not be surprised if the "peon" sometimes clicks the wrong button while reviewing, bringing down a world of hurt on some innocent apple user, all triggered by the embedded snitchware running on their local device.

j16sdiz · on Aug 20, 2021

I guess there are thousands of App reviews each week, each one take tens of minutes to test..

The CASM thing, I guess, give at most a dozen a week...

e40 · on Aug 19, 2021

> The end result is that some peon at Apple has to look at the images and mark them as not CSAM

Btw, this reminded me of a podcast about FB's group to do just this. Because it negatively impacted the mental health of those FB employees, they farmed it out to the contractors in other countries. There were interviews with women in the Philippines, and it was having the same impact there.

fay59 · on Aug 19, 2021

It’s quite possible that you implied this, but I think that the true positives are the ones that had a mental health toll.

e40 · on Aug 19, 2021

That's correct. It wasn't just CSAM. The described images were sickening.

teekert · on Aug 20, 2021

Some dudette/dude is going to look at my personal pictures every now and then? What if they are of my naked children and what if that person is a csam interested person? And she/he takes a picture of the screen? Ugh it feels so bad!

I don’t want there to be a chance some person is going to look at my pics!

hypertele-Xii · on Aug 20, 2021

Imagine costing everyone a bit of privacy all the time.

Does that sound consistent with how Apple has positioned itself in the market?

cheald · on Aug 19, 2021

Why would anyone bother calling the cops and telling them that someone they don't like is an imminent threat? The end result is that some officer just has to stop by and see that they aren't actually building bombs. You've cost someone a bit of time, but that's it.

syshum · on Aug 20, 2021

Reality shows that not to be the case. How many "SWATTING" incidents now has resulted in much more than "some officer stopping by"....

cheald · on Aug 24, 2021

My point precisely.

madeofpalk · on Aug 19, 2021

> one in a trillion chance of an account being flagged innocently

To be clear - Apple claims that there is a one in a trillion chance of there being ~30 false matches in a single account.

nitrogen · on Aug 19, 2021

Is that number just for the private set intersection step?

raxxorrax · on Aug 20, 2021

This also server as a pretext to deeply search someone's device. So you must expect your device getting randomly searched by law enforcement. Completely ridiculous.

criticaltinker · on Aug 19, 2021

> In order to test things, I decided to search the publicly available ImageNet dataset for collisions between semantically different images. I generated NeuralHashes for all 1.43 million images and searched for organic collisions. By taking advantage of the birthday paradox, and a collision search algorithm that let me search in n(log n) time instead of the naive n^2, I was able to compare the NeuralHashes of over 2 trillion image pairs in just a few hours.

> This is a false-positive rate of 2 in 2 trillion image pairs (1,431,168^2). Assuming the NCMEC database has more than 20,000 images, this represents a slightly higher rate than Apple had previously reported. But, assuming there are less than a million images in the dataset, it's probably in the right ballpark.

It's great to see the ingenuity and attention this whole debacle is receiving from the community. Maybe it will lead to advances in perceptual hashing (and also advances in consumer awareness of tech related privacy issues).

SilasX · on Aug 19, 2021

>2 in 2 trillion image pairs

Reporting the collision rate per image pair feels misleading. What you really want to know is the number of false positives per image in the relevant set, not image pair, as that's the figure that indicates how frequently you'll hit a false positive.

dhosek · on Aug 19, 2021

In fact, I'd argue that the collision rate per image pair is overestimating the collision rate. It's the flip side of the birthday paradox. We don't care that any two images have the same hash, we care about any image having the same hash as one in the set that we're testing against.

SilasX · on Aug 19, 2021

>We don't care that any two images have the same hash,

Why not?

snet0 · on Aug 19, 2021

Based on the pigeonhole principle alone, it will always be the case that collisions exist. The size of the digest is very likely smaller than the size of any given image.

SilasX · on Aug 20, 2021

How is that relevant to whether “per image” or “Per image pair” is the right metric?

Heck, how is it relevant to anything given that hash functions avoid collisions in practice, even given the pigeonhole principle?

LodeOfCode · on Aug 20, 2021

Those should only differ by a factor of 2 [the collision rate for an image is the number of collisions it has divided by the number of other images, so the average collision rate is the total collisions divided by n(n-1) vs. n(n-1)/2 pairs] which isn't particularly relevant at this scale

SilasX · on Aug 20, 2021

You care about the total number of collisions, not collisions for a specific image, so they differ by a square -- hence the 1 in a trillion vs million difference.

bawolff · on Aug 19, 2021

ImageNet is a very well-known data set. Are we sure apple didn't test on it when designing this algorithm?

slg · on Aug 19, 2021

>This is a false-positive rate of 2 in 2 trillion image pairs (1,431,168^2). Assuming the NCMEC database has more than 20,000 images, this represents a slightly higher rate than Apple had previously reported. But, assuming there are less than a million images in the dataset, it's probably in the right ballpark.

Apple reported a pretty similar collision rate so maybe they did.

yeldarb · on Aug 19, 2021

Their reported collision rate was against their CSAM hash database[1].

> In Apple’s tests against 100 million non-CSAM images, it encountered 3 false positives when compared against NCMEC’s database. In a separate test of 500,000 adult pornography images matched against NCMEC’s database, it found no false positives.

[1] https://tidbits.com/2021/08/13/new-csam-detection-details-em...

slg · on Aug 19, 2021

I don't follow the point you are making here. The goal of the algorithm is to match images so I would expect similar collision rates regardless of what image was matched against what set of hashes. The exact rate of collision will obviously vary slightly.

javierbyte · on Aug 19, 2021

I think the point is that if Apple did optimize the algorithm using ImageNet then they and we are seeing the same best case scenario.

pfranz · on Aug 19, 2021

The only number I've heard from Apple is, "the likelihood that the system would incorrectly identify any given account is less than one in one trillion per year."[1] Which I read as enough false hits to flag an account that year (some interview said that threshold was around 30). That depends on the average number of new photos uploaded to iCloud, the size of the NCMEC database, the threshold for flagging the account, and the error rate of the match. Without knowing most of those numbers it's hard to gauge how close it is.

https://www.apple.com/child-safety/pdf/Expanded_Protections_...

robertoandred · on Aug 19, 2021

Nah, the consensus online seems to be that Apple hires naive, inept script kiddies and that any rando on GitHub can prove without question that Apple’s solution is flawed.

alfalfasprout · on Aug 19, 2021

There's been a lot of focus on the likelihood of collisions and whether someone could upload eg; an image with a matching hash to your device to "set you up", etc. But what's still extremely concerning is that there is still no guarantee that the hash list used can't be coopted for another purpose (eg; politically insensitive content).

rovr138 · on Aug 19, 2021

On top of that, what happens if a court/government orders them to give them all the current data about people with matches, regardless of the 30 matches.

They can't say if it is or not a match so they have to go after the individuals. Is that enough evidence for a warrant?

Someone in the court thinks it's true and can't prosecute?, oh, it got leaked.

--

Not every country has the same protections about innocent until proven guilty. And even then, we've seen cases in the US where someone has been held in jail indefinitely until they provide a password,

- https://arstechnica.com/tech-policy/2016/04/child-porn-suspe...

- https://nakedsecurity.sophos.com/2016/04/28/suspect-who-wont...

int_19h · on Aug 19, 2021

More broadly speaking, every part of this scheme that is currently an arbitrary Apple decision (and not a technological limitation), can easily become an arbitrary government decision.

And yes, it's true that the governments could always mandate such scanning before. The difference is that it'll be much harder politically for Apple to push back against tweaks to the scheme (such as lowering the bar for manual review / notification of authorities) if they already have it rolled out successfully and publicly argued that it's acceptable in principle, as opposed to pushing back against any kind of scanning at all.

Once you establish that something is okay in principle, the specifics can be haggled over. I mean, just imagine this conversation in a Congressional hearing:

"So, you only report if there are 30+ CSAM images found by the scan. Does this mean that pedophiles with 20 CSAM images on their phones are not reported?"

"Well... yes."

"And how did you decide that 30 is the appropriate number? Why not 20, or 10? Do you maybe think that going after CSAM is not that important, after all?"

There's a very old joke along these lines that seems particularly appropriate here:

"Churchill: Madam, would you sleep with me for five million pounds?

Socialite: My goodness, Mr. Churchill… Well, I suppose… we would have to discuss terms, of course…

Churchill: Would you sleep with me for five pounds?

Socialite: Mr. Churchill, what kind of woman do you think I am?!

Churchill: Madam, we’ve already established that. Now we are haggling about the price."

Apple has put itself in the position where, from now on, they'll be haggling about the price - and they don't really have much leverage there.

8bitsrule · on Aug 19, 2021

Now that everyone can clearly see that 'Stallman was right', we're all just haggling about the price.

blub · on Aug 20, 2021

5 million pounds can ensure a comfortable life everywhere in the world today and was likely worth much more in the past.

Assuming said socialite was not in a committed relationship, why would they not take that money for what must be 30m of effort which may actually be pleasant?

5 pounds on the other hand is not only a small amount of money, but it’s also insulting to ask somebody that’s not a prostitute to sleep with one for such a pittance.

Fictional Churchill was acting like an asshole and the fictional socialite was acting rationally. She only should have replied instead “X million pounds is the best I can offer, but I should certainly hope you are good in bed Mr. Churchill”.

madeofpalk · on Aug 19, 2021

What your saying is that the government can compel Apple to develop software and include it in iOS?

Haven’t they always been able to do that?

noptd · on Aug 20, 2021

The post you responded to already addressed that exact point.

>And yes, it's true that the governments could always mandate such scanning before. The difference is that it'll be much harder politically for Apple to push back against tweaks to the scheme (such as lowering the bar for manual review / notification of authorities) if they already have it rolled out successfully and publicly argued that it's acceptable in principle, as opposed to pushing back against any kind of scanning at all.

>Once you establish that something is okay in principle, the specifics can be haggled over. I mean, just imagine this conversation in a Congressional hearing:

madeofpalk · on Aug 19, 2021

My understanding is that these "safety vouchers" are uploaded regardless of a match. Only when there is about 30 matches are those safety vouches able to be decrypted to determine there there was a match.

So Apple claims your threat model is not technically possible.

Besides, Govt. can just order Apple to hand over the photos themselves from iCloud Photos because those are not end-to-end encrypted.

partdavid · on Aug 20, 2021

Which is an individual and legal process, i.e., it requires a search warrant. There are certainly problems with this process, but at a minimum, an account needs to be already and individually identified through some process (suspicion) and a legal process vetted by a judge happens (probable cause) to allow the access.

That's not at all the same as proactively casting a net and starting an investigation based on the results.

Apple has designed the system so that 30 matches are required; they could include more key material in each safety voucher to reduce the number required, or make it only require one match by providing the whole key in each voucher, or forego the system entirely in favor of one without such restrictions (which they can do, given some time, with an iOS update). It isn't "not technically possible" it's just "how they designed it", which is what the poster is saying Congress would ask about.

madeofpalk · on Aug 20, 2021

Apple doesn’t need any of this to look at your photos. They can just access them on their servers because it’s not end to end encrypted.

rgovostes · on Aug 20, 2021

There would be a lot less misinformation floating around about what this technology is and isn’t if people read the documents published about how it works—-of which there are now several—-before airing “what if” scenarios that are already covered.

mthoms · on Aug 19, 2021

As I understand it, Apple's servers know nothing until the 30+ match threshold is reached. This is actually one way that their system might be an improvement.

NB: I'm not in favour of this system - I'm only commenting on this one specific scenario.

heavyset_go · on Aug 19, 2021

Yes, but the full pictures are uploaded to iCloud Photos for which Apple has the encryption keys, and can scan the photos whenever they please.

mthoms · on Aug 20, 2021

Okay sure, but that's always been the case and has nothing to do with the system they've just introduced.

bequanna · on Aug 19, 2021

> there is still no guarantee that the hash list used can't be coopted for another purpose (eg; politically insensitive content).

That isn't a bug, it is a feature and will be the main use of this functionality.

The "preventing child pornography" reasoning was specifically chosen so that Apple could openly coordinate with governments to violate your privacy while avoiding criticism.

criticaltinker · on Aug 19, 2021

The OP mentions that two countries have to agree to add a file to the list, but your concern is definitely valid:

> Perhaps the most concerning part of the whole scheme is the database itself. Since the original images are (understandably) not available for inspection, it's not obvious how we can trust that a rogue actor (like a foreign government) couldn't add non-CSAM hashes to the list to root out human rights advocates or political rivals. Apple has tried to mitigate this by requiring two countries to agree to add a file to the list, but the process for this seems opaque and ripe for abuse.

JoshTko · on Aug 19, 2021

It's probably relatively trivial for the US or China to coerce another county to agree to an image.

ulzeraj · on Aug 19, 2021

What if those two countries can be Poland and Hungary? These two countries have been passing lots of laws to ostracize and criminalize pro-LGBT content and are friendly to each other.

WesolyKubeczek · on Aug 19, 2021

Fortunately, Hungary and Poland, even combined, are quite small a fish for Apple to just tell them to go pound sand in some form or other. They can even ban iPhone sales, people will just buy them in Czechia.

It's not like China, or India, who not only have huge markets, but could easily hold a chunk of Apple's supply chain hostage.

It's very easy to uphold human rights if it doesn't actually cost you anything.

tester756 · on Aug 19, 2021

> These two countries have been passing lots of laws to criminalize pro-LGBT content

What do you mean

I live in one of those countries and I'd want to be aware

vineyardmike · on Aug 19, 2021

https://googlethatforyou.com?q=poland%20anti-lgbt

but actually this is a good starting point: https://en.wikipedia.org/wiki/LGBT_rights_in_Poland

tehnub · on Aug 19, 2021

Apple did mention in their security thread model document [0] this:

  Apple will also refuse all
  requests to instruct human reviewers to file reports for 
  anything other than CSAM materials for accounts that exceed 
  the match threshold.

[0]: https://www.apple.com/child-safety/pdf/Security_Threat_Model...

nybble41 · on Aug 19, 2021

That's too little, too late. By the time any human reviewer can evaluate whether the images should have been in the database the encryption on the backups has already been circumvented and other parties have already been given access to your private files.

shadowfiend · on Aug 19, 2021

That is not how this works. Please read up on the functioning of the system before chiming in with such certainty on its behavior.

The only thing they will have gained access to are the “derivatives” (presumably lower res versions) of the matched photos, which if this is done to frame you is strictly the fake CSAM.

nybble41 · on Aug 19, 2021

I'm aware of how the system works. Those are still your private files which were supposed to be encrypted and which were revealed (either partially or fully, makes no difference) to another party. That fact that this review process is even possible means that the encryption on the backups can't be trusted.

Someone · on Aug 19, 2021

There wasn’t any guarantee that Apple didn’t have and use such technology before they made this feature public, or even already was running it in the current version of iOS.

If you trusted Apple not to stealthily run such technology before, the question is how much less (if any) you trust them now.

If you didn’t, I don’t think anything changed.

cat199 · on Aug 19, 2021

this also punts the debate to the checking process instead of the fact that there even is a process to start with..

version_five · on Aug 19, 2021

Yes I was about to say the same thing. Hash collisions are an extra concern about what apple is doing, but even if they were as collision free as cryptographic hashes, that would not make the invasion of privacy OK. The technical discussion is something that apple can easily parry and is the wrong framing of this debate.

rustymonday · on Aug 20, 2021

I'm so glad you said this.

As someone not in the tech field, it is incredibly concerning that half the people here on Hacker News, people who help build this kind of technology, do not seem to be concerned with what Apple is doing.

endisneigh · on Aug 19, 2021

It doesn’t matter if there are collisions if the two images don’t actually look the same. Do people honestly believe a single CSAM flag from an “innocent” image is going to result in someone going to prison in America?

PhotoDNA has existed for over a decade doing the same thing with no instances that I have heard of.

If some corrupt government wants to get you they don’t need this. They can just unilaterally say you’ve done something bad without evidence and imprison you. It happens all the time. It’s even happened in America. Just look up DNA exonerations - people have had DNA on the scene that literally proves their innocence and they’re still locked up.

People should care about their governments being corrupt, not necessarily about this.

zbobet2012 · on Aug 19, 2021

That's not the route in which this will be exploited, and used at scale.

A corrupt government has to know they want you "to just get you". Instead they will embed a collision in anti-government meme. That collision will flag you, and now they know you harbor doubts and will come get you.

This is why it's a privacy concern. It's no the tech (like you said photo dna's been about forver), it's the scanning of the phone.

flycaliguy · on Aug 19, 2021

A corrupt government will also enjoy the “chilling effect” created by people’s fear of tainting their phone with illegal images.

kempbellt · on Aug 19, 2021

I feel like a corrupt government would want people to trust their phones.

tgsovlerkhgsel · on Aug 19, 2021

IIRC many totalitarian governments, historical and current, made/make their surveillance blatantly obvious, because the chilling effect deterring people is much more valuable than the added intelligence value from keeping people careless.

After all, they want to suppress dissent, and they can't catch and arrest everyone - it's much more effective if people don't even dare to voice their dissent.

This is also why you see people in situations like the Arab Spring use known-insecure/monitored means of communication, because they realize that the value and power that comes from communicating and finding other likeminded people is worth painting a target on yourself (because you can't succeed without it, and if you succeed, there will be too many to prosecute).

benjaminpv · on Aug 19, 2021

Alternatively, a corrupt government might want folks to distrust their mass market phone such that they can have an individual come along and offer them a 'completely secure and private' alternative[1].

[1]: https://www.pcmag.com/news/fbi-sold-criminals-fake-encrypted...

mrtranscendence · on Aug 19, 2021

> That collision will flag you, and now they know you harbor doubts and will come get you.

Only if that government has worked out some deal with Apple whereby such an anti-government meme would end in the government being notified accordingly. Don't forget that you need a sufficiently high number of collisions, for one, and that those collisions are audited by Apple before being sent to law enforcement.

labcomputer · on Aug 20, 2021

Why do they need to work out a deal with Apple? Just bribe or blackmail the minimum wage reviewers.

endisneigh · on Aug 19, 2021

What you’re saying is already possible. Where are the examples of this happening? PhotoDNA has existed since 2008.

biesnecker · on Aug 19, 2021

Who is scanning your phone right now with PhotoDNA?

endisneigh · on Aug 19, 2021

Microsoft, Facebook and Google. More depending on what services you use

biesnecker · on Aug 19, 2021

They're not actively scanning your phone, they're actively scanning files you send them.

KallDrexx · on Aug 19, 2021

That's not actually answering the question in the GP about why this is different.

Photos people send me to my Android are automatically sent through 3rd parties, either through MMS, Facebook messenger, Google Photos, or One Drive. Photos arriving on my device are almost guaranteed to be uploaded to both OneDrive and Google Photos based on how defaults of Android phones are setup.

So someone could already send hash collisions my way (purposely or inadvertently) and authoritarian governments already have access in their respective clouds (at least China does).

And yet, there are not stories of people being falsely accused of child porn due to PhotoDNA hash collisions.

Why does "on device for apple devices only" change the calculus.

raxxorrax · on Aug 20, 2021

Why waste a resource like that and throw them in prison when you have compromising material? That would be stupid. Epstein was probably a high society pimp and there probably was enough evidence for convictions. That wouldn't have happened if it didn't get public.

endisneigh · on Aug 19, 2021

That’s the same as what Apple is going to do - scan right before sending to iCloud

danuker · on Aug 19, 2021

The scanning on the device before the upload makes it easier to do surveillance.

WesolyKubeczek · on Aug 19, 2021

The same as what Apple is saying it's going to do, there's a difference.

knodi123 · on Aug 19, 2021

Man, wouldn't you love to be the developer who gets assigned the feature to commit these horrible secret privacy violations with deeply evil ethical problems?

You don't even have to implement the feature. All you need is really good proof that you were asked to, and now your job at Apple is secure, along with a huge raise, for years if not for life. Something commensurate with what you and they both know they would lose in the PR damage and possible EU fines.

vineyardmike · on Aug 19, 2021

> wouldn't you love to be the developer who gets assigned the feature

Also, they'd probably not use a Cupertino developer. I'm sure a dev in a nation with a lot less rights is easier for this sort of work. Find a nation where the protections for employees are worse and good jobs harder to find.

vineyardmike · on Aug 19, 2021

> possible EU fines

Until the EU makes it a legal requirement. Which they're getting close to

endisneigh · on Aug 19, 2021

If you don't trust Apple then they could've already done what you're concerned about before they announced this. I don't really get it. Either you trusted Apple before this and you continue to, or you didn't before, and continue not to. If it's the later, then you shouldn't be using Apple services.

tgsovlerkhgsel · on Aug 19, 2021

> If you don't trust Apple then they could've already done what you're concerned about before they announced this.

Not without the risk of it being discovered (either through a leak or because someone analyzes the software on the phone), and then having a much bigger scandal on hand.

ipaddr · on Aug 19, 2021

None of those are scannig your phone. None of them aside from google even sell a phone.

Everyone suspected they were scanning images for this and a number of other things on their services.

sennight · on Aug 19, 2021

We are talking about how everyone who gave Apple money now has a potential probable cause vector that they didn't before. Everyone running the software is a suspect by default. Ask black Americans how they feel about setting the bar low for probable cause.

"Following the 2004 Madrid train bombings, fingerprints on a bag containing detonating devices were found by Spanish authorities. The Spanish National Police shared the fingerprints with the FBI through Interpol. Twenty possible matches for one of the fingerprints were found in the FBI database and one of the possible matches was Brandon Mayfield. His prints were in the FBI database as they were taken as part of standard procedure when he joined the military."

"The FBI described the fingerprint match as '100% verified'."

https://en.wikipedia.org/wiki/Brandon_Mayfield

kmbfjr · on Aug 19, 2021

Reading about incidences such as this has made me think critically about all cloud services in the United States, and the conclusion is simply not to use them.

Sure, the probability is lower than getting struck by lightning. I certainly don't play in the rain and I won't be using cloud services where I'm exposed to this kind of nonsense with the FBI.

sennight · on Aug 19, 2021

> Sure, the probability is lower than getting struck by lightning.

I don't think anyone can actually know that, because I don't think statistics are kept on how often these sort of dragnet programs result in civil liberty violations and secret grand juries. That should be the more immediate concern, because after that point you are depending on the goodwill of prosecutors... which is a super bad idea.

dane-pgp · on Aug 19, 2021

> '100% verified'

Just reading those words is rage-inducing, but I'm grateful to have learnt this example of government lying. I feel like it should become an expression that societies teach to their children to warn them about abuses of power. Other mottoes synonymous with government deception and corruption come to mind, but at the risk of being too controversial I will share only their initials and dates: "SAARH" (2013), "MA" (2003), "IANAC" (1973), "NHDAEMZE" (1961), "AMF" (1940).

sennight · on Aug 19, 2021

Because I delight in delivering bad news, I'll point out that the real takeaway shouldn't be that federal LEOs regularly lie (though, they do) - it is that they are permitted to lie convincingly through handwavy technical means. All these tools are designed to give them permission to totally destroy your life. I'm aware of only two geewiz CSI methods that are actually founded in science: cryptography (this neural crap doesn't qualify) and DNA. Unlike fingerprints and bitemark analysis, those two tools were invented outside of law enforcement - instead of being purpose built for prosecution. Anybody doubting that should look into the history of the polygraph and its continued use in the face of evidence demonstrating how useless it is in the pursuit of truth... which begs the question: if they aren't interested in the truth, what are they doing?

https://en.wikipedia.org/wiki/Polygraph

chipsa · on Aug 20, 2021

DNA is only mostly founded in science. There's the interpretive element in saying: Oh, this DNA matches the suspect, even though it's mixed with the victim's blood.

Nice big clean sample? It probably actually matches. Small sample? Mixed with other people's DNA? Especially in a place they tend to visit (or live in?)?

sennight · on Aug 20, 2021

At that point the problem isn't the science behind DNA, it is the fact that your freedom depends on a jury understanding statistics. But at least there exists the opportunity to challenge the evidence on objective grounds, that unfortunately requires an expensive professional expert witness...

mikepurvis · on Aug 19, 2021

At least part of the concern is that a hash collision is basically "cause" for Apple to then dump and begin manually (like, with humans) reviewing the contents of your device, all of which will be happening behind the closed doors of a private corporation, outside of any of the usual oversight or innocent-presumption mechanisms that come from it happening through the courts.

That, combined with a (pretty reasonable) expectation that the pool of sus hashes will greatly expand as law enforcement and others begin to understand what a convenient side-step this is for due process.

shuckles · on Aug 19, 2021

That is literally the status quo with every cloud service. Apple, unlike the others, has said that they will evaluate you on the basis of what’s included in the associated data of your safety voucher, and you can inspect those contents because they’re shipped in the client. Facebook, for all I know, might be calculating a child predator likelihood score on my account based on how often I look up my middle school ex-girlfriend on Instagram.

In addition, “pretty reasonable” is an opinion not fact. Where is the evidence that PhotoDNA hashes have been compromised in this way in the fifteen years they’ve been used?

foerbert · on Aug 19, 2021

I don't think we can just appeal to the status quo here and assume it's acceptable. There's a couple reasons.

First, how many people really understood this previously? Did society at large actually knowingly accept the current state of things, or did it just happen without most people realizing it? Even here on HN where we'd expect to find people way more knowledgeable about it than in general I'm not sure how well known it was about what was actually happening, though I'd assume most would be aware it was possible.

Secondly, there's a significant difference between your own device or Apple's server doing this. On the technical side of things, right now, it might not matter that much since it currently is limited to things you upload to iCloud. But more philosophically, it's your own device being turned against you to check you for criminal behavior. That's very different from somebody else checking up on you after you willingly interact with them.

nonbirithm · on Aug 19, 2021

If the problem is a lack of understanding of the status quo, then it isn't fair to criticize Apple alone. People ought to demand answers about the state of server-side scanning from Facebook and Microsoft and everyone else that employs PhotoDNA as well. The most popular article submitted to HN with "PhotoDNA" in the title garnered hardly any interest at all, even though someone there implied that a hash collision might be possible five years in advance.

https://news.ycombinator.com/item?id=11636502

endisneigh · on Aug 19, 2021

> But more philosophically, it's your own device being turned against you to check you for criminal behavior. That's very different from somebody else checking up on you after you willingly interact with them.

This literally only works once you willing send photos to iCloud.

ClumsyPilot · on Aug 19, 2021

You can't buy a car in EU that doesn't have a sim card. All Tractors have a computer that locks out the machine if it doesn't like something and phones home. Almost every TV on sale is 'smart' and spies on what you are saying. Coffee machines, lights and toasters are now internet connected, and all of them send data to a server that will be scanning for the 'wrong' material. in 10 years there will be nowhere to hide.

simondotau · on Aug 19, 2021

> You can't buy a car in EU that doesn't have a sim card.

Wait, what?

ClumsyPilot · on Aug 20, 2021

Its used for an emergency rescue system, and i believe it's mandatory in all new cars

https://ec.europa.eu/transport/themes/its/road/action_plan/e...

foerbert · on Aug 19, 2021

Right. I mentioned that. It's still your own device doing it.

It's like announcing to your family member you're going to tell your neighbor you committed a crime and your family member turns you in first. Yeah, you could expect your neighbor to do the same, but are you really not going to feel any differently about the fact it was your family that turned you in?

stjohnswarts · on Aug 19, 2021

For now. There is no technical hurdle preventing them from scanning everything locally and reporting back.

shuckles · on Aug 19, 2021

There wasn’t such a hurdle before or in the counterfactual where they built infrastructure to scan iCloud while also keeping iCloud Backups for every device.

nodamage · on Aug 19, 2021

I mean, that has always been the case. I'm not sure why there is so much paranoia over this hypothetical situation when this hypothetical has actually existed since the first iPhone shipped in 2007.

stjohnswarts · on Aug 19, 2021

I don't understand technie people on HN being okay with apple breaching the spirit of the 4th amendment and becoming the FBI agent in your phone. Scanning the stuff in the cloud is one thing but this is crossing a line. I am shedding all my apple hardware over it. If you want to trust them fine but one day it will bite you on the ass.

shuckles · on Aug 19, 2021

For ideological consistency, are you dumping every service provider that scans the contents of your account and reports offending data to law enforcement?

Tagbert · on Aug 19, 2021

What they would be reviewing would be scaled version of the specific photos that triggered the hash alert. It’s not a broad fishing expedition. There is no mechanism to start browsing the photos on your phone.

WesolyKubeczek · on Aug 19, 2021

You know, it's rather not okay to treat every smartphone user out there as a potential criminal just because they happen to have photos on their devices. At least in an alleged democracy where there's this presumption of innocence thing.

Even Pegasus wasn't that much rotten, it at least wasn't indiscriminately installed onto everyone's phone.

But what can I say, there wasn't much uproar about piracy tax on blank CD-R(W) media back in the day, so why not have that now. And eventually we go peak USSR where everyone and their dog is suspect and whoever is arrested is the enemy of the people. Yay, it's somehow reassuring to know I won't live long enough to see it.

nonbirithm · on Aug 19, 2021

And it's fine to treat everyone as a potential criminal when we entrust our data on the same company's servers? No matter if on-device scanning makes surveillance easier than ever, the surveillance itself was still a significant possibility up to this point with server-side scanning. Imagine how many petabytes of user data already exist in the cloud.

short_sells_poo · on Aug 19, 2021

So in your mind because bad thing X is already happening, it's completely OK for bad thing XY to also start happening?

nonbirithm · on Aug 20, 2021

No, it's that in spite of there already being an invasive scanning process in place for this long at every major tech company that handles user data, nobody seemed to care until now.

WesolyKubeczek · on Aug 20, 2021

Also, there's a difference. You upload stuff to someone's server' if they don't vet it, they become accomplices. Apple intrudes on what you do on your phone, this, among other things, tells that you paid mad bucks for this phone and you don't even own it, you rent it from your phonelord Apple. Also, in their eyes you're suspect and likely a filthy pedo.

hda2 · on Aug 20, 2021

Most people weren't aware or couldn't see where this was going. Now that it's becoming obvious, people are starting to care.

thedingwing · on Aug 19, 2021

You don’t need to be sent to prison to be irreparably harmed by an accusation.

endisneigh · on Aug 19, 2021

Ok sure, PhotoDNA has existed since 2008. Where are the instances of people being sent to prison?

dannyobrien · on Aug 19, 2021

One of the things that is happening now is that the entire PhotoDNA system is finally coming under the level of oversight that it should have had right from the start.

I can tell you from working in this area that it's possible for someone to have their lives ruined by a misplaced investigation, have that investigation abandoned because they turn out to be obviously innocent, and for that to not be well-known, because people simply would not understand the context.

Before this Apple scandal, if you'd written to your reepresentative or a journalist or an activist group and said "I was framed for child abuse because of computer program that misidentified innocent pictures", they would attach a very low priority to dealing with you or publicising this. And almost all people who have experienced this kind of nightmare really don't want to re-live it in public for some tiny possibility of real justice being served for them, or for others. They just want it to all go away.

nonbirithm · on Aug 19, 2021

We certainly have Apple's PR blunder to thank for that, but if PhotoDNA always held that potential for abuse due to its very nature, why did we remain silent for 13 years?

Maybe it's because Google and Microsoft and others' policy of security through obscurity actually succeeded in preventing the details of PhotoDNA from coming to light, and it took Apple exposing their hashing model to reverse engineering by including it on the device for people to finally wake up.

dont__panic · on Aug 19, 2021

Considering I didn't know about:

- PhotoDNA

- CSAM scanning on cloud photo platforms

- the acronym "CSAM"

Before this whole Apple client-side scanning debacle... seems pretty likely. A lot of privacy-focused people also avoid Google and Microsoft cloud services like the plague and trusted Apple up to this point to protect their privacy. The fact that Apple was (and is) scanning iCloud Photos libraries for CSAM unbeknownst to most of us is just another violation of that trust and shows just how far the "what happens on your iphone, stays on your iphone" privacy marketing extends (read: not past your iphone, and sometimes not even on your iphone).

nonbirithm · on Aug 19, 2021

I think the actual issue is that Apple wasn't scanning enough user data, so the government or the FBI or some other external force was holding them accountable for it out of public view, and Apple was pressured into increasing the amount of scanning they conducted.

"U.S. law requires tech companies to flag cases of child sexual abuse to the authorities. Apple has historically flagged fewer cases than other companies. Last year, for instance, Apple reported 265 cases to the National Center for Missing & Exploited Children, while Facebook reported 20.3 million, according to the center’s statistics. That enormous gap is due in part to Apple’s decision not to scan for such material, citing the privacy of its users." [1]

[1] https://www.nytimes.com/2021/08/05/technology/apple-iphones-...

Dah00n · on Aug 19, 2021

You are commenting a lot for this many places in the thread. Are you arguing for this system or for Apple? It reads like pro-Apple and doesn't add anything except "I think it is good, therefore it is good".

rootusrootus · on Aug 19, 2021

It can get a little frustrating to hear so much inaccurate FUD being spread around which detracts from a reasoned discussion on the merits.

kemayo · on Aug 19, 2021

If you have a point which you feel rebuts a common argument, it seems reasonable to leave that comment in places you see that argument. The alternative is "minority positions should be drowned out", no?

qweqwweqwe-90i · on Aug 19, 2021

FYI: when you wrote this comment, you had posted 5% of all the comments on this thread.

Youden · on Aug 19, 2021

> It doesn’t matter if there are collisions if the two images don’t actually look the same.

Is that really true? My understanding is that the manual reviewers at Apple only see some kind of low-resolution proxy, not the full-resolution image. I'd also be shocked if the human reviewers were shown the original, actually CP image, to compare to.

Given that, it's not necessary to produce an actual visual match, it's just necessary to produce an image that when scaled-down looks like CSAM (e.g. take an actual photo of a kid at the beach and photoshop in some skin-coloured swimwear with creases in the right places).

> Do people honestly believe a single CSAM flag from an “innocent” image is going to result in someone going to prison in America?

The attack I'd worry about here is similar to swatting. Someone who doesn't like me sends a bunch of images like the ones I described above (not just one), they end up synced to iCloud (because Apple wants _everything_ synced to iCloud) and Apple reports me to the authorities, who end up knocking at my door and arresting me.

Even though I'm innocent, I'll probably have most of my computers confiscated for a while and spend a few days locked up.

> PhotoDNA has existed for over a decade doing the same thing with no instances that I have heard of.

PhotoDNA's algorithms and hashes aren't public, so it's not clear how an attacker would exploit PhotoDNA in the way that people are afraid will be done for Apple.

PhotoDNA also isn't, as far as I know, part of a product that aims to create unprotected backups of the phones of nearly a billion users. Apple really wants you to upload your whole phone to iCloud. The only comparable alternative is Google's Android backup but Google does the right thing and end-to-end encrypts that.

robertoandred · on Aug 19, 2021

Why would you save all of the almost-CSAM pics the attacker sends you to your photo library?

Youden · on Aug 19, 2021

Manually, perhaps because the attacker crafts the messages to make that desirable.

But a bigger concern is automatically. WhatsApp for example can automatically save all received photos to your Camera Roll, which is of course automatically backed up to iCloud for many people. So an attacker could potentially just send you a bundle of 40 images of however many and your phone automatically sends it to Apple.

yunesj · on Aug 20, 2021

You ask for examples a few times. Here’s one. Computers confiscated from a liberty activist and his radio station. No CP-related charges pressed in the 5 years since they confiscated his computers.

I’m guessing most times this happens, the accused try to keep it on the DL.

https://www.wmur.com/article/fbi-raids-radio-talk-show-host-...

joe_the_user · on Aug 20, 2021

It doesn’t matter if there are collisions if the two images don’t actually look the same. Do people honestly believe a single CSAM flag from an “innocent” image is going to result in someone going to prison in America?

I understand that the system as stated today has multiple safeguards against such things happening but...

Given the sum-total of bs I've seen happening in this country, yes, believe that kind of thing is quite possible.