Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Massive mortgage and loan data leak gets worse, original documents also exposed (techcrunch.com)
131 points by PretzelFisch on Jan 25, 2019 | hide | past | favorite | 33 comments


This is getting so tiresome - consumers need direct recourse against those institutions that are mishandling/losing their data

Edit: if this were the case, the perception would change from viewing large troves of personal private data as an asset, to being viewed as a liability that needs to be mitigated.


This will continue to happen as long as data is viewed as an asset to hoard.

It needs to be viewed like debt - a liability to manage. Ie only hold it when it provides you value greater than the risk and cost involved.

Current time in society "data" is a zero interest rate loan, that is never due, with no credit worthiness check, and no penalty for default. The incentives are perverse.

It absolutely doesn't align with the costs to society.

At bare minimum you need penalty for default, and a non-zero interest rate (with regular payments).


Equifax already has shown that there are no consequences. If they took data security seriously then a breach of that severity should have seriously endangered the company but in reality nothing happened. I am sure other companies have been watching closely and concluded that the cost for proper security practices is too high compared to the cost of a breach.


This will never happen.

I'm taking my security certifications and the first lesson on hardening is that you don't secure something if it's worth less than the countermeasure.

This means your data is only worth about $10 of security while your time to fix identity theft can be in the hundreds.

I bet you a law that says $1000 fine paid directly to the data subject for failure to secure data would make US ideals on data change for decades.


I wonder how far we are from this being a populist position worth taking.

If you proposed a law that getting hacked results in an automatic duty to hand-deliver $1000 in cash to every single person whose detail was leaked... I think voters would cheer pretty loudly, and at some point might drown out the donors?

(Delivered like a subpoena, I'm picturing, along with a booklet explaining your rights... And as a jobs program for delivery-men too.)


Please say more.

--

My current position is that my data is me. Anyone using my data owes me money. To guard my privacy, I get to choose what is publicly known about me.

Being a very simple bear, I envision modern privacy as an extension of property rights, a fundamental human right.

I want to hear more about your (?) ideas of debt, liability. It feels more flexible, but I'm having hard time envisioning the mechanics.

My notion does not cover the decay rate (shelf-life) of data. eg When I die, what happens to my data? Do I really care that anyone knows my autopsy report, my DNA, my lifetime earnings, my favorite sports ball teams, etc?

Nor does it cover (pseudo-)anonymous data. How would my data be included in longitudinal datasets like public health, surveys, traffic, etc?

--

I've been chewing on this stuff for a long time. I have direct experience with (in order) electronic medical records, voter privacy, marketing, metrics & analytics. This topic always makes my head hurt.


How about this (assume inflation adjustment for all numbers):

1. Failure to report a data loss is a fine of $10,000/user (need a compliance stick)

2. An exponential schedule of loss categories:

level 3 items: name, address, email, etc. = $2/each

Level 2 items: username, purchase history, etc = $10/each

Level 1 items: financial data, communications, passwords (unsecured/poorly secured), pictures, etc. = $100/each

Then each category would be multiplicative lose 1m customer names and addresses it would be $2x$2 =$4/each or $4m lose 1m names, addresses and personal communications, it would be $2x$2x$100=$400/each of $400m


So the below is loosely organized thoughts answering what you said:

I can't say I have the answers, but I haven't thought long enough or hard enough about this. But one option is continue the analogy via a thought exercise.

What would it mean to have company have a credit check and credit history? Something the owner of the data could view and evaluate before choosing to lend it?

What is data valued at? Is there a bidding process for the loan? Can a consumer check the interest rate / payment they'd get from company X or Y for use of their data for a period of time?

Data right now is a non-physical asset that has value, doesn't adhere to the laws of scarcity, and has no equivalent of copyright protection. The behavior that produces is inevitable. If someone says hey here is this thing that has value, you can hoard for later, and it costs you virtually nothing to obtain it then you'll get entities doing nothing but recklessly fighting to hoard it.

----

> I envision modern privacy as an extension of property rights, a fundamental human right.

Ultimately, I think this is the only way the system balances out - how you get there from here is a whole different question.

Jumping around again, there has to be a regular cost of holding the liability (eqv of a monthly payment). Maybe that's a true transfer of money to the provider via an interest payment, maybe it's a more risk/maintenance cost. But without a regular cost, the incentive is to simply be a "startup mindset", hoard as much as possible, never make a payment, go for it all or go bankrupt when your company fails. You get the same behavior. You must have marginal cost associated with data (in both continuing to hold it for a longer period of time, and of obtaining more).

HIPPA while not perfect, is an example of their being somewhat of a cost associated with holding data. And consequences for improperly handling it. And the healthcare industry is still very profitable. So it's not too far out of the imagination.

But of course there are gotchas. What happens if I tell you my name, can I "force" you to forget it? Do these rules only apply to companies and not individuals? How is security handled? Suspected hacker asks to be forgotten, including the logs and ip traces?

How do you treat derivative data? I use your info to build a ML model, and then forget your data but get to keep the model for free? What if it's not an anonymous model, but is specific to you? With money if I make profits off what you lent me you are owed none of it, I keep it all.

But it's always easy to find problems with a new idea before someone has even presented a proposal. Anyone could list a billion more, but that doesn't mean it doesn't work. Someone could also list a million things wrong with electricity pre-invention (how are you gonna stored it? won't it start fires? look how much capital investment it'd take to build a grid..), doesn't mean it doesn't work.

----

The above is all just rambling on the topic.

So a few focused thoughts:

1. "Right to be forgotten" is important. It changes/fixes the ownership of the asset.

2. Penalties for misuse or lack of protection is next. Penalties must be there to eliminate poor handling of data

3. If there is ownership by people, and it provides value to companies, give people a way to earn off of it - assign it monetary value.

Right now we're in a "why buy the cow when you can get the milk for free" situation. Companies will resist change because they're getting it all right now - so it seems like either they get incentivized and/or someone needs to enforce it.

But again, just because I haven't thought of it, doesn't mean it isn't there. Maybe it's simply "right to be forgotten" and someone makes a market place for lending / selling your data. The same way people are paid for filling out surveys they are offered different amounts for use of their data.

-----

Imagine a non-profit like Mozilla Foundataion, gives you a plugin to monitor all of your web habits and captures your personality/data in a box that you own and control. Maybe you also give it some of your accounts to scrape or something.

Amazon competitor feels Amazon has a moat they can't surpass because Amazon has 7+ years of your buying and browsing history. Now you can go to this competitor and say hey, my data-box has all of my Amazon browsing and purchase history, I'll sell/lend it to you for a rate of $x / month. Competitor knows they can monetize that and profit more than $x by getting you to make more purchases through showing you the specific better deals in your interests they have over amazon, and ignoring the things you've already bought. Or auto filling out your regularly scheduled monthly grocery products and with one click you're now receiving them from Amazon competitor.

The competitor is better off because they can now compete with Amazon, you're better off because you control your data and earn from lending it, the web is better off because Amazon now competes via a better selling better product and not resting on their laurels of simply having a "data" moat.

The above is just a spur of the moment idea, but maybe there is a smart person who would want run with this, or much more likely come up with a very different and much better way to fix data ownership. Whoever figures the data issue out that will change the web and the world. The alternative (forever on the path we're on) leads us to, in my opinion, a bad state of affairs.


Well put.


Agreed. I feel like a barely flinch at these headlines anymore. My guess is most people also are getting desensitized, which means the PR costs of a hack are decreasing.

We need some mechanism to increase costs (or fundamentally change the way data ownership works)


I think part of the problem is that the impact for individuals is also decreasing. Right now, if you want to know my email, my address, my SSN and the name of my family members, you can find it in some torrent file somewhere on the internet. There just isn't a whole lot about me that isn't already available through the bazillion previous data breaches. Only people who were not part of those are impacted by the new ones.

So my credit is already frozen, I already monitor my reports, I already know that most emails I get with personal information is fishing... one more literally doesn't do anything.

We can only think about the future generations at this point, who aren't yet impacted. People are really bad at that though (see: global warming).


I work in this industry, anything that doesn't have the regulated retention period for audits has high pressure to be burned asap.


You mean like GDPR?


There needs to be fines for this kind of stuff.

Each and every person affected needs to be notified, and the companies sued into oblivion. It's just negligence.


Europe has it, with the GDPR. It'll never happen in the U.S., corporate America has too much lobby power.


I can see in my (US) company how the GDPR already has an influence on data collection practices. Three years ago we wanted to just suck up all available data and figure out later what to do with it. Now there is much more thought and we have to set up processes for deleting data and other things. It's a little more work but definitely a good thing.


My favorite part is salting someone's database.

Sign up as a US citizen, switch your country in the preferences to Spain like 2 minutes later... Boom


That's why it's often easier to go full GDPR if you have worldwide customers.


> "Campbell confirmed that the company will inform all affected customers"

Even though they might be holding my data, I'm most likely not their customer. So, will I ever get informed if they leaked my data?

The ownership of data is getting to be a tricky problem. My employer is asking me to submit to regular and continuous background checks through a third-party service. My employer claims that this third-party has remedies to my employer in their contract in case of a breach, but when I asked what the remedies were to ME I was met with silence.


I don't understand how do people just leave servers with open access. Don't you have to manually set them up to be unprotected and accessible that way?

Edit: missed a word


No, the opposite is often true. To get you working quickly software often has open-to-all by default and the user is left to lock down as they see fit.


The new leak came from an S3 server, which would have been locked down by default. Someone had to make it open to the public.


What most likely happened is, the developer setting it up did have it locked down. But then some upper management decree came down and said, these people/department needs to be able to access this data however they please. The developer probably told them, ok, here is how we need to set that up. The manager said, no, too hard. Just let them call it directly, don't encrypt it, don't check for authorization, make it "easy for them to get to"...

And here we are today.

If I were a developer in that position, after pushing back with nothing coming from it. You should do what doctors do. Tell whoever is ordering you to do X, that you need it in writing and coming from him personally. Other than that, you don't have many options as a developer. Your other option is to be fired, and they'll bring in the next guy.

So ultimately, there needs to be accountability, not just on developers, but the business as a whole. A developer should have the ability to raise a flag to someone without recourse.


Didn't S3 used to default to public? I know older versions of Elasticsearch didn't need authentication when first setup.


Yes, now you get even a warning for opening the bucket but I bet you someone googled the answer and was like 'fuck it, too much a headache to set up IAM for my app.'


So sloppy, this is why it becomes clear that your developers need to know the scope of your business. Best practices sure don’t leave an open server with production data. But, if you know you have tons of pii in there, you should treat that like proverbial nuclear radiating material and lock it down with whatever means are at your disposal, regardless of difficulty to do so.


Securing the data costs money. Leaking the data doesn't. Until it does, businesses will take the path that least affects their bottom line.


Also evaluate which data really needs to be saved in the first place.


So how do we truly know if we're affected? OpticsML claims they are "working to notify all affected parties", but they have lost all trust and credibility, so I'm not holding my breath.


who is behind OpticsML, cant find anything about them


From the archive.org cache (https://web.archive.org/web/20180824215739/https://www.optic...) of their homepage, it looks like they were a company that did OCR, indexing, & AI assisted data extraction.

Seems like a pretty good use of ML really - shouldn’t be an intractable problem to identify something like a scanned W2, run OCR on it and extract the income fields.


Was Elasticsearch on AWS, too? I wouldn't be surprised since they don't support x-pack, making only more inconvenient forms of authentication available to users.


This does not disturb me (modulo leak of SSNs). If documents for federal mortgages were public we would have seen 90% less mortgage fraud.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: