I am working on a website where GitHub users can see how they rank compared to other users.
The site analyses GitHub profiles and commit history to make a more extensive summery then GitHub does. I made it mostly to learn more about ranking algorithms and automatic text generation.
For some backround: This is the "Security gun rule". Under this regulation, the Social Security Administration will submit information on recipients of disability insurance to the National Instant Criminal Background Check System if they met certain “mental impairment” criteria[0].
As I understand it most of the peopled with disability insurance for mental impairment will be military veterans.
> The part that confused me is when they claim to have obtained MethBot source code, but never mention how.
>
On page 19 in the The Methbot Operation report they state that ‘White Ops detection technology was able to use a JavaScript language feature called “reflection” to gather extensive, detailed information about its inner workings.’
I have personally never heard about JavaScript reflection before, but it appear to be a debug method for one object to dump information or data about another object.
Maybe the White Ops software loaded some JavaScript that was able to dump much of its environment and send it back to White Ops?
I don't know more than anyone else about this particular situation, but I can imagine how JS reflection works. Something like:
let test = function() { return "hello";}
test.toString()
returns
"function() { return "hello";}"
It's not too difficult to imagine that pairing that with some JS parsing would allow you to slowly crawl your way around an app and gather the app structure. Crazy, and fascinating idea.
To be fear, they did not just publish a negative blog post about WeWork as the title may suggest, but somehow obtained proprietary data WeWork probably sees as a trade secret.
As I understand it is currently unknown where the author got the data. If WeWorks published this in a publicly available API it is of course reasonable to chart it, but if the author somehow misuses the access he got as a tenant there, this is not (in my view) just an issue about the author publishing a negative blog post as the title indicates.
it's my understanding the API they used was undocumented, so while it's WeWork's fault for not doing a better job securing an API that seems sensitive to them, it's also disrespectful to do something that you're pretty clearly not wanted to be doing
IANAL but you can be in trouble even if the data is openly available but the party managing the data deems it restricted in any way. I believe this is the reason security researchers get in trouble: they access data that is considered restricted and get prosecuted for "stealing" it.
I think the logic to that is something akin to: its illegal for you to take my car even if I leave it unlocked with the keys in it.
The company has to communicate that the data is restricted, they can't just deem it restricted after the fact. Usually when companies or individuals (eg. 3Taps & Aaron Swartz) have gotten in trouble under the CFAA, it's been because they've been served a C&D or IP-blocked and then persist in accessing the data, which the courts have upheld as "knowingly and intentionally accessing a computer without authorization".
In this case, WeWork is within their rights to terminate ThinkNum's membership for the ToS violation, but there's no legal case unless ThinkNum persists in scraping WeWork's data after the termination, or there's evidence that ThinkNum knew that the API was restricted at the time they accessed it. Hence the founder's repeated insistences that he did nothing wrong, and coyness in discussing the source of the data.
Found one source that indicates that the payment may be substantial: "Financial Times reports that one digital media company (which asked not to be named) was told that it would cost 30 percent of its advertising revenue to be whitelisted by Eyeo and AdBlock Plus." - http://arstechnica.com/business/2015/02/over-300-businesses-...
This is going to be a hurdle for me that need a search feed that can be parsed server side. There alternative YPA (Yahoo Partner Ads) is client side only.
Will most people migrate to Microsoft Bing? The results are the same anyway, but the api format differs.
IA torrent files use the archive as web seeds if they have to. But if there's a spike in interest - like right now apparently - it would reduce the load. So it will still work if there are no seeds, and it will reduce load on the servers when that's possible.
Edit: this is all just to be polite, since the archive is not worried about using a ton of bandwidth.
Is the code that runs the frontend of the IA open source? I'd be interested in contributing to that, so when requests for certain objects are creating excessive load, a response status code is provided to indicate so, and the alternate URI returned is a magnet link for the object.
EDIT: It appears an HTTP 303 status code accomplishes this
Right! I'm not interested in breaking the Internet Archive, and I'd expect it to move to IPFS [1] eventually (content addressable web) [2]. If/when/how that happens, I'd expect traditional http tooling to still work (curl, wget, etc), which is why I went looking for a status code that indicates an alternate path for the resource/content.
That's why my above comment kept getting edited as I did some more research. 503 is an ugly failure. 429 tells the client to back off, but it doesn't provide a fallback to still get the content. 303 does.
I thought this train of thought was in line with Brewster's blog post [2]. Apologies for the confusion!
I'm not sure why you're downvoted, the AOL search log case was a huge mistake on the part of AOL and I'm quite surprised that Yahoo! would take a risk like this. The real risk is never in just this data but in combining it with other public datasets. I haven't looked at what is in this particular dump in detail but if there is data that had to be anonymized (as they claim) then you can bet that there will be people already busy trying to reverse that.
Besides China, there are other sources North Korea can have used to aquiver this technology. Both North Korea, Libya, Iran and China did get help developing nuclear technology from a network set up by Abdul Qadeer Khan, a Pakistani nuclear physicist often regarded as the fonder of the Pakistani nuclear enrichment program.
He again acquired the technology in Europe when working for Urenco, a nuclear fuel company.
The whole Q.A Khan affair is quit a fascinating story actually.
I wouldn't be surprised if the nuclear physicist in The Dictator was based on Khan. He sounds like he's proud of his work and I admire that, moral judgements on nuclear weapons aside.
Maybe web scraping then? It would not be so hard to make a focused scraper that scrapes the friends of anyone using the app.
Edit: I did a quick test and at looks like I can see the friend list of many users that is not in my immediate network as long as I am logged in to Facebook. It should then be easy to use something like Perls WWW::Mecanize to make a scraper that log inn and scrapes the profiles you want, as long as one do not need so many that Facebook detects and banns you.
No they probably do not have the users Facebook password, but they do not need it for scraping, because they can just use their own use for that.
I have looked around on Facebook and it looks like one can see other users friend list, even if you are not in their immediate network.
Even if Facebook has a limitation, like you can only see the friend list of friends of friends the company behind this app could probably make some fake Facebook users and befriends someone on each university to get an ok coverage.
It depends on the privacy settings. Though the several iterations of privacy scaremongering and Facebook changing defaults resulted in people locking up their accounts like crazy, friend lists seem to still be visible semi-publicly for quite a lot of people. With more news like that, this will probably change too, though.
It's totally a POV issue IMO, that's why I phrased it that way :).
For me, half of the Facebook's utility was the ability to check people out without having to commit to a relation with them first. A publishing platform, a little bit like personal pages of old, but much more streamlined and accessible to the mainstream. But it turned out there's enough bad actors around (stalkers, marketers) that people voted against this, and so Facebook is now a very locked down place. I think most of those fears people have are overblown, but well, that's only my opinion and it seems that most people disagree.
Though those concerns may be real, the privacy issue being discussed is still a trick employed by Facebook. By redirecting people's fear towards the amount of information that the public can see, they were able to keep them from talking about the original issue -- what Facebook tracks, saves, and uses for advertising.
The site analyses GitHub profiles and commit history to make a more extensive summery then GitHub does. I made it mostly to learn more about ranking algorithms and automatic text generation.
If you have GitHub profile you can look your self up here: https://www.findsosial.com/search