This is pretty cool. I suppose I'd like to see a breakdown of the influences some of the more popular sources had on other things, or maybe to know who are the most prolific borrowers of references and themes (influencees? Chrome tells me that's not a word). Nevertheless, it's interesting to think about cultural references within a very closed environment like TV shows and movies as a network, with all the reasoning affordances as such.
As an aside, TV Tropes doesn't use HTTPS as far as I know (and according to this too [0]), so none of the show/movie links resolve unless you go in and remove the prefix and let the browser figure it out. Looking at your source, it seems you could just change the "https://" to "http://" or even "//" (although this would break when(/if) you start using HTTPS down the road).
A giant graph could be in the cards for future work, schedule permitting.
As for https, I'm currently hosting this with GitHub, but using my own domain so my browser kept complaining (correctly) that the page was using a SSL certificate from another domain. Eventually, I'll move to hosting this on my own somewhere, probably when I get the urge to run some server side code. Then I'll most likely make https the default.
The list tracks well with the relative number of instances of cosplay or t-shirts worn SDCC. (With gaps since "Marvel" isn't going to show up that way on the site)
Yeah, I was worried when I was doing this that, say Batman, would be underrepresented due to all the references pointing to specific movies or comic books, but that worry seems to be misplaced. Maybe later work will group entries into categories, but I'd only do it if it could be done programmatically.
This is awesome work! Is there anywhere we can get the generated graph data? Whether in the form of adjacency list or matrix. I want to do some analysis on this graph too :-)
I have it as a SQLite database on my machine. The full database is too big to upload to GitHub, but I'll see if the table with the edges can be made into a small enough JSON file to be uploaded.
I have tried some graph analysis using my own graph analytics framework: Gunrock (http://gunrock.github.io/gunrock/). The PageRank is pretty similar (the first three are DoctorWho, StarWars, and EVA.) My colleague and I are interested in doing more analysis on this site. We are trying to build a crawler and maybe using bipartite graph to build some kind of recommendation system. All inspired by your project here. Stay tuned :-)
Thanks, this is exactly the kind of thing I hope to happen when I post on HN. I have the crawler I used at https://github.com/jsnider3/QuisCustodiet/blob/master/crawle..., but it needs some post-processing at the end to filter out links that redirect and pages that aren't works of pop culture.
As an aside, TV Tropes doesn't use HTTPS as far as I know (and according to this too [0]), so none of the show/movie links resolve unless you go in and remove the prefix and let the browser figure it out. Looking at your source, it seems you could just change the "https://" to "http://" or even "//" (although this would break when(/if) you start using HTTPS down the road).
[0]: http://tvtropes.org/pmwiki/posts.php?discussion=13461707110A...