Anyone here knows an article comparing GPU architectures for non deep learning that has a similar depth as this article addressing stuff like memory latencies, cycles, caches etc. too? Kudos to the author of above article, I really enjoyed reading it!