Are those outputs actually from the 671B model? The 671B model needs 8xH200 GPUs at minimum, which is $25/hr to rent. If you didn't pay that much, you were not running R1, but rather Qwen or LLaMA based distillations. We paid that much to rent a machine to run the full 671B model!
Sure but he explicitly stated, 'GPU Servers', making it likely he didn't use the CPU for inferencing, validating the question about what GPU setup did they use