Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

DeepSeek-V3.2 is a sparse attention architecture, while Zebra-Llama is a hybrid attention/SSM architecture. The outcome might be similar in some ways (close to linear complexity) but I think they are otherwise quite different.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: