UTF-8 is a regular language (as a subset of all octet strings), so that doesn’t ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		mananaysiempre 78 days ago \| parent \| context \| favorite \| on: Binary Formats Gallery UTF-8 is a regular language (as a subset of all octet strings), so that doesn’t feel like much of a benchmark? Something like TIFF or PECOFF would seem to be a more reasonable standard. (PDF is probably too much to ask, seeing as understanding the structure requires a full Deflate decoder among other things.)

pastage 77 days ago [–]

You can handle deflate with katai or write custom handlers in Python.

https://doc.kaitai.io/user_guide.html#process https://github.com/kaitai-io/kaitai_compress/blob/master/pyt...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact