It seems silly to do that just due to asymmetric processors. The small cores could simply not support the instructions and the OS could then handle the fault by moving the task to a big core. Perhaps this could be signaled via a different CPU feature flag to avoid libraries using AVX512 instructions sporadically (e.g. in memcopy) and only use it in long-running loops. Or maybe give the OS a way to determine whether the CPU flags should be shown to a specific process or not. Or applications could install a SIGILL handler and deal with it in userspace.
This. And performance wouldn't be worse than using AVX twice...
But worst case scenario of the OS moving the process to a big core on an illegal instruction or scheduling it to the right core based on a required capabilities system would also be quite acceptable most of the time.
Plus, it'd be great for supporting more specialized cores designed for different purposes and running a single core ISA with extensions for their specific needs. IIRC, there are some ARM chips that have three different kinds of core.