The Xtensa implementation of arch_irq_offload() required that the user
select the correct interrupt manually, and would race with itself if
invoked from separate CPUs (it was saved here by the main
irq_offload() function which has a semaphore to serialize access).
Use the new gen_zsr.py script to automatically detect the highest
available software interrupt, and keep a per-CPU set of
callback/parameter pointers.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>