Data Alignment...
Hi All,Another thing that really speeds up external processing is to be sure that data vectors are aligned on 16-byte, or 32-byte boundaries. The vDSP specs tell you as much. In fact, since Lisp will not guarantee this, and probably cannot produce better than 4- or 8- byte alignment in the static region, I resorted in the C glue code to checking the base address of arrays being passed in for FFT processing.
If an incoming array is not already aligned to a 16-byte boundary, I use another cached block of temporary store, allocated to a 16-byte boundary in the C glue-code, and then copy the incoming data to that temporary region before calling the FFT routines, and then copying the results of in-place computations back into the array that was originally passed. Even with all this extra copying the speedup is around 2x over the use of non-aligned data.
So, on my wish list for future improvements to LW, I would like to see and option to allocate a :static array on a user specified byte-alignment boundary, e.g, 8-, 16-, or 32- bytes. Who knows? down the road it may even be desirable to align on cache-line boundaries, and so possibly 128- or 256-bytes?
Dr. David McClain
Chief Technical Officer
Refined Audiometrics Laboratory
4391 N. Camino Ferreo
Tucson, AZ 85750
email: dbm@refined-audiometrics.com
phone: 1.520.390.3995