If you are using sync ops (__sync_fetch_and_add, __sync_val_compare_and_swap, …), be careful about messing with alignment. If you end up doing sync ops on a value that crosses a cache line boundary (say, an 8-byte int whose address is 60 mod 64), your performance will get crushed to the tune of a few orders of magnitude.
If you are using sync ops (
__sync_fetch_and_add,__sync_val_compare_and_swap, …), be careful about messing with alignment. If you end up doing sync ops on a value that crosses a cache line boundary (say, an 8-byte int whose address is 60 mod 64), your performance will get crushed to the tune of a few orders of magnitude.