Then you need to ask yourself: Performance or memory efficiency? Is it worth the extra cycles and instructions to put 8 bools in one byte and & 0x bitmask the relevant one?
It’s not just less memory though - it might also introduce spurious data dependencies, e.g. to store a bit you now need to also read the old value of the byte that it’s in.
It might also introduce spurious data dependencies
Those need to be in the in smallest cache or a register anyway. If they are in registers, a modern, instruction reordering CPU will deal with that fine.
to store a bit you now need to also read the old value of the byte that it’s in.
Many architectures read the cache line on write-miss.
The only cases I can see, where byte sized bools seems better, are either using so few that all fit in one chache line anyways (in which case the performance will be great either way) or if you are repeatedly accessing a bitvector from multiple threads, in which case you should make sure that’s actually what you want to be doing.
Then you need to ask yourself: Performance or memory efficiency? Is it worth the extra cycles and instructions to put 8 bools in one byte and & 0x bitmask the relevant one?
And you may ask yourself: where is my beautiful house? Where is my beautiful wife?
Letting the days go by, let the water hold me down
Talking heads - once in a lifetime
Sounds like a compiler problem to me. :p
A lot of times using less memory is actually better for performance because the main bottleneck is memory bandwidth or latency.
It’s not just less memory though - it might also introduce spurious data dependencies, e.g. to store a bit you now need to also read the old value of the byte that it’s in.
Those need to be in the in smallest cache or a register anyway. If they are in registers, a modern, instruction reordering CPU will deal with that fine.
Many architectures read the cache line on write-miss.
The only cases I can see, where byte sized bools seems better, are either using so few that all fit in one chache line anyways (in which case the performance will be great either way) or if you are repeatedly accessing a bitvector from multiple threads, in which case you should make sure that’s actually what you want to be doing.
Yep, and anding with a bit ask is incredibly fast to process, so it’s not a big issue for performance.