Performance optimizations for the GBC code 

Description
Now that the CGB code has been merged !16 (merged) we need to focus on the optimization process to be able to have CGB support without to much of a performance impact.
Analyze the performance impact of this commit f61e3f9e because of the whole let cycles_n = cycles / self.mmu().speed.multiplier();
.
The main areas that should be the focus of optimizations are:
- The PPU conditional code for CGB and DMG with color compatibility
- The variable clock speed code of the main GB cycle https://gitlab.stage.hive.pt/joamag/boytacean/-/blob/master/src/gb.rs#L396
-
The usage of the
get_flipped()
method seems to be one of the culprits
Version 0.7.4 of Boytacean should be used as a reference for benchmarking.