One question which is commonly asked is “how do I represent fractional numbers on my fixed-point MCU, DSP or FPGA?” One of the best solutions to this is use of the Q number system.
The Q number system is a fixed point system where the available bits are divided amongst the integer bits (those to the left of the decimal point), fractional bits (those to the right of the decimal point) and a sign bit. You may ask “I know how integers are represented in binary but not fractions?” The answer is that just like integers, fractional bits are just multiplied by powers of two, except the powers are negative. For example:
- 0.011B = 0*2-1+1*2-2+1*2-3 = 0.375
Q numbers can take on multiple forms with different numbers of fractional and integer bits. They are commonly written mQn or Qm.n where m is the number of integer bits and n is the number of fractional bits. Note m+n+1 = total number of bits available.
Arithmetic
Addition/Subtraction:
Q numbers of the same form can be added together with no issue. The only thing to consider here is overflow.
If you have different forms they need to be converted before the arithmetic. This can be done by shifting. For example:
- 2Q13 << 1 is now 1Q14 (lose an integer bit and gain a fractional bit) and
- 3Q12 >> 1 is now 4Q11 (lose an fractional bit and gain an integer bit)
Multiplication:
- The rule when multiplying two Q numbers together is:
- m1Qn1 * m2Qn2 = (m1+m2)Q(n1+n2)
Once the multiplication is complete, then a shift is needed to get it into the Q format the system needs.
The big issue with multiplication is overflow and precision loss. When there exists m > 0, then scaling back to your original system is difficult. For example:
- 2Q13 * 2Q13 = 4Q26
In order to scale this back to the original 16 bits you either have to sacrifice integer bits (you have to be very careful that the top integer bits don’t contain information – limit the overflow) or lose precision by discarding fractional bits. The solution to this is to try and use systems where m=0.
Digital Control
Choosing the Q number system for digital control is important. The general rule of thumb is you want as much precision as possible and you want to avoid overflows in multiplication. Therefore the best solution is to make all your bits fractional (i.e. m=0). This gives as much precision as your system allows and makes sure there are no overflows (<1 x <1 = <1). In a 16-bit system this is 0Q15 (referred to as Q15).
Once you have your system then you need to make sure that all inputs and outputs fit this system and falls within the range -1 <= x < 1. This is as simple as setting your inputs and outputs to be +1 = full scale positive and -1 = full scale negative.
The key for this to work in a digital control system is to remember the gains on the inputs and outputs. This means remembering what +1 and -1 stand for. For example a voltage input may be -230V to +230V and an output maybe -400V to 400V. The input gain is therefore 1/230 and the output gain 400. Once you have these gains you need to include them in your design of the control system, whether it be through calculation or simulation. Failing to include them leads to incorrect margins and possibly instability.
One potential pitfall of the m=0 approach is how to deal with numbers greater than one. In digital control these can come up quite often generally in biquad filters. The trick is to this is to scale the coefficients by ½, perform the multiplies and then scale back by 2 (shift left 1). This does lose one bit of precision in this particular calculation however it is better than losing one fractional bit in all calculations.
Conclusion
Q number systems allow the designer to use a reliable fixed point system to represent fractional numbers. This allows the use of less expensive fixed point processors instead of the more complex and generally more expensive floating point alternatives.
Download the report ‘Your Digital Power Future – Roadblocks to Avoid’ to learn about the three key issues to watch out for in the Digital Control of Power electronics.