Numerical errors#
We like to think that if we run the same model/programme twice with the same inputs we would get the same outputs. This isn’t always strictly true.
We also like to think about the computer as being deterministic (if not in those words), we have at least tacitly assumed that the computer always does the calculations correctly. Again, this isn’t strictly true
We will, very, briefly, consider why and we now have to understand something about how computers work and handle numbers.
(some of the examples are taken from here https://docs.python.org/2/tutorial/floatingpoint.html)
Binary#
Computers work in binary. So numbers are stored as a series of bits representing powers of 2. When a computer is described as having a certain number of bits this is what is meant. So most computers are now 64 bit, so the base representation of a number can have up to 64 bits. So an unsigned integer could be \(2^{64}-1\) (as it starts at 0) in size or 18,446,744,073,709,551,615. If we allow the integer to be signed then it could be \(2^{63}-1\) or 9,223,372,036,854,775,807 (but it can also be negative).
Binary works as each bit in the integer represents that power of 2. So the first bit (bit 0) can be equal to 0 or 1 (\(2^1\)) The next bit (bit 1) can be 0 or 1 where the 1 means the value is \(2^2\).
So a value of 1 in binary is 0 1, while a value of 2 is 1 0, and 3 would be 1 1. This then carries on as more bits are added.
So the 32-bit signed integer -2080215094 can be broken down into the following bits:
31 |
30 |
29 |
28 |
27 |
26 |
25 |
24 |
23 |
22 |
21 |
20 |
19 |
18 |
17 |
16 |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
1 |
0 |
1 |
0 |
This means if a value can be represented as an integer, then it can be represented exactly in a computer program.
The same is not true to a decimal
Decimals#
A quick reminder, not all numbers can be represented exactly as a decimal. For example 1/3 repeats infinitely as 0.33333333… while Pi can not even be represented as a fraction and again occurs as an decimal which is infinite in length.
We can have similar issues in other bases, so though we can represent 1/10 perfectly in base10 as 0.1. It is not possible for it to be represented in base2 (binary). Instead, in binary it is an infinitely repeating fraction.
So before we have even included a computer in the mix we can recorgnise that not all numbers can be represented with exact precision in all bases.
Decimals Floating Point#
The first thing to recorgnise is that nearly all computer languages offer a floating point representation rather than a decimal representation.
Floats represent a decimal in two parts. The one part are the digits (called the significand) to a given number of significant figures and represented as an integer, the other part is the power of ten (the exponent).
This allows flexibilty in calculations but it is very common for a number to not be easily represented in this format and all floats are, to a certain extent, approximations of the true decimal, and in the cases such as 1/3 or PI are already approximations of an approximation.
Types of Numeric Error#
Having briefly covered these problems we can introduce the numeric errors we can meet
Representation Error#
Consider 1/10 in a float. As we discussed it can not be shown exactly as an integer, but floats are stored (in part as integers)
try the following code
from decimal import Decimal
print(0.1)
Decimal(0.1)
What happened, with print we get a nice 0.1, with Decimal we get a very odd number. This is because print rounds off the float to make our life easier. Using the Decimal function we round nothing.
Now the difference looks insignificant, but imagine running a calculation many times:
su = 0
for val in range(100):
su = su + 0.1
print(su)
Oh dear
This is an error due to how the number is represented.
It will also appear when numbers are rounded.
try:
round(2.675, 2)
then
Decimal(2.675)
Again, the representation and how it is displayed has hidden the fact that the closest floating point representation to 2.675 is smaller, so it rounds down. This can lead to common logic fails where > or < comparisions don’t give the expected results.
This is why you are always advised to use integers in comparisions, or if you must use floats use a range for the comparision.
Round off#
Round off error is due to the limit in size of the floating point representation. So pi is represented as 3.141592653589793115997963468544185161590576171875. But this is a truncation of the true expansion. Again it is only a small difference.
But if you repeat the calculation many times, these small errors can accumulate until it becomes meaningful.
The other issue where round off can occur is when a big and a small number are added together. Due to how the float is represented the effect of the small number might be lost as it can not be represented with in the precision. For example if the float has 8 digits of representation then if the magnitude of the difference between the numbers is greater than 8 orders it can not be included.
Therefore, where possible it is advised to add the small numbers together before adding them to a bigger number to make sure they get included.
Precision#
Something to consider is how precise a number can be, we can print out or represent a number to a large number of figures. But are these all meaningful or is it superious precision.
The concept of machine epsilon helps to decide. Machine Epsilon is the difference between 1 and the next largest floating point number. It is the value that can be added to 1 and generate a new floating point number. If a value less than the machine epsilon is added to 1 then the number is still 1
Conclusion#
If you are aware of this errors then they can be worked around, or included as part of analysis. But it is always something to consider when running any calculation
Also, remember that a lot of the workings of a programming language are hidden from you. Particularly in Python which is designed for simplicity