Functions and Scope

Functions and Scope#

Prerequisites#

Learning Objectives#

To understand the purpose of a function, how they are written, and how they are called.
To define your own functions, and use them in your code.
To understand function arguments and return statements.
To write and use a docstring.

Functions#

We have already seen a number of functions in Python, although you might not have realised that that’s what they were. For example, the built-in functions, print(), range(), type(), len(), as well as some those from the module math, including math.cos() and math.log10(). Each of these functions do a specific task which would take a long time to write out by hand in your program. Each function takes one or multiple inputs (known as arguments), which you place inside the brackets (like print("hello"), where in this case the string "hello" is the argument, and word ‘hello’ is returned on your console). You don’t need to write a script explaining to the computer how to translate the input and put it onto your screen, because it is already linked to the word print(), which runs all that code automatically once typed.

Long story short, functions save time.

This is relevant because we will want to use the same lines of code over and over again. For example, the code below takes a list of atomic masses and iterates through it, adding each individual atom mass to output the total molecule’s mass.

ethene_atom_masses = [12.011, 12.011, 1.008, 1.008, 1.008, 1.008]

total_mass = 0

for atom_mass in ethene_atom_masses:
    total_mass = total_mass + atom_mass

print(total_mass)

It works to add up the numbers in the list, and if we’re just doing it only once, it will serve perfectly fine. However, if you had another list you needed to add up this way, you would need to write this expression again. If you had 100 lists, you could be spending a lot of time writing just these few lines of code.

This is where functions come in.

In Python, you can define your own function that will run these lines of code for you, so by just typing a short phrase, like add_atomic_masses(), the program will run these lines of code. The next section describes how to go about defining your function.

Defining functions#

Defining functions, like the rest of Python, always takes a specific syntax. This is:

def function_name(an_argument):
    Code block, e.g. doing maths on an_argument
    return an_output

def indicates that we are about to define a new function.
function_name() is what we will use to call our function later on.
(an_argument) is the variable that the function will act on, known as the function’s argument. It could be a number, string, list, etc., but whatever it is, we need to make sure that we treat it like the correct variable type throughout the main code block.
: indicates the start of the code that will run whenever we call our function.
Code block is a placeholder indicating the main body of the code. This could be a calculation, reading from a file, or many other things. The main body of the code is indented using the <tab> key, just like with ‘if’, ‘for, and ‘while’. All the code belonging to this function must be indented and directly below the line def function_name():. You can use ‘if’, ‘for’ and ‘while’ within this code block, but they must respect the first indent.
return indicates the end of the function, and must be one indent away from the margin (unless it is nested within another statement - more on that later). The real meaning of return is that the variables which comes after it (in this case, an_output) are what the function will spit out (or return) at the end. In reality, this means you could have a return statement halfway down your function.
an_output is what this function will return at the end of its process. You can have multiple outputs (discussed later) separated by commas.

Using this syntax, we will now turn the above code for adding masses into a function.

def add_atomic_masses(list_of_atoms):
    total_mass = 0
    for atom_mass in list_of_atoms:
        total_mass = total_mass + atom_mass
    return total_mass

The function’s argument (what we are putting in) has been named list_of_atoms, instead of the more specific ethene_atom_masses. This is because we want this function to be general, to apply not only to a list of atoms in ethene, but also to other lists containing the masses of atoms in other molecules.

Another thing you will notice is that the variable list_of_atoms is treated like a list even though it is never explicitly defined as such. When you call your function (discussed in the next section), you must make sure your input is the correct variable type.

Calling functions#

Now that we have created our function add_atomic_masses(), we need to know how to use it in our program. To run the function on a list, we just need to write add_atomic_masses(my_list). To make the output useable, we then set this to a variable, such as formaldehyde_mass = add_atomic_masses(formaldehyde_atom_masses), as seen below.

formaldehyde_atom_masses = [15.999, 12.011, 1.008, 1.008]

formaldehyde_mass = add_atomic_masses(formaldehyde_atom_masses)

print(formaldehyde_mass, "g mol-1")

The output of this is:
30.025999999999996 g mol-1

An important thing to note is that you can only reference variables from within your function definition if they appear on the return line. In this case, we can only access total_mass (which has been renamed to formaldehyde_mass). If total_mass did not appear on the return line we would not be able to refer to it.

This would work exactly the same with another list of atoms, for example:

glycine_atom_masses = [1.008, 1.008, 14.007, 12.011, 1.008, 1.008, 12.011, 15.999, 15.999, 1.008]

glycine_mass = add_atomic_masses(glycine_atom_masses)

print(glycine_mass)

Try putting all three chunks of code into the same program and run it. Be careful though, your function should always be defined at the beginning.

Below is another example of calling a function. In this example, we are calling the function within a for loop.

many_sets_of_atoms = [
   [12.011,12.011,1.008,1.008,1.008,1.008],
   [12.011,1.008,1.008,1.008,12.011,15.999,12.011,1.008,1.008,1.008],
   [12.011,12.011,12.011,12.011,12.011,12.011]
]

total_masses = []

for atom_mass_list in many_sets_of_atoms:
   calculated_mass = add_atomic_masses(atom_mass_list)
   total_masses.append(calculated_mass)

print(total_masses)

This code takes each nested list within the list many_sets_of_atoms and adds up the values within, then appends them to a new list containing only the final masses. In fact, .append() is another built-in Python function which we have seen before. It is associated with a certain list, signified by . (in this case, associated with the list total_masses), and the argument is what you want to put into the list (in this case calculated_masses).

Exercise: Write a function to convert units

Write a function that will convert eV to J by multiplying by 1.602×10^-19. Do this for:

A single value of 4.08 eV. Print the answer, and remember to include units.
The following list of workfunctions: metal_workfunctions = [4.08, 5.0, 4.07, 2.9, 4.81, 2.1, 5.0, 4.7, 5.1, 4.5] # eV.

2.1. Print the new list of workfunctions in J.

2.2. Print each item in the list on a new line. Include units.

Click to view answer for Part 1

In part 1, we are creating a function to take a single variable as an argument and return a single variable.

eV_value = 4.08 # eV

def convert_eV_J(value_eV):
    value_J = value_eV * 1.602e-19
    return value_J

print(convert_eV_J(eV_value), "J")

You should have got the output: 6.53616e-19 J .

Instead of creating another variable to describe the new value in Joules, this piece of code executes the function within a print() statement. If we wanted to use this value again later, it would be better to store it in a relevantly named variable.

Click to view answer for Part 2

In Part 2, we need to take the program we wrote above, and convert it to take a list as an argument, and also return a list.

metal_workfunctions = [4.08, 5.0, 4.07, 2.9, 4.81, 2.1, 5.0, 4.7, 5.1, 4.5] # eV

# Converts a list in eV to a list in J
def convert_eV_J(list_value_eV):
    list_value_J = []
    for value in list_value_eV:
        value_J = value * 1.602e-19
        list_value_J.append(value_J)
    return list_value_J

Once you have this, you can then either print the whole list in a similar way to Part 1:

print(convert_eV_J(metal_workfunctions))

Or you can print each value on a new line:

for num in convert_eV_J(metal_workfunctions):
    print(num, "J")

Try both. Although lists are better to use throughout the rest of a program, it is easier to look at when they have a new value on each line.

Function Scope#

In Python, the word ‘scope’ refers to the parts of the program in which a variable or function name is valid. For example, if you define a variable at the top of one Python file, you cannot retrieve that variable from another Python file, but you can refer to it from anywhere within the program it belongs to. It is a global variable.

Within a function, there is what is called a ‘local scope’. This means that variables created within the function cannot be referred to outside the function unless it has been specified to by return, and assigned a new variable name. They are local variables.

Here is an example, using the above function add_atomic_masses.

water_atom_masses = [15.999, 1.008, 1.008]

water_mass = add_atomic_masses(water_atom_masses)

print(total_mass)

total_mass is a name we gave to our output when defining our function, but if we then try to print it outside the function, Python will return a ‘name not defined’ error. The name total_mass belongs to the local scope of the function, and therefore the name cannot be referred to in the rest of the program. However, we can refer to it indirectly and reassign it a new variable name (in this case water_mass).

Thinking in reverse, a function does however have access to variables which have not been given to them explicitly as arguments. For example, below is a function which converts pressure in bar to pressure in mmHg. The conversion rate is 1 bar to 750.062 mmHg, which is stored in a variable before the function. The function is still completely valid, and will work just as well as any of the other examples.

bar_mmHg_conversion = 750.062

def pressure_conversion(a_pressure):
    pressure_mmHg = a_pressure * bar_mmHg_conversion
    return pressure_mmHg

The thing to be mindful of when doing this is to keep track of your variables! If you start changing them around, or copy and paste a function from one program into another, your variables might result in errors! It is also possible to reuse a variable inside the function without wanting to.

scount = 5

def sum_list(list_to_sum):
    for entry in list_to_sum:
        scount = scount + entry
    
    return scount

mylist = [1,2,3,4,5]

mysum = sum_list(mylist)

In this example it is likely that the variable scount is meant to start at 0, but is likely to start at 5, giving the wrong answer.

More complex functions#

So far we have covered basic functions, but there are more complex facets to function writing. Some of these complex facets are not often used, especially by beginners, so do not feel you have to memorise all of them. The most important aspects are: empty arguments and return, multiple return conditions, and multiple positional arguments.

Functions without arguments and/or return variables#

It is possible to create a function that takes no arguments and returns no variables. All you do is write the function name with nothing in the brackets, add your code, and write return. When you call the code, do not put in any arguments. An example is shown below.

def write_message():
    print("This is an important message. It could be an error message, or a long output that you want to call multiple times. An interesting thing about it is that it has no inputs and returns no outputs. Strange, right? Whatever it is, it is very important. Important enough and long enough that I want to give it its own function. A function just for this message. Imagine that. Being so important you get your own function. Well this is what this message is. A very, very important one. It is so important that I don't want to store it as a mere string inside a variable, for fear that it may get accidentally overwritten by some other, less important message. No. It needs - no, **deserves** its own function. A function that takes no inputs and returns no outputs. Just this message. This very, very, important, long, message. Have you made it this far? I should hope so. You shouldn't skip such an important message, especially when you know it is so important. I said that at the beginning, didn't I, that this message was important. Do you remember? Seems so long ago now. Like yesterday, last month, last year, last decade, a time before you were born. The world was a different place, and yet still it still spun, endlessly, through the dark void of space, days still passed and the sun and moon rose and set, the stars endlessly identical, passing over the horizon, infinite worlds stretching out into the cosmos, vast and unknowable. Makes you think, doesn't it, about life, the universe, and everything. Hm? What was that? What is the message, you ask? Oh, can't remember now. Couldn't be that important now, could it?")
    return

write_message()

The function write_message() has been called later on in the code simply by typing write_message() on a new line, at which point the message will be printed to the screen. This is the simplest kind of function - no arguments and no returned variables, and it only does one thing. Therefore, these often have limited uses. Keep in mind, however, that some functions can take no arguments but return a variable, and vice versa. It is important to use the correct syntax depending on the function.

Note also, that even though the function takes no argument, you must still supply empty brackets in order to use it.

Multiple return conditions#

So far, all the functions we have seen have only one return statement, in which only one variable is returned. But not only is it possible to return multiple variables, it is also possible to use multiple return statements within the function.

Returning multiple outputs

So far, our functions have only returned one variable. However, it is possible to return multiple values.

This is a function which returns both the maximum and the minimum of a list of numbers:

def minmax(number_list):
    min_value = min(number_list)
    max_value = max(number_list)
    return min_value, max_value

masses = [1.008, 12.011, 14.007, 6.94]

min_mass, max_mass = minmax(masses)
print(min_mass)
print(max_mass)

This returns: 1.008 14.007

Note the following:

The two variables we want from the function are both on the return line, separated by a comma.
When we use the function, we need to assign it two variables: min_mass and max_mass.
- If we do not do this, we will get the result as a tuple (see variable lesson). For example:
```
mass = minmax(masses)
print(mass)
```
  Will return: (1.008, 14.007), a tuple. This is not a data type you, as a beginner, will often use.
If you only want one of the returned variables, you can put a double underscore __ to indicate that you will not retrieve that answer: min_mass, __ = minmax(masses).
The function is ‘throwing away’ the second value. This is useful if you want to test or use only one aspect of the function, or if one of the returned variables takes up a lot of storage space, making your program slow.

Using multiple return statements

So far, all our functions have had one return statement, right at the end, but this is not a requirement. A function can have a return statement anywhere, and it can have more than one. This is usually encountered when functions contain if statements, where depending on whether the conditions are met, different sections of the function may execute.

The below function checks if a given elemental symbol is a noble gas or not using Boolean logic. There are three different possible return values in the above function, depending on the input.

def is_noble_gas(symbol):
    if type(symbol) is not str or len(symbol) > 2:
        print("Input must be a 1 or 2 character elemental symbol")
        return None
    elif symbol in ["He", "Ne", "Ar", "Kr", "Xe", "Rn", "Og"]:
        return True
    else:
        return False

The first conditional statement confirms that a symbol of no more than two characters has been passed to the function as a string. There are more sophisticated ways to handle this sort of error, but these are outside the scope of this lesson. The key thing here is that based on this test, we can have the function return None rather than True or False, if an incompatible input is supplied.

The second conditional statement checks if the symbol is a noble gas. If it is, the function will finish and return True.

The last statement is else:. Now that we have eliminated every other option, the inputted symbol cannot be a noble gas.

However, it is important to recognise limitations. This code would not flag the first if statement for strings with integers "10", empty strings "", or made-up symbols "TM". You could fix this by adjusting the comparative statement to exclude strings of length 0, and add an ‘elif’ statement checking that the given symbol is within a list containing all the symbols of the periodic table. It is also quite common to store the periodic table in a dictionary, where the key:value pairs are element_name:element_mass. Return to the variables lesson for revision on dictionaries.

Exercise: Returning two values

Write a program that returns both the energy in J and the energy in eV in separate lists, taking a list of wavelengths as an argument.

\( E = \frac{hc}{\lambda} \)

Remember: To convert from J to eV you must divide by 1.602 \(\times\) 10^-19.

wavelengths_light = [276, 59, 0.5, 1183, 52, 0.002, 127, 474] # nm

Click to view answer

def energy_calc(wavelengths):
    # Constants
    h = 6.626e-34 # m^2kgs^-1
    c = 3.00e8 # ms^-1
    
    # Empty lists of energies
    E_J = []
    E_eV = []

    # Iterate through wavelength list, calculate energies in J and eV and append to our lists. 
    for value in wavelengths:
        energy_J = (h * c) / (value * 1e-9) # J
        energy_eV = energy_J / 1.602e-19 # eV
        E_J.append(energy_J)
        E_eV.append(energy_eV)
    
    # Return 2 values. 
    return E_J, E_eV

wavelengths_light = [276, 59, 0.5, 1183, 52, 0.002, 127, 474] # nm

energy_light_J, energy_light_eV = energy_calc(wavelengths_light)
print(f"The energies in J are: {energy_light_J}")
print(f"The energies in eV are: {energy_light_eV}")

In this code, we are defining our constants h and c within the function. This means if you want to copy this function to a new program, you do not need to worry about ensuring the correct variables are defined in the program. It also prevents name clashes (there might, for some reason, be a different variable also called c in another program).

Multiple Arguments#

Your function can also take multiple arguments. You simply add them into the brackets of your function name, separated by a comma, and call them in the same way.

def my_function(argument_1, argument_2):
    Code block
    return an_answer

answer = my_function(20.10, 2003)

Here is an example of a function to calculate the number of mols from a given mass and molecular mass.

def no_mols(mass, molar_mass):
   n_mols = mass / molar_mass
   return n_mols

aspirin_mass = 1.48 # g
aspirin_molar_mass = 180.158 # g mol-1

mols_aspirin = no_mols(aspirin_mass, aspirin_molar_mass)
print(mols_aspirin , "mols of aspirin")

Which will output 0.008215011267887077 mols of aspirin, which you can check on a calculator.

Your arguments do not have to be the same variable type, you could have a number and a list, as long as you treat them as that variable type in your function. You can add as many arguments as you like, but be aware it might make your function hard to use if you add too many.

Furthermore, you need to make sure you input the right number of arguments when calling your function. If your function expects two arguments but only receives one, you will get a missing argument error message, and similarly if you put in too many arguments you will get a function expects ‘x’ arguments but ‘y’ were given error.

Positional Arguments

But be careful! The arguments you see above are called positional arguments, arg, and the order in which you list them when you call your function matters!

Try calling the following using the above function:

mols_aspirin = no_mols(aspirin_molar_mass , aspirin_mass)
print(mols_aspirin, "mols of aspirin")

It will result in the output 121.72837837837837 mols of aspirin. Not the right answer!

Positional arguments are passed into the function in the order they are listed when defining the function (in this case, mass first, then molar_mass). If you put in ‘aspirin_molar_mass’ first, then ‘aspirin_mass’ when you call the function, then ‘aspirin_molar_mass’ will take the place of mass within the function, and ‘aspirin_mass’ will take the place of molar_mass within the function, resulting in the wrong division.

Keyword Arguments

A keyword argument, or kwarg is the alternative to positional arguments, and do not have to be called in a specific order. Instead, they are identifiable by the name given when defining the function, which must be referenced when calling the function.

def my_function(kwarg_1, kwarg_2):
    Code block
    return an_answer

answer = my_function(kwarg_2 = 2006, kwarg_1 = 3.01)

Be aware that nothing has actually changed in the function itself, it is just how the arguments are treated that makes them positional or keyword arguments. Using our example of calculating mols:

def no_mols(mass, molar_mass):
    """kwargs: mass, molar_mass """
    n_mols = mass / molar_mass
    return n_mols

# Calculate number of mols of aspirin
aspirin_mass = 1.48 # g
aspirin_molar_mass = 180.158 # g mol-1

# Keyword Argument
mols_aspirin = no_mols(molar_mass = aspirin_molar_mass, mass = aspirin_mass)
print(mols_aspirin, "mols of aspirin")

Even though we have put aspirin_molar_mass and aspirin_mass in a different order than how it is in the function, the code still outputs the correct answer: 0.008215011267887077 mols of aspirin.

Warning! If you are using both positional and keyword arguments (for example if the function takes many arguments), all the positional arguments must come before the keyword arguments.

Default/Optional Arguments

Default arguments are keyword arguments whose values are assigned within the function. They are often also referred to as optional arguments. You set it by adding ="value" to the end of the argument in your function. This means that even if your function takes two arguments, if one is a default argument, you only need to call one, the other will take the default value, and you won’t receive an error.

def my_function(arg_1, default_argument=2000):
    Code Block
    return an_answer

answer = my_function(18.08)

For example, if you know that throughout your program you will mostly be using aspirin_molar_mass, you can set that as the default value to save time.

def no_mols(mass, molar_mass=180.158):
    n_mols = mass / molar_mass
    return n_mols

aspirin_mass = 1.48 # g

mols_aspirin = no_mols(aspirin_mass)
print(mols_aspirin, "mols of aspirin")

When we’ve called the function, we have only specified one argument (aspirin_mass). The other argument is automatically taken as 180.158 g mol^-1.

You can also set a default value to None. None is a data type of its own (NoneType), and describes the absence of a value.

def no_mols(mass, molar_mass=None):
    if molar_mass is None:
        print("You must have a molecular mass")
        return
    n_mols = mass / molar_mass
    return n_mols

aspirin_mass = 1.48 # g
no_mols(aspirin_mass)

Here, we are expecting that someone might forget the argument ‘molar_mass’, so we have added the optional argument ‘None’ to ensure we do not get errors down the line. You probably will not have much use of None as a beginner.

Warning! When using positional, keyword, and default arguments, they must be defined in a strict order within your function. Positional arguments first, then keyword arguments, and then all your default arguments at the end. When you call your function, you must first specify your positional arguments, then your keyword arguments, and then any default arguments you wish to change.

Exercise: Using positional and keyword arguments

The cosine rule allows you to find the length of one side of a triangle, c, when you know the lengths of the other two sides, a and b, and one angle, \(\theta\).

\( c = \sqrt{a^2 + b^2 - 2ab cos(\theta)} \)

Write a function which will output the length of the final side of a triangle in cm, taking a positional argument of ‘side_a’ and ‘side_b’, and a keyword argument of ‘angle’.

Test your program works by using side_a = 3 cm, side_b = 4 cm, and the known angle = 15°, which should result in side_c = 1.35 cm.

Hint! Remember that you need to import the module math to use the function cos(). Also consider the units in which the function cos() works.

Click to view answer

Here is a potential answer.

import math

def triangle_side(side_a, side_b, angle):
    angle = math.radians(angle)
    answer = side_a**2 + side_b**2 - (2 * side_a * side_b * math.cos(angle))
    answer **= 1/2
    return answer

print(triangle_side(3, 4, angle = 15), "cm")

Remember! When writing your code you always start by importing relevant modules/libraries, then define all your functions, then write the rest of your code.

Exercise: Using a default argument

Write a function which will output an atomic velocity from its mass and temperature. Set the default temperature to be 20°C. Use the relationship:

\(v = \sqrt{\frac{3k_bT}{m}} \)

where v is velocity in ms^-1, k_b is the Boltzmann constant and is equal to 1.38 \(\times\) 10^-23 JK^-1, and T is temperature in Kelvin.

Find the velocity for an N₂ molecule of mass 4.6 \(\times\) 10^-26 g.

Click to view answer

def molecule_velocity(mass, temperature=20):
   k_b = 1.38e-23
   temperature += 273
   vel = ((3 * k_b * T) / m) ** (1 / 2)
   return vel

mass_N2 = 4.6e-26
velocity_answer = molecule_velocity(mass_N2)
print(velocity_answer, "ms-1")

The correct answer should be 513.52 ms-1.

Notice that the default argument for temperature is given in Celsius, then is converted to Kelvin inside the function. This is an interesting point in usability. When writing a function, you want it to be as easy to use as possible. If you know the user will usually have their temperatures in Celsius, then it is up to you as a programmer to make the function easy to use in Celsius.

More advanced arguments#

There are some kinds of function arguments that are more complex, and less often used. If you are a beginner, you probably won’t have a reason to use these, and you might want to consider moving straight onto the next section.

Positional-only arguments ``, /`` and keyword-only arguments ``* ,``

In the sections about positional and keyword arguments above, there is nothing in the function line itself that describes whether the arguments are positional or keyword. The way in which you call the function (and the order) is what defines it. In the first example, we treated both as positional, but in the second we treated both as keyword.

But there is a way of forcing the function to only take positional or only take keyword arguments. If you then try to treat the input differently, it will raise an error.

, / after the argument makes it positional-only
* , before the argument makes it keyword-only

For example in:

def my_function(argument_1, /, *, argument_2):

argument_1 is positional only, and argument_2 is keyword-only.

Using our mols example from above

def no_mols(mass, /, *, molar_mass):
    n_mols = mass / molar_mass
    return n_mols

aspirin_molar_mass = 180.158 # g mol-1

mols_aspirin = no_mols(aspirin_molar_mass, aspirin_mass = 1.48)
print(mols_aspirin, "mols of aspirin")

Here, we received an error. We are trying to use aspirin_molar_mass as a positional argument and aspirin_mass as a keyword argument. However, the function takes molar_mass as a keyword argument, and mass as a positional argument - the other way around than we have tried to call it. If we try it the correct way around:

mols_aspirin = no_mols(aspirin_mass, molar_mass = 180.158)

We will get the correct output.

Be aware: you can only use , / and *, once each in each function argument line.

Arbitrary positional arguments: ``*args``

Up until this point, the number of arguments passed into the function has been fixed. Whether positional, keyword, or default, the function will only accept a certain number of arguments. But what if we do not know how many arguments we will need in advance? We can use arbitrary positional arguments *args and arbitrary keyword arguments **kwargs. Note single asterisk for positional arguments and double asterisk for keyword arguments. This is defined in the function in the following way:

def my_function(*date, year=2000):

Our above code for calculating the number of mols has been adjusted so that it can take an arbitrary number of masses (each of the type float).

def no_mols(*masses, molar_mass=180.158):
    for mass in masses:
        n = mass / molar_mass
        print(n, "mols")
    return

no_mols(1.48, 1.01, 0.62, 0.21, 0.06)

The output of this is:

0.008215011267887077 mols 0.005606190122004019 mols 0.003441423639249992 mols 0.001165643490713707 mols 0.0003330409973467734 mols

The arbitrary argument *masses represents every mass given when we call the function. Since we have multiple of these, we must iterate through the collection of masses using a for loop and calculate the number of mols for each.

The real meaning of arbitrary positional arguments is that the arbitrary number of arguments you are putting into the function are stored as a tuple, which you then iterate through in the function, but you don’t really need to know this for it to work.

Arbitrary keyword arguments: ``**kwargs``

Arbitrary keyword arguments are a bit more complicated than *args.

Essentially, while *args stores the input as a tuple, **args stores it as a dictionary. This means there are a number of things that need to happen. Have a look at our code, where **masses is now an arbitrary keyword argument (note that **kwargs comes after args, kwargs, and *args in the function).

def no_mols(molar_mass = 180.158, **masses):
    for name, mass in masses.items():   
        n_mols = mass / molar_mass
        print(name, "has" , n_mols, "mols")
    return

no_mols(mass1 = 1.48, mass2 = 1.01, mass3 = 0.62)

Which returns:

mass1 has 0.008215011267887077 mols mass2 has 0.005606190122004019 mols mass3 has 0.003441423639249992 mols

Let’s talk through what is happening here.

Calling the function: Each of the masses we are trying to find the number of mols for has a name and a value. The names are ‘mass1’, ‘mass2’, ‘mass3’, and each has a corresponding value (1.48, 1.01, 0.62). This is exactly how dictionaries store information (revise data types lesson if you want a reminder).
**masses: When we write **masses into our function, we are saying that we want to take that unknown number of named masses below and store both their names and their value in a dictionary to use in our function.
name, mass in the for loop: Since **kwargs has taken both the the name and the value of our input (e.g. both ‘mass1’ and ‘1.48’), we need to be able to iterate through each by assigning them a variable name. It is kind of like when we zipped lists together in the for loop lesson, assigning a variable to items in each list.
masses.items(): This takes all the **kwargs, (mass1, mass2, mass3), and puts them in a list of tuples, allowing the for loop to iterate through. In this case, it becomes: [("mass1", 1.48), ("mass2", 1.01), ("mass3", 0.62)]. You can now see how name, mass extracts the information as each tuple is iterated through.
Once we have all this information, we can use both the name and the mass to calculate the number of mols and print a statement.

Again, **kwargs are not commonly used, especially amongst beginners, so if this is a bit too complicated, do not worry.

Using the unpacking operator, *, when calling a function

The single asterisk unpacking operator * can be used on any iterable that Python provides (lists, tuples, strings, etc.), while the double asterisk operator, **, can only be used on dictionaries. When placed just before an iterable, * unpacks the items within the iterable to be separate objects.

For example, in a list of integers, the unpacking operator will unpack them into separate variables.

my_list = [2, 3, 5, 7, 11]

print("A list: ", my_list)
print("Separate variables: ", *my_list)

Returns the following:

A list:  [2, 3, 5, 7, 11]
Separate variables:  2 3 5 7 11

An example where this could be used is when a function takes lots of arguments, and you don’t want to remember what each should be every time. You can store your arguments in a list, and then unpack them when calling the function.

Recursion#

We will only touch on recursion briefly, for completeness. Essentially, it means that a function can call itself, looping to get a result.

For example, here is a function that finds the factorial of a number. The factorial of 5 would be 5×4×3×2×1 = 120.

def factorial(x):
    if x == 1:
        return 1
    else:
        ans = x * factorial(x-1)
        return ans

print("The factorial of", 3, "is", factorial(3))

The function takes the number 3. As 3 != 1, it multiplies it by a number which is the output of its own factorial function, each time checking if it has reached 1. The output is 6. However, you must be careful, as it is easy to accidentally create an infinite loop (just like with while loops). With this program, if you call the function for a number less than 0, it will loop until error, because the condition if x == 1 will never be fulfilled.

Functions within functions#

It is possible to define another function within a function, however this is generally not useful, as it means the nested function cannot be called anywhere else in the program. This is because the nested function is then local, not global (see above for discussion on scope).

Try to avoid nesting functions in your program. You can, however, call another function you have defined from within your function.

def function_1(arg_1, arg_2):
    total = arg_1 + arg_2
    return total

def function_2(arg_3):
    ans = function_1(arg_3, 6) + 100
    return ans

print(function_2(3))

Here, instead of nesting the functions, we have used the first function within the second function. This means we could call either function to best suit our purpose.

Docstrings#

As you can already see, functions can get pretty confusing pretty fast, especially when it takes many arguments, keyword arguments, and returns multiple variables. It gets even more confusing if there are multiple functions in your program that do similar things, and therefore have similar names. Someone else’s program, or even your own program after a couple of months, becomes very difficult to understand.

To make it easier to see what is happening in a function, programmers use a convention called a docstring from Python’s PEP (a set of guidelines to make Python programs more readable and easy to use). A docstring is a short description of what the function does, its parameters (arguments), and what it returns.

A single-line docstring for a really obvious function might look like this:

""" Explain what this function does """

As in the program:

def pressure_conversion(pressure_bar):
    """ Convert pressure in bar to pressure in mmHg. """
    pressure_mmHg = pressure_bar * 750.062
    return pressure_mmHg

For more complicated functions, you must use a multi-line docstring. For example:

def no_mols(mass, molar_mass=180.158):
    """
    Calculate the number of mols.

    Parameters
    ----------
    mass : Float
        The mass of the substance
        Units: g
    molar_mass : Float
        The molecular weight of the substance
        Units: g mol-1
        Default: aspirin == 180.158

    Returns
    -------
    n_mols : Float
        The number of mols of the substance
        Units: mols
    """
   n_mols = mass / molar_mass
   return n_mols

Things to note:

The first line of the docstring describes what the function does. This is given as an imperative, and ends in a full-stop.
The parameters (arguments) are then listed. It is explained what variable type they are, what units they take, and whether they have any default values.
It is last explained what is returned by the function, in the same format as the parameters above.
If there is any additional information the user requires to use this function, that is also included in the docstring.

The exact format of the docstring is less important than ensuring it is consistent and clear throughout your program.

Note: Why is it called parameters and not arguments? From the function’s perspective, a parameter is the variable listed inside the parentheses in the function definition, whereas an argument is the value that is sent to the function when it is called.

Exercise: Write a docstring for a given function

The following function takes a value of wavelength and converts it into energy in Joules and energy in eV using the equation \( E = \frac{h \ c}{\lambda} \). Write a docstring for this function.

def energy_calc(wavelengths):
   h = 6.626e-34 # m^2kgs^-1
   c = 3.00e8 # ms^-1
   energy_J = []
   energy_eV = []
   for value in wavelengths:
       val_J = (h * c) / (value * 1e-9) # J
       energy_J.append(val_J)
       val_eV = val_J / 1.602e-19
       energy_eV.append(val_eV)
   return energy_J, energy_eV

wavelengths_light = [276, 59, 0.5, 1183, 52, 0.002, 127, 474] # nm

energy_light_J, energy_light_eV = energy_calc(wavelengths_light)
print(f"The energies in J are: {energy_light_J}")
print(f"The energies in eV are: {energy_light_eV}")

Click to view answer

def energy_calc(wavelengths):
    """
    Calculate energy in J and eV from a wavelength.

    Parameters
    ------------
    wavelengths : LIST
        A list of wavelengths in nm
    
    Returns
    ------------
    energy_J : LIST
        A list of energy values in Joules.
    energy_eV : LIST
        A list of energy values in electron Volts
    """
    h = 6.626e-34 # m^2kgs^-1
    c = 3.00e8 # ms^-1
    energy_J = []
    energy_eV = []
    for value in wavelengths:
        val_J = (h * c) / (value * 1e-9) # J
        energy_J.append(val_J)
        val_eV = val_J / 1.602e-19
        energy_eV.append(val_eV)
    return energy_J, energy_eV

The first thing in the docstring is a description of what the function does, written in the imperative tense (as an instruction).

Next, the parameters are given. We only have one in this function: wavelengths.

Finally, the returns are given. We have two in this case, energy_J and energy_eV.

For some, more complex functions, docstrings can be even longer, and include instructions, examples, and usage explanation.

Further Practice#

Question 1#

Write a function that takes one argument, num, and returns True if it is even and False if it is odd.
Hint: Remember that the modulo operator (%) returns the remainder of the left hand quantity when divided by the right hand quantity.

Click to view answer

def is_even(num):
    if num % 2 == 0:
        return True
    else:
        return False

print(is_even(9))
print(is_even(42))

Question 2#

Using your function above, write a function which takes a list of integers, and returns only the even integers of this list

Click to view answer

def keep_evens(num_list):
    evens = []
    for num in num_list:
        if is_even(num):
            evens.append(num)

    return evens

print(keep_evens([1,2,3,4,5,6,7,8,9,10]))

Question 3#

The function add_atomic_masses() defined earlier in the document accepted a list of atomic masses. However, molecules are more generally referred to using formulae rather than lists of masses. Write a series of functions as directed in the comments in the cell below to allow calculation of molecular masses from molecular formulae for simple organic molecules:

The dictionary below can be used to look up an atom’s mass from its symbol. Don’t worry about other elements for the time being, you can assume that these are the only elements that matter (pretend you’re an organic chemist)

atom_masses = {
    "C" : 12.011,
    "H" : 1.008,
    "O" : 15.999,
    "N" : 14.007
}

Click to view answer

atom_masses = {
    "C" : 12,
    "H" : 1,
    "O" : 16,
    "N" : 14
}

def get_masses(atoms):
    mass_list = []
    for atom in atoms:
        mass_list.append(atom_masses[atom])
    return mass_list

print(get_masses(["C","H","H","H","C","H","O"]))

Question 4#

The code below has errors. Have a look at the error message and fix the code to get the expected output.

def say_hello()
print("Hello World")

say_hello

#Expected output: Hello World

Click to view answer

def say_hello():
    print("Hello World")
    return

say_hello()

Corrections:

There must be a colon after the function.
Code belonging to the function (in this case, the phrase “Hello World”), must be indented by one line.
You should include return. Technically, this function will work without it, but it is good practice.
When calling the function, you must include brackets “()” even if the function takes no arguments.

Question 5#

After a lab, a student has extracted IR data from a newly-discovered material, Pythonium (‘Py’). The spectrum data is stored in two lists, the first is the peak wavenumbers, and the second is the corresponding transmittance. Only peaks with a wavenumber above 1500 cm^-1 and with a transmittance less than 95% can be analysed by the students. Write a function that can take the two lists below and output only the useful values. You should be able to adapt any code you have written for this purpose before (hint, go back to the for loops lesson).


raw_wavenumbers_Py = [3420.50, 2955.75, 2850.30, 1745.60, 1605.25, 1550.40, 1515.10, 1501.85, 1450.70, 1255.20, 980.55, 750.30] 

raw_transmittance_Py = [60.3, 47.7, 34.6, 96.2, 48.4, 99.1, 95.8, 51.2, 65.3, 50.0, 97.1, 27.5]

Click to view answer

When thinking about writing a piece of code, it is important to break it down into simple chunks that you can get your mind around. How you achieve each block is not yet in consideration. For example, with this code, you might write.

# Zip together wavenumbers and transmittance
## Check if wavenumber > 1500
## Check if transmittance < 95
### Add to new lists

This is breaking the code down into its component parts. Also notice that there is no mention of a function at this point. It is normally much easier to write out the code before converting it into a function.

def parse_IR_data(wavenumbers, transmittances):
    """
    Take IR data of wavenumbers and corresponding transmittances.
    Return only wavenumbers which are above 1500cm-1 and below 95% transmittance.

    Parameters
    -----------
    wavenumbers : List
        Peak values taken from an IR spectrum
        Units: cm-1
    transmittance : List
        Corresponding transmittance for each peak
        Units : 95%

    Returns
    -----------
    maj_peak_wavenumbers : List
        Peak wavenumber
    """
    
    # Empty lists of the data we want to keep
    maj_peak_wavenumbers = []
    maj_peak_transmittance = []

    # Iterates through the lists
    for v, tr in zip(wavenumbers, transmittances):
        # Checks the values obey the conditional statement
        if v > 1500 and tr < 95:
            print(f"{v} cm-1, {tr} %")
            maj_peak_wavenumbers.append(v)
            maj_peak_transmittance.append(tr)
    return maj_peak_wavenumbers, maj_peak_transmittance


# Data retrieved from the lab for the substance pythonium.
raw_wavenumbers_Py = [3420.50, 2955.75, 2850.30, 1745.60, 1605.25, 1550.40, 1515.10, 1501.85, 1450.70, 1255.20, 980.55, 750.30] # cm-1
raw_transmittance_Py = [60.3, 47.7, 34.6, 96.2, 48.4, 99.1, 95.8, 51.2, 65.3, 50.0, 97.1, 27.5] # %

# Call function
wavenumbers_Py, transmittance_Py = parse_IR_data(raw_wavenumbers_Py, raw_transmittance_Py)
print(wavenumbers_Py, transmittance_Py)

Question 6#

Now adapt the code from Exercise 5 so that it parses the data when it is in the format: raw_data = [[wavenumber, transmittance], [wavenumber, transmittance], etc.] for each point.

raw_data_Py = [[3420.50, 60.3], [2955.75, 47.7], [2850.30, 34.6], [1745.60, 96.2], [1605.25, 48.4], [1550.40, 99.1], [1515.10, 95.8], [1501.85, 51.2], [1450.70, 65.3], [1255.20, 50.0], [980.55, 97.1], [750.30, 27.5]]

Click to view answer

raw_data_Py = [
    [3420.50, 60.3],
    [2955.75, 47.7],
    [2850.30, 34.6],
    [1745.60, 96.2],
    [1605.25, 48.4],
    [1550.40, 99.1],
    [1515.10, 95.8],
    [1501.85, 51.2],
    [1450.70, 65.3],
    [1255.20, 50.0],
    [980.55, 97.1],
    [750.30, 27.5],
]


def parse_IR_data(raw_data):
    """
    Take IR data and only return points with wavenumber > 1500cm-1 AND transmittance < 95%

    Parameters
    -----------
    raw_data : List
        Nested list of raw IR data in the format [[wavenumber, transmittance], [wavenumber, transmittance], etc.]

    Returns
    -----------
    parsed_data : List
        Nested list of IR data with wavenumber > 1500cm-1 AND transmittance < 95%
        In the format [[wavenumber, transmittance], [wavenumber, transmittance], etc.]
    """
    parsed_data = []
    for point in raw_data:
        if point[0] < 1500 or point[1] > 95:
            continue
        else:
            print(f"{point[0]} cm-1 {point[1]} %")
            parsed_data.append(point)
    return parsed_data


parse_IR_data(raw_data_Py)

This version is much shorter, and highlights the importance of considering the way in which to present your data. By placing into a nested list containing both wavenumber and transmittance, we have made the code significantly shorter.

Summary#

Define a function using the following syntax:

def function_name(argument_1, argument_2):
    Code
    return a_variable, b_variable

Call a function using:
- General syntax: my_function().
- Assign it a variable using: result_1, result_2 = my_function(some_argument, other_argument).
- Avoid getting outputs by replacing with a double underscore __, like so: result_1, __ = my_function(some_argument, other_argument).
A variable declared inside a function has ‘local scope’, it cannot be accessed from outside the function. However, from within a function you can access variables from outside without them being passed through as an argument.
Functions can have as many (or no) arguments or return statements as you like. You can nest return statements in if statements.
There are different kinds of arguments:
- Positional arguments, args. These must be in a certain order when the function is called.
- Keyword arguments, kwargs. These can be in any order (although always after positional arguments), and are called by the name in the function. result = my_function(argument_1 = "He", argument_2 = 4)
- Positional-only arguments, , /. Adding a forward slash aftera parameter in the function forces it to only take a positional argument: def function_name(argument_1, /).
- Keyword-only arguments, *,. Adding an asterisk beforea parameter in the function forces it to only take a keyword argument: def function_name(*, argument_1).
- Default/optional arguments. =. If a value is not specified, the function will take that value for that parameter. def function_name(argument_1=21).
- Arbitrary positional arguments, *args. Allows an arbitrary number of arguments to be called. You must iterate through them. def function_name(*argument_1).
- Arbitrary keyword arguments, **kwargs. Allows an arbitrary number of keyword arguments to be called. You can convert the dictionary to a list using the function ‘.items()’. def my_function(**argument).
Functions can call themselves. This is called recursion.
Avoid defining functions within functions. Instead, define it separately and call that function from inside the first function.
Use docstrings to explain to the user the purpose of the function, the arguments, its returns, limitations, and any other information needed to use it effectively.

Good Practise with Functions#

When using functions the following is good advice on how to use them:

Declare all functions at the beginning of the programme, don’t scatter the declarations throughout the programme, it breaks the flow and makes it hard to understand.
Don’t declare a function within a function. This is an unusual coding style, which is only necessary in a few specialist cases.
Pass values into function through the arguments, don’t rely on variable scope.
Generally, it is easier to first write a version of your code (or part of your code) and then turn it into a function, rather than start writing a function straight away.

Functions and Scope

Contents

Functions and Scope#

Prerequisites#

Learning Objectives#

Functions#

Defining functions#

Calling functions#

Function Scope#

More complex functions#

Functions without arguments and/or return variables#

Multiple return conditions#

Multiple Arguments#

More advanced arguments#

Recursion#

Functions within functions#

Docstrings#

Further Practice#

Question 1#

Question 2#

Question 3#

Question 4#

Question 5#

Question 6#

Summary#

Good Practise with Functions#