# Introduction to Python¶

for scientific computing

## Jupyter¶

• We'll use "Jupyter Notebook" to interact with Python.
• Like Matlab's 'Live Editor'; Maple's and Mathematica's notebooks.
• Runs in a web browser.

To get started:

# mas-jupyter.ncl.ac.uk¶

• Open language.ipynb

You can edit the code samples from the slides live and run them as you please.

• Double-click a cell to edit it.
• To run a cell's contents, use Control-Enter.
• You can also use Shift-Enter to run and move to the next cell.
In [1]:
x = 1 + 1
10 * x

Out[1]:
20

Today's course has two parts:

### Morning: the Python language¶

• Why, what, how?
• Basic data types and operations
• Control flow

### Afternoon: Python tools for scientists¶

• NumPy: working with large data grids
• SciPy: common numerical functions
• matplotlib: in-depth plotting library

# What is Python?¶

• Interpreted, object-oriented programming language
• Works on PC, Mac and Linux
• Open source: free (speech, lunch)

## Why Python?¶

• Neat and friendly syntax
In [2]:
print("Hello, world!")

Hello, world!

• Newbie-friendly
• Quick to write code and quick (enough) to run
In [3]:
import json, random
#Data obtained from http://www.imdb.com/interfaces
with open("data/top_250_imdb.json") as data_file:

In [4]:
random.sample(films.items(), 3)

Out[4]:
[('Yôjinbô (1961)', 8.2),
('Batman Begins (2005)', 8.2),
('Das Leben der Anderen (2006)', 8.4)]
In [5]:
from statistics import mean
#This mean is just from the top 250!
mean(films.values())

Out[5]:
8.2636
In [6]:
max(films.values())

Out[6]:
9.2
In [7]:
print([name for name, score in films.items() if score == 9.2])

['The Shawshank Redemption (1994)', 'The Godfather (1972)']


More pros and cons discussed at the SciPy tutorial.

## What can Python do?¶

• Work with large datasets (Pandas dataframes and NumPy arrays)
In [8]:
import pandas #Data from Thomas Bland
df.shape

Out[8]:
(450, 1021)
In [9]:
df.head()

Out[9]:
0 0.98 1.96 2.94 3.92 4.9 5.88 6.86 7.84 8.82 ... 990.78 991.76 992.74 993.72 994.7 995.68 996.66 997.64 998.62 999.6
-22.5 1.0 0.99992 0.99991 0.99998 0.99944 0.99935 0.99995 0.99853 1.00030 1.0019 ... 0.99888 1.00010 0.99949 0.99871 0.99616 0.99866 0.99587 0.99769 0.99823 1.0014
-22.4 1.0 0.99994 0.99992 1.00010 0.99947 0.99951 1.00000 0.99873 1.00030 1.0018 ... 0.99885 1.00000 0.99935 0.99860 0.99643 0.99857 0.99613 0.99769 0.99813 1.0015
-22.3 1.0 0.99995 0.99993 0.99976 1.00000 0.99972 0.99986 0.99892 0.99978 1.0019 ... 0.99873 0.99983 0.99903 0.99840 0.99670 0.99842 0.99643 0.99770 0.99792 1.0016
-22.2 1.0 0.99996 0.99994 0.99969 1.00010 1.00010 1.00000 0.99941 0.99972 1.0015 ... 0.99851 0.99944 0.99880 0.99835 0.99725 0.99816 0.99681 0.99766 0.99771 1.0018
-22.1 1.0 0.99997 0.99995 0.99995 1.00040 1.00030 0.99980 0.99974 0.99997 1.0010 ... 0.99824 0.99916 0.99843 0.99825 0.99759 0.99808 0.99702 0.99763 0.99771 1.0018

5 rows × 1021 columns

• Data processing and visualisation (matplotlib and MayaVi)
In [10]:
subset = df[-7:7]

import matplotlib.pyplot as plt
plt.imshow(subset,                 #Like Matlab's pcolor()
aspect='auto',
extent=(0, 1000, -7, 7))

colorbar = plt.colorbar()
colorbar.ax.set_ylabel('Density $|\psi|^2$', labelpad=20, rotation=270)

plt.xlabel('time $t$')
plt.ylabel('position $z$')
plt.show()

• General purpose programming language (e.g. Python runs websites)
• Got a boring task to do? Automate it!

## How do I get Python?¶

Won't always have this notebook interface!

### Python 2 or 3?¶

• Unless you're using someone else's code, use Python 3.
• Some blogs might tell you it's not supported by big packages but that's not true any more.

Can try an IDE e.g. Spyder

## Numeric types¶

### Integers: indexing or counting:¶

In [11]:
1 + 2

Out[11]:
3
In [12]:
300 - 456

Out[12]:
-156

### Floats: measuring continuous things.¶

In [13]:
0.1 + 0.2    #limited precision

Out[13]:
0.30000000000000004
In [14]:
0.5 - 0.3

Out[14]:
0.2

### Python's numbers are friendly¶

In [15]:
-2 ** 1000            # No problems with sign or under/overflow

Out[15]:
-10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376
In [16]:
type(-2 ** 1000)

Out[16]:
int
In [17]:
1 + 1.5              # Mix int and float: result is float

Out[17]:
2.5
In [18]:
type(12 + 24.0)      #Can check types explicitly

Out[18]:
float

### Golden rule: if one part of an expression is a float, the entire expression will be a float¶

#### Other operations¶

In [19]:
23 - 7.0

Out[19]:
16.0
In [20]:
2 * 4

Out[20]:
8
In [21]:
3 / 2               # division always returns a float in Python 3

Out[21]:
1.5
In [22]:
3 // 2              # double-slashes force integer division

Out[22]:
1
In [23]:
2 ** 3.0

Out[23]:
8.0
In [24]:
2 ^ 6               #Bitwise or -- not very useful for scientists

Out[24]:
4

#### Even more operations¶

In [25]:
(1 + 2) * (3 + 4)   #Brackets work as normal

Out[25]:
21
In [26]:
3 - 2*4             #Order of operations (BODMAS) as normal

Out[26]:
-5
In [27]:
27 % 5              #Modulo (remainder) operation

Out[27]:
2
In [28]:
abs(-2)             #Modulus (absolute value) function

Out[28]:
2

### Advice for working with floats¶

In [29]:
x = 0.1 + 0.2
y = 0.15 + 0.15
print("%.20f\n%.20f" % (x, y))
from math import isclose
isclose(x, y)

0.30000000000000004441
0.29999999999999998890

Out[29]:
True

### complex type¶

• Python uses j for the imaginary unit $i$.
• Has to have a number before it, to distinguish from a variable called j.
In [30]:
1j * 1j

Out[30]:
(-1+0j)
In [31]:
z = 2 - 4j
z + z.conjugate()  # Twice the real part

Out[31]:
(4+0j)
• use cmath functions when working with complex numbers.
In [32]:
import cmath
cmath.sin(0.1 + 2j)

Out[32]:
(0.37559284993485376+3.6087412126897433j)
In [33]:
abs(cmath.exp(2j))

Out[33]:
1.0

## Exercises¶

What are the types and values of the following expressions? Try to work it out by hand; then check in the notebook.

• 23 + 2 * 17 - 9
• 23 + 2 * (17 - 9.0)
• 5 * 6 / 7
• 5 * 6 // 7
• 5 * 6.0 // 7
• 2.0 ** (3 + 7 % 3) // 2
• 2 ** (3 + 7 % 3) / 2
• 4 ** 0.5
• -4 ** 0.5
• (1 + 1/1000) ** 1000
• int: 48
• float: 39.0
• float: 30/7 == 4.28571...6
• int: 30 // 7 == 4
• float: 30.0 // 7 == 4.0
• float: 8.0
• float: 8.0
• float: 2.0
• float: -2.0
• float: 2.71692... $\approx e$

## Control flow: variables¶

Variables are names which refer to values.

In [34]:
x = 10
2 * x + 4

Out[34]:
24
In [35]:
#Prefer descriptive names over shorthand
import math
planck = 6.63e-36
red_planck = planck / (2 * math.pi)
red_planck

Out[35]:
1.0551972726992662e-36
In [36]:
name = 'Dr. John Smith' #not just numbers: more data types later
len(name)

Out[36]:
14
In [37]:
thing1 = 3.142   #numbers okay in variable names
thing2 = 1.618

In [38]:
3rdthing = 2.718 #except at the start

  File "<ipython-input-38-e4d50dee3627>", line 1
3rdthing = 2.718 #except at the start
^
SyntaxError: invalid syntax

In [41]:
del = 'boy'

  File "<ipython-input-41-6e337587edb8>", line 1
del = 'boy'
^
SyntaxError: invalid syntax


To compare variables and/or values, use two equals signs ==. More on this later.

In [39]:
t = 2

In [40]:
t + t = 4

  File "<ipython-input-40-c6ff51bde1a1>", line 1
t + t = 4
^
SyntaxError: can't assign to operator

In [42]:
t + t == 4

Out[42]:
True

### Quick quiz: what happens here?¶

In [43]:
x = 1
y = x
x = x * 5


What's $y$ equal to: $1$ or $5$?

In [44]:
y

Out[44]:
1

When we say y = x, we mean

• Make y refer to whatever x refers to

and not

• Make y refer to x

If in doubt: try experimenting!

### Control flow: functions¶

• Packages and the standard library have many useful functions
• Still useful to write your own: reuse code, break program into smaller problems
In [45]:
def discriminant(a, b, c):
print("a =", a, "b =", b, "c =", c)
return b ** 2 - 4 * a * c

• def keyword (define)
• function name (same rules as variables)
• argument list
• colon to mark indentation
• statements: indented with four spaces
• return expression
In [46]:
discriminant(2, 3, 4)       #Give arguments values by position...

a = 2 b = 3 c = 4

Out[46]:
-23
In [47]:
discriminant(b=3, c=4, a=2) #...or explicitly by name

a = 2 b = 3 c = 4

Out[47]:
-23

Python will complain if you don't give a function the right arguments.

In [48]:
discriminant()

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-48-dc883d99b76f> in <module>()
----> 1 discriminant()

TypeError: discriminant() missing 3 required positional arguments: 'a', 'b', and 'c'
In [49]:
discriminant(0, 0)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-49-05674ee3aefb> in <module>()
----> 1 discriminant(0, 0)

TypeError: discriminant() missing 1 required positional argument: 'c'
In [50]:
discriminant(a=1, a=2, a=3)

  File "<ipython-input-50-cf03c5c67bda>", line 1
discriminant(a=1, a=2, a=3)
^
SyntaxError: keyword argument repeated


Arguments can be made optional by giving them default values.

In [51]:
def greet(greeting='Hello', name='stranger'):
print(greeting, 'to you,', name)

In [52]:
greet()

Hello to you, stranger

In [53]:
greet('David')

David to you, stranger

In [54]:
greet(name='David')

Hello to you, David


Can return more than one value at once:

In [55]:
def consecutive_squares(n):
return n**2, (n + 1)**2

In [56]:
consecutive_squares(5)

Out[56]:
(25, 36)

The function returns a tuple (more on these later). Can unpack to get at the individual values

In [57]:
a, b = consecutive_squares(10)
a

Out[57]:
100
In [58]:
b

Out[58]:
121

#### Variable scope: context matters¶

In [59]:
a = 3
def double(a):
a = 2 * a
return a

In [60]:
double(6)

Out[60]:
12

Function arguments and variables defined in a function are local to the function body.

If there's a name conflict, stuff outside is unaffected.

In [61]:
a

Out[61]:
3

See the Python tutorial for more tips, tricks and examples---including functions that take a variable number of arguments.

### Cheeky challenge¶

Write a function implements the quadratic formula.

• Arguments: three numbers $a$, $b$, and $c$
• Return both solutions to $ax^2 + bx + c = 0$
• Return the smaller one first

Reminder: the quadratic formula is $$x = \frac {-b \pm \sqrt{b^2 - 4ac}} {2a}$$

• Use math.sqrt for computing square roots. Don't forget to import!

Let's do a few tests.

• $(x-4)(x+2) = x^2 + 2x - 8$ has roots $x=4, x=-2$.
• $2(x-10)^2 = 2x^2 -40x + 400$ has a repeated root $x=10$.
print( quadratic_roots(1, 2, -8) )
#assert statements will error if the condition is False.
assert quadratic_roots(1, 2, -8) == (-2, 4)
assert quadratic_roots(2, -40, 400) == (10, 10)


### Control flow: loops¶

Basic looping has two important parts:

• for variable in ...:
• range function
In [62]:
for i in range(5):
print("Hello!")

Hello!
Hello!
Hello!
Hello!
Hello!

• loop body indented with four spaces (like functions)
• colon to denote indentation

#### Python's indexing convention¶

Something of length $N$ uses indices from $0$ to $N-1$ inclusive.

In [63]:
for i in range(5):
print("Here's a number:")
print(i)

Here's a number:
0
Here's a number:
1
Here's a number:
2
Here's a number:
3
Here's a number:
4

• unlike Matlab, Fortran or R (where indexing starts from 1).
• like C, C++, Java, Javascript
• EWD831 discusses different indexing systems
• Wikipedia compares across languages.

#### Controlling integer ranges¶

The most general form of the range function is

range(start, stop, step)

Where step has default value of 1 when it's missing.

In [64]:
for i in range(5, 10):
print(i)

5
6
7
8
9

In [65]:
for i in range(10, 20, 2):
print(i)

10
12
14
16
18


Python assumes that start ≤ stop.

In [66]:
for thing in range(50, 40): #can use any loop variable
print(thing)


If you want a descending loop you need a negative step.

In [67]:
for thing in range(50, 40, -3):
print(thing)

50
47
44
41


### Cheeky challenge¶

Use a loop to compute $$5^2 + 10^2 + 15^2 + 20^2 + \dotsb + 200^2$$

#Again here's a template for you
total = 0
for ... in ...:
total = total + ...
total

#Here's the answer you should have got:
assert total == 553500


We'll see later that we can loop over all sorts of objects---not just ranges.

In [68]:
for character in "David Matthew Robertson":
print(character, end=".")

D.a.v.i.d. .M.a.t.t.h.e.w. .R.o.b.e.r.t.s.o.n.

This makes looping a really powerful tool in Python. It enables

Just like other languages, there are while loops and break and continue statements which are a bit less intuitive.

There's too much to go over here---but there are links in the notebook if you're curious.

### Control flow: conditionals¶

A very important tool in the programmer's toolkit is the ability to do different things in different circumstances.

Enter the if statement:

In [69]:
i = 10
if i % 2 == 0:
print(i, "is even")

10 is even

• Colon, then four spaces before body statements
• Main expression usually a boolean: True or False
• Use comparisons like <, <=, ==, !=, >=, > to make booleans
In [70]:
1 < 2    #less than

Out[70]:
True
In [71]:
2 <= 0.2   #less than or equal

Out[71]:
False
In [72]:
3 == 3.0   #equal

Out[72]:
True
In [73]:
"cat" != "dog" #not equal

Out[73]:
True
In [74]:
x = 10
1 < x < 15 #Mathematical notation for "(1 < x) and (x < 15)"

Out[74]:
True

Let's take our previous if statement and put it in a loop.

Whenever we start a new block (line ending in a colon), we have to indent an extra four spaces.

In [75]:
for i in range(5):
if i % 2 == 0:
print(i, "is even")

0 is even
2 is even
4 is even


We can handle the False case with an else statement.

In [76]:
for i in range(5):
if i % 2 == 0:
print(i, "is even")
else:
print(i, "is odd")

0 is even
1 is odd
2 is even
3 is odd
4 is even


For finer control, use an if... elif... else... chain.

Here elif is short for "else if".

In [77]:
import datetime
now = datetime.datetime.now()
print("The time is", now, "and the hour is:", now.hour)
if 6 <= now.hour < 12:
print("Good morning!")
elif now.hour < 18:
print("Good afternoon!")
elif now.hour < 20:
print("Good evening!")
else:
print("Good night!")

The time is 2017-04-11 12:48:23.167619 and the hour is: 12
Good afternoon!

• else is optional and always comes last.
• Need to have if before any elifs.
• Can have as many elifs as you like.

### Cheeky challenge¶

The sign or signum function is defined by $$\operatorname{sign}(x) = \begin{cases} \phantom{-}1 & \text{if x>0} \\ \phantom{-}0 & \text{if x=0} \\ -1 & \text{if x<0} \end{cases}$$

Implement this as a Python function.

#And some tests:
assert sign(10) == 1
assert sign(0) == 0
assert sign(-23.4) == -1

• Quick mention: can perform logical operations on booleans with and, or, and not.
In [78]:
True and False

Out[78]:
False
In [79]:
True or False

Out[79]:
True
In [80]:
not False

Out[80]:
True
In [81]:
not False and False    #careful with order of operations

Out[81]:
False
In [82]:
not (False and False)

Out[82]:
True

## Data types: strings¶

• Any textual data: plot labels, file names, ...
• Enclosed by single (') or double quotes (")
• Any Unicode character okay
In [83]:
supercal = "Supercalifragilisticexpialidocious"
starwars = 'No, I am your father'  # spaces okay
greeting = "こんにちは (Konnichiwa)" # non-Latin characters okay

• Use \n to stand for a newline
• Use \' or \" for literal quotes
• Use \\ for a literal backslash
• Spaces preserved
In [84]:
print("A short 'quote'\n     a double quote char: \"\n and newlines!")

A short 'quote'
a double quote char: "
and newlines!


### Python is pedantic when comparing¶

In [85]:
'2' == 2            #different types!

Out[85]:
False
In [86]:
type('2'), type(2)

Out[86]:
(str, int)
In [87]:
'True' == True

Out[87]:
False
In [88]:
type('True'), type(True)

Out[88]:
(str, bool)

### String methods¶

A list of handy funtions for working with strings. Full reference online.

In [89]:
vowels = "aeiou"
vowels.upper()

Out[89]:
'AEIOU'
In [90]:
vowels.lower() #already lowercase

Out[90]:
'aeiou'
In [91]:
vowels.capitalize()

Out[91]:
'Aeiou'
In [92]:
len(supercal)   #length function

Out[92]:
34
In [93]:
supercal.count("a")

Out[93]:
3

Silly example: a function which processes a yes/no prompt (y/n)

In [94]:
def handle_response(response):
if response.startswith("y"):
return "positive response"
elif response.startswith("n"):
return "negative response"
else:
return "unclear response"

In [95]:
handle_response("yes")

Out[95]:
'positive response'
In [96]:
handle_response("no way man that's unreasonable")

Out[96]:
'negative response'

What happens when we call with these arguments? Guess, then check in the notebook.

• handle_response()
• handle_response("")
• handle_response("YES")
• handle_response(" yes ")

handle_response()

• TypeError: missing argument

handle_response("")

• Unclear response: the empty string "" doesn't start with anything!

handle_response("YES")

• Unclear response: upper/lowercase matters for comparison
In [97]:
'Y' == 'y'

Out[97]:
False

handle_response(" yes ")

• Unclear response: first char is a space

Often useful to normalise strings to a sensible form, especially if they come from user input.

In [98]:
response = "    YeS   "
response = response.lower()
print( repr(response) )      # explicitly representation with repr()
response = response.strip()  # remove whitespace from start and end
print( repr(response) )

'    yes   '
'yes'

In [99]:
x = "The news media reported today that no news is in fact good news"
x.replace("news", "FAKE NEWS!!")

Out[99]:
'The FAKE NEWS!! media reported today that no FAKE NEWS!! is in fact good FAKE NEWS!!'

### Slicing¶

Remember that indexing works from $0$ to $N - 1$:

In [100]:
supercal[0]

Out[100]:
'S'
In [101]:
supercal[5]

Out[101]:
'c'
In [102]:
supercal[0:5] #like range, slicing excludes upper limit

Out[102]:
'Super'
In [103]:
supercal[-1]  #Last char

Out[103]:
's'
In [104]:
supercal[:5] + "..." + supercal[-4:] #first five, then last 4

Out[104]:
'Super...ious'

### Concatenation¶

In [105]:
name = "David"
"Good morning, " + name + "."

Out[105]:
'Good morning, David.'
• Use * as shorthand for repitition.
In [106]:
'thank you ' * 10

Out[106]:
'thank you thank you thank you thank you thank you thank you thank you thank you thank you thank you '

Even more complicated string handling available:

### Looping over strings¶

Awkward way:

In [107]:
example = "demo"
for i in range(len(example)):
print(example[i])

d
e
m
o


Slick way:

In [108]:
for character in "demo":
print(character)

d
e
m
o


### Cheeky Challenge¶

Write a function to count the number of vowels in a string. Assume that we're just working with the Roman alphabet---so don't worry about variants like ë, è, é, and ê.

For bonus points, try using a loop to write this function.

In [109]:
#Here's a space to write your function

#and some tests to run
assert your_function("Hello") == 2
assert your_function(" xyz HEllO") == 2
assert your_function("Hello, sailor") == 5


## Data types: lists and tuples¶

• Lists: a sequence of arbitrary Python objects
In [110]:
greek_letters = ["alpha", "beta", "gamma", "delta"]
greek_letters[1] #Index just like strings: 0 to N-1.

Out[110]:
'beta'
• Lists can be modified in-place
In [111]:
greek_letters[1] = "BETA (β)"
greek_letters

Out[111]:
['alpha', 'BETA (β)', 'gamma', 'delta']
• Lists can contain objects of different types
In [112]:
things = ["uno", "dos", 3, supercal, 2.718]

• Unless they're modified, lists have a fixed length
In [113]:
len(things)

Out[113]:
5
• Lists are objects, so lists can even contain lists!
In [114]:
names_by_parts = [ ["David", "Robertson"], ["Cetin", "Can", "Evirgen"] ]
print( names_by_parts[0] )
print( names_by_parts[0][1] )

['David', 'Robertson']
Robertson


### Quick Quiz¶

What is len(names_by_parts)?

• 2
• 3
• 4
• 5
In [115]:
len(names_by_parts)

Out[115]:
2
• A list doesn't know anything special about what it contains
• Can't access or add new list items by accident
In [117]:
greek_letters[4]

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-117-85a9bd08274d> in <module>()
----> 1 greek_letters[4]

IndexError: list index out of range
In [118]:
greek_letters[4] = 'EPSILON (ε)'

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-118-233d7087fd08> in <module>()
----> 1 greek_letters[4] = 'EPSILON (ε)'

IndexError: list assignment index out of range
In [119]:
greek_letters.append("EPSILON (ε)")
greek_letters

Out[119]:
['alpha', 'BETA (β)', 'gamma', 'delta', 'EPSILON (ε)']
• Other useful list methods and idoms:
In [120]:
empty_list = []
print(empty_list, len(empty_list))

[] 0

In [121]:
numbers = [5, 2, 64, 41, 27, -2, 11, 32]

In [122]:
numbers.sort()     #modifies list in place
numbers

Out[122]:
[-2, 2, 5, 11, 27, 32, 41, 64]
In [123]:
["ab", 1].sort()  # Can't compare text with numbers

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-123-c281d25ec96a> in <module>()
----> 1 ["ab", 1].sort()  # Can't compare text with numbers

TypeError: '<' not supported between instances of 'int' and 'str'
In [124]:
x = list(range(10, 20))
x

Out[124]:
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
In [125]:
del x[2]  #Delete the entry with index 2 (third entry)
x

Out[125]:
[10, 11, 13, 14, 15, 16, 17, 18, 19]
In [126]:
print("POP:", x.pop(), x)

POP: 19 [10, 11, 13, 14, 15, 16, 17, 18]

In [127]:
x.reverse() #modifies in place
x

Out[127]:
[18, 17, 16, 15, 14, 13, 11, 10]
In [128]:
x.insert(4, "surprise")
x

Out[128]:
[18, 17, 16, 15, 'surprise', 14, 13, 11, 10]

NB: It's quick to extend lists at the end, but inserting or delete near the start is slower. If your list is HUGE then this can become a problem.

Looping over lists is just like strings.

Warning: don't modify list structure when looping! (Modifying list values is fine)

In [129]:
colours = ["red", "orange", "yellow", "green", "blue", "indigo", "violet"]
for colour in colours:
print(colour, "has", len(colour), "letters" )

red has 3 letters
orange has 6 letters
yellow has 6 letters
green has 5 letters
blue has 4 letters
indigo has 6 letters
violet has 6 letters

In [130]:
for colour in colours:
colours.pop()
colours

Out[130]:
['red', 'orange', 'yellow']
In [131]:
for i, colour in enumerate(colours): #avoids range(len(colours))
colours[i] = colour.upper()
colours

Out[131]:
['RED', 'ORANGE', 'YELLOW']

### Cheeky challenge¶

The following lines will read a list of words from a data file. Use Python to find:

• The first, middle and last word in the list
• The percentage of words containing an e
• Hint: use str.find; or better the in operator
• All two-letter words in the list (good for Scrabble)
In [132]:
with open('data/en-GB-words.txt', 'rt') as f:
words = [line.strip() for line in f]
print(len(words), "words. Number 2001 is", words[2000])

99156 words. Number 2001 is Booth

In [133]:
N = len(words)
print(words[0], words[N//2], words[-1], sep=", ")

A, harks, études

In [134]:
count = 0
for word in words:
if 'e' in word:
count = count + 1
print(count, 100 * count / len(words))

63152 63.689539715196254

In [135]:
two_letter_words = []
for word in words:
if len(word) == 2:
two_letter_words.append(word)
print(two_letter_words)

['Ac', 'Ag', 'Al', 'Am', 'Ar', 'As', 'At', 'Au', 'Av', 'Ba', 'Be', 'Bi', 'Bk', 'Br', 'Ca', 'Cd', 'Cf', 'Ci', 'Cl', 'Cm', 'Co', 'Cr', 'Cs', 'Cu', 'Di', 'Dr', 'Ed', 'Er', 'Es', 'Eu', 'Fe', 'Fm', 'Fr', 'GE', 'Ga', 'Gd', 'Ge', 'He', 'Hf', 'Hg', 'Ho', 'Hz', 'In', 'Io', 'Ir', 'It', 'Jo', 'Jr', 'Kr', 'La', 'Le', 'Li', 'Ln', 'Lr', 'Lt', 'Lu', 'Mb', 'Md', 'Mg', 'Mn', 'Mo', 'Mr', 'Ms', 'Mt', 'Na', 'Nb', 'Nd', 'Ne', 'Ni', 'Np', 'OK', 'Ob', 'Os', 'Oz', 'Pa', 'Pb', 'Pd', 'Pl', 'Pm', 'Po', 'Pt', 'Pu', 'Ra', 'Rb', 'Rd', 'Re', 'Rh', 'Rn', 'Ru', 'Rx', 'Sb', 'Sc', 'Se', 'Si', 'Sm', 'Sn', 'Sq', 'Sr', 'St', 'Ta', 'Tb', 'Tc', 'Th', 'Ti', 'Tl', 'Tm', 'Ty', 'Ur', 'Va', 'Wm', 'Wu', 'Xe', 'Yb', 'Zn', 'Zr', 'ad', 'ah', 'am', 'an', 'as', 'at', 'ay', 'be', 'by', 'cs', 'dB', 'do', 'eh', 'em', 'es', 'ex', 'fa', 'go', 'gs', 'ha', 'he', 'hi', 'ho', 'id', 'if', 'in', 'is', 'it', 'kW', 'kc', 'ks', 'la', 'lo', 'ls', 'ma', 'me', 'mi', 'ms', 'mu', 'my', 'no', 'nu', 'of', 'oh', 'on', 'or', 'ow', 'ox', 'pH', 'pa', 'pi', 're', 'rs', 'sh', 'so', 'ti', 'to', 'ts', 'uh', 'um', 'up', 'us', 'vs', 'we', 'ye', 'yo']


### Tuples¶

• The same as a list, except can't be modified after creation.
• Created with round brackets, not square
• Still indexed from $0$ to $N-1$
In [136]:
coordinate = (1, 2, 3)
coordinate

Out[136]:
(1, 2, 3)
In [137]:
coordinate[0]

Out[137]:
1
In [138]:
coordinate[0] = 10

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-138-9acc16226b5e> in <module>()
----> 1 coordinate[0] = 10

TypeError: 'tuple' object does not support item assignment
In [139]:
x, y, z = coordinate      #tuple unpacking
print(x, y, z, x + y + z)

1 2 3 6


In fact, when you say return a, b from a function, what gets returned is the tuple (a, b)!

## Data types: dictionaries¶

• Unordered collection of pairs key -> value
• Keys usually strings
• "Hashmap", "Associative array"
In [140]:
david = dict(
surname = "Robertson",
given_names = ["David", "Matthew"],
age = 24,
dob = "26/06/1992",
height = 190
)
david

Out[140]:
{'age': 24,
'dob': '26/06/1992',
'given_names': ['David', 'Matthew'],
'height': 190,
'surname': 'Robertson'}
• Index by key to get/set values
In [141]:
david['age'] = "very very very very very very old"
david['age']

Out[141]:
'very very very very very very old'

### Three ways to loop:¶

In [142]:
for key in david:
print(key, end=", ")

surname, given_names, age, dob, height,
In [143]:
for value in david.values():
print(value, end=", ")

Robertson, ['David', 'Matthew'], very very very very very very old, 26/06/1992, 190,
In [144]:
for key, value in david.items():
print(key, "->", value)

surname -> Robertson
given_names -> ['David', 'Matthew']
age -> very very very very very very old
dob -> 26/06/1992
height -> 190

• Dictionaries have a length too:
In [145]:
len(david)

Out[145]:
5
• Python will complain if you ask for a missing key:
In [146]:
david['weight']

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-146-01a69e534c10> in <module>()
----> 1 david['weight']

KeyError: 'weight'
• Can check if a key is present with in:
In [147]:
'surname' in david

Out[147]:
True

### Cheeky challenge¶

The following data file contains the periodic table as a dictionary. We're going to load it into a list, and each entry of that list will be a dictionary.

In [148]:
import os
import json
with open("data/PeriodicTable.json", "rt") as f:

In [149]:
table[0]

Out[149]:
{'appearance': 'colorless gas',
'atomic_mass': 1.008,
'boil': 20.271,
'category': 'diatomic nonmetal',
'color': None,
'density': 0.08988,
'discovered_by': 'Henry Cavendish',
'melt': 13.99,
'molar_heat': 28.836,
'name': 'Hydrogen',
'named_by': 'Antoine Lavoisier',
'number': 1,
'period': 1,
'phase': 'Gas',
'shells': [1],
'source': 'https://en.wikipedia.org/wiki/Hydrogen',
'spectral_img': 'https://en.wikipedia.org/wiki/File:Hydrogen_Spectra.jpg',
'summary': 'Hydrogen is a chemical element with chemical symbol H and atomic number 1. With an atomic weight of 1.00794 u, hydrogen is the lightest element on the periodic table. Its monatomic form (H) is the most abundant chemical substance in the Universe, constituting roughly 75% of all baryonic mass.',
'symbol': 'H',
'xpos': 1,
'ypos': 1}

• Which element is densest?
• Create a new dictionary mapping elements' symbols to their names. For example, if D is the dictionary, D['H'] == 'Hydrogen'.
• Sorted alphabetically, what's the first and last element symbol?
• Sorted alphabetically, what's the first and last element name?
• How many elements' symbols have a different first letter to their name?
In [150]:
max_density = 0
max_density_name = ""
for element in table:
if element['density'] != None and element['density'] > max_density:
max_density = element['density']
max_density_name = element['name']
max_density, max_density_name

Out[150]:
(40.7, 'Hassium')
In [151]:
shorthand = {}
for element in table:
symbol = element['symbol']
name = element['name']
shorthand[symbol] = name
shorthand

Out[151]:
{'Ac': 'Actinium',
'Ag': 'Silver',
'Al': 'Aluminium',
'Am': 'Americium',
'Ar': 'Argon',
'As': 'Arsenic',
'At': 'Astatine',
'Au': 'Gold',
'B': 'Boron',
'Ba': 'Barium',
'Be': 'Beryllium',
'Bh': 'Bohrium',
'Bi': 'Bismuth',
'Bk': 'Berkelium',
'Br': 'Bromine',
'C': 'Carbon',
'Ca': 'Calcium',
'Ce': 'Cerium',
'Cf': 'Californium',
'Cl': 'Chlorine',
'Cm': 'Curium',
'Cn': 'Copernicium',
'Co': 'Cobalt',
'Cr': 'Chromium',
'Cs': 'Cesium',
'Cu': 'Copper',
'Db': 'Dubnium',
'Dy': 'Dysprosium',
'Er': 'Erbium',
'Es': 'Einsteinium',
'Eu': 'Europium',
'F': 'Fluorine',
'Fe': 'Iron',
'Fl': 'Flerovium',
'Fm': 'Fermium',
'Fr': 'Francium',
'Ga': 'Gallium',
'Ge': 'Germanium',
'H': 'Hydrogen',
'He': 'Helium',
'Hf': 'Hafnium',
'Hg': 'Mercury (element)',
'Ho': 'Holmium',
'Hs': 'Hassium',
'I': 'Iodine',
'In': 'Indium',
'Ir': 'Iridium',
'K': 'Potassium',
'Kr': 'Krypton',
'La': 'Lanthanum',
'Li': 'Lithium',
'Lr': 'Lawrencium',
'Lu': 'Lutetium',
'Lv': 'Livermorium',
'Mc': 'Moscovium',
'Md': 'Mendelevium',
'Mg': 'Magnesium',
'Mn': 'Manganese',
'Mo': 'Molybdenum',
'Mt': 'Meitnerium',
'N': 'Nitrogen',
'Na': 'Sodium',
'Nb': 'Niobium',
'Nd': 'Neodymium',
'Ne': 'Neon',
'Nh': 'Nihonium',
'Ni': 'Nickel',
'No': 'Nobelium',
'Np': 'Neptunium',
'O': 'Oxygen',
'Og': 'Oganesson',
'Os': 'Osmium',
'P': 'Phosphorus',
'Pa': 'Protactinium',
'Pm': 'Promethium',
'Po': 'Polonium',
'Pr': 'Praseodymium',
'Pt': 'Platinum',
'Pu': 'Plutonium',
'Rb': 'Rubidium',
'Re': 'Rhenium',
'Rf': 'Rutherfordium',
'Rg': 'Roentgenium',
'Rh': 'Rhodium',
'Ru': 'Ruthenium',
'S': 'Sulfur',
'Sb': 'Antimony',
'Sc': 'Scandium',
'Se': 'Selenium',
'Sg': 'Seaborgium',
'Si': 'Silicon',
'Sm': 'Samarium',
'Sn': 'Tin',
'Sr': 'Strontium',
'Ta': 'Tantalum',
'Tb': 'Terbium',
'Tc': 'Technetium',
'Te': 'Tellurium',
'Th': 'Thorium',
'Ti': 'Titanium',
'Tl': 'Thallium',
'Tm': 'Thulium',
'Ts': 'Tennessine',
'U': 'Uranium',
'W': 'Tungsten',
'Xe': 'Xenon',
'Y': 'Yttrium',
'Yb': 'Ytterbium',
'Zn': 'Zinc',
'Zr': 'Zirconium'}
In [152]:
symbols = list(shorthand)
symbols.sort()
symbols[0], symbols[-1]

Out[152]:
('Ac', 'Zr')
In [153]:
names = list(shorthand.values())
names.sort()
names[0], names[-1]

Out[153]:
('Actinium', 'Zirconium')
In [154]:
oddballs = {}
for symbol, name in shorthand.items():
if name[0] != symbol[0]:
oddballs[symbol] = name
oddballs

Out[154]:
{'Ag': 'Silver',
'Au': 'Gold',
'Fe': 'Iron',
'Hg': 'Mercury (element)',
'K': 'Potassium',
'Na': 'Sodium',
'Sb': 'Antimony',
'Sn': 'Tin',
'W': 'Tungsten'}

## After lunch:¶

• Can will give a crash course in Python's scientific libraries
• Some more exercises, chances to practise