Python for physicists

Basic python

To start this tutorial, you will need essentially three things

  • python 2.7 or 3 installed on your computer with a working numpy package
  • a decent text editor (emacs, vi, gedit, kate)
  • a terminal

These minimal reqirements are met by all recent GNU/Linux distributions and iOS. They may be easy to reproduce under Windows.

We will be keeping the text editor and the terminal open all the time, write the code in the text editor and execute it from the command line on the terminal

python hello.py

If you prefer working with an integrated development environment (IDE), please help yourself.

To write a computer code, physicists need basically three things

  • scalars
  • vectors (1d arrays)
  • matrices (2d arrays)

plus the usual structures for loops and conditional statements.

This defines the core set of features on which we will focus on in the first part of the tutorial.

Documentation

A few authoritative sources of information on python:

  • The official documentation available at https://docs.python.org/3/library/
  • The integrated documentation accessible from the interactive python and (even better!) ipython sessions

In addition

A matter of style

The golden rules to write top-quality Python code are crystallized in so-called PEP-8 http://legacy.python.org/dev/peps/pep-0008/.

"One of Guido's key insights is that code is read much more often than it is written. The guidelines provided here are intended to improve the readability of code and make it consistent across the wide spectrum of Python code. As PEP 20 says, Readability counts."

While python tolerates ad-hoc custom conventions on spacings, name conventions etc, the PEP-8 provides the authoritative reference to write good Python code.

The philosophy of Python coding is beautifully described by the Zen of Python (PEP-20), which you should deeply ponder every night before going to sleep

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Scalars

Variables in python behave a bit differently then in statically-types languages like C or Fortran and they are not bound to a type

x = 1
print(x, type(x))
x = 1.0
print(x, type(x))
x = 'hello world'
print(x, type(x))
1 <class 'int'>
1.0 <class 'float'>
hello world <class 'str'>

Under the hoods, x is a reference to an object (in our case, first to 1, then to 1.0, finally to ='hello world'= ) and the above assignements simly change the object the variable x points to.

While that can be confusing at first if you have programmed in C or Fortran, the dynamic character of python variables turns out to be very convenient, especially when writing small scripts.

[ see basics/scalars.py ]

One can transform variables into another very much like in other languages

x_i = 1
x_s = str(x_i)  # returns a string representation of x_i
x_f = float(x_s)  # returns a float from a string

To get more control on the way variables are "formatted" to form strings use the format intrinsic method

print('variable x={} and is of type {}'.format(x, type(x)))
# Same as before, 0 and 1 are the positional arguments passed to format()
print('variable x={0} and is of type {1}'.format(x, type(x)))
# Same as before, but variables are passed as keyword arguments to format()
print('variable x={value} and is of type {type}'.format(value=x, type=type(x)))
variable x=1.0 and is of type <class 'float'>
variable x=1.0 and is of type <class 'float'>

Formatting variables can get fairly sophisticated

# Center string
print('{:^30}'.format('hello world'))
# Center string and fill line with stars
print('{:*^30}'.format('hello world'))
# Float variable y is printed with 6 digits and left justified
print('x={:<28.6f}'.format(1.0))
# Integer variable is right justified
print('y={:>28d}'.format(10))
         hello world          
*********hello world**********
x=1.000000                    
y=                          10

Refer to the python documentation https://docs.python.org/3/library/string.html#format-string-syntax for more details.

Loops and conditionals

Let us introduce some basic control structures. To identify the blocks of code that will be executed withing a for loop or as part of an if construct, python uses a strong convention on code indentation. This is probably one of the biggest shift if you come from a language like C or Fortran.

for i in range(10):
    if i == 0:
        # Is zero even or odd?
        # We do nothing and jump to next iteration
        continue
    elif i % 2 == 1:
        print('{} is even'.format(i))
    else:
        print('{} is odd, we get out of the loop'.format(i))
        break
    # Now the if block is over
# Now the for block is over
1 is even
2 is odd, we get out of the loop

[ see basics/loops_and_conditions.py ]

Have you noticed how the blocks are indented? When you hit tab on a new line after a loop or an if, your text editor will indent the line as needed (if not, change text editor!).

Typically, we use 4 spaces to separate code blocks. When the block is over, we get back to the previous indentation level.

In the previous loop we did 10 iterations using range(10). The range() function allows one to generate custom sequences of integer number, as you can see from this example

# We loop from 2 to 7 with a jump of 2
for i in range(2, 8, 2):
    print(i)
2
4
6

Notice how the value 8 is not generated. This is because range() follows a C-like convention.

range() is typically used to generate the indices of a loop. More "pythonic" idioms can be used instead of range(), but we will mostly stick to the latter in the following.

Lists

What the range() function does is generating a list of integers (*), which are used to assign the values of the loop variable i. This brings us to investigate more closely the notion of list, which behaves under some respects as an array.

Compared to arrays, lists have one main advantage

  • elements in a list can be added and removed dynamically

and one main drawback

  • accessing elements of a list takes way more time than you would expect from an array

(*) technically, in python 3, range() returns an iterator, not a list.

With these ideas in mind, let us have a look at the syntax to manipulate lists

# A list of integers
x = [1, 2, 3]
# Add an element
x.append(4)
print(x)
# Remove the first occurrence of element "2" from the list
x.remove(2)
print(x)
# Find the index of element "3" in the list
print(x.index(3))
[1, 2, 3, 4]
[1, 3, 4]
1

[ see basics/lists.py ]

There is a useful shortcut to generate a list of entries all equal to one another

N = 10
x = [0.0] * N
print('The list x has', len(x), 'elements')
The list x has 10 elements

Accessing individual entries of a list can be done with a familiar "array-like" syntax

x[0] = 1  # first element
x[-1] = 1  # last element
print(x[0] - x[-1])
0

The powerful "list comprehension" syntax allows us to generate or modify lists in a compact way

x = [0.0 for i in range(10)]
print(x)
x = [i**2 for i in range(10)]
print(x)
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

A convenient function is enumerate() which returns the entries of a list along with the index

x = [i**2 for i in range(5)]
y = [i**3 for i in range(5)]
for i, xi in enumerate(x):
    print(i, xi, x[i], y[i])
0 0 0 0
1 1 1 1
2 4 4 8
3 9 9 27
4 16 16 64

It is possible to iterate over multiple lists element by element "in sync" using the zip() intrinsic function

for xi, yi in zip(x, y):
    print(xi + yi)

Arrays

Lists are very powerful and flexible data structures but they are too slow for number crunching calculations: accessing the entries of a list is pretty inefficient.

To see that let us do some simple linear algebra on the elements on two 5000-elements lists

N = 5000
x = [1.0] * N
y = [2.0] * N
z = [0.0] * N
a = 3.0
# Repeat the inner loop many times
for repeat in range(10000):
    for i in range(N):
        z[i] = x[i] + a * y[i]

We have repeated the inner loop many times to increase the computational time.

[ see basics/arrays_list.py ]

Let us do the same with the numpy package, transforming the lists x, y, z into numpy arrays

import numpy
N = 5000
x = [1.0] * N
y = [2.0] * N
z = [0.0] * N
a = 3.0
x = numpy.array(x)
y = numpy.array(y)
z = numpy.array(z)
for repeat in range(10000):
    for i in range(N):
        z[i] = x[i] + a * y[i]

[ see basics/arrays_numpy.py ]

As you might have noticed, things have not improved much… The problem is that numpy arrays are only efficient when operations are done on arrays as a whole (or on arrays slices), not element-wise! Let's try this way:

import numpy
for repeat in range(10000):
    z[:] = x[:] + a * y[:]

This time is much faster! Notice how we have used a "vector syntax" to express the fact that the linear combination of the two arrays can be performed element-wise "in parallel".

We can also use the following syntax to get the same result.

import numpy
for repeat in range(10000):
    z = x + a * y

Modern Fortran programmers will recognize the familiar syntax for array operation.

It is also possible to operate on array subsections using a powerful slicing syntax such as

# Copy the first 10 elements
z[0: 10] = a * y[0: 10]
# Can you guess what this does?
z[0: 10: 2] = a * y[-1: -10: -2]

See the numpy array documentation for more details.

Matrices

A very simple way to define a matrix in python is by constructing a list of lists (very much like in C one would use pointers to pointers).

N = 4
x = []
for i in range(N):
  y = [i] * N
  x.append(y)
# The full matrix
print(x)
# The line (1,:) of the matrix
print(x[1][:])
# Element (1,2) of the matrix
print(x[1][2])
[[0, 0, 0, 0], [1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3]]
[1, 1, 1, 1]
1

[ see basics/matrices.py ]

The numpy package provides general n-dimensional arrays to do efficient linear algebra on matrices of arbitrary dimensions. Let us define a 2x3 matrix

import numpy
N = 2
M = 3
# Matrix, elements not initialized
x = numpy.ndarray((N, M))
print(x.shape)
print(x)
# Matrix, all elements set to zero
x = numpy.zeros((N, M))
print(x)
(2, 3)
[[0. 0. 0.]
 [0. 0. 0.]]
[[0. 0. 0.]
 [0. 0. 0.]]

The shape attribute is a list (actually a tuple, but it doesn't matter) that specifies the size of the array along each dimension.

The syntax to access the matrix elements is simple.

# Get the difference between the first and last element of the first row
dx = x[0, 0] - x[0, -1]

Notice the syntax to access the last element. Negative indices allow to get elements going backward from the end of the line, ex. x[0, -2], x[0, -3]

The numpy package provides useful but basic operations on arrays and matrices, such as transpose, dot and outer product and many others.

An even more complete environment for linear algebra is provided by the scipy package. If you haven't done it already, it is the right time to install it via your software manager or via pip!

Exercise: probability density function

As a first application of these basic concepts, write a piece of Python code that produces an histogram and/or the probability density of a sequence of random numbers. The latter can be easily generated using the python random module

import random
# Generate a flat distribution in the interval [0, 1)
N = 1000
random_numbers = [random.random() for i in range(N)]

If you do not want to reinvent the wheel, you can use the numpy.histogram() function, which computes histograms and probability densities. However, make sure you understand how this function handles the bins of the histogram!

Then, extend your code to compute the pdf of a Gaussian random variable (look for the right function in the random module!)

Intermediate python

In the second part of the tutorial, we will have a look at

  • functions
  • I/O
  • dictionaries
  • command line interface

and a very powerful advanced feature: how to interface Python and Fortran with f2py!

Read and write files

Compared to languages like C and Fortran, reading and writing data from/to files is a piece of cake.

Let us start by writing some data to a file. Here we use plain python

N = 5
x = [1.0] * N
y = [2.0] * N
# Open file in writing mode ('w')
with open('/tmp/data.txt', 'w') as fh:
    for i in range(N):
        fh.write('{} {}\n'.format(x[i], y[i]))

Remember to add a newline with \n

[ see intermediate/read_and_write.py ]

Let us read the data back

x = []
y = []
# Default is reading mode ('r')
with open('/tmp/data.txt') as fh:
    for line in fh:
        # Use split() method to separate the elements on the line. 
        # It returns a list
        data = line.split()
        x.append(data[0])
        y.append(data[1])

# Check the data
print('x = {} y = {}'.format(x, y))
x = ['1.0', '1.0', '1.0', '1.0', '1.0'] y = ['2.0', '2.0', '2.0', '2.0', '2.0']

Note that the line variable is a string and so are the elements of data. If we want to store lists of floats, we must cast the variables explicitly.

Reading the file can be done more easily with the loadtxt() function in numpy

import numpy
# The unpack argument is necessary to get back the data as columns
x, y = numpy.loadtxt('/tmp/data.txt', unpack=True)
# Check the data
print('x = {} y = {}'.format(x, y))
x = [1. 1. 1. 1. 1.] y = [2. 2. 2. 2. 2.]

Note how we get back numpy arrays instead of lists. By the way, it is easy to transform lists in numpy arrays and viceversa

print(type(x))
x = list(x)
print(type(x))
x = numpy.array(x)
print(type(x))
<class 'numpy.ndarray'>
<class 'list'>
<class 'numpy.ndarray'>

Functions

Let us compute the harmonic sphere potential $$u(r) = \frac{1}{2}\epsilon [1 - (r/a)^{2}]$$

def harmonic_sphere(r, epsilon=1.0, sigma=1.0):
    """Harmonic sphere potential. By default, sigma and epsilon are set to 1."""
    if r < sigma:
        return 0.5 * epsilon * (1 - (r/sigma)**2)
    else:
        return 0

# Different ways to call the function
print(harmonic_sphere(0.8), harmonic_sphere(0.8, sigma=1.0, epsilon=1.0))
0.17999999999999994 0.17999999999999994

It is good practice to document the purpose of the function with a docstring enclosed in between triple apostrophes ="""=.

[ see intermediate/functions.py ]

If we want to add the derivative, we can return two variables

def harmonic_sphere_1st(r, epsilon=1.0, sigma=1.0):
    if r < sigma:
        u, uu = 0.5 * epsilon * (1 - (r/sigma)**2), - epsilon/sigma * (1 - (r/sigma))
    else:
        u, uu = 0, 0
    return u, uu

When calling these function, we can store the result in a tuple and then discover how many variables have been returned.

# If we are sure the function returns two values
u, uu = harmonic_sphere_1st(0.8)

# Alternatively...
data = harmonic_sphere_1st(0.8)
if len(data) == 2:
    print('potential={u} derivative={uu}'.format(u=data[0], uu=data[1]))

[ see intermediate/functions_derivative.py ]

Let us tabulate the potential on a grid. This time it is ok to get the data back using via a list

def tabulate_harmonic_sphere(r_grid, u_grid, epsilon=1.0, sigma=1.0):
    for r in r_grid:
        u_grid.append(0.5 * epsilon * (1 - (r/sigma))**2)
    return u_grid

# Define grid
N = 5
r_min, r_max = 0.0, 1.0
dr = (r_max - r_min) / (N-1)
r_grid = [r_min + dr * i for i in range(N)]

# Tabulate the potential
u_grid = []
tabulate_harmonic_sphere(r_grid, u_grid)
for r, u in zip(r_grid, u_grid):
    print(r, u)

[ see intermediate/functions_tabulate.py ]

Let us quickly plot the results with gnuplot

function_tabulate.png

Command line interface (CLI)

When a python code is supposed to be executed from the command line, it is good practice to encapsulate an entry point in a main() function.

def tabulate_harmonic_sphere(r_grid, u_grid, epsilon=1.0, sigma=1.0):
    for r in r_grid:
        u_grid.append(0.5 * epsilon * (1 - (r/sigma)**2))
    return u_grid

def main(N=5, r_min=0.9, r_max=1.4):
    # Define grid
    dr = (r_max - r_min) / (N-1)
    r_grid = [r_min + dr * i for i in range(N)]

    # Tabulate the potential
    u_grid = []
    tabulate_harmonic_sphere(r_grid, u_grid)
    for r, u in zip(r_grid, u_grid):
        print(r, u)

The main() function will be called only when the code is executed from the command line

if __name__ == 'main':
    main()

We can use the sys.argv list to parse parameters passed on the command line

if __name__ == '__main__':
    import sys
    print(sys.argv)

However, there is a much better alternative…!

If you want to build a command line interface for your python codes, look no further than argh (https://argh.readthedocs.io/en/latest/tutorial.html) which can be installed from pypi.

pip install --user argh

Then use the dispatch_command() function to create a CLI for your main() function

def main(message, n=1):
    """Print `message` `n` times on the screen"""
    for i in range(n):
        print(message)

if __name__ == '__main__':
    from argh import dispatch_command
    dispatch_command(main)

[ see intermediate/cli_argh.py ]

From the command line, you can now get a nice help message that explians how to pass parameters to your python program

python intermediate/cli_argh.py --help

Mini exercise: add a CLI to cli_harmonic_sphere.py using argh!

Dictionaries

Dictionaries are very useful data structures to collect related variables together in a flexible struct-like data structure

# Create an empty dictionary
particle = {}
# Add entries to the dictionary
particle['position'] = [0.0, 0.0, 0.0]
particle['species'] = 'H'
particle['mass'] = 1.0

An alternative, more compact syntax to create the dictionary

particle = {'position': [0.0, 0.0, 0.0], 'species': 'H', 'mass': 1.0}

[ see intermediate/dictionaries.py ]

To retrieve the entries, we can access them individually

print(particle['position'])
[0.0, 0.0, 0.0]

Or iterate over them one by one

for key in particle:
    print('{} = {}'.format(key, particle[key]))
position = [0.0, 0.0, 0.0]
mass = 1.0
species = H

You may want to try to build a list of dictionaries to store several particles at a time.

Interfacing Python and Fortran

module lib
  implicit none
contains
  subroutine test_dot_product(x, y, f)
      real(8), intent(in) :: x(:), y(:)
      real(8), intent(out) :: f
      f = DOT_PRODUCT(x, y)
  end subroutine test_dot_product

  subroutine test_daxpy(a, x, y, f)
      real(8), intent(in) :: a, x(:), y(:)
      real(8), intent(inout) :: f(:)
      f = a*x + y
  end subroutine test_daxpy

  subroutine test_integer(a, f)
      integer(8), intent(in) :: a(:)
      integer(8), intent(inout) :: f(:)
      f = a * 2
  end subroutine test_integer
end module lib

[ see intermediate/python_and_fortran.f90 ]

We compile and create the python wrapper module using f2py (installed with numpy)

f2py -c -m f90 intermediate/python_and_fortran.f90
/bin/cp intermediate/f90*so ./

We can now call the routine from python

import numpy, f90

ndim = 10
x = numpy.ones(ndim, dtype='float64')
y = numpy.ones(ndim, dtype='float64')
print(f90.lib.test_dot_product(x, y))

a = 2.0
f = numpy.ones(ndim, dtype='float64')
f90.lib.test_daxpy(a, x, y, f)
print(f)

a = numpy.ones(ndim, dtype='int64')
f = numpy.ones(ndim, dtype='int64')
f90.lib.test_integer(a, f)
print(f)
10.0
[3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
[2 2 2 2 2 2 2 2 2 2]

A few things to notice:

  • Scalar variables with intent(out) are returned as the function result in python.
f = f90.lib.test_dot_product(x, y)
  • If there are output arrays on the Fortran side, we must declare them with the intent(inout) attribute, so that they can be modified in place.
  • There are basic equivalence between numpy and fortran types. Python uses double precision for both float and integer: we must be coherent on the fortran side, thus we use double precision integers.
a = numpy.ones(ndim, dtype='int64')

Alternatively, we could declare the integers as ='int32'= on the Python side

Exercise: energy of a Lennard-Jones cluster

Interface Fortran and Python codes to compute the potential energy of a Lennard-Jones cluster, i.e. isolated particles interacting via the Lennard-Jones potential \(u(r)=4[(\sigma/r)^{12} - (\sigma/r^6)]\)

On the Fortran side, I suggest to read a (ndim,N) position array but you can pass separate arrays for x, y, z coordinates if you prefer

subroutine energy(position,energy)
      real(8), intent(in) :: position(:,:)
      real(8), intent(out) :: energy
      integer :: i, j, ndim, npart
      ndim = size(position,1)
      nnpart = size(position,2)
      do i = 1,npart
        do j = i+1,npart
          ! compute energy
        end do
      end do
end subroutine energy

On the Python side, pay attention to lay down the 2d array with a Fortran layout (use may want to pass order equal to ='F'= when setting up the array)!

Applications

We look now at a few applications:

  • Use the atooms package to quickly deal with particles under periodic boundary conditions
  • Generate random packings of particles
  • Matrix diagonalization with scipy

And two small research projects based on these ideas!

Introduction to atooms

atooms is a Python framework I developed to deal with particle simulations https://gitlab.info-ufr.univ-montp2.fr/atooms

It is well suited to

  • set up and analyze particle-based simulations (ex. classical molecular dynamics, Monte Carlo, lattice models…)
  • implement complex simulation strategies (ex. parallel tempering, transition path sampling) in a high-level expressive language, independent of the actual simulation backend (ex. LAMMPS, RUMD, …)

It can be installed from the official Python package index with pip

pip install --user atooms

or cloning the git repository

git clone https://gitlab.info-ufr.univ-montp2.fr/atooms/atooms.git
cd atooms
make user

The basic idea of atooms is to provide a high-level interface to the main objects of particle simulations. We will start by having a look at the basic objects and how to store them on a file.

Particles' positions are stored as numpy arrays, but we can pass a simple list with x, y, z coordinates when we create them

from atooms.system import Particle
particle = Particle(position=[1.0, 0.0, 0.0])
print(particle.position, type(particle.position))
[1. 0. 0.] <class 'numpy.ndarray'>

By default, they also have a few more properties such as velocity, chemical species, mass and radius. They can all be altered at will and you can add any physical properties to particles at run time: this won't break anything

import numpy
particle = Particle(position=[1.0, 0.0, 0.0], velocity=[1.0, 0.0, 0.0])
particle.species = 'Na'
particle.position += numpy.array([0.0, 1.0, 1.0])
particle.velocity *= 2
particle.radius = None  # point particles have no radius
particle.most_misterious_property = 0.0  # a new particle property
print(particle)
Particle(species=Na, mass=1.0, position=[1. 1. 1.], velocity=[2. 0. 0.], radius=None)

[ see applications/intro_atooms.py ]

To avoid major finite size effects, we enclose particles in a cell with periodic boundary conditions. By convention, the cell origin is in the origin of the reference frame.

from atooms.system import Cell
L = 2.0
cell = Cell(side=[L, L, L])
print('side={} volume={}'.format(cell.side, cell.volume))
side=[2. 2. 2.] volume=8.0

atooms provides means to fold particles back in the "central" simulation cell, i.e. the one centered at the origin at the reference frame. For simplicity, let us work with particles in 2d.

cell = Cell(side=[1.0, 1.0])
particle = Particle(position=[2.0, 0.0])  # particle outside the central cell
particle.fold(cell)

The particle is now folded back at the origin.

Related methods return the nearest periodic image of a given particle with respect to another particle or compute the corresponding distance

particle_1 = Particle(position=[-0.45, 0])
particle_2 = Particle(position=[+0.45, 0])
# Nearest image to particle_1
print(particle_1.nearest_image(particle_2, cell, copy=True))
# Distance with minimum image convention
print(particle_1.distance(particle_2, cell))
Particle(species=A, mass=1.0, position=[0.55 0.  ], velocity=[0. 0. 0.], radius=0.5)
[0.1 0. ]

Random packings with atooms

We will use the atooms package to produce some random packings of disks.

We start by defining the parameters of our code. We avoid particles being to close to each other by setting a lower bound min_distance to their mutual distances.

import random
from atooms.system import Particle, Cell
ndim = 2  # number of spatial dimensions
N = 500  # number of particles
L = 5.0  # cell side
min_distance = 0.1  # minimum distance between particles
max_attempts = 10000  # prevent infinite loop

We try to insert new particles and check the distance with respect to all previously inserted particles

packing = []
cell = Cell([L] * ndim)
for i in range(N):
   for n_attempt in range(max_attempts):
      x = [(random.random() - 0.5) * cell.side[i] for i in range(ndim)]
      new = Particle(position=x)
      insertion = True
      # Test distance wrt to previously inserted particles
      for other in packing:
         distance = sum(other.distance(new, cell)**2)**0.5
         if distance < min_distance:
            # There is an overlap
            insertion = False
            break
      if insertion:
         # Particle can be inserted
         packing.append(new)
         break
print('We inserted {} particles'.format(len(packing)))
We inserted 500 particles

Finally, we write the configuration to disk in xyz format.

with open('applications/random_packing.xyz', 'w') as fh:
    fh.write('{}\n'.format(len(packing)))
    fh.write('cell:{}\n'.format(','.join([str(_) for _ in cell.side])))  # empty comment line
    for particle in packing:
        fh.write('{particle.species} {particle.position[0]} {particle.position[1]}\n'.format(particle=particle))

This can be done in a more general and flexible way via the TrajectoryXYZ class in atooms.

See the full atooms tutorial https://www.coulomb.univ-montp2.fr/perso/daniele.coslovich/atooms/

Let us have a look at the packing with gnuplot

random_packing.png

Matrix diagonalization with scipy

We diagonalize a small symmetric random matrix, using

  • the random module to generate random numbers
  • the eig() function of the scipy package to diagonalize the matrix
import random
import numpy
N = 3
matrix = numpy.zeros(shape=(N, N))
for i in range(matrix.shape[0]):
    for j in range(matrix.shape[1]):
        if j < i:
            continue
        # Random entries between 0 and 1
        matrix[i, j] = random.random()
        matrix[j, i] = matrix[i, j]

[ see applications/matrix_diagonalization.py ]

The eigenvalues will all be real, so we get the real parts

from scipy.linalg import eig
eigenvalues, eigenvectors = eig(matrix)
eigenvalues = numpy.real(eigenvalues)
for i, eigenvector in enumerate(eigenvectors):
    print('eigenvalue={:+.6f} -> eigenvector={}'.format(eigenvalues[i], eigenvector))
eigenvalue=+1.905756 -> eigenvector=[-0.56677074 -0.63101887  0.5297038 ]
eigenvalue=-0.110442 -> eigenvector=[-0.43796409 -0.31381019 -0.84244324]
eigenvalue=+0.056974 -> eigenvector=[-0.69782403  0.70946342  0.09850518]

Let us sort them by eigenvalue. We will use a simple trick: first store the eigenvectors in a dictionary by eigenvalue, then sort the keys of the dictionary

normal_mode = {}
for eigenvalue, eigenvector in zip(eigenvalues, eigenvectors):
    normal_mode[eigenvalue] = eigenvector
for eigenvalue in sorted(eigenvalues):
    print('eigenvalue={:+.6f} -> eigenvector={}'.format(eigenvalue, normal_mode[eigenvalue]))
eigenvalue=-0.110442 -> eigenvector=[-0.43796409 -0.31381019 -0.84244324]
eigenvalue=+0.056974 -> eigenvector=[-0.69782403  0.70946342  0.09850518]
eigenvalue=+1.905756 -> eigenvector=[-0.56677074 -0.63101887  0.5297038 ]

Small project: Wigner semi-circle law

Goal: determine the spectrum of a symmetric random matrix

In the large system limit, the spectrum of symmetric random matrices obeys the Wigner semi-circle law $$ g(\lambda) = \frac{2}{\pi R^2} \sqrt{R^2 - \lambda^2} $$ where radius \(R\) is the radius of the semi-circle.

  • Determine numerically the spectrum of a symmetric random 200x200 matrix using a flat distribution centered around 0
  • What is the role of the mean of the distribution?
  • What is relationship between \(R\) and the variance of the distribution?
  • Study the statistics of level spacings between the eigenvalues, see https://en.wikipedia.org/wiki/Random_matrix

Small project: vibrational density of states

Goal: compute the vibrational density of states of random packings of harmonic spheres.

The proposed strategy is as follows:

  • to initialize the packing, read the xyz file produced by applications/random_packings.py using python
  • compute the Hessian matrix using Fortran for efficiency and return it to the python code as a (3N,3N) array
  • diagonalize the Hessian matrix using scipy
  • compute the probability density function of the eigenvalues

It may be necessary to average over several configurations to obtain a reasonable statistics…

Date: 2019-03-19T13:35+0100

Author: Daniele Coslovich

Org version 7.9.3f with Emacs version 24

Validate XHTML 1.0