SCRIPTING AND PROGRAMMING LABORATORY FOR DATA ANALYSIS

Degree course: 
Corso di Second cycle degree in PHYSICS
Academic year when starting the degree: 
2020/2021
Year: 
1
Academic year in which the course will be held: 
2020/2021
Course type: 
Compulsory subjects, characteristic of the class
Credits: 
6
Period: 
Second semester
Standard lectures hours: 
66
Detail of lecture’s hours: 
Laboratory (66 hours)
Requirements: 

knowledge of statistics and error theory

Final Examination: 
Orale

The students will by asked to prepare an oral dissertation of ~30 minutes on one of the projects analyzed during the course or any other personal project. They have to show other students their analysis approach, pros and cons, and answer to teacher's questions about the presented project and its details.

Assessment: 
Voto Finale

The goal of the course is to improve students skills in data analysis in the physics environment.
After an initial introduction to the Python programming language (and its most widely used scientific libraries like Numpy, Matplotlib and Pandas), the students will be provided some real data from physics experiments belonging to different fields.
The students are expected to develop their own analysis strategy, while the teacher role is limited to provide them with useful hints and suggestion on how to proceed.
Firstly, they should be able to manipulate raw data and files, as provided from the experiments, in order to ease a following main analysis program.
Some scripting techniques and data handling methods will be shown but its up to them to take to take proper choices in the perspective of the final result.
After that, they have to use their programming and analysis experience (data analysis in the laboratory courses of previous years) to perform an efficient and flexible analysis. They should focus on the results and how to effectively reach them.
Finally, the are also expected to improve their skills in data visualization being able to choose the proper tool to make others focus on what is important in the performed analysis.

# The Python programming language:
- Built in data-types (integers, floating points, boolean, strings, lists, tuples, sets and dictionaries)
- control flow tools (if/for/while statements)
- file input/output (reading and writing text files and binary files)
- Showing data with matplotlib (Plotting lists of data, 1D and 2D histograms, scatter plots, animations).
- Object Oriented Programming: classes, objects, attributes and methods
- Intro to Numpy library
- intro to Pandas library
- Image handling
# Basic scripting (in Python and/or in a Unix environment) and data handling:
- file manipulation
- automation of routinary tasks
- dealing with large files
# Physics cases [examples]:
- Ghost imaging in optics
- Waveform analysis
- Particle physics test beam analysis
- time series (with Pandas)

-

- Learning Python – Mark Lutz – O’Reilly
- Python for Data Analysis – Wes McKinney - O’Reilly

The slides/Notebooks of each lesson will be given to the students.

The course alternates two types of sessions:
- lessons for the general introductory part on programming and scripting techniques aimed to the physics data analysis.
- presentation of an exercise (for the introductory part of the course) or a physics case (main body of the course) after which the studens have time to code and discuss with the teacher or the other students about their choices and eventually results.
The lessons are mostly provided by means of Jupyter Notebooks, an interactive tool that can be run in a browser page on all the main operative systems. In some cases, slides will be shared.

To meet to discuss results and reports, write an email to the teacher:
valerio.mascagna@uninsubria.it