SCIENTIFIC PYTHON

Degree course: 
Corso di Second cycle degree in PHYSICS
Academic year when starting the degree: 
2024/2025
Year: 
1
Academic year in which the course will be held: 
2024/2025
Course type: 
Compulsory subjects, characteristic of the class
Credits: 
6
Period: 
Second semester
Standard lectures hours: 
66
Detail of lecture’s hours: 
Lesson (66 hours)
Requirements: 

This is an advanced course in data analysis and previous experience with programming and data analysis is required. In particular, the following prerequisites are assumed:
- Knowledge and use of a programming language at an intermediate level for general programming
- Ability to perform basic data analysis and visualization in any programming language
- Basic usage of a command line interface
If any of the prerequisite is missing, contact the teacher in advance to discuss a workaround.

Final Examination: 
Orale

The exam mark will be based on the successful completion of homeworks assigned during the course, and on the completion of a data analysis project presented and discussed during an oral exam. Specifically:
- Quick, mandatory homeworks will be assigned during the semester. Successful completion of at least 5 of them is mandatory to access the exam
- At the end of the course, the students will choose a data analysis project to develop individually or in small groups. They will prepare a short report and a presentation (10 min max per group member, based on slides, a Jupyter notebook or other means) which will be discussed with the teacher

Assessment: 
Voto Finale

The course aims to provide the students with in-depth skills for the analysis of physical data using the Python programming language.
In the first part of the course, the students will learn to use the Python programming language for general programming purposes, to use some of the most ubiquitous Python libraries to efficiently treat multidimensional numerical data, perform basic scientific data processing, visualize data, treat tabular data. In the second part of the course, the students will learn how to apply advanced techniques of data analysis both by developing example codes in pure Python and by using well-established libraries to efficiently run data analysis.

After the course, the students will be able to:
- Know the basics of the Python programming language and apply it to write scripts for a variety of data analysis tasks
- Install and manage a Python distribution on a computer, including installation of Python libraries
- Perform data manipulation operations, including import/export from data files, obtain data from online sources, preprocess data
- Create graphical visualizations of data
- Use Jupyter notebooks for interactive and descriptive data analysis
- Use Python libraries for efficient and advanced data treatment and analysis
- Implement procedures for reproducible analysis, including file versioning, unit testing and using Python virtual environments

The advanced topics covered in the second part of the course will be partly selected based on the interests of the students. Depending on which topics will be covered, the students will be able to:
- Control instrumentation and automatize data acquisition
- Realize web-based graphical user interfaces (GUIs)
- Realize advanced and interactive data visualizations
- Use machine learning/artificial intelligence for data analysis
- Perform digital signal processing

Throughout the course, the following topics will be covered:
- Python introduction, programming language properties, declaring variables, control flow, basic data types (int, float, complex, bool, string) and data types conversions, operators, some built-in functions
- Structured data types (lists, tuples, dictionaries, sets) and their methods, deep copy, defining functions (including lambda), a basic introduction on modules and introspection
- Files and directories handling
- The Numpy module: ndarrays, random numbers, input/output
- The matplotlib module for data plotting
- The scipy module: integration, optimization, fast Fourier transform, linear algebra
- The pandas module: tabular data management
- Use of Jupyter notebooks
- Management of a Python virtual environment for reproducible results

Additional topics could be covered depending on the needs and interests of the students.

The course will privilege a hands-on experience, paired with different learning techniques:
- Use of Jupyter notebooks, either locally or online, to take notes and test new concepts
- Individual in-classroom exercises followed by group discussion
- Group in-classroom exercises followed by group discussion
- Homeworks ranging from simple, quick exercises aimed at consolidating the concepts presented during lectures, to small projects aimed at developing programming and data analysis ability

All the material used during the lectures (notebooks, scripts, etc.) will be made available to the students through the e-learning platform.

For questions/comments or to request an appointment students can contact the teacher at the following email address: marco.lamperti@uninsubria.it