DATA MINING

Degree course: 
Corso di Second cycle degree in COMPUTER SCIENCE
Academic year when starting the degree: 
2018/2019
Year: 
1
Academic year in which the course will be held: 
2018/2019
Course type: 
Compulsory subjects, characteristic of the class
Language: 
English
Credits: 
6
Period: 
Second semester
Standard lectures hours: 
48
Detail of lecture’s hours: 
Lesson (48 hours)
Requirements: 

Basic contents of the Intelligent Systems course delivered to the first year of the MSc program.
The course is recommended for those who already know at least one programming or scripting language.
Students are advised to get a laptop (Windows, Mac or Linux) that runs the Python interpreter.

Final Examination: 
Orale

The exam consists of a project and an oral interview.
The project is proposed by the student based on his interests.
In the absence of specific proposals, the project is proposed by the teacher.
In the project, students are typically called to implement simple methods of experimental investigation on data made available to them by web sites and / or other banchmarking data available on online repositories.
These investigations are aimed at ascertaining the students' ability to adapt the studied methods to the real cases, possibly understanding their specificities.
The project must be accompanied by a short report describing the contents and the results obtained.
The outcome of the project, out of thirty, is positive (and allows access to the next oral exam) if it shows a score of at least 18/30.
The oral test consists of an interview whose first question is always the discussion of the results of the project.
During the oral examination the student must show understanding of the methods covered in class, their advantages and their disadvantages.
The overall test is passed with a final vote of at least 18/30.
The vote of the project contributes significantly to the determination of the final vote.

Assessment: 
Voto Finale

The term Data Mining refers to a set of techniques and tools used to explore large amounts of data, with the aim of identifying / extracting significant information / knowledge.
This course aims to provide the fundamentals of the discipline, focusing the study on the most important Data Mining techniques of current application / industrial interest.
The course combines the theoretical knowledge of Data Mining with the use of open source Python software.
Participants will be guided in the search for patterns within large datasets, and through the tools provided by Python they will learn how to preprocess the data, perform Clustering, Classification and Forecasting operations.
In summary, the objectives of teaching and the expected learning outcomes are the following:
- Acquire basic knowledge of Data Mining methods on large data and related issues.
- Acquire skills in applying this knowledge to real problems, through the use of the Python language and some of its libraries.
- Ability to learn new methodologies and to carry out comparative analyzes with what is already known to the student.

- Introduction to Data Mining
- Introduction to the Python language and some of its libraries as a tool to be able to directly experience what was seen in class.
- Mining of association rules, decision trees, regression, classification. Aggregation methods (bagging and boosting).
- Problems and methods of learning structured information (ranking, collaborative filtering).
- Data mining problems with large data that can be solved with deep learning algorithms.
The various topics covered will be accompanied by practical examples and the python code necessary for their solution.

To be identified.
Handouts and teaching materials provided by the teacher, available on the e-learning web site.

Convenzionale

The hours of lectures are held in the classroom, alternating theoretical moments with practical exercises.
The analytical software used will be Python, an open-source platform that can be downloaded from the web.
As an editing tool, IPython will be used, an open source, browser-based tool that allows students to create / edit documents that contain code, views and text.
During the course will be also downloaded and installed additional analytical packages needed for the different topics in the study.
In the classroom, continuous assistance is provided by the teacher.

-