-

Data Organization and Data Mining

  • Code: 5603
  • Semester: 6th
  • Type: Scientific Field Course (SFC)
  • Category: Expertise Course (EC)
  • Character: Compulsory Selective (CS), Specialization Course (SC)
  • Specialization: Software Engineering

Module Description

Module Content
• Differences between a classical DBMS, an OnLine Transaction Processing System, and a Data Warehouse
• All the individual step of data processing in a Data Warehouse
• Metadata management of a Data Warehouse
• The data mart concept and its application in practice
• Study of Star, Snowflake and Constellation Schemes for the design and construction of a Data Warehouse
• Data organization for analytical processing
• Basic concepts of OnLine Analytical Processing (OLAP)
• Design and implementation of hierarchies and multidimensional data cubes
• Study of the four OLAP operations, i.e. slice and dice, roll-up, drill-down and pivot
• MOLAP (Multidimensional OLAP) and ROLAP (Relational OLAP)
• Practice with Oracle Warehouse Builder
• Practice with Oracle Analytic Workspace Manager
• Data vs. Information
• Data types
• Information/knowledge representation
• Real life needs for data mining applications
• Technologies pertaining to the modern data mining paradigm
• Data quality: noise, outliers, missing values, inconsistent data
• Data preparation for data mining: cleaning, integration, transformation, dimensionality reduction
• Classification algorithms
• Clustering algorithms
• Association rule mining algorithms
• Recommender systems
• SQL runtime views in practice
• The WEKA data mining platform
• Data mining with the RStudio/R programming environment
• Data mining on linked data

Description
The course comprises an introduction to the modern technology trends in online analytical processing of data to support the decision making process. The technologies pertaining to the data mining and knowledge discovery cycle are considered, namely: data preparation, integration, homogenization, and representation facilitating the discovery of knowledge inherent to large data corpora and databases.
For hands-on practicing, the DBTechNet (www.dbtechnet.org) DebianDB virtual machine is freely available to the students for downloading; the software incorporates pre-installed (free) versions of a number of RDBMS products (e.g. IBM DB2 Express-C, Oracle Express, MySQL, and PostgreSQL) as well as the WEKA and RStudio/R software. Moreover, for practicing OLAP and Data Warehouse the students will use (a) Oracle Warehouse Builder, (b) Oracle Analytic Workspace Manager.

Structure
The course structure involves two components:
a) Theory, and
b) Hands-on laboratory practice

Evaluation
Student evaluation is based on a final exam grade. Optional class assignments contribute to the latter in a bonus-grade manner to a maximum percentage of 35%.

Alternative Evaluation Methods

Possible enrollment for Erasmus students: project work with presentation and/or oral examination (in English)

Module Objectives

By enrolling in this course, the student:
• differentiates a classical DBMS, i.e. an OnLine Transaction Processing System, from a Data Warehouse
• describes all the individual steps of data processing in a Data Warehouse
• understands the usefulness of data mart
• creates a Data Warehouse by using fundamental techniques like Star, Snowflake or Constellation Scheme
• realizes the basic concepts of OnLine Analytical Processing (OLAP)
• develops multidimensional cubes
• applies the four OLAP basic operations slice and dice, roll-up, drill-down and pivot
• uses the MOLAP (Multidimensional OLAP) and ROLAP (Relational OLAP) techniques
• differentiates the concept of data from the concept of information
• appreciates the importance and the use of data as facts as compared to the representation of knowledge in the output of the data analysis stage
• realizes the importance of using dimensionality reduction techniques in special data analysis tasks
• comprehends and applies in practice basic data mining techniques like association rule mining, classification, and clustering
• develops processing flows and programming code in the WEKA and RStudio/R environments
• evaluates the output of the data mining process
• Interprets the result in the output of the data mining process
• designs and develops a simple recommender systems that facilitates decision making in a typical retail business setup
• describes the qualitative difference between data mining in the semantic web and data mining in the conventional DBMS/OLAP/Data Warehouse environment

Bibliography

1. T. Connolly, C. Begg, Database Systems: A Practical Approach to design, Implementation, and Management, 5th Edition, Addison Wesley, 2010
2. R. Elmasri και S.B. Navathe, Fundamentals of Database Systems, 5th Edition, Addison-Wesley 2006
3. R. Ramakrishnan και J. Gehrke, Database Management Systems, 3rd Edition, Mc Graw-Hill, 2002
4. J.D. Ullman, J. Widom, A First Course in Database Systems, Prentice-Hall, 2007
5. G. Antoniou and F. van Harmelen, A Semantic Web Primer, 2nd Edition, MIT Press, 2008
6. B. DuCharme, Learning SPARQL: Querying and Updating with SPARQL 1.1, O’Reilly, 2011

Recent Announcements

4 Oct 2019
Διδασκαλία μαθημάτων από Μεταδιδάκτορες (ΕΣΠΑ)
4 Oct 2019
ΤΡΟΠΟΠΟΙΗΤΙΚΕΣ δηλώσεις μαθημάτων στο πληροφοριακό σύστημα ΠΥΘΙΑ 2019-20Χ
4 Oct 2019
Δηλώσεις τμημάτων εργαστηρίων 2019-20Χ
3 Oct 2019
ΠΡΟΘΕΣΜΙΕΣ ΚΑΙ ΔΙΚΑΙΟΛΟΓΗΤΙΚΑ ΣΙΤΙΣΗΣ ΑΚΑΔ.ΕΤΟΥΣ 2019-2020
3 Oct 2019
Οργάνωση Πινάκων Ανακοινώσεων
2 Oct 2019
ΔΗΛΩΣΕΙΣ ΜΑΘΗΜΑΤΩΝ ΚΑΤΕΥΘΥΝΣΕΩΝ – ΠΡΩΗΝ ΤΜ. ΠΛΗΡΟΦΟΡΙΚΗΣ
2 Oct 2019
Θέση υποψήφιου διδάκτορα σε ερευνητικό έργο
1 Oct 2019
Μετακίνηση το Χειμερινό 2019-2020 – Δήλωση μαθημάτων στο Pithia (επείγον)
3 Oct 2019
Τελετή Υποδοχής Πρωτοετών φοιτητών/τριών 2019-20
30 Sep 2019
Track on 5G for the Industrial Internet of Things @IEEE 5G World Forum
29 Aug 2019
Ημερίδα Πρακτικής Άσκησης
10 Jun 2019
Ημερίδα “Εθνική Στρατηγική Κυβερνοασφάλειας” στο Υπουργείο Ψηφιακής Πολιτικής
14 Apr 2019
6ο Technology Forum – 15 Απριλίου 2019 (τελικό πρόγραμμα)
19 Mar 2019
6ο Technology Forum – 15 Απριλίου 2019 (εισιτήρια με μειωμένο κόστος)
19 Mar 2019
OWASP Student Chapter Συνάντηση “Introduction to Digital Forensics”
17 Dec 2018
Ομιλία του καθηγητή Man Wai Mak (Hong Kοng Polytechnic University)

Δείτε επίσης