Data Literacy

Chapter 0: Introduction

Prof. Dr. Michael Bücker

Table of contents

  • 0.1 Introduction
  • 0.2 Data Literacy
  • 0.3 Structure of this lecture on data literacy
  • 0.4 Tools
  • 0.5 Getting help
  • 0.6 Getting started
  • References

0.1 Introduction

0.1.1 Introduction

Your faculty…


Prof. Dr. Michael Bücker
Professor of Data Science, Mathematics und Business Informatics
Office: Room C 521
Phone: 0251 83-65615
E-Mail: michael.buecker@fh-muenster.de
www.buecker.ms

Since 03/2018 Professor at FH Münster
Since 12/2020 Board of Institut für Prozessmanagement und Digitale Transformation (IPD)
Since 03/2022 Program Manager Master "Digital Business and Innovation Management"
Since 01/2020 Co-Founder TradeLink
Since 01/2019 Co-Founder ScaleWork
05/2011 - 02/2018 Expert and Engagement Manager, McKinsey & Company, Inc.
04/2011 Diploma in Statistics, TU Dortmund
06/2008 PhD in Statistics, TU Dortmund

… and you!

Let’s take a survey to better get to know each other!

https://www.menti.com/alv3mt3nuzkx

or join at menti.com using the code 5614 1886

0.1.2 Results from the survey

0.2 Data Literacy

0.2.1 Which competencies are related to data literacy?

Figure 1: Overlap between Data Literacy and related competencies, Source: dataliteracy.uni-jena.de/en/what-is-data-literacy (based on Schüller, Busch, and Hindinger 2019)

0.2.2 Which competencies are required to become data literate?

Figure 2: Data Literacy competencies during the work with data, Source: dataliteracy.uni-jena.de/en/what-is-data-literacy (based on Schüller, Busch, and Hindinger 2019)

0.2.3 Why is data literacy important for you?

I keep saying the sexy job in the next ten years will be statisticians.

The ability to take data – to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it’s going to be a hugely important skill in the next decades.

I think statisticians are part of it, but it’s just a part. […] Managers need to be able to access and understand the data themselves.1

Hal Varian
Chief Economist at Google

  1. Source: Varian (2008)

0.3 Structure of this lecture on data literacy

0.3.1 Learning goals

  • Understand what data is and how it is stored
  • Have an overview of data bases technologies and implementation of data bases
  • Apply basic data engineering techniques using Python in Jupyter Notebooks
  • Understand and apply basic data analytics techniques to create business value
  • Understand data fallacies and know how to avoid them
  • Reflect on data ethics in a personal and business context

0.3.2 Agenda

  1. Data and Data Bases
  2. Data Insights Process
  3. Data Engineering
  4. Data Analysis
  5. Data Fallacies
  6. Data Storytelling
  7. Data Ethics

0.3.3 Assigment and examination form

  • The examination form of this lecture will a small project be to answer a couple of questions as a team for a given dataset using the methodolog learnt over the course of this lecture
  • More details will follow

0.3.4 Lecture notes

  • The lecture script for this lecture has been completely revised.
  • Please feel free to send correction suggestions anytime to me.
  • If you want to print the script, press ‘e’ in your browser and then print the script via your browser as a PDF (Instructions).
  • The script contains all the code and examples.

0.3.5 Communication

  • For the collaboration during the lecture and the case study, we will use Microsoft Teams
    • Questions on Teams will be answered by me
    • Please use the public channels to ask questions, so other teams can also benefit from the answers
    • For larger difficulties, personal appointments are also possible
  • In the event, we will jointly develop the necessary tools (e.g., fundamentals and methods, implementation with the help of the programming language Python)
  • For each topic block, we will do practical exercises using two example datasets
  • In the events, we will discuss frequently asked questions together

Figure 3: MS Teams link

0.4 Tools

0.4.1 Software

  • In this lecture (as well as in the further lectures in the DigiBIM Masters program), we use the programming language Python and the development environment JupyterHub
  • A fully set up working environment can be found at: jupyter.fh-muenster.de.
  • It is strongly recommended to use this working environment, but there is also the option to install Python and Jupyter yourself.

Figure 4: The user interface of JupyterHub

0.4.2 Python examples throughout this lecture

  • You will find Python code snippets throughout the lecture notes for this lecture.
  • If the code returns an output (numbers, images, tables, etc.) it will be shown directly below the code.
  • You can easily copy and paste the code from this document to your JupyterHub. It should work right away.
# Define the function
def sum_xy(x,y):
  return x+y

# Use the function
sum_xy(5,6)
11

0.4.3 Accessing the FH Münster Jupyter Hub (1/4)

Figure 5: The login screen for the JupyterHub at FH Münster
  • The FH Münster hosts its own JuypterHub1: jupyter.fh-muenster.de
  • You can simply login with your FH Münster credentials but you need to be within the FH network – either via eduroam or from home via VPN2
  1. see https://jupyter.org/hub

  2. For instructions how to use VPN at FH Münster, see the instructions by the DVZ. You need to setup multi-factor authentication (MFA) first.

0.4.3 Accessing the FH Münster Jupyter Hub (2/4)

Figure 6: Selection of the server for this class
  • The FH Münster hosts its own JuypterHub1: jupyter.fh-muenster.de
  • You can simply login with your FH Münster credentials but you need to be within the FH network – either via eduroam or from home via VPN2
  • After logging in, please select the Notebook “FB09 Bücker”
  1. see https://jupyter.org/hub

  2. For instructions how to use VPN at FH Münster, see the instructions by the DVZ. You need to setup multi-factor authentication (MFA) first.

0.4.3 Accessing the FH Münster Jupyter Hub (3/4)

Figure 7: Spinning up the server
  • The FH Münster hosts its own JuypterHub1: jupyter.fh-muenster.de
  • You can simply login with your FH Münster credentials but you need to be within the FH network – either via eduroam or from home via VPN2
  • After logging in, please select the Notebook “FB09 Bücker”
  • It will take a moment to start the service
  1. see https://jupyter.org/hub

  2. For instructions how to use VPN at FH Münster, see the instructions by the DVZ. You need to setup multi-factor authentication (MFA) first.

0.4.3 Accessing the FH Münster Jupyter Hub (4/4)

Figure 8: The user interface after the login
  • The FH Münster hosts its own JuypterHub1: jupyter.fh-muenster.de
  • You can simply login with your FH Münster credentials but you need to be within the FH network – either via eduroam or from home via VPN2
  • After logging in, please select the Notebook “FB09 Bücker”
  • It will take a moment to start the service
  • You will see the user interface of JupyterHub afterwards
  1. see https://jupyter.org/hub

  2. For instructions how to use VPN at FH Münster, see the instructions by the DVZ. You need to setup multi-factor authentication (MFA) first.

0.5 Getting help

0.5.1 Stackoverflow

  • Stack Overflow is the a question-and-answer website for programmers
  • Stack Overflow has over 20 million registered users,[15] and has received over 24 million questions and 35 million answers
  • You will most certainly find an answer to any of your questions and if not you, you can post a question yourself

Figure 9: Questions and answers on Stack Overflow with regards to the Python library pandas

0.5.2 Youtube

  • YouTube is a great ressource for methodological and coding instructions
  • We’ll be using these instructions a lot as homework for you to prepare so that we can use this knowledge in the following lecture

Figure 10: The coreyms channel on YouTube with many Python coding instruction videos

0.5.3 ChatGPT

  • You can and should also use ChatGPT if you are stuck
  • ChatGPT is very good at writing code, in particular Python code
  • With a paid account and using GPT4, ChatGPT can even run and evaluate code that it has written itself

Tip

When using ChatGPT, please to not only copy and paste results: also try to understand the code that has been produced – you can even ask ChatGPT to explain what it has done!

Figure 11: ChatGPT writing and testing Python code

0.5.4 Self learning


  • You can utilize the (otherwise mostly paid) offerings of dataquest.io to build further skills.
  • If interested, write a short email to me.

0.6 Getting started

0.6.1 Jupyter Notebooks

Homework

Please watch the following video (chapters 4-7 only, i.e. “Creating a New Notebook” up to “Magic Commands”):

References

Schüller, Katharina, Paulina Busch, and Carina Hindinger. 2019. “Future Skills: Ein Framework für Data Literacy.” https://hochschulforumdigitalisierung.de/sites/default/files/dateien/HFD_AP_Nr_47_DALI_Kompetenzrahmen_WEB.pdf.
Varian, Hal. 2008. “Hal Varian on How the Web Challenges Managers.” online. https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/hal-varian-on-how-the-web-challenges-managers.

Data Literacy – Prof. Dr. Michael Bücker

1
Data Literacy Chapter 0: Introduction Prof. Dr. Michael Bücker

  1. Slides

  2. Tools

  3. Close
  • Data Literacy
  • Table of contents
  • 0.1 Introduction
  • 0.1.1 Introduction
  • 0.1.2 Results from the survey
  • 0.2 Data Literacy
  • 0.2.1 Which competencies are related to data literacy?
  • 0.2.2 Which competencies are required to become data literate?
  • 0.2.3 Why is data literacy important for you?
  • 0.3 Structure of this lecture on data literacy
  • 0.3.1 Learning goals
  • 0.3.2 Agenda
  • 0.3.3 Assigment and examination form
  • 0.3.4 Lecture notes
  • 0.3.5 Communication
  • 0.4 Tools
  • 0.4.1 Software
  • 0.4.2 Python examples throughout this lecture
  • 0.4.3 Accessing the FH Münster Jupyter Hub (1/4)
  • 0.4.3 Accessing the FH Münster Jupyter Hub (2/4)
  • 0.4.3 Accessing the FH Münster Jupyter Hub (3/4)
  • 0.4.3 Accessing the FH Münster Jupyter Hub (4/4)
  • 0.5 Getting help
  • 0.5.1 Stackoverflow
  • 0.5.2 Youtube
  • 0.5.3 ChatGPT
  • 0.5.4 Self learning
  • 0.6 Getting started
  • 0.6.1 Jupyter Notebooks
  • References
  • f Fullscreen
  • s Speaker View
  • o Slide Overview
  • e PDF Export Mode
  • b Toggle Chalkboard
  • c Toggle Notes Canvas
  • d Download Drawings
  • ? Keyboard Help