Datawarehousing Fundementals#
Data mining - showing unknown trends If insights are not correlated they are useless
BI - reporting, analytics, management information, based on facts BI - Gain insights and make trusted decisions
Advanced analytics - extracts fact
Operational data store / data lake - a staging area Data warehouse purpose - enhance rate of knowledge acquisition and answering management questions
Can’t do it on an operational system - must be done on a data warehouse.
Don’t want to disturb operational data
Can’t do both. Operations must be independent.
Quick wins with data sceince scripts after a while it will use more resources.
- Components of a data warehouse?
- Mananging relationship and syncing of operational database to Datawarehousing - data duplication.
Single source of truth
KPI: COst of Capital, Headcount, Earnings per share, Cost per Unit
Users in BI:
- Strategic - Executive make more decisions
- Tactical - Data scienctists, make some decisions.
- Operational - Little to no decision.
Data warehouse Enterprise Informatin Managment Framework - Data sources, Integration, Derived Information, Semantics (Presentaiton Logic), Deliver / Manipulation / Consumption
Cube is a precalcilated values stored in memory
Star schema: A fact and dimensions around it Fact: Quantity, cost, total Dimensions: Time, Customer, Product, Branch
Delta processing
Surrogate key is an extra key for delta processing and seeing changed records