The data warehouse ETL toolkit : practical techniques for extracting, cleaning, conforming, and delivering data / Ralph Kimball, Joe Caserta
Tipo de material:![Texto](/opac-tmpl/lib/famfamfam/BK.png)
Tipo de ítem | Biblioteca actual | Colección | número de clasificación | Copia número | Estado | Notas | Fecha de vencimiento | Código de barras |
---|---|---|---|---|---|---|---|---|
![]() |
Biblioteca Central | Colección General | 658.4038 K49 2002 (Navegar estantería(Abre debajo)) | 1 | Disponible | 3560900132531 | ||
![]() |
Biblioteca Central | Colección General | 658.4038 K49 2002 (Navegar estantería(Abre debajo)) | 2 | Disponible | Donación Profesora Cecilia Reyes. | 3560900277423 |
Includes index.
Introduction
Overview of the Book: Two Simultaneous Threads
The Planning & Design Thread
The Data Flow Thread
How The Book Is Organized
Who Should Read this Book
Summary
Part I
Requirements, Realities, and Architecture
Chapter 1
Surrounding the Requirements
Requirements
Architecture
The Mission of the Data Warehouse
The Mission of the ETL Team
CHAPTER 2
ETL Data Structures
To Stage or Not To Stage
Designing the Staging Area
Data Structures in the ETL System
Planning and Design Standards
Summary
Part II
Data Flow
CHAPTER 3
Extracting
Part 1: The Logical Data Map
Inside the Logical Data Map
Building the Logical Data Map
Integrating Heterogeneous Data Sources
Mainframe Sources
Flat Files
XML Sources
Web Log Sources
ERP System Sources
Part 3: Extracting Changed Data
Summary
CHAPTER 4
Cleaning and Conforming
Defining Data Quality
Assumptions
Part 1: Design Objectives
Part 2: Cleaning Deliverables
Part 3: Screens and Their Measurements
Part 4: Conforming Deliverables
Summary
chapter 5
Delivering Dimension Tables
The Basic Structure of a Dimension
The Grain of a Dimension
The Basic Load Plan for a Dimension
Flat Dimensions and Snowflaked Dimensions
Date and Time Dimensions
Big Dimensions
Small Dimensions
One Dimension or Two
Dimensional Roles
Dimensions as Subdimensions of Another Dimension
Degenerate Dimensions
Slowly Changing Dimensions
Type 1 Slowly Changing Dimension (Overwrite)
Type 2 Slowly Changing Dimension (Partitioning History)
Precise Time Stamping of a Type 2 Slowly Changing Dimension
Type 3 Slowly Changing Dimension (Alternate Realities)
Hybrid Slowly Changing Dimensions
Late Arriving Dimension Records and Correcting Bad Data
Multi-Valued Dimensions and Bridge Tables
Ragged Hierarchies and Bridge Tables
Technical Note: POPULATING HIERARCHY BRIDGE TABLES
Using Positional Attributes in a Dimension to Represent Text Facts
Summary
chapter 6
Delivering Fact Tables
The Basic Structure of a Fact Table
Guaranteeing Referential Integrity
Surrogate Key Pipeline
Fundamental Grains
Preparing for Loading Fact Tables
Factless Fact Tables
Augmenting a Type 1 Fact Table With Type 2 History
Graceful Modifications
Multiple Units of Measure In A Fact Table
Collecting Revenue In Multiple Currencies
Late Arriving Facts
Aggregations
Delivering Dimensional Data to OLAP Cubes
Summary
Part III
Implementation and Operations
chapter 7
Development
Current Marketplace ETL Tool Suite Offerings
Current Scripting Languages
Time Is of the Essence
Using Database Bulk Loader Utilities to Speed Inserts
Managing Database Features to Improve Performance
Troubleshooting Performance Problems
Increasing ETL Throughput
Summary
chapter 8
Operations
Scheduling and Support
Migrating to Production
Achieving Optimal ETL Performance
Purging Historic Data
Monitoring the ETL System
Tuning ETL Processes
ETL System Security
Short Term Archiving and Recovery
Long Term Archiving and Recovery
Summary
chapter 9
Metadata
Defining Metadata
Business Metadata
Technical Metadata
ETL-Generated Metadata
Metadata Standards and Practices
Impact Analysis
Summary
chapter 10
Responsibilities
Planning and Leadership
Managing the Project
Summary
Part IV
Real Time Streaming ETL Systems
chapter 11
Real Time ETL Systems
Why Real-Time ETL?
Defining Real-Time ETL
Challenges and Opportunities of Real-Time Data Warehousing
Real-Time Data Warehousing Review
Categorizing the Requirement
Real-Time ETL Approaches
Summary
chapter 12
Conclusions
Deepening the Definition of ETL
The Future of Data Warehousing and ETL in Particular