The data warehouse ETL toolkit : practical techniques for extracting, cleaning, conforming, and delivering data / Ralph Kimball, Joe Caserta

Por: Kimball, Ralph [autor]Colaborador(es): Caserta, Joe, 1965- [autor]Tipo de material: TextoTextoEditor: Indianapolis, IN : Wiley, 2004Descripción: xxxiv, 491 páginas : IlustracionesTipo de contenido: texto Tipo de medio: no mediado Tipo de portador: volumenISBN: 0764567578 Tema(s): ALMACENAMIENTO DE DATOS | DISEÑO DE BASE DE DATOSClasificación CDD: 658.4038
Contenidos:
Introduction Overview of the Book: Two Simultaneous Threads The Planning & Design Thread The Data Flow Thread How The Book Is Organized Who Should Read this Book Summary Part I Requirements, Realities, and Architecture Chapter 1 Surrounding the Requirements Requirements Architecture The Mission of the Data Warehouse The Mission of the ETL Team CHAPTER 2 ETL Data Structures To Stage or Not To Stage Designing the Staging Area Data Structures in the ETL System Planning and Design Standards Summary Part II Data Flow CHAPTER 3 Extracting Part 1: The Logical Data Map Inside the Logical Data Map Building the Logical Data Map Integrating Heterogeneous Data Sources Mainframe Sources Flat Files XML Sources Web Log Sources ERP System Sources Part 3: Extracting Changed Data Summary CHAPTER 4 Cleaning and Conforming Defining Data Quality Assumptions Part 1: Design Objectives Part 2: Cleaning Deliverables Part 3: Screens and Their Measurements Part 4: Conforming Deliverables Summary chapter 5 Delivering Dimension Tables The Basic Structure of a Dimension The Grain of a Dimension The Basic Load Plan for a Dimension Flat Dimensions and Snowflaked Dimensions Date and Time Dimensions Big Dimensions Small Dimensions One Dimension or Two Dimensional Roles Dimensions as Subdimensions of Another Dimension Degenerate Dimensions Slowly Changing Dimensions Type 1 Slowly Changing Dimension (Overwrite) Type 2 Slowly Changing Dimension (Partitioning History) Precise Time Stamping of a Type 2 Slowly Changing Dimension Type 3 Slowly Changing Dimension (Alternate Realities) Hybrid Slowly Changing Dimensions Late Arriving Dimension Records and Correcting Bad Data Multi-Valued Dimensions and Bridge Tables Ragged Hierarchies and Bridge Tables Technical Note: POPULATING HIERARCHY BRIDGE TABLES Using Positional Attributes in a Dimension to Represent Text Facts Summary chapter 6 Delivering Fact Tables The Basic Structure of a Fact Table Guaranteeing Referential Integrity Surrogate Key Pipeline Fundamental Grains Preparing for Loading Fact Tables Factless Fact Tables Augmenting a Type 1 Fact Table With Type 2 History Graceful Modifications Multiple Units of Measure In A Fact Table Collecting Revenue In Multiple Currencies Late Arriving Facts Aggregations Delivering Dimensional Data to OLAP Cubes Summary Part III Implementation and Operations chapter 7 Development Current Marketplace ETL Tool Suite Offerings Current Scripting Languages Time Is of the Essence Using Database Bulk Loader Utilities to Speed Inserts Managing Database Features to Improve Performance Troubleshooting Performance Problems Increasing ETL Throughput Summary chapter 8 Operations Scheduling and Support Migrating to Production Achieving Optimal ETL Performance Purging Historic Data Monitoring the ETL System Tuning ETL Processes ETL System Security Short Term Archiving and Recovery Long Term Archiving and Recovery Summary chapter 9 Metadata Defining Metadata Business Metadata Technical Metadata ETL-Generated Metadata Metadata Standards and Practices Impact Analysis Summary chapter 10 Responsibilities Planning and Leadership Managing the Project Summary Part IV Real Time Streaming ETL Systems chapter 11 Real Time ETL Systems Why Real-Time ETL? Defining Real-Time ETL Challenges and Opportunities of Real-Time Data Warehousing Real-Time Data Warehousing Review Categorizing the Requirement Real-Time ETL Approaches Summary chapter 12 Conclusions Deepening the Definition of ETL The Future of Data Warehousing and ETL in Particular
Etiquetas de esta biblioteca: No hay etiquetas de esta biblioteca para este título. Ingresar para agregar etiquetas.
Valoración
    Valoración media: 0.0 (0 votos)
Existencias
Tipo de ítem Biblioteca actual Colección número de clasificación Copia número Estado Notas Fecha de vencimiento Código de barras
Libro General Libro General Biblioteca Central
Colección General 658.4038 K49 2002 (Navegar estantería(Abre debajo)) 1 Disponible 3560900132531
Libro General Libro General Biblioteca Central
Colección General 658.4038 K49 2002 (Navegar estantería(Abre debajo)) 2 Disponible Donación Profesora Cecilia Reyes. 3560900277423

Includes index.

Introduction
Overview of the Book: Two Simultaneous Threads
The Planning & Design Thread
The Data Flow Thread
How The Book Is Organized
Who Should Read this Book
Summary
Part I
Requirements, Realities, and Architecture
Chapter 1
Surrounding the Requirements
Requirements
Architecture
The Mission of the Data Warehouse
The Mission of the ETL Team
CHAPTER 2
ETL Data Structures
To Stage or Not To Stage
Designing the Staging Area
Data Structures in the ETL System
Planning and Design Standards
Summary
Part II
Data Flow
CHAPTER 3
Extracting
Part 1: The Logical Data Map
Inside the Logical Data Map
Building the Logical Data Map
Integrating Heterogeneous Data Sources
Mainframe Sources
Flat Files
XML Sources
Web Log Sources
ERP System Sources
Part 3: Extracting Changed Data
Summary
CHAPTER 4
Cleaning and Conforming
Defining Data Quality
Assumptions
Part 1: Design Objectives
Part 2: Cleaning Deliverables
Part 3: Screens and Their Measurements
Part 4: Conforming Deliverables
Summary
chapter 5
Delivering Dimension Tables
The Basic Structure of a Dimension
The Grain of a Dimension
The Basic Load Plan for a Dimension
Flat Dimensions and Snowflaked Dimensions
Date and Time Dimensions
Big Dimensions
Small Dimensions
One Dimension or Two
Dimensional Roles
Dimensions as Subdimensions of Another Dimension
Degenerate Dimensions
Slowly Changing Dimensions
Type 1 Slowly Changing Dimension (Overwrite)
Type 2 Slowly Changing Dimension (Partitioning History)
Precise Time Stamping of a Type 2 Slowly Changing Dimension
Type 3 Slowly Changing Dimension (Alternate Realities)
Hybrid Slowly Changing Dimensions
Late Arriving Dimension Records and Correcting Bad Data
Multi-Valued Dimensions and Bridge Tables
Ragged Hierarchies and Bridge Tables
Technical Note: POPULATING HIERARCHY BRIDGE TABLES
Using Positional Attributes in a Dimension to Represent Text Facts
Summary
chapter 6
Delivering Fact Tables
The Basic Structure of a Fact Table
Guaranteeing Referential Integrity
Surrogate Key Pipeline
Fundamental Grains
Preparing for Loading Fact Tables
Factless Fact Tables
Augmenting a Type 1 Fact Table With Type 2 History
Graceful Modifications
Multiple Units of Measure In A Fact Table
Collecting Revenue In Multiple Currencies
Late Arriving Facts
Aggregations
Delivering Dimensional Data to OLAP Cubes
Summary
Part III
Implementation and Operations
chapter 7
Development
Current Marketplace ETL Tool Suite Offerings
Current Scripting Languages
Time Is of the Essence
Using Database Bulk Loader Utilities to Speed Inserts
Managing Database Features to Improve Performance
Troubleshooting Performance Problems
Increasing ETL Throughput
Summary
chapter 8
Operations
Scheduling and Support
Migrating to Production
Achieving Optimal ETL Performance
Purging Historic Data
Monitoring the ETL System
Tuning ETL Processes
ETL System Security
Short Term Archiving and Recovery
Long Term Archiving and Recovery
Summary
chapter 9
Metadata
Defining Metadata
Business Metadata
Technical Metadata
ETL-Generated Metadata
Metadata Standards and Practices
Impact Analysis
Summary
chapter 10
Responsibilities
Planning and Leadership
Managing the Project
Summary
Part IV
Real Time Streaming ETL Systems
chapter 11
Real Time ETL Systems
Why Real-Time ETL?
Defining Real-Time ETL
Challenges and Opportunities of Real-Time Data Warehousing
Real-Time Data Warehousing Review
Categorizing the Requirement
Real-Time ETL Approaches
Summary
chapter 12
Conclusions
Deepening the Definition of ETL
The Future of Data Warehousing and ETL in Particular