The Data Warehouse Toolkit 2nd Ed.

The Complete Guide to Dimensional Modeling

Ralph Kimball, Margy Ross

Publisher: Wiley, 2002, 436 pages

ISBN: 0-471-20024-7

Keywords: IT Architecture

Last modified: April 5, 2021, 3:23 p.m.

The latest edition of the single most authoritative guide on dimensional modeling for data warehousing!

Dimensional modeling has become the most widely accepted approach for data warehouse design. Here is a complete library of dimensional modeling techniques - the most comprehensive collection ever written. Greatly expanded to cover both basic and advanced techniques for optimizing data warehouse design, this second edition to Ralph Kimball’s classic guide is more than sixty percent updated.

The authors begin with fundamental design recommendations and gradually progress step-by-step through increasingly complex scenarios. Clear-cut guidelines for designing dimensional models are illustrated using real-world data warehouse case studies drawn from a variety of business application areas and industries, including:

  • Retail sales and e-commerce
  • Inventory management
  • Procurement
  • Order management
  • Customer relationship management (CRM)
  • Human resources management
  • Accounting
  • Financial services
  • Telecommunications and utilities
  • Education
  • Transportation
  • Health care and insurance

By the end of the book, you will have mastered the full range of powerful techniques for designing dimensional databases that are easy to understand and provide fast query response. You will also learn how to create an architected framework that integrates the distributed data warehouse using standardized dimensions and facts.

  • Chapter 1: Dimensional Modeling Primer
    • Different Modeling Primer
    • Goals of a Data Warehouse
      • The Publishing Metaphor
    • Components of a Data Warehouse
      • Operational Source Systems
      • Data Staging Area
      • Data Presentation
      • Data Access Tools
      • Additional Considerations
    • Dimensional Modeling Vocabulary
      • Fact Table
      • Dimension Tables
      • Bringing Together Facts and Dimensions
    • Dimensional Modeling Myths
      • Common Pitfalls to Avoid
    • Summary
  • Chapter 2: Retail Sales
    • Four-Step Dimensional Design Process
    • Retail Case Study
      • Step 1. Select the Business Process
      • Step 2. Declare the Grain
      • Step 3. Choose the Dimensions
      • Step 4. Identify the Facts
    • Dimension Table Attributes
      • Date Dimension
      • Product Dimension
      • Store Dimension
      • Promotion Dimension
      • Degenerate Transaction Number Dimension
    • Retail Schema in Action
    • Retail Schema Extensibility
    • Resisting Comfort Zone Urges
      • Dimension Normalization (Snowflaking)
      • Too Many Dimensions
    • Surrogate Keys
    • Market Basket Analysis
    • Summary
  • Chapter 3: Inventory
    • Introduction to the Value Chain
    • Inventory Models
      • Inventory Periodic Snapshot
      • Inventory Transactions
      • Inventory Accumulating Snapshot
    • Value Chain Integration
    • Data Warehouse Bus Architecture
      • Data Warehouse Bus Matrix
      • Conformed Dimensions
      • Conformed Facts
    • Summary
  • Chapter 4: Procurement
    • Procurement Case Study
    • Procurement Transactions
      • Multiple- versus Single-Transaction Fact Tables
      • Complementary Procurement Snapshot
    • Slowly Changing Dimensions
      • Type 1: Overwrite the Value
      • Type 2: Add a Dimension Row
      • Type 3: Add a Dimension Column
    • Hybrid Slowly Changing Dimension Techniques
      • Predictable Changes with Multiple Version Overlays
      • Unpredictable Changes with Single Version Overlay
    • More Rapidly Changing Dimensions
    • Summary
  • Chapter 5: Order Management
    • Introduction to Order Management
    • Order Transactions
      • Fact Normalization
      • Dimension Role-Playing
      • Product Dimension Revisited
      • Customer Ship-To Dimension
      • Deal Dimension
      • Degenerate Dimension for Order Number
      • Junk Dimensions
      • Multiple Currencies
      • Header and Line Item Facts with Different Granularity
    • Invoice Transactions
      • Profit and Loss Facts
      • Profitability—The Most Powerful Data Mart
      • Profitability Words of Warning
      • Customer Satisfaction Facts
    • Accumulating Snapshot for the Order Fulfillment Pipeline
      • Lag Calculations
      • Multiple Units of Measure
      • Beyond the Rear-View Mirror
    • Fact Table Comparison
      • Transaction Fact Tables
      • Periodic Snapshot Fact Tables
      • Accumulating Snapshot Fact Tables
    • Designing Real-Time Partitions
      • Requirements for the Real-Time Partition
      • Transaction Grain Real-Time Partition
      • Periodic Snapshot Real-Time Partition
      • Accumulating Snapshot Real-Time Partition
    • Summary
  • Chapter 6: Customer Relationship Management
    • CRM Overview
      • Operational and Analytical CRM
      • Packaged CRM
    • Customer Dimension
      • Name and Address Parsing
      • Other Common Customer Attributes
      • Dimension Outriggers for a Low-Cardinality Attribute Set
      • Large Changing Customer Dimensions
      • Implications of Type 2 Customer Dimension Changes
      • Customer Behavior Study Groups
      • Commercial Customer Hierarchies
      • Combining Multiple Sources of Customer Data
    • Analyzing Customer Data from Multiple Business Processes
    • Summary
  • Chapter 7: Accounting
    • Accounting Case Study
    • General Ledger Data
      • General Ledger Periodic Snapshot
      • General Ledger Journal Transactions
      • Financial Statements
    • Budgeting Process
      • Consolidated Fact Tables
    • Role of OLAP and Packaged Analytic Solutions
    • Summary
  • Chapter 8: Human Resources Management
    • Time-Stamped Transaction Tracking in a Dimension
    • Time-Stamped Dimension with Periodic Snapshot Facts
    • Audit Dimension
    • Keyword Outrigger Dimension
      • AND/OR Dilemma
      • Searching for Substrings
    • Survey Questionnaire Data
    • Summary
  • Chapter 9: Financial Services
    • Banking Case Study
    • Dimension Triage
      • Household Dimension
      • Multivalued Dimensions
      • Minidimensions Revisited
    • Arbitrary Value Banding of Facts
    • Point-in-Time Balances
    • Heterogeneous Product Schemas
      • Heterogeneous Products with Transaction Facts
    • Summary
  • Chapter 10: Telecommunications and Utilities
    • Telecommunications Case Study
    • General Design Review Considerations
      • Granularity
      • Date Dimension
      • Degenerate Dimensions
      • Dimension Decodes and Descriptions
      • Surrogate Keys
      • Too Many (or Too Few) Dimensions
    • Draft Design Exercise Discussion
    • Geographic Location Dimension
      • Location Outrigger
      • Leveraging Geographic Information Systems
    • Summary
  • Chapter 11: Transportation
    • Airline Frequent Flyer Case Study
      • Multiple Fact Table Granularities
      • Linking Segments into Trips
    • Extensions to Other Industries
      • Cargo Shipper
      • Travel Services
    • Combining Small Dimensions into a Superdimension
      • Class of Service
      • Origin and Destination
    • More Date and Time Considerations
      • Country-Specific Calendars
      • Time of Day as a Dimension or Fact
      • Date and Time in Multiple Time Zones
    • Summary
  • Chapter 12: Education
    • University Case Study
    • Accumulating Snapshot for Admissions Tracking
    • Factless Fact Tables
      • Student Registration Events
      • Facilities Utilization Coverage
      • Student Attendance Events
    • Other Areas of Analytic Interest
    • Summary
  • Chapter 13: Health Care
    • Health Care Value Circle
    • Health Care Bill
      • Roles Played By the Date Dimension
      • Multivalued Diagnosis Dimension
      • Extending a Billing Fact Table to Show Profitability
      • Dimensions for Billed Hospital Stays
    • Complex Health Care Events
    • Medical Records
      • Fact Dimension for Sparse Facts
    • Going Back in Time
      • Late-Arriving Fact Rows
      • Late-Arriving Dimension Rows
    • Summary
  • Chapter 14: Electronic Commerce
    • Web Client-Server Interactions Tutorial
    • Why the Clickstream Is Not Just Another Data Source
      • Challenges of Tracking with Clickstream Data
      • Specific Dimensions for the Clickstream
    • Clickstream Fact Table for Complete Sessions
    • Clickstream Fact Table for Individual Page Events
    • Aggregate Clickstream Fact Tables
    • Integrating the Clickstream Data Mart into the Enterprise Data Warehouse
    • Electronic Commerce Profitability Data Mart
    • Summary
  • Chapter 15: Insurance
    • Insurance Case Study
      • Insurance Value Chain
      • Draft Insurance Bus Matrix
    • Policy Transactions
      • Dimension Details and Techniques
      • Alternative (or Complementary) Policy Accumulating Snapshot
    • Policy Periodic Snapshot
      • Conformed Dimensions
      • Conformed Facts
      • Heterogeneous Products Again
      • Multivalued Dimensions Again
    • More Insurance Case Study Background
      • Updated Insurance Bus Matrix
    • Claims Transactions
    • Claims Accumulating Snapshot
    • Policy/Claims Consolidated Snapshot
    • Factless Accident Events
    • Common Dimensional Modeling Mistakes to Avoid
    • Summary
  • Chapter 16: Building the Data Warehouse
    • Business Dimensional Lifecycle Road Map
      • Road Map Major Points of Interest
    • Project Planning and Management
      • Assessing Readiness
      • Scoping
      • Justification
      • Staffing
      • Developing and Maintaining the Project Plan
    • Business Requirements Definition
      • Requirements Preplanning
      • Collecting the Business Requirements
      • Postcollection Documentation and Follow-up
    • Lifecycle Technology Track
    • Technical Architecture Design
      • Eight-Step Process for Creating the Technical Architecture
    • Product Selection and Installation
    • Lifecycle Data Track
    • Dimensional Modeling
    • Physical Design
      • Aggregation Strategy
      • Initial Indexing Strategy
    • Data Staging Design and Development
      • Dimension Table Staging
      • Fact Table Staging
    • Lifecycle Analytic Applications Track
      • Analytic Application Specification
      • Analytic Application Development
    • Deployment
    • Maintenance and Growth
    • Common Data Warehousing Mistakes to Avoid
    • Summary
  • Chapter 17: Present Imperatives and Future Outlook
    • Ongoing Technology Advances
    • Political Forces Demanding Security and Affecting Privacy
      • Conflict between Beneficial Uses and Insidious Abuses
      • Who Owns Your Personal Data?
      • What Is Likely to Happen? Watching the Watchers…
      • How Watching the Watchers Affects Data Warehouse Architecture
    • Designing to Avoid Catastrophic Failure
      • Catastrophic Failures
      • Countering Catastrophic Failures
    • Intellectual Property and Fair Use
    • Cultural Trends in Data Warehousing
      • Managing by the Numbers across the Enterprise
      • Increased Reliance on Sophisticated Key Performance Indicators
      • Behavior Is the New Marquee Application
      • Packaged Applications Have Hit Their High Point
      • Application Integration Has to Be Done by Someone
      • Data Warehouse Outsourcing Needs a Sober Risk Assessment
    • In Closing

Reviews

The Data Warehouse Toolkit

Reviewed by Roland Buresund

Very Good ******** (8 out of 10)

Last modified: Oct. 11, 2008, 5:36 p.m.

At least a book that can be read by an old ERP-programmer and make sense. A very good and encompassing book about data warehousing (and also as a primer into good design).

Clearly recommended for anyone interested in the finer points of building analytical tools on top of RDBMSs.

Comments

There are currently no comments

New Comment

required

required (not published)

optional

required

captcha

required