Music and Tech

Share this post
The Data Modeling Series pt.5/5
angelddaz.substack.com

The Data Modeling Series pt.5/5

Digestible Data Modeling Blogging

Angel D'az
Oct 21, 2019
Share this post
The Data Modeling Series pt.5/5
angelddaz.substack.com

Data Modeling is an important skill for any data worker to understand. Data Engineers in particular should understand this topic and be able to apply it to their work.

This post is a continuation of a data or dimensional modeling series. This series will have 5 parts. For pt.5 , we will go over the Benefits of dimensional modeling section in Wikipedia’s page on Dimensional Modeling.


Benefits of dimensional modeling

In part four of this series, we covered the tradeoffs between normalization and de-normalization. In this post, we will finish off with some arguments for dimensional/data modeling your data warehouse.

Understandability.0

Understandability. Compared to the normalized model, the dimensional model is easier to understand and more intuitive. In dimensional models, information is grouped into coherent business categories or dimensions, making it easier to read and interpret. Simplicity also allows software to navigate databases efficiently. In normalized models, data is divided into many discrete entities and even a simple business process might result in dozens of tables joined together in a complex way.

Right off the bat, this paragraph draws a distinction between normalized and dimensional model. This means that most databases don’t reach the higher normal forms. Typically above 3NF — 3rd Normal Form is not done. Meaning, we allow some duplicate data, some de-normalization, in exchange for understanding what how the data in our database is organized.

Going back to our first post in Data Modeling series, the business comes before the database. Our database is organized in such a way that it is grouped into business needs. Highly normalized databases are split into so many tables that it is hard to navigate in queries. Even if you do know how to navigate a highly normalized database, query results can often be unintelligible keys, keys, and more keys.

Query Performance.1

Query performance. Dimensional models are more denormalized and optimized for data querying, while normalized models seek to eliminate data redundancies and are optimized for transaction loading and updating. The predictable framework of a dimensional model allows the database to make strong assumptions about the data which may have a positive impact on performance. Each dimension is an equivalent entry point into the fact table, and this symmetrical structure allows effective handling of complex queries. Query optimization for star-joined databases is simple, predictable, and controllable.

If a database is denormalized, we are duplicating data but making it easier for analysts to query what they need from fewer database objects (schemas, tables, and/or columns).

Extensibility.2

Extensibility. Dimensional models are scalable and easily accommodate unexpected new data. Existing tables can be changed in place either by simply adding new data rows into the table or executing SQL alter table commands. No queries or applications that sit on top of the data warehouse need to be reprogrammed to accommodate changes. Old queries and applications continue to run without yielding different results. But in normalized models each modification should be considered carefully, because of the complex dependencies between database tables.

Maintaining a normalized database requires a lot of work because any modification may break the tier of normal forms (NFs). However, a star schema/dimensional model has organization based on business needs, so alterations to the database should not be a huge effort, as long as the business hasn’t drastically changed.

Finale

This last post was definitely a breeze compared to some of the other 5. Major props to you if you have made it this far. Next week, we will continue on our new series on database normalization, which we started in the part four of this Data Modeling series. Thanks for reading!

Share this post
The Data Modeling Series pt.5/5
angelddaz.substack.com
Comments

Create your profile

0 subscriptions will be displayed on your profile (edit)

Skip for now

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

TopNewCommunity

No posts

Ready for more?

© 2022 Substack Inc
Privacy ∙ Terms ∙ Collection notice
Publish on Substack Get the app
Substack is the home for great writing