Last week someone asked me if I knew anything about “Star Database”. At the time, I did not. However, I did know about the Star schema/structure of that “data warehouse”. These tech jargons keep making their way into the discussions about business intelligence (BI)/reporting /dashboarding and sometimes they can be confusing. Following is an attempt to shed some light on the use of databases without all the jargon for business intelligence.
From a high level perspective we can put the databases in two categories for business intelligence purposes:
1. Data Collection databases
2. Business Intelligence databases
The first category is all about enabling an organization to function and assist in its operations. Good examples of data collecting databases are ERP, HR, Sales, Marketing applications. This data generates from day-to-day activities and is crucial for the organization to function.
Due to various factors this data is
• stored in multiple databases
• structured a certain way (“Normalized”)
• detailed and may not be BI ready
• often times vendor-specific
Read next: The Data Analyst’s Secret Weapon
The second category is all about enabling the decision making of the organization (at all levels, especially at the executive level). Most of the data in these databases is derived from the first category. The term information is used instead in this category because we are trying to make sense of the data and convert it into pieces of information that can be used in decision making. This information empowers the organization to plan, analyze, grow and better itself. Dashboard, Reporting, Business Analytics, Predictive Analytics and Planning applications are good examples for utilizing this type of database.
Data in this category is
• very concise
• well defined for the decision making process
• structured for easy retrieval
• structured in certain way (“de-normalized”, “Star”, “Data mart (DM)”, “Data warehouse (DW)”, “Summary”, “OLAP”, “Cube”, “Enterprise data warehouse (EDW)”)
• generally aggregated/calculated from the first category
• stored into one or only a few databases
• clean with great quality
For business intelligence tools (Dashboard, Reports, Analytics etc.) the second category is the preferred way to go. These tools pull quality and aggregated data (information) to be used for decision making. However, sometimes you may see exceptions where business intelligence tools are getting the data from the first category database.
This division of databases into two categories has traditionally been the case and still prevalent. However, there are several emerging technologies which are following different patterns to cater to the needs of both the categories e.g.:
• database appliance
• Big Data infrastructure
• database as service/database in Cloud
• database web services
• Self data discovery
Still confused about some jargons or similar topics, leave a comment below.
Zahid Ansari – Principal Consultant, iDashboards