ITMD526-Week12-Blog

Project Process Steps


The key steps involved in building a data warehouse are:
Extracting the data from source systems and placing them in the staging area.
Transformation of data according to the requirements.
Creating a dimensional database
Loading the transformed data into it.
Creating assumptions and developing pre-calculated values to increase the process of report generation.
Developing a front-end tool for reporting.
Let us now discuss in details each of the above mentioned steps.
Extracting Data:

A major contribution of  developing a data warehouse involves in extracting the data from source systems and placing them in the staging area. It can be the most complex part of the entire process depending on the sources available and the knowledge about the database. To overcome this issue, Microsoft has designed an extracting tool called the Data Transformation Services (DTS). The DTS is available as part of MS SQL Server and helps in importing and exporting of data from databases. This tool can be helpful only for small and medium sized data warehouses as most large data warehouses does not provide OLE DB/ ODBC-compliant data. Lots of time and effort needs to be put to develop a custom code that can extract data into the staging area.

Transformation of Data:
Transforming the data according to the requirements is as important as extracting the data. The main issue that arises is the lack of knowledge about the source system. The target database system might be completely different from the source system. Complexity increases in transforming the source data according to the target system needs. Another issue is the availability of data across different locations and platforms. Many organizations contain their data in various platforms like Oracle, MS Access, Sybase, etc. Building a staging area that can hold records from different platforms is a challenging task.
Creating a Dimensional Database Model:
Creating a Dimensional Model for the data warehouse is the third step in building a data warehouse. Relational model is the most common and effective method used in most modern plans. The relational database is exceedingly standardized; when outlining such a framework, you attempt to dispose of rehashing columns and make all columns reliant on the primary key of each table. The relational format is not exceptionally proficient with regards to building reports with synopsis and total qualities. The dimensional model gives an approach to enhance query execution without influencing data integrity. The dimensional model comprises of the reality and dimension tables. The reality tables comprise of foreign keys to each dimension table, and also measures. Dimensions, then again, are what your business clients expect in the reports—the insights about the measures.

Loading Data:
Once the dimensional model is created, it is time to now load the data. Caution needs to be taken while loading the data as complex challenges like splitting a single column into several ones, combining fields, etc., might occur in this phase.
Creating assumptions and developing pre-calculated values:
The next step is creating aggregations or developing per-calculated values. This progression has been massively improved by SQL Server Analysis Services. In the wake of populating the dimensional database, SQL Server Analysis Services does all the generation work. In any case, recollect that relying upon the quantity of dimensions present in the DW, building aggregations can take quite a while. As a dependable guideline, the more dimensions, the additional time it'll take to manufacture aggregations. Be that as it may, the measure of each dimension likewise assumes a critical part.
Developing Front-end Tool:
The front-end reporting tool can be developed once the model in developed and data is populated. Microsoft Excel helps greatly in creating a report using the Pivot Table functionality. Buying a third-party reporting tool will end up much cheaper than creating one. 

Comments

Popular posts from this blog

ITMD 526-Week 6-Blog

ITMD 526-Week 7-Blog

ITMD526-Week1-Blog