ITMD526-Week12-Blog
Project Process Steps
The key steps involved in building a data warehouse are:
•Extracting the data from
source systems and placing them in the staging area.
•Transformation of data
according to the requirements.
•Creating a dimensional
database
•Loading the transformed
data into it.
•Creating assumptions and
developing pre-calculated values to increase the process of report generation.
•Developing a front-end tool
for reporting.
Let us now discuss in details each of the above mentioned steps.
Extracting Data:
A major contribution of
developing a data warehouse involves in extracting the data from source
systems and placing them in the staging area. It can be the most complex part
of the entire process depending on the sources available and the knowledge
about the database. To overcome this issue, Microsoft has designed an
extracting tool called the Data Transformation Services (DTS). The DTS is
available as part of MS SQL Server and helps in importing and exporting of data
from databases. This tool can be helpful only for small and medium sized data
warehouses as most large data warehouses does not provide OLE DB/
ODBC-compliant data. Lots of time and effort needs to be put to develop a
custom code that can extract data into the staging area.
Transformation of Data:
Transforming the data according to the requirements is as important
as extracting the data. The main issue that arises is the lack of knowledge
about the source system. The target database system might be completely
different from the source system. Complexity increases in transforming the
source data according to the target system needs. Another issue is the
availability of data across different locations and platforms. Many
organizations contain their data in various platforms like Oracle, MS Access, Sybase,
etc. Building a staging area that can hold records from different platforms is
a challenging task.
Creating a Dimensional
Database Model:
Creating a Dimensional Model for the data warehouse is the third
step in building a data warehouse. Relational model is the most common and
effective method used in most modern plans. The relational database is
exceedingly standardized; when outlining such a framework, you attempt to
dispose of rehashing columns and make all columns reliant on the primary key of
each table. The relational format is not exceptionally proficient with regards
to building reports with synopsis and total qualities. The dimensional model
gives an approach to enhance query execution without influencing data
integrity. The dimensional model comprises of the reality and dimension tables.
The reality tables comprise of foreign keys to each dimension table, and also
measures. Dimensions, then again, are what your business clients expect in the
reports—the insights about the measures.
Loading Data:
Once the dimensional model is created, it is time to now load the
data. Caution needs to be taken while loading the data as complex challenges
like splitting a single column into several ones, combining fields, etc., might
occur in this phase.
Creating assumptions and
developing pre-calculated values:
The next step is creating aggregations or developing per-calculated
values. This progression has been massively improved by SQL Server Analysis
Services. In the wake of populating the dimensional database, SQL Server
Analysis Services does all the generation work. In any case, recollect that
relying upon the quantity of dimensions present in the DW, building
aggregations can take quite a while. As a dependable guideline, the more
dimensions, the additional time it'll take to manufacture aggregations. Be that
as it may, the measure of each dimension likewise assumes a critical part.
Developing Front-end Tool:
The front-end reporting tool can be developed once the model in
developed and data is populated. Microsoft Excel helps greatly in creating a
report using the Pivot Table functionality. Buying a third-party reporting tool
will end up much cheaper than creating one.
Comments
Post a Comment