Professional Documents
Culture Documents
consider and other transactions load into one table called Wrap table and
invalid records (
transaction code missing, null,spaces) to error table. For each dimension
table we are creating surrogate key and load
into DWH tables.
SCD2 Mapping:
We are implementing SCD2 mapping for customer dimension or account
dimension to
keep history of the accounts or customers. We are using SCD2 Date
method.
before telling
this you should know it clearly about this SCD2 method.careful about it.
Role and Responsibilities
Pick from Project architecture Post and tell according your comfortable
level. we are responsible for only development and testing
and scheduling we are using third party tools..( Control M, AutoSys, Job
Tracker, Tivoli
or etc..) we simply give the dependencies between each mapping and
run time. Based on
that Information scheduling tool team will schedule the mappings. We
wont schedule in
Informatica .. thats it Finished
Please Let me know if u required more explanation regarding any point
reply
Confirmed Dimension:
The dimensions which is used more than one fact table is called conformed dimensions.
Ex-Product Dimension related to Order fact, Sles fact..
2.Junk Dimension:
A "junk" dimension is a collection of random transactional codes, flags and/or text
attributes that are unrelated to any particular dimension.
A good example would be a trade fact in a company that brokers equity trades.
fact would contain several metrics (principal amount,net amount, price per share,
commission, margin amount, etc.) and would be related to several dimensions such as
account, date, rep, office, exchange, etc.
3.Degenerated Dimension:
In a data warehouse, a degenerate dimension is a dimension which is derived from the
fact table and doesn't have its own dimension table.
ex-line no in a Facttable,
4.Slowly changing Dimensions:
A Slowly Changing Dimension (SCD)is a dimension that changes over time.It may
change immediately and it may also change quite rapidly.
ex-nothing but Inserts,updates
A dimension table typically has two types of columns, primary keys to fact tables and
textual\descriptive data.
Eg: Time, Customer
Types of Dimensions:
Slowly Changing Dimensions
Rapidly Changing Dimensions
Junk Dimensions
Inferred Dimensions
Conformed Dimensions
Degenerate Dimensions
Role Playing Dimensions
Shrunken Dimensions
Static Dimensions
Slowly Changing Dimensions:
Attributes of a dimension that would undergo changes over time. It depends on the
business requirement whether particular attribute history of changes should be preserved
in the data warehouse. This is called a slowly changing attribute and a dimension
containing such an attribute is called a slowly changing dimension.
Rapidly Changing Dimensions:
A dimension attribute that changes frequently is a rapidly changing attribute. If you dont
need to track the changes, the rapidly changing attribute is no problem, but if you do need
to track the changes, using a standard slowly changing dimension technique can result in
a huge inflation of the size of the dimension. One solution is to move the attribute to its
own dimension, with a separate foreign key in the fact table. This new dimension is
called a rapidly changing dimension.
Junk Dimensions:
A junk dimension is a single table with a combination of different and unrelated attributes
to avoid having a large number of foreign keys in the fact table. Junk dimensions are
often created to manage the foreign keys created by rapidly changing dimensions.
Inferred Dimensions:
While loading fact records, a dimension record may not yet be ready. One solution is to
generate a surrogate key with null for all the other attributes. This should technically be
called an inferred member, but is often called an inferred dimension.
Conformed Dimensions:
A dimension that is used in multiple locations is called a conformed dimension. A
conformed dimension may be used with multiple fact tables in a single database, or
across multiple data marts or data warehouses.
Degenerate Dimensions:
A degenerate dimension is when the dimension attribute is stored as part of fact table, and
not in a separate dimension table. These are essentially dimension keys for which there
are no other attributes. In a data warehouse, these are often used as the result of a drill
through query to analyze the source of an aggregated number in a report. You can use
these values to trace back to transactions in the OLTP system.
Role Playing Dimensions:
A role-playing dimension is one where the same dimension key along with its
associated attributes can be joined to more than one foreign key in the fact table. For
example, a fact table may include foreign keys for both ship date and delivery date. But
the same date dimension attributes apply to each foreign key, so you can join the same
dimension table to both foreign keys. Here the date dimension is taking multiple roles to
map ship date as well as delivery date, and hence the name of role playing dimension.
Shrunken Dimensions:
A shrunken dimension is a subset of another dimension. For example, the orders fact
table may include a foreign key for product, but the target fact table may include a
foreign key only for productcategory, which is in the product table, but much less
granular. Creating a smaller dimension table, with productcategory as its primary key, is
one way of dealing with this situation of heterogeneous grain. If the product dimension is
snowflaked, there is probably already a separate table for productcategory, which can
serve as the shrunken dimension.
Static Dimensions:
Static dimensions are not extracted from the original data source, but are created within
the context of the data warehouse. A static dimension can be loaded manually for
example with status codes or it can be generated by a procedure, such as a date or time
dimension.
A complex mapping generally will have the following characteristics:
Difficult requirement
More no.of transformations
Having difficult business logic
May require combination of two or more methods/combinations
Complex business logic
More than 30 unconnected lookup
Star Schema: It has single fact table connected to dimension tables like a star. In star
schema only one join establishes the relationship between the fact table and any one of
the dimension tables.A star schema has one fact table and is associated with numerous
dimensions table and depicts a star.
Decode is much faster then If -Else because decode in built already have all the values of
the column which we want to decode whereas in if - else statement , we need to
explicitly specify
I want to load data in to two targets. One is dimension table and the other is fact table?
How can I load data at a time?
Generally we all knew that is,In Data warehouse environment,we should load data first in
the dimension table then we load into the fact table.
bcoz fact table which contains the Primary keys of the dimension table along with the
measures.
So we need to check first that whether the fact table which you are going to load that has
foreign key relationship with the dimension table or not?
If yes,Use pipeline mapping,and load dimension data first in first pipeline and in the
second pipeline load fact table data by taking the lookup transformation on the dimension
table which has loaded data already..and return the key value from the lookup
transformation then calculate the measures by using Aggregator and also give "group by"
on the dimension keys and map to the Target(Fact) prots as required.
most importantly specify the "Target Load Plan" where dimesion target as first,
fact table target as second.
Explain different types of modeling
Modeling is defined as to convert requirements of the business
users into technical structures.
1.conceptual modeling
2.logical modeling
3.physical modeling
example modeling tools:ERwin
If one flat file contains n number of records we have to load the records in target from 51
to 100 how to use expressions in Informatica?
use sequence generator to get row no. for each record ,then use filter giving the condition
(row no.greater than 50 and less than 100)
How will you get 1 and 3rd and 5th records in table? What is the query in oracle?
Select * from
(Select sal,
emp_id
row_number() over (partition by emp order by sal) row_num from emp)ref
Performance Improvements:
a) Network Performance
b) Session Performance
c) Database Performance
d) Analyze and if required define the Informatica and DB partitioning
requirements.
Qualitative Testing:
Analyze & validate your transformation business rules. More of
functional testing.
e) You need review field by field from source to target and ensure that
the required transformation logic is applied.
f) If you are making changes to existing mappings make use of the data
lineage feature Available with Informatica Power Center. This will help
you to find the consequences of Altering or deleting a port from existing
mapping.
g) Ensure that appropriate dimension lookups have been used and your
development is in Sync with your business requirements.