Professional Documents
Culture Documents
Based on my experience I prepared maximum test scenarios and test cases to validate the ETL
process. I will keep on update this content. Thanks
Test Scenario
Test Cases
Structure validation
Constraint Validation
1. The data type and length for a particular attribute may vary in files or
tables though the semantic definition is the same.
Example: Account number may be defined as: Number (9) in one field or
table and Varchar2(11) in another table
2. Misuse of Integrity Constraints: When referential integrity constraints are
misused, foreign key values may be left dangling or
inadvertently deleted.
Example: An account record is missing but dependent records are not
deleted.
Data Completeness
Issues
Data Transformation
1. Create a spread sheet of scenarios of input data and expected results and
validate these with the business customer. This is an excellent requirements
elicitation step during design and could also be used as part of testing.
2. Create test data that includes all scenarios. Utilize an ETL developer to
automate the entire process of populating data sets with the scenario
spread sheet to permit versatility and mobility for the reason that scenarios
are likely to change.
3. Utilize data profiling results to compare range and submission of values in
each field between target and source data.
4. Validate accurate processing of ETL generated fields; for example,
surrogate keys.
5. Validate that the data types within the warehouse are the same as was
specified in the data model or design.
6. Create data scenarios between tables that test referential integrity.
7. Validate parent-to-child relationships in the data. Create data scenarios
that test the management of orphaned child records.
Data Quality
Null Validation
Verify the null values where "Not Null" specified for specified column.
Duplicate check
1. Needs to validate the unique key, primary key and any other column
should be unique as per the business requirements are having any duplicate
rows.
2. Check if any duplicate values exist in any column which is extracting from
multiple columns in source and combining into one column.
3. Some time as per the client requirements we needs ensure that no
duplicates in combination of multiple columns within target only.
Example: One policy holder can take multiple polices and multiple claims.
In this case we need to verify the CLAIM_NO, CLAIMANT_NO,
COVEREGE_NAME, EXPOSURE_TYPE, EXPOSURE_OPEN_DATE,
EXPOSURE_CLOSED_DATE, EXPOSURE_STATUS, PAYMENT
DATE Validation
1. To validate the complete data set in source and target table minus query
is best solution.
2. We need to source minus target and target minus source.
3. If minus query returns any value those should be considered as
mismatching rows.
4. And also we needs to matching rows among source and target using
Intersect statement.
5. The count returned by intersect should match with individual counts of
source and target tables.
6. If minus query returns o rows and count intersect is less than source
count or target table count then we can considered as duplicate rows are
exists.
1. Verify that extraction process did not extract duplicate data from the
source (usually this happens in repeatable processes where at point zero we
need to extract all data from the source file, but the during the next
intervals we only need to capture the modified, and new rows.)
2. The QA team will maintain a set of SQL statements that are automatically
run at this stage to validate that no duplicate data have been extracted from
the source systems.
Data cleanness