Archives of the TeradataForum
Message Posted: Wed, 19 Jun 2002 @ 17:28:50 GMT
| Subj: | | Re: Testing Methodology |
| |
| From: | | Jose Lora |
Hi Thomas.
Some of your reasons to move ETL away from the Teradata system are not really valid, for example :
| | - batch windows get smaller as the value of the warehouse grows | |
It always possible and recommendable to design your data warehouse without having Batch Windows as a requirement. In fact, NCR is
moving in that direction with the Active Data Warehouse concept and the improvements in the loading tools. Your design will also be
affected for user's Data availability requirements.
| | - at the same time data volumes and business logic complexity grows | |
This reason will affect both ETL strategies and I should say that an out-of-Teradata strategy will be more affected because more
business logic could mean more lockup transformations and bigger data volumes will involve expensive non parallel IO to move the data
required for the transformation outside Teradata
| | - I purchased the database for the user and not for the ETL process | |
This is a good point and the best solution is making the user a part of the ETL process and making him understand the benefits of
this approach. In my experience, user are very happy to have access to the original data in the database (temporarily) to validate
if the changing business rules.
| | - a separate ETL server is cheaper to expand than a Teradata system | |
That's true, however, an increase in CPU power for your ETL process will also increase the CPU power for the business process. In
the other hand, having an ETL server that only work during batch hours is not a very good way to use that box and do not improve user
response time.
| | - there are high performance tools that transform extremely fast, moving more data than I have typically seen | |
Sure there are, an after test some of them I couldn't find a match for the inherent parallelism of Teradata (for large data
volumes)
However, there is always a good reason to make transformation outside Teradata (number or date validation are not easy on
Teradata) and in this case, I would prefer using Inmod routines, Access Modules or tools like Genio (Hummingbird) that can stream
data directly to Fastload (doing all sort of simple row based transformations on the fly), to avoid the expensive non parallel
IO.
Regards.
Jose Lora
Systems Architect
Meredith Corporation
|