Home Page for the TeradataForum
 
 

 

Archives of the TeradataForum





Message Posted: Wed, 19 Apr 2006 @ 14:47:56 GMT





     
  <Prev Next>  
<<First
<Prev
Next> Last>>  




Subj:   Fastload performance depends on size of
 
From:   Christian Schiefer, makeITdone



Hi,

I am currently facing a interesting situation:

1.) There is a conversion program preparing data to convert it to an fastload readable record. It is written in C and it takes a few seconds to convert data of on day

2.) The C-program is writing each record ( about 300 bytes ) with an "fprintf"

3.) Output of this program is redirected into a fastload via a pipe

4.) Performance very poor


If I am using an access module in fastload, which handles gzip-files and gzip the source data before, I have about 400% performance improvement.

Sounds strange, I have my data already, but then I gzip to a unix-pipe and can load it 4 times faster as without it.

I suspect, that fastload wants 1MB parcels of data to be loaded in one go - which is provided by the access module and it is performing nicely.

cat SOURCE | convert | fastload

vs.

cat SOURCE | convert | gzip | fastload with gzip-access module

is in a relation of 4:1.

What the hell is fastload doing, if you deliver only a couple of bytes at a time via a pipe ??? In my mind wasting a lot of time...

Has the C-program, which is doing the conversion , to deliver data-parcels with 1MB instead of 1 record ( 300 bytes ) after another ..

Any ideas ?


Feedback welcome

Christian

--------------------------------
makeITdone IT Services
D.I. Christian Schiefer


P.S: I know how to fix this, but I would really like to know how fastload is treating its input data ..





     
  <Prev Next>  
<<First
<Prev
Next> Last>>  
 
 
 
 
 
 
 
 
  
 
  Top Home Join Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky 
Last Modified: 30 Jun 2008