Archives of the TeradataForum
Message Posted: Fri, 27 Feb 2004 @ 11:09:08 GMT
I am confused by the Teradata functions of Standard Deviations. I even create sample table and data for verification, and results show that the Teradata version of Standard Deviation does not match up. However, I reserve my opinion on this due to the fact that I may need to convert the values of the data sample to "Float" or some other types. I attach the DDL statements at the end of my question, section 2. There was a similiar post before, but the author used simple data samples and so it looked correct. This time, I try to make it more complex.
There are three pictures in the Attachment area:
SD.jpg explains the generic equation of Standard Deviation, while STDDEV_POP.jpg and STDDEV_SAMP.jpg are the equations Teradata use to compute for standard deviation. My question is, which equation in (STDDEV_POP.jpg, STDDEV_SAMP) is equivalent to the SD.jpg ? They looked different. Is Teradata making any assumptions in deriving its own standard deviation formula ? In SD.jpg, the variable "x" with a bar above is the MEAN of "x".
There are 2 sections below. One is the descriptions of what I learned about SD, and another is the DDL I used to create a sample table with values.
Here is what I learn about Standard Deviation (SD)
Terms to know
X = one value in your set of data
Avg (X) = the mean (average) of all values X in the set of data
N = the number of values X in the set of data
Y = Sum of ( Xi - Avg(X) )**2
SD: For each value X, subtract the overall Avg (X) from X, then multiply that result by itself (or the square of that value). Sum up all those squared values, called this Y. Then divide that result by N, and then Square Root of this result.
SD = Square Root ( Y/N )
Here are 6 data samples: (1, 2, 3, 4, 4.5, 2.8) . My SD formlua gives SQRT( 8.208334 / 6 ) = 1.1696. Both Teradata functions give value of "1". Surprise?
CREATE SET TABLE TONGTEST ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL ( SAMPLE_DATA DECIMAL(18,2) NOT NULL, STATUS CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC ) PRIMARY INDEX ( STATUS ); insert into TONGTEST VALUE (1, 'Y') insert into TONGTEST VALUE (2, 'Y') insert into TONGTEST VALUE (3, 'Y') insert into TONGTEST VALUE (4, 'Y') insert into TONGTEST VALUE (4.5, 'Y') insert into PCDW_EBI_UAT.TONGTEST VALUES (2.8, 'Y')
SELECT STDDEV_SAMP(cast(SAMPLE_DATA as float)) FROM PCDW_EBI_UAT.TONGTEST
|Copyright 2012 - All Rights Reserved|
|Last Modified: 21 Mar 2013|