This is one of my most favourite functionality in Teradata. Since its introduction in V2R5, it has gained popularity considering the enormous space savings and reduced I/O in huge data warehouses.
Top keep upto its promise, Teradata has gone one step further to include more data types for compression.
Previously multivalued compression was limited to fixed length columns types.
In TD13.10, some new features in MVC include
- Support for varchar and varbyte
- Longer columns support upto 510 characters. Previously 255 in Teradata 12, 13
-- Multi
Value Compression in TD12,13
CREATE SET TABLE EMPLOYEE_TBL
(
EMPID INTEGER
NOT NULL,
EMP_NAME_FIRST VARCHAR(20), -- > compression not possible
EMP_NAME_LAST
VARCHAR(20),
DT_BIRTH
DATE FORMAT 'YYYY-MM-DD' NOT NULL,
DT_JOINING
DATE FORMAT 'YYYY-MM-DD' NOT NULL,
DEPT_ID
INTEGER,
LOCATION CHAR(20) COMPRESS ('ALABAMA','CALIFORNIA','COLORADO','FLORIDA','GEORGIA','ILLINOIS','INDIANA')
--> Location needs to be char to
be compressed
)
UNIQUE PRIMARY INDEX (EMPID);
--
Multi Value Compression in TD13.10, more columns can be compressed.
CREATE SET TABLE EMPLOYEE_TBL
(
EMPID INTEGER
NOT NULL,
EMP_NAME_FIRST VARCHAR(20) COMPRESS ('JAMES','JOHN','ROBERT','MICHAEL','WILLIAM','DAVID','RICHARD','CHARLES','JOSEPH','THOMAS','PAUL','MARK','GEORGE','EDWARD','BRIAN'),
-- > compression applied on VARCHAR column with criteria of common first
names
EMP_NAME_LAST
VARCHAR(20),
DT_BIRTH
DATE FORMAT 'YYYY-MM-DD' NOT NULL,
DT_JOINING
DATE FORMAT 'YYYY-MM-DD' NOT NULL,
DEPT_ID
INTEGER ,
LOCATION VARCHAR(20) COMPRESS ('ALABAMA','CALIFORNIA','COLORADO','FLORIDA','GEORGIA','ILLINOIS','INDIANA')
-- > Datatype changed to VARCHAR and Compression applied which results in
5-15% more space savings than CHAR
)
UNIQUE PRIMARY INDEX (EMPID);
To be continued...
No comments:
Post a Comment