It collects the information like:
- total row counts of the table,
- how many distinct values are there in the column,
- how many rows per value, is the column indexed,
- if so unique or non unique etc.
What if collect stats is not done on the table?
Teradata uses a cost based optimizer and cost estimates are done based on statistics.
So if you donot have statistics collected then optimizer will use a Dynamic AMP Sampling method to get the stats. If your table is big and data was unevenly distributed then dynamic sampling may not get right information and your performance will suffer.
How can i know the tables for which the collect stats has been done?
Run the Help Stats command on that table.
e.g HELP STATISTICS TABLE_NAME ;
this will give you Date and time when stats were last collected. You will also see stats for the columns ( for which stats were defined) for the table
Whenever collect stats is done on the particular table(say on index/column) where can I find information regarding these entries ?
Collected statistics are stored in DBC.TVFields or DBC.Indexes tables. However, these two tables cannot be queried .
When to collect stats on tables which have stats ?
1. Typical guideline is roughly 10% of the data has changed. (By measuring delta in perm space since last collected.)
2. Recollect based on stats that have aged 60-90 days. (say last time stats collected was 2 months ago) .
Please note :
Collect stats could be pretty resource consuming for large tables. So it is always advisable to schedule the job at off peak period .