D atabase-Table is good enough to store business data but it is a lack of meta-information needed to search its data more quickly and efficiently. This behavior of table might not affect Search-Performance and might be ignorable if either the table contains minimal set of records, or you always querying table without
WHERE-filters or both. However, in real-world databases none of the scenario is supportive. There could be thousands of records in a table, and you need to extract the information with complex queries having various WHERE-filters including Sub-queries WHERE-filters too. This will take a long time and causes a
significant performance impact. Therefore, Indexes are come into the picture. You can create about 250 indexes over a single table based on columns which frequently used in WHERE-filters of queries. These columns called
Search-Keys.
A table without any Indexes stores the data into an un-organized memory structure called
Heap.
When you execute a query against heap even using WHERE-filter,
SQL-Server searches entire table from 1st row to last-row sequentially.
This searching process is called
Table-Scan where as the search over against an Indexed Table called
Index Lookups. The queries that use Indexed columns in WHERE-filter are called
Covering Queries.
Indexes in database are much similar to Indexes in the Book. They describe book contents in two ways:
Topic-wise (called Book-Contents put at front of the book) and
Keyword-wise
(called Book-Index put at back of book): For example, if you want to
find out the pages of book where particular topic is discussed, then
refer the Index which contains topic name in alphabetic order along with
page Numbers. A book-page can contain multiple topics or a topic can
spread across multiple book-pages. Similarly a
databases-page (a.k.a.,
bucket) can contain multiple records or a single record can spread across multiple pages if too long, the maximum size of
database-Page: 8kb and maximum-record size is based on data types of column e.g., only a single
xml-type column can contain
2Gb of data.
In databases there are two types of Indexes:
Clustered and Non-Clustered Indexes.
The clustered index is similar to Topic-Wise contents of the book in
which topics are sorted sequentially in order of the page numbers; here
Table-rows are sorted in Clustered Index-Key order, typically Primary
Key order.
A book can organize the pages in one sequential order similarly a table
organizes their records according to a Clustered-Index search-key order.
Hence, a table can have at-most one Clustered Index.
The Database-Index also contains Search-keys along with Pages where records are available.
Database-page is a collection of records access as whole for retrieval and persisting data to and from databases.
Database-Index contains Keys [column(s) of Table or
View] and these Keys stored in a tree-like structure called
Balanced-Tree (B+ -Trees). This B-Tree organization of Keys gives two
advantages:
- Key-by-Key search instead of Record-by-record search.
- Dynamic search instead of linear or sequentially search.
The primary advantage as well as disadvantage of Indexes is
performance.
Because, one-hand indexes speed ups data retrieval from database, but
on other-hand they slow-downs INSERT, DELETE and UPDATE operations since
these operations should be reflect on both table as well as
Index-structures. By default SQL server creates Clustered Index on
primary key, this is the reason why you can’t modify he primary
key-value. The reasoning of additional index is depends on 2-factors:
- Frequency of columns that uses in SELECT-WHERE-clauses:
- Frequency of columns that modified or deleted UPDATE-WHERE and DELETE-WHERE.
Note that frequency of large data type columns such as
Binary-data type, Xml data type columns might be zero. Therefore, it is not
advisable to create index on such types of columns.
The
key-role of DBA is to determine and design proper Index that should
optimize for your database. DBA needs to balance the between frequency
of
Data-Retrieval and Data-Modifications.
Index is a performance optimization technique that speeds up the data retrieval process. It is a persistent data structure (Key-Pointer) that associated with a Table (or View) in order to increases performance during retrieving the data from that Table (or View).
The main advantage as well as dis-advantage of Index mechanism is PERFORMANCE. Because, it speed ups the data retrieval process but slow-downs the data modifications process performance. Updating a table with Indexes takes more time than updating without Index, because SQL server needs an additional updates to INDEX entry also. Therefore, before creating explicit index you have to balance the frequency of Data-Retrieval and Data-Modifications on Search-key.
The main difference between Clustered and Non-Clustered Indexes are based on following:
The main advantage as well as disadvantage of Indexing is Performance. There are two reasons that degrades the performance when Indexes are applied?
Some times indexes creates performance overheads. The following are situations when we don't want indexes no more.