Difference between Truncate and Delete, SQL Introduction, SQL Server 2005

Add Computed Column in Table

Wednesday, October 27, 2010 0 comments

A computed column is computed from an expression that can use other columns in the same table. The expression can be a non computed column name, constant, function, and any combination of these connected by one or more operators. The expression cannot be a sub-query. See the below example for details.

CREATE TABLE Authors
(
AuthorId int IDENTITY(1,1) NOT NULL,
FirstName nvarchar(100),
LastName nvarchar(100),
FullName AS (FirstName + SPACE(1) + LastName) -- computed column
)

The above table sample has the FullName computed column defined as the concatenation of two other column values in the same table.
This is a simple expression sample. We define the computed column by "AS" clause and see that we do not define a column type since the type is defined implicitly by the calculation expression.

INSERT INTO Authors (FirstName, LastName) VALUES (N'Roger', N'Wolter')
INSERT INTO Authors (FirstName, LastName) VALUES (N'Dejan', N'Sarka')
SELECT * FROM Authors

We can add computed column after creating the table. Lets take a example, in the same table, we want to add one more column with CASE expression to define the calculated column value.

ALTER TABLE Authors ADD FullName2 AS (CASE WHEN AuthorId <>

Now insert some data into the sql table, and see the results.

INSERT INTO Authors (FirstName, LastName) VALUES (N'Itzik', N'Ben-Gan')
SELECT * FROM Authors

Note: You can add the same functionality with the temporary table also. i.e. you can add computed column in temporary table at the time of creation (with Create table #tmp (T-SQL command) ) or at the time of alteration/modification.

Merge Statement in SQL Server 2008

Monday, August 23, 2010 0 comments

SQL Server 2008 introduces the MERGE statement which will allow users to perform insert, update and delete operations in a single statement. In the earlier versions of SQL Server to achieve the same functionality the database developer or database administrator needed to write separate statements to perform the insert, update or delete of data in one table based on certain conditions in another table.

Using MERGE statement which Microsoft has introduced with SQL Server 2008 database developers or DBA’s can achieve the same functionality by writing very less TSQL code. The code written using this logic will also have performance issue due to the complexity of joins etc. Even though this feature is introduced very late in SQL Server Product, going forward I am very sure that it will be adopted very quickly in many data warehouse projects.

One of the most important advantage of MERGE statement is all the data is read and processed only once. In previous versions three different statement has to be written to process three different activity (INSERT, UPDATE or DELETE), however using MERGE statement all update activity can be done in one pass of database table. This is quite an improvement in performance of database query.

How does MERGE Statement Internally Works
The MERGE statement internally works as an individual insert, update and delete statement within a single Merge statement. You need to specify the SOURCE and the TARGET table or query which should be joined together. Within the MERGE statement you also need to specify the type of the data modification that needs to be performed when the records between the source and target are matched and what actions needs to be performed when they are not matched. With the introduction of MERGE statement the complex TSQL codes which was used earlier to do checks for the existence or inexistence of data within the data warehouse can be replaced with single Merge statement. The use of Merge statement will also improve the query performance.
Below are the three different matched clauses in MERGE:

WHEN MATCHED THEN

Rows that meet the criteria

WHEN [TARGET] NOT MATCHED THEN

Rows that do not match with another row in the target table

WHEN SOURCE NOT MATCHED THEN

Rows that do not match with another row in the source table

Syntax of MERGE statement is as following:
MERGE
[ TOP ( expression ) [ PERCENT ] ]
[ INTO ] target_table [ WITH ( ) ] [ [ AS ] table_alias]
USING
ON
[ WHEN MATCHED [ AND ]
THEN ]
[ WHEN NOT MATCHED [ BY TARGET ] [ AND ]
THEN ]
[ WHEN NOT MATCHED BY SOURCE [ AND ]
THEN ]
[ ]
[ OPTION ( [ ,...n ] ) ]
;

Example

CREATE TABLE dbo.tbl_Source (id INT, name NVARCHAR(100), qty INT);

CREATE TABLE dbo.tbl_Target (id INT, name NVARCHAR(100), qty INT);

--Synchronize source data with target

MERGE INTO dbo.tbl_Target AS t

USING dbo.tbl_Source AS s

ON t.id = s.id

WHEN MATCHED AND (t.name != s.name OR t.qty!= s.qty) THEN

--Row exists and data is different

UPDATE SET t.name = s.name, t.qty = s.qty

WHEN NOT MATCHED THEN

--Row exists in source but not in target

INSERT INTO (id, name, qty)

VALUES (s.id, s.name, s.qty)

WHEN SOURCE NOT MATCHED THEN

--Row exists in target but not in source, Then Delete from Target

DELETE ;

MERGE statement is very handy improvement for T-SQL developers who have to update database tables with complicated logic. MERGE statement also improves the performance of database as it passes through data only once.

Output and Output Into clause in SQL Server 2005

Saturday, August 15, 2009 0 comments

SQL Server 2005 introduced new OUTPUT and OUTPUT INTO Clause for showing the result after the Insert/Update/Delete statement in single go. Earlier (SQL Server 2000), we are using two T-SQL statement for getting result after Insert/Update/Delete statement.
For example we have a table “tbl_Employee” and it has three columns (ID, Name, Salary).
Create table tbl_Employee
(
EmpID Int,
EmpName varchar(50),
Salary Decimal(10,2)
)
Example
In SQL Server 2000, first we execute Insert/Update/Delete statement, and then we execute the select statement to see the result.
Example with Insert Statement.
Insert Into tbl_Employee (EmpID, EmpName, Salary)
Values (1, ‘AAA’, 5000)
Go
Select * from tbl_Employee
Go
The same thing we do with UPDATE and DELETE statement in SQL Server 2000. First we update or Delete the record and then see the output (result).
But in SQL Server 2005, there is an OUTPUT clause for providing output (result) after performing Insert/Update/Delete action in single step (in single query).
OUTPUT Clause with INSERT Statement
Insert into tbl_Employee (EmpID, EmpName, Salary)
OUTPUT Inserted.*
Values (1, 'AAA', 3000)
Go
OUTPUT INTO Clause with INSERT Statement
If you want to insert the output (result) in different table, use OUTPUT INTO clause with other table name. We have another table named tbl_EmployeeLog with same column name and data type of tbl_Employee table.

Insert into tbl_Employee (EmpID, EmpName, Salary)
OUTPUT Inserted.* INTO tbl_EmployeeLog
Values (2, 'BBB', 3000)
Go

If you see the record of tbl_EmployeeLog table, you will get following record.

But don’t start to think that you would use OUTPUT INTO clause for maintaining log. This is good option for inserting output in another table, but for maintaining the log, TRIGGER is the best option. Always use TRIGGER wherever you need to maintain the log of any (Insert/Update/Delete) action.

Varchar(MAX) Datatype and its limitation

Saturday, July 4, 2009 0 comments

In SQL Server 2000 and SQL Server 7, a row cannot exceed 8000 bytes in size. This means that a VARBINARY column can only store 8000 bytes (assuming it is the only column in a table), a VARCHAR column can store up to 8000 characters and an NVARCHAR column can store up to 4000 characters (2 bytes per unicode character). This limitation stems from the 8 KB internal page size SQL Server uses to save data to disk.

To store more data in a single column, you needed to use the TEXT, NTEXT, or IMAGE data types (BLOBs) which are stored in a collection of 8 KB data pages that are separate from the data pages that store the other data in the same table. These data pages are arranged in a B-tree structure. (Binary Large Objects) BLOBs are hard to work with and manipulate. They cannot be used as variables in a procedure or a function and they cannot be used inside string functions such as REPLACE, CHARINDEX or SUBSTRING. In most cases, you have to use READTEXT, WRITETEXT, and UPDATETEXT commands to manipulate BLOBs.

To solve this problem, Microsoft introduced the VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) data types in SQL Server 2005. These data types can hold the same amount of data BLOBs can hold (2 GB) and they are stored in the same type of data pages used for other data types. When data in a MAX data type exceeds 8 KB, an over-flow page is used. SQL Server 2005 automatically assigns an over-flow indicator to the page and knows how to manipulate data rows the same way it manipulates other data types. You can declare variables of MAX data types inside a stored procedure or function and even pass them as variables. You can also use them inside string functions.

Microsoft recommends using MAX data types instead of BLOBs in SQL Server 2005.
Varchar(Max) Data type can hold more than 8000 characters. But it doesn’t means, we can specify varchar(8001), varchar(9000) or something like this in a table’s field.
If we try to create table’s field with VARCHAR(9000) it gives following error.
Create table tbl_Test
(
Col varchar(9000)
)
Msg 131, Level 15, State 2, Line 3
The size (9000) given to the column 'Col' exceeds the maximum allowed for any data type (8000).

Limitation of varchar(Max)
----------------------------------------------------------
1) If a variable hold more than 8000 character. But in print option it display only 8000 character. We should use select statement instead of print.
2) If you add a literal value in a variable of varchar(max) datatype without casting. It will only store 8000 character. See the below example.

Declare @tst Varchar(max)
Set @tst = space(7998) + 'test'
Select len(@tst)

Output
---------
8000

But when you add literal value after casting, it can hold more than 8000 characters. See the below example.

Set @tst = space(7998) + CAST('test' as varchar(max))
Select len(@tst)
Output
--------
8002

Truncate Table Statement (Difference between Truncate and Delete)

Thursday, August 28, 2008 2 comments

Truncate table statement can delete all record of table. There is no where clause uses with truncate command. Delete command removes record for the table and put entry in transaction log of every deleted record. Truncate table removes the data by de-allocating the data pages and records only the page de-allocations in the transaction log. Table records are store in data pages.
The actual data in your table is stored in Data Pages. The Page is the smallest unit of data storage in Microsoft SQL Server. A page contains the data in the rows. A row can only reside in one page. Each Page can contain 8KB of information, due to this; the maximum size of a Row is 8KB. A group of 8 adjacent pages is called an extent.
For better explanation of data pages, we take one example. Suppose I have 100 records in Employee table. These 100 records divided into 10 pages. Records (rows of a table) keep in data page. After using truncate command, every pages deallocation entry would be maintain in transaction log. Truncate command can also be roll back. When we roll back the Truncate command, de-allocated pages would be roll back

For Example:- tbl_Employee table has 100 records. After executing the first 2 step, record would be zero in tbl_Employee table.
1) Begin Transaction
2) TRUNCATE TABLE tbl_Employee
Run the following select command for verifying the output.
3) Select * from tbl_Employee
Result would be zero record.
Now rollback the transaction
4) Rollback Transaction
Run the following select command again and find the output of 100 records.
5) Select * from tbl_Employee

Truncate command has following advantage over Delete command

->The DELETE command removes rows one at a time and records an entry in the transaction log for each deleted row. TRUNCATE command removes the data by de-allocating the data pages (used to store the table data) and records only the page de-allocations in the transaction log.

->When the DELETE command is executed using a row lock, each row in the table is locked for deletion. TRUNCATE command always locks the table and page, not each row.

->Truncate command de-allocates the pages that why it is faster than delete command.

->After a DELETE statement is executed, the table can still contain empty pages. For example, empty pages in a heap cannot be de-allocated. For indexes, the delete operation can leave empty pages behind, although these pages will be de-allocated quickly by a background cleanup process.

TRUNCATE TABLE command removes all rows from a table, not the structure (columns, constraints, indexes, etc) of the table. Table structure and its columns, constraints, indexes, etc remain as it is. To remove the table definition, structure and its data in one go, use the DROP TABLE statement.

Truncate table has following limitation

->You can’t use TRUNCATE TABLE statement on a table which is referenced by a FOREIGN KEY constraint. (You can truncate a table that has a foreign key that references itself.)

->You can’t use “where” clause with TRUNCATE TABLE statement.

Truncate command de-allocates the pages which hold multiple records that why we can’t use “where” clause with truncate command.

Delete Duplicate Records from Table in SQL Server

Wednesday, August 13, 2008 0 comments

I have found the best way to delete duplicate records in a table which has IDENTITY Column. For example, we have an employee table which has duplicate data of EmployeeName and Salary field.

TableName : tbl_Employee

Field Name -------------- FieldType
---------------------------------------------------
EmployeeID ------------- int (IDENTITY)
EmployeeName ----------varchar(50)
Salary --------------------int

Table Records

EmployeeID EmployeeName Salary
----------------------------------------------------------------
1 ----------- AAA ----------- 15000
2 ----------- BBB ----------- 10000
3 ----------- CCC ------------20000
4 ----------- BBB ----------- 10000
5 ----------- CCC ----------- 20000
6 ----------- AAA ---------- 15000
7 ----------- BBB ----------- 10000

DELETE
FROM tbl_Employee
WHERE EmployeeID NOT IN
(
SELECT MAX(EmployeeID)
FROM tbl_Employee
GROUP BY EmployeeName, Salary)

Output ( After executing this query)
---------------------------------------------------------------------------------------

EmployeeID EmployeeName Salary
----------------------------------------------------------------
5 ----------- CCC ----------- 20000
6 ----------- AAA ---------- 15000
7 ----------- BBB -----------10000

Another way

We can do it in another way when table has not any Identity field. First we have to insert the unique record by using distinct command.

Select distinct * into tempEmployee from tbl_Employee

Now delete all record from tbl_Employee

Truncate table tbl_Employee

Now insert unique record in tbl_Employee table from tmpEmployee.

Insert into tbl_Employee
Select * from tempEmployee

After that we can drop that temporary table.

Drop table tempEmployee

I know only these two ways if anyone know any other way. Please share his/her knowledge.

Maximum Capacity Specifications for SQL Server 2000 and SQL Server 2005

Wednesday, July 30, 2008 0 comments

The following table specifies the maximum sizes and numbers of various objects defined in SQL Server 2000/2005 databases or referenced in Transact-SQL statements. The table does not include Microsoft SQL Server 2000 Windows CE Edition and Microsoft SQL Server 2005 Windows CE Edition.
1 Network Packet Size is the size of the tabular data stream (TDS) packets used to communicate between applications and the relational Database Engine. The default packet size is 4 kilobytes (KB), and is controlled by the network packet size configuration option.

2 The maximum number of bytes in any index key cannot exceed 900 in SQL Server 2005. You can define a key using variable-length columns whose maximum sizes add up to more than 900, provided no row is ever inserted with more than 900 bytes of data in those columns. In SQL Server 2005, you can include nonkey columns in a nonclustered index to avoid the maximum index key size of 900 bytes.

3 Database objects include objects such as tables, views, stored procedures,user-defined functions, triggers, rules, defaults, and constraints.The sum of the number of all objects in a database cannot exceed 2,147,483,647.

4 Although a table can contain an unlimited number of FOREIGN KEY constraints, the recommended maximum is 253. Depending on the hardware configuration hosting SQL Server, specifying additional foreign key constraints may be expensive for the query optimizer to process.

5 This value is for static lock llocation. Dynamic locks are limited only by memory.

6 If a stored procedure accesses more than 8 databases, or more than 2 databases in interleaving, you will receive an error.

7 If the table contains one or more XML indexes, the clustering key of the user table is limited to 15 columns because the XML column is added to the clustering key of the primary XML index. In SQL Server 2005, you can include nonkey columns in a nonclustered index to avoid the limitation of a maximum of 16 key columns.

8 SQL Server 2005 supports row-overflow storage which enables variable length columns to be pushed off-row. Only a 24-byte root is stored in the main record for variable length columns pushed out of row; because of this, the effective row limit is higher than in previous
releases of SQL Server. SQL Server 2005 Books Online.

References:
SQL Server 2005 Books Online