Importance of Statistics and How It Works in SQL Server Part 1 Database.Journal. com. Introduction.Statistics refers to the statistical information about the distribution of values in one or more columns of a table or an index.The SQL Server Query Optimizer uses this statistical information to estimate the cardinality, or number of rows, in the query result to be returned, which enables the SQL Server Query Optimizer to create a high quality query execution plan.For example, based on these statistical information SQL Server Query Optimizer might decide whether to use the index seek operator or a more resource intensive index scan operator in order to provide optimal query performance.In this article series, I am going to talk about statistics in detail.Basics of Statistics.SQL Server Query Optimizer uses statistics to estimate the distribution of values in one or more columns of a table or index views, and the number of rows called cardinality to create a high quality query execution plan.Often statistics are created on a single column but its not uncommon to create statistics on multiple columns.Each statistics object contains a histogram displaying the distribution of values of the column or of the first column in the case of multi column statistics.Multi column statistics also contains a correlation of values among the columns called densities, which are derived from the number of distinct rows or the column values.There are different ways you can view the details of the statistics objects.For example, as shown in the query below, you can use the DBCC SHOWSTATISTICS command.DBCC SHOWSTATISTICS shows the header, histogram, and density vector based on data stored in the statistics object.This shows header, histogram, and density vector based on data stored in the statistics object DBCCSHOWSTATISTICSSales.Order. Detail, NCISales.Order. DetailProduct.ID This only shows histogram based on data stored in the statistics object.DBCCSHOWSTATISTICSSales.Order. Detail, NCISales.Order. DetailProduct.IDWITHHISTOGRAM You can also view the statistical information by going to the properties page of the statistics object in SQL Server Management Studio as shown below Statistics Properties General.Statistics Properties Details.The value for alldensity 1 number of distinct values for a column ranges from 0.X0llz.jpg' alt='Mssql Update On Duplicate Key' title='Mssql Update On Duplicate Key' />This actually helps SQL Server Query Optimizer to decide whether to use Index Seek or Index Scan.The histogram captures the frequency of occurrence for each distinct value in the first key column of the statistics object.SQL Server Query Optimizer creates the histogram by sorting the column values, computing the number of values that match each distinct column value and then aggregating the column values into a maximum of 2.Each histogram step includes a range of column values followed by an upper bound column value, which includes all possible column values between boundary values excluding the boundary values themselves.SQL Server Query Optimizer uses statistical information to estimate the cardinality in a query result.This enables the SQL Server Query Optimizer to.THIS TOPIC APPLIES TO SQL Server starting with 2008 Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse.Limits the rows returned in a query.Experts Exchange Questions DataTable.Merge does not change the RowState DataTable.Update fails to commit changes.The term UPSERT has been coined to refer to an operation that inserts rows into a table if they dont exist, otherwise they are updated.To perform. The lowest of the sorted column values is the upper boundary value for the first histogram step.RANGEHIKEY This is also called a key value and represents the upper bound column value for a histogram step.RANGEROWS This represents the estimated number of rows whose column value falls within a histogram step, excluding the upper bound.DISTINCTRANGEROWS This represents the estimated number of rows with a distinct column value within a histogram step, excluding the upper bound.EQROWS This represents the estimated number of rows whose column value equals the upper bound of the histogram step.AVGRANGEROWS RANGEROWS DISTINCTRANGEROWS for DISTINCTRANGEROWS 0 This represents the average number of rows with duplicate column values within a histogram step, excluding the upper bound.When to Create or Update Statistics.When to Create Statistics.Often columns being used in JOIN, WHERE, ORDER BY, or GROUP clauses are good candidate to have up to date statistics on them.Though the SQL Server Query Optimizer creates single column statistics when the AUTOCREATESTATISTICS database property is set to ON or when you create indexes on the table or views statistics are created on the key columns of the indexes, there might be times when you need to create additional statistics using the CREATE STATISTICS command to capture cardinality, statistical correlations so that it enables the SQL Server Query Optimizer to create improved query plans.When you find a query predicate containing multiple columns with cross column relationships and dependencies you should create multi column statistics.These multi column statistics contain cross column correlation statistics, often referred to as densities, to improve the cardinality estimates when query results depend on data relationships among multiple columns.When creating multi column statistics, be sure to put columns in the right order as this impacts the effectiveness of densities for making cardinality estimates.For example, a statistic created on these columns and in order Name, Age, and Salary.In this case, the statistics object will have densities for the following column prefixes Name, Name, Age, and Name, Age, Salary.Now if your query uses Name and Salary without using Age, the density is not available for cardinality estimates.When to Update Statistics.Substantial data change operations like insert, update, delete, or merge change the data distribution in the table or indexed view and make the statistics goes stale or out of date, as it might not reflect the correct data distribution in a given column or index.SQL Server Query Optimizer identifies these stale statistics before compiling a query and before executing a cached query plan.The identification of stale statistics are done by counting the number of data modifications since the last statistics update and comparing the number of modifications to a threshold as mentioned below.A database table with no rows gets a row.A database table had fewer than 5.A database table had more than 5.You can find when each statistics object of a database table was updated using the below query SELECT name AS Statistics.Name, STATSDATEobjectid, statsidAS Statistics.Updated. Date. FROMsys.WHEREOBJECTNAMEobjectidSales.Order. HeaderORDERBY name GOYou can also use below query, which uses the dynamic management function sys.SELECT. OBJECTNAMEstats.AS Table. Name. stats.AS Statistics. Name.FROMsys. statsstats.OUTERAPPLYsys. dmdbstatspropertiesstats.WHEREOBJECTNAMEstats.Sales. Order. HeaderORDERBYstats.Importance of Statistics in Query Performance.Lets start understanding this with an example.Execute the below query to create a new database and set its AUTOCREATESTATISTICS and AUTOUPDATESTATISTICS properties to OFF so that automatic statistics creation and updating does not happen on the tables of this database.Next create a table and load data from the Sales.Order. Detail table of the Adventure.Works database. I have loaded ten times of the data so that we can see the differences clearly.CREATEDATABASE Statistics.Test. ALTERDATABASE Statistics.Test SETAUTOCREATESTATISTICSOFFALTERDATABASE Statistics.Test SETAUTOUPDATESTATISTICSOFFALTERDATABASE Statistics.Test SETAUTOUPDATESTATISTICSASYNCOFF.USE Statistics. Test.CREATETABLE Sales.Order. Detail. Sales.Order. ID int NOTNULL.Sales. Order. Detail.ID int NOTNULL. Carrier.Tracking. Number nvarchar2.NULL. Order. Qty smallint NOTNULL.Product. ID int NOTNULL.Special. Offer. ID int NOTNULL.Unit. Price money NOTNULL.Unit. Price. Discount money NOTNULL.Line. Total money.NOTNULL. rowguid uniqueidentifier ROWGUIDCOL NOTNULL.Modified. Date datetime NOTNULL,ON PRIMARY.INSERTINTO Sales.Order. Detail. SELECTFROM Adventure.Works. 20. 08. R2.Sales. Sales. Order.Detail. GO 1. Now lets run these two queries and have a look on their execution plan.Notice the yellow exclamation mark on the Table Scan operator this indicates the missing statistics.Further, notice between the Actual Number of Rows and Estimated Number of Rows that there is a huge difference.This means, obviously, the execution plan used for query execution was not optimal.Sales. Order. Detail.Product. ID lt 8.Sales. Order. Detail.Product. ID 8. Now lets create an index on the Product.Frequently asked SQL Query Interview Questions.In this article, I am giving some examples of SQL queries which is frequently asked when you go for a programming interview, having one or two year experience on this field.Whether you go for Java developer position, QA, BA, supports professional, project manager or any other technical position, may interviewer expect you to answer basic questions from Database and SQL.Its also obvious that if you are working from one or two years on any project there is good chance that you come across to handle database, writing SQL queries to insert, update, delete and select records.One simple but effective way to check candidates SQL skill is by asking these types of simple query.They are are neither very complex nor very big, but yet they cover all key concept a programmer should know about SQL.These queries test your SQL skill on Joins, both INNER and OUTER join, filtering records by using WHERE and HAVING clause, grouping records using GROUP BY clause, calculating sum, average and counting records using aggregate function like AVG, SUM and COUNT, searching records using wildcards in LIKE operator, searching records in a bound using BETWEEN and IN clause, DATE and TIME queries etc.If you have faced any interesting SQL query or you have any problem and searching for the solution, you can post it here for everyones benefit.If you are looking for more challenging SQL query exercises and puzzles then you can also check Joe Clekos SQL Puzzles And Answers, one of the best books to really check and improve your SQL skills.Question 1 SQL Query to find second highest salary of Employee.Answer There are many ways to find second highest salary of Employee in SQL, you can either use SQL Join or Subquery to solve this problem.Here is SQL query using Subquery select.MAXSalary from Employee WHERE Salary NOT IN select.MAXSalary from Employee Question 2 SQL Query to find Max Salary from each department.Answer You can find the maximum salary for each department by grouping all records by Dept.Id and then using MAX function to calculate maximum salary in each group or each department.SELECT Dept. ID, MAXSalary FROM Employee GROUP BY Dept.ID. These questions become more interesting if Interviewer will ask you to print department name instead of department id, in that case, you need to join Employee table with Department using foreign key Dept.ID, make sure you do LEFT or RIGHT OUTER JOIN to include departments without any employee as well.Here is the query.SELECT Dept. Name, MAXSalary FROM Employee e RIGHT JOIN Department d ONe.Dept. Idd. Dept. ID GROUP BY Dept.Name In this query, we have used RIGHT OUTER JOIN because we need the name of the department from Department table which is on the right side of JOIN clause, even if there is no reference of deptid on Employee table.Question 3 Write SQL Query to display the current date.Answer SQL has built in function called Get.Date which returns the current timestamp.This will work in Microsoft SQL Server, other vendors like Oracle and My.SQL also has equivalent functions.SELECT Get. Date Question 4 Write an SQL Query to check whether date passed to Query is the date of given format or not.Answer SQL has Is.Date function which is used to check passed value is a date or not of specified format, it returns 1true or 0false accordingly.Remember ISDATE is an MSSQL function and it may not work on Oracle, My.SQL or any other database but there would be something similar.SELECT ISDATE10.ASMMDDYY It will return 0 because passed date is not in correct format.Question 5 Write an SQL Query to print the name of the distinct employee whose DOB is between 0.Answer This SQL query is tricky, but you can use BETWEEN clause to get all records whose date fall between two dates.SELECT DISTINCT Emp.Name FROM Employees WHERE DOB BETWEEN 0.AND 3. 11. 21. Question 6 Write an SQL Query find number of employees according to gender whose DOB is between 0.Answer SELECTCOUNT, sex from Employees WHERE DOB BETWEEN 0.AND3. 11. 21. 97.GROUP BY sex Question 7 Write an SQL Query to find an employee whose Salary is equal or greater than 1.Answer SELECT Emp.Name FROM Employees WHERE Salary 1.Question 8 Write an SQL Query to find name of employee whose name Start with MAnswer SELECTROM Employees WHERE Emp.Name likeM Question 9 find all Employee records containing the word Joe, regardless of whether it was stored as JOE, Joe, or joe.Answer SELECTrom Employees WHEREUPPEREmp.Name likeJOE Question 1.Write an SQL Query to find the year from date.Answer Here is how you can find Year from a Date in SQL Server 2.SELECT YEARGETDATE asYear Question 1.Write SQL Query to find duplicate rows in a database SQL query to delete them Answer You can use the following query to select distinct records SELECTROM emp a WHERE rowid SELECTMAXrowid FROM EMP b WHEREa.Delete DELETEFROM emp a WHERE rowid SELECTMAXrowid FROM emp b WHEREa.Question 1. 2 There is a table which contains two column Student and Marks, you need to find all the students, whose marks are greater than average marks i.Answer This query can be written using subquery as shown below SELECT student, marks from table where marks SELECTAVGmarks from tableQuestion 1.How do you find all employees which are also manager You have given a standard employee table with an additional column mgrid, which contains employee id of the manager.Answer You need to know about self join to solve this problem.In Self Join, you can join two instances of the same table to find out additional details as shown below.SELECTe. name, m.FROM Employee e, Employee m WHEREe.John David. One follow up is to modify this query to include employees which dont have a manager.To solve that, instead of using the inner join, just use left outer join, this will also include employees without managers.Question 1. 4 You have a composite index of three columns, and you only provide the value of two columns in WHERE clause of a select query Will Index be used for this operationFor example if Index is on Emp.Id, Emp. First. Name, and Emp.Second. Name and you write query like. Need For Speed Carbon Full Crack Keygen Game . SELECTROM Employee WHERE Emp.Id2and Emp. First.NameRadhe. If the given two columns are secondary index column then the index will not invoke, but if the given 2 columns contain the primary indexfirst column while creating index then the index will invoke.In this case, Index will be used because Emp.Id and Emp. First.Name are primary columns.Other Interview Questions posts from Java.Blog If you are looking for online trainingcourse to learn SQL from scratch, I suggest you joining.Its one of the best sourse to learn SQL fundamentals e.SQL query optimization.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |