Just one of those things you may not know existed - the ROLLUP clause. It's very useful for totaling. Are you used to doing something like this?
SELECT customer, sum(sales) from table
UNION
SELECT 'Total', sum(sales) from table
The problem here is that on larger datasets, this can really be a pain in terms of performance. ROLLUP can be extremely quick, but the syntax may not be straight forward. Here's an example of summing sales in AdventureWorks2017 with a Grand Total.
USE AdventureWorks2017
SELECT
SalesOrderID,
sum(LineTotal)
FROM Sales.SalesOrderDetail
where SalesOrderID in (
'43659',
'43660',
'43661',
'43662',
'43663'
)
GROUP BY SalesOrderID
WITH ROLLUP
Notice I've added a WITH ROLLUP at the end of my GROUP BY. This creates the grand total. However, I'm left with a NULL for the SalesOrderID. To solve this I wouldn't want to use a coalesce, because a NULL might be valid elsewhere. Instead we should use the GROUPING function.
USE AdventureWorks2017
SELECT
CASE
WHEN GROUPING(SalesOrderID) = 1 THEN 'Total'
ELSE cast(SalesOrderID as nvarchar(10))
END as SalesOrderID,
sum(LineTotal)
FROM Sales.SalesOrderDetail
where SalesOrderID in (
'43659',
'43660',
'43661',
'43662',
'43663'
)
GROUP BY SalesOrderID
WITH ROLLUP
You can extend this even further...
A nifty little bit of code I've used for years and years, very useful if you don't have a third party add-on installed...
USE AdventureWorks2017
SELECT * FROM (
SELECT t.name AS name,
'table' as type,
SCHEMA_NAME(schema_id) AS schema_name,
c.name AS column_name
FROM sys.tables AS t
INNER JOIN sys.columns c ON t.OBJECT_ID = c.OBJECT_ID
WHERE c.name LIKE '%Employee%'
UNION
SELECT v.name AS name,
'view' as type,
SCHEMA_NAME(schema_id) AS schema_name,
c.name AS column_name
FROM sys.views AS v
INNER JOIN sys.columns c ON v.OBJECT_ID = c.OBJECT_ID
WHERE c.name LIKE '%Employee%'
) a
ORDER BY schema_name, name
...
One fun thing I found out about SSIS is that is doesn't always error out a package if you execute stored procedures as SQL tasks instead of utilizing the SQL itself inside of the package. In other words - it may produce an error if you were to execute it in SSMS, but if you execute within SSIS it'll fly by successfully without you knowing anything went wrong. The best little trick I've found is to just surround the entire procedure with a Try/Catch block. This seems to raise the error up to SSIS and fails the package for you.
CREATE PROCEDURE usp_TestProc as
BEGIN TRY
--SQL goes here...
END TRY
BEGIN CATCH
PRINT 'Error ' + CONVERT(varchar(50), ERROR_NUMBER()) + ' at line ' + CONVERT(varchar(50), ERROR_LINE()) + ': ' + ERROR_MESSAGE();
END CATCH;
GO
...
One problem I've had recently is a SQL Server instance in which the views were constantly cached and needed to be refreshed. Normally to do this you would run the build in procedure:
exec sp_refreshview MyView
However - I had the need to run this on all my views daily, and we constantly had new views. You can use the stored proc below to simply loop all the views and run as needed.
create procedure usp_refreshview as
declare @sqlcmd nvarchar(max) = ''
select @sqlcmd = @sqlcmd + 'exec sp_refreshview ' + '''' + name + '''; '
from sys.objects as so
where so.type = 'V'
print @sqlcmd
if len(@sqlcmd) > 0
exec(@sqlcmd)
To run, simply exec:
exec usp_refreshview
...
A useful bit of syntax in the case that you end up with multiple levels of granularity in a dataset, and you want to collapse it back down. One solution would be to turn your rows into columns (pivoting), or put your row values into a single column separated by commas (or pipes, etc.). Below I'm using the AdventureWorks2017 database to show how to do the latter.
SELECT [CountryRegionCode]
,StateProvinceCode
FROM [AdventureWorks2017].[Person].[StateProvince]
Here it produces a gigantic list of countries with associated state/provinces. So let's collapse that down...
SELECT outerQ.[CountryRegionCode],
STUFF((
SELECT DISTINCT ',' + subQ.StateProvinceCode
FROM [AdventureWorks2017].[Person].[StateProvince] subQ
WHERE outerQ.CountryRegionCode = subQ.CountryRegionCode
FOR XML PATH('')
), 1, 1, ''
) as LastNameList
FROM [AdventureWorks2017].[Person].[StateProvince] outerQ
GROUP BY outerQ.[CountryRegionCode];
...
The bane of my existence, blank values that should be NULL in a database. This happens frequently when importing Excel/Flat files particularly within SQL Server. Running hundreds of UPDATE statements isn't fun, so I use a Stored Procedure with a CURSOR to take care of it.
CREATE PROCEDURE [dbo].[usp_getRidOfTableBlanks](@tableName nvarchar(50))
AS
DECLARE @colName varchar(255)
DECLARE @sql varchar(4000)
DECLARE Table_Cursor CURSOR FOR
select COLUMN_NAME
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME=@tableName
OPEN Table_Cursor;
FETCH NEXT FROM Table_Cursor INTO @colName
WHILE @@FETCH_STATUS = 0
BEGIN
set @sql = 'UPDATE ' + @tableName + ' SET [' + @colName + '] = NULL WHERE [' + @colName + '] = '''';';
...
Something very important to me in particular as I utilize SQL Server quite a bit. Utilizing pyodbc and Pandas you can bring your data in.
Required Libraries
conda install pyodbc #Anaconda installation
#pip install pyodbc #Alternative
Here we can try altering to read from the AdventureWorks2017 database in SQL Server, as an example (replacing your server_name):
import pyodbc
import pandas as pd
conn = pyodbc.connect('Driver={SQL Server};'
'Server=server_name;'
'Database=AdventureWorks2017;'
'Trusted_Connection=yes;')
sql_query = pd.read_sql_query('SELECT * FROM Person.person',conn)
sql_query.head()
Notice this gets thrown out as a dataframe:
Similarly - we could also write back to SQL Server if we wanted to utilizing slightly altered logic (abbreviated insert given how many...