Auto-increment without the need of a table.

I came across a situation where I need unique numbers for both a large and small set of data. The standard approach is to create a table with an identity column and do your inserts. For a table with tens of thousands of records, this might be the best approach, but there is another one: ROW_NUMBER().

The key to using the ROW_NUMBER() function is to provide a constant so you will get a contiguous sequence.  Let's look at sample data.

Format
  AB
  AL
  CD
  CE
  GG
  LI

Let's look at the function:

ROW_NUMBER() OVER (Partition by <group> order by <columns>)

I can first construct the function as follows

 select Format,
         ROW_NUMBER() OVER(PARTITION BY Format ORDER BY FORMAT) as id

However this will give us the following results:

id Format
 1   AB
 1   AL
 1   CD
 1   CE
 1   GG
 1   LI

The reason is your PARTITION BY creates a unique number for each format. To solve this problem you need to put a value that will never change over the life of the select. I use 1.

 select Format,
         ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY FORMAT) as id

The new result set is:

id Format
 1   AB
 2   AL
 3   CD
 4   CE
 5   GG
 6   LI

Hope this helps improve your sql. As always, leave comments below.

 

 

 

 

Grouping data by 15 minute intervals

After a rather long course change (I was doing little work with SQL Server and work with other RDBMS systems), I'm back working on Microsoft SQL Server and a few other Microsoft products. Here is an interesting problem presented to me a few days ago.

A colleague of mine need timed data grouped by 15 minute intervals. I found this to be an interesting problem and didn't see much online, so thought I'd post this quick tidbit for others to find. The technique can be used to group by any interval with a little tweaking.

The field in question is a datetime field. The code needed to group data in 15 min intervals is this:

datepart(mi,C001_TimeEnd)/15)*15

Let's break it down. I take the minutes of the datetime field and divide it by 15, which is my interval. That will give me a number between 0 and 3, because it is an integer/integer. I then multiple that by 15. The distribution is as follows:

  • (00-14)/15 = 0*15=0
  • (15-29)/15= 1*15 = 15
  • (30-44)/15=2*15=30
  • (45-59)=3*15=45

Datepart(mi,"2018-5-9 11:48:33")/15) = 48/15 = 3*15 = 45. Now add in the the date and hours.

left(convert(varchar,"2018-5-9 11:48:33",120),14)+RIGHT('00' + CONVERT(VARCHAR(2), (datepart(mi,"2018-5-9 11:48:33")/15)*15), 2) = "2018-5-9 11:45:00"

If you wanted to do a 10 min interval, this: datepart(mi,C001_TimeEnd)/15)*15 becomes this datepart(mi,C001_TimeEnd)/10)*10.

 

Using Row_Number to pick the correct entry.

There have been numerous occasions where I have to pick a one of two records out of a given result set, but how I determine which one has some unique rules.  I have two examples of this and I show how I used Row_Number() to solve this.

The first set of data has these sets of rules:

  • For any given ISBN13, give me the 'Acquisition Editor' if they exist and if they do not exist, provide the 'Editor', known as [Role].
  • If there are multiple persons for the chosen [Role], pick the one with the lowest [SortOrder].
  • If [SortOrder] is null, pick the first name alphabetically.

Here is the solution:

select ISBN13,
       DisplayName,
       [Role]
  from (select ISBN13,
               DisplayName,
               [Role],
               ROW_NUMBER() OVER(Partition by ISBN13 order by [Role],isnull(cast(SortOrder as varchar(40)),DisplayName)) as Row
          from dbo.ProductContact
          where C006_Role in ('Acquisition Editor','Editor')) d
where Row=1

The second set of data had the following rules:

  • For a given ISBN13, provide the sum of the  'First Printing', which can be under either PrintNumber '0000' or '0001'. 
  • PrintNumber '0000' is always chosen over '0001' if it exists.

Here is the solution:

select isbn13,
       Printing_Number,
       First_Run
   from (SELECT isbn13,
                Printing_Number,
                Sum(Print_Run) as First_Run,                
                ROW_NUMBER() OVER(PARTITION by isbn13 order by Printing_Number) as Row
           FROM dbo.PPB
          where Printing_Number in ('0000','0001')
          group by isbn13,Printing_Number) d
  where Row=1

QueryIO Reader for Python V2

I've made some improvements to my QueryIO reader built in Python.  The new version allows you to sort the output in either ascending or descending order.  Prior to V2., the only output option was the output that was available in the statistics io text file fed into the program.  If there are any other features/enhancements you'd like to see, please let me know.  You can download it here.  Enjoy!