T-SQL Tuesday #169 – Thank You

Invitation from Kay Sauter

T-SQL Tuesday No. 169 – December 2023 I’ve got the honor to invite you to the blog party SQLTuesday, tagged as #tsql2sday, for this December 2023. Many thanks to Steve Jones (blogtwitter) for letting me host this month’s blog party!

December is the last month of a year. I believe December is the season of being thankful, or at least looking back. I also believe that we generally do not say “thank you” enough to people. So I invite and encourage you to say thank you to anyone who helped you:

  • Work colleagues or employers
  • Mentors
  • Friends
  • Family

Your post does not have to be personal, it can also be just a story you anonymize people so that the reader who it actually matters, will know that you appreciate them. If you even can incorporate some code, please do! Happy blogging and happy SQLTuesday! Please publish your blog article on December 12 2023.

T-SQL Tuesday #168 – Mature Window Functions

Invitation and roundup from Steve Jones.

We’ve had window functions in SQL Server for a decade now, since SQL Server 2012.

This month I’m asking you to write on how window functions have made your life easier. A few ideas for you:

  • What problems have you solved with a window function? Bonus points for lead/lag/first_value/last_value
  • Have you used the SQL Server 2022 enhancements in any queries?
  • How has performance improved for you with a window function
  • Draw a picture of a window with a spatial function – more extra points

Give us some specifics, with real world problems. Obfuscate the data, at least if you have my name in your dev system, but help others understand how they might solve a complex aggregate using a Window function. The more specific examples, the more others might get help from one of the posts.

T-SQL Tuesday #167 – Encryption and Data Protection

Invitation and roundup from Matthew McGiffen.

For those of you who don’t know, T-SQL Tuesday is a monthly blog party where pople are invited to blog on a chosen subject related to SQL Server. You’re invited to join in whether you’re an experienced blogger, or have never blogged before. It’s a nice way to get started writing with a ready audience who’ll view your post.

Any of you who know me will know I have a keen interest in SQL Server Encryption and Data Protection so that’s the topic for this month’s blog party.

Any contributions relating to that topic are most welcome, and treat it as you see fit. Maybe you have a technical detail or discovery you want to share. Maybe you have a useful “how to”. Equally welcome are stories about encryption gone wrong – or incidents where data wasn’t protected sufficiently, and bad things happened. Or stories with a happy ending. It’s a deliberately wide topic, and I’d love a wide variety of contributions.

T-SQL Tuesday #166 – Why Not Extended Events?

Invitation from Grant Fritchey.

With 165 T-SQL Tuesday events, two, just two, this one, T-SQL Tuesday #166, and another one back in 2018 or 2019 (I forget and I’m far too lazy to go look) have been on Extended Events.

At conferences I’m frequently the only one doing sessions on Extended Events (although, sometimes, Erin Stellato is there, presenting a better session than mine). I did a session at SQL Konferenz in Germany earlier this week on Extended Events. Hanging out in the hallway at the event (which was great by the way), I was talking with some consultants. Here’s their paraphrased (probably badly) story:

“I was working with an organization just a few weeks back. They found that Trace was truncating the text on some queries they were trying to track. I asked them if they had tried using Extended Events. They responded: What’s that? After explaining it to them, they went away for an hour or so and came back to me saying that had fixed the problem.”

We all smiled and chuckled. But then it struck me. This wasn’t a case of someone who simply had a lot more experience and understanding of Profiler/Trace, so they preferred to use it. They had literally never heard of Extended Events.

Why?

Search Engines

I did a search on BingGoogle and DuckDuckGo. The results were instructive.

The top result on Bing was to a 14 year old StackOverFlow post. To say the least, yeah, it’s not showing anyone how to use Extended Events. It talks about DMVs in addition to Trace/Profiler.

The top result in Google was a site I’ve never even heard of before, Sit24x7.com. It talked about DMVs, and nothing else. I couldn’t find a publish date on the article, but since it didn’t talk about sys.dm_exec_procedure_stats, but only talked about query_stats, either the person writing it was ignorant of more recent DMVs (and, let’s be clear, saying that procedure_stats is recent is s stretch), or this is a very old article indeed.

The top result in DuckDuckGo was a post on SQLShack written in 2017 using examples from SQL Server 2016. The tools used were, kind of oddly, Activity Monitor within SSMS and Query Store. No mention of Profiler/Trace, Extended Events, or even DMVs. The second result was the 14 year old Stack overflow post.

If you were looking to identify a long running query, you might be lead to believe the consensus is that the only tool to use is DMVs.

The Whole First Page

The common wisdom is that people never go beyond the first page of search results (I’ve no idea if this is true). So, what’s on the first page?

Well, Bing had eleven results that weren’t ads when I ran the query linked above. It wasn’t until links 7 and 8 that Extended Events are mentioned. Further, links 7 & 8 were the same article, just published in two different places with a few edits between them. Four of the links were to Microsoft and NONE (zero, 0, zip, nicht) of those mentioned Extended Events, although they did talk about DMVs. Of the top 10, most of the links were old. Many 10 years or more. The two links, 7 & 8, were the only ones to mention Extended Events.

With Google I saw 10 non-sponsored links on the first page. Many of them were duplicates of the links in Bing, just in a different order. Link 4 was to the same article I found in Bing. Link 5 was to a new source that did had Extended Events as the #1 tool for gathering query performance metrics. There was only one Microsoft link, duplicated from Bing, and it didn’t list Extended Events. Just like with Bing, most of the links were old. Many of the links from Google were older than the ones from Bing.

DuckDuckGo was just a little better. The 3rd and 4th slots had two different articles talking about Extended Events with the 3rd slot being the same article from both Bing and Google and the 4th slot being a new one. Three of the slots were Microsoft and again, no mention of Extended Events. And, once more, many of the links are minimum 7 or 8 years old, but some being 13 or 14 years old.

Conclusion

We can have a lot of discussion about the technical aspects of Extended Events. We can also talk about whether or not you should, or shouldn’t use Extended Events. The simple fact of the matter is, there’s a good chance that people aren’t using Extended Events, not because they’re problematic, hard, contain XML, muscle memory, or any of the other issues that I, and others, bring up, but instead, because they simply don’t know that they exist.

So, if you are #TeamXE, not only do we have to overcome years of bad information and indifference due to a poor launch (2008 XE just wasn’t good, let’s be honest), but the fact that the way search engines work, Extended Events may be hidden from many people.

T-SQL Tuesday #165: What Do All The Database Job Titles Actually Mean?

Invitation and roundup from Josephine Bush

Earlier this year, when I was looking for a new job, I realized that there are a lot of conceptions about what different database jobs do. I was perplexed about what should be part of what job. These job titles could include other job title tasks, depending on what a company wants. The more I Googled what each of the job titles would include, the more it seemed there was no real standard for what each job title does. For example, if a company doesn’t have dedicated DBAs, that job function is lumped into another job title. You might be doing all of them at many companies because you are the db person, period.

My Conceptions or Misconceptions

Here’s what each job title means to me, whether accurate or not:

  • DBA – This is backups, restores, HA, DR, index maintenance, integrity checks, patching, upgrade, and maybe some performance tuning definitely at the server level, maybe at the query level
  • Database engineer – This is more coding and getting into the meat and bones of db design. This could also include building data pipelines. Also, it would include more query tuning.
  • Database reliability engineer – This is more DevOps-y than DBA, so it’s more about automation and creating CI/CD processes.
  • Database architect – This is designing database systems, such as architecting the infrastructure and making diagrams.
  • Data architect – This is intimately knowing the data inside and out and maybe designing data flows.
  • Data warehouse architect – This is a separate domain because DW OLAP differs greatly from DBA OLTP work. I don’t put this with DBRE or DE because it’s its own domain.
  • Data warehouse engineer – Is this a thing? Or is that lumped into a database engineer job title? But again, as I said with database architect vs. data warehouse architect, data warehouse engineer could be separate from database engineer.
  • Data scientist – This is modeling/predicting with datasets.

Kendra did an amazing post covering some of this last week. It was like she was reading my mind because I was preparing this post as she published hers! I hit her up about her post, and she suggested including data scientists in my post.

If you have other database job titles you want to include, please feel free. I thought up whatever I could based on what I saw or thought should be there. I’m excited to see what you all think!

T-SQL Tuesday #164: Code That Made You Feel A Way

Invitation and roundup from Erik Darling.

This month’s topic challenges you to think back to the last time you saw code that made you feel a thing. Hopefully a positive thing.

I’m not a thesaurus, so the list isn’t exhaustive, but think along the lines of: surprise, awe, inspiration, excitement.

Or maybe it was just code that sort of sunk its teeth into you and made you want to learn a whole lot more.

This won’t be my submission, but I think a great example is a piece of query tuning magic by Paul White. It was one of the first articles I ever read as a young SQL Developer.

It was one of many lightbulb moments (and, crap, I should have said lightbulb moment up there when I was telling you what kind of code I want you to think about) that I’ve had in my career.

The only rule I have is (outside of the normal T-SQL Tuesday rules) is that it can’t be your own code.

It should also probably be code that you won’t get sued for showing or talking about. Please be diligent in abiding by copyrights.

Other than that, it’s the normal pack of rabbits: submissions have to be posted next Tuesday, the 11th of July.

T-SQL Tuesday #163 Invitation – What is the best piece of Career Advice you ever received

Invitation and round up from Gethyn Ellis

There is no right or wrong answer here.  It might be some technical advice like, “OK, so you got backups. You really need to be able to restore them, though have you tested that”

Or it might be something around the non-technical skills you need in this profession, like “Find a mentor”

We ask this question to our guests on our Putting the Human into Technology podcast, and the answers are always intriguing some of the short answers to the question include

  • Find a mentor
  • Be curious
  • Do what you love
  • Own the customer

These are just some of the answers we have had. There are many more

Feel the fear and do it anyway

This quote is now the title of a book. I googled it when writing this invite. I’ve not read the book, but the book was published in 2017. I got this advice circa September 2003. I think it’s a good title for a book. I might need to get a copy and have a read. Update the book Feel the Fear and do it Anyway by Suzan Jeffers was first published in 1987. So there is every chance that this came from that book. You can find the book here

So, Let me give some background. It was September 2003, and I worked for a local authority in Wales. I had just graduated from university.  My Job title was Community Outreach Worker. Essentially, this was working on an online portal that the authority would use to promote local businesses and community organisations and their activities. It also allowed the council to bring some of their services online.

With this job I was working with technology and people, and that was always something I wanted to do, and in the round, I really enjoyed it. However, part of my role was phoning up people running community groups, asking them if they wanted to become involved in the session. It wasn’t quite cold calling, but it was close. I’ll be honest I didn’t look forward to doing this, I probably did anything else to get out of it, and I think this was obvious to most people on the team.

Let me Introduce Andy Wilson

I sat next to my colleague in the office, We’ve lost touch over the years, but his name was Andy Wilson. Andy doesn’t seem to be on social media or I’d tag him here. He was probably twenty years older than me with lots more experience. He was a great person, funny and good at his job.  He could see I wasn’t having fun. So, he turned to me and said, “Look Geth, I can see you don’t like doing the phone calls. Why?” I explained that I didn’t like phoning people, and it made me feel a little nervous phoning up strangers. Andy turned to me and said, “Well, you have three calls to make. Why don’t you make them now, and they are done for the day, and you get on with the stuff you do enjoy doing… I got some advice from my old boss years ago when I worked in sales and had to make cold calls. He said to  ‘Feel the Fear and do it anyway’ What’s the worst that can happen? They hang up on you? Which they won’t. They’ve signed up for a call back”

Did I make the calls?

So, I made the calls. Once I’d done one, I’d done two and ended up doing several more than the three on my list and feeling quite good about myself.

I’ve used this advice, mindset and how I felt afterwards that morning on many other occasions since. Whether changing jobs or moving into a DBA role for a large police force. Implementing a technical fix for a problem that you have been called out for at 3 am, and there is no one to escalate it to, you have to do something,  and you’re in the land of do it now and ask for forgiveness in the morning and get the business working again or the business is stuck until 9 am when the rest of the business will be awake. I used the advice when making the Jump and started my own consulting business. It definitely helped when I was about to teach my first multi-day training class or deliver my first, second, third, and fiftieth conference sessions. When the nerves (or fear) kick in, I feel the fear and do it anyway! The feeling afterwards when you have done something out of your comfort zone is pretty amazing.

I’d say it’s had quite an impact on my career.

Summary – Over to you

I’d love to hear answers to the question, whether it’s around a technology choice, a career path, a company to work for, or whatever piece of advice you received that had the most significant impact on your career in the data profession, why I’d love to hear about it. You might help someone new to the data profession and give them an extra tool in their armoury to succeed on their path.

T-SQL Tuesday #162 – Data Science in the time of ChatGPT

Invitation from Tomaz Kastrun.

Instead of writing and asking Data science questions, let’s discuss the aspects of Data science with the presence of Chat GPT 4.0.

By now, it is known to everyone that Chat GPT is a language model (LLM – Large Language Model) that is based on the GPT (Generative Pre-trained Transformer) architecture. It uses deep learning algorithms to like neural nets with billions of weights and transformers, that generated the sequence of tokens, that make up a piece of text.Transformers introduce the concept of “paying attention” to generally build better sequence of text. It operates primarily with probabilities of words and their sequence and therefore it is also good for human-like responses to natural language queries, making it great for a conversation-like experience.

There are many of the caveats hidden in the processing of text, adjustments of weights, functions (different and tweaked versions of Relu), additional corpora and billions of text for model training and many additional texts.

I have prepared two groups of questions. And I will not go into debate, as the end of data science is near, nor will go into debate, that the AGI (artificial general intelligence) will completely replace the role of data scientists. What I want to hear from you is simply how did you embrace (if at all) the use of Chat GPT, and what were your first impressions. And mostly, how did it help you (if at all), what did you use it for, and have you encountered any traps?

Usage and working along Chat GPT

Imagine using SQL, R, Python, Julia, or Scala, for your daily data science work. And you can practically ask Chat GPT anything and it will return you a relatively coherent and good answer. If you need an explanation, it will excel. Where and what have you used it for? Here is a short list, that might get you started:

  1. Explain the data science algorithn?
  2. Help tune or create SQL code to query big data
  3. Prepare R, Python, Scala code for exploring the data
  4. Help you prepare the training of the model in desired language
  5. Prepare the code for hyperparameter tunning and cross-validation
  6. Ask for data visualization for given dataset
  7. Help create dashboard
  8. Create code for model deployment, model re-training or model consumption
  9. Ask for preparing custom functions and algorithm/function adjustments?

Now, that you have added and found the list of where and how it did help you, I would like to understand, how did this help you? Feel free to make a general comparison and add some explanations. And lastly, of course, add, if this has in any kind of way compromise your work as a data scientist (in a term of embracing it in – a positive way, or in terms of a negative experience).

Responsible usage

We have seen many controversies around Chat GPT emerge. Some European Union countries have banned it, and some will so be doing it too. And the question is not only its use (as the end of humanity and empathy) but also the misuse of personal data, privacy issues and leaking of relevant, corporate information.

Have you considered responsible usage of Chat GPT? Here is again the short list for helping you:

  1. The use of personal data retrieved from the model
  2. Inserting sensitive (personal or company) data
  3. Explaining the section of R, Python, Scala code, that is the property of your enterprise

Instead of this, have you tried using it more responsibly:

  1. Using pseudo code for explanation of the algorithm
  2. Using mock data rather than real data
  3. Giving pseudo-code in order to receive the documentation
  4. Skipping on sensible data (SQL schema, model information, sensible data)

So which cases have you come across? Did it have any consequences for you? Which other responsible use of Chat GPT have you also done?

My takeaways

ChatGPT offers interesting answers (based on my experience and search), and it is the next step from a google search of Stackoverflow. In other words, it gives you a more focused answer. When exploring and searching forums, you might find several different solutions for a single problem, whereas here, you have to ask for another solution. And respectively, it can give you answer faster, in comparison to browsing the web. In both cases, both sides have their advantages and disadvantages, but non will assure you, that the answer is correct!

I embrace this technology as an additional learning source. But I personally do not use it as my daily driver, despite trying it out a couple of times (with mixed results; working and nonworking/useless/meaningless). It can be super helpful for entry/junior positions, but the more experienced you are, the more abstract data science work you and the more complicated topics you cover, less frequently you will presumably use it.

T-SQL Tuesday #161 – Having Fun with T-SQL

Invitation and writeup from Reitse Eskens.

So, what to blog about this month. Well, it’s just been April fools day and as you’re reading this, you survived. Congratulations! But it did spark a question; what fun are we having with our code? And I’m not talking about commit messages in the Git repository or funny comments inside the code. I’m just as guilty on that part as the next programmer, but I’d like to focus on something else.

What are your most fun script, procedures or statements that you’ve written. Let me give two examples to set a bit of a stage.

The first fun script I wrote is one that has some history with it. About ten years ago when my wife was pregnant we were in the garden discussing the future. We were pulling out some weeds, trimming back some plants and enjoying the spring weather. For some reason we got onto the long term future and there a long running joke emerged. Our kid would have 18 years with us, when he would turn 18, the main present would be a set of moving boxes. Let’s call it a hint. Every now and then the joke serves it’s purpose to as a lightning rod when things don’t go like we like. The remark “well, only X years to go” relieves some of the stress. Nothing more serious than that. Until some co-workers got wind of the joke and asked for more precision. So, I wrote a very small piece of code that resulted in a number of results, the amount of years, day, hours, minutes and seconds until his 18th birthday.

CREATE OR ALTER PROCEDURE sp_howlong
AS
DECLARE @birthdate DATETIME;
SET @birthdate = '2013-01-01 00:00:00'; -- enter the correct birthday here
SET @birthdate = DATEADD (YEAR, 18, @birthdate);
WITH getData
AS ( SELECT
CONVERT (VARCHAR(12), DATEDIFF (SECOND, SYSDATETIME (), @birthdate) / 60 / 60 / 24 / 365) AS ' Year(s) '
, +CONVERT (VARCHAR(12), DATEDIFF (SECOND, SYSDATETIME (), @birthdate) / 60 / 60 / 24 % 365) AS ' Day(s) '
, +CONVERT (VARCHAR(12), DATEDIFF (SECOND, SYSDATETIME (), @birthdate) / 60 / 60 % 24) AS ' Hour(s) '
, +CONVERT (VARCHAR(2), DATEDIFF (SECOND, SYSDATETIME (), @birthdate) / 60 % 60) AS ' Minute(s) '
, +CONVERT (VARCHAR(2), DATEDIFF (SECOND, SYSDATETIME (), @birthdate) % 60) AS ' Second(s) ')
SELECT CONCAT_WS (':', [ Year(s) ], [ Day(s) ], [ Hour(s) ], [ Minute(s) ], [ Second(s) ]) AS [This is how long…]
FROM getData;

What this procedure does is getting the birthdate as it happened, adding 18 years because that’s the target. The select then calculates the differences based on on the modulo function (the % sign). As I’m converting to seconds, I can work my way down from years to seconds by changing the modulo.

I’ve used this technique in some customer cases as well to determine if a certain record had expired its valid date or not.

The second one is more work related but fun nonetheless. It’s one I didn’t think of myself but it was heavily inspired on the work from Brent Ozar. I’m a great believer in attribution, and as this is mostly his work, check out the link to get a quick working setup and adjust it to your needs.


The reason for this script came from a customer who wanted to know if all databases were up and running if a server went into a failover, reboot or whatever. We discussed the issue shortly and, having paid attention in classes of Brent, I came up with a procedure that runs after startup and checks the state of all the databases. If all the databases are online and running, it will send an email stating everything is OK. If one of the databases didn’t get to the normal state, the email will have a line for each database with the state it was in when the procedure ran. Of course, this isn’t watertight and fails if either the mailserver is down or the server never returns to normal running, but that is being monitored elsewhere.

Now this script has been running for years and just one simple ‘out of order’ message has been seen: Database Offline. Every other database has resumed without hesitation or error. Yes, some database servers are just summer children.

So without further ado, time to hit your keyboard and write about your funny scripts, code.

T-SQL Tuesday #160: Microsoft OpenAI Wishlist

Invitation and round-up from Damien Jones.

Introduction

Artificial Intelligence has been a big deal in recent months. One of the main drivers of this has been OpenAI, whose DALL-E 2 and ChatGPT services have seen extraordinary public interest and participation.

ChatGPT is currently the fastest-growing consumer application in history It reached 100 million users in its first two months, and has been integrated into numerous applications. One such example is the recent version of DBeaver that I tried out in my previous post.

Microsoft has been one of OpenAI’s most prominent supporters. In July 2019 Microsoft invested $1 billion in OpenAI and became their exclusive cloud provider.

In January 2023 Microsoft announced the latest phase of its multibillion-dollar investment partnership with OpenAI and the general availability of Azure OpenAI Service. Since then, Microsoft announced that it is building AI technology into Microsoft Bing, Edge and Microsoft 365.

My invitation for this month’s T-SQL Tuesday is:

What is on your wishlist for the partnership between Microsoft and OpenAI?

This can include all Microsoft products and services, like: