T-SQL Tuesday #162 – Data Science in the time of ChatGPT

Invitation from Tomaz Kastrun.

Instead of writing and asking Data science questions, let’s discuss the aspects of Data science with the presence of Chat GPT 4.0.

By now, it is known to everyone that Chat GPT is a language model (LLM – Large Language Model) that is based on the GPT (Generative Pre-trained Transformer) architecture. It uses deep learning algorithms to like neural nets with billions of weights and transformers, that generated the sequence of tokens, that make up a piece of text.Transformers introduce the concept of “paying attention” to generally build better sequence of text. It operates primarily with probabilities of words and their sequence and therefore it is also good for human-like responses to natural language queries, making it great for a conversation-like experience.

There are many of the caveats hidden in the processing of text, adjustments of weights, functions (different and tweaked versions of Relu), additional corpora and billions of text for model training and many additional texts.

I have prepared two groups of questions. And I will not go into debate, as the end of data science is near, nor will go into debate, that the AGI (artificial general intelligence) will completely replace the role of data scientists. What I want to hear from you is simply how did you embrace (if at all) the use of Chat GPT, and what were your first impressions. And mostly, how did it help you (if at all), what did you use it for, and have you encountered any traps?

Usage and working along Chat GPT

Imagine using SQL, R, Python, Julia, or Scala, for your daily data science work. And you can practically ask Chat GPT anything and it will return you a relatively coherent and good answer. If you need an explanation, it will excel. Where and what have you used it for? Here is a short list, that might get you started:

  1. Explain the data science algorithn?
  2. Help tune or create SQL code to query big data
  3. Prepare R, Python, Scala code for exploring the data
  4. Help you prepare the training of the model in desired language
  5. Prepare the code for hyperparameter tunning and cross-validation
  6. Ask for data visualization for given dataset
  7. Help create dashboard
  8. Create code for model deployment, model re-training or model consumption
  9. Ask for preparing custom functions and algorithm/function adjustments?

Now, that you have added and found the list of where and how it did help you, I would like to understand, how did this help you? Feel free to make a general comparison and add some explanations. And lastly, of course, add, if this has in any kind of way compromise your work as a data scientist (in a term of embracing it in – a positive way, or in terms of a negative experience).

Responsible usage

We have seen many controversies around Chat GPT emerge. Some European Union countries have banned it, and some will so be doing it too. And the question is not only its use (as the end of humanity and empathy) but also the misuse of personal data, privacy issues and leaking of relevant, corporate information.

Have you considered responsible usage of Chat GPT? Here is again the short list for helping you:

  1. The use of personal data retrieved from the model
  2. Inserting sensitive (personal or company) data
  3. Explaining the section of R, Python, Scala code, that is the property of your enterprise

Instead of this, have you tried using it more responsibly:

  1. Using pseudo code for explanation of the algorithm
  2. Using mock data rather than real data
  3. Giving pseudo-code in order to receive the documentation
  4. Skipping on sensible data (SQL schema, model information, sensible data)

So which cases have you come across? Did it have any consequences for you? Which other responsible use of Chat GPT have you also done?

My takeaways

ChatGPT offers interesting answers (based on my experience and search), and it is the next step from a google search of Stackoverflow. In other words, it gives you a more focused answer. When exploring and searching forums, you might find several different solutions for a single problem, whereas here, you have to ask for another solution. And respectively, it can give you answer faster, in comparison to browsing the web. In both cases, both sides have their advantages and disadvantages, but non will assure you, that the answer is correct!

I embrace this technology as an additional learning source. But I personally do not use it as my daily driver, despite trying it out a couple of times (with mixed results; working and nonworking/useless/meaningless). It can be super helpful for entry/junior positions, but the more experienced you are, the more abstract data science work you and the more complicated topics you cover, less frequently you will presumably use it.

T-SQL Tuesday #161 – Having Fun with T-SQL

Invitation and writeup from Reitse Eskens.

So, what to blog about this month. Well, it’s just been April fools day and as you’re reading this, you survived. Congratulations! But it did spark a question; what fun are we having with our code? And I’m not talking about commit messages in the Git repository or funny comments inside the code. I’m just as guilty on that part as the next programmer, but I’d like to focus on something else.

What are your most fun script, procedures or statements that you’ve written. Let me give two examples to set a bit of a stage.

The first fun script I wrote is one that has some history with it. About ten years ago when my wife was pregnant we were in the garden discussing the future. We were pulling out some weeds, trimming back some plants and enjoying the spring weather. For some reason we got onto the long term future and there a long running joke emerged. Our kid would have 18 years with us, when he would turn 18, the main present would be a set of moving boxes. Let’s call it a hint. Every now and then the joke serves it’s purpose to as a lightning rod when things don’t go like we like. The remark “well, only X years to go” relieves some of the stress. Nothing more serious than that. Until some co-workers got wind of the joke and asked for more precision. So, I wrote a very small piece of code that resulted in a number of results, the amount of years, day, hours, minutes and seconds until his 18th birthday.

CREATE OR ALTER PROCEDURE sp_howlong
AS
DECLARE @birthdate DATETIME;
SET @birthdate = '2013-01-01 00:00:00'; -- enter the correct birthday here
SET @birthdate = DATEADD (YEAR, 18, @birthdate);
WITH getData
AS ( SELECT
CONVERT (VARCHAR(12), DATEDIFF (SECOND, SYSDATETIME (), @birthdate) / 60 / 60 / 24 / 365) AS ' Year(s) '
, +CONVERT (VARCHAR(12), DATEDIFF (SECOND, SYSDATETIME (), @birthdate) / 60 / 60 / 24 % 365) AS ' Day(s) '
, +CONVERT (VARCHAR(12), DATEDIFF (SECOND, SYSDATETIME (), @birthdate) / 60 / 60 % 24) AS ' Hour(s) '
, +CONVERT (VARCHAR(2), DATEDIFF (SECOND, SYSDATETIME (), @birthdate) / 60 % 60) AS ' Minute(s) '
, +CONVERT (VARCHAR(2), DATEDIFF (SECOND, SYSDATETIME (), @birthdate) % 60) AS ' Second(s) ')
SELECT CONCAT_WS (':', [ Year(s) ], [ Day(s) ], [ Hour(s) ], [ Minute(s) ], [ Second(s) ]) AS [This is how long…]
FROM getData;

What this procedure does is getting the birthdate as it happened, adding 18 years because that’s the target. The select then calculates the differences based on on the modulo function (the % sign). As I’m converting to seconds, I can work my way down from years to seconds by changing the modulo.

I’ve used this technique in some customer cases as well to determine if a certain record had expired its valid date or not.

The second one is more work related but fun nonetheless. It’s one I didn’t think of myself but it was heavily inspired on the work from Brent Ozar. I’m a great believer in attribution, and as this is mostly his work, check out the link to get a quick working setup and adjust it to your needs.


The reason for this script came from a customer who wanted to know if all databases were up and running if a server went into a failover, reboot or whatever. We discussed the issue shortly and, having paid attention in classes of Brent, I came up with a procedure that runs after startup and checks the state of all the databases. If all the databases are online and running, it will send an email stating everything is OK. If one of the databases didn’t get to the normal state, the email will have a line for each database with the state it was in when the procedure ran. Of course, this isn’t watertight and fails if either the mailserver is down or the server never returns to normal running, but that is being monitored elsewhere.

Now this script has been running for years and just one simple ‘out of order’ message has been seen: Database Offline. Every other database has resumed without hesitation or error. Yes, some database servers are just summer children.

So without further ado, time to hit your keyboard and write about your funny scripts, code.

T-SQL Tuesday #160: Microsoft OpenAI Wishlist

Invitation and round-up from Damien Jones.

Introduction

Artificial Intelligence has been a big deal in recent months. One of the main drivers of this has been OpenAI, whose DALL-E 2 and ChatGPT services have seen extraordinary public interest and participation.

ChatGPT is currently the fastest-growing consumer application in history It reached 100 million users in its first two months, and has been integrated into numerous applications. One such example is the recent version of DBeaver that I tried out in my previous post.

Microsoft has been one of OpenAI’s most prominent supporters. In July 2019 Microsoft invested $1 billion in OpenAI and became their exclusive cloud provider.

In January 2023 Microsoft announced the latest phase of its multibillion-dollar investment partnership with OpenAI and the general availability of Azure OpenAI Service. Since then, Microsoft announced that it is building AI technology into Microsoft Bing, Edge and Microsoft 365.

My invitation for this month’s T-SQL Tuesday is:

What is on your wishlist for the partnership between Microsoft and OpenAI?

This can include all Microsoft products and services, like:

T-SQL Tuesday #159 – What’s Your Favorite New Feature?

Invitation and wrap up from Deepthi Goguri.

This month, I am inviting everyone to blog about two topics:

  1. Blog about your new favorite feature in SQL Server 2022 or in Azure. Why is it your favorite feature and what are your experiences and learnings from exploring this feature? If you have not explored these new features yet, No worries! Blog about the features you feel interested in exploring.
  2. New year, New Resolutions. What are your new year resolutions and how do you keep the discipline doing it day after day? Here are some examples: new hobby, plan to spend more time doing physical activity, wanted to read list of books (Please mention the names so it may also inspire others to read those books), journaling or any other resolutions you plan for this year.

Here are my answers to above questions:

  1. I am looking forward to learn about my favorite feature Query Store and its advancements in the SQL Server 2022. Query Store feature now supports the read only replicas from availability groups. The other advancement in Query Store is Query Store hints. I have written a blog post about it here. The other new feature is the parameter sensitive plan optimization where multiple plans are stores in plan cache for a single stored procedure reducing the parameter sniffing problems.
  2. This year, my resolution is to include exercise to my daily routine and reading David Goggin’s book all over again “Can’t Hurt me” before I begin to read his second book “Never finished”. It is getting harder to keep the exercise discipline. I had my gaps but I know I will get into the track again. I believe it is all about doing your best when you feel the worst. I am looking forward to listen to your resolutions and your discipline in following them day in and day out.

If you are looking for the latest features in SQL Server 2022, follow this series of videos by Bob Ward and Anna Hoffman explaining the new capabilities and features for SQL Server 2022. For new features in Azure, please check Azure SQL updates here and general overall Azure updates here.

T-SQL Tuesday #158, Implementing Worst Practices

Invitation from Raul Gonzalez.

One of the most repeated answers to almost any question asked within the SQL Server community is that “everything depends”… Can that also apply to known best practices?
 

Furthermore, is it possible that some of the commonly agreed “worst practices” have indeed some use case where they can be useful or suit an edge use case?
 

This month I am asking you to write about those not-so-common practices that you may have implemented at some point and the reasons behind it, I have a few in my pocket that will make more than one a bit uncomfortable 😀

T-SQL Tuesday #157 – End of Year Activity

This month’s invitation and recap from Garry Bargsley.

Welcome to the final T-SQL Tuesday for 2022. My ask is, what do you have planned for end-of-year activities for your SQL environment? Do you have annual processes or procedures you run? Do you clean up documentation? Do you just take time off and hope someone else does the work?

Some Examples:
  • Purge log data
  • Archive databases for long term
  • Look for orphaned data/log files on your SQL Servers
  • Do Security analysis for no longer needed accounts
  • Add new years dates to dimension tables

T-SQL Tuesday #156 – Production Code

Invitation from Tom Zika.

I’m a learner by example, so when I started programming (not so long ago), I tried to find existing solutions on various Q&A sites or blogs, as one might.

After a while, I noticed one sentence repeating often enough that it stuck with me:

“This is not a production-grade code”.

So here’s my invitation: “Which quality makes code production grade?”

You might think: “Production code is code that runs in production, duh.”

But let’s help out the newbies who look for a bit of concrete guidance.
Please be as specific as possible with your examples and include your reasoning.

I’m not limiting the scope to just the SQL; it can be anything.

T-SQL Tuesday #155 – The Dynamic Code Invitation

Invitation from Steve Jones.

I saw a post recently where someone noted they used Excel to help build dynamic SQL for their job. I thought that was a) creative, and b) similar to something I’ve done. In fact, that will be my post for this month.

However, while many of the experts decry dynamic SQL as a poor way of solving problems, it is not going away. In fact, it works really well for many situations and problems, albeit not necessarily a high volumes of data. There also are security concerns.

My invitation this month is to write about producing SQL dynamically in some way. Let us know about any of these things:

  • a problem you solved
  • a creative use of technology to build SQL
  • security concerns
  • a place where dynamic SQL failed you
  • a way to convert dynamic SQL to something cleaner
  • anything else that relates to code producing code

T-SQL Tuesday #154 – SQL Server 2022

Invitation from Glenn Berry.

Your mission, if you choose to accept it, is to write about what you have been doing (if anything) with SQL Server 2022. Maybe you have been doing a lot of testing with the public CTP builds and now RC0. Perhaps you have not had the time (or interest) in doing any SQL Server 2022 testing or research. Whatever you have been doing, now is your chance to talk about it, good or bad.

Maybe you can’t wait for SQL Server 2022 to be released, or perhaps you couldn’t care less about it. I would love to hear what your experience has been and what your opinions are about this release. If SQL Server 2022 is not on your future roadmap, please tell me why.

Just in case you haven’t heard much about SQL Server 2022, here are a few links from Microsoft.

T-SQL Tuesday #153 – The Conference That Changed Everything For Me

Invitation and roundup from Kevin Kline.

My invitation is about the social side of life as an data professional, specifically conferences and events. As one of the original nine founders of PASS and an early president of the association from 2004-2010, I always looked forward to the fall and the yearly PASS Summit. The leadership of PASS always worked hard to make the event feel like not only the best SQL Server and Azure SQL training event, but also like a big family reunion. (Check out the #sqlfamily hashtag on social media. It’s a thing!) Many bloggers have already written about #sqlfamily goodness from their attendance at the PASS Summit. Maybe we will see a couple of those blogs reposted?

With the last couple years pandemic lockdown behind us, we might not have many recent examples from the past two years. On the other hand, we have so much to look forward to with anticipation this fall! For many of us in this industry, conferences and events are the highpoint of our yearly business cycles. And with good reason, because we attended an event that had a lifelong positive impact on us. The invitation –

Tell us the story of how attending an IT conference or event resulted in an amazing career or life opportunity.

Of course, job and career opportunities readily jump to front of mind. But I ask you not to limit yourself to stories solely focused on career opportunities or job changes. I’ve seen so many other great outcomes happen for people who attended events like the PASS Summit, SQLBits, and others. I’ve even seen people meet, fall in love, and begin their journey as a couple because they both attended the same event. How wonderful is that?!?

Here are some other ideas. Did you connect with a new group of friends who are now a constant part of your life? Maybe you found a mentor who helped you in a multitude of ways? Perhaps you attended an event that started as the launch point of your own personal advancement as a speaker, blogger, writer, mentor, or community leader? Or maybe you learned an entirely new set of skills that amplified your success at your current job? It could be something even simpler, like finding out about a new musical act, writer, or artist who now is your favorite! Whatever your story might be, I would love to read your blog post about the conference that changed everything for you.

I hope I have inspired you to participate. Your action might then inspire others, starting a virtuous cycle of continued improvement for our community. Now, read the blog party rules below and get started!