Happy New Year!! Even though the year 2020 was full of surprises, I managed to accomplish a few things:
I can’t believe it’s already been a year since BDCs went GA (general availability). Time sure does fly by!
I don’t recall how I came across this Kubernetes IDE called Lens, but all I know is it’s cool as hec! It connects to a Kubernetes cluster (using the kube config file) and gives you an in depth view of all the different Kubernetes objects, their associated yaml files, health/metrics, etc. In this blog post I will show you how we can look into a Big Data Cluster’s Kubernetes infrastructure using Lens.
First let’s install Lens.
I remember when I first started deploying Big Data Clusters, they were on Azure Kubernetes Service utilizing the $200 credit for first time sign ups. By the time I got around to figuring out how to deploy the BDC, not only was my $200 credit gone, but I started to incur cost out of pocket.
If only there was a feature that would allow me to stop the VMs in AKS whenever I wasn’t using them. Well, I’m excited to share that Microsoft AKS (Azure Kubernetes Service) came out with a neat feature (currently in preview at the time of the publishing of this post) that allows you to stop and start your AKS cluster by running a simple command. Of course I had to try it out on BDCs and to my surprise it worked. Well, sort of. Let me explain…
Are you a data professional and curious about Kubernetes but not quite sure what type of opportunities are available? Maybe you’re hesitant because you think Kubernetes is a “fad”? Or perhaps you’re just starting out in IT and don’t know what path to take?
Whether you are new to IT, or a seasoned IT professional, pondering these questions can be exhausting. In this blog post I will go over the importance of learning Kubernetes and how it can massively level up your career!
First, a little primer.
About 2 years ago I started coming across a lot of online chatter on “containers” and “Kubernetes”. This was back in 2018 and around that time I had no interest of learning about it because it had no direct connection to SQL Server and my daily job duties as a DBA. Up until that point I had been working with SQL Server for about ten years. So like most people, “containers and Kubernetes” went in one ear and out the other.
That all changed with the hype, and eventual release, of SQL Server 2019. In SQL Server 2019 comes a feature called “Big Data Clusters”. This new feature in SQL Server really intrigued me because it was something completely different. I started to hear those terms again (containers and Kubernetes) because those are technologies behind Big Data Clusters. Over the next year, I heavily blogged, spoke, and created video content on Big Data Clusters. As a result of my deep passion and promotion of the product, I was awarded Microsoft MVP. My journey didn’t stop there as I have a natural “thirst for knowledge” and had to learn more about the underlying technology that makes Big Data Clusters feasible in the first place: Kubernetes.
So I started to study for the Certified Kubernetes Administrator (CKA) exam.
On July 1, 2020, I was awarded the Microsoft MVP in the Data Platform category. That was a truly elevating moment for me and this blog post is intended to share my journey over the past year.
This whole “social distancing” is a perfect time to spruce up the resume. What better way than to add a new certificate? I recently took the Microsoft AZ 900 Azure Fundamentals exam and want to share the two-pronged approach I took to pass.
Before I go into that, I want to talk about the importance of the AZ 900 exam. This is my first certificate in Azure. I have some experience working with Azure through deploying Big Data Clusters on Azure Kubernetes Service. So not every concept was new to me, but I still decided it would be a good idea to expand my fundamental knowledge of Azure.
There are a few server-wide configurations that you cannot setup during inital Big Data Cluster deployment. One of those is enabling SQL Agent service. That’s right. If you have deployed a Big Data Cluster you will notice the SQL Server Agent is disabled by default (see screenshot below):
In my previous posts, I showed you how to deploy a single node cluster and a multi-node cluster. That’s find and dandy but how do you upgrade to the newest SQL Server CU? This blog post will show you how to easily upgrade a SQL Server Big Data Cluster. This method applies to a single node or multi-node cluster. It does not matter how many nodes your BDC has, this upgrade process will work.
In my previous post, I talked about deploying a Big Data Cluster on a single node Kubernetes cluster. That’s cool and all but what if you’re a business or organization that cannot have your data on the cloud for whatever reason? Is there a way to deploy a Big Data Cluster on-premise? Absolutely! I’ll walk you through setting that up this blog post. I will walk you through deploying a 3-node Kubernetes cluster, then deploying a Big Data Cluster on top of that.
One of my 2020 goals was to start presenting more. The easiest, and most cost efficient, way to do that is by presenting remotely. Another goal I had was to start creating Big Data Cluster videos for my YouTube channel. These two goals required a different set of tools. I also had no idea how complicated video recording could get. I finally got everything setup the way I want and a few friends asked me about my setup. This blog post will list everything I use and lessons learned.
So you want to play around with a Big Data Cluster but on a strict budget? No problem! Keep reading or watch the video below.
If you are interested in deploying a Big Data Cluster on a multi-node kubeadm cluster, check out my post here.
One of the key components of a Big Data Cluster is the data pool. Within that single data pool, there are two SQL Server instances. The primary job of the data pool is to provide data persistence and caching for the Big Data Cluster. (At the time of this blog post, there can only be a single data pool in a Big Data Cluster and the maximum supported number instances in a data pool is eight.) The instances inside the data pool do not communicate with each other and are accessed via the SQL Server Master instance. The data pool instances are also where data marts are created.
HAPPY NEW YEAR!
Wow! Time flies by so fast! I can’t believe it’s 2020. I’ve been posting these “blog stats” for a couple years now and it’s always been bittersweet when I post them. On one hand it reminds me that time is moving so fast (and it some cases I want it to slow down a bit). On the other hand I get to reflect on the year that past and be grateful for all the beautiful things that happened in my life.
Below is a recap of 2019: Continue reading “Blog Stats – 2019”