One of the key components of a Big Data Cluster is the data pool. Within that single data pool, there are two SQL Server instances. The primary job of the data pool is to provide data persistence and caching for the Big Data Cluster. (At the time of this blog post, there can only be a single data pool in a Big Data Cluster and the maximum supported number instances in a data pool is eight.) The instances inside the data pool do not communicate with each other and are accessed via the SQL Server Master instance. The data pool instances are also where data marts are created.
HAPPY NEW YEAR!
Wow! Time flies by so fast! I can’t believe it’s 2020. I’ve been posting these “blog stats” for a couple years now and it’s always been bittersweet when I post them. On one hand it reminds me that time is moving so fast (and it some cases I want it to slow down a bit). On the other hand I get to reflect on the year that past and be grateful for all the beautiful things that happened in my life.
Below is a recap of 2019: Continue reading “Blog Stats – 2019”
Instead of boring you with content that you can easily find on Amazon, I will do something different this review post. I will give you the three reasons why I recommend this book for SQL DBAs and data professionals. It doesn’t matter if you are a beginner or an expert. This book will be a great addition to your library regardless of your level of SQL Server knowledge.
A few months ago I posted a blog on deploying a BDC using the built-in ADS notebook. This blog post will go a bit deeper into deploying a Big Data Cluster on AKS (Azure Kubernetes Service) using Azure Data Studio (version 1.13.0). In addition, I’ll go over the pros and cons and dive deeper into the reasons why I recommend going with AKS for your Big Data Cluster deployments.
One of the biggest questions I had when I first started diving into Big Data Clusters was, “What about licensing….how will that work?” With so many different instances running on the storage pool, data pool and compute pool nodes will licensing cost too much? The answer I got from Microsoft was that it will “be competitive”.
Before you deploy big data clusters, you must configure the tools below on a Windows or Linux machine that will act like a “base machine” from which you will be able to deploy, manage, and monitor a SQL Server Big Data Cluster. For the example in this blog I will use a virtual machine running Windows Server 2016 running 4 cores and 8 GB RAM. (This can also work on a Windows 10 Pro machine as well).
A few months ago I posted a blog on deploying a BDC using the built-in ADS notebook. This blog post will talk about the deployment options available for Big Data Clusters and benefits of going with Azure Kubernetes Service (AKS).
I love to share real-life stories when I give talks. I usually start out my sessions with a story on how I got interested in Big Data Clusters. It all starts with my neighbor Tom (not his real name) last year (2018). I was at the bus stop with my 5 yr old son on his first day of Kindergarten. As I am waiting patiently for the bus to arrive I hear a voice say,
“Are you in IT?”
I finally presented my first tech at SQL Saturday Sioux Falls and Baton Rouge. I decided to name my session, “Big Data Clusters for the Absolute Beginner” because beginning of 2019 I *was* the absolute beginner (and I’m learning every day!) So in a way, my session is “based on a true story” :) I’m super-excited because I think it’s the future of SQL Server. Big Data Clusters are new and cutting-edge technology.
— Anthony E. Nocentino (@nocentino) August 17, 2019
— Anthony E. Nocentino (@nocentino) August 17, 2019
I’m super excited to share some information regarding my upcoming SQL Saturday sessions at Sioux Falls and Baton Rouge.
Yesterday was the release of SQL Server 2019 CTP 3.2. The biggest change in CTP 3.2 is that Big Data Clusters is now in public preview. That means anyone can go download and deploy it. Prior to CTP 3.2, you had to sign up for the “Early Adoption Program”, wait until you received an email with your Docker credentials, etc. With CTP 3.2, Microsoft has actually done away with Docker credentials. You no longer need that to create your Big Data Cluster as the images needed are on Microsoft’s public repo.
Yay!! SQL Server on Windows Containers! Hey wait a minute, I thought you could already install Docker Desktop on Windows? Yes, but behind the scenes, Docker uses HyperV to create a Linux VM (called MobyLinuxVM). So even though it was on a Windows machine, it still used a Linux VM on the backend.
This is part 4 of the “BDC series.” You can read part 1 here, part 2 here, and part 3 here. This blog post will go into the available monitoring tools available to monitor the health of your Big Data Cluster. If you’d like to stay updated, without doing the heavy work, feel free to register for my newsletter. I will email out blog posts of my journey down the wonderful road of BDCs.[Updated for CTP 3.2] – There are kubectl commands and azdata commands to check the health of your cluster but I will focus on the Kubernetes Dashboard for this series. I will blog about some of the useful kubectl and mssqlctl commands in later posts.
This is part 3 of the “BDC series.” You can read part 1 here and part 2 here. This blog post will go into creating the Big Data Cluster on top of the Azure Kubernetes Service (AKS) cluster we created in Part 2. If you’d like to stay updated, without doing the heavy work, feel free to register for my newsletter. I will email out blog posts of my journey down the wonderful road of BDCs.
Before I get started I want to say that there are many ways to deploy a Big Data Cluster. There is a “Default configuration” way and a “Custom configuration” way. You can read more about the custom config way here. I will be posting blogs on the other ways to deploy a BDC but for the sake of this series I will be deploying the BDC via the default way. The BDC team at Microsoft is constantly revamping and tweaking the BDC deployment process in order to make it more streamline and easier.