SQL Basics and BI Tooling for CS Leaders

Why CS leaders need to understand data tooling

The most common bottleneck in CS reporting is not a lack of data. It is the gap between the people who have questions and the people who can answer them with data. A CS Director who needs to understand why CSAT dropped last month should not have to wait three days for a data analyst to write a query and produce a report. A manager who wants to know which agents are handling the highest volume of escalations this week should not need to submit a ticket to the BI team.

That gap — between operational questions and data answers — is closed in two ways. The first is building a reporting infrastructure that anticipates common questions and makes the answers available through well-designed dashboards. The second is developing enough data literacy in the CS leadership team that they can answer unexpected questions themselves, without depending entirely on analysts or engineers.

This article covers both. It introduces the SQL concepts that CS leaders find most useful without requiring them to become database engineers. It maps the main BI tools available and explains which is appropriate for which context. And it covers how to work effectively with data teams when the question is too complex to answer independently.

The goal is not to turn CS leaders into data analysts. It is to close the gap enough that data becomes a practical tool for operational decision-making rather than a resource that requires specialist intermediation every time it is needed.

What SQL is and why it matters for CS leaders

SQL — Structured Query Language — is the standard language for retrieving and manipulating data stored in relational databases. Most CS operational data — tickets, interactions, agent records, SLA logs, CSAT responses — is stored in relational databases, either in your ticketing system's backend, your data warehouse, or both.

Understanding SQL at a basic level gives CS leaders the ability to ask and answer data questions directly — without waiting for a data analyst, without depending on pre-built reports, and without being limited to the analysis that someone else thought to build into a dashboard.

The level of SQL most useful for CS leaders is not the advanced analytical SQL that data engineers use for complex transformations and pipeline building. It is the foundational retrieval and aggregation SQL that allows you to ask questions of your data in plain terms: how many tickets of type X were created last week, what was the average handle time by tier for the last month, which agents had the highest escalation rate in Q3.

That level of SQL is accessible to anyone who understands the basic structure of a query and a handful of core operations. The concepts below cover the majority of what a CS leader needs for practical data work.

The anatomy of a SQL query

Every SQL query follows the same basic structure. Understanding this structure makes it possible to read, interpret, and write queries without memorising syntax.

sql

SELECT   -- what columns do you want to see?
FROM     -- which table are you reading from?
WHERE    -- which rows do you want to include?
GROUP BY -- how do you want to aggregate the data?
HAVING   -- which groups do you want to filter?
ORDER BY -- how do you want the results sorted?
LIMIT    -- how many rows do you want returned?

Not every query uses every clause. A simple query might use only SELECT, FROM, and WHERE. A more complex aggregation might use all of them. The key is understanding what each clause does so you can construct the query that answers your specific question.

SELECT and FROM: the foundation

SELECT and FROM are the core of every query. SELECT specifies which columns you want to see in the results. FROM specifies which table you are reading from.

sql

SELECT ticket_id, created_at, severity, status, agent_id
FROM tickets

This query returns five columns — ticket ID, creation time, severity, status, and agent ID — for every row in the tickets table. Without a WHERE clause, it returns all rows.

In practice most CS data tables contain many columns. You rarely need all of them. Selecting only the columns relevant to your question makes results easier to read and queries faster to run.

The asterisk (*) is a shorthand for "all columns":

sql

SELECT *
FROM tickets

Useful for exploring a table you haven't worked with before. Not useful for production queries where you know which columns you need.

WHERE: filtering rows

The WHERE clause filters the rows returned by the query to only those matching specific conditions. This is where you specify the time period, severity level, team, channel, or any other dimension you want to focus on.

sql

SELECT ticket_id, created_at, severity, status
FROM tickets
WHERE severity = 'S1'
  AND created_at >= '2026-01-01'
  AND created_at < '2026-04-01'

This query returns S1 tickets created in the first quarter of 2026. The AND operator requires all conditions to be true. The OR operator requires at least one condition to be true.

Common WHERE operators worth knowing:

= and != for exact match and not equal. WHERE status = 'open' or WHERE status != 'closed'.

> and <, >= and <= for numerical and date comparisons. WHERE handle_time_minutes > 15.

IN for matching against a list of values. WHERE severity IN ('S1', 'S2') is equivalent to WHERE severity = 'S1' OR severity = 'S2' but cleaner to write.

BETWEEN for range conditions. WHERE handle_time_minutes BETWEEN 10 AND 20.

LIKE for pattern matching in text fields. WHERE agent_name LIKE 'Smith%' matches any agent name starting with Smith. The % symbol is a wildcard.

IS NULL and IS NOT NULL for identifying missing values. WHERE resolved_at IS NULL returns tickets that have not been resolved.

GROUP BY and aggregate functions: summarising data

The most practically useful SQL for CS leaders involves aggregation — summarising rows of individual records into counts, averages, and totals by category. GROUP BY, combined with aggregate functions, is how this is done.

Aggregate functions perform a calculation across a group of rows and return a single value:

COUNT() counts the number of rows. COUNT(*) counts all rows. COUNT(ticket_id) counts rows where ticket_id is not null.

SUM() adds up the values in a column. SUM(handle_time_minutes) gives total handle time.

AVG() calculates the average. AVG(handle_time_minutes) gives average handle time.

MAX() and MIN() return the highest and lowest values.

GROUP BY divides the rows into groups before the aggregate function is applied, producing one result row per group:

sql

SELECT severity,
       COUNT(*) AS ticket_count,
       AVG(handle_time_minutes) AS avg_handle_time,
       AVG(csat_score) AS avg_csat
FROM tickets
WHERE created_at >= '2026-01-01'
  AND created_at < '2026-04-01'
GROUP BY severity
ORDER BY severity

This query returns one row per severity level, showing the count of tickets, average handle time, and average CSAT score for each. The AS keyword gives each calculated column a readable name in the results. ORDER BY sorts the results by severity.

This type of query answers one of the most common CS reporting questions — how does performance vary by severity, channel, team, or time period — directly from the underlying data.

Practical SQL patterns for CS operations

A small set of query patterns covers the majority of ad hoc analysis needs in a CS operation. These patterns are worth understanding as templates that can be adapted to specific questions.

Counting tickets by time period

sql

SELECT DATE_TRUNC('week', created_at) AS week_start,
       COUNT(*) AS ticket_count
FROM tickets
WHERE created_at >= '2026-01-01'
GROUP BY DATE_TRUNC('week', created_at)
ORDER BY week_start

DATE_TRUNC rounds a timestamp down to the start of the specified period — week, month, day, hour. This pattern produces a week-by-week volume trend that feeds directly into forecasting analysis.

SLA attainment by severity tier

sql

SELECT severity,
       COUNT(*) AS total_tickets,
       SUM(CASE WHEN resolved_within_sla = true THEN 1 ELSE 0 END) AS within_sla,
       ROUND(100.0 * SUM(CASE WHEN resolved_within_sla = true THEN 1 ELSE 0 END) / COUNT(*), 1) AS sla_attainment_pct
FROM tickets
WHERE created_at >= '2026-03-01'
GROUP BY severity
ORDER BY severity

The CASE WHEN expression evaluates a condition for each row and returns a specified value — here converting a boolean field into 1 or 0 so it can be summed. This pattern calculates SLA attainment percentage by severity tier for any time period.

Agent performance summary

sql

SELECT agent_id,
       agent_name,
       COUNT(*) AS tickets_handled,
       AVG(handle_time_minutes) AS avg_handle_time,
       AVG(csat_score) AS avg_csat,
       ROUND(100.0 * SUM(CASE WHEN escalated = true THEN 1 ELSE 0 END) / COUNT(*), 1) AS escalation_rate_pct
FROM tickets
WHERE created_at >= '2026-03-01'
  AND tier = 'T1'
GROUP BY agent_id, agent_name
ORDER BY avg_csat DESC

This pattern produces an agent-level performance summary — ticket volume, average handle time, CSAT, and escalation rate — for a specified period. ORDER BY avg_csat DESC ranks agents from highest to lowest CSAT. Replace DESC with ASC to reverse the order.

Contact driver analysis

sql

SELECT contact_reason,
       COUNT(*) AS ticket_count,
       ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (), 1) AS pct_of_total
FROM tickets
WHERE created_at >= '2026-03-01'
GROUP BY contact_reason
ORDER BY ticket_count DESC
LIMIT 20

The SUM(COUNT(*)) OVER () expression uses a window function to calculate each contact reason's share of total volume — producing a percentage breakdown of contact drivers ranked from most to least frequent. LIMIT 20 returns the top twenty reasons.

Joining tables: combining data sources

Real CS analysis often requires combining data from multiple tables — connecting ticket data with agent data, with customer data, with SLA configuration data. The JOIN operation links rows from two tables based on a shared column value.

sql

SELECT t.ticket_id,
       t.severity,
       t.created_at,
       a.agent_name,
       a.team,
       c.customer_name,
       c.customer_tier
FROM tickets t
JOIN agents a ON t.agent_id = a.agent_id
JOIN customers c ON t.customer_id = c.customer_id
WHERE t.created_at >= '2026-03-01'
  AND t.severity = 'S1'

The table aliases — t for tickets, a for agents, c for customers — make the query more readable when referencing columns from multiple tables. The ON clause specifies which columns link the two tables.

The most common join type is INNER JOIN — which returns only rows where a match exists in both tables. LEFT JOIN returns all rows from the left table and matching rows from the right table, with null values where no match exists — useful when you want to include records even if they don't have a corresponding entry in the joined table.

BI tooling: choosing the right platform

SQL gives you the ability to retrieve and aggregate data. BI tools give you the ability to visualise it, share it, and build dashboards that update automatically without manual query execution. Understanding the main options and their tradeoffs helps CS leaders make sensible decisions about tooling investment.

Google Looker Studio (formerly Data Studio)

Best for: Teams already using Google Workspace, operations with limited BI budget, quick dashboard builds on top of Google Sheets, BigQuery, or standard SaaS connectors.

Looker Studio is free, integrates natively with Google's product suite, and has a low barrier to entry. A CS manager with no prior BI experience can build a functional dashboard in a day. Its limitations are in advanced analytics depth and the number of native data connectors — for complex multi-source dashboards or sophisticated calculated metrics it can feel constraining.

For CS teams using Google Sheets as their primary data store, or those connected to BigQuery, Looker Studio is the natural starting point. It requires no engineering support for basic dashboards and no licensing cost.

Metabase

Best for: Teams that want SQL access alongside visual dashboard building, operations with a moderate technical appetite, organisations that prefer open-source tooling.

Metabase sits in a useful middle ground — it allows non-technical users to build dashboards through a visual query builder while also supporting raw SQL for more complex analysis. Its open-source version can be self-hosted at minimal cost. The cloud-hosted version is moderately priced relative to enterprise BI tools.

For CS operations that want more than Google Sheets integration but don't need enterprise-scale BI infrastructure, Metabase is often the right choice. It is accessible enough for managers to use independently but powerful enough for analytical work that would exceed Looker Studio's capabilities.

Tableau

Best for: Large organisations with dedicated data teams, operations requiring sophisticated visualisation, organisations that need to publish dashboards to wide internal audiences.

Tableau is the most powerful visualisation tool in common use — capable of handling complex data models, advanced calculated fields, and highly polished visual output. Its limitation is cost and complexity. Tableau licenses are expensive, implementation typically requires dedicated analyst support, and the full capability of the tool takes significant time to learn.

For most CS operations, Tableau is more tool than is needed. It is the right choice when the organisation already has a Tableau investment and a data team to support it, or when the reporting requirements are sophisticated enough that simpler tools have been genuinely outgrown.

Looker (the BI platform, not Looker Studio)

Best for: Enterprise organisations with data engineering teams, operations that need a governed semantic layer, organisations requiring strong data access controls and auditability.

Looker — distinct from Looker Studio — is an enterprise BI platform that emphasises a centralised semantic layer: a defined, governed model of what each metric means and how it is calculated, shared across all dashboards and reports. This approach ensures that every dashboard in the organisation uses the same definition of CSAT, the same calculation of SLA attainment, the same customer segmentation logic.

For CS operations in large enterprises where metric definition inconsistency is a real problem — different teams reporting different CSAT numbers because they calculate it differently — Looker's semantic layer approach solves a genuine issue. For smaller operations, the implementation overhead is rarely justified.

Zendesk Explore and native ticketing system analytics

Best for: CS operations that want reporting without leaving the ticketing system, teams with limited technical resources, standard reporting needs that don't require cross-system data integration.

Most major ticketing systems — Zendesk, Intercom, Freshdesk — include native analytics and reporting functionality. Zendesk Explore, for example, provides pre-built dashboards and a query builder that generates reports directly from Zendesk data without requiring external BI tooling.

Native analytics tools are the right starting point for teams that primarily need to report on data that exists within their ticketing system. Their limitation is that they cannot easily incorporate data from other systems — WFM platforms, HR systems, financial data — which makes them insufficient for the financial dimension of the balanced scorecard or for any analysis that requires joining ticketing data with external sources.

Working effectively with data teams

For analysis that goes beyond what a CS leader can build independently — complex multi-source models, automated pipeline builds, sophisticated statistical analysis — working with a data engineering or analytics team is necessary. The quality of that collaboration significantly affects how quickly operational questions get answered.

A few practices that make the collaboration more effective:

Specify the question, not the query. Data analysts are most effective when given a clear business question — "I need to understand whether our CSAT drop last month is concentrated in a specific customer segment or severity tier" — rather than a technical specification. Translating operational questions into specific, answerable form is the CS leader's job. Deciding how to answer them technically is the analyst's job.

Provide context about the decision. Analysts who understand what decision the analysis will inform produce more useful work than those who receive a request without context. "I need this to decide whether to invest in additional T2 headcount for APAC" produces a more targeted analysis than "I need a breakdown of APAC performance."

Agree on definitions upfront. Before any analysis begins, align on how key metrics are defined. What counts as a resolved ticket? Does AHT include after-contact work? How is escalation defined in the data? Misaligned definitions are the most common source of analysis that produces surprising results — and surprises in analytical output usually mean a definition disagreement, not a genuine operational insight.

Review outputs before sharing widely. Analytical outputs should always be reviewed by someone with operational context before being shared with leadership. Analysts can produce technically correct results that are operationally misleading — because they didn't know about a data quality issue, a policy change that affected a metric definition mid-period, or a one-off event that should be excluded from trend analysis.

Build toward self-service. The most effective data partnerships result in CS leaders becoming progressively more independent — not through replacing analysts with automated tools, but through the gradual transfer of knowledge about data structure, metric definitions, and query patterns that allows CS leaders to answer more questions independently over time.

Building reporting independence: a practical roadmap

Developing data and reporting capability in a CS operation is not a single project. It is a progressive investment that builds on itself over time.

Stage 1 — Foundation: Ensure clean, consistent data in your ticketing system. Define your core KPIs and agree on calculation methodology. Build the three dashboard types — operational, performance, strategic — using your ticketing system's native analytics or a simple BI tool. Establish the review cadences that embed reporting in operational decision-making.

Stage 2 — Expansion: Connect ticketing data to other data sources — WFM data, HR data, financial data — to enable the cross-system analysis that supports the financial and people dimensions of the balanced scorecard. Develop SQL proficiency in the CS leadership team to enable ad hoc analysis without analyst dependency.

Stage 3 — Maturity: Build predictive capability — volume forecasting models, leading indicator analysis, churn risk indicators from account-level support patterns. Establish a data governance framework that ensures metric definitions are consistent, data quality is monitored, and reporting outputs are trusted across the organisation.

Each stage builds on the previous one. Attempting Stage 3 without Stage 1 in place — building predictive models on top of inconsistent data and undefined metrics — produces sophisticated-looking analysis that nobody trusts and nobody acts on.