TempDB Querying Language

TempDB offers a simple and flexible query language designed for efficient data aggregation, filtering, and analysis. Aggregating data allows users to derive insights from large datasets by summarizing, grouping, and processing data in meaningful ways. Whether you're analyzing sales performance, monitoring customer trends, or identifying operational bottlenecks, TempDB's query language helps you to uncover actionable insights with ease.

Use Cases for Data Aggregation

Data aggregation is critical for:

  • Business Analytics: Summarizing key performance indicators (KPIs) like revenue, customer count, or average purchase value.

  • Data Exploration: Understanding trends and patterns across different dimensions such as time, location, or category.

  • Reporting: Generating insights for dashboards and reports with summarized data.

  • Filtering Large Datasets: Narrowing down large datasets to specific areas of interest using filters and conditions.


Query Language Basics

Each query in TempDB consists of one or more operations that can be chained together in a logical order.

Basic Syntax

QUERY <operation1> <operation2> ... <operationN>

The query operations are performed sequentially, with the output of one operation serving as the input to the next.


Available Query Operations

Aggregation Operations

These operations process and summarize data fields:

  • COUNT: Counts the number of records matching the query. In this case, simple count will count all the documents in the collection/database.

    COUNT
  • SUM: Calculates the total for a specified numeric field.

    SUM /net_amount
  • AVG (Average): Finds the average value of a specified numeric field.

    AVG /net_amount
  • MIN: Identifies the smallest value in a specified field.

    MIN /price
  • MAX: Identifies the largest value in a specified field.

    MAX /price
  • DISTINCT: Returns unique values from a specified field.

    DISTINCT /category
  • TOPN: Retrieves the top N values for a specified field.

    TOPN 5 /net_amount
  • BOTTOMN: Retrieves the bottom N values for a specified field.

    QUERY BOTTOMN 5 /net_amount

Grouping Operations

Grouping operations allow you to organize data by specific fields and perform aggregations within each group.

  • GROUPBY: Groups data by a specified field. Often used in combination with aggregation functions.

    GROUPBY /category COUNT
    GROUPBY /location SUM /net_amount

Filter Operations

Filters narrow down data based on conditions. Multiple conditions can be combined for precise filtering.

  • FILTER: Applies conditions to data. Supported operators include:

    • eq: Equals

    • neq: Not equals

    • gt: Greater than

    • lt: Less than

    • gte: Greater than or equal

    • lte: Less than or equal

    • contains: Contains substring

    • startsWith: Starts with substring

    • endsWith: Ends with substring

    • in: Value in array

    • notin: Value not in array

    • exists: Field exists

    • notexists: Field does not exist

    • regex: Matches regular expressions

Examples:

FILTER /price gt 1000
FILTER /category eq Electronics
FILTER /location contains Mumbai
FILTER /payment_method in ["Credit Card", "Debit Card"]

Usage Examples

Combining Multiple Operations

  1. Group by category with count and sum:

    GROUPBY /category COUNT SUM /net_amount
  2. Filter by price range and group by payment method:

    FILTER /price gt 1000 FILTER /price lt 5000 GROUPBY /payment_method COUNT
  3. Perform multiple aggregations by location:

    GROUPBY /location COUNT AVG /net_amount MAX /price MIN /price
  4. Identify top spenders by category:

    FILTER /net_amount gt 5000 GROUPBY /category COUNT SUM /net_amount

E-commerce Specific Examples

  1. Analyze payment method preferences:

    GROUPBY /payment_method COUNT SUM /net_amount AVG /net_amount
  2. Evaluate category performance:

    GROUPBY /category COUNT SUM /net_amount AVG /discount
  3. Understand location-based customer trends:

    GROUPBY /location COUNT DISTINCT /customer_id AVG /net_amount
  4. Analyze the impact of discounts:

    FILTER /discount_availed eq Yes GROUPBY /category COUNT SUM /discount
  5. Explore age group purchasing patterns:

    GROUPBY /age_group COUNT AVG /net_amount TOPN 3 /price

Best Practices

  1. Query Structure:

    • Apply filters first to reduce the dataset.

    • Use grouping operations to organize data logically.

    • Follow grouping with aggregations to compute summary statistics.

  2. Field Names:

    • Always prefix fields with / for clarity.

    • Ensure field names match exactly as they appear in the dataset.

    • Remember: field names are case-sensitive.

  3. Performance Tips:

    • Prioritize specific filters over broad ones for faster query execution.

    • Limit the use of regex filters, as they are computationally expensive.

    • Avoid excessive chaining of operations unless necessary.


Error Handling

TempDB provides error messages to help debug issues:

  • INVALID_COMMAND: Syntax error in the query.

  • FIELD_NOT_FOUND: Referenced field does not exist.

  • INVALID_FILTER_FORMAT: Incorrect filter condition.

  • MISSING_FIELD_FOR_*: Required field parameter is missing.


By using this powerful query language, you can efficiently aggregate, analyze, and derive meaningful insights from your data.

Last updated