SOQL – How I Query With Thee, Let Me Count the Ways

I've been wanting to write this post since the new aggregate functions were announced in Spring 10. With the new 18 version of the API in sandboxes I can finally do a high level overview of SOQL (Salesforce Object Query Language) for all of the newcomers to the Force.com platform. This is definitely not an all-encompassing look as SOQL but something to whet your appetite with examples and screenshots. The Salesforce SOQL docs have a lot of great detailed info, but one of the things that I think is missing is what the results of the SOQL queries look like. This article does not cover Dynamic SOQL.

Most developers are familiar with SQL (Structured Query Language) so SOQL isn't much of a stretch. There are some really cool features of SOQL that you will find extremely productive but there are some restrictions that will make you want to pull your hair out. However, as with everything on the Force.com platform, SOQL is evolving and becoming better and better overtime. From a high level, there are a few important differences in the two languages:

With SQL you can retrieve and modify datasets directly. However, with SOQL you can only retrieve datasets. You can then use DML statements with these records to performs inserts, updates, deletes and upserts (creates new records and updates existing records within a single statement).In SQL joins are second-nature to retrieve data from multiple tables. However, with SOQL, one of my favorite features is its support for dot notation to join related objects implicitly. We'll look at both child-to-parent and parent-to-child relationship shortly.The biggest thing to remember is that SOQL is not SQL. SOQL does not support full SQL syntax so you'll have to rethink the way you query for data. With Spring 10 Saleforce.com has added some new aggregate functions (finally!!) and SOQL functions but there will be many times when you'll have to break open the docs to see what you can and can not do with SOQL.

  1. With SQL you can retrieve and modify datasets directly. However, with SOQL you can only retrieve datasets. You can then use DML statements with these records to performs inserts, updates, deletes and upserts (creates new records and updates existing records within a single statement).

  2. In SQL joins are second-nature to retrieve data from multiple tables. However, with SOQL, one of my favorite features is its support for dot notation to join related objects implicitly. We'll look at both child-to-parent and parent-to-child relationship shortly.

  3. The biggest thing to remember is that SOQL is not SQL. SOQL does not support full SQL syntax so you'll have to rethink the way you query for data. With Spring 10 Saleforce.com has added some new aggregate functions (finally!!) and SOQL functions but there will be many times when you'll have to break open the docs to see what you can and can not do with SOQL.

Relationship Queries

Relationships are arguably the most powerful feature of SOQL. They allow you to perform joins on multiple objects and traverse the relationship chain to find and return objects. Relationships are available for standard and custom objects in Salesforce. However, their syntax is slightly different. The classic relationship example is accounts and contacts. An account (parent) can have multiple contacts (children).

Child-to-Parent Relationship

This type of query returns child objects (contact) containing parent (account) information:

</p>

For child-to-parent relationships with custom objects the syntax is slightly different but the results are essentially the same. In the query below there is a custom object (Affiliate__c) with a lookup relationship on the Account object. Notice in the query below we use the __r instead of the object's name since this is the name of the relationship. This will be explained shortly in the parent-to-child section below.

For each relationship you can specify up to 5 levels. This makes for some really cool queries that let you reach way up the relationship chain to fetch data. For instance you can write a query like this which spans only 4 levels.

Parent-to-Child Relationships

These types of queries are almost always one-to-many. You specify the relationship using a subquery, where the initial member of the FROM clause is the subquery is related to the initial member of the outer query FROM clause. Here's where the relationship comes into affect. With standard objects you use the plural name of the object in the subquery. Here's how you would find the name of the relationship using Eclipse:

The following query returns the name for each account and then for each account another collection of contacts containing their first name, last name and email address.

For custom objects the relationships are slightly different. When you create a relationship between two objects, the Force.com platform generates a relationship name for you. So if your relationship field name is Foo__c the relationship name would be Foo__r. I've seen some really strange relationship names generated so you can change the name (here are instructions how) if you need to. So a sample query might look like:

You can get really crazy with relationships and where clauses in the subqueries and run something like the following.

Joins
SOQL also support semi-joins and anti-joins. Some examples are:

Semi-Joins with IN query

ID field Semi-Join

ID field Anti-Join

Multiple Semi-Joins or Anti-Joins

Aggregate Functions

Aggregate Functions are the long-awaited additions to SOQL for Spring 10. Not all fields are supported so check out this list for details.

COUNT() returns the number of rows that match the filtering conditions and COUNT() must be the only element in the select list. The resulting query result size field returns the number of rows and the records will returns null.

COUNT(fieldname) returns the number of rows that match the filtering conditions and have a non-null value. An AggregateResult object in the records field contains the number of rows. Do not use the size field for the resulting records.

COUNT_DISTINCT() returns the number of distinct non-null field values matching your query criteria.

SUM() returns the total sum of a numeric field based up your query criteria.

AVG() returns the average value of a numeric field based up your query criteria.

MIN() returns the minimum value for a field.

MAX() returns the maximum value for a field.

GROUP BY
As in SQL you can use GROUP BY to summarize and roll up your query results instead of processing records individually. There are a number of options, restrictions and conditions for group by, so check out the docs for more info.

You can also calculate subtotals for aggregate data in query results by using GROUP BY ROLLUP.

HAVING
Using HAVING you can filter the results returned by aggregate functions where you might normally want to use a WHERE clause. For instance, this query will fail:

This query will return the correct results:

Date Functions

The Force.com platform provides you with a number of date functions that can be used in SOQL to make your life easier. Check out the docs for a complete list.