Querying CDS

Querying CDS is done by placing an authenticated HTTP request against the /v1/documents resource with zero or more query parameters.

Valid query parameters for CDS can be broken up into several broad categories:

Filtering

Filtering query parameters limit the results returned by a CDS query to only documents that match a certain set of characteristics. Some examples include profileIds (filters to only documents containing a given profile) and collectionIds (filters to only documents that are part of a given collection).

Filtering query parameters

Filtering query parameters limit the results returned by a CDS query to only those matching a certain set of characteristics.

Boolean logic in queries

Some filtering query parameters allow for multiple values to be specified; an example is profileIds. When multiple values are specified for a single query parameter, this becomes a logical OR operation. For example, see the following parameter:

profileIds=renderable,listenable

This parameter would filter results to documents that contain the renderable profile OR the listenable profile.

When multiple query parameters are present, this is interpreted as a logical AND operation. See the following example:

profileIds=renderable&collectionIds=1002

This query would filter results to documents that contain the renderable profile AND are part of the 1002 collection. Note that query parameters can be present more than once in a single query:

profileIds=renderable&profileIds=listenable

The above query would filter results to documents that have both the renderable profile AND the listenable profile. Any filtering query parameter can be used with an AND operator.

Some query parameters (ex: profileIds) have a corresponding excluded parameter (ex: excludedProfileIds). These allow you to make queries while telling CDS which documents you do NOT want to be returned. For example:

profileIds=story&excludedProfileIds=has-images,has-audio

This query would filter results to documents containing the story profile BUT containing NEITHER the has-images profile NOR the has-audio profile. In other words, each story document returned would NOT contain the has-images profile AND would NOT contain the has-audio profile.

Note that by combining query parameters and multiple values, complex boolean operations can be created:

profileIds=renderable,listenable&collectionIds=1001,1002&profileIds=podcast-episode

The above query can be expressed with the following boolean expression:

(profileIds contains "renderable" OR profileIds contains "listenable") AND
(document in collection 1001 OR document in collection 1002) AND
(profileIds contains podcast-episode)

Date values

Some filtering query parameters require full-date or date-time values. These query parameters will support RFC3339 format date-times (ex: 2022-01-01T00:00:00Z) or full-dates (ex: 2022-01-01).

These parameters can be given as single values, such as in the following examples:

publishDateTime=2022-01-01T00:00:00Z
publishDateTime=2022-01-01

When a date-time is given, the value in the document must match the given value exactly to be returned. When a full-date is given, any document with a value falling between 00:00:00 and 23:59:59 on the given date will be returned.

Date ranges can also be specified using the ellipses (...) operator:

publishDateTime=2022-01-01T00:00:00Z...
publishDateTime=...2022-12-31T23:59:59Z
publishDateTime=2022-01-01T00:00:00Z...2022-12-31T23:59:59Z

When the ellipses are present after a value, it indicates that any document with a value greater than or equal to the given value is eligible to be returned. When the ellipses preceed a value, any document with a less than or equal value will be returned.

When full-date values are used with ellipses, they are interpreted as a date-time based on where they are in the range. Full-dates starting a range will be intepreted as having the 00:00:00 time, and full-dates ending a range will be interpeted as having the 23:59:59 time.

IMPORTANT NOTE:

At present, if a request specifies an RFC3339 full-date (e.g. 2022-01-01), CDS will automatically append the Eastern Standard Time (EST) time-zone offset to that date.

This is equivalent to: 2022-01-01T00:00:00-05:00.

However, if a client explicitly defines a valid RFC3339 date-time string (e.g. 2022-01-01T00:00:00-08:00, which translates to exactly midnight in PST), this automatic appending of EST does not occur, and the explicit date-time is used.


publishDateTime=2022-01-01...2022-01-02

In the above example, any document with a publishDateTime value between 2022-01-01T00:00:00-05:00 and 2022-01-02T23:59:59-05:00 is eligible to be returned.

(See above for why and when the -05:00 is appended.)

In all cases, if a parameter accepts date-time values, it will also accept full-date values. Certain parameters will only accept full-date values; see the showDates parameter below.

Valid filtering query parameters

Name Supports multiple values? Date values? Description
collectionIds Yes No A list of one or more collection IDs to filter by; only documents present in one or more of the given collections will be returned.

Example: collectionIds=1001,1002
editorialLastModifiedDateTime No Yes A date or date range; only documents with a editorialLastModifiedDateTimeValue within the given range will be returned.

Example: editorialLastModifiedDateTime=2021-01-01T00:00:00Z
excludedOwnerHrefs Yes No A list of one or more URIs to filter out; documents containing one or more of the given URIs in their owners array will NOT be returned

Example: excludedOwnerHrefs=https://organization.api.npr.org/v4/services/s583
excludedIds Yes No A list of one or more document IDs to filter out; documents containing one or more of the given document IDs will NOT be returned

Example: excludedIds=1002,1045,1006
excludedProfileIds Yes No A list of one or more profile IDs to filter out; documents containing one or more of the given profiles at the top level will NOT be returned

Example: excludedProfileIds=renderable
ids Yes No A list of one or more document IDs to filter by; only documents with IDs in the given set will be returned

Example: ids=1002,1045,1006
ownerHrefs Yes No A list of one or more URIs to filter by; only documents containing one or more of the given URIs in their owners array will be returned

Example: ownerHrefs=https://organization.api.npr.org/v4/services/s583
nprWebsitePaths Yes No A list of one or more website paths; only documents containing the path in their nprWebsitePaths will be returned.

Example of a path that will match: nprWebsitePaths=/podcasts/510310/npr-politics-podcast
profileIds Yes No A list of one or more profile IDs to filter by; only documents containing one or more of the given profiles at the top level will be returned

Example: profileIds=renderable
publishDateTime No Yes A date or date range; only documents with a publishDateTimeValue within the given range will be returned.

Example: publishDateTime=2021-01-01T00:00:00Z
recommendUntilDateTime No Yes A date or date range; only documents with a recommendUntilDateTimeValue within the given range will be returned.

Example: recommendUntilDateTime=2021-01-01T00:00:00Z
showDates No Yes A date or date range; only documents with a showDates entry within the given range will be returned. This parameter will not accept date-time values.

Example: showDates=2022-01-01
seasonNumber No No A single numerical value corresponding to the seasonNumber value found on some CDS documents.

Example: seasonNumber=1

See: podcast-episode profile

Pagination

When CDS determines which documents are “eligible” to be returned for a query, it uses the pagination query parameters to determine which subset of documents are actually returned to the client. By default, CDS will only return the first 20 documents.

Pagination query parameters

By default, a CDS query will return 20 documents; however, there may be more than 20 documents matching a given query. Pagination query parameters allow clients to control how many documents are returned from a query, and which subset are returned.

When using pagination query parameters, it’s important to note that every query is evaluated independently; that is, CDS does not support cursor-based querying. When two queries are made in quick succession, the results may change between the two based on publishing activity.

offset and limit

The two valid pagination query parameters are offset and limit. When a query is made against CDS, a (potentially large) set of documents are “eligible” to be returned. CDS will return the first subset of documents starting at the offset value, up to a maximum number of documents defined by limit. offset is 0-based, so offset=0 indicates the first document in the set.

This is illustrated by the diagram below:

Pagination Diagram

Pagination limits

For a single query request, CDS has a hard limit of 300 documents; requesting more than 300 documents in a query will result in a 400 error. For a full set of documents, CDS will not return results beyond the 2000th document. That’s to say, limit + offset must always be less than 2000.

Valid pagination query parameters

Name Max value Default Description
limit 300 20 The maximum number of documents to return in this query
offset 2000 0 Where to “start” the subset of documents returned by this query. This value is 0-based.

Sorting

When documents are returned by a CDS query, they are ordered by their publishDateTime attribute, starting from most recent to least recent (“descending”). The sort attribute can alter this ordering.

Sorting query parameters

The order that documents are returned in a query is determined by the query’s sort. By default, queries are sorted by their publishDateTime attribute, descending (most recent first, least recent last).

This sort can be changed using the sort parameter. The sort parameter takes the following form:

sort=<type>[:<direction>[:<missing>]][,<type2>[:<direction2>[:<missing2>]]]
Parameter Name Required? Function
type Yes The name of the field to be sorted on. (See the valid sort types below.)
direction Yes The desired sorting order: asc (ie. smallest → largest) or desc (ie. largest → smallest).
missing No The desired behavior for documents that are missing the property being sorted on. (More info on this parameter)

CDS does support sorting on multiple fields (ie. multi-dimensional sort). The exact details of that behavior are outlined below.

An example of descending publishDateTime sort would be:

sort=publishDateTime:desc

The type value determines the method of sorting the documents. If present, the direction value determines whether the documents will be sorted in ascending or descending order. The direction value is optional, and not all sort types support a direction; see editorial sort below.

When valid and present, the direction value may be either asc or desc for “ascending” or “descending”, respectively. When valid but missing, direction defaults to desc.

An example using the optional missing param would be:

...sort=seasonNumber:desc:first

This means that our results will be sorted by the seasonNumber field, in descending order (ie. highest → lowest).

Editorial sort

The editorial sort is a method of ordering CDS documents within a collection. It may not be used with a direction, and must be used in conjuction with a single collectionIds query parameter (see Filtering for more details on this parameter).

When given, editorial sort will place ordered content before unordered content. When querying for a collection, ordered content will be returned first in the order it appears in the items array, followed by all unordered content sorted by publishDateTime.

For more information on collections, see the Collections page.

Valid sort types

Name Supports direction? Description
editorialLastModifiedDateTime Yes Sorts by each document’s editorialLastModifiedDateTime value. For this sort, desc means “most recent first” and asc means “least recent first”.
editorial No Sorts the CDS documents “editorially”; see the “Editorial Sort” section above.
publishDateTime Yes Sorts by each document’s publishDateTime value. For this sort, desc means “most recent first” and asc means “least recent first”.
showDates Yes Sorts by each document’s most recent entry in the showDates array. For this sort, desc means “most recent first” and asc means “least recent first”.
seasonNumber Yes If a document has a seasonNumber field, sort by this field in the chosen direction. For this sort, asc means “oldest first” and desc means “newest first.” (This sort is most useful when querying for podcast-episode docs.)
episodeNumber Yes If a document has an episodeNumber field, sort by this field in the chosen direction. For this sort, asc means “oldest first” and desc means “newest first.” (This sort is most useful when querying for podcast-episode docs.)

More on CDS sorting behavior

Here is part of a CDS query that demonstrates the default CDS sorting behavior:

...sort=seasonNumber:asc,episodeNumber:asc

The results of this query will look like this:

seasonNumber episodeNumber
1 1
1 2
1 3
1 (Missing episodeNumber)
2 1
2 2
2 3
2 (Missing episodeNumber)
(Missing seasonNumber) (Missing episodeNumber either)

As we see, CDS has ordered the list of matching documents based on the parameters provided – ie. first by seasonNumber (“ascending”, or smallest → largest), then by episodeNumber (also “ascending”).

IMPORTANT NOTE

The order in which the sort query params are specified in the query URL absolutely matters. CDS will read these query parameters in order. If we were to invert the order of the fields we were sorting on – ie. ...sort=episodeNumber:asc,seasonNumber:asc... – our results would look like this instead:

seasonNumber episodeNumber
1 1
2 1
1 2
2 2
1 3
2 3
(Missing seasonNumber) (Missing episodeNumber)

You can think of this as CDS saying: “episodeNumber is the most important field to sort on (ascending) – and if two episodes have the same episodeNumber, only then will I sort by seasonNumber (ascending).”

IMPORTANT NOTE

The effects of the optional :first and :last params (detailed below) are independent of sort direction – ie. docs with missing sort fields will show up where you specify regardless of whether you are sorting in ascending or descending order.

Sometimes a client may want to alter this default behavior. This is where the optional :first and :last param comes in.

Optional :first and :last params

These are both additional, optional sort parameters that will explicitly tell CDS how to handle sorting of documents that do not contain the fields being sorted on.

The :first and :last param offers CDS clients a way to specify if they want documents missing some or all fields being sorted on at the beginning or end of the list of results – or of a group of results (if multiple fields are being sorted on).

Example queries :first and :last param

Base behavior example: sorting on only seasonNumber with NO :first and :last

Here is an example query where the results are being sorted only by seasonNumber (ascending):

...sort=seasonNumber:asc

The resulting list of CDS docs will look like this:

seasonNumber episodeNumber
1 (Missing episodeNumber)
1 1
1 2
2 1
2 2
(Missing seasonNumber) (Missing episodeNumber)

And now WITH :first and :last specified

...sort=seasonNumber:asc:first

The resulting list of CDS docs will look like this:

seasonNumber episodeNumber
(Missing seasonNumber) 1
1 1
1 2
2 1
2 2

:first and :last when sorting on multiple fields

These two optional params allow us to control the ‘What to do if a field is missing on a doc’ behavior individually for each field being sorted on. Note the difference in these two queries and their resulting CDS results:

...sort=seasonNumber:asc:first,episodeNumber:asc:first
seasonNumber episodeNumber
(Missing seasonNumber) (Missing episodeNumber)
1 (Missing episodeNumber)
1 1
1 2
2 (Missing episodeNumber)
2 1
2 2
...sort=seasonNumber:asc:first,episodeNumber:asc:last
seasonNumber episodeNumber
(Missing seasonNumber) (Missing episodeNumber)
1 1
1 2
1 (Missing episodeNumber)
2 1
2 2
2 (Missing episodeNumber)

:first and :last when sorting on multiple fields – “descending” order

...sort=seasonNumber:desc:first,episodeNumber:desc:first

The resulting list of CDS docs will look like this:

seasonNumber episodeNumber
(Missing seasonNumber) (Missing episodeNumber)
2 (Missing episodeNumber)
2 2
2 1
1 (Missing episodeNumber)
1 2
1 1

Different direction and missing values for each field

...sort=seasonNumber:asc:first,episodeNumber:desc:last

The resulting list of CDS docs will look like this:

seasonNumber episodeNumber
(Missing seasonNumber) (Missing episodeNumber)
1 2
1 1
1 (Missing episodeNumber)
2 2
2 1
2 (Missing episodeNumber)

© 2024 npr