- Querying CDS
- Filtering
- Pagination
Querying CDS
Querying CDS is done by placing an authenticated HTTP request against the /v1/documents
resource with zero or more query parameters.
Valid query parameters for CDS can be broken up into several broad categories:
Filtering
Filtering query parameters limit the results returned by a CDS query to only documents that match a certain set of characteristics. Some examples include profileIds
(filters to only documents containing a given profile) and collectionIds
(filters to only documents that are part of a given collection).
Filtering query parameters
Filtering query parameters limit the results returned by a CDS query to only those matching a certain set of characteristics.
Boolean logic in queries
Some filtering query parameters allow for multiple values to be specified; an example is profileIds
. When multiple values are specified for a single query parameter, this becomes a logical OR
operation. For example, see the following parameter:
profileIds=renderable,listenable
This parameter would filter results to documents that contain the renderable
profile OR the listenable
profile.
When multiple query parameters are present, this is interpreted as a logical AND
operation. See the following example:
profileIds=renderable&collectionIds=1002
This query would filter results to documents that contain the renderable
profile AND are part of the 1002
collection. Note that query parameters can be present more than once in a single query:
profileIds=renderable&profileIds=listenable
The above query would filter results to documents that have both the renderable
profile AND the listenable
profile. Any filtering query parameter can be used with an AND operator.
Some query parameters (ex: profileIds
) have a corresponding excluded
parameter (ex: excludedProfileIds
). These allow you to make queries while telling CDS which documents you do NOT want to be returned. For example:
profileIds=story&excludedProfileIds=has-images,has-audio
This query would filter results to documents containing the story
profile BUT containing NEITHER the has-images
profile NOR the has-audio
profile. In other words, each story document returned would NOT contain the has-images
profile AND would NOT contain the has-audio
profile.
Note that by combining query parameters and multiple values, complex boolean operations can be created:
profileIds=renderable,listenable&collectionIds=1001,1002&profileIds=podcast-episode
The above query can be expressed with the following boolean expression:
(profileIds contains "renderable" OR profileIds contains "listenable") AND
(document in collection 1001 OR document in collection 1002) AND
(profileIds contains podcast-episode)
Date values
Some filtering query parameters require full-date or date-time values. These query parameters will support RFC3339 format date-times (ex: 2022-01-01T00:00:00Z
) or full-dates (ex: 2022-01-01
).
These parameters can be given as single values, such as in the following examples:
publishDateTime=2022-01-01T00:00:00Z
publishDateTime=2022-01-01
When a date-time is given, the value in the document must match the given value exactly to be returned. When a full-date is given, any document with a value falling between 00:00:00
and 23:59:59
on the given date will be returned.
Date ranges can also be specified using the ellipses (...
) operator:
publishDateTime=2022-01-01T00:00:00Z...
publishDateTime=...2022-12-31T23:59:59Z
publishDateTime=2022-01-01T00:00:00Z...2022-12-31T23:59:59Z
When the ellipses are present after a value, it indicates that any document with a value greater than or equal to the given value is eligible to be returned. When the ellipses preceed a value, any document with a less than or equal value will be returned.
When full-date values are used with ellipses, they are interpreted as a date-time based on where they are in the range. Full-dates starting a range will be intepreted as having the 00:00:00
time, and full-dates ending a range will be interpeted as having the 23:59:59
time.
IMPORTANT NOTE:
At present, if a request specifies an RFC3339
full-date
(e.g.2022-01-01
), CDS will automatically append the Eastern Standard Time (EST) time-zone offset to that date.This is equivalent to:
2022-01-01T00:00:00-05:00
.However, if a client explicitly defines a valid RFC3339
date-time
string (e.g.2022-01-01T00:00:00-08:00
, which translates to exactly midnight in PST), this automatic appending of EST does not occur, and the explicitdate-time
is used.
publishDateTime=2022-01-01...2022-01-02
In the above example, any document with a publishDateTime
value between 2022-01-01T00:00:00-05:00
and 2022-01-02T23:59:59-05:00
is eligible to be returned.
(See above for why and when the -05:00
is appended.)
In all cases, if a parameter accepts date-time values, it will also accept full-date values. Certain parameters will only accept full-date values; see the showDates
parameter below.
Valid filtering query parameters
Name | Supports multiple values? | Date values? | Description |
---|---|---|---|
collectionIds | Yes | No | A list of one or more collection IDs to filter by; only documents present in one or more of the given collections will be returned. Example: collectionIds=1001,1002 |
editorialLastModifiedDateTime | No | Yes | A date or date range; only documents with a editorialLastModifiedDateTimeValue within the given range will be returned.Example: editorialLastModifiedDateTime=2021-01-01T00:00:00Z |
excludedOwnerHrefs | Yes | No | A list of one or more URIs to filter out; documents containing one or more of the given URIs in their owners array will NOT be returned Example: excludedOwnerHrefs=https://organization.api.npr.org/v4/services/s583 |
excludedIds | Yes | No | A list of one or more document IDs to filter out; documents containing one or more of the given document IDs will NOT be returned Example: excludedIds=1002,1045,1006 |
excludedProfileIds | Yes | No | A list of one or more profile IDs to filter out; documents containing one or more of the given profiles at the top level will NOT be returned Example: excludedProfileIds=renderable |
ids | Yes | No | A list of one or more document IDs to filter by; only documents with IDs in the given set will be returned Example: ids=1002,1045,1006 |
ownerHrefs | Yes | No | A list of one or more URIs to filter by; only documents containing one or more of the given URIs in their owners array will be returned Example: ownerHrefs=https://organization.api.npr.org/v4/services/s583 |
nprWebsitePaths | Yes | No | A list of one or more website paths; only documents containing the path in their nprWebsitePaths will be returned. Example of a path that will match: nprWebsitePaths=/podcasts/510310/npr-politics-podcast |
profileIds | Yes | No | A list of one or more profile IDs to filter by; only documents containing one or more of the given profiles at the top level will be returned Example: profileIds=renderable |
publishDateTime | No | Yes | A date or date range; only documents with a publishDateTimeValue within the given range will be returned.Example: publishDateTime=2021-01-01T00:00:00Z |
recommendUntilDateTime | No | Yes | A date or date range; only documents with a recommendUntilDateTimeValue within the given range will be returned.Example: recommendUntilDateTime=2021-01-01T00:00:00Z |
showDates | No | Yes | A date or date range; only documents with a showDates entry within the given range will be returned. This parameter will not accept date-time values.Example: showDates=2022-01-01 |
seasonNumber | No | No | A single numerical value corresponding to the seasonNumber value found on some CDS documents.Example: seasonNumber=1 See: podcast-episode profile |
Pagination
When CDS determines which documents are “eligible” to be returned for a query, it uses the pagination query parameters to determine which subset of documents are actually returned to the client. By default, CDS will only return the first 20 documents.
Pagination query parameters
By default, a CDS query will return 20 documents; however, there may be more than 20 documents matching a given query. Pagination query parameters allow clients to control how many documents are returned from a query, and which subset are returned.
When using pagination query parameters, it’s important to note that every query is evaluated independently; that is, CDS does not support cursor-based querying. When two queries are made in quick succession, the results may change between the two based on publishing activity.
offset and limit
The two valid pagination query parameters are offset
and limit
. When a query is made against CDS, a (potentially large) set of documents are “eligible” to be returned. CDS will return the first subset of documents starting at the offset
value, up to a maximum number of documents defined by limit
. offset
is 0-based, so offset=0
indicates the first document in the set.
This is illustrated by the diagram below:
Pagination limits
For a single query request, CDS has a hard limit of 300 documents; requesting more than 300 documents in a query will result in a 400 error. For a full set of documents, CDS will not return results beyond the 2000th document. That’s to say, limit + offset
must always be less than 2000.
Valid pagination query parameters
Name | Max value | Default | Description |
---|---|---|---|
limit | 300 | 20 | The maximum number of documents to return in this query |
offset | 2000 | 0 | Where to “start” the subset of documents returned by this query. This value is 0-based. |
Sorting
When documents are returned by a CDS query, they are ordered by their publishDateTime
attribute, starting from most recent to least recent (“descending”). The sort
attribute can alter this ordering.
Sorting query parameters
The order that documents are returned in a query is determined by the query’s sort. By default, queries are sorted by their publishDateTime
attribute, descending (most recent first, least recent last).
This sort can be changed using the sort
parameter. The sort parameter takes the following form:
sort=<type>[:<direction>[:<missing>]][,<type2>[:<direction2>[:<missing2>]]]
Parameter Name | Required? | Function |
---|---|---|
type | Yes | The name of the field to be sorted on. (See the valid sort types below.) |
direction | Yes | The desired sorting order: asc (ie. smallest → largest) or desc (ie. largest → smallest). |
missing | No | The desired behavior for documents that are missing the property being sorted on. (More info on this parameter) |
CDS does support sorting on multiple fields (ie. multi-dimensional sort). The exact details of that behavior are outlined below.
An example of descending publishDateTime
sort would be:
sort=publishDateTime:desc
The type
value determines the method of sorting the documents. If present, the direction
value determines whether the documents will be sorted in ascending or descending order. The direction value is optional, and not all sort types support a direction; see editorial
sort below.
When valid and present, the direction
value may be either asc
or desc
for “ascending” or “descending”, respectively. When valid but missing, direction
defaults to desc
.
An example using the optional missing
param would be:
...sort=seasonNumber:desc:first
This means that our results will be sorted by the seasonNumber
field, in descending order (ie. highest → lowest).
Editorial sort
The editorial
sort is a method of ordering CDS documents within a collection. It may not be used with a direction, and must be used in conjuction with a single collectionIds
query parameter (see Filtering for more details on this parameter).
When given, editorial sort will place ordered content before unordered content. When querying for a collection, ordered content will be returned first in the order it appears in the items
array, followed by all unordered content sorted by publishDateTime
.
For more information on collections, see the Collections page.
Valid sort
types
Name | Supports direction? | Description |
---|---|---|
editorialLastModifiedDateTime | Yes | Sorts by each document’s editorialLastModifiedDateTime value. For this sort, desc means “most recent first” and asc means “least recent first”. |
editorial | No | Sorts the CDS documents “editorially”; see the “Editorial Sort” section above. |
publishDateTime | Yes | Sorts by each document’s publishDateTime value. For this sort, desc means “most recent first” and asc means “least recent first”. |
showDates | Yes | Sorts by each document’s most recent entry in the showDates array. For this sort, desc means “most recent first” and asc means “least recent first”. |
seasonNumber | Yes | If a document has a seasonNumber field, sort by this field in the chosen direction. For this sort, asc means “oldest first” and desc means “newest first.” (This sort is most useful when querying for podcast-episode docs.) |
episodeNumber | Yes | If a document has an episodeNumber field, sort by this field in the chosen direction. For this sort, asc means “oldest first” and desc means “newest first.” (This sort is most useful when querying for podcast-episode docs.) |
More on CDS sorting behavior
Here is part of a CDS query that demonstrates the default CDS sorting behavior:
...sort=seasonNumber:asc,episodeNumber:asc
The results of this query will look like this:
seasonNumber | episodeNumber |
---|---|
1 | 1 |
1 | 2 |
1 | 3 |
1 | (Missing episodeNumber ) |
… | … |
2 | 1 |
2 | 2 |
2 | 3 |
2 | (Missing episodeNumber ) |
… | … |
(Missing seasonNumber ) | (Missing episodeNumber either) |
As we see, CDS has ordered the list of matching documents based on the parameters provided – ie. first by seasonNumber
(“ascending”, or smallest → largest), then by episodeNumber
(also “ascending”).
IMPORTANT NOTE
The order in which the
sort
query params are specified in the query URL absolutely matters. CDS will read these query parameters in order. If we were to invert the order of the fields we were sorting on – ie....sort=episodeNumber:asc,seasonNumber:asc...
– our results would look like this instead:
seasonNumber episodeNumber 1 1 2 1 1 2 2 2 1 3 2 3 (Missing seasonNumber
)(Missing episodeNumber
)You can think of this as CDS saying: “
episodeNumber
is the most important field to sort on (ascending) – and if two episodes have the sameepisodeNumber
, only then will I sort byseasonNumber
(ascending).”
IMPORTANT NOTE
The effects of the optional
:first
and:last
params (detailed below) are independent of sort direction – ie. docs with missing sort fields will show up where you specify regardless of whether you are sorting in ascending or descending order.
Sometimes a client may want to alter this default behavior. This is where the optional :first
and :last
param comes in.
Optional :first
and :last
params
These are both additional, optional sort parameters that will explicitly tell CDS how to handle sorting of documents that do not contain the fields being sorted on.
The :first
and :last
param offers CDS clients a way to specify if they want documents missing some or all fields being sorted on at the beginning or end of the list of results – or of a group of results (if multiple fields are being sorted on).
Example queries :first
and :last
param
Base behavior example: sorting on only seasonNumber
with NO :first
and :last
Here is an example query where the results are being sorted only by seasonNumber
(ascending):
...sort=seasonNumber:asc
The resulting list of CDS docs will look like this:
seasonNumber | episodeNumber |
---|---|
1 | (Missing episodeNumber ) |
1 | 1 |
1 | 2 |
… | … |
2 | 1 |
2 | 2 |
… | … |
(Missing seasonNumber ) | (Missing episodeNumber ) |
And now WITH :first
and :last
specified
...sort=seasonNumber:asc:first
The resulting list of CDS docs will look like this:
seasonNumber | episodeNumber |
---|---|
(Missing seasonNumber ) | 1 |
1 | 1 |
1 | 2 |
… | … |
2 | 1 |
2 | 2 |
… | … |
:first
and :last
when sorting on multiple fields
These two optional params allow us to control the ‘What to do if a field is missing on a doc’ behavior individually for each field being sorted on. Note the difference in these two queries and their resulting CDS results:
...sort=seasonNumber:asc:first,episodeNumber:asc:first
seasonNumber | episodeNumber |
---|---|
(Missing seasonNumber ) | (Missing episodeNumber ) |
1 | (Missing episodeNumber ) |
1 | 1 |
1 | 2 |
2 | (Missing episodeNumber ) |
2 | 1 |
2 | 2 |
...sort=seasonNumber:asc:first,episodeNumber:asc:last
seasonNumber | episodeNumber |
---|---|
(Missing seasonNumber ) | (Missing episodeNumber ) |
1 | 1 |
1 | 2 |
1 | (Missing episodeNumber ) |
2 | 1 |
2 | 2 |
2 | (Missing episodeNumber ) |
:first
and :last
when sorting on multiple fields – “descending” order
...sort=seasonNumber:desc:first,episodeNumber:desc:first
The resulting list of CDS docs will look like this:
seasonNumber | episodeNumber |
---|---|
(Missing seasonNumber ) | (Missing episodeNumber ) |
2 | (Missing episodeNumber ) |
2 | 2 |
2 | 1 |
1 | (Missing episodeNumber ) |
1 | 2 |
1 | 1 |
Different direction
and missing
values for each field
...sort=seasonNumber:asc:first,episodeNumber:desc:last
The resulting list of CDS docs will look like this:
seasonNumber | episodeNumber |
---|---|
(Missing seasonNumber ) | (Missing episodeNumber ) |
1 | 2 |
1 | 1 |
1 | (Missing episodeNumber ) |
2 | 2 |
2 | 1 |
2 | (Missing episodeNumber ) |