SQL Filtering
The data sources in a collector line offer filtering of datasets with an SQL query. SQL filtering provides a simple way to select datasets on criteria using standard syntax. This syntax can also handle multivalued attributes, called 'lists of values,' later in this chapter.
The operating principle of the SQL query is to consider the dataset that has just been created (for example, from a file) as a table containing a single row of data. The query is executed on each dataset. The query returns either the same dataset or no dataset which means that the values present do not meet the criteria expressed in the query.
SELECT * FROM dataset WHERE dataset.unique_ID <> 'VIP'
In the example above, the query is executed on each dataset created by the source. If the registration number contains the value 'VIP', the dataset is rejected because the criteria in the WHERE clause specifies that the registration number must not contain 'VIP'.
Note that it is possible to address configuration variables of the project or of the collector line by prefixing their name with config as shown in the following example for the reject variable:
SELECT * FROM dataset WHERE dataset.unique_ID <> config.reject
Query Syntax
Only SELECT queries are authorized for SQL filtering. When the query is executed, the current dataset is considered as an SQL table whose columns are the dataset attributes. This table contains only one line consisting of attribute values. The main difference with a real SQL table is that each column may contain a list of values (corresponding to a multivalued attribute of the dataset).
Main source
In the FROM clause of the SQL query, the only allowed table name is 'dataset'. The *
character in the projection part returns all of the attributes of the dataset. However, it is possible to return only a part of the attributes of the dataset as shown in the following example:
SELECT unique_ID, first_name, last_name FROM dataset
The source, when reading the file, creates a dataset with a certain number of attributes and then the SQL query is executed and only the unique ID, first and last name attributes are retained in the dataset transmitted to the next component.
It is also possible to create a brand new attribute in the projection area as shown in the following example with the new attribute full_name
:
SELECT unique_ID, first_name, last_name, concat(first_name, ' ', last_name) AS full_name FROM dataset
This method for adding a calculated attribute to the dataset, however, is not recommended because it bypasses the attribute declaration in the pattern. The direct consequence is that the collector editor will not suggest this name in the attribute selectors.
Secondary source
In a source activated through a join component, two sets of data are handled:
dataset
: dataset created by the secondary sourceparam
: dataset passed as a parameter by the join component
In a secondary source, the tables listed in the FROM clause may be dataset, param, or both. As before, it is also possible to restrict the attributes returned but they may come from dataset or from param as shown in the following query:
SELECT unique_ID, first_name, last_name, param.date_arrival, param.date_departure FROM dataset
The attribute names without a table name (unique_ID
, first_name
, last_name
) come from the table mentioned in the FROM clause (dataset) whereas the two attributes date_arrival
and date_departure
come explicitly from the param dataset. It is possible to retrieve the entire dataset and some attributes of param as shown in the following query:
SELECT *, param.date_arrival, param.date_departure FROM dataset
Finally, if you wish to retrieve all of the attributes (dataset and param), both of the following syntaxes work:
SELECT * FROM dataset, param
SELECT *, param.* FROM dataset
Operators
Operators allow you to make comparisons (equal, less than, greater than, ...) or to combine operations through logical operators (and, or, ...).
Operators work on simple values as well as on lists of values. To obtain a list of values, the attribute must be declared as being multivalued in the pattern and the input file must contain multiple values for a single CSV column or a single LDIF attribute. The result of the comparison depends on the operator used. Here is the explanation for the three main operators with an example of syntax:
- '=' Operator: It is a strict equality operator, which can compare two simple values or two lists. For the operator to return true, the two operands must possess the same number of values, and the values must be identical. If the objective is to compare a value and a list, you must use the IN operator or the LIKE operator. The following example shows the comparison of two multivalued attributes:
SELECT * FROM dataset WHERE dataset.unique_ID = dataset.userid
IN
Operator: The IN operator is useful to verify whether a value is in a list. The following example shows the syntax of the IN operator to test whether one of the values of the unique ID attribute is equal to ‘EXT0001’:
SELECT * FROM dataset WHERE 'EXT0001' IN (dataset.unique_ID)
Note that the left operand of the IN operator may also be a list. In this case, the operation performed is an inclusion test. The result is true if all the values of the left operand are found in the list of values of the right operand. The syntax is as follows:
SELECT * FROM dataset WHERE dataset.userid IN (dataset.unique_ID)
LIKE
Operator: TheLIKE
operator allows you to use wildcards like%
or_
. If the left operand is a list, theLIKE
operator returns true if one of the values of the list respects the regular expression in the right operand. The syntax is:
SELECT * FROM dataset WHERE dataset.unique_ID LIKE 'EXT%'
Important: All the operators are case-sensitive. If the unique ID contains 'EXT0001', none of the following queries returns the record in question:
SELECT * FROM dataset WHERE dataset.unique_ID = 'ext0001'
SELECT * FROM dataset WHERE dataset.unique_ID LIKE 'ext%'
SELECT * FROM dataset WHERE 'ext0001' in (dataset.unique_ID)
To make the query non-case-sensitive in the previous example, you must use the upper or lower functions to convert the unique ID:
SELECT * FROM dataset WHERE lower(dataset.unique_ID) = 'ext0001'
SELECT * FROM dataset WHERE lower(dataset.unique_ID) LIKE 'ext%'
SELECT * FROM dataset WHERE 'ext0001' in (lower(dataset.unique_ID))
Note that most of the functions (such as lower, upper, trim,...) may be applied to single-valued or multivalued attributes.
Arithmetic operator
The list of supported arithmetic operators is:
+
for addition-
for subtraction*
for multiplication/
for division
These operators only work on numerical operands. Lists of values are not supported. To ensure the order of operations, it is possible to use parentheses in the expression. Here are some examples using these operators:
SELECT * FROM dataset WHERE len(dataset.unique_ID) - 1 < 3
SELECT * FROM dataset WHERE 2 + (dataset.counter / 10) > 5
Comparison operator
The comparison operators are:
=
: true if both operands are equal<>
or!=
: true if both operands are different<
: true if the first operand is strictly less than the second operand<=
: true if the first operand is less than or equal to the second operand>
: true if the first operand is strictly greater than the second operand>=
: true if the first operand is greater than or equal to the second operand
The first two operators (=
and <>
) operate on values or lists of values. These strict comparisons return true only if the values of two operands are identical (for the =
operator) or all different (for the <>
operator).
The other operators do not work on lists of values. Both operands must be simple values. If the comparison involves two values of different types, the second operand is automatically converted to the type of the first operand during the comparison. Here are some examples using these operators:
SELECT * FROM dataset WHERE dataset.unique_ID = 'INT0001'
SELECT * FROM dataset WHERE dataset.date_departure < now()
Logical operator
The supported logical operators are:
AND
: true if both operands are trueOR
: true if one of the two operands is trueNOT
: true if the operand is false
These operators only work on Boolean operands. Lists of values are not supported. These operators all return a Boolean result. Here are some examples using these operators:
SELECT * FROM dataset WHERE len(dataset.unique_ID) > 3 AND dataset.unique_ID LIKE 'I%'
SELECT * FROM dataset WHERE NOT (upper(dataset.unique_ID) LIKE 'EXT%' OR dataset.unique_ID= 'VIP')
IS NULL operator
Tests whether a value is null.
Syntax:
<value_or_list> IS NULL
Result:
- true if
value_or_list
is null, false otherwise. Ifvalue_or_list
is a list, the result is true if the list is empty or if the list only contains null values.
Operand:
value_or_list
: String or Number or Boolean or Date or List The operand may be a simple value or a list of values.
Example calls:
SELECT * FROM dataset WHERE dataset.unique_ID IS NULL
SELECT * FROM dataset WHERE dataset.unique_ID IS NOT NULL and trim(upper(dataset.unique_ID)) = 'EXT0001'
LIKE operator
Tests whether a value respects a regular expression.
Syntax:
<left_value_or_list>``LIKE <right_value_or_list> [ESCAPE <escape_value>]
Result:
true
if theleft_value_or_list
respects the regular expression present inright_value_or_list
, otherwisefalse
. Ifleft_value_or_list
is a list, the result is true if at least one value in the list respects this regular expression inright_value_or_list
. Ifright_value_or_list
is also a list, the result is true if at least one value in theleft_value_or_list
respects one of the regular expressions present inright_value_or_list
.
Operand:
left_value_or_list
: String or Number or Boolean or Date or List Single value or list of values which must respect the regular expression. If the operand is a list of values, the result is true if at least one value respects the regular expression.right_value_or_list
: String or Number or Boolean or Date or List Regular expression using wildcards_
,%
and[]
. If the operand is a list of regular expressions, the result is true if at least one of the values of theleft_value_or_list
operand respects one of the regular expressions of the list. The meaning of the wildcards is as follows:_
character: corresponds to any single character. For example, the value'EXT0001'
respects the regular expressions'EX_0001'
and'EXT___1'
%
character: matches any sequence of characters. For example, the value'EXT0001'
respects the regular expressions'EXT%'
,'EXT%1'
,'EXT0001%'
,'%000%'
and'EXT000%1'
[]
characters: correspond to a list of characters. For example, the value'EXT0001'
respects the regular expressions'[AEIOU]XT0001'
,'EXT000[0123456789]'
. Warning, syntax designating a range of characters, such as'[0-9]'
is not supported.
It is of course possible to combine different wildcards in the same regular expression. For example, the value 'EXT0001'
respects the regular expressions '_XT000[0123456789]' and 'EXT___1%'
escape_value
: String String containing the escape character if one of the wildcards of the regular expression must be interpreted as a simple character. For example, if theleft_value_or_list
operand contains the value'jean_pierre'
and you need to test for the presence of the'_'
character, you must indicate to theLIKE
operator that it should consider the'_'
character as a simple character and not as a wildcard. For this, the'_'
character is preceded by an escape character (like'\'
) which you must mark as such to theLIKE
operator with the keywordESCAPE
followed by the escape character.
Example calls:
SELECT * FROM dataset WHERE lower(dataset.unique_ID) LIKE 'ext%'
SELECT * FROM dataset WHERE dataset.unique_ID LIKE '\_' ESCAPE '\'
SELECT * FROM dataset WHERE dataset.unique_ID NOT LIKE 'EXT000[0123456789]'
BETWEEN operator
Test whether a value is in a range.
Syntax:
<value>
BETWEEN<low_limit>
AND<high_limit>
Result:
- true if the value is greater than or equal to
low_limit
and less than or equal tohigh_limit
, false otherwise. The BETWEEN operator does not support lists of values in any of the operands. If the data type is different between one value and one of the limits, an implicit conversion of the limit is performed in order to perform the comparison.
Operands:
value: String
orNumber
orBoolean
orDate
. The value to be compared.low_limit: String
orNumber
orBoolean
orDate
. The lowest authorized limit for the value. If this value is not the same type as the value, an implicit conversion of thelow_limit
is performed before the comparison.high_limit: String
orNumber
orBoolean
orDate
. The highest authorized limit for the value. If this value is not of the same type as the value, an implicit conversion of thehigh_limit
is performed before the comparison.
Example calls:
SELECT * FROM dataset WHERE dataset.arrival BETWEEN '20080101000000' AND now()
SELECT * FROM dataset WHERE len(dataset.unique_ID) NOT BETWEEN 6 AND 10
In operator
Tests whether a value is present in a list of values.
Syntax:
<value_or_list> IN (<value_set>)
Result:
true
if the value ofvalue_or_list
is present invalue_set
, false otherwise. Ifvalue_or_list
is a list, the Result istrue
if all the values in the list are present invalue_set
.
Operands:
-
value_or_list
: String or Number or Boolean or Date or List. Single value or list of values that must be present invalue_set
. If the operand is a list of values, the Result is true if all of the values are present invalue_set
. -
value_set
: String or Number or Boolean or Date or List. List of values separated by commas. Each value may be a single value or a list of values.
Example calls:
SELECT * FROM dataset WHERE dataset.unique_ID IN ('EXT0001', 'EXT0002', 'EXT0003')
SELECT * FROM dataset WHERE dataset.unique_ID NOT IN ('EXT0001', dataset.userid)
SELECT * FROM dataset WHERE 'EXT0001 IN (dataset.userid)
Functions
In some cases, filtering records requires a calculation, conversion or data transformation. The functions in this chapter are mainly used in the where clause. They may be combined in order to combine several operations. The following example shows the transformation of a string to uppercase and the removal of spaces from beginning and end:
SELECT * FROM dataset WHERE trim(upper(dataset.unique_ID)) = 'EXT0001'
An important point to watch is the processing of null values. In the example above, if the unique ID may be null (not given in the input file), then it becomes imperative to test the nullity in the where clause before calling a function, as shown in the following example:
SELECT * FROM dataset WHERE dataset.unique_ID is not null and trim(upper(dataset. unique_ID)) = 'EXT0001'
Without the test of nullity, the line collector stops running with an exception caused by calling a function on a null value.
Also note that most functions treat single values as well as lists. If a unique ID is declared as multivalued, the previous query does not work because of the strict equality comparison. If the unique ID is a list of three values, the equality operator returns false because it would have to have two list type operands with the same values to return true. To pass the test successfully if the value 'EXT0001
' is part of the list, the request must be transformed as follows:
SELECT * FROM dataset WHERE dataset.unique_ID is not null and 'EXT0001' in (trim(upper(dataset.unique_ID)))
In this example, if the unique ID is a list of three values, they are transformed to uppercase by the upper function which returns the list of three values, then the trim function removes the spaces from the three values and returns the list of three values, and finally the 'in' operator searches for the 'EXT0001
' value.
The LIKE operator also works on a list of values. Another way to formulate the query to obtain the same Result is as follows:
SELECT * FROM dataset WHERE dataset.unique_ID is not null and trim(upper(dataset.unique_ID)) LIKE 'EXT0001'
If the desired value is any unique ID starting with EXT, the previous query may be transformed as follows to use wildcards:
SELECT * FROM dataset WHERE dataset.unique_ID is not null and trim(upper(dataset.unique_ID)) LIKE 'EXT%'
Ascii function
Returns the Unicode code of the first character of the string passed as parameter.
Signature:
ascii(value): Number
Return value:
- a string consisting of only one character whose code is the value
Parameters:
- value: String. String whose first character must be transformed into Unicode.
Example call:
SELECT * FROM dataset WHERE ascii('A') = 65
SELECT * FROM dataset WHERE left(dataset.unique_ID, 1) = ascii('E')
Char function
Returns a string consisting of only one character whose Unicode code is passed as a parameter.
Signature:
char(value)
: String
Other signature:
chr(value)
: String
Return value:
- a string consisting of only one character whose code is the value
Parameters:
- value: Number. Numerical value corresponding to the Unicode code of the character to be transformed into a string.
Example call:
SELECT * FROM dataset WHERE char(65) = 'A'
SELECT * FROM dataset WHERE left(dataset.unique_ID, 1) = char(69)
Concat function
Concatenates one or more strings to another string or to a list.
Signature:
concat(value_or_list, str[, ...])
: String or List
Return value:
- The string with the concatenated values if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned, after the concat function has been applied to each value.
Parameters:
-
value_or_list
: String or List. String to be modified. If the parameter is a list of strings, all the values of the list undergo the application of the concat function. -
str
: String (varying number of parameters). String(s) to be concatenated.
Example call:
SELECT * FROM dataset WHERE concat(dataset.unique_ID, 'ab', 'cd') LIKE '%abcd'
Format function
Converts a date to a string, respecting the format given as a parameter.
Signature:
format(value, pattern): String
Return value:
- a string containing the date in the requested format
Parameters:
value: Date.
Date to be transformed into a string.pattern: String.
String of characters giving the conversion format. The format respects the specification of the Java java.text.SimpleDateFormat class. The most frequently used characters are:- y: year
- M: month
- d: day of the month
- H: hours
- m: minutes
- s: seconds
Examples of formats:
'dd/MM/yy HH:mm:ss' gives '23/05/11 15:40:12'
'dd MMM yyyy' gives '23 May 2011'
Example call:
SELECT * FROM dataset WHERE format(dataset.arrival, 'yyyy') = '2011'
Insert function
Inserts a string into another string or into a list.
Signature:
insert(value_or_list, start, len, str): String or List
Return value:
- The string completed with the string to be inserted if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned after the insert function has been applied to each value.
Parameters:
value_or_list: String
orList
. String to be completed. If the parameter is a list of strings, all the values of the list undergo the application of the insert function.start: Number.
Starting index of the insertion in thevalue_or_list
string. Index 1 corresponds to the first character. If this value is greater than or equal to the length of thevalue_or_list
string, the two strings are concatenated.len: Number.
Number of characters to replace in thevalue_or_list
string. The value 0 allows the insertion of the string. A value greater than 0 performs a replacement in thevalue_or_list
string.str: String.
String to be inserted.
Example call:
SELECT * FROM dataset WHERE insert(dataset.unique_ID, 4, 0, 'xx') LIKE 'intxx%'
Instr function
Looks for the index of a certain occurrence of the specified value in a string.
Signature:
instr(full, search[, start[, repeat]])
: Number
Return value:
- The index of the occurrence of the specified value or 0 if it is not found
Parameters:
full
: String. String in which the search is carried out.search
: String. String of characters to search for.start
: Number (optional). Starting index of the search. Index 1 corresponds to the first character.repeat
: Number (optional). Number of the occurrence. Numbering starts at 1 to obtain the first occurrence.
Example call:
SELECT * FROM dataset WHERE instr(dataset.unique_ID, 'xt') = 2
Left function
Extracts a certain number of characters from a string or list of strings starting with the first position.
Signature:
left(value_or_list, len)
: String or List
Return value:
- A string containing the number of characters requested starting from the first position if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned after the left function has been applied to each value.
Parameters:
value_or_list
: String or List. String, of which a part is extracted. If the parameter is a list of strings, all the values of the list undergo the left function.- len: Number. A whole number between 0 and the length of the string.
Example call:
SELECT * FROM dataset WHERE left(dataset.unique_ID, 3) = 'Ext'
Len function
Returns the number of characters in a string.
Signature:
len(value)
: Number
Other signature:
character_length(value)
: Number
Other signature:
char_length(value)
: Number
Return value:
- The number of characters in the string
Parameters:
value
: String. String whose size is to be calculated.
Example call:
SELECT * FROM dataset WHERE len(dataset.unique_ID) > 3
Listget function
Extracts an element from a list of values.
Signature:
listget(value_or_list, index)
: String or Number or Boolean or Date
Return value:
- The value in the list at the specified index. The type of the return value is the same as the type of the list elements.
Parameters:
value_or_list
: String or Number or Boolean or Date or List. If the parameter is a list of values, the value present at the specified index is returned. If the value is a single value, it is returned if the index passed as a second parameter is 1.index
: Number. Index of the desired value in the list. The first value is located at index 1.
Example call:
SELECT * FROM dataset WHERE listget(dataset.unique_ID, 1) = 'EXT0001'
Listgetstring function
Extracts an element from a list of values and converts it to a string.
Signature:
listgetstring(value_or_list, index)
: String
Return value:
- The value in the list at the specified index, converted into a string.
Parameters:
-
value_or_list:
String or Number or Boolean or Date or List. If the parameter is a list of values, the value at the specified index is returned after string conversion. If the value is a single value, it is converted into a string and returned if the index passed as a second parameter is 1. -
index:
Number. Index of the desired value in the list. The first value is located at index 1.
Example call:
SELECT * FROM dataset WHERE listgetstring(dataset.unique_ID, 1) = 'EXT0001'
Listsize function
Returns the number of elements in a list of values.
Signature:
listsize(value_or_list)
: Number
Return value:
- The number of elements if the
value_or_list parameter
is a list. If thevalue_or_list
parameter is not a list, the function returns 1 if the value is not null or 0 if the value is null.
Parameters:
value_or_list
: String or Number or Boolean or Date or List. Parameter whose size is to be calculated. The function is especially useful if the parameter is a list of values (attribute declared as multivalued) but it also works with single or null values.
Example call:
SELECT * FROM dataset WHERE listsize(dataset.unique_ID) = 2
Lower function
Converts a string or a list of strings to lowercase.
Signature:
lower(value_or_list)
: String or List
Other signature:
lcase(value_or_list)
: String or List
Return value:
- The string in lowercase if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned with all values converted to lowercase
Parameters:
value_or_list
: String or List. String to be converted to lowercase. If the parameter is a list of strings, all the values of the list are converted to lowercase.
Example call:
SELECT * FROM dataset WHERE lower(dataset.unique_ID) LIKE 'ext%'
Lpad function
Adds a repetition of characters to the beginning of a string or list of strings.
Signature:
lpad(value_or_list, num[, pattern])
: String or List
Return value:
- string complemented with a repetition of characters at the beginning if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned, after thelpad
function has been applied to each value.
Parameters:
value_or_list
: String or List. String to be completed at the beginning. If the parameter is a list of strings, all the values of the list undergo thelpad
function.num
: Number. Number of repetitions of the pattern string or of a space if the pattern is not specified.pattern
: String (optional). String to be repeated in the beginning. If the parameter is omitted, the repeated pattern is the space character.
Example call:
SELECT * FROM dataset WHERE lpad(dataset.unique_ID, 3, 'Z') LIKE 'ZZZint%'
Ltrim function
Returns a string or a list of string without leading spaces.
Signature:
ltrim(value_or_list)
: String or List
Return value:
- The string without leading spaces if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned after theltrim
function is applied to each value.
Parameters:
value_or_list
: String or List. String from which leading spaces must be removed. If the parameter is a list of strings, all the values of the list undergo theltrim
function.
Example call:
SELECT * FROM dataset WHERE ltrim(dataset.unique_ID) LIKE 'int%'
Now function
Returns the current date.
Signature:
now()
: Date
Other signature:
localtimestamp()
: Date
Return value:
- The current date
Parameters:
- none
Example call:
SELECT * FROM dataset WHERE dataset.departure is null or dataset.departure > now()
Number function
Converts Boolean, date or string data into a number.
Signature:
number(value_or_list)
: Number or List
Return value:
- A number representing the converted data if the
value_or_list
parameter is not a list. If thevalue_or_list
parameter is a list, a new list is returned after the number function has been applied to each value.
Parameters:
value_or_list
: String or Number or Boolean or Date or List. Data of any type to be converted to a number. If the parameter is a list, all the values of the list undergo the number function.
Example call:
SELECT * FROM dataset WHERE number('123') = 123
Position function
Looks for the index of the first occurrence of the specified value within a string.
Signature:
position(search, full[, start])
: Number
Other signature:
locate(search, full[, start])
: Number
Return value:
- The index of the first occurrence of the specified value, or 0 if it is not found
Parameters:
search
: String. String of characters to look for.full
: String. String of characters in which to look.start
: Number (optional). Starting index of the search. Index 1 corresponds to the first character.
Example call:
SELECT * FROM dataset WHERE position('xt', dataset.unique_ID) = 2
Repeat function
Creates a string by repeating another string several times.
Signature:
repeat(value, num)
: String
Return value:
- The string constituted by a repetition of the value parameter
Parameters:
value
: String. String to be repeated several times.num
: Number. Number of repetition of the value string.
Example call:
SELECT * FROM dataset WHERE repeat(dataset.unique_ID, 3) = 'int123int123int123'
Replace function
Replaces all occurrences of a string within another string or in a list.
Signature:
replace(value_or_list, search, str)
: String or List
Return value:
- The string with the values replaced if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned after the replace function has been applied to each value.
Parameters:
value_or_list
: String or List. String to be modified. If the parameter is a list of strings, all the values of the list under the replace function.search
: String. String to search for.str
: String. Replacement string.
Example call:
SELECT * FROM dataset WHERE replace(dataset.unique_ID, 'int', 'xx') LIKE 'xx%'
Right function
Extracts a certain number of characters from a string or a list of strings starting from the last position.
Signature:
right(value_or_list, len)
: String or List
Return value:
- A string containing the number of characters requested starting from the last position if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned after the right function has been applied to each value.
Parameters:
value_or_list
: String or List. String from which a part is extracted. If the parameter is a list of strings, all the values of the list undergo the right function.len
: Number. A whole number between 0 and the length of the chain.
Example call:
SELECT * FROM dataset WHERE right(dataset.unique_ID, 2) = '00'
Rpad function
Adds a repetition of characters at the end of a string or a list of strings.
Signature:
rpad(value_or_list, num[, pattern])
: String
Return value:
- The string completed with a repetition of characters at the end if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned after therpad
function has been applied to each value.
Parameters:
value_or_list
: String or List. String to be completed at the end. If the parameter is a list of strings, all the values of the list undergo the rpad function.num
: Number. Number of repetitions of the pattern string or of a space if the pattern is not specified.value
: String (optional). String to be repeated at the beginning. If this parameter is omitted, the repeated pattern is the space character.
Example call:
SELECT * FROM dataset WHERE rpad(dataset.unique_ID, 3, 'Z') LIKE '%00ZZZ'
Rtrim function
Returns a string or a list of strings without trailing spaces.
Signature:
rtrim(value_or_list)
: String or List
Return value:
- The string without trailing spaces if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned after thertrim
function has been applied to each value.
Parameters:
value_or_list
: String or List. String from which trailing spaces must be removed. If the parameter is a list of strings, all the values of the list undergo thertrim
function.
Example call:
SELECT * FROM dataset WHERE rtrim(dataset.unique_ID) LIKE '%00'
Space function
Create a string consisting of spaces.
Signature:
space(num)
: String
Return value:
- The string consisting of spaces
Parameters:
num
: Number. Number of spaces in the string.
Example call:
SELECT * FROM dataset WHERE space(3) = ' '
String function
Converts numerical, Boolean or date data into a string.
Signature:
string(value_or_list)
: String or List
Return value:
- A string containing the converted data if the
value_or_list
parameter is not a list. If thevalue_or_list
parameter is a list, a new list is returned after the string function has been applied to each value.
Parameters:
value_or_list
: String or Number or Boolean or Date or List. Data of any type to be converted into a string. If the parameter is a list, the string function is applied to all the values of the list.
Example call:
SELECT * FROM dataset WHERE string(123) = '123'
Substring function
Extracts a certain number of characters from a string or from a list of strings starting from a certain position.
Signature:
substring(value_or_list, start[, len])
: String or List
Other signatures:
mid(value_or_list, start[, len])
: String or Listsubstr(value_or_list, start[, len])
: String or List
Return value:
- A string containing the requested number of characters starting from the start position if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned after the substring function has been applied to each value.
Parameters:
value_or_list
: String or List. String from which a part must be extracted. If the parameter is a list of strings, the substring function is applied to all the values in the list.start
: Number. A whole number between 1 and the string length.len
: Number (optional). A whole number between 0 and the string length. If this value is not provided, all the characters from the start position and up until the end of the string are returned.
Example call:
SELECT * FROM dataset WHERE substring(dataset.unique_ID, 1, 2) = 'xt'
The signature of the substring function is very different from the signature of the substring method of JavaScript or of Java. The last parameter is the length of the substring, whereas in JavaScript or Java, the last parameter is the index of the end of the substring. In addition, the position of the first character in the string is 1 in SQL, unlike Java or JavaScript where numbering starts at 0.
Trim function
Returns a string or a list of strings without leading or trailing spaces.
Signature:
trim(value_or_list)
: String or List
Return value:
- The string without leading or trailing spaces if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned after the trim function has been applied to each value.
Parameters:
value_or_list
: String or List. String from which leading and trailing spaces are to be removed. If the parameter is a list of strings, the trim function is applied to all the values of the list.
Example call:
SELECT * FROM dataset WHERE trim(dataset.unique_ID) LIKE 'int%00'
Upper function
Converts a string or list of strings to uppercase.
Signature:
upper(value_or_list)
: String or List
Other signature:
ucase(value_or_list)
: String or List
Return value:
- The string in uppercase if the
value_or_list
parameter is a string. If thevalue_or_list
parameter is a list of strings, a new list is returned with all values converted to uppercase
Parameters:
value_or_list
: String or List. String to be converted to uppercase. If the parameter is a list of strings, all the values of the list are converted to uppercase.
Example call:
SELECT * FROM dataset WHERE upper(dataset.unique_ID) LIKE 'INT%'
Javascript calls
The SQL query may also call a JavaScript function in the script associated with the collector line. It is possible to pass parameters to the function. Here is an example call with a parameter:
SELECT * FROM dataset WHERE checkInternal(upper(dataset.unique_ID)) = 'internal'
The associated function returns the 'internal' string if the registration number is that of an in-house person:
function checkInternal(userId) {
if ((userId == null) || (userId == '')) {
return 'external';
}
var prefix = userId.substring(0, 3).toUpperCase();
if ((prefix == 'VIP') || (prefix == 'EXT')) {
return 'external';
}
return 'internal';
}
The current record being filtered is accessible through the predefined variable dataset.