String Functions

These function operate, as the category name implies, mostly on strings. Strings are made up from Unicode characters.

Function Description
ascii() Returns the numeric value of the left-most character
base64decode() Applies a BASE64 decoding to the given string
base64encode() Applies a BASE64 encoding to the given string
bin() Return a string containing binary representation of a number
btrim() Removes all spaces or given characters from a string
char_length() Returns the number of characters in a string
character_length() Same as char_length()
chr() Returns the character representing the given number
concat() Concatenates all given strings into one
concat_ws() Concatenates all strings with a given divider
find_in_set() Returns the index of a given string in a list of strings
hex() Converts a given number into a hexadecimal string
ilike() / ILIKE Case-insensitive version of like()
initcap() Capitalize all first letters of the words in a string
instr() Returns the index of the first occurrence of the given substring
iregexp() / IREGEXP Case-insensitive version of the regexp() function.
lcase() Same as lower() function
length() Returns the length of a string in bytes
like() / LIKE Simple pattern matching, returns true when match, else returns false
locate() Returns the position of the first occurrence of the given substring
lower() Converts the given string into lowercase characters
lpad() Adds a prefix (left) padding to a string
ltrim() Removes the leading spaces in a string
parse_url() Parses a URL and returns the requested part from it
regexp() / REGEXP Returns true if a string matches a pattern. else returns false
regexp_extract() Returns the n-th math group from the string
regexp_like() Similar to like() but using regular expressions to match
regexp_match_count() Returns the count of matches of a pattern in a string
regexp_replace() Replaces matches of a pattern in a string
repeat() Repeats the given string a number of times
reverse() Reverses the characters in a string
rlike() / RLIKE Same as regexp_like()
rpad() Appends a string the given number of times
rtrim() Removes trailing spaces from the given string
sets_intersect() Returns true if two lists of string intersect, else returns false
space() Returns a string containing a given number of spaces
split_part() Splits a string at given divider, returning the n-th part
strleft() Returns the n leading characters of a string
strright() Returns the n trailing characters of a string
substr() Returns a substring from a given string
substring() Same as the substr() function
translate() Replaces characters in a given string
trim() Removes all leading and trailing spaces of a string
ucase() Same as the upper() function
unhex() Converts a string with hexadecimal values into a string
upper() Returns the given string converted to uppercase characters

The following explains some of the functions and statements in more detail.

  • INT find_in_set(STRING, STRING)

    Returns the index of a given string in a list of strings.

    Example: Using the find_in_set() function

    SELECT find_in_set("ab", "ab,ab,ab,ade,cde"); 1 SELECT find_in_set("ab", "abc,xyz,abc,ade,ab"); 5

String Comparison Functions

The following describes the function related to simple string comparisons.

  • BOOLEAN ilike(STRING, STRING) and ILIKE

    Like the LIKE operator (and its like() function variant), this function and operator compares a column value with a pattern, and if it matches it returns true. Otherwise it will return false instead. Available as operator and function variant.

    The difference is, here the comparison is done case-insensitive.

    Example: Using the ILIKE statemen

    > SELECT * FROM okera_sample.users
    WHERE uid ILIKE "%abf%"
    
    uid                                     dob      gender  ccn
    0001BDD9-EABF-4D0D-81BD-D9EABFCD0D7D    8-Apr-84    F    3771-2680-8616-9487
    0082AA65-5ABF-4733-8FFA-EB44672402AF    8-Feb-77    M    4916-2897-7947-1976
    009723B0-50A8-4E79-ABFC-23D89C21F090    8-Oct-72    U    3447-9051-8148-3340
    ...
    
  • BOOLEAN like(STRING, STRING) and LIKE

    This function and operator compares a column value with a pattern, and if it matches it returns true. Otherwise it will return false instead. Available as operator and function variant.

    The only allowed wildcards are % and _, where the former is the same as a .* and the latter a . in regular expression syntax. You can escape these two characters to make the literals. The comparison is done case-sensitive.

    Example: Using the ILIKE statemen

    > SELECT count(*) FROM okera_sample.users
    WHERE uid LIKE "%abf%"
    0
    
    > SELECT "a" = "a ", "a" LIKE "a ", "a" = "A", "a" LIKE "A"
    [
        {
        "'a' = 'a '": false,
        "'a' like 'a '": false,
        "'a' = 'a'": false,
        "'a' like 'a'": false
        }
    ]
    

Regular Expressions

The following describes the function related to more complex string comparisons, using regular expressions.

  • BOOLEAN regexp(STRING, STRING) and REGEXP

    Returns true if a string matches a pattern. else returns false. Available as operator and function variant.

    Example: Using the special REGEXP operator

    > SELECT * FROM okera_sample.users
    WHERE ccn REGEXP "^37.*56$";
    uid                                     dob       gender  ccn
    006E4AFF-222A-4FE9-AE4A-FF222A0FE9A5    15-Jul-82    M    3771-0858-7088-7956
    01377A7B-01F0-496C-88CD-4935F17B31BA    11-Oct-60    M    3771-3523-1278-5756
    04725AC1-DC68-4CF1-9A88-BEA8C68E425E    31-Oct-90    M    3771-2033-3016-1056
    ...
    
  • STRING regexp_extract(STRING, STRING, BIGINT)

    Returns the n-th math group from the string. The function needs three parameters, which are

    • the string to search in,
    • the pattern to match against, and
    • the group number to return.

    Like regular expressions in programming languages, this function can handle basic, as well as complex patterns, including capturing groups. The third parameter indicates which group should be returned, where this is usually set to zero (0) for basic patterns without groups. For complex patterns, the group zero (0) is the full match across the entire pattern, while specific groups can be returned starting from index 1.

    Example: Using the regular expressions functions to extract data

    > SELECT regexp_extract("abxcy1234a", "a.x", 0);
    abx
    > SELECT regexp_extract("abxcy1234a", "(a.x)(.y.*)(3.*a)", 1)
    abx
    
  • BOOLEAN regexp_like(STRING, STRING)

    Similar to like() but using regular expressions to match. There are two variants of this function, one with two and another with three parameters, comprising

    • the string to search in,
    • the pattern to match, and
    • (optionally) the matching options.

    The following matching options are available and you can specifiy more than one:

    Option Description
    i Match case-insensitive
    c Match case-sensitive
    m Pattern is in POSIX egrep syntax
    n Enable newlines being matched by patterns

    Example: Using the regular expression based like functions

    > SELECT regexp_like("foobar", "[AEIOU]{2}", "i")
    true
    
    > SELECT regexp_like("this\nis\nnewline", "^.*$")
    false
    
    > SELECT regexp_like("this\nis\nnewline", "^.*$", "n")
    true
    
  • INT regexp_match_count(STRING, STRING)

    Returns the count of matches of a pattern in a string.

  • STRING regexp_replace(STRING, STRING, STRING)

    Replaces matches of a pattern in a string.

  • BOOLEAN rlike(STRING, STRING) and RLIKE

    Same as regexp_like().