All functions in this section search case-sensitively by default. Case-insensitive search is usually provided by separate function variants.Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-page-updates.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Case-insensitive search follows the lowercase-uppercase rules of the English language. E.g. Uppercased
i in the English language is
I whereas in the Turkish language it is İ - results for languages other than English may be unexpected.haystack) and the search string (referred to in this section as needle) are single-byte encoded text. If this assumption is
violated, no exception is thrown and results are undefined. Search with UTF-8 encoded strings is usually provided by separate function
variants. Likewise, if a UTF-8 function variant is used and the input strings are not UTF-8 encoded text, no exception is thrown and the
results are undefined. Note that no automatic Unicode normalization is performed, however you can use the
normalizeUTF8*() functions for that.
General strings functions and functions for replacing in strings are described separately.
The documentation below is generated from the
system.functions system table.countMatches
Introduced in: v21.1.0 Returns number of matches of a regular expression in a string.Version dependent behaviorThe behavior of this function depends on the ClickHouse version:
- in versions < v25.6, the function stops counting at the first empty match even if a pattern accepts.
- in versions >= 25.6, the function continues execution when an empty match occurs. The legacy behavior can be restored using setting
count_matches_stop_at_empty_match = true;
UInt64
Examples
Count digit sequences
Query
Response
countMatchesCaseInsensitive
Introduced in: v21.1.0 LikecountMatches but performs case-insensitive matching.
Syntax
haystack— The string to search in.Stringpattern— Regular expression pattern.const String
UInt64
Examples
Case insensitive count
Query
Response
countSubstrings
Introduced in: v21.1.0 Returns how often a substringneedle occurs in a string haystack.
Syntax
haystack— String in which the search is performed. String or Enum. -needle— Substring to be searched. String. -start_pos— Position (1-based) inhaystackat which the search starts. UInt. Optional.
UInt64
Examples
Usage example
Query
Response
Query
Response
countSubstringsCaseInsensitive
Introduced in: v21.1.0 LikecountSubstrings but counts case-insensitively.
Syntax
haystack— String in which the search is performed.StringorEnumneedle— Substring to be searched.Stringstart_pos— Optional. Position (1-based) inhaystackat which the search starts.UInt*
UInt64
Examples
Usage example
Query
Response
Query
Response
countSubstringsCaseInsensitiveUTF8
Introduced in: v21.1.0 LikecountSubstrings but counts case-insensitively and assumes that haystack is a UTF-8 string.
Syntax
haystack— UTF-8 string in which the search is performed.StringorEnumneedle— Substring to be searched.Stringstart_pos— Optional. Position (1-based) inhaystackat which the search starts.UInt*
UInt64
Examples
Usage example
Query
Response
Query
Response
extract
Introduced in: v1.1.0 Extracts the first match of a regular expression in a string. If ‘haystack’ doesn’t match ‘pattern’, an empty string is returned. This function uses the RE2 regular expression library. Please refer to re2 for supported syntax. If the regular expression has capturing groups (sub-patterns), the function matches the input string against the first capturing group. Syntaxhaystack— String from which to extract.Stringpattern— Regular expression, typically containing a capturing group.const String
String
Examples
Extract domain from email
Query
Response
Query
Response
extractAll
Introduced in: v1.1.0 Likeextract, but returns an array of all matches of a regular expression in a string.
If ‘haystack’ doesn’t match the ‘pattern’ regex, an empty array is returned.
If the regular expression has capturing groups (sub-patterns), the function matches the input string against the first capturing group.
Syntax
haystack— String from which to extract fragments.Stringpattern— Regular expression, optionally containing capturing groups.const String
Array(String)
Examples
Extract all numbers
Query
Response
Query
Response
extractAllGroupsHorizontal
Introduced in: v20.5.0 Matches all groups of a string using the provided regular expression and returns an array of arrays, where each array contains all captures from the same capturing group, organized by group number. Syntaxs— Input string to extract from.StringorFixedStringregexp— Regular expression to match by.const Stringorconst FixedString
Array(Array(String))
Examples
Usage example
Query
Response
extractGroups
Introduced in: v20.5.0 Extracts the capturing groups from the first substring matched by a regular expression. To extract groups from all matches, useextractAllGroupsHorizontal or extractAllGroupsVertical.
Syntax
s— Input string to extract from.StringorFixedStringregexp— Regular expression. Must contain at least one capturing group. Constant.const Stringorconst FixedString
1 to N, where N is the number of capturing groups in regexp) of the first match. If there is no match, returns an empty array. Array(String)
Examples
Usage example
Query
Response
hasAllTokens
Introduced in: v25.10.0 LikehasAnyTokens, but returns 1, if all tokens in the needle string or array match the input string, and 0 otherwise. If input is a column, returns all rows that satisfy this condition.
Column
input should have a text index defined for optimal performance.
If no text index is defined, the function performs a brute-force column scan which is orders of magnitude slower than an index lookup.- the
inputargument (always), and - the
needleargument (if given as a String) using the tokenizer specified for the text index. If the column has no text index defined, thesplitByNonAlphatokenizer is used instead. If theneedleargument is of type Array(String), each array element is treated as a token — no additional tokenization takes place.
hasAllToken
Arguments
input— The input column.StringorFixedStringorArray(String)orArray(FixedString)needles— Tokens to be searched.StringorArray(String)tokenizer— The tokenizer to use. Valid arguments aresplitByNonAlpha,splitByString,asciiCJK,ngrams,sparseGrams, andarray. Optional, if not set explicitly, defaults tosplitByNonAlpha.const String
UInt8
Examples
Basic usage with a string needle
Query
Response
Query
Response
tokens function
Query
Response
Query
Response
Query
Response
Query
Response
Query
Response
Query
Response
hasAnyTokens
Introduced in: v25.10.0 Returns 1, if at least one token in theneedle string or array matches the input string, and 0 otherwise. If input is a column, returns all rows that satisfy this condition.
Column
input should have a text index defined for optimal performance.
If no text index is defined, the function performs a brute-force column scan which is orders of magnitude slower than an index lookup.- the
inputargument (always), and - the
needleargument (if given as a String) using the tokenizer specified for the text index. If the column has no text index defined, thesplitByNonAlphatokenizer is used instead. If theneedleargument is of type Array(String), each array element is treated as a token — no additional tokenization takes place.
hasAnyToken
Arguments
input— The input column.StringorFixedStringorNullable(String)orNullable(FixedString)orArray(String)orArray(FixedString)orArray(Nullable(String))orArray(Nullable(FixedString))needles— Tokens to be searched.StringorArray(String)tokenizer— The tokenizer to use. Valid arguments aresplitByNonAlpha,splitByString,asciiCJK,ngrams,sparseGrams, andarray. Optional, if not set explicitly, defaults tosplitByNonAlpha.const String
1, if there was at least one match. 0, otherwise. UInt8
Examples
Basic usage with a string needle
Query
Response
Query
Response
tokens function
Query
Response
Query
Response
Query
Response
Query
Response
Query
Response
hasPhrase
Introduced in: v26.4.0 Checks if the haystack contains all tokens from the phrase in consecutive order. Prior to searching, the function tokenizes both theinput and the phrase arguments using the tokenizer specified as the optional third argument.
The tokenizer argument must be one of splitByNonAlpha, splitByString, ngrams, or asciiCJK.
If no tokenizer is specified, by default the splitByNonAlpha tokenizer would be used.
Unlike hasToken, hasAnyTokens and hasAllTokens, hasPhrase requires the tokens to appear in the same order
and without any intervening tokens. For example, hasPhrase('the quick brown fox', 'quick fox') returns 0
because “brown” appears between “quick” and “fox”.
Syntax
matchPhrase
Arguments
input— The input column.StringorFixedStringphrase— Phrase to search for.const Stringtokenizer— The tokenizer to use. Optional, defaults tosplitByNonAlpha.const String
1 if the phrase is found as a consecutive token sequence, 0 otherwise. UInt8
Examples
Phrase match
Query
Response
Query
Response
hasSubsequence
Introduced in: v23.7.0 Checks if a needle is a subsequence of a haystack. A subsequence of a string is a sequence that can be derived from another string by deleting some or no characters without changing the order of the remaining characters. Syntaxhaystack— String in which to search for the subsequence.Stringneedle— Subsequence to be searched.String
1 if needle is a subsequence of haystack, 0 otherwise. UInt8
Examples
Basic subsequence check
Query
Response
Query
Response
hasSubsequenceCaseInsensitive
Introduced in: v23.7.0 LikehasSubsequence but searches case-insensitively.
Syntax
haystack— String in which the search is performed.Stringneedle— Subsequence to be searched.String
UInt8
Examples
Usage example
Query
Response
hasSubsequenceCaseInsensitiveUTF8
Introduced in: v23.7.0 LikehasSubsequenceUTF8 but searches case-insensitively.
Syntax
haystack— UTF8-encoded string in which the search is performed.Stringneedle— UTF8-encoded subsequence string to be searched.String
UInt8
Examples
Usage example
Query
Response
hasSubsequenceUTF8
Introduced in: v23.7.0 LikehasSubsequence but assumes haystack and needle are UTF-8 encoded strings.
Syntax
1 if needle is a subsequence of haystack, otherwise 0. UInt8
Examples
Usage example
Query
Response
Query
Response
hasToken
Introduced in: v20.1.0 Checks if the given token is present in the haystack. Uses splitByNonAlpha as tokenizer, i.e. a token is defined as the longest possible sub-sequence of consecutive characters[0-9A-Za-z_] (numbers, ASCII characters and underscore).
Syntax
haystack— String to be searched.Stringtoken— Token to search for.const String
1 if the token is found, 0 otherwise. UInt8
Examples
Token search
Query
Response
hasTokenCaseInsensitive
Introduced in: v20.1.0 Performs case insensitive lookup of needle in haystack using tokenbf_v1 index. Syntax- None.
hasTokenCaseInsensitiveOrNull
Introduced in: v23.1.0 Performs case insensitive lookup of needle in haystack using tokenbf_v1 index. Returns null if needle is ill-formed. Syntax- None.
hasTokenOrNull
Introduced in: v20.1.0 LikehasToken but returns null if token is ill-formed.
Syntax
haystack— String to be searched. Must be constant.Stringtoken— Token to search for.const String
1 if the token is found, 0 otherwise, null if token is ill-formed. Nullable(UInt8)
Examples
Usage example
Query
Response
highlight
Introduced in: v26.4.0 Highlights occurrences of search terms in a text string by wrapping them with HTML tags. The function performs ASCII case-insensitive matching. If multiple search terms overlap or are adjacent in the text, the matched regions are merged into a single highlighted span. Syntaxhaystack— The text to search in.StringorFixedStringneedles— An array of search terms to highlight.const Array(String)open_tag— The opening tag to insert before each match. Default:<em>.const Stringclose_tag— The closing tag to insert after each match. Default:</em>.const String
String
Examples
Basic highlight
Query
Response
Query
Response
ilike
Introduced in: v20.6.0 Likelike but searches case-insensitively.
Syntax
haystack— String in which the search is performed.StringorFixedStringpattern— LIKE pattern to match against.String
1 if the string matches the LIKE pattern (case-insensitive), otherwise 0. UInt8
Examples
Usage example
Query
Response
like
Introduced in: v1.1.0 Returns whether stringhaystack matches the LIKE expression pattern.
A LIKE expression can contain normal characters and the following metasymbols:
%indicates an arbitrary number of arbitrary characters (including zero characters)._indicates a single arbitrary character.\is for escaping literals%,_and\.
_ matches the Unicode code point ¥ which is represented in UTF-8 using two bytes.
If the haystack or the LIKE expression are not valid UTF-8, the behavior is undefined.
No automatic Unicode normalization is performed, you can use the normalizeUTF8* functions for that.
To match against literal %, _ and \ (which are LIKE metacharacters), prepend them with a backslash: \%, \_ and \\.
The backslash loses its special meaning (i.e. is interpreted literally) if it prepends a character different than %, _ or \.
ClickHouse requires backslashes in strings to be quoted as well, so you would actually need to write
\\%, \\_ and \\\\.LIKE expressions of the form %needle%, the function is as fast as the position function.
All other LIKE expressions are internally converted to a regular expression and executed with a performance similar to function match.
Syntax
haystack— String in which the search is performed.StringorFixedStringpattern—LIKEpattern to match against. Can contain%(matches any number of characters),_(matches single character), and\for escaping.String
1 if the string matches the LIKE pattern, otherwise 0. UInt8
Examples
Usage example
Query
Response
Query
Response
Query
Response
locate
Introduced in: v18.16.0 Likeposition but with arguments haystack and locate switched.
Version dependent behaviorThe behavior of this function depends on the ClickHouse version:
- in versions < v24.3,
locatewas an alias of functionpositionand accepted arguments(haystack, needle[, start_pos]). - in versions >= 24.3,
locateis an individual function (for better compatibility with MySQL) and accepts arguments(needle, haystack[, start_pos]). The previous behavior can be restored using settingfunction_locate_has_mysql_compatible_argument_order = false.
needle— Substring to be searched.Stringhaystack— String in which the search is performed.StringorEnumstart_pos— Optional. Position (1-based) inhaystackat which the search starts.UInt
0, if the substring was not found. UInt64
Examples
Basic usage
Query
Response
match
Introduced in: v1.1.0 Checks if a provided string matches the provided regular expression pattern. This function uses the RE2 regular expression library. Please refer to re2 for supported syntax. Matching works under UTF-8 assumptions, e.g.¥ uses two bytes internally but matching treats it as a single codepoint.
The regular expression must not contain NULL bytes.
If the haystack or the pattern are not valid UTF-8, the behavior is undefined.
Unlike re2’s default behavior, . matches line breaks. To disable this, prepend the pattern with (?-s).
The pattern is not anchored. To match the entire string, anchor the pattern yourself using ^ and $.
If you just want to search for substrings, you can use functions like or position instead, which work much faster than this function.
Alternative operator syntax: haystack REGEXP pattern.
Syntax
REGEXP_MATCHES
Arguments
haystack— String in which the pattern is searched.Stringpattern— Regular expression pattern.const String
1 if the pattern matches, 0 otherwise. UInt8
Examples
Basic pattern matching
Query
Response
Query
Response
Query
Response
multiFuzzyMatchAllIndices
Introduced in: v20.1.0 LikemultiFuzzyMatchAny but returns the array of all indices in any order that match the haystack within a constant edit distance.
Syntax
haystack— String in which the search is performed.Stringdistance— The maximum edit distance for fuzzy matching.UInt8pattern— Array of patterns to match against.Array(String)
Array(UInt64)
Examples
Usage example
Query
Response
multiFuzzyMatchAny
Introduced in: v20.1.0 LikemultiMatchAny but returns 1 if any pattern matches the haystack within a constant edit distance.
This function relies on the experimental feature of hyperscan library, and can be slow for some edge cases.
The performance depends on the edit distance value and patterns used, but it’s always more expensive compared to non-fuzzy variants.
multiFuzzyMatch*() function family do not support UTF-8 regular expressions (it treats them as a sequence of bytes) due to restrictions of hyperscan.haystack— String in which the search is performed.Stringdistance— The maximum edit distance for fuzzy matching.UInt8pattern— Optional. An array of patterns to match against.Array(String)
1 if any pattern matches the haystack within the specified edit distance, otherwise 0. UInt8
Examples
Usage example
Query
Response
multiFuzzyMatchAnyIndex
Introduced in: v20.1.0 LikemultiFuzzyMatchAny but returns any index that matches the haystack within a constant edit distance.
Syntax
haystack— String in which the search is performed.Stringdistance— The maximum edit distance for fuzzy matching.UInt8pattern— Array of patterns to match against.Array(String)
0. UInt64
Examples
Usage example
Query
Response
multiMatchAllIndices
Introduced in: v20.1.0 LikemultiMatchAny but returns the array of all indices that match the haystack in any order.
Syntax
haystack— String in which the search is performed.Stringpattern— Regular expressions to match against.String
Array(UInt64)
Examples
Usage example
Query
Response
multiMatchAny
Introduced in: v20.1.0 Check if at least one of multiple regular expression patterns matches a haystack. If you only want to search multiple substrings in a string, you can use functionmultiSearchAny instead - it works much faster than this function.
Syntax
haystack— String in which patterns are searched.Stringpattern1[, pattern2, ...]— An array of one or more regular expression patterns.Array(String)
1 if any pattern matches, 0 otherwise. UInt8
Examples
Multiple pattern matching
Query
Response
Query
Response
multiMatchAnyIndex
Introduced in: v20.1.0 LikemultiMatchAny but returns any index that matches the haystack.
Syntax
haystack— String in which the search is performed.Stringpattern— Regular expressions to match against.Array(String)
UInt64
Examples
Usage example
Query
Response
multiSearchAllPositions
Introduced in: v20.1.0 Likeposition but returns an array of positions (in bytes, starting at 1) for multiple needle substrings in a haystack string.
All multiSearch*() functions only support up to 2^8 needles.
Syntax
haystack— String in which the search is performed.Stringneedle1[, needle2, ...]— An array of one or more substrings to be searched.Array(String)
0, if the substring was not found. Array(UInt64)
Examples
Multiple needle search
Query
Response
multiSearchAllPositionsCaseInsensitive
Introduced in: v20.1.0 LikemultiSearchAllPositions but ignores case.
Syntax
haystack— String in which the search is performed.Stringneedle1[, needle2, ...]— An array of one or more substrings to be searched.Array(String)
0 if the substring was not found. Array(UInt64)
Examples
Case insensitive multi-search
Query
Response
multiSearchAllPositionsCaseInsensitiveUTF8
Introduced in: v20.1.0 LikemultiSearchAllPositionsUTF8 but ignores case.
Syntax
haystack— UTF-8 encoded string in which the search is performed.Stringneedle— UTF-8 encoded substrings to be searched.Array(String)
Array
Examples
Case-insensitive UTF-8 search
Query
Response
multiSearchAllPositionsUTF8
Introduced in: v20.1.0 LikemultiSearchAllPositions but assumes haystack and the needle substrings are UTF-8 encoded strings.
Syntax
haystack— UTF-8 encoded string in which the search is performed.Stringneedle1[, needle2, ...]— An array of UTF-8 encoded substrings to be searched.Array(String)
0 if the substring was not found. Array
Examples
UTF-8 multi-search
Query
Response
multiSearchAny
Introduced in: v20.1.0 Checks if at least one of a number of needle strings matches the haystack string. FunctionsmultiSearchAnyCaseInsensitive, multiSearchAnyUTF8 and multiSearchAnyCaseInsensitiveUTF8 provide case-insensitive and/or UTF-8 variants of this function.
Syntax
haystack— String in which the search is performed.Stringneedle1[, needle2, ...]— An array of substrings to be searched.Array(String)
1, if there was at least one match, otherwise 0, if there was not at least one match. UInt8
Examples
Any match search
Query
Response
multiSearchAnyCaseInsensitive
Introduced in: v20.1.0 Like multiSearchAny but ignores case. Syntaxhaystack— String in which the search is performed.Stringneedle— Substrings to be searched.Array(String)
1, if there was at least one case-insensitive match, otherwise 0, if there was not at least one case-insensitive match. UInt8
Examples
Case insensitive search
Query
Response
multiSearchAnyCaseInsensitiveUTF8
Introduced in: v20.1.0 Like multiSearchAnyUTF8 but ignores case. Syntaxhaystack— UTF-8 string in which the search is performed.Stringneedle— UTF-8 substrings to be searched.Array(String)
1, if there was at least one case-insensitive match, otherwise 0, if there was not at least one case-insensitive match. UInt8
Examples
Given a UTF-8 string ‘Здравствуйте’, check if character ‘з’ (lowercase) is present
Query
Response
multiSearchAnyUTF8
Introduced in: v20.1.0 Like multiSearchAny but assumeshaystack and the needle substrings are UTF-8 encoded strings.
Syntax
haystack— UTF-8 string in which the search is performed.Stringneedle— UTF-8 substrings to be searched.Array(String)
1, if there was at least one match, otherwise 0, if there was not at least one match. UInt8
Examples
Given ‘你好,世界’ (‘Hello, world’) as a UTF-8 string, check if there are any 你 or 界 characters in the string
Query
Response
multiSearchFirstIndex
Introduced in: v20.1.0 Searches for multiple needle strings in a haystack string (case-sensitive) and returns the 1-based index of the first needle found. Syntaxhaystack— The string to search in.Stringneedles— Array of strings to search for.Array(String)
UInt64
Examples
Usage example
Query
Response
Query
Response
Query
Response
multiSearchFirstIndexCaseInsensitive
Introduced in: v20.1.0 Returns the indexi (starting from 1) of the leftmost found needle_i in the string haystack and 0 otherwise.
Ignores case.
Syntax
haystack— String in which the search is performed.Stringneedle— Substrings to be searched.Array(String)
0, if there was no match. UInt8
Examples
Usage example
Query
Response
multiSearchFirstIndexCaseInsensitiveUTF8
Introduced in: v20.1.0 Searches for multiple needle strings in a haystack string, case-insensitively with UTF-8 encoding support, and returns the 1-based index of the first needle found. Syntaxhaystack— The string to search in.Stringneedles— Array of strings to search for.Array(String)
UInt64
Examples
Usage example
Query
Response
Query
Response
Query
Response
multiSearchFirstIndexUTF8
Introduced in: v20.1.0 Returns the indexi (starting from 1) of the leftmost found needle_i in the string haystack and 0 otherwise.
Assumes haystack and needle are UTF-8 encoded strings.
Syntax
haystack— UTF-8 string in which the search is performed.Stringneedle— Array of UTF-8 substrings to be searched.Array(String)
UInt8
Examples
Usage example
Query
Response
multiSearchFirstPosition
Introduced in: v20.1.0 Likeposition but returns the leftmost offset in a haystack string which matches any of multiple needle strings.
Functions multiSearchFirstPositionCaseInsensitive, multiSearchFirstPositionUTF8 and multiSearchFirstPositionCaseInsensitiveUTF8 provide case-insensitive and/or UTF-8 variants of this function.
Syntax
haystack— String in which the search is performed.Stringneedle1[, needle2, ...]— An array of one or more substrings to be searched.Array(String)
haystack string which matches any of multiple needle strings, otherwise 0, if there was no match. UInt64
Examples
First position search
Query
Response
multiSearchFirstPositionCaseInsensitive
Introduced in: v20.1.0 Like multiSearchFirstPosition but ignores case. Syntaxhaystack— String in which the search is performed.Stringneedle— Array of substrings to be searched.Array(String)
haystack string which matches any of multiple needle strings. Returns 0, if there was no match. UInt64
Examples
Case insensitive first position
Query
Response
multiSearchFirstPositionCaseInsensitiveUTF8
Introduced in: v20.1.0 Like multiSearchFirstPosition but assumeshaystack and needle to be UTF-8 strings and ignores case.
Syntax
haystack— UTF-8 string in which the search is performed.Stringneedle— Array of UTF-8 substrings to be searched.Array(String)
haystack string which matches any of multiple needle strings, ignoring case. Returns 0, if there was no match. UInt64
Examples
Find the leftmost offset in UTF-8 string ‘Здравствуй, мир’ (‘Hello, world’) which matches any of the given needles
Query
Response
multiSearchFirstPositionUTF8
Introduced in: v20.1.0 Like multiSearchFirstPosition but assumeshaystack and needle to be UTF-8 strings.
Syntax
haystack— UTF-8 string in which the search is performed.Stringneedle— Array of UTF-8 substrings to be searched.Array(String)
haystack string which matches any of multiple needle strings. Returns 0, if there was no match. UInt64
Examples
Find the leftmost offset in UTF-8 string ‘Здравствуй, мир’ (‘Hello, world’) which matches any of the given needles
Query
Response
ngramDistance
Introduced in: v20.1.0 Calculates the 4-gram distance between two strings. For this, it counts the symmetric difference between two multisets of 4-grams and normalizes it by the sum of their cardinalities. The smaller the returned value, the more similar the strings are. For case-insensitive search or/and in UTF8 format use functionsngramDistanceCaseInsensitive, ngramDistanceUTF8, ngramDistanceCaseInsensitiveUTF8.
Syntax
0 and 1. The smaller the returned value, the more similar the strings are. Float32
Examples
Calculate 4-gram distance
Query
Response
ngramDistanceCaseInsensitive
Introduced in: v20.1.0 Provides a case-insensitive variant ofngramDistance.
Calculates the 4-gram distance between two strings, ignoring case.
The smaller the returned value, the more similar the strings are.
Syntax
0 and 1. Float32
Examples
Case-insensitive 4-gram distance
Query
Response
ngramDistanceCaseInsensitiveUTF8
Introduced in: v20.1.0 Provides a case-insensitive UTF-8 variant ofngramDistance.
Assumes that needle and haystack strings are UTF-8 encoded strings and ignores case.
Calculates the 3-gram distance between two UTF-8 strings, ignoring case.
The smaller the returned value, the more similar the strings are.
Syntax
haystack— First UTF-8 encoded comparison string.Stringneedle— Second UTF-8 encoded comparison string.String
0 and 1. Float32
Examples
Case-insensitive UTF-8 3-gram distance
Query
Response
ngramDistanceUTF8
Introduced in: v20.1.0 Provides a UTF-8 variant ofngramDistance.
Assumes that needle and haystack strings are UTF-8 encoded strings.
Calculates the 3-gram distance between two UTF-8 strings.
The smaller the returned value, the more similar the strings are.
Syntax
haystack— First UTF-8 encoded comparison string.Stringneedle— Second UTF-8 encoded comparison string.String
0 and 1. Float32
Examples
UTF-8 3-gram distance
Query
Response
ngramSearch
Introduced in: v20.1.0 Checks if the 4-gram distance between two strings is less than or equal to a given threshold. For case-insensitive search or/and in UTF8 format use functionsngramSearchCaseInsensitive, ngramSearchUTF8, ngramSearchCaseInsensitiveUTF8.
Syntax
1 if the 4-gram distance between the strings is less than or equal to a threshold (1.0 by default), 0 otherwise. UInt8
Examples
Search using 4-grams
Query
Response
ngramSearchCaseInsensitive
Introduced in: v20.1.0 Provides a case-insensitive variant ofngramSearch.
Calculates the non-symmetric difference between a needle string and a haystack string, i.e. the number of n-grams from the needle minus the common number of n-grams normalized by the number of needle n-grams.
Checks if the 4-gram distance between two strings is less than or equal to a given threshold, ignoring case.
Syntax
1 if the 4-gram distance between the strings is less than or equal to a threshold (1.0 by default), 0 otherwise. UInt8
Examples
Case-insensitive search using 4-grams
Query
Response
ngramSearchCaseInsensitiveUTF8
Introduced in: v20.1.0 Provides a case-insensitive UTF-8 variant ofngramSearch.
Assumes haystack and needle to be UTF-8 strings and ignores case.
Checks if the 3-gram distance between two UTF-8 strings is less than or equal to a given threshold, ignoring case.
Syntax
1 if the 3-gram distance between the strings is less than or equal to a threshold (1.0 by default), 0 otherwise. UInt8
Examples
Case-insensitive UTF-8 search using 3-grams
Query
Response
ngramSearchUTF8
Introduced in: v20.1.0 Provides a UTF-8 variant ofngramSearch.
Assumes haystack and needle to be UTF-8 strings.
Checks if the 3-gram distance between two UTF-8 strings is less than or equal to a given threshold.
Syntax
1 if the 3-gram distance between the strings is less than or equal to a threshold (1.0 by default), 0 otherwise. UInt8
Examples
UTF-8 search using 3-grams
Query
Response
notILike
Introduced in: v20.6.0 Checks whether a string does not match a pattern, case-insensitive. The pattern can contain special characters% and _ for SQL LIKE matching.
Syntax
haystack— The input string to search in.StringorFixedStringpattern— The SQL LIKE pattern to match against.%matches any number of characters (including zero),_matches exactly one character.String
1 if the string does not match the pattern (case-insensitive), otherwise 0. UInt8
Examples
Usage example
Query
Response
notLike
Introduced in: v1.1.0 Similar tolike but negates the result.
Syntax
haystack— String in which the search is performed.StringorFixedStringpattern— LIKE pattern to match against.String
1 if the string does not match the LIKE pattern, otherwise 0. UInt8
Examples
Usage example
Query
Response
Query
Response
position
Introduced in: v1.1.0 Returns the position (in bytes, starting at 1) of a substringneedle in a string haystack.
If substring needle is empty, these rules apply:
- if no
start_poswas specified: return1 - if
start_pos = 0: return1 - if
start_pos >= 1andstart_pos <= length(haystack) + 1: returnstart_pos - otherwise: return
0
locate, positionCaseInsensitive, positionUTF8 and positionCaseInsensitiveUTF8.
Syntax
haystack— String in which the search is performed.StringorEnumneedle— Substring to be searched.Stringstart_pos— Position (1-based) inhaystackat which the search starts. Optional.UInt
0, if the substring was not found. UInt64
Examples
Basic usage
Query
Response
Query
Response
Query
Response
Query
Response
positionCaseInsensitive
Introduced in: v1.1.0 Likeposition but case-insensitive.
Syntax
instr
Arguments
haystack— String in which the search is performed.StringorEnumneedle— Substring to be searched.Stringstart_pos— Optional. Position (1-based) inhaystackat which the search starts.UInt*
0, if the substring was not found. UInt64
Examples
Case insensitive search
Query
Response
positionCaseInsensitiveUTF8
Introduced in: v1.1.0 LikepositionUTF8 but searches case-insensitively.
Syntax
haystack— String in which the search is performed.StringorEnumneedle— Substring to be searched.Stringstart_pos— Optional. Position (1-based) inhaystackat which the search starts.UInt*
0, if the substring was not found. UInt64
Examples
Case insensitive UTF-8 search
Query
Response
positionUTF8
Introduced in: v1.1.0 Likeposition but assumes haystack and needle are UTF-8 encoded strings.
Syntax
haystack— String in which the search is performed.StringorEnumneedle— Substring to be searched.Stringstart_pos— Optional. Position (1-based) inhaystackat which the search starts.UInt*
0, if the substring was not found. UInt64
Examples
UTF-8 character counting
Query
Response