splitStringBy
lib.splitStringBy
Splits a string into substrings based on a predicate that examines adjacent characters.
This function provides a flexible way to split strings by checking pairs of characters against a custom predicate function. Unlike simpler splitting functions, this allows for context-aware splitting based on character transitions and patterns.
Inputs
predicate
- Function that takes two arguments (previous character and current character) and returns true when the string should be split at the current position. For the first character, previous will be "" (empty string).
keepSplit
- Boolean that determines whether the splitting character should be kept as part of the result. If true, the character will be included at the beginning of the next substring; if false, it will be discarded.
str
- The input string to split.
Return
A list of substrings from the original string, split according to the predicate.
Type
splitStringBy :: (string -> string -> bool) -> bool -> string -> [string]
Examples
lib.strings.splitStringBy
usage example
Split on periods and hyphens, discarding the separators:
splitStringBy (prev: curr: builtins.elem curr [ "." "-" ]) false "foo.bar-baz"
=> [ "foo" "bar" "baz" ]
Split on transitions from lowercase to uppercase, keeping the uppercase characters:
splitStringBy (prev: curr: builtins.match "[a-z]" prev != null && builtins.match "[A-Z]" curr != null) true "fooBarBaz"
=> [ "foo" "Bar" "Baz" ]
Handle leading separators correctly:
splitStringBy (prev: curr: builtins.elem curr [ "." ]) false ".foo.bar.baz"
=> [ "" "foo" "bar" "baz" ]
Handle trailing separators correctly:
splitStringBy (prev: curr: builtins.elem curr [ "." ]) false "foo.bar.baz."
=> [ "foo" "bar" "baz" "" ]
Noogle detected
Implementation
The following is the current implementation of this function.
splitStringBy =
predicate: keepSplit: str:
let
len = stringLength str;
# Helper function that processes the string character by character
go =
pos: currentPart: result:
# Base case: reached end of string
if pos == len then
result ++ [ currentPart ]
else
let
currChar = substring pos 1 str;
prevChar = if pos > 0 then substring (pos - 1) 1 str else "";
isSplit = predicate prevChar currChar;
in
if isSplit then
# Split here - add current part to results and start a new one
let
newResult = result ++ [ currentPart ];
newCurrentPart = if keepSplit then currChar else "";
in
go (pos + 1) newCurrentPart newResult
else
# Keep building current part
go (pos + 1) (currentPart + currChar) result;
in
if len == 0 then [ (addContextFrom str "") ] else map (addContextFrom str) (go 0 "" [ ]);