Splitting a String by Another String in C++: A Flexible Utility Function
Posted on In Programming, TutorialIn this post, we will explore a flexible utility function for splitting a string based on a given delimiter using C++ and the standard
Table of Contents
The C++ Utility Function to Split a String by Another String
Background: regular expressions are an essential tool for text processing and pattern matching. They provide a concise and powerful way to express complex search patterns and are widely used for tasks such as text validation, data extraction, and string manipulation. The C++ Standard Library offers the
The <regex>
library in C++ includes several key components, such as the std::regex
class which represents a compiled regular expression, std::regex_iterator
and std::sregex_iterator
which are iterators for traversing matches in a given input, the std::regex_replace
and std::regex_search
which are algorithms for searching and replacing patterns within strings, and the std::regex_token_iterator
and std::sregex_token_iterator
for tokenizing strings based on a given pattern.
Here’s the code snippet for our utility function making use of the regex
standard library:
#include <regex>
std::vector<std::string>
split_str(const std::string& str, const std::string& delim_str) {
std::regex delim{delim_str};
std::vector<std::string> results;
std::sregex_token_iterator end;
std::sregex_token_iterator iter(str.begin(), str.end(), delim, -1);
for ( ; iter != end; ++iter) {
std::string split(*iter);
if (split.size()) results.push_back(split);
}
return results;
}
Breaking Down the String Splitting C++ Function Implementation
Let’s go through the code step by step:
- First, we include the
<regex>
header, which provides us with the necessary tools to work with regular expressions in C++. - We define a function called
split_str
that takes two parameters: a conststd::string&
calledstr
, which is the input string to be split, and a conststd::string&
calleddelim_str
, which is the delimiter string to be used for splitting. - We create a
std::regex
object calleddelim
with the delimiterdelim_str
. This is the pattern that will be used to split the input string. - We declare a
std::vector<std::string>
calledresults
to store the resulting substrings after splitting. - We define two
std::sregex_token_iterator
objects:end
anditer
. Theend
object serves as a sentinel value indicating the end of the sequence. Theiter
object is initialized with the beginning and end of the input string, the delimiter pattern, and-1
as the submatch value. The-1
value tells the iterator to return the unmatched parts of the input (i.e., the substrings between the delimiters). - We use a for loop to iterate through the tokens returned by the iterator. Inside the loop, we create a
std::string
object calledsplit
and initialize it with the current token. - We check if the size of the split string is non-zero. If it is, we add it to the
results
vector. - Finally, we return the
results
vector containing the substrings.
Using the C++ Utility Function to Split a String
Here’s a C++ example of how you can use the split_str
function:
#include <iostream>
#include <vector>
#include <string>
#include <regex>
std::vector<std::string>
split_str(const std::string& str, const std::string& delim_str) {
std::regex delim{delim_str};
std::vector<std::string> results;
std::sregex_token_iterator end;
std::sregex_token_iterator iter(str.begin(), str.end(), delim, -1);
for ( ; iter != end; ++iter) {
std::string split(*iter);
if (split.size()) results.push_back(split);
}
return results;
}
int main() {
std::string input = "Hello::World::from::C++";
std::string delimiter = "::";
std::vector<std::string> results = split_str(input, delimiter);
for (const auto& word : results) {
std::cout << word << std::endl;
}
return 0;
}
This code snippet would output:
$ g++ -std=c++20 split-string-by-string-example.cpp -o s && ./s
Hello
World
from
C++
That’s it! We’ve created a flexible and reusable utility function to split a string using the <regex>
library. You can easily modify the delimiter string to fit your needs, making this function highly adaptable for various text processing tasks.