How to parse a string with raw escape sequences?


Suppose there are 2 strings:

string parse(const string& s) { // how to write this function? } int main() { string s1 = R"(hello\n\"this is a string with escape sequences\"\n)"; string s2 = "hello\n\"this is a string with escape sequences\"\n"; assert(parse(s1) == s2); }

My question is, how to write the function parse() in order to make the assertion succeed, other than some hand-made code traversing the string and checking against every possible escape sequence? Is there any existing idiom for doing this?


It's probably the easiest to just implement the replacement of escape sequences using a regular expression, but if you insist on doing it without, you can always write the string to a file as part of a valid C++ program that prints the string to a file, then read it from the file. (This can also be improved to work without temp files at all)

#include <iostream> #include <fstream> #include <string> #include <cstdlib> #include <assert.h> using std::string; string parse(const string& s) { std::ofstream ftmp("tmpsrc.cpp"); ftmp << "#include <iostream>\nint main(int argc, char* argv[]){\n"; ftmp << " std::cout << \"" << s << "\";\nreturn 0;}\n\n"; ftmp.close(); system("g++ -o tmpprint tmpsrc.cpp"); system("./tmpprint > tmpstr.txt"); std::ifstream fin("tmpstr.txt",std::ios::in|std::ios::binary); fin.seekg(0,std::ios::end); int size=fin.tellg(); fin.seekg(0); string res; res.resize(size); fin.read(&res[0],size); fin.close(); // Add delete of temp files here return res; } int main() { string s1 = R"(hello\n\"this is a string with escape sequences\"\n)"; string s2 = "hello\n\"this is a string with escape sequences\"\n"; assert(parse(s1) == s2); }


IMHO, C++ escape sequencies are very easy to replace them manually

string string_replace( const string & s, const string & findS, const std::string & replaceS ) { string result = s; auto pos = s.find( findS ); if ( pos == string::npos ) { return result; } result.replace( pos, findS.length(), replaceS ); return string_replace( result, findS, replaceS ); } string parse(const string& s) { static vector< pair< string, string > > patterns = { { "\\\\" , "\\" }, { "\\n", "\n" }, { "\\r", "\r" }, { "\\t", "\t" }, { "\\\"", "\"" } }; string result = s; for ( const auto & p : patterns ) { result = string_replace( result, p.first, p.second ); } return result; } int main() { string s1 = R"(hello\n\"this is a string with escape sequences\"\n)"; string s2 = "hello\n\"this is a string with escape sequences\"\n"; cout << parse(s1) << endl; cout << ( parse(s1) == s2 ) << endl; }



hello "this is a string with escape sequences"



You could use string stream. Check each character of the string for an escaped backslash '\' character. When found, check for the next character to be a valid escape character. Then write a string to the string stream for that escape character sequence.

std::string parse(const std::string& s) { std::stringstream ss{""}; for(size_t i = 0; i < s.length(); i++) { if (s.at(i) == '\\') { switch(s.at(i + 1)) { case 'n': ss << "\n"; i++; break; case '"': ss << "\""; i++; break; default: ss << "\\"; break; } } else { ss << s.at(i); } } return ss.str(); }


