15122

Regexes work in PHP and don't in Erlang. Why?

Question:

I tried to rewrite url parsing function written in PHP to Erlang. And I found that these regex don't work in Erlang but work fine in PHP code. Can you tell why and how to make it work with Erlang.

Loose = "^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)". re:compile( Loose ). {error,{"nothing to repeat",166}} Strict = "^(?:([^:\/?#]+):)?(?:\/\/\/?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?))?(((?:\/(\w:))?((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)". re:compile( Strict ). {error,{"nothing to repeat",114}}

But this code works fine:

$url = "http://gazeta.ru/"; $loose = '/^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/'; preg_match($loose, $url, $match); var_dump( $match );

Answer1:

The character "\" is special in strings in Erlang. There are other special characters which must be preceded by a backslash, these include doublequote and backslash. The technique of marking special characters is called escaping and backslash itself is called an escape character. So "\" must be followed with another character. For example if you want to include character '\' (one backslash) into a string you should write "\\":

CorrectString = "C:\\windows" %% Correct WrongString = "C:\windows" %% Wrong

Hence you have to change all single backslashes in your regexp to double backslashes. Here is an example in erlang shell:

3> Loose = "^(?:(?![^:@]+:[^:@\\/]*@)([^:\\/?#.]+):)?(?:\\/\\/\\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\\/?#]*)(?::(\\d*))?)(((?:\\/(\\w:))?(\\/(?:[^?#](?![^?#\\/]*\\.[^?#\\/.]+(?:[?#]|$)))*\\/?)?([^?#\\/]*))(?:\\?([^#]*))?(?:#(.*))?)". 4> re:compile(Loose). {ok,{re_pattern,14,0, <<69,82,67,80,147,2,0,0,16,0,0,0,1,0,0,0,14,0,0,0,0,0,0, ...>>}}

Recommend

  • Ruby Rack Heroku: Serving Static Files
  • Carbon locale format and decline
  • How to apply a texture to THREE.ExtrudeGeometry?
  • RewriteCond and rewriteRule to redirect depending on the domain
  • Importing matplotlib on Ubuntu
  • Why am I getting an Array of an Array with Doctrine 2 & Symfony 2 Data Transformer?
  • Tinymce strips attributes on submit
  • PHP file_exists() anomaly
  • Why does Sencha 2 only work in Webkit browsers?
  • Using HTML/CSS for UI in XNA?
  • Programatically open file in visual studio
  • File loader changed image file name but not the file name in HTML file
  • Creating a C++ function that calls other Lua function
  • C function strchr - How to calculate the position of the character?
  • testing a POST using phpunit in laravel 4
  • Why isn't my “Fizz Buzz” test in R working?
  • Validate jQuery plugin, field not required
  • Floated image with variable width and heading with background image
  • Trying to get the char code of ENTER key
  • Redirect STDERR in OPEN pipe comand. Perl Linux
  • preg_replace Double Spaces to tab (\\t) at the beginning of a line
  • MongoError: Incorrect arguments
  • Read a local file using javascript
  • Change multiple background-images with jQuery
  • Projection media query: browser support and workarounds?
  • Align navbar back button on right side
  • How to recover from a Spring Social ExpiredAuthorizationException
  • Sending data from AppleScript to FileMaker records
  • ILMerge & Keep Assembly Name
  • Upload files with Ajax and Jquery
  • Large data - storage and query
  • WOWZA + RTMP + HTML5 Playback?
  • bootstrap to use multiple ng-app
  • How to get icons for entities from eclipse?
  • Hits per day in Google Big Query
  • FormattedException instead of throw new Exception(string.Format(…)) in .NET
  • Turn off referential integrity in Derby? is it possible?
  • Linking SubReports Without LinkChild/LinkMaster
  • XCode 8, some methods disappeared ? ex: layoutAttributesClass() -> AnyClass
  • JaxB to read class hierarchy