12011

Unable to get a block of code into my regex match groups

Question:

So yeah, the title is pretty weird but I have no other idea how to describe my problem properly. Whatever... lets get to the problem.

<h3><strong>Job to get done</strong></h3>

My boss wants a function that read all functions of a python file and return a DataTable containing the found functions. This function should be written in IronPython (Python which actually uses C# libraries).

<h3><strong>The Problem</strong></h3>

I am relatively new to Python and I have no idea what this language is capable of, so I started to write my function and yeah it works pretty well, except one weird problem. I wrote a regular expression to find the functions and to test it I downloaded a RegEx Tester. The Regex Tester showed the results I wanted: Group 1 - The function name, Group 2 - The functions parameter and Group 3 - the content of the function.

For some magical reasons, it doesn't work when it goes to live testing. And with doesn't work I mean, Group 3 has actually no output. After testing the expression with another (online) <a href="http://regexhero.net/tester/" rel="nofollow">RegEx Tester</a>, it showed me, that Group 3 has actually not the content of the function, it only has a small part of it, starting with a newline/return character.

In my test cases, the results of Group 3 where all the same, starting with a newline/return character and ended with the functions <em>return</em> (e.g. return objDic).

Question: What the hell is going wrong there? I have no idea what is wrong on my RegEx.

<h3><strong>The Regex</strong></h3> objRegex = Regex(r"(?i)def[\s]+([\w]+)\(([\, [\w]+)\)(?:[\:{1}]\s*)([\n].*(?!\ndef[\s]+))+") <h3><strong>The Data</strong></h3> def test_function(some_parameter): try: some_cool_code_goes_here() return obj except Exception as ex: DetailsBox.Show(ex) def another_cool_function(another_parameter): try: what_you_want() return obj except Exception as ex: DetailsBox.Show(ex) <h3><strong>The Result</strong></h3>

<strong>Match:</strong> def test_function(some_parameter):...<br /><strong>Position:</strong> ..<br /><strong>Length:</strong> ..<br /><strong>Group 1:</strong> test_function<br /><strong>Group 2:</strong> some_parameter<br /><strong>Group 3:</strong> <em>(newline/return character)</em> return obj

But Group 3 should be:

try: some_cool_code_goes_here() return obj except Exception as ex: DetailsBox.Show(ex)

I hope you can help me :3 Thank you guys!

Answer1:

Although @Hamza said in his comment that you have several problems in your regex, I think they are more of uneeded complexity, the reason for not matching the body might be that you haven't let the . special meta-character match the new line so it is stopping at the first new line character after the first Try: statement.

To fix this you will need to let the . match new line characters and here is a stripped down version of your regex that works:

(?i)def\s+(\w+)\s*\(([\, \w]+)\)(?:\s*:\s*)(.+?)(?=def|$)

Answer2:

Thanks to <a href="https://stackoverflow.com/users/1401975/hamza" rel="nofollow">HamZa</a> for the quick help (and of course also thanks for all the other helpers), he actually solved the problem. There were just a few adjustments necessary (to make it work for C# :-)) but the main point comes from him, thanks a lot.

Solution for my problem:

Regex(r"(?is)def\s*(?<name>\w+)\s*\((?<parameter>[^)]+)\)\s*:\s*(?:\r?\n)+(?<body>.*?)(?=\r?\ndef|$)")

Recommend

  • why the html5 geolocation fails to get results?
  • focus for java applets
  • defproject Compiler Exception
  • Can't open testflight build
  • javascript Confirm replacement with return true/false
  • Different builds of turn-based Game Center game can’t see each other’s matches
  • UIBarButtonItem's action is not called when in a view with a UIGestureRecognizer
  • C++/CLI Thread synchronization including managed and unmanaged code
  • Certain Arabic text gets incorrectly shown while other Arabic text gets showed normally?
  • Raphael.js function getBBox give back NAN/NAN/NAN in IE8
  • matching similar elements in between two lists
  • Tools for understanding HTML layout
  • Efficient User-Agent Regex to find Safari in Python
  • .NET video play library which allows to change the playback rate?
  • IE10 strips out hashtag from the URL
  • Create a link to a web page that runs a Javascript function on the page
  • Detecting null parameter in preprocessor macro
  • using System.Speech.Synthesis with Windows10 universal app (XAML-C#)
  • NUnit 3.0 TestCase const custom object arguments
  • Plotting line graph with factors in R
  • JPA flush vs commit
  • Unable to decode certificate at client new X509Certificate2()
  • Is playing sound in Javascript performance heavy?
  • Alternative to overridePendingTransition() - Android
  • Needing to do .toArray() to get output of mongodb .find() on key name not value
  • Can you perform a UNION without a subquery in SQLAlchemy?
  • Extracting HTML between tags
  • FFmpeg Conversion Error
  • MongoDB in PHP using aggregate to group by _id is null not working
  • Insert into database using onclick function
  • Regex thinks I'm nesting, but I'm not
  • Read text file and split every line in MSBuild
  • How to add a column to a Pandas dataframe made of arrays of the n-preceding values of another column
  • Knitr HTML Loop - Some HTML output, some R output
  • Javascript convert timezone issue
  • Can a Chrome extension content script make an jQuery AJAX request for an html file that is itself a
  • Symfony2: How to get request parameter
  • ORA-29908: missing primary invocation for ancillary operator
  • Unit Testing MVC Web Application in Visual Studio and Problem with QTAgent
  • embed rChart in Markdown