Extract a complex String from between two Strings


I have a String which contains the following sub string:

[Qual:3] [Text:PIX 1252471471953/YHYF/PPP121.40/10RTY10/NOLXX08X1] [Elem:123]

I'd like to extract the part between [Text: and ] i.e. PIX 1252471471953/YHYF/PPP121.40/10RTY10/NOLXX08X1.

How do I do this?


Pattern p = Pattern.compile("\\[Text:(.*?)\\]"); Matcher m = p.matcher("[Qual:3] [Text:PIX 1252471471953/YHYF/PPP121.40/10RTY10/NOLXX08X1] [Elem:123]"); m.find(); System.out.println(m.group(1));


PIX 1252471471953/YHYF/PPP121.40/10RTY10/NOLXX08X1

The \\[ and \\] are to escape the brackets, which are special characters in regexes. The .*? is a non-greedy quantifier, so it stops gobbling up characters when it reaches the closing bracket. This part of the regex is given inside a capturing group (), which you can access with m.group(1).


Use the following string as the regex:


The first capture group will give you exactly the substring you want.

The non-greedy match (.*?) is required to make it stop at the first ] rather than also including [Elem:123].


String.substring(int beginIndex, int endIndex)

Returns a new string that is a substring of this string.

You could use this to remove the start and end of the string,


You could use

String.indexOf(String str)

To get the index of the start and end of the match and copy the contents to a new result string.

You could use

String.matches(String regex)

However writing regular expressions can get difficult,

<a href="http://docs.oracle.com/javase/6/docs/api/java/lang/String.html" rel="nofollow">http://docs.oracle.com/javase/6/docs/api/java/lang/String.html</a>

I hope this helps.


Instead of using "\\[Text:(.*?)\\]", as others have suggested, I'd go one step further and use lookarounds to filter out the text you don't want:


This will match exactly the text you want without having to select a capturing group.


  • How would I represent the following prolog statement in predicate logic?
  • Reinforcement learning algorithm using turtle graphics not functioning
  • Perl Regular Expression for extracting multi-line LaTeX chapter name
  • RegEx match quotes with 1 or more results
  • Ruby regex: extract a list of urls from a string
  • JavaScript text manipulation
  • Need help replacing single quotes in bash script
  • Can comments make any difference during the run-time?
  • Regex - Matching arbitrary amount of numbers
  • extract a double or integer from a string using regular expression in java?
  • Matching token sequences
  • Unable to create child instance for one-to-many on Sequelize.js
  • c# parsing xml with and apostrophe throws exception
  • dealing with nested quotes in html generated from c#
  • Thymeleaf encoding issues with javascript (using spring mvc)
  • How can I exit a while(1) based on a user input?
  • Serializing (and deserializing) 'complex' Rails objects with JSON
  • Confusion in RegExp Reluctant quantifier? Java
  • Why is negation of a regex needed?
  • File path validation in javascript
  • Python Regex - Find contents from a string between two '*'
  • curl - How to escape < in parameter value
  • Insert Path of a file with \\\\ in mysql using java
  • pandas parse csv with left and right quote chars
  • preg_replace
  • Scala using regex with or syntax in match case statement
  • custom string delimiters stringtemplate-4
  • Programatically open file in visual studio
  • Creating a C++ function that calls other Lua function
  • Regex for Specific Tag
  • Why isn't my “Fizz Buzz” test in R working?
  • What Makes These Two Array Adds Different?
  • Validate jQuery plugin, field not required
  • How to make JSON.NET deserialize to Microsoft Date Time?
  • How do I open a C file with a relative path?
  • Projection media query: browser support and workarounds?
  • Regex thinks I'm nesting, but I'm not
  • When should I choose bucket sort over other sorting algorithms?
  • Unit Testing MVC Web Application in Visual Studio and Problem with QTAgent
  • embed rChart in Markdown