35963

Removing tags using preg_replace

I'm filtering through a string (Pulled from a text file), and removing all and tags using preg_replace. For some reason, it is removing the actual text "script", but leaving the <> and . I've tried subbing in /< (to try and treat it as a literal), but that just generates errors. How do I get it to remove the brackets as well? The input is <script>Text</script> Here's the code:

$file = file_get_contents($directory . "original-" . $name); $file = htmlentities($file); $file = preg_replace('<script>', '', $file); $file = preg_replace('<\script>', '', $file);

And here is the output:

<>TEXT</>

Answer1:

The answer is

$html = preg_replace('#<script(.*?)>(.*?)</script>#is', '', $html);

But you might want to have a look at the strip_tags function

Answer2:

The pattern you use in your preg_* functions has to have some kind of a delimiter before and after that. PHP allows many different delimiters, so it's treating your angle brackets as the regexp delimiter, and not part of the pattern. I ordinarily use { and } as delimiters, many other people use slashes, hash signs, square brackets, parentheses. Angle brackets are also permitted as delimiters, that's why your pattern fails.

You can solve this by adding some delimiters around your patterns, e.g.:

$file = preg_replace('/<script>/', '', $file);

Also, note that PHP regular expressions are case sensitive, so your pattern is foiled by a tag that says <SCRIPT> or <Script>. The i modifier after the pattern (after the closing delimiter) makes it case insensitive (/<script>/i). Also, there are many different ways to write HTML tags that are still interpreted by the browser, e.g.:

<script type="text/javascript">...</script> <script src="..." />

On a sidenote, and maybe I'm reading too much into your question, you should not, I repeat, not use regexps to parse HTML, and especially to sanitize it.

Answer3:

$html = preg_replace('#(.*?)#is', '', $html);

Recommend

  • PHPWord export giving Corrupt Word File
  • fixing curl_exec hangs in Windows 8 apache
  • Convert emoji to html code or display emoji as html
  • In php, Prepare string and create XML/RSS Feed
  • Portable JRE on Linux - possible?
  • How to pass solution folder as parameter in command line arguments (for debug)?
  • Error processing multiple files
  • Yii2: Finding file and getting path in a directory tree
  • Who propagate bugfixes across branches (corporate development)?
  • xcode don't localize specific strings
  • Web.config system.webserver errors
  • Checking free space on FTP server
  • Change Inet root folder for iis 7
  • ilmerge with a PFX file
  • AES padding and writing the ciphertext to a disk file
  • How to convert from System.Drawing.Color to Excel.ColorFormat in C#? Change comment color
  • Why doesn't :active or :focus work on text links in webkit? (safari & chrome)
  • Validaiting emails with Net.Mail MailAddress
  • MySQL WHERE-condition in procedure ignored
  • Deserializing XML into class C#
  • Updated Ionic CLI but shows previous version (Windows)
  • Web-crawler for facebook in python
  • Jquery - Jquery Wysiwyg return html as a string
  • Function pointer “assignment from incompatible pointer type” only when using vararg ellipsis
  • Arrays break string types in Julia
  • WPF Applying a trigger on binding failure
  • trying to dynamically update Highchart column chart but series undefined
  • Benchmarking RAM performance - UWP and C#
  • Acquiring multiple attributes from .xml file in c#
  • How to CLICK on IE download dialog box i.e.(Open, Save, Save As…)
  • Java static initializers and reflection
  • How can I remove ASP.NET Designer.cs files?
  • python draw pie shapes with colour filled
  • Running Map reduces the dimensions of the matrices
  • costura.fody for a dll that references another dll
  • Observable and ngFor in Angular 2
  • How to Embed XSL into XML
  • UserPrincipal.Current returns apppool on IIS
  • Conditional In-Line CSS for IE and Others?
  • java string with new operator and a literal