58994

How to web-scrape DL's DT and DD which is under a div with DOMparser/Xpath

Question:

I am trying to get DL's DT and DD which is under a class and trying to put those in a foreach. But facing some troubles,

<dl class="c-explain2"> <dt>所在地</dt> <dd> 大阪府大阪市 北区天満1丁目25番1(地番) <br>

Here is my codes;

$DOMParser = new \DOMDocument(); $DOMParser->loadHTML($html); $xpath = new \DOMXPath($DOMParser); $classname="c-explain2"; $getAllTable = $xpath->query("//dl[contains(@class, '$classname')]//"); foreach($getAllTable as $table){ $allProperties = []; $table->getElementsByTagName('dt')[0]->nodeValue; $value = $table->getElementsByTagName('dd')[0]->nodeValue; $allProperties[] = [ 'property' => $property, 'value'=> $value]; } $insertData[$start_id] = $allProperties; $MyTable = true;

How to get those dt and dd, After that want to put those in array. Any help? Thank you.

Answer1:

There is a problem with your XPath expression, it should be "//dl[@class='$classname']"

Also it looks like you are never assigning $property in your loop. Try this:

<?php $html = <<<END <dl class="c-explain2"> <dt>所在地</dt> <dd>大阪府大阪市 北区天満1丁目25番1(地番</dd> </dl> END; $DOMParser = new \DOMDocument(); $DOMParser->loadHTML($html); $xpath = new \DOMXPath($DOMParser); $classname = "c-explain2"; $getAllTable = $xpath->query("//dl[@class='$classname']"); foreach ($getAllTable as $table) { $allProperties = []; $property = $table->getElementsByTagName('dt')[0]->nodeValue; $value = $table->getElementsByTagName('dd')[0]->nodeValue; $allProperties[] = [ 'property' => $property, 'value' => $value ]; }

Recommend

  • Highlighting when HTML and Xpath is given
  • Invalid character entity parsing xml
  • Issue Parsing XML in react using ReactDOM
  • xpath: extract data from a node using xpath
  • Preserving line breaks in xml node
  • Trying to pull XML from post with xmlDomDoc.Load (Request) gives useless error
  • PHP DOMXPath gives empty string for /@ID
  • JavaScript ActiveXObject
  • Evaluate javascript to plain text using C#, .NET 3.5
  • Transform XML using XSLT to SQL
  • How to get tag content using DOMDocument?
  • PHP code to sort XML tags alphabetically?
  • Uncaught Error: Class 'Google\\AdsApi\\Examples\\AdWords\\v201809\\Reporting\\DOMDo
  • xml + xpath, print element
  • How to get the value of special attributes / custom attributes of HTML using PHP DOM Parser?
  • passing parameters in xml
  • MSXML's loadXML fails to load even well formed xml
  • H2 tag auto ID in php string
  • PHP XML Remove Parent
  • How to Detect if an XML Element is Empty using DOMDocument in PHP?
  • php echo first divs
  • How to remove html part of a text in PHP
  • How to add nodes to a multi-level XML from an array?
  • PHP XPath. How to return string with html tags?
  • Displaying image retrieved from database to image control
  • A simple datepicker in VueJS
  • How to make Plotly chart with year mapped to line color and months on x-axis
  • How to parsing NSDate to RFC 822 always use in English?
  • Zend Framework 2, Module Redirect
  • iOS Date formatting
  • Android Oreo JobIntentService Keep running in background for Android 7 &below and crashing often
  • Why does PHP appear to evaluate this condition incorrectly?
  • Multilingual set up of codeigniter
  • Is there an HTML code that can make my background picture transparent and my text non-transparent?
  • Contact form problem - I do receive messages, but no contents (blank page)
  • replacing while loop with list comprehension
  • remove unicode characters but keep all special and English characters with preg_replace
  • Adding Parent and Child Nodes in TreeView from Sql Server 2008
  • Retrieving value from sql ExecuteScalar()
  • To Get the radio button value in ruby on rails