12955

Ruby CSV - Illegal quoting in line 1. CSV::MalformedCSVError

I have a problem with reading from the csv file. File comes from Windows, so I suppose there are some encoding issues. My code looks like this:

CSV.open(path, 'w', headers: :first_row, col_sep: ';', row_sep: "\r\n", encoding: 'utf-8') do |csv| CSV.parse(open(doc.file.url), headers: :first_row, col_sep: ';', quote_char: "\"", row_sep: "\r\n", encoding: 'utf-8').each_with_index do |line, index| csv << line.headers if index == 0 # do something wiht row csv << line end end

I have to open existing file and complete some columns from it. So I just create new file. The existing file is stored on Dropbox, so I have to use open method.

The problem is that I get an error in this line:

CSV.parse(open(doc.file.url), headers: :first_row, col_sep: ';', quote_char: "\"", row_sep: "\r\n", encoding: 'utf-8').each_with_index do |line, index|

The error is:

Illegal quoting in line 1. CSV::MalformedCSVError

I check and seems like I don't have BOM characters in the file (not sure if check it right). The problem seems to be in quote character. The exception is thrown for every line in the file.

This is the file that causes me problems: https://dl.dropboxusercontent.com/u/3900955/geo_bez_adresu_10_do_testow_small.csv

I tried different approaches from StackOverflow but nothing helps, for example I changed my code into this:

CSV.open(path, 'w', headers: :first_row, col_sep: ';', row_sep: "\r\n", encoding: 'utf-8') do |csv| open(doc.file.url) do |f| f.each_line do |line| CSV.parse(line, 'r:bom|utf-8') do |row| csv << row end end end end

but it doesn't help. I will be grateful for any help with parsing this file.

======= edit =========

When I safe the same file on Windows with encoding ANSI as UTF-8 (in Notepad++) I can parse the file correctly. From this discussion What is "ANSI as UTF-8" and how can I make fputcsv() generate UTF-8 w/BOM?, it seems like I have BOM in the original file. How I can check in Ruby if my file is with BOM and how I can parse the csv file with BOM ?

Answer1:

CSV.parse() requires a string on its first argument, but you're passing a File object instead. What happens is that parse() gets to parse the expanded value of (file object).to_s instead and it cause the error.

Update

To read file with BOM you can have this:

CSV.new(File.open('file.csv', 'r:bom|utf-8'), col_sep: ';').each do |row| ... end

Reference: https://stackoverflow.com/a/7780559/445221

Answer2:

I didn't find any way to read directly from remote file, if it contains BOM. So I use Tempfile file to create temporary file and then I do CSV.open with 'r:bom|utf-8':

doc = Document.find(doc_id) path = "#{Rails.root.join('tmp')}/#{doc.name.split('.').first}_#{Time.now.to_i}.csv" file = Tempfile.new(["#{doc.name.split('.').first}_#{Time.now.to_i}", '.csv']) file.binmode file << open(doc.file.url).read file.close CSV.open(path, 'w', headers: :first_row, col_sep: ';', row_sep: "\r\n", encoding: 'utf-8') do |csv| CSV.open(file.path, 'r:bom|utf-8', headers: :first_row, col_sep: ';', quote_char: "\"", row_sep: "\r\n").each_with_index do |line, index| # do something end end

Now, it seems to parse the file.

Recommend

  • Fixed positioning bug in Firefox 7.0.1, some overflow seems to cause 1px margin on fixed div
  • how to start this process
  • Matlab drag and drop file from windows explorer to figure (gui)
  • how to get OneToMany relationship revision in Hibernate envers
  • Skip before_filter defined with block
  • How do I move twitter configuration out of the controller? (Rails)
  • Display the value of a range slider with activeadmin and formtastic
  • How to get the revision of an item with Dropbox API
  • cannot run python script file using windows prompt
  • Selecting a Random Subset in SQL (Sybase Server IQ)
  • Insert Path of a file with \\\\ in mysql using java
  • input type=“file” accept=“image/*” doesn't work in phone gap?
  • Illegal reflective access operation
  • date: illegal option — d, Find difference between two dates
  • How can I get the choice “H2” back in the H2 consol?
  • WPF version of .ScaleControl?
  • Primefaces :radioButton inside a ui:repeat
  • R convert summary result (statistics with all dataframe columns) into dataframe
  • Breaking out column by groups in Pandas
  • Unable to get column index with table.getColumn method using custom table Model
  • Keep this build forever option - Jenkins
  • Converting a WriteableBitmap image ToArray in UWP
  • Is there any way to access browser form field suggestions from JavaScript?
  • Trying to switch camera back to front but getting exception
  • Hazelcast - OperationTimeoutException
  • Unanticipated behavior
  • Is there a mandatory requirement to switch app.yaml?
  • Comma separated Values
  • using conditional logic : check if record exists; if it does, update it, if not, create it
  • Error creating VM instance in Google Compute Engine
  • Free memory of cv::Mat loaded using FileStorage API
  • Hits per day in Google Big Query
  • Angular 2 constructor injection vs direct access
  • how does django model after text[] in postgresql [duplicate]
  • FormattedException instead of throw new Exception(string.Format(…)) in .NET
  • Linking SubReports Without LinkChild/LinkMaster
  • Can't mass-assign protected attributes when import data from csv file
  • XCode 8, some methods disappeared ? ex: layoutAttributesClass() -> AnyClass
  • Programmatically clearing map cache
  • Unable to use reactive element in my shiny app