46472

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position

When I try to extract some pattern from a tagged text in nltk, I have the error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 79: ordinal not in range(128). Firstly I had not this error, but I got it only after installing some packages.

this is the code:

# -*- coding: utf-8 -*- import codecs import sys import re import sys import nltk from nltk.corpus import * k = nltk.corpus.brown.tagged_words('myfile') for (w1,t1), (w2,t2) in nltk.bigrams(k): if t1 == 'NN' and t2 == 'AJ': print w1, w2

this is the entire output of the code.

Traceback (most recent call last): File "/home/fathi/egfe.py", line 12, in <module> for (w1,t1), (w2,t2) in nltk.bigrams(k): File "/usr/local/lib/python2.7/dist-packages/nltk/util.py", line 442, in bigrams for item in ngrams(sequence, 2, **kwargs): File "/usr/local/lib/python2.7/dist-packages/nltk/util.py", line 419, in ngrams history.append(next(sequence)) File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/reader/util.py", line 291, in iterate_from tokens = self.read_block(self._stream) File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/reader/tagged.py", line 241, in read_block for para_str in self._para_block_reader(stream): File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/reader/util.py", line 564, in read_blankline_block line = stream.readline() File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 1095, in readline new_chars = self._read(readsize) File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 1322, in _read chars, bytes_decoded = self._incr_decode(bytes) File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 1352, in _incr_decode return self.decode(bytes, 'strict') UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 79: ordinal not in range(128)

Answer1:

The problem is that the ntlk version is not compatabile with the python version, so it requires an older version of the nltk toolkit.

Recommend

  • go one step back and one step forward in a loop with python
  • Python code flow does not work as expected?
  • Race condition with waitgroup and unbuffered channel
  • Why won't this short bit of code work?
  • How to group by multiple columns in SQL Server
  • Trim white spaces in both Object key and value recursively
  • Hide the values of a td tag using jquery
  • d3.js Add labels in Chord diagram
  • Xpath Regex in PHP not working
  • Regular Expression Negative Lookbehind Alternative for VBScript
  • Regular Expression Negative Lookbehind Alternative for VBScript
  • R save Matrix to csv, and load as Matrix
  • Fields not getting sorted in alphabetical order in elasticsearch
  • How to plot area between a predefined upper and lower bounds in ggplot2?
  • PHP create embed image from database position
  • Problems with python 2.4 and 2.4.4 in struct.unpack and win/lin
  • Is there a simple way to count occurences of one text string within another text string?
  • 1px white spacing in Chrome between div's
  • Matlab and mechanics (mostly physics)
  • Writing to a textBox using two threads
  • How to perform a left join in SQLALchemy?
  • php_network_getaddresses: getaddrinfo failed
  • TypeScript Mapped Types: Get element type of array
  • How do I create a M2Crypto DSA object given parameters and key values?
  • Increasing dimensions on hover without changing the position of other elements
  • BBC micro:bit - Radio string transfer random carriage returns
  • I have a SQLite syntax error in DELETE statement
  • Unselect column after pasting data
  • Edge-case: When (only) reversing order of template parameters in specialization, can non-specialized
  • ssh2_scp_send() using php corrupts pdf
  • dplyr and tidyr: convert long to wide format and arrange columns
  • In powershell, using the export-csv cmdlet, my ints are being encapsulated by quotes any idea why?
  • Populate checkbox from database
  • Enterprise Architect Synchronize with Code
  • Where does the file get saved using “File file = new file(filename)” in Android
  • How to trick Node.js to load .js files as ES6 modules?
  • Sort by a column in a union query in SqlAlchemy SQLite
  • What command do i need to pass in SabreCommandLLSRQ to get current price of PNR?
  • Randomizing -and remembering that randomisation- multiple choice questions in php
  • How to create a file in java without a extension