Weird characters while reading file content


I'm not sure what is wrong:

for line in open(textfile, 'r'): print(line)



The file was created using textpad++ using Unix EOL and UTF8 encoding.

Now it works properly using Encoding with UTF-8 without BOM option on notepad++ . But why? I mean how could I convert all sent files to UTF-8 to avoid weird chars?


Specifying <a href="https://docs.python.org/3/library/codecs.html#standard-encodings" rel="nofollow">encoding</a> will solve your problem.

for line in open(textfile, 'r', encoding='utf-8-sig'): print(line)

<a href="https://docs.python.org/3/library/codecs.html#module-encodings.utf_8_sig" rel="nofollow">utf_8_sig</a>: UTF-8 codec with BOM signature


You must set the encoding of your file while reading it, using UTF-8.

Add a third parameter to your code, setting its enconding. From:

for line in open(textfile, 'r'): print(line)


for line in open(textfile, 'r', encoding='utf-8-sig'): print (line)


