67836

read through sentences separated by line break and parse

Question:

Given that I have tokenized sentences separated with a linebreak, and I have 2 columns representing the actual and predicted tag for the tokens. I want to loop through each of these token and find out wrong predictions e.g. actual tag not equal to predicted tag

#word actual predicted James PERSON PERSON Washington PERSON LOCATION went O O home O LOCATION He O O took O O Elsie PERSON PERSON along O O >James Washington went home: Incorrect >He took Elsie along: Correct

Answer1:

In addition to my <a href="https://stackoverflow.com/a/23084050/846892" rel="nofollow">previous answer</a> I am using <a href="https://docs.python.org/2/library/functions.html#all" rel="nofollow">all()</a> and a list comprehension here:

from itertools import groupby d = {True: 'Correct', False: 'Incorrect'} with open('text1.txt') as f: for k, g in groupby(f, key=str.isspace): if not k: # Split each line in the current group at whitespaces data = [line.split() for line in g] # If for each line the second column is equal to third then `all()` will # return True. predicts_matched = all(line[1] == line[2] for line in data) print ('{}: {}'.format(' '.join(x[0] for x in data), d[predicts_matched]))

<strong>Output:</strong>

James Washington went home: Incorrect He took Elsie along: Correct

Answer2:

Python strings have powerful parsing functions you can use here. I did this using Python 3.3, but it should work with any other version as well.

thistext = '''James PERSON PERSON Washington PERSON LOCATION went O O home O LOCATION He O O took O O Elsie PERSON PERSON along O O ''' def check_text(text): lines = text.split('\n') correct = [True] #a bool wrapped in a list,we can modify it from a nested function words = [] def print_result(): if words: print( ' '.join(words), ": ", "Correct" if correct[0] else "Incorrect" ) #words.clear() del words[:] correct[0] = True for line in lines: if line.strip(): # check if the line is empty word, a, b = line.split() if a != b: correct[0] = False words.append(word) else: print_result(); print_result() check_text(thistext)

Recommend

  • XML to csv(-like) format
  • How to use stringstream constructor in getline?
  • Workaround to Webkit[Chrome/Safari] javascript select on focus bug (when using tab between fields)
  • How to parse JSON Response into Dictionary?
  • How to retrieve grouped messages ordered by date SQL
  • PHP Parse XML response with many namespaces
  • CakePHP 2.x Custom Route Pagination
  • Trying to get Geolocation on a map element in windows phone 8 using emulator
  • type php code into textarea, store in database, then execute
  • Django ManyToMany filtering by set size or member in the set
  • C++ std::auto_ptr copy constructor
  • MarkLogic: XQuery to Get Distinct Names from XML Document?
  • Select distinct pairs joining a table to itself in sql
  • Perl , html data and characters encoded in utf-8
  • Getting the id of the last inserted record from an MSSQL table using PDO and PHP
  • Add comma between all names in a list of object
  • Host name does not match the certificate subject
  • Rake Execute With Multiple Arguments
  • five3d local3dtoglobal
  • Java regex skipping matches
  • AppleScript access Network Folder
  • Program doesn't stop after exception
  • SQL query comparing an attribute in multiple tuples based on values of another attribute within the
  • function declaration within function declaration with same name javascript
  • numpy 64bit support in PTVS and numpy System.Int64 casting
  • ng-repeat not working with table but works with list
  • Getting errors while using neuralnet function
  • Zend Framework + Doctrine1.2 project structure with more modules
  • Cordova Apache wrong module path
  • File extension of zlib zipped html page?
  • Correct implementation of List Iterator methods
  • Installing Perl6 and Panda on Ubuntu 15.10. Problems with bootstrap.pl
  • Can my PDF ping my server when it is opened?
  • Web.config system.webserver errors
  • Android full screen on only one activity?
  • HTML download movie download link
  • Accessing IRQ description array within a module and displaying action names
  • Function pointer “assignment from incompatible pointer type” only when using vararg ellipsis
  • 0x202A in filename: Why?
  • File not found error Google Drive API