36734

Python: how to group similar lists together in a list of lists?

I have a list of lists in python. I want to group similar lists together. That is, if first three elements of each list are the same then those three lists should go in one group. For eg

[["a", "b", "c", 1, 2], ["d", "f", "g", 8, 9], ["a", "b", "c", 3, 4], ["d","f", "g", 3, 4], ["a", "b", "c", 5, 6]]

I want this to look like

[[["a", "b", "c", 1, 2], ["a", "b", "c", 5, 6], ["a", "b", "c", 3, 4]], [["d","f", "g", 3, 4], ["d", "f", "g", 8, 9]]]

I could do this by running an iterator and manually comparing each element of two consecutive lists and then based on the no of elements within those lists that were same I can decide to group them together. But i was just wondering if there is any other way or a pythonic way to do this.

Answer1:

You can use itertools.groupby :

>>> A=[["a", "b", "c", 1, 2], ... ["d", "f", "g", 8, 9], ... ["a", "b", "c", 3, 4], ... ["d","f", "g", 3, 4], ... ["a", "b", "c", 5, 6]] >>> from operator import itemgetter >>> [list(g) for _,g in groupby(sorted(A),itemgetter(0,1,2)] [[['a', 'b', 'c', 1, 2], ['a', 'b', 'c', 3, 4], ['a', 'b', 'c', 5, 6]], [['d', 'f', 'g', 3, 4], ['d', 'f', 'g', 8, 9]]]

Answer2:

You don't need to sort, you can group in a dict using a tuple of the first three elements from each list as the key:

from collections import OrderedDict l=[ ["a", "b", "c", 1, 2], ["d", "f", "g", 8, 9], ["a", "b", "c", 3, 4], ["d","f", "g", 3, 4], ["a", "b", "c", 5, 6] ] od = OrderedDict() for sub in l: k = tuple(sub[:3]) od.setdefault(k,[]).append(sub) from pprint import pprint as pp pp(od.values()) [[['a', 'b', 'c', 1, 2], ['a', 'b', 'c', 3, 4], ['a', 'b', 'c', 5, 6]], [['d', 'f', 'g', 8, 9], ['d', 'f', 'g', 3, 4]]]

Which is O(n) as opposed to O(n log n).

If you don't care about order use a defaultdict:

from collections import defaultdict od = defaultdict(list) for sub in l: a, b, c, *_ = sub # python3 k = a,b,c od[k].append(sub) from pprint import pprint as pp pp(list(od.values())) [[['a', 'b', 'c', 1, 2], ['a', 'b', 'c', 3, 4], ['a', 'b', 'c', 5, 6]], [['d', 'f', 'g', 8, 9], ['d', 'f', 'g', 3, 4]]]

Recommend

  • Change n-th entry of NumPy array that fulfills condition
  • Pythonic Way to have multiple Or's when conditioning in a dataframe
  • Read pandas dataframe from csv beginning with non-fix header
  • A Python one liner? if x in y, do x
  • Find row numbers in a binary array with a certain code
  • Python functions: Pass global variables if only accessing them?
  • Text similarity analysis (Excel)
  • How to merge two tables and transpose rows to columns
  • pythonic way to find all potential longest sequence
  • How can I sum two different columns at once where one contains Decimal objects in pandas?
  • Aggregate all dataframe row pair combinations using pandas
  • How to format data from string variable
  • How do you SELECT several columns with one distinct column
  • replacing while loop with list comprehension
  • How can I select the most recent and distinct records using LINQ?
  • How to select table rows/complete table?
  • Multicolor tooltip in Qt
  • Vigenere cipher not working
  • What is this strange character in chrome's resource css viewer?
  • How to concat Pandas dataframe columns
  • How to repeat sections of a SQL query across UNIONs? (DRY in SQL)
  • Criterion causing memory consumption to explode, no CAFs in sight
  • Python delete lines of text line #1 till regex
  • Group list of tuples by item
  • RxJava debounce by arbitrary value
  • Sequential (transactional) API calls in angular 4 with state management
  • Use of this Javascript
  • C++ Partial template specialization - design simplification
  • How to model a transition system with SPIN
  • How to get next/previous record number?
  • Matplotlib draw Spline from multiple points
  • Why winpcap requires both .lib and .dll to run?
  • Return words with double consecutive letters
  • Understanding cpu registers
  • Django query for large number of relationships
  • Busy indicator not showing up in wpf window [duplicate]
  • Observable and ngFor in Angular 2
  • How to Embed XSL into XML
  • UserPrincipal.Current returns apppool on IIS
  • Conditional In-Line CSS for IE and Others?