50258

Calculating frequency distribution of a collection with .Net/C#

Is there a fast/simple way to calculate the frequency distribution of a .Net collection using Linq or otherwise?

For example: An arbitrarily long List contains many repetitions. What's a clever way of walking the list and counting/tracking repetitions?

Answer1:

The easiest way is to use a hashmap and either use the value as the key and increment the value, or pick a bucket size (bucket 1 = 1 - 10, bucket 2 = 11 - 20, etc), and increment each bucket by the value.

Then you can go through and determine the frequencies.

Answer2:

The simplest way to find duplicate items in a list is to group it, like this:

var dups = list.GroupBy(i => i).Where(g => g.Skip(1).Any());

(Writing Skip(1).Any() should be faster than (Count() > 1) because it won't have to traverse more than two items from each group. However, the difference is probably negligible unless list's enumerator is slow)

Answer3:

The C5 generic collections library has a HashBag implementation that accepts duplicates by counting. The following pseudo-code would get you what you're looking for:

var hash = new HashBag();
hash.AddAll(list);
var mults = hash.ItemMultiplicities();
</pre>

(where K is the type of the items in your list) mults will then contain an IDictionary<K,int> where the list item is the key and the multiplicity is the value.

Recommend

  • Failed to start an instrument test from Firebase gcloud command line
  • AnalyticsReceiver in Google Analytics Tracking
  • Custom variables on product details page in Magento
  • ExtJS 4 Spring 3 file upload. Server sends bad response content type
  • How to use both ga.js and analytics.js?
  • Inno Setup Search for specifc file on a CD, retrieve exact filepath and return value to [Files]-Sect
  • How to clear out the contents of a map when clear() method call throws UnsupportedOperationException
  • UIimage to char* conversion
  • Update Search Results to Lazy Adapter in android
  • ProgressBar Paint Method?
  • Easiest way to covert part of a byte array to uint16
  • How to apply async task into this
  • C: Custom strlen() library function
  • Weighted round robin dns between 2 Cloudfront distributions
  • Can I monitor the progress of an S3 download using the AWS SDK?
  • Google datalab : how to import pickle
  • MySql - get days remaining
  • Merge list of objects into consistent list based on common matching attribute in Python
  • Insert records if not exist SQL Server 2005
  • How to set current CultureUI via XAML binding
  • Unique Permutations - with exceptions
  • Other than Linq to SQL does anything else consume INotifyPropertyChanging?
  • What's the syntax to inherit documentation from another indexer?
  • Cloud Code function running twice
  • Why the SequenceFile is truncated?
  • DIV instruction jumping to random location?
  • Updating both a ConcurrentHashMap and an AtomicInteger safely
  • ListItem.Attributes.Add not working
  • Atlas images wrong size on iPad iOS 9
  • Change multiple background-images with jQuery
  • Why value captured by reference in lambda is broken? [duplicate]
  • Android screen density dpi vs ppi
  • DirectX11 ClearRenderTargetViewback with transparent buffer?
  • Change an a tag attribute in JavaScript based on screen width
  • php design question - will a Helper help here?
  • Why joiner is not used after Sequence generator or Update statergy
  • Recursive/Hierarchical Query Using Postgres
  • UserPrincipal.Current returns apppool on IIS
  • jQuery Masonry / Isotope and fluid images: Momentary overlap on window resize
  • How do I use LINQ to get all the Items that have a particular SubItem?