61215

Compute Most Frequent Occurance of Numbers of A Sorted List in Haskell

The question is to compute the mode (the value that occurs most frequently) of a sorted list of integers.

[1,1,1,1,2,2,3,3] -> 1 [2,2,3,3,3,3,4,4,8,8,8,8] -> 3 or 8 [3,3,3,3,4,4,5,5,6,6] -> 3

Just use the Prelude library.

Are the functions filter, map, foldr in Prelude library?

Answer1:

Starting from the beginning.

You want to make a pass through a sequence and get the maximum frequency of an integer.

This sounds like a job for fold, as fold goes through a sequence aggregating a value along the way before giving you a final result.

foldl :: (a -> b -> a) -> a -> [b] -> a

The type of foldl is shown above. We can fill in some of that already (I find that helps me work out what types I need)

foldl :: (a -> Int -> a) -> a -> [Int] -> a

We need to fold something through that to get the value. We have to keep track of the current run and the current count

data BestRun = BestRun { currentNum :: Int, occurrences :: Int, bestNum :: Int, bestOccurrences :: Int }

So now we can fill in a bit more:

foldl :: (BestRun -> Int -> BestRun) -> BestRun -> [Int] -> BestRun

So we want a function that does the aggregation

f :: BestRun -> Int -> BestRun f (BestRun current occ best bestOcc) x | x == current = (BestRun current (occ + 1) best bestOcc) -- continuing current sequence | occ > bestOcc = (BestRun x 1 current occ) -- a new best sequence | otherwise = (BestRun x 1 best bestOcc) -- new sequence

So now we can write the function using foldl as

bestRun :: [Int] -> Int bestRun xs = bestNum (foldl f (BestRun 0 0 0 0) xs)

Answer2:

Are the functions filter, map, foldr in Prelude library?

Stop...Hoogle time!

Did you know Hoogle tells you which module a function is from? Hoolging map results in this information on the search page:

map :: (a -> b) -> [a] -> [b]

base Prelude, base Data.List

This means map is defined both in Prelude and in Data.List. You can hoogle the other functions and likewise see that they are indeed in Prelude.

You can also look at Haskell 2010 > Standard Prelude or the Prelude hackage docs.

So we are allowed to map, filter, and foldr, as well as anything else in Prelude. That's good. Let's start with Landei's idea, to turn the list into a list of lists.

groupSorted :: [a] -> [[a]] groupSorted = undefined -- groupSorted [1,1,2,2,3,3] ==> [[1,1],[2,2],[3,3]]

How are we supposed to implement groupSorted? Well, I dunno. Let's think about that later. Pretend that we've implemented it. How would we use it to get the correct solution? I'm assuming it is OK to choose just one correct solution, in the event that there is more than one (as in your second example).

mode :: [a] -> a mode xs = doSomething (groupSorted xs) where doSomething :: [[a]] -> a doSomething = undefined -- doSomething [[1],[2],[3,3]] ==> 3 -- mode [1,2,3,3] ==> 3

We need to do something after we use groupSorted on the list. But what? Well...we should find the longest list in the list of lists. Right? That would tell us which element appears the most in the original list. Then, once we find the longest sublist, we want to return the element inside it.

chooseLongest :: [[a]] -> a chooseLongest xs = head $ chooseBy (\ys -> length ys) xs where chooseBy :: ([a] -> b) -> [[a]] -> a chooseBy f zs = undefined -- chooseBy length [[1],[2],[3,3]] ==> [3,3] -- chooseLongest [[1],[2],[3,3]] ==> 3

chooseLongest is the doSomething from before. The idea is that we want to choose the best list in the list of lists xs, and then take one of its elements (its head does just fine). I defined this by creating a more general function, chooseBy, which uses a function (in this case, we use the length function) to determine which choice is best.

Now we're at the "hard" part. Folds. chooseBy and groupSorted are both folds. I'll step you through groupSorted, and leave chooseBy up to you.

<strong>How to write your own folds</strong>

We know groupSorted is a fold, because it consumes the entire list, and produces something entirely new.

groupSorted :: [Int] -> [[Int]] groupSorted xs = foldr step start xs where step :: Int -> [[Int]] -> [[Int]] step = undefined start :: [[Int]] start = undefined

We need to choose an initial value, start, and a stepping function step. We know their types because the type of foldr is (a -> b -> b) -> b -> [a] -> b, and in this case, a is Int (because xs is [Int], which lines up with [a]), and the b we want to end up with is [[Int]].

Now remember, the stepping function will inspect the elements of the list, one by one, and use step to fuse them into an accumulator. I will call the currently inspected element v, and the accumulator acc.

step v acc = undefined

Remember, in theory, foldr works its way from right to left. So suppose we have the list [1,2,3,3]. Let's step through the algorithm, starting with the rightmost 3 and working our way left.

step 3 start = [[3]]

Whatever start is, when we combine it with 3 it should end up as [[3]]. We know this because if the original input list to groupSorted were simply [3], then we would want [[3]] as a result. However, it isn't just [3]. Let's pretend now that it's just [3,3]. [[3]] is the new accumulator, and the result we would want is [[3,3]].

step 3 [[3]] = [[3,3]]

What should we do with these inputs? Well, we should tack the 3 onto that inner list. But what about the next step?

step 2 [[3,3]] = [[2],[3,3]]

In this case, we should create a new list with 2 in it.

step 1 [[2],[3,3]] = [[1],[2],[3,3]]

Just like last time, in this case we should create a new list with 1 inside of it.

At this point we have traversed the entire input list, and have our final result. So how do we define step? There appear to be two cases, depending on a comparison between v and acc.

step v acc@((x:xs):xss) | v == x = (v:x:xs) : xss | otherwise = [v] : acc

In one case, v is the same as the head of the first sublist in acc. In that case we prepend v to that same sublist. But if such is not the case, then we put v in its own list and prepend that to acc. So what should start be? Well, it needs special treatment; let's just use [] and add a special pattern match for it.

step elem [] = [[elem]] start = []

And there you have it. All you have to do to write your on fold is determine what start and step are, and you're done. With some cleanup and eta reduction:

groupSorted = foldr step [] where step v [] = [[v]] step v acc@((x:xs):xss) | v == x = (v:x:xs) : xss | otherwise = [v] : acc

This may not be the most efficient solution, but it works, and if you later need to optimize, you at least have an idea of how this function works.

Answer3:

I don't want to spoil all the fun, but a group function would be helpful. Unfortunately it is defined in Data.List, so you need to write your own. One possible way would be:

-- corrected version, see comments grp [] = [] grp (x:xs) = let a = takeWhile (==x) xs b = dropWhile (==x) xs in (x : a) : grp b

E.g. grp [1,1,2,2,3,3,3] gives [[1,1],[2,2],[3,3,3]]. I think from there you can find the solution yourself.

Answer4:

I'd try the following:

mostFrequent = snd . foldl1 max . map mark . group where mark (a:as) = (1 + length as, a) mark [] = error "cannot happen" -- because made by group

Note that it works for any finite list that contains orderable elements, not just integers.

Recommend

  • Conditional Vlook up without using VBA
  • Should I optimize around reads or CPU time in Google App Engine
  • Equivalent of branch..merge for git-push
  • Accessing local repository in offline mode
  • MarkLogic Node.js Sort on “last-modified”
  • c++ search a vector for element first seen position
  • getelement by class name for clicking
  • PHP UTF-8 to GB2312
  • Using android opencv apps without downloading opencv sdk manager
  • Efficient User-Agent Regex to find Safari in Python
  • Django Haystack Rebuild Index
  • jQuery: How to AJAXify WordPress Search?
  • How to install node-mysql?
  • Selenium to click on a javascript button corresponding to a text
  • Is playing sound in Javascript performance heavy?
  • How do I exclude a dependency in provided scope when running in Maven test scope?
  • Is there a perl module to validate passwords stored in “{crypt}hashedpassword” “{ssha}hashedpassword
  • Google Custom Search with transparent background
  • Meteor helpers not available in Angular template
  • Insert into database using onclick function
  • What is Eclipse's Declaration View used for?
  • DirectX11 ClearRenderTargetViewback with transparent buffer?
  • Does CUDA 5 support STL or THRUST inside the device code?
  • Knitr HTML Loop - Some HTML output, some R output
  • Can I make an Android app that runs a web view in Chrome 39?
  • Can a Chrome extension content script make an jQuery AJAX request for an html file that is itself a
  • Why is the timeout on a windows udp receive socket always 500ms longer than set by SO_RCVTIMEO?
  • How do you troubleshoot character encoding problems?
  • Web-crawler for facebook in python
  • Why winpcap requires both .lib and .dll to run?
  • Unit Testing MVC Web Application in Visual Studio and Problem with QTAgent
  • using HTMLImports.whenReady not working in chrome
  • Understanding cpu registers
  • embed rChart in Markdown
  • Can Visual Studio XAML designer handle font family names with spaces as a resource?
  • How can I remove ASP.NET Designer.cs files?
  • Are Kotlin's Float, Int etc optimised to built-in types in the JVM? [duplicate]
  • How to get NHibernate ISession to cache entity not retrieved by primary key
  • How can I use `wmic` in a Windows PE script?
  • Unable to use reactive element in my shiny app