78300

Given two strings, how do I find number of reoccurences of one in another?

For example, s1='abc', s2='kokoabckokabckoab'. Output should be 3. (number of times s1 appears in s2).

Not allowed to use for or strfind. Can only use reshape,repmat,size.

I thought of reshaping s2, so it would contain all of the possible strings of 3s:

s2 =

kok

oko

koa

oab

.... etc

But I'm having troubles from here..

Answer1:

Assuming you have your matrix reshaped into the format you have in your post, you can replicate s1 and stack the string such that it has as many rows as there are in the reshaped s2 matrix, then do an equality operator. Rows that consist of all 1s means that we have found a match and so you would simply search for those rows where the total sum is equal to the total length of s1. Referring back to my post on dividing up a string into overlapping substrings, we can decompose your string into what you have posted in your question like so:

%// Define s1 and s2 here s1 = 'abc'; len = length(s1); s2 = 'kokoabckokabckoab'; %// Hankel starts here c = (1 : len).'; r = (len : length(s2)).'; nr = length(r); nc = length(c); x = [ c; r((2:nr)') ]; %-- build vector of user data cidx = (1:nc)'; ridx = 0:(nr-1); H = cidx(:,ones(nr,1)) + ridx(ones(nc,1),:); % Hankel subscripts ind = x(H); % actual data %// End Hankel script %// Now get our data subseqs = s2(ind.'); %// Case where string length is 1 if len == 1 subseqs = subseqs.'; end <hr>

subseqs contains the matrix of overlapping characters that you have alluded to in your post. You've noticed a small bug where if the length of the string is 1, then the algorithm won't work. You need to make sure that the reshaped substring matrix consists of a single <strong>column</strong> vector. If we ran the above code without checking the length of s1, we would get a row vector, and so simply transpose the result if this is the case.

Now, simply replicate s1 for as many times as we have rows in subseqs so that all of these strings get stacked into a 2D matrix. After, do an equality operator.

eqs = subseqs == repmat(s1, size(subseqs,1), 1);

Now, find the column-wise sum and see which elements are equal to the length of your string. This will produce a single column vector where 1 indicates that we have found a match, and zero otherwise:

sum(eqs, 2) == len ans = 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0

Finally, to add up <strong>how many times</strong> the substring matched, you just have to add up all elements in this vector:

out = sum(sum(eqs, 2) == len) out = 2 <hr>

As such, we have <strong>two</strong> instances where abc is found in your string.

Answer2:

Here is another one,

s1='abc'; s2='bkcokbacaabcsoabckokabckoabc'; [a,b] = ismember(s2,s1); b = [0 0 b 0 0]; a1=circshift(b,[0 -1]); a2=circshift(b,[0 -2]); sum((b==1)&(a1==2)&(a2==3))

It gives 3 for your input and 4 for my example, and it seems to work well if ismember is okey.

Answer3:

Just for the fun of it: this can be done with nlfilter from the Image Processing Toolbox (I just discovered this function today and am eager to apply it!):

ds1 = double(s1); ds2 = double(s2); result = sum(nlfilter(ds2, [1 numel(ds1)], @(x) all(x==ds1)));

Recommend

  • Resize rectangle in Paper.js
  • Dividing a number into m parts uniformly randomly
  • Group by on XML column field with LINQ
  • Is there a way to make “X() in [X(), Y(), Z()]” return True?
  • How-to: short-circuiting inverted ternary operator implemented in, e.g. C#? Does it matter?
  • How to remove duplicate elements for two numpy arrays in python
  • Why is this jQuery reference '$(“”)' instead of '$(“”)'?
  • What do internal compiler error messages mean, and what can I do?
  • Why does an array get passed by value some times and not others?
  • Which devices/recommended sizes should I target with mediaqueries?
  • reduce/reduce conflicts using ocamlyacc
  • How to write string.Contains(someText) in expression Tree
  • Distribute Range of Numbers between each threads
  • How to get the index of element in the List in c#
  • Rails AREL .where statement
  • SQL: Getting the physical size of a subset of a table
  • Perspective projection, 4 points
  • pip in virtualenv gets ConnectTimeoutError
  • R convert summary result (statistics with all dataframe columns) into dataframe
  • Approximate Order-Preserving Huffman Code
  • RxJava debounce by arbitrary value
  • WPF ICommand CanExecute(): RaiseCanExecuteChanged() or automatic handling via DispatchTimer?
  • How solve “Qt: Untested Windows version 10.0 detected!”
  • Grails calculated field in SQL
  • Marklogic : Query response time is very high
  • Google Custom Search with transparent background
  • Insert into database using onclick function
  • What is Eclipse's Declaration View used for?
  • Is possible to count alias result on mysql
  • Javascript Callbacks with Object constructor
  • Can I make an Android app that runs a web view in Chrome 39?
  • How can I use Kendo UI with Razor?
  • How to model a transition system with SPIN
  • retrieve vertices with no linked edge in arangodb
  • File upload with ng-file-upload throwing error
  • Python: how to group similar lists together in a list of lists?
  • FormattedException instead of throw new Exception(string.Format(…)) in .NET
  • Change div Background jquery
  • apache spark aggregate function using min value
  • Django query for large number of relationships