
For example, s1='abc'
, s2='kokoabckokabckoab'
.
Output should be 3
. (number of times s1 appears in s2).
Not allowed to use for
or strfind
. Can only use reshape
,repmat
,size
.
I thought of reshaping s2, so it would contain all of the possible strings of 3s:
s2 =
kok
oko
koa
oab
.... etc
But I'm having troubles from here..
Answer1:
Assuming you have your matrix reshaped into the format you have in your post, you can replicate s1
and stack the string such that it has as many rows as there are in the reshaped s2
matrix, then do an equality operator. Rows that consist of all 1s means that we have found a match and so you would simply search for those rows where the total sum is equal to the total length of s1
. Referring back to my post on dividing up a string into overlapping substrings, we can decompose your string into what you have posted in your question like so:
%// Define s1 and s2 here
s1 = 'abc';
len = length(s1);
s2 = 'kokoabckokabckoab';
%// Hankel starts here
c = (1 : len).';
r = (len : length(s2)).';
nr = length(r);
nc = length(c);
x = [ c; r((2:nr)') ]; %-- build vector of user data
cidx = (1:nc)';
ridx = 0:(nr-1);
H = cidx(:,ones(nr,1)) + ridx(ones(nc,1),:); % Hankel subscripts
ind = x(H); % actual data
%// End Hankel script
%// Now get our data
subseqs = s2(ind.');
%// Case where string length is 1
if len == 1
subseqs = subseqs.';
end
<hr>
subseqs
contains the matrix of overlapping characters that you have alluded to in your post. You've noticed a small bug where if the length of the string is 1, then the algorithm won't work. You need to make sure that the reshaped substring matrix consists of a single <strong>column</strong> vector. If we ran the above code without checking the length of s1
, we would get a row vector, and so simply transpose the result if this is the case.
Now, simply replicate s1
for as many times as we have rows in subseqs
so that all of these strings get stacked into a 2D matrix. After, do an equality operator.
eqs = subseqs == repmat(s1, size(subseqs,1), 1);
Now, find the column-wise sum and see which elements are equal to the length of your string. This will produce a single column vector where 1
indicates that we have found a match, and zero otherwise:
sum(eqs, 2) == len
ans =
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
Finally, to add up <strong>how many times</strong> the substring matched, you just have to add up all elements in this vector:
out = sum(sum(eqs, 2) == len)
out =
2
<hr>
As such, we have <strong>two</strong> instances where abc
is found in your string.
Answer2:
Here is another one,
s1='abc';
s2='bkcokbacaabcsoabckokabckoabc';
[a,b] = ismember(s2,s1);
b = [0 0 b 0 0];
a1=circshift(b,[0 -1]);
a2=circshift(b,[0 -2]);
sum((b==1)&(a1==2)&(a2==3))
It gives 3
for your input and 4
for my example, and it seems to work well if ismember
is okey.
Answer3:
Just for the fun of it: this can be done with nlfilter
from the Image Processing Toolbox (I just discovered this function today and am eager to apply it!):
ds1 = double(s1);
ds2 = double(s2);
result = sum(nlfilter(ds2, [1 numel(ds1)], @(x) all(x==ds1)));