How can C# match substrings very fast?

chaffinsjc used Ask the Experts™
In the problem example below, each item in "usableItems" is looked for in the "originalItems" data structure. Each item that is not found is placed in the discardedItems string array.

string[] usableItems = {"1", "b7", "13" };
string[] originalItems = {"1", "3", "5", "b7", "9", "11", "13" };
string[] discardedItems = { "3", "5", "9", "11" }; //final result

How should a C# program structure the data and logic to minimize the time to build the final string array named "discardedItems"?

I think Regex and IndexOf would probably be slow. And I'm not sure about using char[] arrays somehow. Also, there might be consideration of a dictionary for quick lookup, but then establishing the dictionary will also require time. Hmmm.

SOME RULES (just want to explain clearly):
(1) The items numerically represent the notes in common piano chords. There are hundreds of chords, with variations in sharps, flats, and number of items to match and/or be matched. Typical items are "3", "#5", "7", "b7", "#9", "b11", "11", "#11", "b13", etc.
(2) The items always increase numerically from left to right.
(3) Items do not repeat.
(4) There can be from 1 to 7 "originalItems".
(5) There can be from 1 to 7 "usableItems".
(6) There will never be more "usableItems" than "originalItems".
(7) Each item in "usableItems" ALWAYS DOES EXIST in "originalItems".

NOTE: To present this problem, string arrays are used for "usableItems" and "originalItems", but only the final result "discardedItems" must be a string array.

I would sincerely appreciate some code ideas.
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Distinguished Expert 2017
Hi chaffinsjc;

If you are using Visual Studio .Net 2008 or 2010 then you can use the set operator Except to get what you need.

// Making the originals as List<string> will be more effecent
List<string> usableItems = new List<string>() { "1", "b7", "13" };
List<string> originalItems = new List<string>() { "1", "3", "5", "b7", "9", "11", "13" };

// Result string Array
string[] discardedItems = originalItems.Except(usableItems).ToArray();

Open in new window

I would take advantage of rules (2), (3) and (7).

I would iterate through the "usableItems" and "originalItems" in parallel, moving forward through the original items and adding the original items as discarded until they equal the current "usableItem". At that point I would move the usable item forward.

Code attached
(this is Java, but C# should be similar :))

int ius = 0, ior = 0, idis = 0;
while (ius < usableItems.length) {
  if (originalItems[ior].equals(usableItems[ius])) {
    ius++; // found, move forward
  } else {
    discardedItems[idis] = originalItems[ior]; //not found yet

Open in new window


Stunning. I didn't know anything about C# and manipulating sets. Wow!. Thank you Fernando.

The other solution posted in a non-C# language may have at least one flaw: the arrays of items to be compared are unequal length, so the comparison in usableItems[i] will eventually blow up when using a subscript that is OK in originalItems (I know, 'cuz it happened to me...).: .Thank you for contributing.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial