• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 411
  • Last Modified:

a search and find algorithm

Hi there;

I have a bunch of numbers in a text file, around 80000 lines, the first ones are as follows:

1       0.9
268       5.1
46       0.17
4620       0.0
468       0.15
4631       0.15
4673       0.9
46732       1.1

What I am trying to do is finding the corresponding second column e.g. (1.1) by searching for a string e.g., 46-73-212345. I am trying to find the longest option possible meaning,
46732 will be the exact match for the input 46-73-212345, but for another string value 46-73-111134, then 4673 would be the exact match.

Since there are too many records, how can i process this in C# or Java?
This is obvious an algorithm based question.

e.g. should I go per character matching but consider the number of total records which is high..

Regards.
0
jazzIIIlove
Asked:
jazzIIIlove
  • 4
  • 2
  • 2
  • +1
3 Solutions
 
TommySzalapskiCommented:
What if there is a tie?

80,000 is quite small for the computer so you could just go through the list one at a time and see how long the match is.
Since they appear to be sorted, you could use a binary search to find the closest match.
Note that if you look at where your number would fit in the sorted list, either the one right above or right below will be the longest match but there could be dozens that tie for longest match.
0
 
jazzIIIloveAuthor Commented:
If tie, the one encountered first is taken into account.

And yes, left column is sorted.

Regards.
0
 
TommySzalapskiCommented:
So you want to do something like this
string longest;
int max = 0;

foreach string s in leftcolumn
{
  match = Check how long the match is
  if (match > max)
  {
    max = match;
    longest = s;
  }
}

Note that since they are sorted, if match < max, you can quit. You'll never get a better match.

You could play around with different collections in C# or Java, but that's the basic algorithm.
You could also do a binary search for where the search string "belongs" and run up to find the first one.

Of course, you'll probably find it easiest to make a new string that has no '-' in it and use that to search.
0
Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

 
jazzIIIloveAuthor Commented:
Can you guide me how i can use binary search? Any sample code for this?

Thanks.

Regards.
0
 
yatsCommented:
And yes, left column is sorted.

How the left column is sorted??
0
 
jazzIIIloveAuthor Commented:
left column is sorted per character:

1       0.9
268       5.1
46       0.17
4620 ...

first character is 1 in the first row.
first character is 2 in the second row.
first character is 4 in the third row.
first character is again 4 in the fourth row, second character is 6, again a tie, then 2 is coming in the fourth row.

So, look vertical alignment of numbers in that sense.

Regards.
0
 
yatsCommented:
I thought the same at firsh place, but have a look at 5th and 6th row
468       0.15
4631       0.15

the way you explained it's not seems to be sorted.
0
 
jazzIIIloveAuthor Commented:
yup, good point. I sorted them in horizontal order.

Regards.
0

Featured Post

The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

  • 4
  • 2
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now