Explain Big O notation, Binary Search and Logarithms, C#

Explain Big O notation, Binary Search and Logarithms, C#

I am curious to learn more about Big O, Binary Search and huge data sets.

What sorting and searching methods maximize performance for millions of records?

Is Binary Search still used here?

Thanks

https://codingticks.wordpress.com/2013/07/21/a-beginners-guide-to-big-o-notation/
newbiewebSr. Software EngineerAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

MishaProgrammerCommented:
Binary Search  is better method then, Bubble sort, for example. In Binary Search Adding one item to a binary search tree is on average an O(log n) process. You can read more information in many articles, for example
https://en.wikipedia.org/wiki/Tree_sort

Also this method have already realization in .Net Framework. In is List<T>.BinarySearch Method (T)

https://msdn.microsoft.com/en-us/library/w4e7fxsh(v=vs.110).aspx
 
If I correct understand this question.
0
newbiewebSr. Software EngineerAuthor Commented:
can you explain this sentence to me?

Adding one item to a binary search tree is on average an O(log n) process (in big O notation).
0
MishaProgrammerCommented:
The binary search tree is a binary tree for which the following additional conditions  are satisfied:
1) Both subtrees — left and right — are binary search trees.
2)   All nodes of the left subtree of arbitrary node X of value of keys of data have less, than value of a key of data of the node X.
3) All nodes of the right subtree of arbitrary node X of value of keys of data have more or are equal, than value of a key of data of the node X.
After these conditions you have 3 basic operation with tree: find, insert, remove.
Find(K): search of a node in which couple (key, value) with key = K is stored.
Algorithm:

    If the tree is empty, to report that the node is not found and to stop.
    Otherwise to compare K to value of a key of root node X.
        If K=X to issue the link to this node and to stop.
        If K> X, recursively to look for K key in the right subtree T.
        If K<X, recursively to look for K key in the left subtree T.

INSERT (K, V) — adding in couple tree (key, value) = (K, V).

If the tree is empty, to replace it with a tree with one root node ((K, V), null, null) and to stop.
Otherwise to compare K to a key of root node X.

    If K > X, recursively to add (K, V) in the right subtree T.
    If K < X, recursively to add (K, V) in the left subtree T.
If K=X to replace the V current node with new value


REMOVE(K) — deleting a node in which couple (key, value) with key = K is stored.

If the tree of T is empty, to stop;
    Otherwise to compare K to a key of the X root node of n.
        If K > X, recursively to delete K from the right subtree T;
        If K <X, recursively to delete k from the left subtree=T;
i K=X then  it is necessary conside three cases. both children  are absent,we delete current node  and nullify link at a parent node; if no one of children, field values child m put instead appropriate root node, rubbing its old values, release memory occupied by m; present, right absent (n-> right->; left)
           We copy from the right node in deleted the K, V fields and the link to the right node of the right descendant.
                Otherwise
                    We will take the most left node m, the right subtree n-> right;
                    We will copy data (except links to child members) from m in n;
                    Recursively we will delete a node m
0
Exploring SQL Server 2016: Fundamentals

Learn the fundamentals of Microsoft SQL Server, a relational database management system that stores and retrieves data when requested by other software applications.

newbiewebSr. Software EngineerAuthor Commented:
can you explain the basics of logarithms as it relates to calculating the times to execute any of these?
0
MishaProgrammerCommented:
The number "b" logarithm is determined by the base of "a" as an exponent in which it is necessary to build a base to receive number "b".
Logarithm  also use  In Big O Notation, where "b" is a count of your records. Base usually is 2.
You can see more infromation about  logarithm , for example, in this article.
https://en.wikipedia.org/wiki/Logarithm
0
MishaProgrammerCommented:
For example, if you have 8 records, then
log 2 (8) = 3.
It is better then 8^2 = 64
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Florian LenschowCommented:
Misha already gave a great explanation of a data structure that is based on the binary search algorithm. Keep in mind that binary search requires your data to be ordered, meaning its a good idea to read your key,value pairs directly into a binary search tree to assure these properties.

A bit more detail for BigO Notation:

The idea is to estimate the computation speed of an algorithm independent from hardware. To do this you consider the growth of a function of n. where n is the size of your input data (number of elements mostly). So having n elements as input you count the worst case of number of  operations that need to be computed to obtain the result.
For example in binary search you halve the search space in each step. You start with the element in the middle, if its bigger than the element you search for, then you can drop the right (greater) side of the inputs and repeat the step with the middle element of the left (smaller) side.
So the maximum number of steps you need to take to find the result is log(n) (with the base 2 of course) because this is the number of times you can halve the input data (e.g 32 -> 16 -> 8 -> 4 -> 2 -> 1, so log(32) = 5).

This is what you do to estimate the worst case runtime of an algorithm. In practice this is much more important than CPU speed, because it shows how many steps you have to compute more when you increase the size of the input. Binary Sort alwys runs in log(n), as long as your input data is ordered (if not you need to add the runtime of a sorting algorithm), Bubble sort however has a runtime that depends on the order of the input data. If  the data happens to be ordered in reverse, the algorithm swaps each element completely through the list unitl it is ordered, so every of the n items is moved n times resulting in O(n^2). This is the worst case however, bubble sort probably runs faster most of the time but it is still very slow.

So considering a runtime of O(n^2) this means that you require 16 operations for list of length 4, 25 for a list of length 5 and so on (In the worst case). Since you asked about big datasets you can imagine how the required number of operations explodes.

Another point for huge datasets is the problem, that they usually do not fit into the ram or are distributed across many servers. For this you should have a look at external memory algorithms.
1
newbiewebSr. Software EngineerAuthor Commented:
thanks
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
.NET Programming

From novice to tech pro — start learning today.