# Explain Big O notation, Binary Search and Logarithms, C#

on
Explain Big O notation, Binary Search and Logarithms, C#

What sorting and searching methods maximize performance for millions of records?

Is Binary Search still used here?

Thanks

https://codingticks.wordpress.com/2013/07/21/a-beginners-guide-to-big-o-notation/
Comment
Watch Question

Do more with

EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Programmer
Commented:
Binary Search  is better method then, Bubble sort, for example. In Binary Search Adding one item to a binary search tree is on average an O(log n) process. You can read more information in many articles, for example
https://en.wikipedia.org/wiki/Tree_sort

Also this method have already realization in .Net Framework. In is List<T>.BinarySearch Method (T)

https://msdn.microsoft.com/en-us/library/w4e7fxsh(v=vs.110).aspx

If I correct understand this question.
Software Engineer

Commented:
can you explain this sentence to me?

Adding one item to a binary search tree is on average an O(log n) process (in big O notation).
Programmer
Commented:
The binary search tree is a binary tree for which the following additional conditions  are satisfied:
1) Both subtrees — left and right — are binary search trees.
2)   All nodes of the left subtree of arbitrary node X of value of keys of data have less, than value of a key of data of the node X.
3) All nodes of the right subtree of arbitrary node X of value of keys of data have more or are equal, than value of a key of data of the node X.
After these conditions you have 3 basic operation with tree: find, insert, remove.
Find(K): search of a node in which couple (key, value) with key = K is stored.
Algorithm:

If the tree is empty, to report that the node is not found and to stop.
Otherwise to compare K to value of a key of root node X.
If K=X to issue the link to this node and to stop.
If K> X, recursively to look for K key in the right subtree T.
If K<X, recursively to look for K key in the left subtree T.

INSERT (K, V) — adding in couple tree (key, value) = (K, V).

If the tree is empty, to replace it with a tree with one root node ((K, V), null, null) and to stop.
Otherwise to compare K to a key of root node X.

If K > X, recursively to add (K, V) in the right subtree T.
If K < X, recursively to add (K, V) in the left subtree T.
If K=X to replace the V current node with new value

REMOVE(K) — deleting a node in which couple (key, value) with key = K is stored.

If the tree of T is empty, to stop;
Otherwise to compare K to a key of the X root node of n.
If K > X, recursively to delete K from the right subtree T;
If K <X, recursively to delete k from the left subtree=T;
i K=X then  it is necessary conside three cases. both children  are absent,we delete current node  and nullify link at a parent node; if no one of children, field values child m put instead appropriate root node, rubbing its old values, release memory occupied by m; present, right absent (n-> right->; left)
We copy from the right node in deleted the K, V fields and the link to the right node of the right descendant.
Otherwise
We will take the most left node m, the right subtree n-> right;
We will copy data (except links to child members) from m in n;
Recursively we will delete a node m
Software Engineer

Commented:
can you explain the basics of logarithms as it relates to calculating the times to execute any of these?
Programmer
Commented:
The number "b" logarithm is determined by the base of "a" as an exponent in which it is necessary to build a base to receive number "b".
Logarithm  also use  In Big O Notation, where "b" is a count of your records. Base usually is 2.
https://en.wikipedia.org/wiki/Logarithm
Programmer
Commented:
For example, if you have 8 records, then
log 2 (8) = 3.
It is better then 8^2 = 64
Commented:
Misha already gave a great explanation of a data structure that is based on the binary search algorithm. Keep in mind that binary search requires your data to be ordered, meaning its a good idea to read your key,value pairs directly into a binary search tree to assure these properties.

A bit more detail for BigO Notation:

The idea is to estimate the computation speed of an algorithm independent from hardware. To do this you consider the growth of a function of n. where n is the size of your input data (number of elements mostly). So having n elements as input you count the worst case of number of  operations that need to be computed to obtain the result.
For example in binary search you halve the search space in each step. You start with the element in the middle, if its bigger than the element you search for, then you can drop the right (greater) side of the inputs and repeat the step with the middle element of the left (smaller) side.
So the maximum number of steps you need to take to find the result is log(n) (with the base 2 of course) because this is the number of times you can halve the input data (e.g 32 -> 16 -> 8 -> 4 -> 2 -> 1, so log(32) = 5).

This is what you do to estimate the worst case runtime of an algorithm. In practice this is much more important than CPU speed, because it shows how many steps you have to compute more when you increase the size of the input. Binary Sort alwys runs in log(n), as long as your input data is ordered (if not you need to add the runtime of a sorting algorithm), Bubble sort however has a runtime that depends on the order of the input data. If  the data happens to be ordered in reverse, the algorithm swaps each element completely through the list unitl it is ordered, so every of the n items is moved n times resulting in O(n^2). This is the worst case however, bubble sort probably runs faster most of the time but it is still very slow.

So considering a runtime of O(n^2) this means that you require 16 operations for list of length 4, 25 for a list of length 5 and so on (In the worst case). Since you asked about big datasets you can imagine how the required number of operations explodes.

Another point for huge datasets is the problem, that they usually do not fit into the ram or are distributed across many servers. For this you should have a look at external memory algorithms.
Software Engineer

Commented:
thanks

Do more with