asgarcymed
asked on
Algorithm to Score Global Quality-Quantity of Files (eBooks)
I would like to create an algorithm to score the "global quality/quantity" of eBooks of each subject.
Each subject has a main subfolder, like:
eBooks\Subject1\(...)
eBooks\Subject2\(...)
eBooks\Subject3\(...)
(...)
I would to create an algorithm/equation/formula to score them (subjects). I think it would be important such algorithm/equation/formula to rely on:
*) Total Size of each Subfolder (the higher the better)
*) Total Number of Files inside each Subfolder (the higher the better)
*) Total Number of Sub-Subfolders inside each Subfolder (the higher the better)
*) Maximum Folder Depth (the higher the better)
*) The Size of the Largest File Size (the higher the better)
*) The Size of the Largest File Size (the higher the better)
I tried many combinations but the resultant score were very absurd, except when:
Score = (Total Number of Files inside each Subfolder) * (Total Size of each Subfolder)
Do you know a better one?
Thanks.
Regards.
Each subject has a main subfolder, like:
eBooks\Subject1\(...)
eBooks\Subject2\(...)
eBooks\Subject3\(...)
(...)
I would to create an algorithm/equation/formula
*) Total Size of each Subfolder (the higher the better)
*) Total Number of Files inside each Subfolder (the higher the better)
*) Total Number of Sub-Subfolders inside each Subfolder (the higher the better)
*) Maximum Folder Depth (the higher the better)
*) The Size of the Largest File Size (the higher the better)
*) The Size of the Largest File Size (the higher the better)
I tried many combinations but the resultant score were very absurd, except when:
Score = (Total Number of Files inside each Subfolder) * (Total Size of each Subfolder)
Do you know a better one?
Thanks.
Regards.
ASKER
Please, download my CSV file (inside a ZIP) at:
http://tinyurl.com/ypv8t7
You will find a "Score" Column/Row, which corresponds to :
Score = (Total Number of Files inside each Subfolder) * (Total Size of each Subfolder)
You also will find that all numeric values are preceded with
«(zero or letter) »
because I do not know how to numerically sort the Columns/Rows inside a CSV file; and I do not want to sort it as alphabetic sorting:
1-10-100-1000-2-20-200-200 0
instead of
1-2-10-20-100-200-1000-200 0
If you know how to solve this; I also would appreciate your help ;)
Thanks.
Best regards.
http://tinyurl.com/ypv8t7
You will find a "Score" Column/Row, which corresponds to :
Score = (Total Number of Files inside each Subfolder) * (Total Size of each Subfolder)
You also will find that all numeric values are preceded with
«(zero or letter) »
because I do not know how to numerically sort the Columns/Rows inside a CSV file; and I do not want to sort it as alphabetic sorting:
1-10-100-1000-2-20-200-200
instead of
1-2-10-20-100-200-1000-200
If you know how to solve this; I also would appreciate your help ;)
Thanks.
Best regards.
ASKER
PS - I do not why, but the "Experts Exchange" sometimes makes illegal characters; what can be awful in case of posting formulas/equations/algorit hms/functi ons.
«(zero or letter) »
[illegal characters - why do they are generated????]
«(zero or letter) »
[illegal characters - why do they are generated????]
ASKER
ozo - Do you have any news?
Thanks.
Regards.
Thanks.
Regards.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
JimFive - Excellent idea! I was over-complicating! I agree with you 100%!!
Thank you very much for your suggestion!!
Best regards.
Thank you very much for your suggestion!!
Best regards.
Didn't I say you could sum the individual scores with a weighting factor?
You never gave examples of scores from which we could determine what weights might work best to produce the desired order or whether interactions between categories would need to be taken into account.
You never gave examples of scores from which we could determine what weights might work best to produce the desired order or whether interactions between categories would need to be taken into account.
total((size*depth)²) would seem to satisfy your criterion
Or you could might explicitly evaluate each of
*) Total Size of each Subfolder (the higher the better)
*) Total Number of Files inside each Subfolder (the higher the better)
*) Total Number of Sub-Subfolders inside each Subfolder (the higher the better)
*) Maximum Folder Depth (the higher the better)
*) The Size of the Largest File Size (the higher the better)
*) The Size of the Largest File Size (the higher the better)
(that looks like a duplicate)
and sum those individual scores, perhaps with some weighting factor
Given a few example folders and what you want their relative scores to be, we may be able to fit a function that orders them appropriately