Link to home
Start Free TrialLog in
Avatar of Anthony Matovu
Anthony MatovuFlag for Uganda

asked on

python for analysing data over 300,000,000 rows and 20 columns

i would look like ro analyse/describe data that is big over 300,000,000 rows do you python can be able to help me do this. how efficiently can i do thos
Avatar of Dr. Klahn
Dr. Klahn

Python is a "less than optimal" choice for an application like this.

The reason is that Python is an interpreted language, not a compiled language.  Given the nature of interpreted languages, Python must run slower than an equivalent compiled program that does the same job.  At a guess it would run at best 1/3 the speed of a compiled program.

If the intended application needs to handle a 300 million row table, you probably want to get all the performance possible out of the application.  C or C++ are excellent possibilities if the data involves multiple data types, and if it is purely numeric number crunching it is hard to beat good old FORTRAN.
ASKER CERTIFIED SOLUTION
Avatar of David Favor
David Favor
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Anthony Matovu

ASKER

thank you very much. what is i summaries and reduce the number of rows to not more than 12,000,000 uaing sql server or any DBMS before i do extraction to where i will need python.  i want to use python for analysis.  do you think this will help
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial