Link to home
Start Free TrialLog in
Avatar of pgmerLA
pgmerLA

asked on

How to speed up reading data from a txt file in R?

Hi,

I have a very large file (800 MB) to be read in R from a txt files and it takes several minutes to do so. I need it to be much quicker.

 Is there any functions, packages, or tricks to read in data much quicker than what I am doing?

I have included my code and the dataset.
I use a HP laptop with intel i7 and windows8.

Thank you for you help.


### time the first way
start.time.data1<-Sys.time()

data1<-read.table("XYZ_EE.txt",sep=",",stringsAsFactor=F,header=F)

time.data1<-Sys.time()-start.time.data1
time.data1  ## Time difference of 1.523134 secs


### time the second way
start.time.data2<-Sys.time()

data2<-read.table("XYZ_EE.txt",sep=",",stringsAsFactor=F,header=F,
                  colClasses=c("character","character","numeric","numeric"))

time.data2<-Sys.time()-start.time.data2
time.data2  ## Time difference of 1.497055 secs

Open in new window

XYZ-EE.txt
SOLUTION
Avatar of Pavel Celba
Pavel Celba
Flag of Czechia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pgmerLA
pgmerLA

ASKER

Thank you all for your answers.

I have used GaryPatterson advice on fread() and I was able to reduce the time considerably

> system.time(data1<-fread("XYZ_EE.txt",sep=",",header=F))
   user  system elapsed 
   0.59    0.00    0.61 
> system.time(data2<-read.table("XYZ_EE.txt",sep=",",header=F))
   user  system elapsed 
   1.64    0.03    1.67 

Open in new window

Avatar of pgmerLA

ASKER

The system I use:

i7 4700MQ @2.4 Ghz
16 GB RAM DDR3
64bit Windows8
1TB hard drive
At least some speeding is good.
Avatar of pgmerLA

ASKER

Hi pcelba,

How can I follow your advice in R?  "I would recommend to use SSD drive"
It was just a brainstorming... or hardware solution.

Any HDD swap for SSD drive will speed your computer up. I am using notebooks with SSD C: drive and it is much faster than any older desktop.

The notebook uses just i7-3610QM but its experience index is 7.
Avatar of pgmerLA

ASKER

Are you using an external SSD C:drive?

How much faster do you think my program will get?

What do you mean by "The notebook uses just i7-3610QM but its experience index is 7"?
My SSD is internal. External would need USB 3 connection and it also does not speed the OS up.

Sorry I cannot predict the speed improvement.

 i7-3610QM is slower than 4700MQ: http://ark.intel.com/compare/75117,64899
but thanks to SSD the notebook speed index is 7.
Avatar of pgmerLA

ASKER

Thank you pcelba. I will keep that really good advise in mind!
You are welcome.