Link to home
Create AccountLog in
Avatar of pgmerLA
pgmerLA

asked on

How to fix Namespace issues within an R program?

Hi,
I need help solving namespace issues with my program. The program works great when it is run once, but spits out errors if it has to run more than once, unless I restart Rstudio each time I want to run the program.

I have attached the program, the function file, and the 2 datasets.
Thank you.

Here is what is says:
Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

Loading required package: lattice
Loading required package: ggplot2

Attaching package: ‘MASS’

The following object is masked from ‘package:dplyr’:

    select


My program:
source("FT_functions_EE.R")
##### Read data: 
## data1: Date,Time,Price,Volume,TT (total traded= Price*Volume)
data1<-read.table("XYZ_EE_Long.txt",header=T,sep=" ", stringsAsFactors=F)
## features.dfm has 494 rows with 260 variables
features.dfm<-read.table("features_EE.txt",header=T,sep=" ",stringsAsFactors=F)

N<-5 ##### it works if N=1
for (i in 1:N){
##### Bin the data by bucket size of 5000 shares
volBinIdx5k<-MakeVolumeBinIdx(data1$Volume,5000)
data1.5k.dfm<-MakeBinCandles(data1,volBinIdx5k)

##### Normalize the features
library(caret)
trans<-preProcess(features.dfm,method=c("BoxCox","center","scale"))
transformed<-predict(trans,features.dfm)
}

Open in new window



My functions:
MakeVolumeBinIdx<-function(data1.volume,volBinSize){
  ### PURPOSE: Find the indexes for a given size of volume bin, that is 
  ###            Find the indexes of the data.frame where the sum of the volume is equal
  ###             to the input volBinSize.
  ### INPUT: a vector of trades volume and the desired volume bin size
  ### OUTPUT: a vector indicating where each row belongs to which volume bin
  
  #create index
  volBin<-1
  sumVol<-0
  Volume<-data1.volume
  volBinIdx <- numeric(length(Volume))
  
  #create cutting for each volume bin
  for(i in seq_len(length(Volume))){
    sumVol<-sumVol + Volume[i]  
    if (sumVol<= volBinSize) {
      volBinIdx[i] <- volBin
    } else {
      volBinIdx[i] <-  volBin <- volBin + 1
      sumVol <- Volume[i]
    }
  }
  
  #clean environment
  rm(Volume,i,sumVol,volBinSize,volBin)
  
  return(volBinIdx)
}


MakeBinCandles<-function(data,volBinIdxk){
  ### PURPOSE: Create candles based on bins
  ### INPUT: a new data.frame containing only Date,Time, Price,Volume,TT, AND
  ###          a vector containing the output of MakeVolumeBinIdx
  ### OUTPUT: a data.frame  with Date,Time, OHLC, Volume,
  ###           HighIdx,lowIdx, TT,VWAP (=volume weighted average price)
  
  library(dplyr)
  
  data.return<-data %>%
    mutate(volBinIdxk=volBinIdxk) %>%
    group_by(volBinIdxk) %>%
    summarize(Date=head(Date,1),
              Time=head(Time,1),
              Open=head(Price,1),
              High=max(Price),
              Low=min(Price),
              Close=tail(Price,1),
              Volume=sum(Volume),
              # HighIdx=which.max(Price),
              # LowIdx=which.min(Price),
              TT=sum(TT,na.rm=T),
              VWAP=TT/Volume) %>%
    select(-volBinIdxk) %>%
    as.data.frame()
  
  return(data.return)
  
}


MakeBinCandlesXts<-function(data){
  ### PURPOSE: Turn bin candles from data.frame into xts object
  ### INPUT: data frame outputed by MakeBinCandles()
  ### OUTPUT: xts object
  library(xts)
  data$Date<-strptime(paste(data$Date,data$Time),"%m/%d/%Y %H:%M:%S")
  
  data<-data[,-2] # if I don't remove it, all columns become characters
  data.xts<-xts(data[,-1],order.by=as.POSIXct(data[,1]))
  
  return (data.xts)
  
}

Open in new window

XYZ-EE-Long.txt
features-EE.txt
FT-functions-EE.txt
Avatar of pgmerLA
pgmerLA

ASKER

I think I fixed the issue by adding:

dplyr::select(-volBinIdxk) inside the second function

Should I worry about something else? I don't want the program to crash unexpectedly

Here is the complete code:
MakeBinCandles<-function(data,volBinIdxk){

    library(dplyr)
   data.return<-data %>%
    mutate(volBinIdxk=volBinIdxk) %>%
    group_by(volBinIdxk) %>%
    summarize(Date=head(Date,1),
              Time=head(Time,1),
              Open=head(Price,1),
              High=max(Price),
              Low=min(Price),
              Close=tail(Price,1),
              Volume=sum(Volume),
              HighIdx=which.max(Price),
              LowIdx=which.min(Price),
              TT=sum(TT,na.rm=T),
              VWAP=TT/Volume) %>%
    dplyr::select(-volBinIdxk) %>%  #### I was getting the error here
    as.data.frame()
  
  return(data.return)
  
}

Open in new window

Avatar of Vitor Montalvão
I'm just a R programming curious but wondering how many R experts we have here in EE. It would be enough experts and questions to create a new topic just for R programming language?
ASKER CERTIFIED SOLUTION
Avatar of gheist
gheist
Flag of Belgium image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Asker answered his question on first comment.