Avatar of pgmerLA
pgmerLA
 asked on

How to fix Namespace issues within an R program?

Hi,
I need help solving namespace issues with my program. The program works great when it is run once, but spits out errors if it has to run more than once, unless I restart Rstudio each time I want to run the program.

I have attached the program, the function file, and the 2 datasets.
Thank you.

Here is what is says:
Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

Loading required package: lattice
Loading required package: ggplot2

Attaching package: ‘MASS’

The following object is masked from ‘package:dplyr’:

    select


My program:
source("FT_functions_EE.R")
##### Read data: 
## data1: Date,Time,Price,Volume,TT (total traded= Price*Volume)
data1<-read.table("XYZ_EE_Long.txt",header=T,sep=" ", stringsAsFactors=F)
## features.dfm has 494 rows with 260 variables
features.dfm<-read.table("features_EE.txt",header=T,sep=" ",stringsAsFactors=F)

N<-5 ##### it works if N=1
for (i in 1:N){
##### Bin the data by bucket size of 5000 shares
volBinIdx5k<-MakeVolumeBinIdx(data1$Volume,5000)
data1.5k.dfm<-MakeBinCandles(data1,volBinIdx5k)

##### Normalize the features
library(caret)
trans<-preProcess(features.dfm,method=c("BoxCox","center","scale"))
transformed<-predict(trans,features.dfm)
}

Open in new window



My functions:
MakeVolumeBinIdx<-function(data1.volume,volBinSize){
  ### PURPOSE: Find the indexes for a given size of volume bin, that is 
  ###            Find the indexes of the data.frame where the sum of the volume is equal
  ###             to the input volBinSize.
  ### INPUT: a vector of trades volume and the desired volume bin size
  ### OUTPUT: a vector indicating where each row belongs to which volume bin
  
  #create index
  volBin<-1
  sumVol<-0
  Volume<-data1.volume
  volBinIdx <- numeric(length(Volume))
  
  #create cutting for each volume bin
  for(i in seq_len(length(Volume))){
    sumVol<-sumVol + Volume[i]  
    if (sumVol<= volBinSize) {
      volBinIdx[i] <- volBin
    } else {
      volBinIdx[i] <-  volBin <- volBin + 1
      sumVol <- Volume[i]
    }
  }
  
  #clean environment
  rm(Volume,i,sumVol,volBinSize,volBin)
  
  return(volBinIdx)
}


MakeBinCandles<-function(data,volBinIdxk){
  ### PURPOSE: Create candles based on bins
  ### INPUT: a new data.frame containing only Date,Time, Price,Volume,TT, AND
  ###          a vector containing the output of MakeVolumeBinIdx
  ### OUTPUT: a data.frame  with Date,Time, OHLC, Volume,
  ###           HighIdx,lowIdx, TT,VWAP (=volume weighted average price)
  
  library(dplyr)
  
  data.return<-data %>%
    mutate(volBinIdxk=volBinIdxk) %>%
    group_by(volBinIdxk) %>%
    summarize(Date=head(Date,1),
              Time=head(Time,1),
              Open=head(Price,1),
              High=max(Price),
              Low=min(Price),
              Close=tail(Price,1),
              Volume=sum(Volume),
              # HighIdx=which.max(Price),
              # LowIdx=which.min(Price),
              TT=sum(TT,na.rm=T),
              VWAP=TT/Volume) %>%
    select(-volBinIdxk) %>%
    as.data.frame()
  
  return(data.return)
  
}


MakeBinCandlesXts<-function(data){
  ### PURPOSE: Turn bin candles from data.frame into xts object
  ### INPUT: data frame outputed by MakeBinCandles()
  ### OUTPUT: xts object
  library(xts)
  data$Date<-strptime(paste(data$Date,data$Time),"%m/%d/%Y %H:%M:%S")
  
  data<-data[,-2] # if I don't remove it, all columns become characters
  data.xts<-xts(data[,-1],order.by=as.POSIXct(data[,1]))
  
  return (data.xts)
  
}

Open in new window

XYZ-EE-Long.txt
features-EE.txt
FT-functions-EE.txt
ProgrammingMath / Science

Avatar of undefined
Last Comment
gheist

8/22/2022 - Mon
pgmerLA

ASKER
I think I fixed the issue by adding:

dplyr::select(-volBinIdxk) inside the second function

Should I worry about something else? I don't want the program to crash unexpectedly

Here is the complete code:
MakeBinCandles<-function(data,volBinIdxk){

    library(dplyr)
   data.return<-data %>%
    mutate(volBinIdxk=volBinIdxk) %>%
    group_by(volBinIdxk) %>%
    summarize(Date=head(Date,1),
              Time=head(Time,1),
              Open=head(Price,1),
              High=max(Price),
              Low=min(Price),
              Close=tail(Price,1),
              Volume=sum(Volume),
              HighIdx=which.max(Price),
              LowIdx=which.min(Price),
              TT=sum(TT,na.rm=T),
              VWAP=TT/Volume) %>%
    dplyr::select(-volBinIdxk) %>%  #### I was getting the error here
    as.data.frame()
  
  return(data.return)
  
}

Open in new window

Vitor Montalvão

I'm just a R programming curious but wondering how many R experts we have here in EE. It would be enough experts and questions to create a new topic just for R programming language?
ASKER CERTIFIED SOLUTION
gheist

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
gheist

Asker answered his question on first comment.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck