Statistical Packages

110

Solutions

274

Contributors

Statistical packages are software titles, such as JMP and GNU Octave, and programming languages, such as MATLAB, R and SAS, that are used to discover, explore and analyze data and suggest useful conclusions, either to learn something unexpected or to confirm a hypothesis. The field includes the design and analysis of techniques to give approximate but accurate solutions to hard problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Share tech news, updates, or what's on your mind.

Sign up to Post

How do i get the attached data "normalised" -- made into normal distribution ?

Worksheet "Data" = Full Data
"Filtered 25-65" = data with 25-65 s filtered by "Data" and copied in this worksheet.

If i use 25-65, it has the best result - ie, less skewness but stil not into normal distribution.

Any idea to use that range such that the data plotted is normal distribution  ?
Data-25-65--Check-Normal--EE.xlsx
0
Angular Fundamentals
LVL 12
Angular Fundamentals

Learn the fundamentals of Angular 2, a JavaScript framework for developing dynamic single page applications.

Hi,
I would like to prepare data for regression analysis.
I can prepare data in two forms.
a) values
b) rankings.
example:
values  60,30,25,90
rankings 2,3,4,1
Which format would be most suitable or does it not matter ?
many thanks
Ian
0
How is
model <- lm(
           formula = Petal.Width ~ Petal.Length,
            data = iris
            )
different from
model <- lm(
            formula = iris$Petal.Width ~ iris$Petal.Length,
            data = iris
            )

The output of both the commands are same but my prediction output differs.
NOTE: I had assumed Petal.Width is same as iris$Petal.Width, clearly they are not. I don't understand how are they different.
Attached RScript contains complete code
0
Hello,

I have  a list of lists.  The lists in the list of lists are file names.  I use lapply to read and merge the contents of each list in the list of lists (3 merged contents in this case  which will be the content of 3 files).  Then, I  have to change the name of the 3 resulting files and finally I have to write the contents of the files to each file.

 lc <- list("test.txt", "test.txt", "test.txt", "test.txt")
 lc1 <- list("test.txt", "test.txt", "test.txt")
 lc2 <- list("test.txt", "test.txt")
#list of lists.  The lists contain file names
 lc <- list(lc, lc1, lc2)
#new names for the three lists in the list of lists
 new_dataFns <- list("name1", "name2", "name3")
 file_paths <- NULL
 new_path <- NULL
#add the file names to the path and read and merge the contents of each list in the list of lists
 lapply(
    lc,
    function(lc) {
     filenames <- file.path(dataFnsDir, lc)
     dataList= lapply(filenames, function (x) read.table(file=x, header=TRUE))
     Reduce(function(x,y) merge(x,y), dataList)
     #   print(dataList)

    }
  )  

#add the new name of the file to the path total will be 3 paths/fille_newname.tsv.  
 lapply(new_path, function(new_path){new_path <- file.path(getwd(), new_dataFns)

The statements above work because lc and  new_dataFns are global and I can pass them to the lapply function

#Finally, I need to write the merged contents to the corresponding file (path/name.tsv).  I tried the following statement, but this …
0
I would like to change the node names of my data.tree object (tjpCPI) from IDs to readable tags. A sample of the tree structure is here:
> print(tjpCPI, "CPI.Tag")
                                levelName                          CPI.Tag
1   378257447                                                          CPI
2    ¦--378257497                                                     Food
3    ¦   ¦--378259447                                              Cereals
4    ¦   ¦   ¦--378259457                                             Rice
5    ¦   ¦   ¦   ¦--378259467                                Non Glutinous
6    ¦   ¦   ¦   ¦   ¦--378259477                                   Rice-A
7    ¦   ¦   ¦   ¦   °--378259487                                   Rice-B
8    ¦   ¦   ¦   °--378259497                                    Glutinous
9    ¦   ¦   ¦--378259507                                            Bread
10   ¦   ¦   ¦   ¦--378259517                                  White Bread
11   ¦   ¦   ¦   ¦--378259527                                Bean Jam Buns
12   ¦   ¦   ¦   °--378259537                                   Curry Buns
13   ¦   ¦   ¦--378259547                                          Noodles

I would like levelName to become CPI.Tag.

How might I do that without needing to iterate through each node?
0
I would like to change the names of the members of the list. The list is of a CPI hierarchy - the trouble is the names are now ID numbers, that are difficult to interpret. Each one of the members has a readable tag, that is "CPI.Tag".

How would I swap the label that is used for the name  currently to the "CPI.Tag"? (the display of a few sample members is attached)
2018-08-14_13-32-00.png

I thought it would be something along the lines of names(jpCPIlist)<-jpCPIlist$CPI.Tag, but evidently not..
0
How to remove any character from a string if it is not part of the below "x character set" .
Remove a character from a given string -if it is not part of the below x character set.
X character set: After removing the character there should not be any space on the character that gets removed.
---------------------
a b c d e f g h I j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9
/ - ? : ( ) . , ‘ +

Example
     '020?Dome@&++;stic CT out;%going?20测试叙事测试叙事@@@@test'
Expected output is
    020?Dome++stic CT outgoing?20test


I am using the below query but it is replacing '+'. I don't want to replace this + because it a part of x character set.

select translate(REGEXP_REPLACE ('020?Dome@&++;stic CT out;%going?20测试叙事测试叙事@@@@test','[^' || CHR(32) ||'-' ||CHR (127) || ']', ''),'+=!“%&*<>;{@#_',' ') AS TEST from dual;
0
I need to execute a substring function on each row of a single column dataframe in R Studio, then assign that value to a new dataframe.
0
Hello,
I'm trying to pass a value to a List, but I get an error Cannot implicitly convert System.Collections.Generic.List<char> to System.Collections.Generic.List<string>.
So, here's my code...
public List<string> NomeContrato { get; set; }

Open in new window


BindingSource bs = new BindingSource();
            bs.DataSource = LoadContratos();
            var editForm = new Concelhos_Edit();
            var editFormModel = new Info();
            editFormModel.Id = concelhos_datagrid.CurrentRow.Cells[0].Value.ToString();
            var _nomeContrato = contratosdt.AsEnumerable().FirstOrDefault(a => a.Field<int>("IdContrato") == ((DataRowView)bs.Current).Row.Field<int>("IdContrato")).Field<string>("Designacao");
            editFormModel.NomeContrato = _nomeContrato.ToList(); //--> Here's where the code breaks and get Error!

Open in new window


Changing this
public List<string> NomeContrato { get; set; }

Open in new window

to this
public List<char> NomeContrato { get; set; }

Open in new window

, I get my Combobox with one letter per line, like this...
W
o
r
l
d

Any help?
Thanks.
0
I would like to perform fractal analysis on a financial series (stock exchange index). What shall I use (prices or log returns) for calculating fractal dimension, Hurst Exponent, for performing R/S Analysis and for predictions? Are there R functions that calculate all of these immediately? Are there any things I need to take care of when analyzing the index? Thank you in advance!
0
Build an E-Commerce Site with Angular 5
LVL 12
Build an E-Commerce Site with Angular 5

Learn how to build an E-Commerce site with Angular 5, a JavaScript framework used by developers to build web, desktop, and mobile applications.

First of all I am doing a program kinda simple long program, here is the full details:

The P-v-T relation for real gases can take many forms. The simplest relations are the ideal gas equation and the Van der Waals equation. These relations are to be applied to superheated steam. The file “pvt.txt” contains the P-v-T data of superheated steam (10 – 800 kPa) for the temperature range of 200 oC through 1200 oC, obtained from the steam tables.

Write a C program to read the steam table data “pvt.txt”. In the C program, estimate the density of steam for the pressure range 10 through 800 kPa, and temperature range 200 oC through 1200 oC,

(1) Using the ideal-gas relation: m3/kg where R = 0.4615 kJ/kgK, T is temperature [K] and P is pressure [kPa].

(2) Using the Van der Waals equation:

where R = 0.4615 kJ/kgK, T is temperature [K] and P is pressure [kPa]. The constants are obtained from and where Pcr = 22060 kPa and Tcr = 647.1 K.

In each case, calculate the resulting percentage error of the estimated density as follows: Error = x 100% Submit a report which must include: 1. Introduction, algorithm or flowchart, the C program, and the density from steam table. 2. The estimated density table when using the ideal gas equation. 3. The percentage error table when using the ideal gas equation. 4. The estimated density table when using the Van der Waals equation. 5. The percentage error table when using the Van der Waals equation. 6. Discussion and conclusion. Note: Density…
0
Please help i want the image to be in canny edge n perform fuzzy logic with this code

img1=imread('F:\Matlab Project\7 sem\currency\10.jpg')




Igray = 0.2989*img4(:,:,1)+0.5870*img4(:,:,2)+0.1140*img4(:,:,3);
%Igray1=edge(Igray,'Canny');
%figure
%image(Igray,'CDataMapping','scaled');
%colormap('gray')
%title('Input Image in Grayscale')

%Convert Image to Double-Precision Data
I = double(Igray);


%Scaling the factor
classType = class(Igray);
scalingFactor = double(intmax(classType));
I = I/scalingFactor;

%Obtain the Image Gradient

Gx = [-1 1];
Gy = Gx';
Ix = conv2(I,Gx,'same');
Iy = conv2(I,Gy,'same');

%figure
%image(Ix,'CDataMapping','scaled')
%colormap('gray')
%title('Ix')

%figure
%image(Iy,'CDataMapping','scaled')
%colormap('gray')
%title('Iy')

%Define the Fuzzy Inferences System
edgeFIS = newfis('edgeDetection');

%Specify the image gradients, Ix and Iy, as the inputs of edgeFIS
edgeFIS = addvar(edgeFIS,'input','Ix',[-1 1]);
edgeFIS = addvar(edgeFIS,'input','Iy',[-1 1]);

%
sx = 0.1;
sy = 0.1;
edgeFIS = addmf(edgeFIS,'input',1,'zero','gaussmf',[sx 0]);
edgeFIS = addmf(edgeFIS,'input',2,'zero','gaussmf',[sy 0]);
%
edgeFIS = addvar(edgeFIS,'output','Iout',[0 1]);

%Specify the triangular membership functions, white and black, for Iout.
wa = 0.1;
wb = 1;
wc = 1;
ba = 0;
bb = 0;
bc = 0.7;
edgeFIS = addmf(edgeFIS,'output',1,'white','trimf',[wa wb wc]);
edgeFIS = addmf(edgeFIS,'output',1,'black','trimf',[ba …
0
Hi all,
I am not able to extract data from a package in r called rdota2. Wanted to run a function name get_league_listing from the package. Shows the following error-"Error in (function (..., deparse.level = 1, make.row.names = TRUE, stringsAsFactors = default.stringsAsFactors())  :
  numbers of columns of arguments do not match". please help me out. Thanks in advance.
0
I am trying the code below so that if there is neither "/" nor "\" the output is an error and the application stops, but it does not work. I cannot see any thing wrong with the code.  Does any one sees any thing wrong?

 if (grepl("/", OUTpath, fixed=TRUE)) { # mac style
     OUTpath<- paste(paste(unlist(strsplit(OUTpath, "/", fixed=TRUE)), collapse="/"), "/", sep="")
   } else
     if (grepl("\\", OUTpath, fixed=TRUE)) { # windows style
       OUTpath<- paste(paste(unlist(strsplit(OUTpath, "\\", fixed=TRUE)), collapse="\\"), "\\", sep="")
     } else
       if(!grepl("/", OUTpath, fixed=TRUE) || !grepl("\\", OUTpath, fixed=TRUE)){
       trueFalse = FALSE
       errorMessage("Unrecognized path separator in OUTpath or no path specification in PARAMS file. Cannot open connection\n
                             You can edit your input file and save the changes. Afterwards, stop and restart glycoPipe and upload file again")
       stop("Unrecognized path separator in OUTpath\n")
     }
0
Hi

I am looking for a way to capture standard error and redirect it to standard output in R (Shiny). I can not find any information any where in the web. Is there a way to do this?  the error gets displayed to the console, but I would like to displayed in the Shiny GUI.

Thanks
0
Hi,
 Although the program used to run smoothly on win98 and VB5 now cannot read the pixel info of an image (ie. jpg or gif) and therefore cannot proceed
 to calculate the Lab pixel values from RGB? In particular I have test pics of 100x100 pixel and program now reads 585x1890 !!  Any ideas?

 Following is the part of the program

 Cheers
 Public Sub Command1_Click()

 Dim OFName As OPENFILENAME
 OFName.lStructSize = Len(OFName)
 'Set the parent window
 OFName.hwndOwner = Me.hwnd
 'Set the application's instance
 OFName.hInstance = App.hInstance
 'Select a filter
 OFName.lpstrFilter = "Image Files (*.bmp;*.jpg;*.png)" + Chr$(0) + "*.bmp;*.jpg;*.png" + Chr$(0) + "All Files (*.*)" + Chr$(0) + "*.*" + Chr$(0)
 'create a buffer for the file
 OFName.lpstrFile = Space$(254)
 'set the maximum length of a returned file
 OFName.nMaxFile = 255
 'Create a buffer for the file title
 OFName.lpstrFileTitle = Space$(254)
 'Set the maximum length of a returned file title
 OFName.nMaxFileTitle = 255
 'Set the initial directory
 OFName.lpstrInitialDir = "C:\"
 'Set the title
 OFName.lpstrTitle = "Open File"
 'No flags
 OFName.flags = 0

 'Show the 'Open File'-dialog
 If GetOpenFileName(OFName) Then
 Label2 = Trim$(OFName.lpstrFile)

 End If

 End Sub

 Private Sub Command2_Click()
 Dim PicInfo As BITMAP
 Dim pic As Picture
 Dim X, Y As Long
 Dim height, width As Long
 Dim R As Long
 Dim imagesource As String
 Dim hFile As Long, FileInfo As BY_HANDLE_FILE_INFORMATION
 Dim Red, 

Open in new window

0
Hi

So if I manually connect using sftp from a centos box to my Proftpd server and issue a get command to grab a file, all's good.

If I do it in a script, it fails after getting the file  (next step would be to delete the file which it never does)

It's driving me round the bend abit, so any help would be greatly appreciated.

on the sftp client side
Log from Scripted version
2018-03-15 12:34:35,029 [30711] <sftp:6>: received READ (5) SFTP request (request ID 11, channel ID 0)
2018-03-15 12:34:35,030 [30711] <sftp:7>: received request: READ 8fc9867310df242f 0 32768
2018-03-15 12:34:35,030 [30711] <sftp:8>: sending response: STATUS 1 'End of file' ('End of file' [-1])
2018-03-15 12:34:35,030 [30711] <ssh2:9>: sending CHANNEL_DATA (remote channel ID 0, 37 data bytes)
2018-03-15 12:34:35,030 [30711] <ssh2:19>: waiting for max of 600 secs while polling socket 1 using select(2)
2018-03-15 12:34:35,030 [30711] <ssh2:3>: sent SSH_MSG_CHANNEL_DATA (94) packet (80 bytes)
2018-03-15 12:34:35,031 [30711] <ssh2:11>: channel ID 0 remote window size currently at 2096633 bytes
2018-03-15 12:34:35,031 [30711] <ssh2:19>: waiting for max of 600 secs while polling socket 0 using select(2)
2018-03-15 12:34:35,031 [30711] <ssh2:20>: SSH2 packet len = 44 bytes
2018-03-15 12:34:35,031 [30711] <ssh2:20>: SSH2 packet padding len = 5 bytes
2018-03-15 12:34:35,031 [30711] <ssh2:20>: SSH2 packet payload len = 38 bytes
2018-03-15 12:34:35,031 [30711] <ssh2:19>: waiting for max of …
0
Hello All,

Hope someone clarify the error I have in my STIDF data plot.
I'm reading through related questions but no solution fixed my error.

I'm working on STIDF  and I want to use stplot and spplot but it seems spplot is not suitable for STIDF.

When I use stplot I always get this error:

    Error in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels,  :
      factor level [2] is duplicated

Here's how my data in STIDR data type looks:

           Lat         Long       sp.ID       time                                 endTime              TimeIndex     Speed    Station_ID    
    41.71268  -87.64341    1      2017-07-01 00:00:00   2017-07-01 18:00:00       1                    86           2
    41.47268  -87.35281    2      2017-07-01 00:00:00   2017-07-01 18:00:00       1                    35           5
    41.71268  -87.64341    3      2017-07-01 01:00:00   2017-07-01 18:01:00       2                    43           2
    41.47268  -87.35281    4      2017-07-01 01:00:00   2017-07-01 18:01:00       2                    55           5

I think it's related to my ID variable but I have duplicated station ID because I have hourly reading for each location , so ID will be repeated in my dataset.

I tried this code but I still have the error message,

 STIDF_jour$Station_ID <- factor(STIDF_jour $Station_ID, levels = rev(unique(STIDF_jour $Station_ID)), ordered=TRUE)
0
1) I have sales volumes which I can sum by customer and month and I have interest rates by month.
So just imagine months as rows, the avg monthly interest rates in column 1, monthly sales volume customer A in column 2, monthly sales volume customer B in column 3...

2) Typically I use Excel to analyze.

3) Goal: I'd like to find how interest rates impact our sales volume.

4) Thought & Concerns:

a) I don't think I should use monthly sales totals for all customers. I think I need to somehow factor in the customers since an increase/decrease in a specific customer's sales may be attributable to a new/lost customer or a change in a customer's volume that may not be related to interest rates. this is why I suggested the data columns described above.
b) I am concerned that this may be too much for excel since we have a few hundred customers with sales (columns) in the past 3 year period I am reviewing.
c) There is likely a lag between the month of the the interest rate change and the change in sales.
d) There is likely an extreme initial reaction to a rate change.
e) The interest rate impact could vary depending upon the amount of change and/or where the change started. Meaning a jump from 3.5% to 4% may have less impact then a rate increase from 4% to 4.5%

Maybe this is involves more than a multiple regression analysis? In either case; it would help if you could describe how to set up the data and how to use the appropriate analysis tool
0
CompTIA Network+
LVL 12
CompTIA Network+

Prepare for the CompTIA Network+ exam by learning how to troubleshoot, configure, and manage both wired and wireless networks.

Can you please help me with the below query:

SELECT
                                 SM_SEC_GROUP,
                                 SM_SEC_TYPE,            
                                 MAX (ABS(MKT_VALUE * r.rate)) AS 'MKV_USD',
                                 MAX (ABS(MKT_NOTION * r.rate)) AS 'NOTIONAL_USD',
                                 COUNT (*) AS 'NUM_OF_HOLDINGS'  ,
                        MAX(FI.MATURITY) AS MATURITY
                                               
                        --      INTO #FI_Summary
                              FROM
                                 iim_risk_point.dbo.FI_PORT_SEC_CHAR_LOAD FI,
                                 dbo.Account ACT,
                                 dbo.fx_rate r
                              WHERE
                                 ASOF_DATE = '12/29/2017' AND
                       FI.MATURITY < dateadd(dd, 30, '12/29/2017') AND
                                 FI.PORTF_LIST = ACT.ID_ALADDIN AND
                                 ACT.STATUS = 'A' AND
                                 ACT.NME_GC_LVL1 IN('Fixed Income','Liquidity' )   AND
                                 MKT_VALUE <> 0  AND
                                 FI.PORT_CURRENCY = r.curr_sold AND
                                 r.curr_bought = 'USD' AND
                                 FI.ASOF_DATE = r.date
                              GROUP BY
                                 SM_SEC_GROUP,
                                 SM_SEC_TYPE
                              ORDER BY 1,2


      the dateadd function is not giving correct results. what am I doing wrong.
0
So I have a dataset wherein I have account number and "days past due" with every observation. So for every account number, as soon as the "days past due" column hits a code like "DLQ3" , I want to remove rest of the rows for that account(even if DLQ3 is the first observation for that account).

My dataset looks like :

Observation date Account num   Days past due

2016-09                           200056              DLQ1
2016-09                           200048              DLQ2
2016-09                           389490              NORM
2016-09                           383984              DLQ3.....

So for account 383984, I want to remove all the rows post the date 2016-09 as now its in default.

So in all I want to see when the account hits DLQ3 and when it does I want to remove all the rows post the first DLQ3 observation.
0
I have 2 java projects to do a replication , a RMIreplication and the publhiser, in the RMIReplication I create an ArrayList of Subjects anda in the publisher I nedd to aceed this ArrayList to do the attach and setstate how can I do, I will put belong the code of the 2 class from diferents projects
public class Replication {

	 //static ArrayList<Subject> theList ;
	static ArrayList<Subject> theList;
	public static void main(String args[]){
		//System.setProperty("java.rmi.server.hostname","192.168.1.92");
		
		//System.setProperty("java.rmi.server.hostname","192.168.1.92");
				theList =new ArrayList<Subject>();
				Registry r=null;
				Registry r1=null;
				Registry r2=null;
				
				try{
					r = LocateRegistry.createRegistry(2023);
					r1 = LocateRegistry.createRegistry(2024);
					r2 = LocateRegistry.createRegistry(2025);
					
				}catch(RemoteException a){}
				//System.setSecurityManager(new RMISecurityManager());
				 try{
					 
					Subject list = new Event();
					
					Subject list1 = new Event();
					Subject list2 = new Event();
					System.out.println("1");
		            	Naming.rebind("//localhost:2023/Subject", (Remote) list );
		            	System.out.println("2");
		            	Naming.rebind("//localhost:2024/Subject1", (Remote) list1 );
		            	Naming.rebind("//localhost:2025/Subject2", (Remote) list2 );
		            	System.out.println("3");
		            	theList.add( list);
		            	theList.add(list1);
		            	

Open in new window

0
I have a data frame with some names (rows) and I have some positions that I want to use, for instance 35th row and 145th row. How do I get the names of the rows which are in this positions? I uploaded a print screen that may help understanding. Thanks!

I tried something like

names1 <- row.names(which(size_96 < median(size_96, na.rm= T)))
0
Hi, I have one data frame (df1) with 20 observations (one for each year) and 597 variables (each one is one stock). The values are a ratio called book-to-market ratio. I need to build two portfolios for each year which consists of the stocks with values lower than the median and stocks with values above the median. The names of the stocks are the columns from df1. So I need to check if each value from each row (each year) is below or above the median and identify each stock name (columns in df1). Then I need to match it with the columns from another data frame (df2) which has data from the return of each stock in each year (20x597). The end result would be a vector with 20 entries, which are the differences of average returns between the two portfolios. I hope it was clear enough, thanks for the answer and I`m here for any explanation.
0
Hi,

I've logged into a Microsoft R Server using mrsdeploy::remoteLogin()

Test with session:

REMOTE> result <- system("gpg --yes --batch -r [e-mail] --passphrase=[youPassphrase] --armor --utf8-strings --decrypt youFile", intern = TRUE)

REMOTE> result
character(0)
attr(,"status")
[1] 2

REMOTE> exit
>Logout from remote R session complete

Open in new window


Test without session:

result <- system("gpg --yes --batch -r [e-mail] --passphrase= youPassphrase] --armor --utf8-strings --decrypt youFile", intern = TRUE)

gpg: encrypted with 2048-bit RSA key, ID XXXXXXX, created 2017-11-20 "name<e-mail>"

>result
[1] "Esta es la frase\r"
[2] "que he encriptado\r"

Open in new window


I need that it work based on remote session because this way works on service.


Thanks for your reply.

Kind regards,
0

Statistical Packages

110

Solutions

274

Contributors

Statistical packages are software titles, such as JMP and GNU Octave, and programming languages, such as MATLAB, R and SAS, that are used to discover, explore and analyze data and suggest useful conclusions, either to learn something unexpected or to confirm a hypothesis. The field includes the design and analysis of techniques to give approximate but accurate solutions to hard problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Top Experts In
Statistical Packages
<
Monthly
>