Statistical Packages

140

Solutions

319

Contributors

Statistical packages are software titles, such as JMP and GNU Octave, and programming languages, such as MATLAB, R and SAS, that are used to discover, explore and analyze data and suggest useful conclusions, either to learn something unexpected or to confirm a hypothesis. The field includes the design and analysis of techniques to give approximate but accurate solutions to hard problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Share tech news, updates, or what's on your mind.

Sign up to Post

Hi,

Finally, My Dell Power Edge R 510 is running Windows Server 2019 (actually, it is a straight forward installation).

Now, I want to install VM machines in this server.

As Dell Server, my experience in VM machine is also limited.
Can somebody direct me what would I do to create the Hyper V server 2019?

Thanks,
tjie
0
OWASP: Forgery and Phishing
LVL 13
OWASP: Forgery and Phishing

Learn the techniques to avoid forgery and phishing attacks and the types of attacks an application or network may face.

Hi,

I am working in R studio

My code
library(caTools)

set.seed(123)

split = sample.split(Customer_Churn$tenure, SplitRatio = 0.7)
split


training_set = subset(Customer_Churn, split == TRUE)
test_set = subset(Customer_Churn, split == FALSE)


# Fitting Simple Linear Regression to the Training Set

regressor = lm(formula = tenure ~ Contract,
               data = training_set)

summary(regressor)

options(scipen = 999)

# Predicting the test set results

y_pred = predict(regressor, newdata = test_set)

head(y_pred)

cbind(Actual=test_set$tenure,predicted=y_pred) -> final_data

as.data.frame(final_data) -> final_data

head(final_data)

final_data$ACtual - final_data$Predicted -> error

cbind(final_data,error) -> final_data

head(final_data)

Open in new window


gives me error
> final_data$Actual - final_data$Predicted -> error
Error in final_data$Actual : $ operator is invalid for atomic vectors
> final_data$ACtual - final_data$Predicted -> error
Error in final_data$ACtual : $ operator is invalid for atomic vectors

Open in new window


Please advise
0
The attached csv is the list of dates. By using R, I imported csv into a dataframe as follows.

x <- read.csv("date.csv")

From x variable, I would like to exclude holidays such as Saturday and Sunday.
But I'm not sure how I do it by R. It's appreciate if I can know any way.
date.csv
0
I thought I knew how to do this after my last post, but apparently not.  I have the following code which works fine, however, the output needs to list groups from the profile table that have no entries at all.  So, when I run the code currently, I get all the groups that have registered individuals.  However, there may be some groups that have no registered individuals and they should show up as 0, however they don't.

With TOTAL_REGISTERED as 
(select r.regdate, r.Agency
FROM   tblOrgProfile p 

LEFT JOIN tblOrgRegistrations r
ON p.AgencyID = r.AgencyID
and r.fiscal = 2020

 where active = 1 and
 r.agency <> 'Administrator')

select Agency, 
SUM(CASE when regdate >= '7/1/2019' And regdate < '10/01/2019' then 1 end) as [1st Quarter],
SUM(CASE when regdate >= '10/01/2019' And regdate < '01/01/2019' then 1 ELSE 0 end) as [2nd Quarter],
SUM(CASE when regdate >= '01/01/2020' And regdate < '04/01/2020' then 1 ELSE 0 end) as [3rd Quarter],
SUM(CASE when regdate >= '04/01/2020' And regdate < '07/01/2020' then 1 ELSE 0 end) as [4th Quarter]
from TOTAL_REGISTERED T group by Agency order by agency

Open in new window


Please note that the tables attached do not reflect exactly all the data in the actual tables.
tblOrgProfile.xlsx
tblOrgRegistrations.xlsx
0
I want to merge 12 historical datasets from a library (each named FINAL_201812, FINAL_201901, FINAL_201902, etc.) and the current dataset FINAL_201908 into one dataset.  Each dataset has the same amount of columns and column names.  For each of the lines of data within the dataset I want to add a new column in the first position that designates what dataset it came from (so each file has 15 specific locations...so my final dataset with ALL 13 datasets should have a new column with the dataset name:
dataset_name                      location                             field3                            field4                                 field5
FINAL_201812                      Dallas                                 Item1                           Item2                                  Item3
FINAL_201812                      Pittsburgh                         Item1                           Item2                                  Item3
FINAL_201901                      Dallas                                 Item1                           Item2                                  Item3
FINAL_201901                      Pittsburgh                         Item1                           Item2                                  Item3

libname sasdata "mydirectorypath";

dataset names, Final_201808.sas7bdat, Final_201809.sas7bdat, ... Final_201908.sas7bdat
0
I'm stumbling here.  I have a SAS program that creates a final dataset and I use an ods excel proc report that creates a formatted spreadsheet.  I also have 11 other sas datasets under a libname in which I need to get added to this single excel file as well.  Now I understand the limitations of ods excel not being able to create separate sheets easily into one file.  

What I'm struggling with is how to do this, whether I need to loop through the historical datasets in one single ods statement that creates these datasets on separate tabs or what.

/* so my libname statement is */
libname sasdata "mypath" access=readonly';

/* Then I have my macro at the end to create the excel file: */
%macro createReport(utildt);

options missing=' ';
ODS LISTING CLOSE;

ODS PATH (prepend) STD.template99(READ) SASHELP.TEMPLMST(READ);
ODS ESCAPECHAR='^';

TITLE;

ods excel file = " &rptpath./&filenm..xlsx" style=stdXLSX

/* CRITERIA PAGE - first sheet; specifies criteria for report */

options (orientation='portrait'
                   sheet_name=Criteria"
                           );

proc report data=CritReport no windwos headline headskip spacing = 2 missing split='|';

column description;

define description /display "REPORT DESCRIPTION;

run;

/* REPORT SHEET - second sheet; actual report data in a formatted sheet */

TITLE;

ods excel style =stdXLSX
options (orientation='portrait'
                   sheet_name=&utildt."
                    );
proc report data=FINAL_&utildt. 

Open in new window

0
Hi everyone,

in SAS I am trying to sort one SAS dataset by a certain variable (var1) and then write all records to SAS datasets (name includes "var1") and ASCII files (name includes "var1"). I am thinking to do everything in two steps as below but I am not sure if Step 2 should be done in PROC SQL or maybe in a DATA step with macros?  If anyone knows how to do Step 2 please let me know. Any suggestions or examples would be greatly appriciated.  

Step 1. Sort olddata.sas7bdat SAS dataset by var1
Step 2. Write all records from olddata.sas7bdat for each unique var1 to its correcponding data<var1>.sas7bdat SAS datasets and data<var1>.in ASCII files

*********************
* Step 1- sorting initial dataset
***********************
proc sort data=olddata;
   by var1;
run;

****************************
* Step 2 - creating data<var1>.sas7bdat
****************************
proc sql noprint;
  select distinct var1 into : x separated by ' ' from olddata;
quit;
%macro create;
  %do i = 1 %to %eval(%sysfunc(count(&x, %str( )))+1);
  data data%scan(&x,&i.);
  set olddata;
  if var1 = "%scan(&x,&i.)";
  run;
  %end;
%mend;
%create

Open in new window

0
I am trying to compare 2 columns in a word table and if they match return a specific value. There are 13 rows in the table and I am starting at row 3. If the entire column 4 is equal to the entire column 34 then add text to table 6, otherwise do nothing.

The first problem is that it is showing  no match at all. Even though it should.

So for example


Sub CompareColumns()
    Dim tbl1 As Table
    Dim tbl2 As Table
    Dim r As Integer
    Dim cl As cell
    Dim rw As Row
    Dim c As Integer
    Dim d As Integer
  

    Set tbl1 = ActiveDocument.Tables(9)
    Set tbl2 = ActiveDocument.Tables(9)
Set tbl3 = ActiveDocument.Tables(6)
    
    c = 4 'Column No
    d = 34 'THIS IS A CONTENT CONTROL ITEM

    For r = 8 To 8
        If tbl1.cell(r, c).Range.Text = tbl2.cell(r, d).Range Then
                   tbl1.cell(r, c).Range = "Yes"
        End If

    Next r

End Sub

Open in new window

0
Private Sub GetEmployee()
        'Clear COMBOBOX...
        cmbEmployee.Items.Clear()
        OLEDBControls.ExecQuery("SELECT empId,empName FROM Employees;")
        'If Records are found, then add them to COMBOBOX....
        If OLEDBControls.RecordCount > 0 Then
            For Each r As DataRow In OLEDBControls.OledbDS.Tables(0).Rows
                cmbEmployee.Items.Add(r("empName"))
            Next
            cmbEmployee.SelectedIndex = 0
            cmbEmployee.MaxDropDownItems = 5
            cmbEmployee.ValueMember = "empId"
            cmbEmployee.DisplayMember = "Name"

        ElseIf OLEDBControls.Exception <> "" Then
            'Report Error..
            MsgBox(OLEDBControls.Exception)
        End If
    End Sub

    Private Sub GetEmpID()
        Dim id As Integer
        id = Me.cmbSite.SelectedValue
        MsgBox(id)
    End Sub

    Private Sub cmbEmployee_DropDown(sender As Object, e As EventArgs) Handles cmbEmployee.DropDown
        GetEmployee()
    End Sub

    Private Sub CmbEmployee_SelectedIndexChanged(sender As Object, e As EventArgs) Handles cmbEmployee.SelectedIndexChanged
        GetEmpID()
    End Sub

Please I want to return the employee id for a selected Text but it returns zero...please need your help
0
Hi Experts

I want to change the following line of R Script code in my current code so the R Script finds the mini date from column "RecordDate" and an maximum date as opposed to using start=c(2018, 5), end=c(2019, 5), frequency=12). and second part is to use column "Count" to get our date points for those date as opposed to using data2 <- ts(c(9,9,14,4,14,15,4,14,17,17,19,16). need to make the R Script more dynamic.

#Begin first set of commands 
library(ggplot2)
library(trend)
library(zoo)
library(dplyr)
library(Kendall)
#End of first set of commands


#Begin second set of commands
data2 <- ts(c(9,9,14,4,14,15,4,14,17,17,19,16), start=c(2018, 5), end=c(2019, 5), frequency=12)
  mk.test(data2)
  data1X <- c(1:length(data1))
  data1Fit <- lm(data1~data1X)
  data1df <- data.frame(date=as.Date(time(data1)), Y=as.matrix(data1))
  ggplot(data=data1df, mapping=aes(x=date, y=Y, ymin = 0))+geom_point() +
    geom_line(color='blue') +
    stat_smooth(method = "lm", col = "red") +
    xlab("Months") + 
    ylab("Complaints") +
    scale_x_date(date_breaks = "1 month", date_labels = '%b %y') +
    labs(title = paste("Adj R2 = ",signif(summary(data1Fit)$adj.r.squared, 5),
                       " Slope =",signif(data1Fit$coef[[2]], 5)))
  #End of second set of commands
  

Open in new window

0
Introduction to R
LVL 13
Introduction to R

R is considered the predominant language for data scientist and statisticians. Learn how to use R for your own data science projects.

Hi,
I have finally confronted my fears of learning SQL Server
and have managed to export my entire UKHR database to it.
Up until now I have used Excel exclusively as a database and to run regressions
Due to the excessive number of files around 20, with
some of them well and truly exceeding the Excel limits I felt it
was well overdue for me to, shall we say, move on.
I'm hoping some of you would guide me in the direction of a suitable
alternative to Excel. Using SQL Server as a Database Management System.
R springs to mind along with a couple of statistical software packages. One
that I have a licensed copy for is Stata which is now 13 years old. I never had
got my head around it.
If someone can point me in the right direction I would be most appreciative.
Many Thanks
Ian
0
I am running the following query in SSMS:

UPDATE LS.FLIB2.S0647D36.LIB2.MADR SET ADUUCB = 'YES'
 FROM LS_LIB2.S0647D36.LIB2.MADR MBAD
	INNER JOIN V_READY_TO_SHIP R
		ON MBAD.ADCVNB = R.[CO#] AND MBAD.ADDRNB = R.RELEASE_NUMBER
		AND MBAD.ADFCNB = R.LINE_NUMBER AND MBAD.ADAASZ = R.KIT_RELEASE_NUMBER

Open in new window


and getting the following error msg:

Msg 7352, Level 16, State 1, Line 1
The OLE DB provider "IBMDASQL" for linked server "XA_AMFLIB2" supplied inconsistent metadata. The object "(user generated expression)" was missing the expected column "Bmk1000".

Open in new window


The V_READY_TO_SHIP is a SQL Server table and LS.FLIB2.S0647D36.LIB2.MADR  is a file on a AS400.  My only option would be to send the content of the V_READY_TO_SHIP to a AS400 file and run the update query from there but I would prefer to run it on the SQL Server instance because it is part of a longer process. happening in SQL Server.

Any ideas?
0
Cisco R setting DHCP-PD , right method is set a WAN pre 64 bit hex IPv6 left the local mac layer identify ?

The using R wire Cisco only allowed a NAME just  , example 2002 ,  which was on Rv042G set DHCP-PD 2002:: Prefix , is the current Router under previous Rv042G stored at the Server of Cisco caused this ?  Or the apply wont' be processed going on ?
0
I am trying to figure out what is causing my laptop to freeze at random points during the day.  It seems to happen more often when I am connected to external monitor via HDMI, but cant pinpoint what is actually causing it.  What happens is no .exe will process and it "freezes" exes from starting for 15-60 seconds then all of a sudden it "wakes up" and runs all the queued .exe files.  
So when I realize its stopped allowing new things ill hit windows r (i DO get the run box) and type cmd and hit enter...run box goes away but cmd wont launch for 15-60 seconds and then boom everything ive done in those 15-60 seconds happens in an instant (command prompt box opens, calculator opens, changing tabs in chrome  etc).  Then everything is fine and works for another hour or more (random when it does/doesnt happen).  I CAN alt/tab and change between windows, but if I try to do anything in that window it wont work (until it wakes up and does all the stored tasks), I CANT ctrl-alt-del, its another exe that gets "put in queue" and again,  it will pop up along with my command prompt etc.
I have ran memory checks, sfc, cclean, mbam - running on an SSD for primary OS - no errors that I can tell.
I did just install process explorer but not sure how to pinpoint the "queue"...once everything is in "freeze mode" what do I look for on process explorer?
0
hi, i can get postman to POST to create a VM but using pycharm encountered this error.

code 1: obtain a token
==========================================================================
import requests
url = "https://IP/silvan/apigateway/v1.0/"
get_apis = "apis_include_throttles"

##API URLs
get_token_url = "https://iam-apigateway-proxy.domain.com/v3/auth/tokens"

create_volume_url = "https://evs.sitc-1.domain.com/v2/6d321dd88c7143ba8d6daf3e15f14be9/volumes"
delete_volume_url = "https://evs.stic-1.domain.com/v2/6d321dd88c7143ba8d6daf3e15f14be9/volumes/"
create_vm_url = "https://ecs.sitc-1.domain.com/v2/6d321dd88c7143ba8d6daf3e15f14be9/servers"



##Images Dictionary
images = {
    "Ubuntu 18.10": "5313ace4-4573-404b-abc6-8548ed14c4f7",
    "RHEL7.5-40G": "aa9d05f3-cb90-4776-9c02-617a9906b271",
    "WindowsServer2016WithGUI": "c54d05fa-5ad8-425e-be56-e60ede395230",
    "Windows10Pro": "29caef55-0617-4813-8a17-cb0bef19de16",
    "RHEL7.5": "c5ccd8a7-d8f3-4a4c-91c3-9d93303aee58",
    "Ubuntu16.04LTS": "3f8948fd-c108-48db-9951-1d617e8e5b03",
    "image-kvm-euler": "298e2912-5a7a-4178-8ac4-b260712d514c",
    "image-ManageOne": "80d9b0ee-a5b3-42fe-99ed-fc32c57da5b3",
    "esight_image": "e1e94234-f3e7-4793-8bdb-0cef9e3194cf"
}


def get_token():
    body = {
        "auth": {
            "identity": {
                "methods": [
                    "password"
                ],
                "password": {
                    "user": {
    …
0
hi,

why can't delete my system volumes with error. checked has no more VM, snaps, etc.

use postman to delete with error. ps check my postman.
2.jpg
use python script to delete with error. ps check my python code

import requests

get_token_url = "https://iam-apigateway-proxy.domain.com/v3/auth/tokens"

body = {
    "auth": {
        "identity": {
            "methods": [
                "password"
            ],
            "password": {
                "user": {
                    "domain": {
                        "name": "XXXXX"
                    },
                    "name": "XXXXX",
                    "password": "XXXXX"
                }
            }
        },
        "scope": {
            "project": {
                "id": "cd088007d3b84e7fa894478e6fe667c4",
                "domain": {
                    "name": "XXXXX"
                }
            }
        }
    }
}

# POST to the API
results = requests.post(get_token_url, json=body, verify=False)


token = results.headers['X-Subject-Token']

#Deletion

volume_id = [
    'c3b803ee-f1ab-428d-bc55-01b380c36d49',
    '065ba0b3-959a-4dc0-af23-17b8a1099367',
    'b282ea13-5e24-48fb-b7e2-84fce0973621',
    'aa87fc71-ae1b-47ee-82fc-bcd73a59538d',
    'a72cc934-473e-4cde-8eb0-743ca058e350'

]

delete_url = "https://evs.domain.com/v2/cd088007d3b84e7fa894478e6fe667c4/volumes/"
headers = {
    'content-type': "application/json",
    …
0
How to copy and paste data with conditional formatting format from minitab worksheet into excel ?
I tried but they copy in values and the format that i had in minitab was not reflected in excel.
0
How is
model <- lm(
           formula = Petal.Width ~ Petal.Length,
            data = iris
            )
different from
model <- lm(
            formula = iris$Petal.Width ~ iris$Petal.Length,
            data = iris
            )

The output of both the commands are same but my prediction output differs.
NOTE: I had assumed Petal.Width is same as iris$Petal.Width, clearly they are not. I don't understand how are they different.
Attached RScript contains complete code
0
I would like to change the node names of my data.tree object (tjpCPI) from IDs to readable tags. A sample of the tree structure is here:
> print(tjpCPI, "CPI.Tag")
                                levelName                          CPI.Tag
1   378257447                                                          CPI
2    ¦--378257497                                                     Food
3    ¦   ¦--378259447                                              Cereals
4    ¦   ¦   ¦--378259457                                             Rice
5    ¦   ¦   ¦   ¦--378259467                                Non Glutinous
6    ¦   ¦   ¦   ¦   ¦--378259477                                   Rice-A
7    ¦   ¦   ¦   ¦   °--378259487                                   Rice-B
8    ¦   ¦   ¦   °--378259497                                    Glutinous
9    ¦   ¦   ¦--378259507                                            Bread
10   ¦   ¦   ¦   ¦--378259517                                  White Bread
11   ¦   ¦   ¦   ¦--378259527                                Bean Jam Buns
12   ¦   ¦   ¦   °--378259537                                   Curry Buns
13   ¦   ¦   ¦--378259547                                          Noodles

I would like levelName to become CPI.Tag.

How might I do that without needing to iterate through each node?
0
C++ 11 Fundamentals
LVL 13
C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

I would like to change the names of the members of the list. The list is of a CPI hierarchy - the trouble is the names are now ID numbers, that are difficult to interpret. Each one of the members has a readable tag, that is "CPI.Tag".

How would I swap the label that is used for the name  currently to the "CPI.Tag"? (the display of a few sample members is attached)
2018-08-14_13-32-00.png

I thought it would be something along the lines of names(jpCPIlist)<-jpCPIlist$CPI.Tag, but evidently not..
0
How to remove any character from a string if it is not part of the below "x character set" .
Remove a character from a given string -if it is not part of the below x character set.
X character set: After removing the character there should not be any space on the character that gets removed.
---------------------
a b c d e f g h I j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9
/ - ? : ( ) . , ‘ +

Example
     '020?Dome@&++;stic CT out;%going?20测试叙事测试叙事@@@@test'
Expected output is
    020?Dome++stic CT outgoing?20test


I am using the below query but it is replacing '+'. I don't want to replace this + because it a part of x character set.

select translate(REGEXP_REPLACE ('020?Dome@&++;stic CT out;%going?20测试叙事测试叙事@@@@test','[^' || CHR(32) ||'-' ||CHR (127) || ']', ''),'+=!“%&*<>;{@#_',' ') AS TEST from dual;
0
I need to execute a substring function on each row of a single column dataframe in R Studio, then assign that value to a new dataframe.
0
Hello,
I'm trying to pass a value to a List, but I get an error Cannot implicitly convert System.Collections.Generic.List<char> to System.Collections.Generic.List<string>.
So, here's my code...
public List<string> NomeContrato { get; set; }

Open in new window


BindingSource bs = new BindingSource();
            bs.DataSource = LoadContratos();
            var editForm = new Concelhos_Edit();
            var editFormModel = new Info();
            editFormModel.Id = concelhos_datagrid.CurrentRow.Cells[0].Value.ToString();
            var _nomeContrato = contratosdt.AsEnumerable().FirstOrDefault(a => a.Field<int>("IdContrato") == ((DataRowView)bs.Current).Row.Field<int>("IdContrato")).Field<string>("Designacao");
            editFormModel.NomeContrato = _nomeContrato.ToList(); //--> Here's where the code breaks and get Error!

Open in new window


Changing this
public List<string> NomeContrato { get; set; }

Open in new window

to this
public List<char> NomeContrato { get; set; }

Open in new window

, I get my Combobox with one letter per line, like this...
W
o
r
l
d

Any help?
Thanks.
0
I would like to perform fractal analysis on a financial series (stock exchange index). What shall I use (prices or log returns) for calculating fractal dimension, Hurst Exponent, for performing R/S Analysis and for predictions? Are there R functions that calculate all of these immediately? Are there any things I need to take care of when analyzing the index? Thank you in advance!
0
First of all I am doing a program kinda simple long program, here is the full details:

The P-v-T relation for real gases can take many forms. The simplest relations are the ideal gas equation and the Van der Waals equation. These relations are to be applied to superheated steam. The file “pvt.txt” contains the P-v-T data of superheated steam (10 – 800 kPa) for the temperature range of 200 oC through 1200 oC, obtained from the steam tables.

Write a C program to read the steam table data “pvt.txt”. In the C program, estimate the density of steam for the pressure range 10 through 800 kPa, and temperature range 200 oC through 1200 oC,

(1) Using the ideal-gas relation: m3/kg where R = 0.4615 kJ/kgK, T is temperature [K] and P is pressure [kPa].

(2) Using the Van der Waals equation:

where R = 0.4615 kJ/kgK, T is temperature [K] and P is pressure [kPa]. The constants are obtained from and where Pcr = 22060 kPa and Tcr = 647.1 K.

In each case, calculate the resulting percentage error of the estimated density as follows: Error = x 100% Submit a report which must include: 1. Introduction, algorithm or flowchart, the C program, and the density from steam table. 2. The estimated density table when using the ideal gas equation. 3. The percentage error table when using the ideal gas equation. 4. The estimated density table when using the Van der Waals equation. 5. The percentage error table when using the Van der Waals equation. 6. Discussion and conclusion. Note: Density…
0

Statistical Packages

140

Solutions

319

Contributors

Statistical packages are software titles, such as JMP and GNU Octave, and programming languages, such as MATLAB, R and SAS, that are used to discover, explore and analyze data and suggest useful conclusions, either to learn something unexpected or to confirm a hypothesis. The field includes the design and analysis of techniques to give approximate but accurate solutions to hard problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Top Experts In
Statistical Packages
<
Monthly
>