Statistical Packages

141

Solutions

319

Contributors

Statistical packages are software titles, such as JMP and GNU Octave, and programming languages, such as MATLAB, R and SAS, that are used to discover, explore and analyze data and suggest useful conclusions, either to learn something unexpected or to confirm a hypothesis. The field includes the design and analysis of techniques to give approximate but accurate solutions to hard problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Share tech news, updates, or what's on your mind.

Sign up to Post

Hi,

Finally, My Dell Power Edge R 510 is running Windows Server 2019 (actually, it is a straight forward installation).

Now, I want to install VM machines in this server.

As Dell Server, my experience in VM machine is also limited.
Can somebody direct me what would I do to create the Hyper V server 2019?

Thanks,
tjie
0
CompTIA Network+
LVL 13
CompTIA Network+

Prepare for the CompTIA Network+ exam by learning how to troubleshoot, configure, and manage both wired and wireless networks.

If I clock the speed of an object for six different trials is their a way to predict what the approximate mean and standard would be for 100 trials for the same distance?

Also is my sample of six trials too small?

Example
Trial 1:  .607 seconds
Trial 2:  .653 seconds
Trial 3:  .618 seconds
Trial 4:  .809 seconds
Trial 5:  .63 seconds
Trial 6:  .619 seconds

Mean is 6
Standard deviation is .076562
0
I've below code to SelectFolderDialog in my previous vb.net application. I'm trying to implement it to c# but didn't succeeded.

Can anyone help me to convert it to C# or any additional solution would be grateful.

Public Class SelectFolderDialog
    Implements IDisposable

    ' Wrapped dialog
    Private OFD As System.Windows.Forms.OpenFileDialog = Nothing

    ''' <summary>
    ''' Initialize our Open File Dialog object
    ''' </summary>
    Public Sub New()
        OFD = New System.Windows.Forms.OpenFileDialog()

        With OFD
            .Filter = "Folders|" & vbLf
            .AddExtension = False
            .CheckFileExists = False
            .DereferenceLinks = True
            .Multiselect = False
        End With
    End Sub

#Region "Properties"

    ''' <summary>
    ''' Gets/Sets the initial folder to be selected. A value of Nothing or Emplty selects the current directory.
    ''' </summary>
    Public Property InitialDirectory() As String
        Get
            Return OFD.InitialDirectory
        End Get
        Set(value As String)
            OFD.InitialDirectory = CType(IIf(value Is Nothing OrElse value.Length = 0, Environment.CurrentDirectory, value), String)
        End Set
    End Property

    ''' <summary>
    ''' Gets/Sets the title to show in the dialog. A value of Nothing or Empty defaults to 'Select a folder'
    ''' </summary>
    Public Property Title() As String
        Get
            Return OFD.Title
        End Get
        

Open in new window

0
Starting with a string representing a range:
string sortRange = "C10:H24";

Open in new window

This then works:
worksheet.Range[sortRange].Sort(worksheet.Range[sortRange].Columns[1], xlSortOder.xlDescending);

Open in new window

I saw worksheet.Range[sortRange] was listed twice, so I decided to refactor that out:
Excel.Range r = worksheet.Range[sortRange];
Excel.Range c = r.Columns[1];
r.Sort(c, xlSortOrder.xlDescending);

Open in new window

My refactored version gives a much different result. Why is that?
1
Q: How to unite interaction columns from R emmeans package dynamically generated in R-Shiny?

I am building an R-Shiny app where I need to wrangle the output from the 'emmeans' package. However, in this interactive environment where many factors may be entered by the user, the single-tibble 'emmeans' output structure will vary with each run depending on the selections made. It could go from having only a single main effect to having multiple 3-way interactions (mixed with main effects and 2-way interactions) arranged in a wide format way.

For instance, assuming the user selects FctrA (with levels A and B) and FctrB (with levels C, D, and E), the interaction FctrA_FctrB will be automatically considered as well. When (~FctrA, ~FctrB, ~FctrA+FctrB) are submitted to 'emmeans', the output tibble is structured as follows:

- the leftmost side of the tibble contains FctrA results (levels, estimates, SE, df, CLs);
- the certermost block contains FctrB results;
- the rightmost side of the tibble contains the interaction results

So far so good except that FctrA levels columns is a single column, FctrB levels column is also a single column, but the interaction portion has its levels split into two columns, one with FctrA and one with FctrB.

The above issue impairs gathering, spreading, stacking of the separate blocks owing to the dimensional discrepancy.

My question is: How can I tell Shiny ('tidyr') to find those split interaction columns and concatenate them …
0
Hi I have a Dell Perc 740p RAID card I just put in a computer, the computer recognizes it but I can't get in with ctrl + R.  I see the post message, but this one dosen't say press ctrl + R to get into Bios.  it says...

"PowerEdge Expandable RAID controller BIOS copyright AVAGO Technologies"

Then Initializing virtual drives, then goes on it's marry way.

Question: is there another way to get into Bios?  Am I dooing something wrong?

Thanks all
0
hi,

I have a study on ETL tools and recently heard a lot of voice on just need to do ETL programming using script/coding, e.g. R and Qview, so is this means now ETL tools like MS SSIS is useless ?

what is the pros and cons on doing ETL logic by coding and ETL tools?

it seems now doing ETL code in container level with RESTFUL API already make ETL process can do load balancing, parallel execution and scale out (by container), is that correct ? so no need ETL tools any more ???

the new MariaDB X3 platform seems can even ignore ETL process as it can stream data directly form OLTP to OLAP, so not need ETL anymore?
0
Hello,

You can see I have commented out a couple of tables.  The  B.EMPLID = '4373198'
here does not have a residency - even with a left outer join he does not appear. What needs to be done for him to appear in the result set?

SELECT DISTINCT B.EMPLID  
 , T.FIRST_NAME_SRCH  
 , T.LAST_NAME_SRCH  
 , B.STRM  
 , B.DESCR  
 , B.CLASS_NBR  
 , B.SUBJECT  
 , B.CATALOG_NBR  
 , B.CLASS_SECTION  
 , B.ENRL_STATUS_REASON  AS STATUS
 , X.XLATSHORTNAME AS ENRL_STATUS_REASON  
 , B.ENRL_ACTN_RSN_LAST AS ActionReasonLastStatus  
  , T.PHONE  
 , U.EMAIL_ADDR  
 , B.ENRL_ADD_DT  
 , B.ENRL_DROP_DT  
 , A.ACCOUNT_BALANCE
  , VW.DESCR
 , VW.REF1_DESCR
 , (  
 SELECT O.COMMENTS  
  FROM PS_PERSON_COMMENT O  
 WHERE B.EMPLID = COMMON_ID  
   AND ADMIN_FUNCTION = 'SFAC'  
   AND CMNT_CATEGORY = 'FYI'  
   AND COMMENT_DT <= GETDATE()  
   AND COMMENTS IS NOT NULL  
   AND SEQ_3C = (  
 SELECT (MAX(SEQ_3C))  
  FROM PS_PERSON_COMMENT O2  
 WHERE O2.COMMON_ID = O.COMMON_ID  
   AND ADMIN_FUNCTION = 'SFAC'  
   AND CMNT_CATEGORY = 'FYI'  
   AND COMMENT_DT <= GETDATE() ))
   --,R.RESIDENCY
  -- ,S.SRVC_IND_CD
  -- ,MAX(SRVC_IND_DTTM)
  FROM PS_CLASS_TBL_SE_VW  B
LEFT OUTER JOIN XLATTABLE_VW X ON B.ENRL_STATUS_REASON = X.FIELDVALUE
LEFT OUTER JOIN PS_PERSONAL_DATA T ON B.EMPLID = T.EMPLID
LEFT OUTER JOIN PS_EMAIL_ADDRESSES U ON B.EMPLID = U.EMPLID  
LEFT OUTER JOIN PS_ACCOUNT_TOT_VW A ON B.EMPLID = A.EMPLID
LEFT OUTER JOIN PS_ITEM_SF_VW VW  …
0
Hi Experts

Could you give me an overall knowledge on how to use R language to obtain data from Facebook ?

Thanks in advance.
0
/usr/local/sbin/smsbox -v 4 /home/admin/web/mysite.com/kannel/kannel.conf
2019-05-29 08:17:42 [2871] [0] PANIC: Failed to open HTTP socket
2019-05-29 08:17:42 [2871] [0] PANIC: /usr/local/sbin/smsbox(gw_panic+0x145) [0x438f05]
2019-05-29 08:17:42 [2871] [0] PANIC: /usr/local/sbin/smsbox(main+0x128d) [0x40e7cd]
2019-05-29 08:17:42 [2871] [0] PANIC: /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f7d88b9e445]
2019-05-29 08:17:42 [2871] [0] PANIC: /usr/local/sbin/smsbox() [0x40ead2]
[root@sms-api ~]# /usr/local/sbin/bearerbox -v 4 /home/admin/web/mysite.com/kannel/kannel.conf
2019-05-29 08:18:03 [2872] [3] PANIC: Could not open smsbox port 13001
2019-05-29 08:18:03 [2872] [3] PANIC: /usr/local/sbin/bearerbox(gw_panic+0x145) [0x47d8c5]
2019-05-29 08:18:03 [2872] [3] PANIC: /usr/local/sbin/bearerbox() [0x41b638]
2019-05-29 08:18:03 [2872] [3] PANIC: /usr/local/sbin/bearerbox() [0x47b41f]
2019-05-29 08:18:03 [2872] [3] PANIC: /lib64/libpthread.so.0(+0x7e25) [0x7fe3f9e54e25]
2019-05-29 08:18:03 [2872] [3] PANIC: /lib64/libc.so.6(clone+0x6d) [0x7fe3f8f99bad]
  i run command to see process active and  why this happend.

tcp        0      0 0.0.0.0:2525            0.0.0.0:*               LISTEN      779/exim
tcp        0      0 0.0.0.0:13000           0.0.0.0:*               LISTEN      2182/bearerbox
tcp        0      0 0.0.0.0:13001           0.0.0.0:*               LISTEN      2182/bearerbox
tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN   …
0
OWASP: Threats Fundamentals
LVL 13
OWASP: Threats Fundamentals

Learn the top ten threats that are present in modern web-application development and how to protect your business from them.

Hi Experts

I am new to R Script and coding, just inherited some R Code that i which plots perfectly fine and gives the correct results. I want to have the ability to show the x and y values when a user hoovers the mouse cursor over the line.

I have found a link to a articles that describe how to carry out the necessary step but having no luck at all..
[url=" https://www.r-graph-gallery.com/124-change-hover-text-in-plotly/"]


my r code.
library(dplyr)
library(ggplot2)
library(survival)
library(survminer)
library(grid)
library(gridExtra)
library(plotly)
pc <- dataset
fstat <- pc %>% mutate(fstatus = case_when(OUTCOMETYPE=="Revised" ~ 1,TRUE ~ 0))
pmpa <- fstat %>% select(PRIMARYPROCEDUREID,PRIMARYTOOUTCOMEYEARS,fstatus,OUTCOMETYPE)
if(nrow(pmpa) < 4){
d <- pmpa %>% select(PRIMARYPROCEDUREID,PRIMARYTOOUTCOMEYEARS,OUTCOMETYPE) %>% mutate(INSUFFICIENTDATA = "Summary Results")
h = head(d[,2:4])
grid.table(h)
}else{
fit <- survfit(Surv(PRIMARYTOOUTCOMEYEARS,fstatus)~1,data = pmpa) 
ggsurv <- ggsurvplot(fit,
           ylab="Patient Analysis",
           xlab="Time (Years)",
           break.time.by = 1,
           xlim = c(0,max(fit$time)),
           surv.scale = "percent",
           legend.title = "Kaplan-Meier",
           legend.labs = "",
           risk.table = TRUE,
           fontsize = 3,
           font.tickslab = c(10, "plain"),
           risk.table.y.text = FALSE,
           fun = "event"
           )
ggsurv$plot <- ggsurv$plot + theme(plot.title = 

Open in new window

0
If you had to consider all the myriad of stock trading indicators out there that many novice and advanced traders alike base their trading on, and you had to group that huge number of indicators into the broadest possible major categories, what would those categories be?

I'm thinking there may only be two (2) categories off the top of my head:  Price and Volume.

What do you think?
0
Guys,
I like to know how and at what scenarios Pythons and R that embedded in sql server 2017 can really meaningful to be use? I'm still using sql 2014 and running sql reporting services to produces end report to customer  and I can says 100% of data analysis that we have performed  were generated from T-sql.
Actually I wonder how this R and Python in sql 2017 would assist me to speed up or create more meaningful data for customer as from my experiences T-sql is doing more than enough for me to provides even complex reports to end users.
If anyone  here are able to shed some lights I maybe have more explanation and reason to upgrade to sql 2017.
0
I've installed the rattle package and run this code.
library(rattle)
test <- c(1,2,3,4,5,6)
test
test2 <- binning(test,4,method = "quantile",ordered = FALSE)
test2

Open in new window


This is the output I get.

[1] 1.000000 1.916667 3.500000 5.083333 6.000000
Levels: [1,1.92] (1.92,3.5] (3.5,5.08] (5.08,6]

Open in new window

I understand that 3.5 is the median.  Where do 1.92 and 5.08 come from?
0
Hi,

I have a file with extension .dta

I need to analyse the data in excel

I've downloaded Stata and would like to know how to convert the .dta to a .xls or .csv

Thanks
Seamus
0
Microsoft R and SQL Server 2016

I installed Microsoft R on an existing instance of SQL Server 2016 and I cannot get the SQL Server Launchpad service to start.  Each time I try, I get "The request failed or the service did not respond in a timely fashion. Consult the event log or other applicable error logs for details."  In the application error log is says "A timeout was reached (30000 milliseconds) while waiting for the SQL Server Launchpad (MSSQLSERVER) service to connect."

I have tried everything I can find and no joy.  I read that if the R library is out of sync with SQL Server this can happen.  How can I tell if this is the case and if so how can I fix it?

Here is the version number of the Launchpad.exe file 13.0.1601.5
Here is the version number of the SQLSVC.dll is 13.0.5216.0

The @@VERSION for SQL Server is Microsoft SQL Server 2016 (SP2-CU3) (KB4458871) - 13.0.5216.0 (X64)   Sep 13 2018 22:16:01  

Any help would be greatly appreciated.

Jim
0
I ran into a problem in R Studio. I am trying to run a t-test and it says "grouping data must have exactly 2 levels". Anyone know how to do this t-test with our values? We are trying to use bar charts showing means with error bars by presenting the results in Excel. Also need to know how to segment results using ANOVA.
rstudio.png
0
Difficulty with one DC in a multi-site AD setup - Naming Context is in the process of being removed or is not replicated from the specified server
It appears that syncing FROM the master DC (schema, FSMO roles holder) TO the out-of-sync DC works without error, however the receiving DC cannot initiate a sync via GUI in AD Sites and Services nor can it via repadmin /replicate.

Promoted another server in the remote site to DC and was able to successfully get it working, so WAN / VPN / DNS appears to be working as expected.

Is there a way I can force the sync From the main to the out-of-sync DC and get it to pick back up again?
0
Hi there.

I have trouble how to do this in STATA: I have a dataset with response (BOR) for the patients. The variable BOR can be CR, PR, SD, PD or NE. Each patient can be in either arm A or B. I need to make a table in STATA with the distribution of BOR on arm A and B, and I also need to calculate a p-value (log-rank) and Hazard ratio for each value of BOR.
Anyone who can tell me how to do this in STATA? If not STATA, when maybe the codes in an altertative statistics software? It's probably very much alike.
Thank you in advance.

Best regards

Ulrich
0
Rowby Goren Makes an Impact on Screen and Online
LVL 13
Rowby Goren Makes an Impact on Screen and Online

Learn about longtime user Rowby Goren and his great contributions to the site. We explore his method for posing questions that are likely to yield a solution, and take a look at how his career transformed from a Hollywood writer to a website entrepreneur.

Are there any good machine learning libraries for .NET that may make it an alternative to Python and R?  I've heard of ML.NET, but am not sure how good it is and how committed Microsoft are to it.
0
I am trying to find out a task that automatically starts an application when a user login to the computer running Windows Server 2012 R.
But I do not where it is located.  With the Event Viewer, will we be able to find out how/where the task start from?
0
Hi Experts!

I am running into an error when using CPYFRMIMPF and numerics.    When I run the CPYFRMIMPF command I get these errors..  The copy did not complete for reason code 9. When I have the DDS fields all alpha numeric it works.

This is my CPYFRMIMPF code:

0049.00              CPYFRMIMPF FROMSTMF('/ZIP/ZIPFILE.CSV') +              
0050.00                           TOFILE(*CURLIB/&EDTF) MBROPT(*REPLACE) +  
0051.00                           RCDDLM(*LF) STRDLM(*NONE) +              
0052.00                           RMVBLANK(*TRAILING) RPLNULLVAL(*FLDDFT) +
0052.01                           RMVCOLNAM(*YES)                          
 
This is a sample of the data from the ZIPFILE.CSV

zip_code,distance,city,state
15090,99.162,"Wexford","PA"
15084,99.649,"Tarentum","PA"
15006,98.913,"Bairdford","PA"
15015,98.329,"Bradfordwoods","PA"

This is my DDS:

0001.00 0008 A          R ZIPCREC                   TEXT('ZIP RADIUS')
0002.00 0000 A            ZIP              5A         COLHDG('ZIP CODE')
0003.00 0000 A            ZDIST          4P 3     COLHDG('DISTANCE')
0004.00 0000 A            ZCITY        30A         COLHDG('CITY')    
0005.00 0000 A            ZSTATE       4A         COLHDG('STATE')  

Thanks for your help!!
0
Hi,

I'm currently working in AWS and trying to use a Lambda function to automate the creation of my AMIs. I'm doing this via the use of the Python script below, but when I test it it returns an error. Can anyone shed any light on what I should be looking at please?

Script:

import boto3
import collections
import datetime
import sys
import pprint

ec = boto3.client('ec2')
#image = ec.Image('id')

def lambda_handler(event, context):
   
    reservations = ec.describe_instances(
        Filters=[
            {'Name': 'tag-key', 'Values': ['backup', 'Backup']},
        ]
    ).get(
        'Reservations', []
    )

    instances = sum(
        [
            [i for i in r['Instances']]
            for r in reservations
        ], [])

    print "Found %d instances that need backing up" % len(instances)

    to_tag = collections.defaultdict(list)

for instance in instances:
    try:
        retention_days = [
            int(t.get('Value')) for t in instance['Tags']
            if t['Key'] == 'Retention'][0]
    except IndexError:
        retention_days = 7

    finally:

        #for dev in instance['BlockDeviceMappings']:
        #    if dev.get('Ebs', None) is None:
        #        continue
        #    vol_id = dev['Ebs']['VolumeId']
        #    print "Found EBS volume %s on instance %s" % (
        #        vol_id, instance['InstanceId'])

            #snap = ec.create_snapshot(
            #    VolumeId=vol_id,
      …
0
How to calculate linear regression in oracle plsql.
Please see the file attached.
C--Tanuja-Lake_IL-BRDs-linear-regre.docx
0
How do i get the attached data "normalised" -- made into normal distribution ?

Worksheet "Data" = Full Data
"Filtered 25-65" = data with 25-65 s filtered by "Data" and copied in this worksheet.

If i use 25-65, it has the best result - ie, less skewness but stil not into normal distribution.

Any idea to use that range such that the data plotted is normal distribution  ?
Data-25-65--Check-Normal--EE.xlsx
0

Statistical Packages

141

Solutions

319

Contributors

Statistical packages are software titles, such as JMP and GNU Octave, and programming languages, such as MATLAB, R and SAS, that are used to discover, explore and analyze data and suggest useful conclusions, either to learn something unexpected or to confirm a hypothesis. The field includes the design and analysis of techniques to give approximate but accurate solutions to hard problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Top Experts In
Statistical Packages
<
Monthly
>