Statistical Packages

125

Solutions

297

Contributors

Statistical packages are software titles, such as JMP and GNU Octave, and programming languages, such as MATLAB, R and SAS, that are used to discover, explore and analyze data and suggest useful conclusions, either to learn something unexpected or to confirm a hypothesis. The field includes the design and analysis of techniques to give approximate but accurate solutions to hard problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Share tech news, updates, or what's on your mind.

Sign up to Post

I've installed the rattle package and run this code.
library(rattle)
test <- c(1,2,3,4,5,6)
test
test2 <- binning(test,4,method = "quantile",ordered = FALSE)
test2

Open in new window


This is the output I get.

[1] 1.000000 1.916667 3.500000 5.083333 6.000000
Levels: [1,1.92] (1.92,3.5] (3.5,5.08] (5.08,6]

Open in new window

I understand that 3.5 is the median.  Where do 1.92 and 5.08 come from?
0
Learn Ruby Fundamentals
LVL 13
Learn Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

Hi,

I have a file with extension .dta

I need to analyse the data in excel

I've downloaded Stata and would like to know how to convert the .dta to a .xls or .csv

Thanks
Seamus
0
I ran into a problem in R Studio. I am trying to run a t-test and it says "grouping data must have exactly 2 levels". Anyone know how to do this t-test with our values? We are trying to use bar charts showing means with error bars by presenting the results in Excel. Also need to know how to segment results using ANOVA.
rstudio.png
0
Difficulty with one DC in a multi-site AD setup - Naming Context is in the process of being removed or is not replicated from the specified server
It appears that syncing FROM the master DC (schema, FSMO roles holder) TO the out-of-sync DC works without error, however the receiving DC cannot initiate a sync via GUI in AD Sites and Services nor can it via repadmin /replicate.

Promoted another server in the remote site to DC and was able to successfully get it working, so WAN / VPN / DNS appears to be working as expected.

Is there a way I can force the sync From the main to the out-of-sync DC and get it to pick back up again?
0
Hi there.

I have trouble how to do this in STATA: I have a dataset with response (BOR) for the patients. The variable BOR can be CR, PR, SD, PD or NE. Each patient can be in either arm A or B. I need to make a table in STATA with the distribution of BOR on arm A and B, and I also need to calculate a p-value (log-rank) and Hazard ratio for each value of BOR.
Anyone who can tell me how to do this in STATA? If not STATA, when maybe the codes in an altertative statistics software? It's probably very much alike.
Thank you in advance.

Best regards

Ulrich
0
Are there any good machine learning libraries for .NET that may make it an alternative to Python and R?  I've heard of ML.NET, but am not sure how good it is and how committed Microsoft are to it.
0
I am trying to find out a task that automatically starts an application when a user login to the computer running Windows Server 2012 R.
But I do not where it is located.  With the Event Viewer, will we be able to find out how/where the task start from?
0
Hi Experts!

I am running into an error when using CPYFRMIMPF and numerics.    When I run the CPYFRMIMPF command I get these errors..  The copy did not complete for reason code 9. When I have the DDS fields all alpha numeric it works.

This is my CPYFRMIMPF code:

0049.00              CPYFRMIMPF FROMSTMF('/ZIP/ZIPFILE.CSV') +              
0050.00                           TOFILE(*CURLIB/&EDTF) MBROPT(*REPLACE) +  
0051.00                           RCDDLM(*LF) STRDLM(*NONE) +              
0052.00                           RMVBLANK(*TRAILING) RPLNULLVAL(*FLDDFT) +
0052.01                           RMVCOLNAM(*YES)                          
 
This is a sample of the data from the ZIPFILE.CSV

zip_code,distance,city,state
15090,99.162,"Wexford","PA"
15084,99.649,"Tarentum","PA"
15006,98.913,"Bairdford","PA"
15015,98.329,"Bradfordwoods","PA"

This is my DDS:

0001.00 0008 A          R ZIPCREC                   TEXT('ZIP RADIUS')
0002.00 0000 A            ZIP              5A         COLHDG('ZIP CODE')
0003.00 0000 A            ZDIST          4P 3     COLHDG('DISTANCE')
0004.00 0000 A            ZCITY        30A         COLHDG('CITY')    
0005.00 0000 A            ZSTATE       4A         COLHDG('STATE')  

Thanks for your help!!
0
Hi,

I'm currently working in AWS and trying to use a Lambda function to automate the creation of my AMIs. I'm doing this via the use of the Python script below, but when I test it it returns an error. Can anyone shed any light on what I should be looking at please?

Script:

import boto3
import collections
import datetime
import sys
import pprint

ec = boto3.client('ec2')
#image = ec.Image('id')

def lambda_handler(event, context):
   
    reservations = ec.describe_instances(
        Filters=[
            {'Name': 'tag-key', 'Values': ['backup', 'Backup']},
        ]
    ).get(
        'Reservations', []
    )

    instances = sum(
        [
            [i for i in r['Instances']]
            for r in reservations
        ], [])

    print "Found %d instances that need backing up" % len(instances)

    to_tag = collections.defaultdict(list)

for instance in instances:
    try:
        retention_days = [
            int(t.get('Value')) for t in instance['Tags']
            if t['Key'] == 'Retention'][0]
    except IndexError:
        retention_days = 7

    finally:

        #for dev in instance['BlockDeviceMappings']:
        #    if dev.get('Ebs', None) is None:
        #        continue
        #    vol_id = dev['Ebs']['VolumeId']
        #    print "Found EBS volume %s on instance %s" % (
        #        vol_id, instance['InstanceId'])

            #snap = ec.create_snapshot(
            #    VolumeId=vol_id,
      …
0
How to calculate linear regression in oracle plsql.
Please see the file attached.
C--Tanuja-Lake_IL-BRDs-linear-regre.docx
0
Why Diversity in Tech Matters
LVL 13
Why Diversity in Tech Matters

Kesha Williams, certified professional and software developer, explores the imbalance of diversity in the world of technology -- especially when it comes to hiring women. She showcases ways she's making a difference through the Colors of STEM program.

How do i get the attached data "normalised" -- made into normal distribution ?

Worksheet "Data" = Full Data
"Filtered 25-65" = data with 25-65 s filtered by "Data" and copied in this worksheet.

If i use 25-65, it has the best result - ie, less skewness but stil not into normal distribution.

Any idea to use that range such that the data plotted is normal distribution  ?
Data-25-65--Check-Normal--EE.xlsx
0
Hi,
I would like to prepare data for regression analysis.
I can prepare data in two forms.
a) values
b) rankings.
example:
values  60,30,25,90
rankings 2,3,4,1
Which format would be most suitable or does it not matter ?
many thanks
Ian
0
Hello,

I have  a list of lists.  The lists in the list of lists are file names.  I use lapply to read and merge the contents of each list in the list of lists (3 merged contents in this case  which will be the content of 3 files).  Then, I  have to change the name of the 3 resulting files and finally I have to write the contents of the files to each file.

 lc <- list("test.txt", "test.txt", "test.txt", "test.txt")
 lc1 <- list("test.txt", "test.txt", "test.txt")
 lc2 <- list("test.txt", "test.txt")
#list of lists.  The lists contain file names
 lc <- list(lc, lc1, lc2)
#new names for the three lists in the list of lists
 new_dataFns <- list("name1", "name2", "name3")
 file_paths <- NULL
 new_path <- NULL
#add the file names to the path and read and merge the contents of each list in the list of lists
 lapply(
    lc,
    function(lc) {
     filenames <- file.path(dataFnsDir, lc)
     dataList= lapply(filenames, function (x) read.table(file=x, header=TRUE))
     Reduce(function(x,y) merge(x,y), dataList)
     #   print(dataList)

    }
  )  

#add the new name of the file to the path total will be 3 paths/fille_newname.tsv.  
 lapply(new_path, function(new_path){new_path <- file.path(getwd(), new_dataFns)

The statements above work because lc and  new_dataFns are global and I can pass them to the lapply function

#Finally, I need to write the merged contents to the corresponding file (path/name.tsv).  I tried the following statement, but this …
0
Hi
I have one PDC server 2008R2 (D2R03Q02)  holding all FSMO roles and a second PDC server 2012R2 (PowerT130) who is not replicating any more since more than a month.

on PDC1 the command repadmin /showrepl shows no erros
on PDC2 the command repadmin /showrepl contains several errors

the netdom query FSMO shows all roles on PDC D2R03Q02

Connectivity: I can ping both servers

If I try to transfer FSMO to the second PDC PowerT130 I get the  ERROR The current Operations master is offline. The role cannot be transferred.
But the PDC D2R03Q02 is up and running and I can ping it from the second PDC.

Dcdiag show many errors and warnings on both PDC

Errors related to Ldap for example

or warnings like :

Warning: DcGetDcName(PDC_REQUIRED) call failed, error 1355

Warning: DcGetDcName(TIME_SERVER) call failed, error 1355
A Time Server could not be located.
The server holding the PDC role is down.
Warning: DcGetDcName(GOOD_TIME_SERVER_PREFERRED) call failed,
error 1355
A Good Time Server could not be located.

I attached The complet Dcdiag report dcdia.txt


DNSLINT command look good

DNSLint Report

System Date: Fri Jul 20 23:42:26 2018

Command run:

dnslint /ad 192.168.1.6 /s 192.168.1.7 /v

 Root of Active Directory Forest:

    Ecole.Schulz.Local
Active Directory Forest Replication GUIDs Found:
 
DC: D2R03Q02
GUID: 1a3677e0-7a77-413b-b70d-f0ede03ff7af

DC: POWERT130
GUID: …
0
All, I am preparing for MFE (financial engineering). I could not thin of any other forum where i could seek help for clearing my doubts on financial products.

Dear experts, is my below understanding correct?

Is  r>-1  coming from this statement?

&&&1 + r&&&

if 1+r>0, then r>-1
*****

Extract from Stochastic Calculus for Finance 1

We introduce also an interest rate 1^n. One dollar invested in the money

market at time zero will yield 1 + r dollars at time one. Conversely, one dollar

borrowed from the money market at time zero will result in a debt of &&&1 + r&&&

at time one. In particular, the interest rate for borrowing is the same as the

interest rate for investing. It is almost always true that r >= 0, and this is

the case to keep in mind. However, the mathematics we develop requires only

that r > -1.

Kindly guide
0
hi,

Anyone know how to scale out R server/service in MS SQL 2016 and later?
0
Trying to run

        Dim RetVal As Long
        RetVal = Shell("for %%i in (C D E F G H I J K L M N O P Q R S T U V W X Y Z) DO @if exist %%i: dir %%i:\*.* /s /b >E:\zzz\Zip-Bak\DayBkUp\DirectoryListHistory\%%i.txt")



Get error run time error 53 file not found?

It runs fine in a dos batch file

Trying to create a series of text files with a list of all files on each drive on my computer

Thanks
DIR-ALL-Drives.txt
0
Enabling R commands on T-SQL of MS SQL Server
I have installed Microsoft SQL server 2017 and SQL server management studio on my laptop computer with windows 10 Professional.  I am able to connect to the database from management studio.  I have also installed R (version 3.5.0 (The R foundation for Statistical Computing).
I learnt that from SQL server 2016 onwards, MS SQL server permits R commands on T-SQL.
I am familiar with both T-SQL and R commands.
I want to know how to enable execution of R commands on T-SQL.
0
Hi,

Anyone know if the R server in SQL 2016 and R and Python feature in SQL 2017 need additional license?

Answer should be a no as MS always license everything SQL server by a single cost and ALL feature is charged already.
0
OWASP: Forgery and Phishing
LVL 13
OWASP: Forgery and Phishing

Learn the techniques to avoid forgery and phishing attacks and the types of attacks an application or network may face.

As the administrator for our MS Server 2016 Essentials network, I sometimes need to  view coworkers c drive.

I simply use WinKey R  > \\JanesCpu\c$   and supply my network admin credentials.  

For the rest of the day I can view  JanesCpu without supplying the credentials.

Clearly the admin credentials are being saved somewhere, but where are they, and how can I delete them when I am done?
0
I'm not able to send an image with a  text message on my iPhone 6. After selecting the recipient from contacts I select the camera icon to the left of the text field and select an image from my local images.  The image appears at the top of the screen.    Then I enter a text message and  and hit the green uparrow.  In a second I see a red message:  "Not Delivered."  This all repeatable.  The person I'm sending the message to has an Android.   Also, I 'm in range of a WiFi signal so I assume this is all going through WiFi and not Cellular.  Does anyone know what might be going on?

Thanks.
0
I am trying the code below so that if there is neither "/" nor "\" the output is an error and the application stops, but it does not work. I cannot see any thing wrong with the code.  Does any one sees any thing wrong?

 if (grepl("/", OUTpath, fixed=TRUE)) { # mac style
     OUTpath<- paste(paste(unlist(strsplit(OUTpath, "/", fixed=TRUE)), collapse="/"), "/", sep="")
   } else
     if (grepl("\\", OUTpath, fixed=TRUE)) { # windows style
       OUTpath<- paste(paste(unlist(strsplit(OUTpath, "\\", fixed=TRUE)), collapse="\\"), "\\", sep="")
     } else
       if(!grepl("/", OUTpath, fixed=TRUE) || !grepl("\\", OUTpath, fixed=TRUE)){
       trueFalse = FALSE
       errorMessage("Unrecognized path separator in OUTpath or no path specification in PARAMS file. Cannot open connection\n
                             You can edit your input file and save the changes. Afterwards, stop and restart glycoPipe and upload file again")
       stop("Unrecognized path separator in OUTpath\n")
     }
0
Hi Experts,

I have a Windows 7 Home 64bit machine that wont boot in normal or safe mode....

I have pulled the drive out & backed up the data, run all tests on memory and harddrive (all ok) while I had it out I ran chkdsk /r /f  no issues found. I'm currently downloading a windows 7 disk so I can either attempt to repair from that or am  best of with a W & R? or is there something that may assist in repairing.

Windows repair from boot up does not fix it although I did come across a message about a driver missing?

The user was hacked by some one who called and he let them on his PC one thing I noticed is that windows 7 is installed on the D drive

cheers.
0
SQL Join Query.

I am trying to query 3 tables in a database and I am not getting the results back that I am expecting.

Here is the query:

SELECT DISTINCT m.ADName, i.FirstName, i.LastName, r.Description
FROM         Membership AS m INNER JOIN
                         user_information AS i ON m.ADName = i.NT_Username INNER JOIN
                         RolesDescription AS r ON m.RoleID = r.RoleID
WHERE     (m.ApplicationID = 6)

I think its because a record does not exist in the "User_Information" (I would like the result to show a blank field if this is the case)

I am expecting 120 records but I am getting only 106.
0
hi,

any one tried to upgrade the R ( in database) of SQL 2016 ? I have a SQL server 2016 AOG group with 3 x nodes SQL nodes one by one and only one has R (In database) service can't upgrade, any idea?
0

Statistical Packages

125

Solutions

297

Contributors

Statistical packages are software titles, such as JMP and GNU Octave, and programming languages, such as MATLAB, R and SAS, that are used to discover, explore and analyze data and suggest useful conclusions, either to learn something unexpected or to confirm a hypothesis. The field includes the design and analysis of techniques to give approximate but accurate solutions to hard problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Top Experts In
Statistical Packages
<
Monthly
>