Statistical Packages

Statistical packages are software titles, such as JMP and GNU Octave, and programming languages, such as MATLAB, R and SAS, that are used to discover, explore and analyze data and suggest useful conclusions, either to learn something unexpected or to confirm a hypothesis. The field includes the design and analysis of techniques to give approximate but accurate solutions to hard problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Share tech news, updates, or what's on your mind.

Sign up to Post

I have a spreadsheet "attached" of "Ring Numbers"
The problem I'm having is the data starting with a "R4" and an "R5" need a "1" inserted in between the R and the 4 and also the R and the 5.
For example
"R4142" needs to be "R14142" & "R5507" needs to be "R15507" in order to be joined with another dataset
I am having issues isolating these specific fields only and entering the "1"
RingUtilization.csv
0
[Webinar On Demand] Database Backup and Recovery
LVL 11
[Webinar On Demand] Database Backup and Recovery

Does your company store data on premises, off site, in the cloud, or a combination of these? If you answered “yes”, you need a data backup recovery plan that fits each and every platform. Watch now as as Percona teaches us how to build agile data backup recovery plan.

I have the following query on prior help and I added a column 'text1', change the Comp5 heading to Comp15 and I can't get 'text1' and Comp15 values on the result:
DECLARE @TEMP4 TABLE
    (
        PID INT ,
        TEXT1 VARCHAR(10),
Comp1 VARCHAR(10),
 Comp2 VARCHAR(10),
Comp3 VARCHAR(10),
Comp4 VARCHAR(10),
Comp15 VARCHAR(10)
    );

INSERT INTO @TEMP4
VALUES ( 11122, '1212', NULL, NULL, NULL, NULL, NULL ) ,
         ( 12345, NULL, NULL, NULL, '123', NULL, NULL ) ,
       ( 23456, NULL, '234', NULL, 'ewr', NULL, NULL ) ,
       ( 34567, NULL, NULL, 'acc', NULL, NULL, 'def' ) ,
       ( 45678, NULL, NULL, NULL, 'jkl', NULL, NULL ) ,
       ( 56789, NULL, NULL, NULL, NULL, NULL, 'we1' ) ,
       ( 23450, NULL, 'abc', 'acc', 'exy', 'ert', 'def' );

WITH Unpivoted
AS ( SELECT *
     FROM   @TEMP4 T UNPIVOT(CompValue FOR CompType IN(Comp1, Comp2, Comp3, Comp4, Comp15)) U ) ,
     Reordered  
AS ( SELECT U.PID ,
                  U.TEXT1,
            U.CompValue ,
            'Comp' + CAST(ROW_NUMBER() OVER ( PARTITION BY U.PID
                                              ORDER BY U.CompType ASC ) AS VARCHAR(255)) AS NewCompType
     FROM   Unpivoted U )
SELECT *
FROM   Reordered R
    PIVOT (   MIN(CompValue)
              FOR NewCompType IN ( Comp1, Comp2, Comp3, Comp4, Comp15 )) P;
0
So I have a dataset wherein I have account number and "days past due" with every observation. So for every account number, as soon as the "days past due" column hits a code like "DLQ3" , I want to remove rest of the rows for that account(even if DLQ3 is the first observation for that account).

My dataset looks like :

Observation date Account num   Days past due

2016-09                           200056              DLQ1
2016-09                           200048              DLQ2
2016-09                           389490              NORM
2016-09                           383984              DLQ3.....

So for account 383984, I want to remove all the rows post the date 2016-09 as now its in default.

So in all I want to see when the account hits DLQ3 and when it does I want to remove all the rows post the first DLQ3 observation.
0
I have 2 java projects to do a replication , a RMIreplication and the publhiser, in the RMIReplication I create an ArrayList of Subjects anda in the publisher I nedd to aceed this ArrayList to do the attach and setstate how can I do, I will put belong the code of the 2 class from diferents projects
public class Replication {

	 //static ArrayList<Subject> theList ;
	static ArrayList<Subject> theList;
	public static void main(String args[]){
		//System.setProperty("java.rmi.server.hostname","192.168.1.92");
		
		//System.setProperty("java.rmi.server.hostname","192.168.1.92");
				theList =new ArrayList<Subject>();
				Registry r=null;
				Registry r1=null;
				Registry r2=null;
				
				try{
					r = LocateRegistry.createRegistry(2023);
					r1 = LocateRegistry.createRegistry(2024);
					r2 = LocateRegistry.createRegistry(2025);
					
				}catch(RemoteException a){}
				//System.setSecurityManager(new RMISecurityManager());
				 try{
					 
					Subject list = new Event();
					
					Subject list1 = new Event();
					Subject list2 = new Event();
					System.out.println("1");
		            	Naming.rebind("//localhost:2023/Subject", (Remote) list );
		            	System.out.println("2");
		            	Naming.rebind("//localhost:2024/Subject1", (Remote) list1 );
		            	Naming.rebind("//localhost:2025/Subject2", (Remote) list2 );
		            	System.out.println("3");
		            	theList.add( list);
		            	theList.add(list1);
		            	

Open in new window

0
I can upload a into RStudio server using shiny.  However, I cannot open the file for editing and saving for processing by the R code.  Apparently, the randsontable library function ransdontable() takes as input a dataframe.  I am using the code below: DF = as.data.frame(read.delim(inFile$datapath))
    rhandsontable(DF, width = 550, height = 300)

but I get the message:

Warning: Error in as.data.frame.default: cannot coerce class "c("rhandsontable", "htmlwidget")" to a data.frame

Is there a solution for this, or is there any other way to open and edit file using shiny?

EDIT:
My code was completely wrong. If anyone wants to find more about the R rhandsontable, please post a question. I will be happy to help.
0
A user I support needs me to purchase the JMP software (which is available from the https://www.sas.com/jmpstore/software/jmp/prodJMP.html website).

I have visited this website and have found that this software is being sold for $1,700.

Are there any other places or vendors where the full version of this software can be purchased for less?
0
I have an excel spread sheet that has a column with either b, g or r
I would like to create a formula in a separate column which will convert this to either Blue, Green or red

Gordon
0
Installing SQL 2017 we were asked to "Accept" on "Microsoft R Open".  We looked it up some sort of open source, but What is it to us Microsoft product users?, also  Why click "Accept" it in this SQL install? and How can we take advantage of this "Microsoft R Open"?
0
I have a data frame with some names (rows) and I have some positions that I want to use, for instance 35th row and 145th row. How do I get the names of the rows which are in this positions? I uploaded a print screen that may help understanding. Thanks!

I tried something like

names1 <- row.names(which(size_96 < median(size_96, na.rm= T)))
0
Hi, I have one data frame (df1) with 20 observations (one for each year) and 597 variables (each one is one stock). The values are a ratio called book-to-market ratio. I need to build two portfolios for each year which consists of the stocks with values lower than the median and stocks with values above the median. The names of the stocks are the columns from df1. So I need to check if each value from each row (each year) is below or above the median and identify each stock name (columns in df1). Then I need to match it with the columns from another data frame (df2) which has data from the return of each stock in each year (20x597). The end result would be a vector with 20 entries, which are the differences of average returns between the two portfolios. I hope it was clear enough, thanks for the answer and I`m here for any explanation.
0
Free Tool: SSL Checker
LVL 11
Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

TABLEI have a table with data R. Virgo (3 Rows). I want to query the most recent only the frist row and not all the data with this code:

    Private Sub DISPLAY()
        conn.Close()
        Dim cmd2 As New MySqlCommand
        Dim myDA2 As New MySqlDataAdapter(cmd2)
        Dim myDT2 As New DataTable
        cmd2.Connection = conn
        cmd2.CommandText = "SELECT * FROM esd_reco where IDNumber = '" & txtG.Text & "' ORDER BY Date DESC"
        myDA2.Fill(myDT2)
        dghistory.DataSource = myDT2
        conn.Close()

Open in new window


txtG.text is the ID Number
Help please, thank you.
0
Hi,

I've logged into a Microsoft R Server using mrsdeploy::remoteLogin()

Test with session:

REMOTE> result <- system("gpg --yes --batch -r [e-mail] --passphrase=[youPassphrase] --armor --utf8-strings --decrypt youFile", intern = TRUE)

REMOTE> result
character(0)
attr(,"status")
[1] 2

REMOTE> exit
>Logout from remote R session complete

Open in new window


Test without session:

result <- system("gpg --yes --batch -r [e-mail] --passphrase= youPassphrase] --armor --utf8-strings --decrypt youFile", intern = TRUE)

gpg: encrypted with 2048-bit RSA key, ID XXXXXXX, created 2017-11-20 "name<e-mail>"

>result
[1] "Esta es la frase\r"
[2] "que he encriptado\r"

Open in new window


I need that it work based on remote session because this way works on service.


Thanks for your reply.

Kind regards,
0
below is a VBA formula for subtotalling filtered data and  which works fine if I define the column number as a given number  in in the section of code to the right
However I would like to define the column number as a variable ie Col Number 2 or Column Number 6 etc
Have seen some suggestions how this might work but doesnt seem to like the formula I am using
I know what I have put in is wrong but am trying to figure out the correct if possible version

Sub Analysis_Testdev()
 Dim ColNumberx As Integer
 ColNumberx = 2


   
    Sheets("WsheetA").Select
    Range("B10").Select
    Application.CutCopyMode = False
'
    ActiveCell.FormulaR1C1 = _
        "=IF('123WsheetA'!RC[-1]>-.1,SUMPRODUCT(SUBTOTAL(109,OFFSET('123WsheetA'!R48C5,ROW('WsheetA'!R49C5:R20000C5)-ROW('WsheetA'!R48C5),,1)),--('WsheetA'![R49C&ColNumberx]:[R20000C&ColNumberx]='WsheetA'!RC[-1])),"""")"
           
           
    ' my problem is getting this bit of the above equation to work in that I want the Column Number to be a variable ColNumberx and i cant seem to get it to work
       '   --('WsheetA'![R49C&ColNumberx]:[R20000C&ColNumberx]='WsheetA'!RC[-1])),"""")"
       ' the aim is to create a valid formula in B10 based on whatever column I choose and then copy it to cells B10:B29 but I cant get the column number based on a variable right
           
       
    Range("B10").Select
    Selection.Copy
    Range("B10:B29").Select
    ActiveSheet.Paste
   
   
 
End Sub
0
Why the R Programming Language Will Become Your Go To Language
When you discover the power of the R programming language, you are going to wonder how you ever lived without it! Learn why the language merits a place in your programming arsenal.
0
I had this question after viewing How to change this macro to find last column instead of A to J?.

Sub Commas2Rows()
  ' hiker95, 05/18/2017, ME1006027
  Dim lr As Long, lc As Long, r As Long, s, i As Long
  Application.ScreenUpdating = False
  With ActiveSheet
    lr = .Cells(Rows.Count, 1).End(xlUp).Row
    lc = .Cells(2, Columns.Count).End(xlToLeft).Column
    For r = lr To 2 Step -1
      If InStr(.Range("G" & r), ", ") Then
        s = Split(.Range("G" & r), ", ")
        .Rows(r + 1).Resize(UBound(s)).Insert
        .Range("G" & r).Resize(UBound(s) + 1) = Application.Transpose(s)
        .Range("A" & r + 1 & ":F" & r + 1).Resize(UBound(s)).Value = .Range("A" & r & ":F" & r).Value
'        .Range("H" & r + 1 & ":J" & r + 1).Resize(UBound(s)).Value = .Range("H" & r & ":J" & r).Value
        .Range("H" & r + 1 & ":" & Chr(64 + lc) & r + 1).Resize(UBound(s)).Value = .Range("H" & r & ":" & Chr(64 + lc) & r).Value
      End If
    Next r
  End With
  Application.ScreenUpdating = True
End Sub
0
I have to install an R program and several open source proteomics programs in a linux box.  The R program calls the open source proteomics programs.  The pipe line starts by one of the programs taking as input a raw data file and all the programs produce output files some of which may be input to the other programs in the pipe line.  The users will be able to run the R program online.
1. Is it possible to do this? I am using the open open source version of RStudio which s single threaded (users responses to program requests wait until the previous user in the chain finishes running the pipeline. This implies that the proteomics programs called by R will be called by a single user at a time.

2.  Is there any way of synchronizing the linux box and the users pcs so that the files can be created in both the linux server and the PCs, otherwise the users will have to send the raw data file to the Linux box and to import data files from the linux box to their PCs.
0
The R program that we have calls other proteomics open source software installed in the user PCs.  I am going to install RStudio server and the Sniny server in linux so that everyone can access the program remotely.  The idea is to keep the proteomics software in the user PCS and just install the R program in the server.  Is this possible? Will the server be able to locate the software installed in the user PCs?
0
Guys I have codes here that is working in Numeric only.
I need a code that whatever is the last number it will generate the next number.Whether I changed the last or not it will generate a next number. Here's my code for the numeric number but I need to generate a number  using alpha-numeric.

Dim Ws As Worksheet
    Dim r As Long
    Dim ReqNo As String
    Set Ws = Worksheets("MainRecord")
getNum:
    r = Ws.Cells(Ws.Rows.Count, "D").End(xlUp)
    ReqNo = CODES.Text & "-" & r + 1
    If IsNumeric(Application.Match(ReqNo, Worksheets("MainRecord").Range("D:D"), False)) Then GoTo getNum
    On Error Resume Next
    Me.REQUESTNO.Value = ReqNo

I dont know what code to use.To generate an Alpha-Numeric number
0
In process of rebuilding...installed SSD...Win 7 / 64 Pro...new RAM...
All drivers installed...correctly...

HOWEVER...this bugger will NOT update BIOS...

According to Belarc which I ran before the rebuild...I have BIOS v2.7...
The latest BIOS on Toshiba download site is BIOS v4.10
Everytime I try to install the BIOS i get the error message......"This computer is not supported"...

Any suggestons appreciated...
0
Free Tool: Port Scanner
LVL 11
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

We hv external consultants who will be stationed at our office to do
Data warehse statistical analysis using R & Python :
what are the risks to watch out for ?  We provide hardened PCs

Don't allow Internet access?
Any patches needed?
Secure Coding to adhere to?
0
I have this query that returns the count within each dollar range of quotes. I would like to add the sum of the dollar value of those quotes as well. Is that possible within this one query? If so, how would you do it?

SELECT T.RANGE AS [Range_Value],COUNT(*) AS [Number_of_Quotes]
FROM (
      SELECT CASE
            WHEN CONT_AMNT >=0 AND CONT_AMNT <= 1000 THEN ' 1) $0-$1000'
            WHEN CONT_AMNT >1000   AND CONT_AMNT <= 5000 THEN ' 2) $1000-$5000'
            WHEN CONT_AMNT >5000   AND CONT_AMNT <= 10000 THEN ' 3) $5000-$10000'
            WHEN CONT_AMNT >10000  AND CONT_AMNT <= 50000 THEN ' 4) $10000-$50000'
            WHEN CONT_AMNT >50000  AND CONT_AMNT <= 100000 THEN ' 5) $50000-$100000'
            WHEN CONT_AMNT >100000 AND CONT_AMNT <= 150000 THEN ' 6) $100000-$150000'
            WHEN CONT_AMNT >150000 AND CONT_AMNT <= 200000 THEN ' 7) $150000-$200000'
            WHEN CONT_AMNT >200000 AND CONT_AMNT <= 500000 THEN ' 8) $200000-$500000'
            WHEN CONT_AMNT >500000 AND CONT_AMNT <= 1000000 THEN ' 9) $500000-$1000000'
            WHEN CONT_AMNT >1000000 THEN '10) $1000000+'
     END AS RANGE
       FROM (
         SELECT SUM(TOTAL_SELL_PRC) AS CONT_AMNT,CUSTOMER_NUMBER AS CUSTNMBR
   FROM [CSTQUTHD]
   WHERE DATE_OF_QUOTE>= '2016/01/01' AND DATE_OF_QUOTE<'2016/12/31'
   GROUP BY CUSTOMER_NUMBER) R ) T
   GROUP BY T.RANGE
0
I dont think SQL can store a table with 190,005 columns and 5000 Rows.

What I have now.
Right now what I have is a table with 5001 observations. Each observation has 190,005 possible scenarios.  That's 950,215,005 rows.

My objective.
I would like to pivot this table so my observations is the first column and all the possible outcomes are columns.
The observation is an int and the possible observations are bit (basically yes or no for outcome).

I am using Microsoft SQL server 2016.

My goal.
I am trying to create a matrix to do correlation analysis. I will be bring the data into R some how i am sure that's going to be a challenge as well.

If SQL cant do it that's fine but how then can I get this data into a Text file from SQL?

Basically Type becomes columns and isTrue flag is now going horizontal so there is 1 row per Key. Remember there are 190,005 different Types for each member. Yes I do know every possible Type value. For the sake of this example I didnt think it would be wise to list all 190,005.
Here are my columns
Key, Type, isTrue
1,V3201,1
1,F123,0
1,S20341,0
2,99080,1
2,R4570,1
2,S26995,0
3,D4567,0
3,SX34526,0
3,F5678,1
3,E9807,1
3,ST5688,0

Open in new window


Thank you in advance for your help.
0
I am getting the following error when running make while trying to cross-compile OpenSSL:

make[1]: mips-linuxar: Command not found
Makefile:652: recipe for target 'libcrypto.a' failed
make[1]: *** [libcrypto.a] Error 127
make[1]: Leaving directory '/home/dev/openwrt/package/openssl-1.1.0f'
Makefile:128: recipe for target 'all' failed
make: *** [all] Error 2


From the Makefile:

PLATFORM=linux-generic32
OPTIONS=--cross-compile-prefix=mips-linux no-asan no-crypto-mdebug no-crypto-mdebug-backtrace no-ec_nistp_64_gcc_128 no-egd no-fuzz-afl no-fuzz-libfuzzer no-heartbeats no-md2 no-msan no-rc5 no-sctp no-ssl-trace no-ssl3 no-ssl3-method no-ubsan no-unit-test no-weak-ssl-ciphers no-zlib no-zlib-dynamic
CONFIGURE_ARGS=("linux-generic32", "--cross-compile-prefix=mips-linux")

ARFLAGS=
AR=$(CROSS_COMPILE)ar $(ARFLAGS) r


Line 652
      $(AR) $@ $?

I am new to this. Please tell me how I can make this question more useful.

Thank you.
Makefile
Configure
0
How to find the returns of multiple, irregular inflows & outflows over a short period, lets say in 4 months?

I don't want to use the XIRR as it gives the compounded annualized return, which will be misleading. I want to calculate the returns for the duration of the investment only.

For eg., please refer to the attached excel sheet which lists out 5 transactions of Buy & Sell over 4 months period.

Thanks and regards

R
Question.xlsx
0
Hi,
I received a new computer last week but cannot get Task Scheduler to successfully run a VBS script.  It says it is running but the spreadsheets are not updating.  I created a new script to open Notepad and type a few words and Task Scheduler says it is running, ran successfully, but it did not..Notepad never opened.

I can manually run both scripts and they work fine.  Spreadsheets get updated and Notepad opens, types a few words, and stays open until I close Notepad.

My company's computer support has not been able to help at this time and I am currently waiting for the next level of help.  However, that could be more than a week until someone contacts me.

Attached are some screen shots I hope will help, with some identifying text blacked out with a description of what info was there.

Would any of you have any ideas as to why the scripts work when I double click on them manually to get them to run but Task Scheduler will not successfully get them to run, even though it says they are running?

Thanks.
0

Statistical Packages

Statistical packages are software titles, such as JMP and GNU Octave, and programming languages, such as MATLAB, R and SAS, that are used to discover, explore and analyze data and suggest useful conclusions, either to learn something unexpected or to confirm a hypothesis. The field includes the design and analysis of techniques to give approximate but accurate solutions to hard problems in statistics, econometrics, time-series, optimization and 2D- and 3D-visualization. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Top Experts In
Statistical Packages
<
Monthly
>