Python

Python is a widely used general-purpose, high-level programming language. Its design philosophy emphasizes code readability, and its syntax allows programmers to express concepts in fewer lines of code than would be possible in other languages. Python supports multiple programming paradigms, including object-oriented, imperative and functional programming or procedural styles. It features a dynamic type system and automatic memory management and has a large and comprehensive set of standard libraries, including NumPy, SciPy, Django, PyQuery, and PyLibrary.

Share tech news, updates, or what's on your mind.

Sign up to Post

OK, EE members, I appreciate that EE doesn't have as many programmers as Stackoverflow, but I have joined because I'm hoping that the few programmers with EE will assist me.


I was presented with the following Apache Spark python question:

Sample and Randomly split data to create training and test datasets and persist the training dataset to disk.

In order to successfully complete the question I need to write a function that achieves the following:

  1. CC1_TrainTestSpit.csv
  2. Samples 20 Percent of the the Dataframe without replacement
  3. Randomly splits the sampled data to create train and test datasets with weights .8 and .3 respectively
  4. Persist the training dataset using DISK_ONLY
  5. Uses the summary function on the training dataset to return a dataframe of the statistics mean, min, max.
  6. Returns, in order the training dataset, test test dataset, and the summary dataframe

The function should take the form of:

def traintestSplit(df, Dataframe)

Return (trainDF

testDF

statsDF)

The dataset is attached.

Can someone assist me
0
OWASP: Avoiding Hacker Tricks
LVL 13
OWASP: Avoiding Hacker Tricks

Learn to build secure applications from the mindset of the hacker and avoid being exploited.

I've got a Python script that traverses S3 Buckets and prints out what folders and files have public permissions. This can be handy when auditing AWS for potential security issues.

Right now, the script runs fine, but takes a long time to run, due to a CDN that has known public permissions. I would like to exclude that bucket when I run the script.

Can someone please help me create a line in the script that allows me to EXCLUDE a particular bucket? Let's call the bucket I want to exclude "cdn-twt" for the sake of this script.

Thanks in advance for your assistance.

#This Script will use Paginator to print result for each bucket, executed in multiple threads
import boto3
import threading
import os.path

ACCESS_KEY = 'A*****************A'
SECRET_ACCESS_KEY = 'P******************************2'

session = boto3.Session(aws_access_key_id = ACCESS_KEY, aws_secret_access_key = SECRET_ACCESS_KEY)

maxthreads = 5
sema = threading.Semaphore(value=maxthreads)

def list_object(bucket):
    try:
        s3 = session.client('s3')
        flag1 = objcount = 0
        paginator = s3.get_paginator('list_objects')
        page_iterator = paginator.paginate(Bucket= bucket)
        for page in page_iterator:
            if 'Contents' in page:
                for obj in page['Contents']:
                    uniobj = obj['Key'].encode('ascii', 'ignore').decode('ascii')
                    objAcl = s3.get_object_acl(Bucket=bucket, Key=obj['Key'])
                    flag2 = 0
   

Open in new window

0
I have a python script on python 3.7.2 that I have pieced together. The script will first accept user input for a string and store into a variable and then iterate through each function call.

Depicted here:

while True:
    uin = input("Please enter a uin or 'enter' to quit: ")
    if not uin:
        break
    startTime = datetime.now()
    selectCustomer(uin)
    selectVehicle(uin)
    selectHRO(uin)
    selectHLABOR(uin)
    selectHPARTS(uin)
    selectINV(uin)
    selectSUPPLIER(uin)
    selectLABOR_OP(uin)
    selectSOURCE(uin)
    selectRO(uin)
    selectLABOR(uin)
    selectPARTS(uin)
    selectEHRO(uin)
    selectEHLABOR(uin)
    selectEHPARTS(uin)
    endTime = datetime.now() - startTime
    print('It took', endTime, ' seconds to run...')

Open in new window


Each function (for the most part is exactly identical and is composed of:

  • opening up a SQL connection
  • running a query
  • storing the output in an ordered dictionary
  • creating a JSON file and naming it with the user input (in our case UIN)
  • closing connection and file

Depicted here:

def selectEHPARTS(uin, output_dir=r"C:\Code\TestData"):
    driver = 'DRIVER={SQL Server};'
    host = 'SERVER=SQL2;'
    database = 'DATABASE=ROWSTAGE;'

    connectionstr = driver + host + database
    conn = pyodbc.connect(connectionstr)
    cursor = conn.cursor()

    cursor.execute("SELECT STORE_UIN, RO_NO, PARTNO, PARTDESC, ITEM_NO, 

Open in new window

0
I'm using Python 3.7.4

classdemo.py
#Base Class
class Staff:
    def __init__ (self, pPosition, pName, pPay):
        self._position = pPosition
        self.name = pName
        self.pay = pPay
        print('Creating Staff object')

    def __str__(self):
        return "Position = %s, Name = %s, Pay = %d" %(self._position, self.name, self.pay)

    @property
    def position(self):
        print("Getter Method")
        return self._position

    @position.setter
    def position(self, value):
        if value == 'Manager' or value == 'Basic':
            self._position = value
        else:
            print('Position is invalid.  No changes made.')
    
    def calculatePay(self):
        prompt = '\nEnter number of hours worked for %s: ' %(self.name)
        hours = input(prompt)
        prompt = 'Enter the hourly rate for %s: ' %(self.name)
        hourlyRate = input(prompt)
        self.pay = int(hours)*int(hourlyRate)
        return self.pay

#Derived class
class ManagementStaff(Staff):
    def __init__ (self, pName, pPay, pAllowance, pBonus):
        super().__init__('Manager', pName, pPay)
        self.allowance = pAllowance
        self.bonus = pBonus

    def calculatePay(self):
        basicPay = super().calculatePay()
        self.Pay = basicPay + self.allowance
        return self.pay

    def calculatePerfBonus(self):
        prompt = 'Enter performance grade for %s: ' %(self.name)
        grade = input(prompt)
        if (grade == 'A'):
           

Open in new window

0
I've written some simple Python scripts and now I would like to write a Python script that can be installed and run as a service.

I know how to take a Python program and run it as a service but how would I write a Python program that can be installed?

My operating system is Ubuntu 18.04.
0
How does Python3 endswith( ) built-in function operate on following tuple of suffixes?

# Using a tuple of suffixes (check from index 2 to 6-1)
'Postman'.endswith( ('man', 'ma'), 2, 6)

Does 'Postman' have to end with 'man' or 'ma' in order for this to be True ?
0
I connect to remote Linux VM from my windows PC puTTy terminal or from remote desktop.

The Linux VM distribution is Ubuntu 18.04.2 LTS.  It is 64-bit.

What is python3-pexpect.  It isn't installed.  Will 'sudo apt-get install python3-pexpect' install it?
0
I've got a Python script that traverses S3 Buckets and prints out what folders and files have public permissions. This can be handy when auditing for security issues.

Right now, the script runs fine, but times-out by the time it hits the third bucket.

Can someone please help me find a way to "hard code" this on a per-bucket basis? In other words, if I have a bucket called "art-bucket", how could I get the script to traverse JUST that bucket and provide me the results.

BTW - I've installed both boto3 & Paginator

Thanks for your help.

#This Script will use Paginator to print result for each bucket, executed in multiple threads
import boto3
import threading
import os.path

ACCESS_KEY = 'AKIAIXXXXXXXXXXX'
SECRET_ACCESS_KEY = 'XXUPJIsSXXxxXXxxXXo9Fl5TzSxXXxxXX3ly2XXlxjXXxxXX'

session = boto3.Session(aws_access_key_id = ACCESS_KEY, aws_secret_access_key = SECRET_ACCESS_KEY)

maxthreads = 5
sema = threading.Semaphore(value=maxthreads)

def list_object(bucket):
    try:
        s3 = session.client('s3')
        flag1 = objcount = 0
        paginator = s3.get_paginator('list_objects')
        page_iterator = paginator.paginate(Bucket= bucket)
        for page in page_iterator:
            if 'Contents' in page:
                for obj in page['Contents']:
                    uniobj = obj['Key'].encode('ascii', 'ignore').decode('ascii')
                    objAcl = s3.get_object_acl(Bucket=bucket, Key=obj['Key'])
                    flag2 = 0
                   

Open in new window

0
hello expert
I'm new to python.
I wrote a code below.
code is running.
But when I command "pop" from the outside, it gives an error.
because only one word is written for the "pop" command.
and so it gives "IndexError: list index out of range" error for number variable
ı can how to fix this error?

z=int(input())
liste=set(map(int,input().split()))
for _ in range(0,int(input())):
       line=input().split()
       deger=line[0]
       sayi=line[1]--this line give me error for 'pop' command
       if deger=='pop':
           liste.pop()
       elif deger=='remove':
           liste.remove(int(sayi))
       elif deger=='discard':
            liste.discard(int(sayi))
       else:
           pass

for xx in liste:
    print(str(xx),sep="\n")

Open in new window

thanks alot
0
I need to build embedded Linux image using Yocto Project.
https://www.yoctoproject.org/docs/2.7.1/brief-yoctoprojectqs/brief-yoctoprojectqs.html

Yocto requires Python 3.4.0 or greater
I have Python 2.7.15+

How to upgrade python in Linux VM that I can connect to with windows PC secure shell like puTTy.  Also, I can connect to Linux VM by Remote Desktop connection from my windows PC.
0
OWASP Proactive Controls
LVL 13
OWASP Proactive Controls

Learn the most important control and control categories that every architect and developer should include in their projects.

Experts!

I am an entry-level programmer and quickly learning python. I am having difficulty understanding a few concepts. In the code below, I have created a script that will establish a connection string, query a database, and output the data in a pretty format in JSON.

I am looking to understand how I would incorporate:

  • adding in an iteration that will iterate strings from user input (I understand how to ask for user input and how to put those items in array or object) but grabbing the contents of the array and storing them in place of store_uin = (instead of CO004 inserting what would be in the array)
  • iterating and repeating those select statements for all the contents in the array
  • and writing out to new output JSON file instead of overriding one with the file name being export_name from array_date.JSON

import pyodbc
import json
import collections
def selectHRO():
    driver = 'DRIVER={SQL Server};'
    host = 'SERVER=hostname;'
    database = 'DATABASE=dbname;'

    connectionstr = driver+host+database
    conn = pyodbc.connect(connectionstr)
    cursor = conn.cursor()

    cursor.execute("SELECT cust_NO, FNAME, LNAME, store_UIN from Customer where store_UIN = 'CO004'")

    rows = cursor.fetchall()

    objects_list = []
    for row in rows:
        d = collections.OrderedDict()
        d['cust_NO'] = row.cust_NO
        d['FirstName'] = row.FNAME
        d['LastName'] =

Open in new window

0
When I wanted to learn ASP or ASP.NET I obtained a MS hosting account.
I went to Bluehost to set up an account to learn PHP.

How would I go about this with Python?
0
I want to name images I'm inserting into an Excel workbook using Python.

I've tried using both openpyxl and xlsxwriter to insert the images but both give default names to the inserted image, e.g. Image 1, Picture 1.

Is there a way to name the inserted images, preferably taking the name from the filename?

Here's my code so far.
import xlsxwriter
import openpyxl

wb = openpyxl.Workbook()
ws = wb.worksheets[0]

workbook = xlsxwriter.Workbook('Maps.xlsx')
worksheet = workbook.add_worksheet('Maps')

images = ['overview.png','Site 1.png','Site 2.png','Site 3.png','Site 4.png','Site 5.png','Site 6.png','Site 7.png','Site 8.png']

for row, image in enumerate(images):

  img = openpyxl.drawing.image.Image(image)
  img.anchor = 'A'+str(row*30+1)

  ws.add_image(img)

  worksheet.insert_image(30*row,1,image)
  

workbook.close()

wb.save('out.xlsx')

wb.close()

Open in new window

0
Hi,

I'm now having a new task which is completely new to me.  My boss asked me to grab data from an xml file which send to us on a daily basis.  I'm completely new to xml.  I find web scraping tools on the internet seems relevant to the task that I need to handle.  I also find lxml y using Python may help to solve my case.

May I know whether any short and concise (not as detail as encyclopedia) notes and materials which come with sample Python codes (at least a sample code skeleton) which demonstrate how to code scraping xml data by using lxml library under Python?

Or, if some other libraries which can do the job better than lxml under Python, that will also be welcome.  But, currently, due to license issue, I can only use Python as the programming language or Excel VBA.

Kindly please help.

Cheers!
Stanley
0
Hi Experts,

I get the following error when I convert tiff to jpeg with wand (python library) of imagemagick on linux ubuntu machine

wand.exceptions.CoderError: Read error at scanline 3373; got 27894 bytes, expected 30000. `TIFFFillStrip' @ error/tiff.c/TIFFErrors/568

Open in new window


When I open image with paint and convert it jpeg it is working fine.  But when I do it programatically it is throwing these errors.  Please help in resolving this issue.

Thank you

Bharath
0
I want to extract contact information related to real estate companies from the following link: https://www.metrocuadrado.com/directorio-inmobiliarias/bogota/
The link has pagination, so I included in my code automated clicks to access pages no. 2 and page no. 3 from the link provided since it is not possible from the URL (the URL is the same regardless of the page no).

nombre_path = "//*[@class='col-xs-6 nombre-inmobil']/h2[1]"
ciudad_path = "//*[@class='col-xs-6 nombre-inmobil']/span[1]"
direccion_path = "//*[@class='col-xs-12']/ul/li[1]/span[1]"
telefono_path = "//*[@class='col-xs-12']/ul/li[2]/span[1]"
email_path = "//*[@class='col-xs-12']/ul/li[3]/span[1]"
link_path = "//*[@class='col-xs-12']/a[1]"

nombre = []
ciudad = []
direccion = []
telefono = []
email = []
link = []

driver = webdriver.Chrome('/Users/racho/Documents/chromedriver')
driver.implicitly_wait(30)

url = 'https://www.metrocuadrado.com/directorio-inmobiliarias/bogota/'
driver.get(url)

for i in range(3,5):
    
    page_path = "//*[@class='col-xs-12 col-sm-4 col-md-4 col-lg-4 pager']/a" + str([i])
    
    next_page = driver.find_element_by_xpath(page_path)
    actions = ActionChains(driver)
    actions.move_to_element(next_page)
    actions.click(next_page)
    actions.perform()
 
    el_nombre = driver.find_elements_by_xpath(nombre_path)
    nombre_ = [element.text for element in el_nombre]
    [nombre.append(x) for x in nombre_]

    el_ciudad = driver.find_elements_by_xpath(ciudad_path)
    

Open in new window

0
Python: Moving files into folder based on excel spreadsheet.

I have a spreadsheet with the column that is the name of the folder (column A) and column B (the name of the file). The files in column B are not sorted into folders and I am trying to write a python script that will go through the list of files and move them into their designated folder. Below is what I have built so far:

import os
import shutil
import openpyxl

src = "S:\\Compliance Dept\\Vendor Management\\VCM\\VCM Attachments\\"
dst = "S:\\Compliance Dept\\Vendor Management\\VCM\\VCM Attachments\\"
exFile = "C:\\Users\\mfigur\\Desktop\\Attachments.xlsx"

wb = openpyxl.load_workbook(exFile)
sheet = wb.get_sheet_by_name ('Sheet1')
ven = sheet['A']
fil = sheet['B']

for cell in fil:
   folderName = cell.value
    os.path.join(src, folderName)
    shutil.copy()
    print("\nFolder created in: ", os.path.join(src, folderName))

Open in new window


Also here is a screenshot of the folder structure.
0
I try to write Python code to read data (scrapping strike price, LTP, OI, Change in OI,IV, expiry date) from NSE option chain(nseindia.com). But when trying to run the code only one message is displayed "Process finished with exit code 0". Can anyone rectify the error in the code.
My code:--------------------


import requests
import pandas as pd
from bs4 import BeautifulSoup



def get_option_chain(symbol):

    if symbol == 'NIFTY':
        Base_url =('https://www.nseindia.com/live_market/dynaContent/live_watch/option_chain/optionKeys.jsp?'
                   'symbolCode=-10003&symbol=NIFTY&symbol=NIFTY&instrument=OPTIDX&date=-&segmentLink=17&segmentLink=17')
    elif symbol == 'BANKNIFTY':
        Base_url =('https://www.nseindia.com/live_market/dynaContent/live_watch/option_chain/optionKeys.jsp?'
                   'symbolCode=-9999&symbol=BANKNIFTY&symbol=BANKNIFTY&instrument=OPTIDX&date=-&segmentLink=17&segmentLink=17')
    else:
        Base_url = ('https://www.nseindia.com//marketinfo/sym_map/symbolMapping.jsp?symbol={}&'
                    'instrument=OPTSTK&date=-&segmentLink=17'.format(symbol))
    try:
        page = requests.get(Base_url)
    except Exception as e:
        print ("Exception {} in getting data  for symbol {}".format(e, symbol))

    #print(page.status_code)
    #print(page.content)

    soup = BeautifulSoup(page.content, 'html.parser')
    # print (soup.prettify())

    table_it = soup.find_all("div", {"class": …
0
Hi,

I am beginner in Python...can you please recommend a few list of websites that teaches python basics/fundamentals for the layman in the best easy digestible format, quick to pick up?

thanks
0
Exploring ASP.NET Core: Fundamentals
LVL 13
Exploring ASP.NET Core: Fundamentals

Learn to build web apps and services, IoT apps, and mobile backends by covering the fundamentals of ASP.NET Core and  exploring the core foundations for app libraries.

The environment is

$uname -m
x86_64

$cat /etc/*release
skip
VERSION="16.04.4 LTS (Xenial Xerus)"
VERSION_ID="16.04"
UBUNTU_CODENAME=xenial

$python --version Python 2.7.12

I have 2 python scripts one purges a sql database the other dynamically updates the database

script1.py takes a long time to complete. I need help in writing a new python script3.py to run

script1.py

make sure it completes, successfully, and then run

script2.py

and log and email status.

Your help is appreciated!!!
0
Hi Experts,

I get the following error when I run the python application.

The code which is causing error is as follows:

 current = Item.objects.filter(pk=str(properties['id'])).first()

Open in new window


where Item is a Django model and id is the primary key .  the error which i get is as follows:-

KeyError: 'id'

Traceback (most recent call last):
  File "/home/ubuntu/workarea/dev-harvestor/harvestor-2/harvest-territory-stories/harvest/models.py", line 283, in _add_item
    current = Item.objects.filter(pk=str(properties['id'])).first()
KeyError: 'id'

Open in new window

I am not getting what is causing this error.  Please help in resolving this issue.

Thank you,
0
Hi,

I have a view in SQL Server that creates a pivoted result set. I need to re-format the title of the columns to split the title over 2 rows. The two additional column headers would be derived from the existing column name by splitting the string based on the '|' operator.
I realise this is really the presentation layer but I wanted to know if it is possible to use TSQL and the existing view to give the required format, or is this better formatted using say Python in the SQL Machine Learning Services?

Below is some tsql just to demonstrate the output of the existing view:

DECLARE @TempTable TABLE (
	Parameter NVARCHAR(30)
	,[ID01|20190720] FLOAT
	,[ID01|20190721] FLOAT
	,[ID01|20190722] FLOAT
	,[ID01|20190723] FLOAT
	)

INSERT INTO @TempTable
VALUES (
	'pH'
	,'5'
	,'6'
	,'6'
	,'5'
	)

INSERT INTO @TempTable
VALUES (
	'Alkalinity'
	,'166'
	,'168'
	,'168'
	,'163'
	)

SELECT *
FROM @TempTable

Open in new window


Below is an example of the required format:
Example output
0
Hi Experts,

I get the following error when I run the following python application,   Things I do are as follows. I run python run.py pull this command, and I get this following error.  I had tried removing the migrations folder and created new migrations folder and copied the ___init.py and tried python manage.py migrate.  there are no errors.


root@ip-10-252-14-11:/home/ubuntu/workarea/dev-harvestor/harvestor-2/harvest-territory-stories# python run.py pull
2019-07-22 22:52:37,108 [MainThread  ] [INFO ]  Found credentials in shared credentials file: ~/.aws/credentials
_initializeapp run.py
PID 6264 opening the lock file/home/ubuntu/workarea/dev-harvestor/harvestor-2/harvest-territory-stories/harvest/../lockfile
PID 6264 locking the lock file /home/ubuntu/workarea/dev-harvestor/harvestor-2/harvest-territory-stories/harvest/../lockfile
2019-07-22 22:52:37,357 INFO fetching from offset 50360
bitstreams -  true  -  hasoriginals -  true  - hasthumbnail -  false
tiffimage -  false  - jpegimage -  true  - pdf -  false  - audio -  false  video -  false  parentrecordc -  false
i.name.lower() -  image viewer
Traceback (most recent call last):
  File "/home/ubuntu/workarea/dev-harvestor/harvestor-2/harvest-territory-stories/harvest/models.py", line 639, in _add_item
    item.save()
  File "/usr/local/lib/python3.6/dist-packages/django/db/models/base.py", line 806, in save
    force_update=force_update, update_fields=update_fields)
  File 

Open in new window

0
Hello Experts,

We have one encrypted PDF where we might have multiple objects like .csv or .word or another .pdf file within the same PDF. Now need the script in python to extract the embedded objects and save them in one directory creating one directory with pdfname and then put all the objects after extracting under that folder.

Python version  # 3.7.4
0
sacramento_real_estate_transactions.csvsacramento_real_estate_transactions.csvHi,
I have a data set. i made couple of cleaning on it. i got acceptable result but i cannot work on it to predict my values because of:

prices which is the target has a high standard deviation. here the standard deviation in thousands.

the features which are the beds number and the bath numbers with square feet are in the acceptable standard deviation (less than one).

i can understand there is an outliar values. my question is: what is the best procedure to fix the standard deviation of the price after what i did for cleaning.

note: for cleaning i removed the negative values, the zero's, and the NaN. please i need an expert to help in this :(
0

Python

Python is a widely used general-purpose, high-level programming language. Its design philosophy emphasizes code readability, and its syntax allows programmers to express concepts in fewer lines of code than would be possible in other languages. Python supports multiple programming paradigms, including object-oriented, imperative and functional programming or procedural styles. It features a dynamic type system and automatic memory management and has a large and comprehensive set of standard libraries, including NumPy, SciPy, Django, PyQuery, and PyLibrary.