Solved

Using Python To Iterate Through All Website Pages And Paste Changed Content Back

Posted on 2016-08-10
28
259 Views
Last Modified: 2016-08-19
I had this question after viewing What regex will remove duplicate rel="nofolow" tags?.

What has come out of the previous questions works great for an individual page and then manually pasting the updated page back to the site.  The site is blogger and the api will only allow you to change a very limited number of pages per day.   For example, if I was doing this manually open the page, copy the page, update the links, paste the page back and then save the page.  So is there  a  way to do all the pages in the site and pasting the changed content back into the blog as I just described?

By the way, I tried the Adam Lewis find and replace http://www.adamwlewis.com/articles/blogger-find-replace but since it is using the API it only does a limited number of pages per day and then it starts at the top again each time it is ran.  I have 1,000's of pages and many more links to change thus why I am hoping I can get this to iterate through the website and make the appropriate changes.

Thanks,
0
Comment
Question by:sharingsunshine
  • 15
  • 13
28 Comments
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41753294
You can use python in 2 ways here:
1) In the same way as the tool Blogger Find and Replace, but with some kind of control on what page to restart;
2) Create a python script that will use Selenium Webdriver to do what you want.
0
 

Author Comment

by:sharingsunshine
ID: 41754254
I appreciate the idea to pursue Selenium Webdriver but being new to Python and not seeing anything specific to my issue in the selenium docs I am still at a stand still.

Can you provide more specifics, some snippets of code or a link to something similar.

Thanks,
0
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41754358
With Selenium Webdriver, you control a webbrowser through python code. So, this means it is possible for you write the code to interact with blog on blogger, do all the replace and then put the changed text back, without the limitation of the API and still automated.
here is a small example:
from selenium import webdriver
from selenium.webdriver.common.proxy import *
# from pyvirtualdisplay import Display
import traceback
import random


def random_line(afile):
    line = next(afile)
    for num, aline in enumerate(afile):
        if random.randrange(num + 2):
            continue
        line = aline
    return line

browser = None
try:
    #    display = Display(visible=0, size=(800, 600))
    #    display.start()

    proxy = None
    with open('../data/proxies.txt', 'r') as f:
        myProxy = '177.130.59.66:3128' #random_line(f).replace('\n','').replace('http://','').replace('https://','')
        proxy = Proxy({
            'proxyType': ProxyType.MANUAL,
            'httpProxy': myProxy,
            'ftpProxy': myProxy,
            'sslProxy': myProxy,
            'noProxy': ''})

    browser = webdriver.Firefox(proxy=proxy)
    browser.get('https://www.yell.com/connectscan')

     print(browser.page_source)
     elem = browser.find_element_by_name('company.name')  # Find the search box
     elem.send_keys('Reconditioned Ranges Ltd')
     elem = browser.find_element_by_name('company.phoneNumber')  # Find the search box
     elem.send_keys('01209214774')
     elem = browser.find_element_by_name('company.email')  # Find the search box
     elem.send_keys('aaaaaa@gmail.com')
     elem = browser.find_element_by_class_name("js-show-manual-address utils-btnLink")
     elem.click()
     elem = browser.find_element_by_name('company.address.buildingNumber')  # Find the search box
     elem.send_keys('Aga House')
     elem = browser.find_element_by_name('company.address.streetAddress')  # Find the search box
     elem.send_keys('Scorrier Road')
     elem = browser.find_element_by_name('company.address.locality')  # Find the search box
     elem.send_keys('')
     elem = browser.find_element_by_name('company.address.town')  # Find the search box
     elem.send_keys('Redruth')
     elem = browser.find_element_by_name('company.address.county')  # Find the search box
     elem.send_keys('Cornwall')
     elem = browser.find_element_by_name('company.address.postcode')  # Find the search box
     elem.send_keys('TR16 5AA')
    
     elem.submit()

    print('no_errors')
except:
    print(traceback.format_exc())
finally:
    if browser:
        browser.quit()
 display.stop()

Open in new window

0
 

Author Comment

by:sharingsunshine
ID: 41754412
Based on the python code I have now

import urllib2
import re

website = urllib2.urlopen('http://www.theherbsplacenews.com/')
html = website.read()   # the content of the page

with open('original_document3.html', 'w') as f:
    f.write(html)

rexURL = re.compile(r'("http://www\.theherbsplace\.com/.*?")')
result = rexURL.sub(r'\1 rel="nofollow"', html)

rexDoubledNofollow = re.compile(r'(rel="nofollow"\s*)+')
result = rexDoubledNofollow.sub(r'\1', result)

with open('new_document3.html', 'w') as f:
    f.write(result)

Open in new window


I just need to open a webpage and paste this as the source code instead of writing it to new_document3.html.

I tried changing the open statement to include an http url but I got an error saying no such file or directory
Traceback (most recent call last):
  File "/Users/rjw/Documents/Python/expertsPepr2.py", line 15, in <module>
    with open('http://www.theherbsplacenews.com/2015/06/save-up-to-18-on-lbs-ii-aloe-vera-and.html', 'a') as f:
IOError: [Errno 2] No such file or directory: 'http://www.theherbsplacenews.com/2015/06/save-up-to-18-on-lbs-ii-aloe-vera-and.html'

Open in new window

0
 

Author Comment

by:sharingsunshine
ID: 41754418
I have seen several examples similar to what you gave me and I can tell it took some time to put that together but I am so close as I indicated above I just need to transfer the value back to the same webpage I started with as source code.

So can you tell me how to make that connection?
0
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41754783
The only way to use your code is to add the API call to send the page back. Otherwise, anyone could hack any website.
0
 

Author Comment

by:sharingsunshine
ID: 41755150
I am trying to run your code and I am getting his error:

Traceback (most recent call last):
  File "/Users/rjw/Documents/Python/expertsBrazilwebdriver.py", line 31, in <module>
    browser = webdriver.Firefox(proxy=proxy)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/selenium/webdriver/firefox/webdriver.py", line 80, in __init__
    self.binary, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/selenium/webdriver/firefox/extension_connection.py", line 52, in __init__
    self.binary.launch_browser(self.profile, timeout=timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/selenium/webdriver/firefox/firefox_binary.py", line 68, in launch_browser
    self._wait_until_connectable(timeout=timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/selenium/webdriver/firefox/firefox_binary.py", line 108, in _wait_until_connectable
    % (self.profile.path))
selenium.common.exceptions.WebDriverException: Message: Can't load the profile. Profile Dir: /var/folders/vg/lzbgw_fx4k90zdjn3zy95qt80000gp/T/tmpf63fqcrk If you specified a log_file in the FirefoxBinary constructor, check it for details.


Traceback (most recent call last):
  File "/Users/rjw/Documents/Python/expertsBrazilwebdriver.py", line 64, in <module>
    display.stop()
NameError: name 'display' is not defined

Open in new window


this is the code
from selenium import webdriver
from selenium.webdriver.common.proxy import *
# from pyvirtualdisplay import Display
import traceback
import random


def random_line(afile):
    line = next(afile)
    for num, aline in enumerate(afile):
        if random.randrange(num + 2):
            continue
        line = aline
    return line

browser = None
try:
    #    display = Display(visible=0, size=(800, 600))
    #    display.start()

    proxy = None
    with open('new_document3wd.html', 'r') as f:
        myProxy = '177.130.59.66:3128' #random_line(f).replace('\n','').replace('http://','').replace('https://','')
        proxy = Proxy({
            'proxyType': ProxyType.MANUAL,
            'httpProxy': myProxy,
            'ftpProxy': myProxy,
            'sslProxy': myProxy,
            'noProxy': ''})

    browser = webdriver.Firefox(proxy=proxy)
    browser.get('https://www.yell.com/connectscan')

    print(browser.page_source)
    elem = browser.find_element_by_name('company.name')  # Find the search box
    elem.send_keys('Reconditioned Ranges Ltd')
    elem = browser.find_element_by_name('company.phoneNumber')  # Find the search box
    elem.send_keys('01209214774')
    elem = browser.find_element_by_name('company.email')  # Find the search box
    elem.send_keys('aaaaaa@gmail.com')
    elem = browser.find_element_by_class_name("js-show-manual-address utils-btnLink")
    elem.click()
    elem = browser.find_element_by_name('company.address.buildingNumber')  # Find the search box
    elem.send_keys('Aga House')
    elem = browser.find_element_by_name('company.address.streetAddress')  # Find the search box
    elem.send_keys('Scorrier Road')
    elem = browser.find_element_by_name('company.address.locality')  # Find the search box
    elem.send_keys('')
    elem = browser.find_element_by_name('company.address.town')  # Find the search box
    elem.send_keys('Redruth')
    elem = browser.find_element_by_name('company.address.county')  # Find the search box
    elem.send_keys('Cornwall')
    elem = browser.find_element_by_name('company.address.postcode')  # Find the search box
    elem.send_keys('TR16 5AA')

    elem.submit()

    print('no_errors')
except:
    print(traceback.format_exc())
finally:
    if browser:
        browser.quit()
display.stop()

Open in new window


Please advise.
0
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41755434
I dont have what to advise, as you just put the code in the middle of yours, without any thinking on how to use it.
Anyway the selenium error is related to the lack of synch between the firefox version and selenium webdriver version. You may need to install the appropriate webdriver for your firefox version or upgrade your firefox to the version used by the webdriver.

The last display.stop() line needs to be commented, as all the other lines referring to display.
0
 

Author Comment

by:sharingsunshine
ID: 41756852
I disagree the code above your comment is an exact copy of what you provided initially.  I have no code in there of my own.

I am sure you know a lot about selenium and webdriver but it seems we aren't communicating.  I have asked how can I marry the two my regex code and the webdriver and you have yet to provide an answer.

I have no issue using the api but there must be someway to bridge the python regex code to the webdriver.

Since we have been going at this since the 11th and we are not any closer to a solution.  I am going to request the moderators get more experts involved.
0
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41757006
Well, in fact, if you dont see problems in using the API, then no selenium is needed and no complication for you.
I can show what to change on your code. Please see the comments below:
import re
# import google api --- please check documentation for that.

# -----
# here you'll initialize the google api to use it
# -----

# list_of_blogs = call the method to retrieve the list of blogs
# for blog in list_of_blogs:
#       pages_blog = get the list of pages of that blog
#        for page in pages_blog:
#             website = get the page content
                rexURL = re.compile(r'("http://www\.theherbsplace\.com/.*?")')
                result = rexURL.sub(r'\1 rel="nofollow"', website)
                rexDoubledNofollow = re.compile(r'(rel="nofollow"\s*)+')
               result = rexDoubledNofollow.sub(r'\1', result)
#           save the page back.

Open in new window


Now, if you can use this, try to write some code and show us where is the problem, then we can help you more.
0
 

Author Comment

by:sharingsunshine
ID: 41757039
I thought you were  callling selenium and webdriver the api.  My mistake for not clarifying.

However, the google api only allows 50 blog posts a day to be changed.  I have over 1700 posts in one blog alone.  and I have 4 blogs to change with equal or more posts.

This is why I wanted to use webdriver and selenium after you mentioned it.

I will be logged into the blogger dashboard and I have every right to copy and paste content into each post.  I just wanted to automate it (even a portion of it) rather than having to do it manually which will take a long time.
0
 
LVL 15

Accepted Solution

by:
Walter Ritzel earned 500 total points
ID: 41757130
Ok, let's go back to selenium.

The objective in selenium is to mimic each step that you do manually in order to do what you need.
So, let's get the initial code I gave you and split into pieces:
from selenium import webdriver
from selenium.webdriver.common.proxy import *
import traceback
import random

browser = None
try:
    browser = webdriver.Firefox()
    browser.get('http://www.theherbsplacenews.com/')   # navigate to your blog
except:
    print(traceback.format_exc())
finally:
    if browser:
        browser.quit()

Open in new window



the code above, if the problem I mentioned earlier with firefox x webdriver version is solved,  opens firefox and navigate to your blog, without you hit a key.

Next step, you need to change this code to log you in. To do that, the code will be something like this:
from selenium import webdriver
from selenium.webdriver.common.proxy import *
import traceback
import random

browser = None
try:
    browser = webdriver.Firefox()
    browser.get('http://www.theherbsplacenews.com/')   # navigate to your blog
    elem = browser.find_element_by_name('login')  # Find the login textbox. You need to inspect the page source code to get the correct name
    elem.send_keys('user')
    elem = browser.find_element_by_name('password')  # Find the password textbox
    elem.send_keys('pass')
   elem.submit() # submit the form
except:
    print(traceback.format_exc())
finally:
    if browser:
        browser.quit()

Open in new window



the code above, if the object names are correct and the data informed is also correct, will log you in and show the first page after login.

So, based on this, you should build your automation script. I believe that since you want to change every post in your blog, there is probably a part of this code that will be a loop of some sorts.
Inside this loop you will probably execute the following actions (see the code just below):
   - select all the content of post and associate to a python variable;
  - use this variable to apply the regex you have created;
  - use a command to input the changed content back to the post on the blog;
                 blog_post = browser.find_element_by_name('post')
                 rexURL = re.compile(r'("http://www\.theherbsplace\.com/.*?")')
                result = rexURL.sub(r'\1 rel="nofollow"', blog_post.content)
                rexDoubledNofollow = re.compile(r'(rel="nofollow"\s*)+')
               result = rexDoubledNofollow.sub(r'\1', result)
               blog_post.send_keys(result)

Open in new window


I encourage you to take a look on selenium documentation for more detailed information for coding the script.

Hope that now everything is the most clear possible.
0
 

Author Comment

by:sharingsunshine
ID: 41758669
I don't have the correct version of python to have selenium 3.0 which is the version I need.  So I will post another question to find out how to downgrade my 3.5 to 3.3.  Then I will be back to work on this.
0
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41760764
Please let me know your Firefox version.
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 

Author Comment

by:sharingsunshine
ID: 41760773
FireFox version 48.
0
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41760855
Ok. So, let's get rid of Firefox in this code and use the more obvious browser for Mac:

Replace Firefox by Safari on this line.

browser = webdriver.Safari()

Open in new window

0
 

Author Comment

by:sharingsunshine
ID: 41760866
Thanks for sticking with me on this

Traceback (most recent call last):
  File "/Users/rjw/.pyenv/versions/test_env/lib/python3.3/site-packages/selenium/webdriver/safari/webdriver.py", line 50, in __init__
    executable_path = os.environ["SELENIUM_SERVER_JAR"]
  File "/Users/rjw/.pyenv/versions/3.3.6/lib/python3.3/os.py", line 656, in __getitem__
    raise KeyError(key) from None
KeyError: 'SELENIUM_SERVER_JAR'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "expertsBrazil2webdriver.py", line 9, in <module>
    browser = webdriver.Safari()
  File "/Users/rjw/.pyenv/versions/test_env/lib/python3.3/site-packages/selenium/webdriver/safari/webdriver.py", line 53, in __init__
    'SELENIUM_SERVER_JAR'")
Exception: No executable path given, please add one to Environment Variable                 'SELENIUM_SERVER_JAR'

(test_env) rjw python -V
Python 3.3.6

Open in new window

0
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41760969
Ok, so you need to download the jar driver:
http://docs.seleniumhq.org/download/ and find the line where it said:

Download version 3.0.0-beta2

Click on the link and download to your computer.

Last step, add the following line on your script, after  the last line of import.
import os

os.environ["SELENIUM_SERVER_JAR"] = "<path to your download jar with the file name>"

Open in new window

0
 

Author Comment

by:sharingsunshine
ID: 41761016
I am getting this error

Error: Unable to access jarfile <'/Users/rjw/Downloads/selenium-server-standalone-3.0.0-beta2.jar'>
Traceback (most recent call last):
  File "expertsBrazil2webdriver.py", line 12, in <module>
    browser = webdriver.Safari()
  File "/Users/rjw/.pyenv/versions/test_env/lib/python3.3/site-packages/selenium/webdriver/safari/webdriver.py", line 55, in __init__
    self.service.start()
  File "/Users/rjw/.pyenv/versions/test_env/lib/python3.3/site-packages/selenium/webdriver/safari/service.py", line 69, in start
    raise WebDriverException("Can not connect to the SafariDriver")
selenium.common.exceptions.WebDriverException: Message: Can not connect to the SafariDriver

Open in new window


I see this but I am not clear which option to pick

https://gyazo.com/0a02f733c36e1b780686c7ae1baa6dd7
0
 

Author Comment

by:sharingsunshine
ID: 41761150
Since this is turning out to be so difficult and time consuming for both of us is there a way to ( after I have manually opened the url in blogger clicked on a post)  have python copy the source code run it through the regex routine I have and then paste it back.  Press the update button and iterate to the next post?

I have 177,000 links that need to be changed so any automation would be helpful.
0
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41761225
The error now seems to be with permissions on the folder. Can't you copy the jar file to your script folder and adjust the environment variable and try again?
Also, please check the jar is accessible by anyone. I think if OS X is similar to linux, you can type this command on the terminal:
chmod 777 <jar file name>

Open in new window

0
 

Author Comment

by:sharingsunshine
ID: 41761248
it opened -  Hurray!

(test_env) rjw python expertsBrazil2webdriver.py
12:25:09.171 INFO - Selenium build info: version: '3.0.0-beta2', revision: '2aa21c1'
12:25:09.173 INFO - Launching a standalone Selenium Server
2016-08-18 12:25:09.335:INFO::main: Logging initialized @5823ms
12:25:09.692 INFO - Driver provider org.openqa.selenium.ie.InternetExplorerDriver registration is skipped:
registration capabilities Capabilities [{ensureCleanSession=true, browserName=internet explorer, version=, platform=WINDOWS}] does not match the current platform MAC
12:25:09.694 INFO - Driver provider org.openqa.selenium.edge.EdgeDriver registration is skipped:
registration capabilities Capabilities [{browserName=MicrosoftEdge, version=, platform=WINDOWS}] does not match the current platform MAC
12:25:09.695 INFO - Driver class not found: com.opera.core.systems.OperaDriver
12:25:09.695 INFO - Driver provider com.opera.core.systems.OperaDriver is not registered
2016-08-18 12:25:11.491:INFO:osjs.Server:main: jetty-9.2.15.v20160210
2016-08-18 12:25:12.028:INFO:osjsh.ContextHandler:main: Started o.s.j.s.ServletContextHandler@3a82f6ef{/,null,AVAILABLE}
2016-08-18 12:25:12.899:INFO:osjs.ServerConnector:main: Started ServerConnector@4cc0edeb{HTTP/1.1}{0.0.0.0:56842}
2016-08-18 12:25:12.900:INFO:osjs.Server:main: Started @9388ms
12:25:12.901 INFO - Selenium Server is up and running
12:25:15.640 INFO - SessionCleaner initialized with insideBrowserTimeout 0 and clientGoneTimeout 1800000 polling every 180000
12:25:16.583 INFO - Executing: [new session: Capabilities [{browserName=safari, javascriptEnabled=true, version=, platform=MAC}]])
12:25:17.221 INFO - Creating a new session for Capabilities [{browserName=safari, javascriptEnabled=true, version=, platform=MAC}]
12:25:18.637 INFO - Server started on port 46981
12:25:18.684 INFO - Launching Safari
12:25:18.758 INFO - Waiting for SafariDriver to connect
12:25:25.816 INFO - Connection opened
12:25:25.878 INFO - Driver connected in 7119 ms
12:25:26.146 INFO - Done: [new session: Capabilities [{browserName=safari, javascriptEnabled=true, version=, platform=MAC}]]
12:25:26.209 INFO - Executing: [get: http://www.theherbsplacenews.com/])
12:25:35.139 INFO - Done: [get: http://www.theherbsplacenews.com/]
12:25:35.209 INFO - Executing: [delete session: 52897159-dfd1-4df5-a75f-0f0fd3029a31])
12:25:35.211 INFO - Shutting down
12:25:35.211 INFO - Closing connection
12:25:35.219 INFO - Stopping Safari
12:25:35.296 INFO - Stopping server
12:25:35.296 INFO - Stopping server
12:25:35.375 INFO - Shutdown complete
12:25:35.375 INFO - Done: [delete session: 52897159-dfd1-4df5-a75f-0f0fd3029a31]

Open in new window


the problem wasn't what you said it was my ignorance.  When I saw your command
chmod 777 <jar file name>

It occurred to me that you were using the <> to offset an entry not put that in the syntax.  I moved it to the script folder but I removed <> and it worked.

now where do I go from here?
0
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41761268
That's good!
Ok, let's move on.

Next step will depend on how the page is being displayed on your Safari.
If the page shows that you are already logged, your next step will be to identify the link that goes to the list of posts and click on it. The code below does exactly that:
     elem = browser.find_element_by_class_name("btn_list_posts")
     elem.click()

Open in new window


And you'll write a pair of commands like that for each step of your task.

To know which commands to use, please check the documentation at:
http://www.seleniumhq.org/docs/03_webdriver.jsp#selenium-webdriver-api-commands-and-operations
0
 

Author Comment

by:sharingsunshine
ID: 41761355
to get to my dashboard I changed the link to
https://www.blogger.com/blogger.g?blogID=2213276582068581739#allposts

It never opened safari but it showed the safari launcher briefly.  Does something need to be different for https?

(test_env) rjw python expertsBrazil2webdriver.py
13:35:54.471 INFO - Selenium build info: version: '3.0.0-beta2', revision: '2aa21c1'
13:35:54.473 INFO - Launching a standalone Selenium Server
2016-08-18 13:35:54.517:INFO::main: Logging initialized @611ms
13:35:54.644 INFO - Driver provider org.openqa.selenium.ie.InternetExplorerDriver registration is skipped:
registration capabilities Capabilities [{ensureCleanSession=true, browserName=internet explorer, version=, platform=WINDOWS}] does not match the current platform MAC
13:35:54.644 INFO - Driver provider org.openqa.selenium.edge.EdgeDriver registration is skipped:
registration capabilities Capabilities [{browserName=MicrosoftEdge, version=, platform=WINDOWS}] does not match the current platform MAC
13:35:54.645 INFO - Driver class not found: com.opera.core.systems.OperaDriver
13:35:54.645 INFO - Driver provider com.opera.core.systems.OperaDriver is not registered
2016-08-18 13:35:54.756:INFO:osjs.Server:main: jetty-9.2.15.v20160210
2016-08-18 13:35:54.805:INFO:osjsh.ContextHandler:main: Started o.s.j.s.ServletContextHandler@3a82f6ef{/,null,AVAILABLE}
2016-08-18 13:35:54.897:INFO:osjs.ServerConnector:main: Started ServerConnector@35e71ec3{HTTP/1.1}{0.0.0.0:57784}
2016-08-18 13:35:54.898:INFO:osjs.Server:main: Started @992ms
13:35:54.899 INFO - Selenium Server is up and running
13:36:04.049 INFO - SessionCleaner initialized with insideBrowserTimeout 0 and clientGoneTimeout 1800000 polling every 180000
13:36:04.096 INFO - Executing: [new session: Capabilities [{browserName=safari, javascriptEnabled=true, version=, platform=MAC}]])
13:36:04.122 INFO - Creating a new session for Capabilities [{browserName=safari, javascriptEnabled=true, version=, platform=MAC}]
13:36:04.211 INFO - Server started on port 24036
13:36:04.221 INFO - Launching Safari
13:36:04.238 INFO - Waiting for SafariDriver to connect
13:36:07.139 INFO - Connection opened
13:36:07.143 INFO - Driver connected in 2904 ms
13:36:07.260 INFO - Done: [new session: Capabilities [{browserName=safari, javascriptEnabled=true, version=, platform=MAC}]]
13:36:07.278 INFO - Executing: [get: https://www.blogger.com/blogger.g\?blogID=2213276582068581739#allposts])
13:36:07.870 INFO - Done: [get: https://www.blogger.com/blogger.g\?blogID=2213276582068581739#allposts]
13:36:07.901 INFO - Executing: [delete session: a8558bd3-3176-4f5b-994b-8a9b6c805477])
13:36:07.906 INFO - Shutting down
13:36:07.906 INFO - Closing connection
13:36:07.907 INFO - Stopping Safari
13:36:07.974 INFO - Stopping server
13:36:07.975 INFO - Stopping server
13:36:07.985 INFO - Shutdown complete
13:36:07.986 INFO - Done: [delete session: a8558bd3-3176-4f5b-994b-8a9b6c805477]

Open in new window

0
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41761365
By the logs, I'm not seeing any errors, so let's try this: first, open the http address, then add a small delay (you can do that time.sleep(5) for 5 seconds. You may need to add import time at the import sections of the script), and then run the get command for the https address.
0
 

Author Comment

by:sharingsunshine
ID: 41761379
I logged out and then put in the main blogger page
https://www.blogger.com/about/

and it worked fine, looking at the docs - is that how to do the login?
https://gyazo.com/e733118cd9a108c3ff99b992821dc1ec - but substituting the word login in place of the word cheese?
0
 
LVL 15

Expert Comment

by:Walter Ritzel
ID: 41761387
Yes. You'll need to inspect the HTML code to discover the name of the objects on the page, class that you can use to identify the element, etc... It is boring, but it is an effort that you make only once.
0
 

Author Comment

by:sharingsunshine
ID: 41762720
You have been a great help and if I run into any more issues I will post another question.
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Dictionaries contain key:value pairs. Which means a collection of tuples with an attribute name and an assigned value to it. The semicolon present in between each key and values and attribute with values are delimited with a comma.  In python we can…
The purpose of this article is to demonstrate how we can upgrade Python from version 2.7.6 to Python 2.7.10 on the Linux Mint operating system. I am using an Oracle Virtual Box where I have installed Linux Mint operating system version 17.2. Once yo…
The purpose of this video is to demonstrate how to automatically show related posts at the bottom of a blog post in WordPress. This will be demonstrated using a Windows 8 PC. Plugin “Yet Another Related Posts Plugin” will be used. Go to your…
The purpose of this video is to demonstrate how to integrate Mailchimp with Facebook. This will be demonstrated using a Windows 8 PC. Mailchimp and Facebook will be used. Log into your Mailchimp account. : Click on your name. Go to Account Setti…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now