Drive Headless Chromium with Python3

2015/6/19 posted in  Network

Browser Automation

Before we dive into any code, let’s talk about what a headless browser is and why it’s useful. In short, headless browsers are web browsers without a graphical user interface (GUI) and are usually controlled programmatically or via a command-line interface.

One of the many use cases for headless browsers is automating usability testing or testing browser interactions. If you’re trying to check how a page may render in a different browser or confirm that page elements are present after a user initiates a certain workflow, using a headless browser can provide a lot of assistance. In addition to this, traditional web-oriented tasks like web scraping can be difficult to do if the content is rendered dynamically (say, via Javascript). Using a headless browser allows easy access to this content because the content is rendered exactly as it would be in a full browser.

Headless Chrome and Python

Going Headless!

Setup

Before we get started, we need to install Chromium and download the latest ChromeDriver.

Next, let’s make a folder that will contain all of our files:

mkdir going_headless
#Now we can move the ChromeDriver into the directory that we just made:
mv Downloads/chromedriver going_headless/

Since we are using Selenium with Python, it’s a good idea to make a Python virtual environment. I use virtualenv, so if you use another virtual environment manager, the commands may be different.

$ cd going_headless && virtualenv -p python3 env  
$ source env/bin/activate

The next thing we need to do is install Selenium. If you’re not familiar with Selenium, it’s a suite of tools that allows developers to programmatically drive web browsers. It has language bindings for Java, C#, Ruby, Javascript (Node), and Python. To install the Selenium package for Python, we can run the following:

pip3 install selenium  

Example

Now that we’ve gotten all of that out of the way, let’s get to the fun part. Our goal is to write a script that searches for my name “Hamples” on back.re, and checks that a recent article I wrote about Android security is listed in the results. If you’ve followed the instructions above, you can use the headless version of Chromium with Selenium like so:

import os  
from selenium import webdriver  
from selenium.webdriver.common.keys import Keys  
from selenium.webdriver.chrome.options import Options`  

`chrome_options = Options()  
chrome_options.add_argument("--headless")  
chrome_options.binary_location = '/Applications/Google Chromium.app/Contents/MacOS/Google Chromium'`    

`driver = webdriver.Chrome(executable_path=os.path.abspath(“chromedriver"),   chrome_options=chrome_options)  
driver.get("https://back.re")`  

`magnifying_glass = driver.find_element_by_id("js-open-icon")  
if magnifying_glass.is_displayed():  
  magnifying_glass.click()  
else:  
  menu_button = driver.find_element_by_css_selector(".menu-trigger.local")  
  menu_button.click()`  

`search_field = driver.find_element_by_id("site-search")  
search_field.clear()  
search_field.send_keys("Hamples")  
search_field.send_keys(Keys.RETURN)  
assert "Nothing..." in driver.page_source   driver.close()` 

Example Explained

The driver.get function will be used navigate to the specified URL.

driver.get("https://back.re")

The back.re website is responsive, so we have to handle different conditions. As a result, we check to see if the expected search button is displayed. If it isn’t, we click the menu button to enter our search term.

magnifying_glass = driver.find_element_by_id("js-open-icon")  
if magnifying_glass.is_displayed():  
  magnifying_glass.click()  
else:  
  menu_button = driver.find_element_by_css_selector(".menu-trigger.local")  
  menu_button.click()  

Now we clear the search field, search for my name, and send the RETURN key to the drive.

search_field = driver.find_element_by_id("site-search")  
search_field.clear()  
search_field.send_keys("Olabode")  
search_field.send_keys(Keys.RETURN)

We check to make sure that the blog post title from one of my most recent posts is in the page’s source.

assert "Nothing..." in driver.page_source

And finally, we close the browser.

driver.close().

Benchmarks

Head to Headless

So, it’s cool that we can now control Chrome using Selenium and Python without having to see a browser window, but we are more interested in the performance benefits we talked about earlier. Using the same script above, we profiled the time it took to complete the tasks, peak memory usage, and CPU percentage. We polled CPU and memory usage with psutil and measured the time for task completion using timeit.

For our small script, there were very small differences in the amount of time taken to complete the task (4.3%), memory usage (.5%), and CPU percentage (5.2%). While the gains in our example were very minimal, these gains would prove to be beneficial in a test suite with dozens of tests.

Resources

Chrome Links:

Selenium Links: