This is the next installment (in progress) of my PPRACTICUM series. This particular grouping is going through David Burns’ “Selenium 2 Testing Tools Beginner’s Guide“.

Note: PRACTICUM is a continuing series in what I refer to as a “somewhat “Live Blog” format. Sections may vary in time to complete. Some may go fast. Some may take much more time to get through. Updates will be daily, they may be more frequent. Feel free to click refresh to get the latest version at any given time. If you see “End of Section” at the bottom of the post, you will know that this entry is finished :).

Chapter 3: Overview of Selenium WebDriver

What makes this book different from the first book is that, at this stage, we would be going through a lot more Selenium IDE and configuration aspects, getting ready to introduce Selenium RC around Chapter 6. This time around, though, we are looking at a totally different architecture, and David has made the decision to speed up the process and get us into the Server aspects of Selenium, specifically WebDriver. As many of you may know (and some of you may not), Selenium and WebDriver merged a couple of years ago, and the architectures are somewhat different.

History of Selenium

I’m not going to go through everything the David wrote here, but suffice it to say that Selenium has quite an exciting run over the past few years:

– Selenium, when originally created by Jason Huggins, solved the issue of getting the browser to do user interactions.

– Selenium was limited by the JavaScript sandbox in browsers. One big issue was the Same Origin Policy. Moving from HTTP to HTTPS, as you would in a login, the browser would block the action because we are no longer in the same origin.

– The Selenium API was focused on running tests written in HTML using a three column design (similar to FIT). Selenium IDE is a good example: the three input boxes (command, target, value) match to this framework.

– Patrick Lightbody and Paul Hammant created Selenium Remote Control (RC) using Java as a web server that would proxy traffic. It was designed to utilize the same three column approach as the IDE. Different languages and bindings were allowed to be used, provided they adhered to the three column “Selenese” format.

– There is now somewhere around 140 methods available for the Selenium API. Lots of choices, figuring out which one to use? Hmmm, getting to be a challenge.

– Selenium RC was starting to struggle with the development of HTML5 and Mobile devices. A different approach was going to be needed to handle these changes.

Simon Stewart started working on the WebDriver project. By using HTMLUnit and Internet Explorer, Simon designed the WebDriver API around Object Oriented.

Back in 2011, I was at the Selenium Conference in San Francisco when Simon announced that Selenium and WebDriver were going to be merged, and that Selenium 2 would effectively be WebDriver.

WebDriver Architecture

Selenium is written in JavaScript, This JavaScript emulates user actions and automate the browser from within the browser. WebDriver oohs to work outside of the browser through API calls (specifically the accessibility API common to most web browsers.

Depending on the browsers implementation, it uses the language appropriate for that browser to use the accessibility API (Firefox uses JavaScript, Internet Explorer uses C++).

Upside, we can control browsers in the best possible way. Downside, new browsers will not be initially supported.

Where a native language API approach wont work, JavaScript will be injected into the page.

The four parts of the WebDriver architecture are summed up as follows:

WebDriver API

The WebDriver API is the part of the system that you interact with directly. The WebDriverAPI is much briefer than the 1`40 method API for Selenium RC. The WebDriver and the WebElement objects look like this:

element.sendKeys(“I love cheese”);

These commands are then translated and passed to the WebDriver SPI.

WebDriver SPI

The Stateless Programming Interface (SPI) breaks down the element, uses a unique ID, and then calls the relevant command. The code in the previous section after it arrived in the SPI would look like this:

findElement(using=”name”, value=”q”)
sendKeys(element=”webdriverID”, value=”I love cheese”)

JSON Wire Protocol

The WebDriver developers created a transport mechanism called the JSON Wire Protocol. This protocol transports the necessary elements to the code that controls it.

Selenium Server

The Selenium server, or browser, uses the JSON Wire commands, breaks down the JSON object, and then does what it needs to do.

The Merging of Two Projects

Simon Stewart and Jason Huggins thought it would be good to merge the two projects, and front this came Selenium 2. The Selenium core developers have been working to simplify the code base and remove duplication where possible. Selenium Atoms has grown out of this, and these are shared between the two projects.

Next up, setting up Intellij IDEA and JUnit.

More to come. Stay tuned!