Hi everyone,
We've been making good progress on WebDriver and stabilising the code and improving performance. We're about to land some pretty major changes to the InternetExplorerDriver that make it safe to use from multiple (Java) threads. With all this progress, I think it makes sense to start the process of merging the projects. The following major technical work items are left on my list:
Add support to WebDriver for handling of javascript alerts, prompts and confirmations. I have a desired API, which I can cover that on another thread.
Complete the WebDriverBackedSelenium. This is an implementation of the "Selenium" interface in Java that uses WebDriver for the actual work (so no need for a Selenium RC server)
Start a SeleniumBackedWebDriver implementation. This will allow us to use the WebDriver interfaces on browsers where we haven't/can't write a native browser
Add a "TableRunner", allowing us to run the normal Selenium Core tests with WebDriver (actually the Selenium RC interfaces and a bunch of reflection)
Add more language support to WebDriver. There are branches for .Net and Python, and once those are done, but we also want support for Ruby.
Integrating with Selenium Grid is something that I would love to see. Philippe, let's get started on this!
Open questions include:
When do we move the WebDriver source from Google Code to Open QA? Obviously, after 1.0 is a good idea, so perhaps this is something that can wait until then before thinking about this again.
What will happen to the Google Code website? I'm quite fond of it, but there's probably a discussion about how best to import the wiki and issues from Google Code to Open QA's infrastructure. I may just sit up one night and copy the issues across. This is something that we've chewed over before, so I'm happy to continue those discussions outside of this thread.
There are a number of WebDriver contributers. It seems reasonable to me if they became Selenium contributers too, especially once we merge the subversion repositories.
I'd deeply appreciate some help with getting all this sorted out, because that's actually a fairly large piece of work. It's also an interesting piece of work! If anyone wants to get involved, now's the perfect time to do so. Who's interested in joining in?
Simon
When do we move the WebDriver source from Google Code to Open QA? Obviously, after 1.0 is a good idea, so perhaps this is something that can wait until then before thinking about this again.
I say we leave it as is until it's time to actually merge them. At that point, we won't need to create a new project but instead can just create a branch in one of the selenium projects (probably RC) and use that moving forward.
What will happen to the Google Code website? I'm quite fond of it, but there's probably a discussion about how best to import the wiki and issues from Google Code to Open QA's infrastructure. I may just sit up one night and copy the issues across. This is something that we've chewed over before, so I'm happy to continue those discussions outside of this thread.
This probably relates to the overall rebranding and update of openqa.org. We all pretty much agree that once Watir leaves OpenQA (they are in the process of doing that), that rebranding OpenQA to Selenium makes sense. In doing that, we'd make http://selenium.openqa.org be the main website and likely merge the content from the other projects in to that template. At that point, we'll likely have a good opportunity to upgrade/update the infrastructure bits (website, bug tracker, forums, etc), including pulling in anything we like from Google Code or even adopting parts of it.
There are a number of WebDriver contributers. It seems reasonable to me if they became Selenium contributers too, especially once we merge the subversion repositories.
Yes, I agree. Just email me their info and I can take care of that.
In general, I'm really excited about the work you're doing. I know we discussed in the past different ways to merge WebDriver in to Selenium, but perhaps you can reiterate your vision for how this will all work. Keep in mind the recent work Dan and Philippe have been doing. Specfically they modified the default modes of RC to use HTA mode, Chrome mode, and a privileged Safari mode.
My feeling is that WebDriver integration comes in two parts:
Improved browser automation (speed, reliability, additional automation outside of the DOM). Because chrome mode is pretty reliable already and WebDriver doesn't have great Safari support, my instinct is that the first candidate to replace is HTA mode and get WebDriver to become the IE automation engine for RC.
Improved APIs for all languages. I think at a high level, everyone agrees that there are two ways to run Selenium today: using the simple language in IDE or using a programming language with RC. Unfortunately, the RC API sucks, so adopting the WebDriver-style API would be a big win for RC.
I understand that RC isn't the ideal place to do the integration, since part of what WebDriver brings to the table is not being "remote", so we need to figure out a way to reconcile that in to the overall vision. I don't have a great solution, but I figure it probably involves a new project as well as possibly merging RC and Grid together to a single "remote" solution.
Anyone else have thoughts?
Simon, how are you finding the work distribution for WebDriver work? Not being much tapped in to the state of the WebDriver community / participation, it does sometimes seem like you're waging a one man battle.
I'd personally like to get a bit familiar with the WebDriver code - are there discrete, relatively small pieces of work you think you can section off and dole out?
Unfortunately, the RC API sucks, so adopting the WebDriver-style API would be a big win for RC.
Perhaps more could be done with our client drivers to make it the API seem less awkward ... for example I see Philippe has recently been doing some work to enhance the usability of the Ruby driver.
Haw-Bin
Unfortunately, the RC API sucks, so adopting the WebDriver-style API would be a big win for RC.
Perhaps more could be done with our client drivers to make it the API seem less awkward ... for example I see Philippe has recently been doing some work to enhance the usability of the Ruby driver.
Yes, I think the general idea is to get the WebDriver API and the RC client drivers more in line. The only disconnect I see that we need to resolve is Simon's desire to allow WebDriver to work without any need to communicate over a wire protocol like SRC currently does. Other than that, I envision that "Selenium 2.0" (purposely defined as a broad and forward looking name - don't know what it'll really be yet) is very similar in my mind and Simon's mind: good APIs and solid IE, FF, and Safari support.
What will happen to the Google Code website? I'm quite fond of it, but there's probably a discussion about how best to import the wiki and issues from Google Code to Open QA's infrastructure. I may just sit up one night and copy the issues across. This is something that we've chewed over before, so I'm happy to continue those discussions outside of this thread.
This probably relates to the overall rebranding and update of openqa.org. We all pretty much agree that once Watir leaves OpenQA (they are in the process of doing that), that rebranding OpenQA to Selenium makes sense. In doing that, we'd make http://selenium.openqa.org be the main website and likely merge the content from the other projects in to that template. At that point, we'll likely have a good opportunity to upgrade/update the infrastructure bits (website, bug tracker, forums, etc), including pulling in anything we like from Google Code or even adopting parts of it.
OK. I'll wait until the Great Rebranding before trying to move anything. Sounds like the easiest thing to do as well.
In general, I'm really excited about the work you're doing. I know we discussed in the past different ways to merge WebDriver in to Selenium, but perhaps you can reiterate your vision for how this will all work. Keep in mind the recent work Dan and Philippe have been doing. Specfically they modified the default modes of RC to use HTA mode, Chrome mode, and a privileged Safari mode.
Providing as flawless an experience as possible for our users is essential: Dan and Philippe have done entirely the right thing.
I see the process taking a reasonable amount of time to complete and being a gradual process. The plan in my head looks a like:
Selenium 1.5: Merge the source trees and include the WebDriver interfaces and classes in the Selenium RC downloads. This is probably the crudest possible way to "merge" the projects, but it's a start. I would expect the "WebDriverBackedSelenium" class to be pass the Selenium tests, but I'd be amazed if it was flawless! I would love to see a "SeleniumBackedWebDriver" implementation too (anyone want to write that?)
Selenium 1.6+: Start the process of merging the projects more closely. The initial focus would be to make the default implementation of the Selenium RC IE implementation use the WebDriver tech. We can also start to expose some of the WebDriver capabilities to Selenium, in particular our keyboard and mouse input methods (which will be completely native on Mac, Linux and Windows by this point)
Selenium 2.0: A major revision number is our chance to review both the dictionary and object-based APIs, reassessing them in the light of how people use them and ensuring that they meet the original vision.
My feeling is that WebDriver integration comes in two parts:
Improved browser automation (speed, reliability, additional automation outside of the DOM). Because chrome mode is pretty reliable already and WebDriver doesn't have great Safari support, my instinct is that the first candidate to replace is HTA mode and get WebDriver to become the IE automation engine for RC.
Agreed. HTA mode is the first candidate for moving over to the WebDriver tech. I suspect that the Firefox Chrome and FirefoxDriver will borrow aspects from each other: we're going to end up with a hybrid core of functionality and specific parts tailored to each of the APIs.
Improved APIs for all languages. I think at a high level, everyone agrees that there are two ways to run Selenium today: using the simple language in IDE or using a programming language with RC. Unfortunately, the RC API sucks, so adopting the WebDriver-style API would be a big win for RC.
I agree that adopting the WebDriver API would be a big win for RC ![]()
I'm looking to support Java, Python, Ruby and C# as "first class citizens" of the WebDriver API. We've made a start with the Python bindings, as well as with the C# binding. I'm happy to go into detail about how we're planning on facilitating this if anyone's keen to join in the fun!
I understand that RC isn't the ideal place to do the integration, since part of what WebDriver brings to the table is not being "remote", so we need to figure out a way to reconcile that in to the overall vision. I don't have a great solution, but I figure it probably involves a new project as well as possibly merging RC and Grid together to a single "remote" solution.
You'll be pleased to hear that we've started addressing this issue already: there's a WebDriver Remote implementation that's already in trunk. It should be bourne in mind that eventually we're always going to be "local" to a browser, even if there's a chain of servers between it and the user, and we take advantage of this. Having said that, in the common case, where tests are running on the local machine, removing the server process is a win for the user, if only because it reduces complexity and there's one less thing to forget to start. In the case where there's a central server (or grid) having the remote interfaces is a real boon.
gyrm wrote:
Simon, how are you finding the work distribution for WebDriver work? Not being much tapped in to the state of the WebDriver community / participation, it does sometimes seem like you're waging a one man battle.
Here's the number of commits by developer from last month:
1 : alexis.j.vuillemin
2 : michael.tamm2
3 : amorujao
13 : Jiayao.Yu
15 : simon.m.stewart
Michael and Jiayao are particularly active, and Alexis has made a major contribution with his reworking of practically the entire IE codebase. I should also add that there's another person from Oz who regularly sends patches too.
I'd personally like to get a bit familiar with the WebDriver code - are there discrete, relatively small pieces of work you think you can section off and dole out?
Absolutely! There's a list of issues that provide an "easy in" with the Getting Involved list. To be honest, I think you'll be able to pick up any of the Firefox issues too, if there's something that interests you. The internals of the FirefoxDriver have grown organically, and could probably do with a good clean up: some help there would be deeply appreciated, as would identifying how we can add support for the WebDriver API to the Selenium IDE.
We're working on improving the developer documentation, so if you find it hard to get started, please let me know because I'll view it as a bug that needs fixing ![]()
Unfortunately, the RC API sucks, so adopting the WebDriver-style API would be a big win for RC.
Perhaps more could be done with our client drivers to make it the API seem less awkward ... for example I see Philippe has recently been doing some work to enhance the usability of the Ruby driver.
We're in a difficult position because people have got a large amount of code that relies on the existing APIs (particularly the Java one IME) Having said that, I'd love to help slim the dictionary-based API down. Perhaps that's a topic for another thread?
Hi Simon,
what do you mean by merging? Merging as a replacement, as an option or as a real merge by getting the best out of the two frameworks? Are they complementary, concurring or partially overlapping solutions?
I've searched the Selenium forum for WebDriver but I've found less than 20 matching entries and apart from this thread I didn't hear about it - so I started researching.
I found a funny discussion about Selenium RC vs WebDriver at GTAC 2007 between you and Jason, but I'm still not sure what are the main benefits over Selenium RC, which I use in several projects; so I also don't know whether it would be beneficial to switch.
What I read in the FAQ as main difference is that WebDriver controls the browser via the native API; I guess a similar way like Watir or WinRunner - both had drawbacks, too as does Selenium have as well, but other limitations.
To make a long story short: I'd love to read a feature comparison matrix to be able to decide which one to use in different projects.
I hope I didn't miss too much information being currently available; if somehow yes then I'd appreciate if you'd provide links to existing content.
Many thanks in advance,
Andras
Simon,
Glad you and I are on the same page. Philippe is on vacation for 2 weeks so I don't think we'll hear from him right away. But hopefully the rest of the team can tell us what they think.
For me, I think the trickiest thing will be getting a non-server solution working for all the major languages and browsers. The reason we went with the "SeleniumServer" module was because it allowed us to extract 99% of the complexity of automating the browsers via a proxy to a standard runtime. Perhaps initially we'd use your native hooks for IE and then for other browsers we'd have a Ruby/PHP/Pyhton/etc wrapper than just kicks off a SeleniumServer as needed and sends HTTP calls to localhost?
Patrick
andras.hatvani wrote:
Hi Simon,
what do you mean by merging? Merging as a replacement, as an option or as a real merge by getting the best out of the two frameworks? Are they complementary, concurring or partially overlapping solutions?
I'm talking about doing a real merge. There are cases where the current Selenium approach is "better" than the WebDriver approach (such as running on the iphone!), whereas there are cases where the WebDriver approach is more appropriate or capable than Selenium's (such the IE driver) The strengths of one address the weaknesses of the other. I really believe that it's going to be great for our users.
What I read in the FAQ as main difference is that WebDriver controls the browser via the native API; I guess a similar way like Watir or WinRunner - both had drawbacks, too as does Selenium have as well, but other limitations.
To make a long story short: I'd love to read a feature comparison matrix to be able to decide which one to use in different projects.
The differences can be listed as so:
WebDriver drives the browser natively, and can take advantage of the extended capabilities that this offers (such as avoiding the single host origin problem)
It is a lot easier to support new browsers using Selenium's codebase --- all browsers run Javascript.
The WebDriver API takes a fundamentally different approach to Selenium's. Which you prefer is a choice you and team can make.
Selenium RC uses codegen to create the language bindings. This is fast, but not necessarily particularly idiomatic. We're hand-writing the other language bindings for WebDriver.
When running the browser on the localhost, WebDriver doesn't require an external process
I will put together a more complete feature matrix. Is there anything in particular that you're keen on seeing?
plightbo wrote:
For me, I think the trickiest thing will be getting a non-server solution working for all the major languages and browsers. The reason we went with the "SeleniumServer" module was because it allowed us to extract 99% of the complexity of automating the browsers via a proxy to a standard runtime. Perhaps initially we'd use your native hooks for IE and then for other browsers we'd have a Ruby/PHP/Pyhton/etc wrapper than just kicks off a SeleniumServer as needed and sends HTTP calls to localhost?
The approach we're taking with WebDriver is to push as much of the smarts as we can into the native driver. This means that the Java bindings tend to be skinny wrappers that prepare messages, shunt them to the actual driver, and then read the results. The plan for supporting other languages looks like:
IE: C++ DLL accessed via DL (Ruby), ctypes (Python), pinvoke (C#) and JNA (Java)
Firefox: XPI listening on port, accessed via sockets
Writing a new language binding falls into two parts. First of all, there's the matter of writing a suitably idiomatic API that feels natural to long-time users of the language. Secondly, there's actually implementing this to support the different browsers. The approach we've taken means that although the first part will require a lot of effort, the second should be a lot easier than it might otherwise be.
One way we're planning on reducing the overhead still lower is to ensure that both the Firefox and Remote WebDrivers speak the same "wire protocol": that is, that they both accept the same JSON structures sent over HTTP. This is something I'd like to see completed before we release 2.0. The advantages to this are many, obvious and very much the same as the SeleniumServer approach.
Simon,
Thanks for your reply. An additional benefit you pointed out in the F.A.Q. is the way of handling JS.
Selenium's keypress event handling mechanism is still unreliable, so this would be an area, where
we could benefit from the usage of WebDriver.
I'll experiment with WebDriver and study the code base and in case of questions should arise I'll
contact you.
Cheers,
Andras
