great place to work

Leveraging Artificial Intelligence for Appium Test Automation


Perhaps the most popular buzzword related to technology these days is “AI” (Artificial Intelligence) or ”ML” ( Machine Learning). The phrase Artificial Intelligence, or just AI, is broadly and generally used to refer to any sort of Machine Learning program. Machine learning is a powerful technique in Artificial Intelligence (AI) that uses data and examples to learn patterns and structure in the environment.

Let’s delve deeper into how one can use AI with Appium! It might sound unusual, but the Appium project has developed an AI-powered element finding plugin for use, specifically with Appium.

First, let’s discuss “element finding plugin”. Recently, there has been an addition to Appium. They have added the ability for third-party developers to create “plugins” for Appium that can use an Appium driver together with their own unique capabilities to find elements. As one can see below, users can access these plugins simply by installing the plugin as an NPM module in their Appium directory, and then using the customFindModules capability to register the plugin with the Appium server.

The first plugin that worked within this new structure was one that incorporates a Machine Learning model from designed to classify app icons, the training data for which was just open-sourced. This is a model which can tell you, given the input of an icon, what the icon represents (for example, a shopping cart button or a back arrow button). The application you can develop with this model is the Appium Classifier Plugin, which conforms to the new element finding plugin format.

Basically, you can use this plugin to find icons on the screen based on their appearance, without having any knowledge about the structure of the app or the need to ask developers for internal identifiers to use as selectors. As of now, the plugin is limited to finding elements by their visual appearance, so it only works for elements which display a single icon. These kinds of elements are pretty common in mobile apps.

This approach is more flexible than existing locator strategies (like accessibility id or image) in many cases, because the AI model is trained to recognize icons without the need of any context, and requiring them to match only one precise image style. This means that using the plugin to find a “cart” icon will work across apps and platforms, without worrying about minor differences.

Let’s take a look at a concrete example by demonstrating the simplest possible use case. If you fire up an iOS simulator, you have access to the Photos application, which looks something like this:

Notice the little magnifying glass icon near the top which, when clicked, opens up a search bar:

Now, let’s write a test that uses the new plugin to find and then click on that icon. First, you need to follow the set-up instructions to make sure everything works. Then, you can set up the desired capabilities for running a test against the Photos app:

DesiredCapabilities caps = new DesiredCapabilities();

caps.setCapability(“platformName”, “iOS”);

caps.setCapability(“platformVersion”, “11.4”);

caps.setCapability(“deviceName”, “iPhone 6”);

caps.setCapability(“bundleId”, “”);

Now, you need to add some new capabilities: customFindModules (so that Appium knows about the AI plugin you want to use), and shouldUseCompactResponses (because the plugin requires that you set this capability in its set-up instructions):

HashMap<String, String> customFindModules = new HashMap<>();

customFindModules.put(“ai”, “test-ai-classifier”);

caps.setCapability(“customFindModules”, customFindModules);

caps.setCapability(“shouldUseCompactResponses”, false);

You can see that customFindModules is a capability which has some internal structure. In this case, “ai” is the shortcut name for the plugin that you can use internally in the test. The “test-ai-classifier” is the fully-qualified reference that is required by Appium for identifying the AI elements. It’s an attribute that is needed while scripting.

Once you are done with this, finding the element is simple:



Here, you’re using a new custom locator strategy so that Appium has the knowledge that you want a plugin, and not one of its supported locator strategies. Then, you need to prefix the selector with ai: in order to let Appium know which plugin specifically, you want to use for this request (because there could be multiple). Since you’re only, using one plugin for this test, you could do away with the prefix (and for good measure you could use the different find command style, too):


In Conclusion

As mentioned above, this technology has some significant limitations as of now. For example, you need to train the Appium by passing the capability to detect the icons in AI level. On top of that, the process is fairly slow, both in the plugin code (since it has to retrieve every element on screen in order to send information to the model), and in the model itself. All of these areas will go through improvements in the future, however. And even if this particular plugin isn’t useful for your day-to-day tasks, it still, demonstrates that concrete application of AI in the testing space is not only possible, but actual!


Sanoj S, Test Architect, RapidValue

Please Share Your Thoughts & Comments Below.

How can we help you?