26 April 2019

There are different ways to handle data tables in Cucumber scenarios. Which method of data table you use depends on your project and complexity of your test. In this article I will show some methods with their examples. I will also discuss the advantages and disadvantages of each method. In the end, it is up to you to decide which method best suits your needs.

Method 1: List

You can add data to your Gherkin as a list. This method is best used when you have a small data set of the same data type and the order of the data is unlikely to change or not important (for example: a list of numbers that you want to loop through).
For example:

Given some strings that I want to pass on
  | Lists   |
  | In      |
  | Gherkin |

In your step file, this data can be read as:

@Given("^some strings that I want to pass on$")
public void myMethodToDoSomethingWithTheList(List<String> myList) {
    // Here I can access the different list elements
    System.out.println(myList.get(0));
    System.out.println(myList.get(1));
    System.out.println(myList.get(2));
}

This is probably the simplest method to implement data tables. In this case we have 3 string values. Instead of strings you can switch the data type of the list to any data type that you want. However, the disadvantage of this method is that you can only use a single data type, so only strings or only integers, etc. You cannot mix the different data types into the same list. Furthermore, you are dependent on the order of your data in your list. If you want to add an item to a position other then the end of the list, you will need to update your indexes of the rest if each data point has its own function.

Summary:
+ Simple to implement
- Limited to single data type in list
- You have to keep track of the order of your data in the list

Method 2: Map

Instead of a list, we can also use a map. This solves the problem of keeping track of the order of the data in your list. This method is best used when you have a small/medium data set of the same data type and each data value has its own function.
In Gherkin this would look like:

When I fill in some personal data
  | Name | Mazin Inaad                |
  | Job  | Test Automation Engineer   |

In your step file, this data can be read as:

@When("^I fill in some personal data$")
public void myMethodToDoSomethingWithTheMap(Map<String, String> myMap) {
    // Here I can access the different map elements
    System.out.println(myMap.get("Job"));
    System.out.println(myMap.get("Name"));
}

Because we are now using a map, the order of the data is no longer an issue. We can access the data by simply calling the key which is the first column of the data. However, we can still only have the same data type in the entire map. In this case <String, String>. Keep in mind that keys in the map need to be unique.

Summary:
+ Fairly simple to implement
+ The order of your data is not an issue
- Limited to single data type in map

Method 3: Custom object

You can also create a custom object for your data table. A great advantage of using this method is that you can handle different data types in your data table. For example: if we want to create data for a product in a store we can create an object called Product.

public class Product {
    private String name;
    private int amountInStock;
    private double price;
}

In Gherkin we can give values to each field of the Product object:

Given a product
  | Name    | AmountInStock | Price |
  | Cookies | 200           | 1.42  |

It is important that the names in the top row are an exact match with your field names of your custom object. Cucumber is only forgiving in capitalization of the first letter. Also note that the topmost row now has the "keys" and the second row contains the data where in method 2 the first column was the key and the second column the value.

In your step file, this data can be read as follows:

@Given("^a product$")
public void methodName(List<Product> myData) {
    // Since myData is a list of products, you need to get the first
    // product in order to access it
    Product myProduct = myData.get(0);
    // Now you can do something with your product
}

In order to access the fields we need to add getters to the Product class. So the final Java file looks like:

public class Product {
    private String name;
    private int amountInStock;
    private double price;

    public String getName() {
        return name;
    }
    public int getAmountInStock() {
        return amountInStock;
    }
    public double getPrice() {
        return price;
    }
}

With the getters in place, we can now access the fields in our steps. For example:

@Given("^a product$")
public void methodName(List<Product> myData) {
    // Since myData is a list of products, you need to get the first
    // product in order to access it
    Product myProduct = myData.get(0);
    System.out.println(myProduct.getName());
}

As you can see we can now use multiple data types in our data input (strings, integers and doubles in our case). Another advantage is that you can work with multiple data objects like this:

Given a product
  | Name    | AmountInStock | Price |
  | Cookies | 200           | 1.42  |
  | Milk    | 150           | 0.66  |
  | Nuts    | 70            | 5.99  |

And access them all in the steps file with for example a loop:

@Given("^a product$")
public void methodName(List<Product> allProducts) {
    for (Product singleProduct : allProducts) {
        System.out.println(singleProduct.getName());   
    }
}

Summary:
+ Multiple data types possible
+ The order of your data is not an issue
+ Multiple data objects in a single step
- Bit more complex to implement

Method 4: Custom objects with fake data

This is a more complex usage of data tables where I no longer wanted to create new test data every time I ran my test. It is actually an extension of method 3. This method is highly recommended when working in a test environment where your test data has to be unique each time you run your test (for example when filling in forms) or when working with large data sets. It allows you to control the values that you want and randomize other values.

Like we did in method 3, we need to create an object. But we now add a Constructor which generates our fake data. I use the Java Faker package to generate the fake data. For example:

public class ContactFormData {
    private String name;
    private String email;
    private String message;

    public ContactFormData() {
        Faker faker = new Faker();
        name = faker.name().fullName();
        email = faker.internet().safeEmailAddress();
        message = faker.gameOfThrones().quote();
    }

    public String getName() {
        return name;
    }
    public String getEmail() {
        return email;
    }
    public String getMessage() {
        return message;
    }
}

In Gherkin we give values only to the fields that we want to control the values of:

When I fill out the contact form
  | Name        |
  | Mazin Inaad |

And then in our steps file, we generate the rest of the data:

@When("^I fill out the contact form$")
public void methodName(List<ContactFormData> formDataList) {
    ContactFormData initialData = formDataList.get(0);
    ContactFormData finalData = ContactFormData.generateRemainingData(initialData);
    // Now we can do something with the final data
}

As you can see I am calling the Java method generateRemainingData in ContactFormData. You can add this method to the ContactFormData class with the following code:

public static ContactFormData generateRemainingData(ContactFormData initialData) {
    return DataHelper.generateRemainingData(initialData, () -> new ContactFormData());
}

This Java method calls the generic Java method in my DataHelper class (which we have not yet created). In calling this method it passes on the initialData that I filled in in my Gherkin and it passes on a new data object using the constructor that generates the fake data. So all we need now is the generic generateRemainingData method somewhere (In my case it is located in a class called DataHelper. You can place this method anywhere you want). This generic method only needs to be defined once for your entire project since it works with generic class types.

/**
* This method fills data for the fields that have not been defined in the initialdata at the Gherkin level.
* The data is not generated in this method itself, but by the supplier which should supply the new object with
* generated faker data.
*
* Created by: Mazin Inaad
*
* @param initialData the initial data.
* @param newDataSupplier the new daya supplier.
* @param <T> the type.
* @return object with generated fake data.
*/

public static <T> T generateRemainingData(final T initialData, final Supplier<T> newDataSupplier) {
   final T result = newDataSupplier.get();
   final Field[] fields = initialData.getClass().getDeclaredFields();
   try {
       for (Field f : fields) {
           f.setAccessible(true);
           if (f.get(initialData) != null) {
               f.set(result, f.get(initialData));
           }
       }
   } catch (IllegalAccessException e) {
       Assert.fail("Error occured in generateRemainingData while trying to copy data from initial data to generated data");
   }
   return result;
}

This method for data tables in Cucumber is also very handy when you have large data sets. Since you only fill in some fields in your Gherkin and generate data for the rest of the fields, your feature file is not cluttered with unnecessary information.

Summary:
+ Possibility to generate fake data
+ Multiple data types possible
+ The order of your data is not an issue
+ Multiple data objects in a single step
- Complexity of code

Expert: Mazin Inaad, trainer Capgemini Academy