Writing unit tests with the AI

Developers often find writing comprehensive unit tests a tedious task. The iterative nature of the process – write, review, refine, re-run, and debug – can be time-consuming and demotivating. While Test-Driven Development (TDD) can mitigate some of this burden, it doesn’t eliminate the need for extensive test coverage, especially for complex validation logic.

In 2025, Artificial Intelligence offers a compelling solution to this challenge. AI excels at generating thorough and accurate unit tests, freeing developers to focus on higher-level tasks.

Consider this scenario: we needed to implement a BigQuery table name validator with specific rules: snake_case or camelCase, lowercase first letter, alphanumeric characters only, and a maximum length of 60 characters. Manually crafting tests for all possible edge cases would have been a significant undertaking.

How many scenarios would you consider? How much time would you allocate for planning and review?

By leveraging AI, I generated a comprehensive suite of unit tests in just 15 minutes.

Here is script:

Prompt:

Does this statement with description meets regexp pattern ? “The name of the field. Must be either alphanumeric in camelCase starting with a lowercase letter or snake_case and be no longer than 60 characters.” -> “^(?:[a-z][a-z0-9_]*|[a-z]+(?:[A-Z][a-z0-9]*)*)$”

Anwer:

Let’s break down the statement and the regex pattern to see if they match.

Statement Description:

Name of the field: This implies we’re dealing with a string.
Must be either alphanumeric in camelCase starting with a lowercase letter:
- Starts with a lowercase letter.
- Can contain alphanumeric characters (letters and numbers).
- Follows camelCase convention (e.g., fieldName, anotherVariable).
or snake_case:
- Uses underscores to separate words (e.g., field_name, another_variable).
and be no longer than 60 characters: This is a length constraint.

Regex Pattern:

^: Matches the beginning of the string.
(?: … ): A non-capturing group.
[a-z][a-z0-9_]*: Matches a lowercase letter followed by zero or more alphanumeric characters or underscores. This part is meant to match snake_case, but it also allows camelCase with underscores, which is not what the description specifies.
|: OR operator.
[a-z]+(?:[A-Z][a-z0-9]*)*: Matches one or more lowercase letters, followed by zero or more occurrences of an uppercase letter followed by zero or more alphanumeric characters. This part is meant to match camelCase.
$: Matches the end of the string.

Issues and Mismatches:

Underscores in camelCase: The regex [a-z][a-z0-9_]* allows underscores in what’s supposed to be snake_case. However, it also allows underscores in what is supposed to be camelCase, which is not correct according to the description.
Lack of length constraint: The regex does not enforce the “no longer than 60 characters” rule.

Conclusion:

The provided regex pattern does not fully meet the requirements of the statement description. It has the following shortcomings:

It allows underscores in camelCase.
It does not enforce the maximum length constraint.

Therefore, the statement and the regexp pattern do not fully match.

Prompt:

Is it possible to write regexp that will allow underscores only if there is no capital latter?

AI Answer:

Yes, it’s absolutely possible to write a regular expression that allows underscores only when there are no capital letters. Achieving this requires careful construction of the pattern to enforce those conditions. Here’s a breakdown of how we can approach this, along with considerations:

Understanding the Requirements

Snake Case(Underscores):
- Must consist of lowercase letters, numbers, and underscores.
- No uppercase letters allowed.
Camel Case(No Underscores):
- Must start with a lowercase letter.
- Can contain uppercase and lowercase letters, and numbers.
- No underscores allowed.

Regex Construction

Here’s a regex that aims to fulfill those requirements:

^(?:[a-z0-9]+(?:_[a-z0-9]+)*|[a-z][a-zA-Z0-9]*)$

Explanation:

Prompt:

Please, generate python script using pytest to validate regexp pattern above with corner cases

AI Response:

import pytest
import re

def validate_field_name(field_name):
    """
    Validates a field name against the specified regex pattern.
    """
    pattern = r"^(?:[a-z0-9]{1,60}(?:_[a-z0-9]{1,60})*|[a-z][a-zA-Z0-9]{0,59})$"
    return bool(re.match(pattern, field_name))

def test_valid_camel_case():
    assert validate_field_name("fieldName")
    assert validate_field_name("anotherVariable")
    assert validate_field_name("shortName")
    assert validate_field_name("aVeryLongVariableNameThatIsWithin60Characters")
    assert validate_field_name("variable123")
    assert validate_field_name("a1")

def test_valid_snake_case():
    assert validate_field_name("field_name")
    assert validate_field_name("another_variable")
    assert validate_field_name("short_name")
    assert validate_field_name("a_very_long_variable_name_that_is_within_60_characters")
    assert validate_field_name("variable_123")
    assert validate_field_name("a_1")
    assert validate_field_name("a")
    assert validate_field_name("a_b_c_d_e_f_g_h_i_j_k_l_m_n_o_p_q_r_s_t_u_v_w_x_y_z_1234567890")

def test_invalid_camel_case():
    assert not validate_field_name("FieldName")  # Starts with uppercase
    assert not validate_field_name("field_Name") # Underscore and uppercase
    assert not validate_field_name("field_nameVariable") # mixing snake and camel
    assert not validate_field_name("123variable") # starts with a number.
    assert not validate_field_name("_variable") # starts with underscore.

def test_invalid_snake_case():
    assert not validate_field_name("Field_name") # Uppercase
    assert not validate_field_name("field__name") # Double underscore
    assert not validate_field_name("field_Name") # Uppercase inside snake
    assert not validate_field_name("field_nameVariable") # mixing snake and camel
    assert not validate_field_name("123_variable") # starts with a number.

def test_empty_string():

    assert not validate_field_name("")

def test_long_string():
    long_camel = "a" + "a" * 59
    long_snake = "a" + "_a" * 29
    too_long_camel = "a" + "a" * 60
    too_long_snake = "a" + "_a" * 30
    assert validate_field_name(long_camel)
    assert validate_field_name(long_snake)
    assert not validate_field_name(too_long_camel)
    assert not validate_field_name(too_long_snake)

def test_edge_cases():
    assert validate_field_name("a")
    assert validate_field_name("a1")
    assert validate_field_name("a_1")
    assert validate_field_name("a_b")
    assert validate_field_name("variable1")
    assert validate_field_name("variableA")
    assert not validate_field_name("A")
    assert not validate_field_name("1")
    assert not validate_field_name("_")
    assert not validate_field_name("1_a")
    assert not validate_field_name("A_a")

Ha! As you see it generated 40(!) scenarios to test!

However it’s not so good:

Prompt:

That show failed scenarios: E AssertionError: assert not True E + where True = validate_field_name(‘123variable’)

AI Answer:

You are absolutely correct. My apologies! The issue stems from the regex pattern still allowing numbers at the beginning of the camelCase portion. The original requirement was that camelCase should start with a lowercase letter.

Here’s the corrected Python script with the adjusted regex, along with explanations and the updated tests:

import pytest
import re

def validate_field_name(field_name):
    """
    Validates a field name against the specified regex pattern.
    """

    pattern = r"^(?:[a-z0-9]{1,60}(?:_[a-z0-9]{1,60})*|[a-z][a-zA-Z0-9]{0,59})$"

    return bool(re.match(pattern, field_name))

So no matter AI failed to fix regexp, coverage actually worked perfect to given scenario!

However still your your important role to refine this:

Prompt:

fix the pytests below by switching them to parametrizes UT so one failure will not block other tests

AI answer:

import pytest

def validate_field_name(name):
   ... # put your implementation here

@pytest.mark.parametrize("name", [
    "fieldName",
    "anotherVariable",
    "shortName",
   "aVeryLongVariableNameThatIsWithin60Characters",
    "variable123",
    "a1",
])

def test_valid_camel_case(name):
    assert validate_field_name(name)

@pytest.mark.parametrize("name", [
    "field_name",
    "another_variable",
    "short_name",      "a_very_long_variable_name_that_is_within_60_characters",
    "variable_123",
    "a_1",
    "a", "a_b_c_d_e_f_g_h_i_j_k_l_m_n_o_p_q_r_s_t_u_v_w_x_y_1234567890",
])

def test_valid_snake_case(name):
    assert validate_field_name(name)

@pytest.mark.parametrize("name", [
    "FieldName",
    "field_Name",
    "field_nameVariable",
    "123variable",
    "_variable",
])

def test_invalid_camel_case(name):
    assert not validate_field_name(name)

@pytest.mark.parametrize("name", [
    "Field_name",
    "field__name",
    "field_Name",
    "field_nameVariable",
    "123_variable",
])

def test_invalid_snake_case(name):
    assert not validate_field_name(name)

def test_empty_string():
    assert not validate_field_name("")

@pytest.mark.parametrize("name, expected", [
    ("a" + "a" * 59, True),
    ("a" + "_a" * 29, True),
    ("a" + "a" * 60, False),
    ("a" + "_a" * 30, False),
])

def test_long_string(name, expected):
    assert validate_field_name(name) == expected, f"name: {name}, expected: {expected}, length: {len(name)}"

@pytest.mark.parametrize("name, expected", [
    ("a", True),
    ("a1", True),
    ("a_1", True),
    ("a_b", True),
    ("variable1", True),
    ("variableA", True),
    ("A", False),
    ("1", False),
    ("_", False),
    ("1_a", False),
    ("A_a", False),

])

def test_edge_cases(name, expected):
    assert validate_field_name(name) == expected

Here we go!

That is already production-ready UT that we can accept and submit to our code!

Writing unit tests with the AI

Comments

Leave a Reply Cancel reply