Morteza Zakeri's Ph.D. dissertation

This website contains a summary of my Ph.D. dissertation and provides online complementary materials. This is a final view of my Ph.D. dissertation (version 5.0, June 2024). Materials may be updated in the next months.

Abstract

Testing in software development can be demanding and time-consuming, particularly within agile processes that prioritize tests to drive design and code production. The associated costs of testing present a significant hindrance to maintaining agility. Therefore, it is advisable to assess the testability of the code prior to initiating any testing activities and, if required, refactor the code to facilitate efficient and effective testing. From a software engineering standpoint, the need for automatic refactoring to enhance testability and other quality criteria arises from the identification of source code problems, commonly known as code smells. Analyzing these code smells enables the assessment of their impact on quality attributes. During our exploration of automated approaches to measure source code similarity, we concluded that measuring code similarity is a foundational method for automating numerous tasks, such as testability, coverageability, and code smell prediction, in software engineering. For example, machine learning models can be used to automate software engineering tasks by learning how to compare source codes and predict the characteristics of a software entity based on the features known in similar entities. Incorporating lexical and statistical metrics and common source code metrics enhances the feature space for machine learning. Our research has demonstrated that shallow learning methods exhibit higher accuracy and faster performance than deep learning methods when utilizing this feature space to learn similarity. The advantage of machine learning models is that there is no need to run the code; in other words, they are static. Indeed, by using the runtime experiences that these models learn, it is possible to determine the level of testability and coverageability without any need for the code to be executable. Using a search-based refactoring approach based on code smells, we could improve the testability of 1234 classes in six known Java projects during agile processes. The approach uses an evolutionary process and an initial population of refactorings to remove the smells when the testability level is unacceptable. This way, testability increased by 30\% on average as a result. The results of the proposed methods of this thesis have been integrated into an agile software development process called "Testability Driven Development (TsDD)."

Research motivation, importance, questions, and challenges

Software testing is a vital activity in software development, but it also consumes a significant amount of resources. It is estimated that software testing accounts for 50% of the total cost of software development. To achieve a high-quality and low-cost test, the developed artifacts (requirements, architecture, design, and code) must be designed to be testable. Testability is the degree to which a software system or component facilitates the establishment of test criteria and the performance of tests to determine whether those criteria have been met. Improving the testability of software can make testing easier or more efficient by reducing the effort, time, and complexity of testing activities. Software testability analysis helps in quantifying the testability value of software and identifying the factors that affect it. One challenge in software testability analysis is that testability itself is a non-functional requirement, which is hard to measure directly. Therefore, various models and metrics have been proposed to estimate testability indirectly based on different attributes and perspectives. Another challenge in software testability analysis is to refactor a software system to improve its testability without compromising its functionality or quality. Refactoring is the process of improving the internal structure of software while preserving its external behavior. Refactoring for testability requires applying specific techniques and principles that can enhance the modularity, cohesion, coupling, readability, and maintainability of software.

Two major research questions raised here are: (1) what are the best approaches to quantify the testability of the software system, and (2) what are the best refactoring actions to improve testability? To answer this question, we need to look at testability in different stages of the software development life cycle (SDLC).

There is a vast need for high-quality software in today's cyber-physical systems, IoT systems, and safety-critical systems. On the other hand, the lack of automatic methods for preparing such newly developed systems before testing is sensible. Our problem is to automatically identify the weak points or smells in SDLC artifacts and refactor them with the aim of increasing the testability.

Contributions

Figure 1 shows my Ph.D. thesis contributions' tree and its enabling techniques, which are described in detail in the subsequent sections.

Figure 1: Zakeri Ph.D. thesis contributions tree

Code embedding

Enabler technique: Basically, code embedding aims to represent given code snippets as fixed-length vectors to be used as the input of different analytical and machine learning models. There are various code embedding techniques that are based on different aspects of the source code, such as tokens, graphs, trees, or paths.
This thesis uses a code embedding technique based on an extended set of exisiting and newly proposed source code metrics.

Relevant publications: Learning to predict software testability, Learning to predict test effectiveness

Supporting tools: ADAFEST

Source code similarity measurement

Contribution: Source code similarity measurement is a technique underlying a wide range of difficult software engineering tasks, including but not limited to code clone detection, smell prediction, code recommendation, and testability measurement. The last one is discussed in this thesis.

Relevant publications: Method name recommendation based on source code metrics

Supporting tools: SENSA

Source code testability measurement

Contribution: The main goal of this contribution is to provide a novel and practical framework to measure testability at the implementation level. Connecting runtime information to the static properties of the program is a key point in measuring software quality, including testability. Despite a large number of researches on software testability, we observed that the relationship between testability and test adequacy criteria had not been studied, and testability metrics still are far from measuring the actual testing effort. We hypothesize that testability has a significant impact on automatic testing tools. Therefore, we propose a new methodology to measure and quantify software testability by exploiting both runtime information and static properties of the source code. The proposed method uses machine learning to predict testability based on source code metrics mapped to the test criteria during the training process. An ablation study is used to determine the most influential source code metrics affecting testability at the implementation level.

Relevant publications: An ensemble meta-estimator to predict source code testability

Supporting tools: ADAFEST

Automated refactoring

Contribution: Automated refactoring refers to a collection of behavior preservation program transformations automatically applied at different software entities with different abstraction-level, such as code, design, or model. Search-based refactoring can handle complex and multi-objective refactoring scenarios by exploring a large and diverse search space of possible refactorings. It is my main apparatus to improve software testability in this thesis.

Relevant publications: Flipped boosting of automatic test data generation frameworks through a many-objective program transformation approach, An automated extract method refactoring approach to correct the long method code smell, Supporting single responsibility through automated extract method refactoring

Supporting tools: CodART, ExtractMethod, SBSRE

Source code testability improvement

Contribution: This contribution proposes the use of automated refactoring to improve source code testability. An appropriate source code testability measurement approach is required to ensure testability improvement since not all refactoring operations improve testability. To this aim, the proposed testability prediction method is used as a fitness function of the improvement process.

Relevant publications: An ensemble meta-estimator to predict source code testability, Flipped boosting of automatic test data generation frameworks through a many-objective program transformation approach

Supporting tools: CodART

Graph embedding

Enabler technique: Graph embedding is the process of transforming a graph into a low-dimensional vector representation that preserves some key information of the graph, such as its structure, topology, or semantics. Graph embedding can be used for various tasks such as graph analysis, visualization, clustering, classification, or recommendation. Graph embedding uses to represent graphs, nodes, and edges as fixed-length vectors. There are different types of graph embedding techniques that are based on different aspects of the graph, such as nodes, edges, subgraphs, or paths. Graph embedding enables us to process software design artifacts such as UML class diagrams, mainly treated as directed labeled graphs.

Relevant publications: Measuring and improving software testability at the design level

Supporting tools: D4T

Design testability measurement

Contribution: Design testability measurement is the process of evaluating how easy or difficult it is to test a software product at the design phase. Source code metrics are not available at the design phase. The proposed source code testability prediction method is modified to work at the design level based on design metrics.

Relevant publications: Measuring and improving software testability at the design level

Supporting tools: D4T, ADAFEST,

Automated refactoring to patterns

Contribution: Automated refactoring to software design patterns is the process of applying code transformations that improve the design quality of software by introducing well-known design patterns. We show that how refactoring to creational design patterns enables improving testability at the design level.

Relevant publications: Measuring and improving software testability at the design level

Supporting tools: D4T, CodART

Design testability improvement

Contribution: Design patterns are important from the viewpoint of a software designer. Some patterns mainly simplify module isolation and falcate testing at the design level. This thesis proposes the automatic identification and application of two well-known patterns: Dependency Injection (Constructor Injection) and Factory Method, to improve testability at the design level.

Relevant publications: Measuring and improving software testability at the design level

Supporting tools: D4T, CodART

Word-context embedding

Enabler technique: Word embedding is similar to code embedding in the context of natural language texts. Traditional word embeddings (e.g., Word2Vec, GloVe) create fixed vectors for each word, regardless of context. We trained separate model for each frequent word according to its domain to enable traditional word-embeding to preserve the context.

Relevant publications: Natural language requirements testability measurement based on requirement smells

Supporting tools: ARTA

Requirement smell detection

Contribution: Requirements smells are a symptom of poor requirements, similar to code and design smells which are symptoms are poor design and implementation. Just as code smells indicate suboptimal coding practices and design smells highlight flaws in the overall system architecture, requirements smells signal issues within the initial specification of a software system. These smells manifest as imprecision, ambiguity, incompleteness, or other quality deficiencies in the natural language descriptions of system requirements. When left unaddressed, requirement smells can lead to delays, rework, and dissatisfaction among stakeholders. Identifying and mitigating these smells early in the development process is crucial for successful project outcomes. Automated techniques for detecting requirements smells play a crucial role in improving the quality of software specifications. These techniques leverage natural language processing (NLP) and other methods to identify and address issues within requirement documents.

Relevant publications: Natural language requirements testability measurement based on requirement smells

Supporting tools: ARTA

Requirement testability measurement

Contribution: Software requirements are crucial artifacts in developing quality and testable software systems. Requirements specifications are used in both functional and acceptance testing to ensure that a program meets its requirements. A testable requirement increases the effectiveness of testing while decreasing the cost and time. We define requirements testability in terms of requirements smells, size, and complexity and propose a measurement method.

Relevant publications: Natural language requirements testability measurement based on requirement smells

Supporting tools: ARTA

Requirement testability improvement

Contribution: Requirement testability improvement does not deeply investigate in this thesis. However, the identification of requirements smells enables finding requirements refactoring opportunities. Fully-automated requirements testability is a challenging task that requires many requirements examples as ground-truth samples to construct and evaluate appropriate models.

Relevant publications: Natural language requirements testability measurement based on requirement smells

Supporting tools: ARTA

----

Testability-driven development (TsDD)

Sum-up work

Contribution: In conclusion, the techniques and approach proposed in this thesis are connected to form a new software development approach that supports both agility and testability. The rationale behind testability-driven development is to effectively use the testability measurement information in the software development lifecycle (SDLC) to separate the complexity of testing from the main application logic and purpose. This approach encourages developers to test a product's testability before testing it. However, the measurement by itself does not provide a considerable advantage. It must be accompanied by improvement of the components with poor testability. Hence, an improvement mechanism is embedded in TsDD, relying on software refactoring techniques. The most interesting part of TsDD is that the overall process of testability measurement and improvement can be performed automatically. Despite TDD, which fades the advantage of using automated test data generation tools due to the early development of test cases, TsDD maximizes the usage of such tools by preparing source code before testing.

Relevant publications: Testability-driven development: An improvement to the TDD efficiency

Empirical evaluations and supporting tools: TsDD

Dissertation structure

The dissertation contents have been organized into the nine chapters described as follows:

Chapter 1: Introduction

The chapter sections are problem statement, motivation, goals and challenges, proposed method and contributions, and the organization of the following chapters.

Chapter 2: Background

This chapter describes the background topics used in the rest of the thesis including testing and testability in software systems, software refactoring, and software analytics techniques.

The chapter sections are a systematic literature review on source code similarity measurement techniques, source code testability, design testability, and requirements testability.