This replicated study examines whether software testers’ opinions—such as preferred techniques, perceived complexity, and self-assessed performance—influence theirThis replicated study examines whether software testers’ opinions—such as preferred techniques, perceived complexity, and self-assessed performance—influence their

A Replication Study on Software Testing Perception vs Effectiveness

Table Of Links

Abstract

1 Introduction

2 Original Study: Research Questions and Methodology

3 Original Study: Validity Threats

4 Original Study: Results

5 Replicated Study: Research Questions and Methodology

6 Replicated Study: Validity Threats

7 Replicated Study: Results

8 Discussion

9 Related Work

10 Conclusions And References

\

5 Replicated Study: Research Questions And Methodology

We decide to further investigate the results of the original study in search of possible drivers behind misperceptions. Psychology considers that people’s perceptions can be affected by personal characteristics as attitudes, personal interests and expectations. Therefore, we decide to examine participants’ opinions by conducting a differentiated replication of the original study [47] that extends its goal as follows:

  1. The survey of effectiveness perception is extended to include questions on programs.

  2. We want to find out whether participants’ perceptions might be conditioned by their opinions. More precisely: their preferences (favourite technique), their performance (the technique that they think they applied best) and technique or program complexity (the technique that they think is easiest to apply, or the simplest program to be tested).

    \ Therefore, the replicated study reexamines RQ1 stated in the original study (this time the survey taken by participants also includes questions regarding programs), and addresses the new following research questions:

    RQ1.6: Are participants perceptions related to the number of defects reported by participants? We want to assess if participants perceive as the most effective technique the one with which they have reported more defects.

    RQ2: Can participants’ opinions be used as predictors for testing effectiveness?

    – RQ2.1: What are participants’ opinions about techniques and programs? We want to know if participants have different opinions about techniques or programs.

    RQ2.2: Do participants’ opinions predict their effectiveness? We want to assess if the opinions that participants have about techniques (or programs) predict which one is the most effective for them.

    RQ3: Is there a relationship between participants’ perceptions and opinions?

    RQ3.1: Is there a relationship between participants’ perceptions and opinions? We want to assess if the opinions that participants have about techniques (or programs) are related to their perceptions.

    – RQ3.2: Is there a relationship between participants’ opinions? We want to assess if a certain opinion that participants have about techniques are related to other opinions.

    \ To answer these questions, we replicate the original study with students of the same course in the following academic year. This time we have 46 students. The changes made to the replication of the experiment are as follows: – The questionnaire to be completed by participants at the end of the experiment is extended to include new questions. The information we want to capture with the opinion questions is: – Participants performance on techniques. With this question we are referring to process conformance. Best applied technique is the technique each participant thinks (s)he applied more thoroughly. It corresponds to OT1: Which technique did you apply best?

    \ – Participants preferences. We want to know the favourite technique of each participant. They one (s)he felt more comfortable with when applied. It corresponds to OT2: Which technique do you like best?

    Technique complexity. We want to know the technique each participant thinks was easiest to get process conformance. It corresponds to OT3: Which technique is the easiest to apply?

    \ – Program testability. We want to know the program it was easier to test. This is, the program in which process conformance could be obtained more easily. It corresponds to OP1: Which is the simplest program? Table 16 summarizes the survey questions. We have chosen these questions because we need to ask simple questions, that can be easily understood by participants, being at the same time meaningful. We do not want to overwhelm participants with complex questions that have lots of explanations. A complex questionnaire might discourage students to submit it.

    \ – The program faults are changed. The original study is designed so that all techniques are effective at finding all defects injected. We choose faults detectable by all techniques so the techniques could be compared fairly. The replicated study is designed to cover the situation in which some faults cannot be detected by all techniques. Therefore, we inject some faults that techniques are not effective at detecting. For example, BT cannot detect a non-implemented feature (as participants are required to generate test cases from the source code only). Likewise,

EP cannot find a fault whose detection depends on the combination of two invalid equivalence classes. Therefore, in the replicated study, we inject some faults that can be detected by BT but not by EP and some faults that can be detected by EP but not by BT into each program (each program is seeded with six faults). Note that the design is balanced: we inject the same number of faults that BT can detect, but not EP, that the opposite –EP can detect, but not BT). This change is expected to affect the effectiveness of EP and BT, which might be lower than in the original study. It should not affect the effectiveness of CR.

– We change the program application order to further study maturation issues. The order is now: cmdline, ntree, nametbl. This change should not affect the results.

– Participants run their own test cases. It could be that the misperceptions obtained in the original study are due to the fact that participants are not running their own test cases.

– There are not two versions anymore but one. Faults and failures are not the goal of this study. This helps to simplify the experiment. Table 17 shows a summary of the changes made to the study.

To measure technique effectiveness we proceed in the same way as in the original study. We do not rely on the reported failures, as participants could:

  1. Report false positives (non-real failures).
  2. Report the same failure more than once (although they were asked not to do so).
  3. Miss failures corresponding to faults that have been exercised by the technique, but for some reason have not been seen.

We measure the new response variable (reported defects) by counting the number of faults/failures reported by each participant. We analyse RQ2.1 in the same manner as RQ1.1, and RQ1.6, RQ2.2, RQ3.1 and RQ3.2 like RQ1.2. Table 18 summarises the statistical tests used to answer each research question.

\

6 Replicated Study: Validity Threats

The threats to validity listed in the original study apply to this replicated study. Additionally, we have identified the following ones:

6.1 Conclusion Validity

  1. Reliability of treatment implementation. The replicated experiment is run by the same researchers that performed the original experiment. This assures that the two groups of participants do not implement the treatments differently.

    6.2 Internal Validity

    1. Evaluation Apprehension. The use of students and associating their performance in the experiment with their grade in the course might explain that participants consider that their performance and not the weaknesses of the techniques explain the effectiveness of a technique.

6.3 Construct Validity

  1. Inadequate preoperational explanation of effect constructs. Since opinions are hard constructs to operationalize, there exists the possibility that the questions appearing in the questionnaire are not interpreted by participants the way we intended to. 6.4 External Validity

  2. Reproducibility of results. It is not clear to what extent the results obtained here are reproducible. Therefore, more replications of the study are needed.

    \ The steps that should be followed are:

    (a) Replicate the study capturing the reasons for the answers given by participants.

    (b) Perform the study with practitioners with the same characteristics as the students used in this study (people with little or no experience in software testing).

    (c) Explore and define what types of experience could be influencing the results (academic, professional, programming, testing, etc.).

    (d) Run new studies taking into consideration increasing levels of experience.

    \ Again, of all threats affecting the replicated study, the only one that could affect the validity of the results of this study in an industrial context is the one related to generalisation to other subject types.

\

:::info Authors:

  1. Sira Vegas
  2. Patricia Riofr´ıo
  3. Esperanza Marcos
  4. Natalia Juristo

:::

:::info This paper is available on arxiv under CC BY-NC-ND 4.0 license.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Is Doge Losing Steam As Traders Choose Pepeto For The Best Crypto Investment?

Is Doge Losing Steam As Traders Choose Pepeto For The Best Crypto Investment?

The post Is Doge Losing Steam As Traders Choose Pepeto For The Best Crypto Investment? appeared on BitcoinEthereumNews.com. Crypto News 17 September 2025 | 17:39 Is dogecoin really fading? As traders hunt the best crypto to buy now and weigh 2025 picks, Dogecoin (DOGE) still owns the meme coin spotlight, yet upside looks capped, today’s Dogecoin price prediction says as much. Attention is shifting to projects that blend culture with real on-chain tools. Buyers searching “best crypto to buy now” want shipped products, audits, and transparent tokenomics. That frames the true matchup: dogecoin vs. Pepeto. Enter Pepeto (PEPETO), an Ethereum-based memecoin with working rails: PepetoSwap, a zero-fee DEX, plus Pepeto Bridge for smooth cross-chain moves. By fusing story with tools people can use now, and speaking directly to crypto presale 2025 demand, Pepeto puts utility, clarity, and distribution in front. In a market where legacy meme coin leaders risk drifting on sentiment, Pepeto’s execution gives it a real seat in the “best crypto to buy now” debate. First, a quick look at why dogecoin may be losing altitude. Dogecoin Price Prediction: Is Doge Really Fading? Remember when dogecoin made crypto feel simple? In 2013, DOGE turned a meme into money and a loose forum into a movement. A decade on, the nonstop momentum has cooled; the backdrop is different, and the market is far more selective. With DOGE circling ~$0.268, the tape reads bearish-to-neutral for the next few weeks: hold the $0.26 shelf on daily closes and expect choppy range-trading toward $0.29–$0.30 where rallies keep stalling; lose $0.26 decisively and momentum often bleeds into $0.245 with risk of a deeper probe toward $0.22–$0.21; reclaim $0.30 on a clean daily close and the downside bias is likely neutralized, opening room for a squeeze into the low-$0.30s. Source: CoinMarketcap / TradingView Beyond the dogecoin price prediction, DOGE still centers on payments and lacks native smart contracts; ZK-proof verification is proposed,…
Share
BitcoinEthereumNews2025/09/18 00:14
ServicePower Closes Transformative Year with AI-Driven Growth and Market Expansion

ServicePower Closes Transformative Year with AI-Driven Growth and Market Expansion

Double-digit growth, 50% team expansion, and accelerated innovation define 2025 momentum MCLEAN, Va., Dec. 18, 2025 /PRNewswire/ — ServicePower, a leading provider
Share
AI Journal2025/12/18 23:32
Franklin Templeton CEO Dismisses 50bps Rate Cut Ahead FOMC

Franklin Templeton CEO Dismisses 50bps Rate Cut Ahead FOMC

The post Franklin Templeton CEO Dismisses 50bps Rate Cut Ahead FOMC appeared on BitcoinEthereumNews.com. Franklin Templeton CEO Jenny Johnson has weighed in on whether the Federal Reserve should make a 25 basis points (bps) Fed rate cut or 50 bps cut. This comes ahead of the Fed decision today at today’s FOMC meeting, with the market pricing in a 25 bps cut. Bitcoin and the broader crypto market are currently trading flat ahead of the rate cut decision. Franklin Templeton CEO Weighs In On Potential FOMC Decision In a CNBC interview, Jenny Johnson said that she expects the Fed to make a 25 bps cut today instead of a 50 bps cut. She acknowledged the jobs data, which suggested that the labor market is weakening. However, she noted that this data is backward-looking, indicating that it doesn’t show the current state of the economy. She alluded to the wage growth, which she remarked is an indication of a robust labor market. She added that retail sales are up and that consumers are still spending, despite inflation being sticky at 3%, which makes a case for why the FOMC should opt against a 50-basis-point Fed rate cut. In line with this, the Franklin Templeton CEO said that she would go with a 25 bps rate cut if she were Jerome Powell. She remarked that the Fed still has the October and December FOMC meetings to make further cuts if the incoming data warrants it. Johnson also asserted that the data show a robust economy. However, she noted that there can’t be an argument for no Fed rate cut since Powell already signaled at Jackson Hole that they were likely to lower interest rates at this meeting due to concerns over a weakening labor market. Notably, her comment comes as experts argue for both sides on why the Fed should make a 25 bps cut or…
Share
BitcoinEthereumNews2025/09/18 00:36