arrow-up icon
Image

Duplication management

"Duplication may be the root of all evil in software" (Clean Code Collection, Robert C. Martin). Keep software quality healthy by managing duplicate code, which can cause project delays and bugs.

deco-blob-1 decoration
graphical divider

Analyze source code updated every day, and detect duplicate code

Everyone has probably experienced to copy-and-paste existing source code. However, as the number of duplicate codes generated by copy-and-paste is increased, when one code is changed, it is necessary to search not only the target code but also all related duplicate codes without omission and determine whether the same modification must be applied or not. As the size of the software increases, the difficulty of this task increases exponentially, leading to frequent bugs and project delays.

Siderscan analyzes the existing duplicate code present in the entire project at the first analysis and generates a summary report. After that, it analyzes the code that is added and updated during daily development and notifies users of any newly generated duplicate code.

        nominal_D_overlap[i] = LAYERS[i]->getDIM_D() - offs; //
    }
    DIM_D += LAYERS[N_LAYERS-1]->getDIM_D();
}

- int MultiSliceVolume(int i, int j) {
    if (j==0)
        return LAYERS[i]->getDIM_V();
    else if(j==1)
        return LAYERS[i]->getDIM_H();
    else if(j==2)
        return LAYERS[i]->getDIM_D();
        nominal_D_overlap[i] = LAYERS[i]->getDIM_D() - offs; //
    }
    DIM_D += LAYERS[N_LAYERS-1]->getDIM_D();
}

+ int MultiLayerVolume(int i, int j) {
    if (j==0)
        return LAYERS[i]->getDIM_V();
    else if(j==1)
        return LAYERS[i]->getDIM_H();
    else if(j==2)
        return LAYERS[i]->getDIM_D();

* This code is detected as duplicate code because the logic is the same in the left and right files, although the namespaces are different.

Patent pending duplication detection algorithm

Various algorithms to detect duplicate code (also called “clone code”) have been proposed in order to improve processing speed and detection accuracy. The duplicate code that can be detected by an algorithm is classified academically into four types based on the degree of difference between similar code pairs.

Siderscan's duplicate code detection employs a patent-pending proprietary algorithm that can detect Type 3 duplicate code at high speed, even in large projects. In other words, Siderscan can detect not only the duplicate code of the exact same text, but also any edits made after the copy and paste, such as partial changes of variable or function names, or changes or insertions of sentence units.

  • Checkmark icon

    Type 1

    Identical code fragments but may have some variations such as whitespace, tabs, line breaks, etc.
  • Checkmark icon

    Type 2

    In addition to Type 1, code allows for differences in variable names, function names, type names, and other lexical (token) units to be matched.
  • Checkmark icon

    Type 3

    In addition to Type 2, syntactically similar code with inserted, deleted, or updated statements
  • Checkmark icon

    Type 4

    Semantically equivalent, but syntactically different code

Code viewer for detailed analysis

Siderscan provides a code viewer for a detailed examination of each duplicate code reported. By clicking on the link in the report sent to you by email, you can view the details of the duplicate code in a standard browser.

Duplicates are not always one-to-one. Siderscan groups multiple duplicate code blocks if it has “Type 3” similarity.

Sinderscan's Code Viewer displays each pair of duplicate code blocks in a group side-by-side, highlighting differences between the two similar codes. Statistics such as "similarity" and "duplicate code size" are also displayed to help you determine whether a detected duplication is acceptable at this time or whether it should be removed immediately.

Manage duplication and share it with your team

Siderscan not only detects duplication, but also provides a dashboard to show the time series increase/decrease of duplicate code in the entire project. Share the current status of duplication with your team and use it for refactoring planning.

When you leave the duplicate as it is for now, or If the duplicate is an algorithm's false positive, you can remove it from the list managed by Siderscan. Instead of relying on genus memory, let's use the tool to properly manage duplication.

siderscan-bottle decoration

Start duplicate analysis for free

You can analyze one project for free.