在实践中，为什么不同的编译器会计算不同的int x=++i+++i；值呢？

提问者：小点点

在实践中，为什么不同的编译器会计算不同的int x=++i+++i；值呢？

请考虑以下代码:

int i = 1;
int x = ++i + ++i;

假设编译器编译了这些代码，我们可以猜测它可能会对这些代码做什么。

两个++i都返回2，结果是x=4.
一个++i返回2，另一个返回3，结果是x=5.
两个++i都返回3，结果是x=6.

在我看来，第二种可能性最大。两个++运算符中的一个在i=1的情况下执行，i递增，并返回结果2。然后在i=2的情况下执行第二个++运算符，i递增，并返回结果3。然后2和3相加得5。

但是，我在Visual Studio中运行了这段代码，结果是6。我试图更好地理解编译器，我想知道什么可能导致6的结果。我唯一的猜测是，代码可以用某种“内置”并发执行。调用了两个++运算符，每个运算符在另一个返回之前递增i，然后它们都返回3。这将与我对调用堆栈的理解相矛盾，需要加以解释。

我的问题是:C++编译器可以做哪些（合理的）事情来导致结果为4或结果为6？

共3个答案

匿名用户

编译器获取您的代码，将其拆分成非常简单的指令，然后以它认为最优的方式重新组合和排列它们。

该守则

int i = 1;
int x = ++i + ++i;

包括以下说明:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
5. read i as tmp2
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

但是，尽管这是一个编号列表，但这里只有几个排序依赖项:1->2->3->4->5->10->11和1->6->7->8->9->10->11必须保持相对顺序。除此之外，编译器可以自由地重新排序，也许还可以消除冗余。

例如，您可以按如下方式对列表进行排序:

1. store 1 in i
2. read i as tmp1
6. read i as tmp3
3. add 1 to tmp1
7. add 1 to tmp3
4. store tmp1 in i
8. store tmp3 in i
5. read i as tmp2
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

编译器为什么能做到这一点？因为没有对增量的副作用进行排序。但是现在编译器可以简化:例如，4中有一个死存储:值立即被覆盖。还有，tmp2和tmp4真的是一回事。

1. store 1 in i
2. read i as tmp1
6. read i as tmp3
3. add 1 to tmp1
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

现在所有与tmp1有关的都是死代码:它从未被使用过。而重读的《我》也可以消除:

1. store 1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
10. add tmp3 and tmp3, as tmp5
11. store tmp5 in x

听着，这个代码短多了。优化器很高兴。程序员不是，因为我只被递增了一次。哎呀。

让我们来看看编译器可以做的其他事情:让我们回到原始版本。

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
5. read i as tmp2
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

编译器可以按如下方式重新排序:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

然后再次注意到我被读了两次，所以消除其中一个:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

这很好，但它可以更进一步:它可以重用TMP1:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp1
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

那么它可以消除6:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

现在4是一家死店:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

现在3和7可以合并成一条指令:

1. store 1 in i
2. read i as tmp1
3+7. add 2 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

消除最后一个临时:

1. store 1 in i
2. read i as tmp1
3+7. add 2 to tmp1
8. store tmp1 in i
10. add tmp1 and tmp1, as tmp5
11. store tmp5 in x

现在您将得到Visual C++给出的结果。

请注意，在这两条优化路径中，重要的顺序依赖关系都被保留了下来，因为指令并没有因为什么都不做而被删除。

匿名用户

虽然这是UB（正如OP所暗示的），但以下是编译器可以获得3个结果的假设方法。如果与不同的inti=1，j=1；变量而不是同一个i一起使用，这三个变量将给出相同的正确结果。

两个++i都返回2，结果是x=4.

int i = 1;
int i1 = i, i2 = i;   // i1 = i2 = 1
++i1;                 // i1 = 2
++i2;                 // i2 = 2
int x = i1 + i2;      // x = 4

int i = 1;
int i1 = ++i;           // i1 = 2
int i2 = ++i;           // i2 = 3
int x = i1 + i2;        // x = 5

int i = 1;
int &i1 = i, &i2 = i;
++i1;                   // i = 2
++i2;                   // i = 3
int x = i1 + i2;        // x = 6

匿名用户

我认为一个更简单的答案可以解决OP问题的这一部分:

一个++i返回2，另一个返回3...

这可能是一个错误的前提，因为增量运算符一开始并没有真正的“返回”值。它不是函数调用；它不在堆栈上存储中间结果；它只是直接递增变量i。所有这样的操作都是在算术语句被解析之前完成的。于是这些操作都只是:

增量i
增量i
添加i+i

将i递增两倍，其值为3，相加时，和为6。

为了进行检查，请将其视为一个C++函数:

int dblInc ()
{
    int i = 1;
    int x = ++i + ++i;
    return x;   
}

下面是我使用旧版本的GNU C++编译器(win32，gcc版本3.4.2（mingw-special））编译该函数得到的汇编代码。这里没有任何花哨的优化:

__Z6dblIncv:
    push    ebp
    mov ebp, esp
    sub esp, 8
    mov DWORD PTR [ebp-4], 1
    lea eax, [ebp-4]
    inc DWORD PTR [eax]
    lea eax, [ebp-4]
    inc DWORD PTR [eax]
    mov eax, DWORD PTR [ebp-4]
    add eax, DWORD PTR [ebp-4]
    mov DWORD PTR [ebp-8], eax
    mov eax, DWORD PTR [ebp-8]
    leave
    ret

请注意，局部变量i位于堆栈中的一个位置:地址[ebp-4]。该位置递增两次（汇编函数的第6行和第8行；包括该地址明显冗余地加载到eax中）。然后在第9行和第10行，将该值加载到eax中，然后添加到eax中（即计算当前的i+i)。然后将其冗余地复制到堆栈，并作为返回值（显然是6）返回到eax。

现在，如果你愿意，你可以:做一个整数包装类（像一个Java整数）；重载函数operator+和operator++，使它们返回中间值对象；从而编写++iobj+++iobj并使其返回一个包含5的对象。（为了简洁起见，这里没有包含完整的代码。）但是对于像int这样的基元类型，它不会这样做；它只会直接递增，可能使用汇编语言inc操作。

在实践中，为什么不同的编译器会计算不同的int x=++i+++i；值呢？

共3个答案

相关问题