Python 数据分析三剑客之 NumPy（三）：数组的迭代与位运算

首发于: 2020-04-06丨阅读量: 丨字数统计: 4.3k丨阅读时长: 19分丨 NumPy 数据分析

文章目录

【1x00】numpy.nditer 迭代器对象
【2x00】NumPy 位运算

NumPy 系列文章：

专栏：

NumPy 专栏：https://itrhx.blog.csdn.net/category_9780393.html
Pandas 专栏：https://itrhx.blog.csdn.net/category_9780397.html
Matplotlib 专栏：https://itrhx.blog.csdn.net/category_9780418.html

推荐学习资料与网站：

NumPy 官方中文网：https://www.numpy.org.cn/
Pandas 官方中文网：https://www.pypandas.cn/
Matplotlib 官方中文网：https://www.matplotlib.org.cn/
NumPy、Matplotlib、Pandas 速查表：https://github.com/TRHX/Python-quick-reference-table

这里是一段物理防爬虫文本，请读者忽略。
本文原创首发于 CSDN，作者 ITBOB。
博客首页：https://itrhx.blog.csdn.net/
本文链接：https://itrhx.blog.csdn.net/article/details/105185337
未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！

【1x00】numpy.nditer 迭代器对象

numpy.nditer 是 NumPy 的迭代器对象，迭代器对象提供了许多灵活的方法来访问一个或多个数组中的所有元素，简单来说，迭代器最基本的任务就是完成对数组元素的访问。

【1x01】单数组的迭代

单数组迭代示例：

>>> import numpy as np
>>> a = np.arange(10).reshape(2, 5)
>>> print(a)
[[0 1 2 3 4]
 [5 6 7 8 9]]
>>> for i in np.nditer(a):
    print(i, end=' ')

    
0 1 2 3 4 5 6 7 8 9

注意：默认对数组元素的访问顺序，既不是以标准 C（行优先）也不是 Fortran 顺序（列优先），选择的顺序是和数组内存布局一致的，这样做是为了提高访问效率，反映了默认情况下只需要访问每个元素而不关心特定排序的想法。以下用一个数组的转置来理解这种访问机制。

>>> import numpy as np
>>> a = np.arange(10).reshape(2, 5)
>>> print(a)
[[0 1 2 3 4]
 [5 6 7 8 9]]
>>> 
>>> b = a.T
>>> print(b)
[[0 5]
 [1 6]
 [2 7]
 [3 8]
 [4 9]]
>>> 
>>> c = a.T.copy(order='C')
>>> print(c)
[[0 5]
 [1 6]
 [2 7]
 [3 8]
 [4 9]]
>>> 
>>> for i in np.nditer(a):
    print(i, end=' ')

    
0 1 2 3 4 5 6 7 8 9 
>>> 
>>> for i in np.nditer(b):
    print(i, end=' ')

    
0 1 2 3 4 5 6 7 8 9 
>>> 
>>> for i in np.nditer(c):
    print(i, end=' ')

    
0 5 1 6 2 7 3 8 4 9

例子中 a 是一个 2 行 5 列的数组，b 数组对 a 进行了转置，而 c 数组则是对 a 进行转置后按照 C order（行优先）的形式复制到新内存中储存，b 数组虽然进行了转置操作，但是其元素在内存当中的储存顺序仍然和 a 一样，所以对其迭代的效果也和 a 一样，c 数组元素在新内存当中的储存顺序不同于 a 和 b，因此对其迭代的效果也不一样。

【1x02】控制迭代顺序

如果想要按照特定顺序来对数组进行迭代，nditer 同样也提供了 order 参数，可选值为：C F A K

numpy.nditer(a, order='C')：标准 C 顺序，即行优先；
numpy.nditer(a, order='F')： Fortran 顺序，即列优先；
numpy.nditer(a, order='A')：如果所有数组都是 Fortran 顺序的，则 A 表示以 F 顺序，否则以 C 顺序；
numpy.nditer(a, order='K')：默认值，保持原数组在内存当中的顺序。

应用举例：

>>> import numpy as np
>>> a = np.arange(12).reshape(3, 4)
>>> print(a)
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
>>> 
>>> for i in np.nditer(a, order='C'):
    print(i, end= ' ')

    
0 1 2 3 4 5 6 7 8 9 10 11 
>>> 
>>> for i in np.nditer(a, order='F'):
    print(i, end= ' ')

    
0 4 8 1 5 9 2 6 10 3 7 11 
>>> 
>>> for i in np.nditer(a, order='K'):
    print(i, end= ' ')

    
0 1 2 3 4 5 6 7 8 9 10 11

【1x03】修改数组元素

nditer 对象提供了可选参数 op_flags，默认情况下，该参数值为 readonly（只读），如果在遍历数组的同时，要实现对数组中元素值的修改，则可指定 op_flags 值为 readwrite（读写）或者 writeonly（只读）。

应用举例：

>>> import numpy as np
>>> a = np.arange(10).reshape(2, 5)
>>> print(a)
[[0 1 2 3 4]
 [5 6 7 8 9]]
>>> 
>>> for i in np.nditer(a, op_flags=['readwrite']):
        i[...] = i+1

    
>>> print(a)
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]

>>> import numpy as np
>>> li = []
>>> a = np.arange(12).reshape(3, 4)
>>> print(a)
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
>>> 
>>> for i in np.nditer(a, op_flags=['readwrite']):
        li.append(i*2)

    
>>> print(li)
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22]

【1x04】使用外部循环

nditer 对象支持 flags 参数，该参数最常用的值是 external_loop，表示使给定的值为具有多个值的一维数组。

通俗来讲，当 Ndarray 的顺序和遍历的顺序一致时，就会将所有元素组成一个一维数组返回；当 Ndarray 的顺序和遍历的顺序不一致时，则返回每次遍历的一维数组

官方介绍：https://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#using-an-external-loop

应用举例：

>>> import numpy as np
>>> a = np.array([[0,1,2,3], [4,5,6,7], [8,9,10,11]], order='C')
>>> for i in np.nditer(a, flags=['external_loop'], order='C' ):
        print(i, end=' ')

    
[ 0  1  2  3  4  5  6  7  8  9 10 11] 
>>> for i in np.nditer(a, flags=['external_loop'], order='F' ):
        print(i, end=' ')

    
[0 4 8] [1 5 9] [ 2  6 10] [ 3  7 11]

【1x05】跟踪元素索引

在迭代期间，我们有可能希望在计算中使用当前元素的索引值，同样可以通过指定 flags 参数的取值来实现：

参数	描述
`c_index`	跟踪 C 顺序索引
`f_index`	跟踪 Fortran 顺序索引
`multi_index`	跟踪多个索引或每个迭代维度一个索引的元组（多重索引）

在以下实例当中：

当参数为 c_index 和 f_index 时，it.index 用于输出元素的索引值

当参数为 multi_index 时，it.multi_index 用于输出元素的索引值

it.iternext() 表示进入下一次迭代，直到迭代完成为止

multi_index 可理解为对迭代对象进行多重索引

>>> import numpy as np
>>> a = np.arange(6).reshape(2, 3)
>>> it = np.nditer(a, flags=['c_index'])
>>> while not it.finished:
        print('%d <%s>' %(it[0], it.index))
        it.iternext()

    
0 <0>
True
1 <1>
True
2 <2>
True
3 <3>
True
4 <4>
True
5 <5>
False

>>> import numpy as np
>>> a = np.arange(6).reshape(2, 3)
>>> it = np.nditer(a, flags=['f_index'])
>>> while not it.finished:
        print('%d <%s>' %(it[0], it.index))
        it.iternext()

    
0 <0>
True
1 <2>
True
2 <4>
True
3 <1>
True
4 <3>
True
5 <5>
False

>>> import numpy as np
>>> a = np.arange(6).reshape(2, 3)
>>> it = np.nditer(a, flags=['multi_index'])
>>> while not it.finished:
        print('%d <%s>' %(it[0], it.multi_index))
        it.iternext()

    
0 <(0, 0)>
True
1 <(0, 1)>
True
2 <(0, 2)>
True
3 <(1, 0)>
True
4 <(1, 1)>
True
5 <(1, 2)>
False

【1x06】广播数组迭代

如果两个数组满足广播原则，nditer 对象能够同时迭代它们，即广播数组迭代（多数组的迭代）。

>>> import numpy as np
>>> a = np.arange(3)
>>> b = np.arange(6).reshape(2,3)
>>> print(a)
[0 1 2]
>>> print(b)
[[0 1 2]
 [3 4 5]]
>>> for m, n in np.nditer([a,b]):
    print(m,n)

    
0 0
1 1
2 2
0 3
1 4
2 5

如果两个数组不满足广播原则，将会抛出异常：

>>> import numpy as np
>>> a = np.arange(4)
>>> b = np.arange(6).reshape(2,3)
>>> print(a)
[0 1 2 3]
>>> print(b)
[[0 1 2]
 [3 4 5]]
>>> for m, n in np.nditer([a,b]):
        print(m,n)

    
Traceback (most recent call last):
  File "<pyshell#55>", line 1, in <module>
    for m, n in np.nditer([a,b]):
ValueError: operands could not be broadcast together with shapes (4,) (2,3)

这里是一段物理防爬虫文本，请读者忽略。
本文原创首发于 CSDN，作者 ITBOB。
博客首页：https://itrhx.blog.csdn.net/
本文链接：https://itrhx.blog.csdn.net/article/details/105185337
未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！

【2x00】NumPy 位运算

由于位运算是直接对整数在内存中的二进制位进行操作，所以有必要先了解一下如何来表示数字的二进制。

在 Python 中，提供了一个内置函数 bin()，将整数转换为以 0b 为前缀的二进制字符串，如果要去掉 0b 前缀，则可以使用 format 方法，因为返回的是字符串，所以也可以使用切片等其他方法去掉前缀。

>>> bin(3)
'0b11'
>>> format(3, 'b')
'11'
>>> f'{3:b}'
'11'

除了内置函数以外，NumPy 还提供了一个 numpy.binary_repr 函数，该函数的作用也是以字符串形式返回输入数字的二进制表示形式。

基本语法：numpy.binary_repr(num, width=None)

参数解释：

参数	描述
num	要表示的数，只能是整数形式
width	可选项，对于负数，如果未指定 width，则会在前面添加减号，如果指定了 width，则返回该宽度的负数的二进制补码

>>> import numpy as np
>>> np.binary_repr(3)
'11'
>>> np.binary_repr(-3)
'-11'
>>> np.binary_repr(-3, width=4)
'1101'
>>> np.binary_repr(3, width=4)
'0011'

以下是 NumPy 数组当中用到的位运算函数，各函数与其对应的用操作符计算的作用相同。

函数	描述	操作符
`bitwise_and`	对数组元素进行按位与（AND）操作	&
`bitwise_or`	对数组元素进行按位或（OR）操作	\|
`bitwise_xor`	对数组元素执行按位异或（XOR）操作	^
`invert`	对数组元素执行按位取反（NOT）操作	~
`left_shift`	将数组元素的二进制形式向左移动指定位，右侧附加相等数量的 0	<<
`right_shift`	将数组元素的二进制形式向右移动指定位，左侧附加相等数量的 0	>>

【2x01】numpy.bitwise_and()

numpy.bitwise_and() 函数对数组元素进行按位与（AND）操作。

>>> import numpy as np
>>> np.binary_repr(10), np.binary_repr(15)
('1010', '1111')
>>> np.bitwise_and(10, 15)
10
>>> np.binary_repr(10)
'1010'

numpy.bitwise_and() 函数还支持多个元素同时进行按位与操作：

>>> import numpy as np
>>> np.bitwise_and([14,3], 13)
array([12,  1], dtype=int32)

>>> import numpy as np
>>> np.bitwise_and([11,7], [4,25])
array([0, 1], dtype=int32)
>>>
>>> np.array([11,7]) & np.array([4,25])    # 函数与 & 操作符作用一样
array([0, 1], dtype=int32)

还可以传入布尔值：

>>> import numpy as np
>>> np.bitwise_and([True,False,True],[True,True,True])
array([ True, False,  True])

【2x02】numpy.bitwise_or()

numpy.bitwise_or() 函数对数组元素进行按位或（OR）操作。

>>> import numpy as np
>>> np.binary_repr(10), np.binary_repr(14)
('1010', '1110')
>>> np.bitwise_or(10, 14)
14
>>> np.binary_repr(14)
'1110'

和按位与操作一样，numpy.bitwise_or() 函数也支持传入布尔值和多个元素同时进行操作：

>>> import numpy as np
>>> np.bitwise_or([33,4], 1)
array([33,  5], dtype=int32)
>>>
>>> np.bitwise_or([33,4], [1,2])
array([33,  6], dtype=int32)
>>>
>>> np.bitwise_or(np.array([2,5,255]), np.array([4,4,4]))
array([  6,   5, 255], dtype=int32)
>>>
>>> np.array([2,5,255]) | np.array([4,4,4])   # 函数与 | 运算符作用相同
array([  6,   5, 255], dtype=int32)
>>>
>>> np.bitwise_or([True, True], [False,True])
array([ True,  True])

【2x03】numpy.bitwise_xor()

numpy.bitwise_xor() 函数对数组元素执行按位异或（XOR）操作。

>>> import numpy as np
>>> bin(13), bin(17)
('0b1101', '0b10001')
>>> np.bitwise_xor(13,17)
28
>>> bin(28)
'0b11100'

>>> import numpy as np
>>> np.bitwise_xor([31,3], 5)
array([26,  6], dtype=int32)
>>> 
>>> np.bitwise_xor([31,3], [5,6])
array([26,  5], dtype=int32)
>>> 
>>> np.array([31,3]) ^ np.array([5,6])    # 函数与 ^ 运算符作用相同
array([26,  5], dtype=int32)
>>>
>>> np.bitwise_xor([True, True], [False, True])
array([ True, False])

【2x04】numpy.invert()

numpy.invert() 函数将对数组元素执行按位取反（NOT）操作，注意按位取反和取反操作不同。

按位取反通用公式：~x = -(x+1)

我们将原来的数称为 A，按位取反后的数称为 B，按位取反的步骤如下：
先求 A 的补码，对 A 的补码每一位取反（包括符号位），得到的数为 B 的补码，将 B 的补码转换为 B 的原码得到最终结果。

分情况具体讨论：

正数按位取反步骤

1、将其转换成二进制形式；
2、求其补码（正数的原码、反码、补码都相同）；
3、将补码每一位进行取反操作（包括符号位）；
【经过步骤 3 后的结果为一个二进制形式的负数补码，接下来将这个负数补码转换成原码（负数原码到补码的逆运算）】
4、对步骤 3 得到的负数 -1 得到反码；
5、对步骤 4 得到的反码再进行取反得到原码；
6、将步骤 5 得到的原码转回十进制即是最终结果。

负数按位取反步骤

1、将其转换成二进制形式；
2、求其补码（先求其反码、符号位不变，末尾 +1 得到其补码）；
3、将补码每一位进行取反操作（包括符号位）；
【经过步骤 3 后的结果为一个二进制形式的正数，接下来将这个正数转换成原码即可】
4、由于正数的原码、反码、补码都相同，所以直接将其转换成十进制即为最终结果。

注意：第 3 步的取反操作，包括符号位都要取反，与求反码不同，求反码时符号位不变。

具体计算举例（二进制前 4 位为符号位）：

9 的按位取反

① 原码：0000 1001
② 反码：0000 1001
③ 补码：0000 1001
④ 取反：1111 0110 （包括符号位一起取反，得到新的补码）
⑤ 反码：1111 0101 （将新的补码 -1 得到其反码）
⑥ 原码：1111 1010 （将反码取反得到原码）
⑦ 转为十进制：-10

-9 的按位取反

① 原码：1111 1001
② 反码：1111 0110
③ 补码：1111 0111
④ 取反：0000 1000 （包括符号位一起取反，得到新的补码）
⑤ 原码：0000 1000 （由于新的补码为正数，所以原码补码相同）
⑥ 转为十进制：8

其他关于按位取反操作的知识：

Python 代码应用示例：

>>> import numpy as np
>>> np.binary_repr(9, width=8)
'00001001'
>>> np.invert(9)
-10
>>> np.invert(-9)
8

【2x05】numpy.left_shift()

numpy.left_shift() 函数将数组元素的二进制形式向左移动指定位，右侧附加相等数量的 0。

应用举例：

>>> import numpy as np
>>> np.binary_repr(10, width=8)
'00001010'
>>> np.left_shift(10, 2)
40
>>> 10 << 2         # numpy.left_shift 函数相当于 Python 当中的 << 运算符
40
>>> np.binary_repr(40, width=8)
'00101000'

【2x06】numpy.right_shift()

numpy.right_shift() 函数将数组元素的二进制形式向右移动指定位，左侧附加相等数量的 0

>>> import numpy as np
>>> np.binary_repr(10, width=8)
'00001010'
>>> np.right_shift(10, 2)
2
>>> 10 >> 2         # numpy.right_shift 函数相当于 Python 当中的 >> 运算符
2
>>> np.binary_repr(2, width=8)
'00000010'

这里是一段物理防爬虫文本，请读者忽略。
本文原创首发于 CSDN，作者 ITBOB。
博客首页：https://itrhx.blog.csdn.net/
本文链接：https://itrhx.blog.csdn.net/article/details/105185337
未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！