<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>BOB&#39;S BLOG</title>
  
  <subtitle>一入 IT 深似海 从此学习无绝期</subtitle>
  <link href="https://www.itbob.cn/atom.xml" rel="self"/>
  
  <link href="https://www.itbob.cn/"/>
  <updated>2023-05-04T15:20:00.000Z</updated>
  <id>https://www.itbob.cn/</id>
  
  <author>
    <name>BOB</name>
    
  </author>
  
  <generator uri="https://hexo.io/">Hexo</generator>
  
  <entry>
    <title>抖音滑块验证码 captchaBody 逆向分析，JSVMP 纯算法还原</title>
    <link href="https://www.itbob.cn/article/070/"/>
    <id>https://www.itbob.cn/article/070/</id>
    <published>2023-05-04T15:20:00.000Z</published>
    <updated>2023-05-04T15:20:00.000Z</updated>
    
    <content type="html"><![CDATA[<!-- ![captcha_reverse](https://static.wukongsec.com/itbob/images/cover/captcha_reverse.png) --><p><strong><center><font color="red" size="5px" weight="bolder">本文已收到抖音的律师函，CSDN 和本站的文章都已经做下架处理。</font></center><br></strong></p><p><strong><center><font color="red" size="5px" weight="bolder">欢迎加入爬虫逆向微信交流群：添加微信 IT-BOB（备注交流群）</font></center></strong></p><p><img src="https://static.wukongsec.com/itbob/images/article/070/Lawyer'sLetter1.png" alt="Lawyer's Letter1"></p><p><img src="https://static.wukongsec.com/itbob/images/article/070/Lawyer'sLetter2.png" alt="Lawyer's Letter2"></p>]]></content>
    
    
      
      
    <summary type="html">&lt;!-- ![captcha_reverse](https://static.wukongsec.com/itbob/images/cover/captcha_reverse.png) --&gt;
&lt;p&gt;&lt;strong&gt;&lt;center&gt;&lt;font color=&quot;red&quot; size=&quot;</summary>
      
    
    
    
    <category term="验证码逆向实战" scheme="https://www.itbob.cn/categories/%E9%AA%8C%E8%AF%81%E7%A0%81%E9%80%86%E5%90%91%E5%AE%9E%E6%88%98/"/>
    
    
    <category term="爬虫" scheme="https://www.itbob.cn/tags/%E7%88%AC%E8%99%AB/"/>
    
    <category term="验证码逆向实战" scheme="https://www.itbob.cn/tags/%E9%AA%8C%E8%AF%81%E7%A0%81%E9%80%86%E5%90%91%E5%AE%9E%E6%88%98/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据结构之栈的实现</title>
    <link href="https://www.itbob.cn/article/038/"/>
    <id>https://www.itbob.cn/article/038/</id>
    <published>2020-11-30T06:48:05.000Z</published>
    <updated>2022-05-22T12:49:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#zhan-de-gai-nian">栈的概念</a></li><li><a href="#zhan-de-te-dian">栈的特点</a></li><li><a href="#zhan-de-cao-zuo">栈的操作</a></li><li><a href="#python-shi-xian-zhan">Python 实现栈</a></li><li><a href="#zhan-de-jian-dan-ying-yong-gua-hao-pi-pei-wen-ti">栈的简单应用：括号匹配问题</a></li><li><a href="#zhan-de-jian-dan-ying-yong-dao-xu-shu-chu-yi-zu-yuan-su">栈的简单应用：倒序输出一组元素</a></li></ul><!-- tocstop --><hr><h2><span id="zhan-de-gai-nian">栈的概念</span></h2><p>栈（stack）又名堆栈，栈是一种线性数据结构，用先进后出或者是后进先出的方式存储数据，栈中数据的插入删除操作都是在栈的顶端进行，这一端被称为栈顶，相对地，把另一端称为栈底。向一个栈插入新元素又称作进栈、入栈或压栈，它是把新元素放到栈顶元素的上面，使之成为新的栈顶元素；从一个栈删除元素又称作出栈或退栈，它是把栈顶元素删除掉，使其相邻的元素成为新的栈顶元素。</p><p><img src="https://static.wukongsec.com/itbob/images/article/038/01.png" alt="01"></p><hr><h2><span id="zhan-de-te-dian">栈的特点</span></h2><p>元素后进先出（Last in First Out，LIFO）</p><hr><h2><span id="zhan-de-cao-zuo">栈的操作</span></h2><ul><li><font color="#FF0000"><strong>push(item)</strong></font>：进栈（向栈顶添加元素）</li><li><font color="#FF0000"><strong>pop()</strong></font>：出栈（删除栈顶元素）</li><li><font color="#FF0000"><strong>top()</strong></font>：查看栈顶元素</li><li><font color="#FF0000"><strong>empty()</strong></font>：判断栈是否为空</li></ul><hr><h2><span id="python-shi-xian-zhan">Python 实现栈</span></h2><p>栈并不是 Python 的内建类型，在必要的时候可以使用列表来模拟基于数组的栈。如果将列表的末尾看作是栈的顶，列表方法 <code>append()</code> 就是将元素压入到栈中（进栈），而列表方法 <code>pop()</code> 会删除并返回栈顶的元素（出栈），列表索引的方式 <code>arr[-1]</code> 可以查看栈顶元素。具体代码实现如下：</p><pre><code class="hljs python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Stack</span>:</span>    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>        self.stack = []    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">push</span>(<span class="hljs-params">self, item</span>):</span>        self.stack.append(item)    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">pop</span>(<span class="hljs-params">self</span>):</span>        <span class="hljs-keyword">if</span> self.empty():            <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>        <span class="hljs-keyword">else</span>:            <span class="hljs-keyword">return</span> self.stack.pop()    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">top</span>(<span class="hljs-params">self</span>):</span>        <span class="hljs-keyword">if</span> self.empty():            <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>        <span class="hljs-keyword">else</span>:            <span class="hljs-keyword">return</span> self.stack[-<span class="hljs-number">1</span>]    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">empty</span>(<span class="hljs-params">self</span>):</span>        <span class="hljs-keyword">return</span> <span class="hljs-built_in">len</span>(self.stack) == <span class="hljs-number">0</span></code></pre><hr><h2><span id="zhan-de-jian-dan-ying-yong-gua-hao-pi-pei-wen-ti">栈的简单应用：括号匹配问题</span></h2><p><font color="#2DAF8B"><strong>问题描述：</strong></font></p><p>给定一个字符串，字符串中只包含小括号 <code>()</code>、中括号 <code>[]</code>、大括号 <code>&#123;&#125;</code>，求该字符串中的括号是否匹配。匹配规则：成对出现或者左右对称出现，例如：</p><p><font color="#FF0000"><strong>()[]{}</strong></font>：匹配；<font color="#FF0000"><strong>{[()]}</strong></font>：匹配；<font color="#FF0000"><strong>({}]</strong></font>：不匹配；<font color="#FF0000"><strong>()]</strong></font>：不匹配；<font color="#FF0000"><strong>({)}</strong></font>：不匹配</p><p><font color="#2DAF8B"><strong>通过栈来解决：</strong></font></p><p>有字符串 <font color="#FF0000"><strong>()[{}]</strong></font>，依次取每个括号，只要是左括号就进栈，只要是右括号就判断栈顶是否为对应的左括号，具体步骤如下：</p><ul><li><strong>①</strong> 遇到左小括号 <font color="#FF0000"><strong>(</strong></font>，执行进栈操作；</li><li><strong>②</strong> 遇到右小括号 <font color="#FF0000"><strong>)</strong></font>，判断此时栈顶是否为左小括号 <font color="#FF0000"><strong>(</strong></font>，是则让左小括号 <font color="#FF0000"><strong>(</strong></font> 出栈，此时栈为空;</li><li><strong>③</strong> 遇到左中括号 <font color="#FF0000"><strong>[</strong></font>，执行进栈操作；</li><li><strong>④</strong> 遇到左大括号 <font color="#FF0000"><strong>{</strong></font>，执行进栈操作；</li><li><strong>⑤</strong> 遇到右大括号 <font color="#FF0000"><strong>}</strong></font>，判断此时栈顶是否为左大括号 <font color="#FF0000"><strong>{</strong></font>，是则让左大括号 <font color="#FF0000"><strong>{</strong></font> 出栈，此时栈为空；</li><li><strong>⑥</strong> 遇到右中括号 <font color="#FF0000"><strong>]</strong></font>，判断此时栈顶是否为左中括号 <font color="#FF0000"><strong>[</strong></font>，是则让左中括号 <font color="#FF0000"><strong>[</strong></font> 出栈，此时栈为空；</li><li><strong>⑦</strong> 判断最终的栈是否为空，是则表示匹配，不是则表示不匹配。其中第 <strong>② ⑤ ⑥</strong> 步中，若判断为不是，则直接表示不匹配。</li></ul><p><font color="#2DAF8B"><strong>Python 代码实现：</strong></font></p><pre><code class="hljs python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Stack</span>:</span>    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>        self.stack = []    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">push</span>(<span class="hljs-params">self, item</span>):</span>        self.stack.append(item)    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">pop</span>(<span class="hljs-params">self</span>):</span>        <span class="hljs-keyword">if</span> self.empty():            <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>        <span class="hljs-keyword">else</span>:            <span class="hljs-keyword">return</span> self.stack.pop()    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">top</span>(<span class="hljs-params">self</span>):</span>        <span class="hljs-keyword">if</span> self.empty():            <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>        <span class="hljs-keyword">else</span>:            <span class="hljs-keyword">return</span> self.stack[-<span class="hljs-number">1</span>]    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">empty</span>(<span class="hljs-params">self</span>):</span>        <span class="hljs-keyword">return</span> <span class="hljs-built_in">len</span>(self.stack) == <span class="hljs-number">0</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">brackets_match</span>(<span class="hljs-params">s</span>):</span>    match_dict = &#123;<span class="hljs-string">&#x27;&#125;&#x27;</span>: <span class="hljs-string">&#x27;&#123;&#x27;</span>, <span class="hljs-string">&#x27;]&#x27;</span>: <span class="hljs-string">&quot;[&quot;</span>, <span class="hljs-string">&#x27;)&#x27;</span>: <span class="hljs-string">&#x27;(&#x27;</span>&#125;    stack = Stack()    <span class="hljs-keyword">for</span> ch <span class="hljs-keyword">in</span> s:        <span class="hljs-keyword">if</span> ch <span class="hljs-keyword">in</span> [<span class="hljs-string">&#x27;(&#x27;</span>, <span class="hljs-string">&#x27;[&#x27;</span>, <span class="hljs-string">&#x27;&#123;&#x27;</span>]:    <span class="hljs-comment"># 如果为左括号，则执行进栈操作</span>            stack.push(ch)        <span class="hljs-keyword">else</span>:                        <span class="hljs-comment"># 如果为右括号</span>            <span class="hljs-keyword">if</span> stack.empty():        <span class="hljs-comment"># 如果栈为空，则不匹配，即多了一个右括号，没有左括号匹配</span>                <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>            <span class="hljs-keyword">elif</span> stack.top() == match_dict[ch]:  <span class="hljs-comment"># 如果栈顶的元素为对应的左括号，则让栈顶出栈</span>                stack.pop()            <span class="hljs-keyword">else</span>:                    <span class="hljs-comment"># 如果栈顶元素不是对应的左括号，则不匹配</span>                <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>    <span class="hljs-keyword">if</span> stack.empty():                <span class="hljs-comment"># 最后的栈如果为空，则匹配，否则不匹配</span>        <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>    <span class="hljs-keyword">else</span>:        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span><span class="hljs-built_in">print</span>(brackets_match(<span class="hljs-string">&#x27;[&#123;()&#125;()&#123;()&#125;[](&#123;&#125;)&#123;&#125;]&#x27;</span>))<span class="hljs-built_in">print</span>(brackets_match(<span class="hljs-string">&#x27;()[&#123;&#125;]&#x27;</span>))<span class="hljs-built_in">print</span>(brackets_match(<span class="hljs-string">&#x27;(&#123;)&#125;&#x27;</span>))<span class="hljs-built_in">print</span>(brackets_match(<span class="hljs-string">&#x27;[]&#125;&#x27;</span>))</code></pre><p>输出结果：</p><pre><code class="hljs python"><span class="hljs-literal">True</span><span class="hljs-literal">True</span><span class="hljs-literal">False</span><span class="hljs-literal">False</span></code></pre><hr><h2><span id="zhan-de-jian-dan-ying-yong-dao-xu-shu-chu-yi-zu-yuan-su">栈的简单应用：倒序输出一组元素</span></h2><p>把元素存入栈，再顺序取出：</p><pre><code class="hljs python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Stack</span>:</span>    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>        self.stack = []    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">push</span>(<span class="hljs-params">self, item</span>):</span>        self.stack.append(item)    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">pop</span>(<span class="hljs-params">self</span>):</span>        <span class="hljs-keyword">if</span> self.empty():            <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>        <span class="hljs-keyword">else</span>:            <span class="hljs-keyword">return</span> self.stack.pop()    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">top</span>(<span class="hljs-params">self</span>):</span>        <span class="hljs-keyword">if</span> self.empty():            <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>        <span class="hljs-keyword">else</span>:            <span class="hljs-keyword">return</span> self.stack[-<span class="hljs-number">1</span>]    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">empty</span>(<span class="hljs-params">self</span>):</span>        <span class="hljs-keyword">return</span> <span class="hljs-built_in">len</span>(self.stack) == <span class="hljs-number">0</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">reverse_list</span>(<span class="hljs-params">s</span>):</span>    stack = Stack()    <span class="hljs-keyword">for</span> ch <span class="hljs-keyword">in</span> s:        stack.push(ch)    new_list = []    <span class="hljs-keyword">while</span> <span class="hljs-keyword">not</span> stack.empty():        new_list.append(stack.pop())    <span class="hljs-keyword">return</span> new_list<span class="hljs-built_in">print</span>(reverse_list([<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>, <span class="hljs-string">&#x27;D&#x27;</span>, <span class="hljs-string">&#x27;E&#x27;</span>]))</code></pre><p>输出结果：</p><pre><code class="hljs python">[<span class="hljs-string">&#x27;E&#x27;</span>, <span class="hljs-string">&#x27;D&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;A&#x27;</span>]</code></pre>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#zhan-de-gai-nian&quot;&gt;栈的概念&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#zhan-de-te-dia</summary>
      
    
    
    
    <category term="数据结构" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84/"/>
    
    
    <category term="Python" scheme="https://www.itbob.cn/tags/Python/"/>
    
    <category term="数据结构" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84/"/>
    
  </entry>
  
  <entry>
    <title>Python 算法之递归与尾递归，斐波那契数列以及汉诺塔的实现</title>
    <link href="https://www.itbob.cn/article/037/"/>
    <id>https://www.itbob.cn/article/037/</id>
    <published>2020-10-28T14:05:13.000Z</published>
    <updated>2022-05-22T12:48:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#di-gui-gai-nian">递归概念</a></li><li><a href="#di-gui-yao-su">递归要素</a></li><li><a href="#di-gui-yu-die-dai-de-qu-bie">递归与迭代的区别</a></li><li><a href="#shi-li-yi-jie-cheng">示例一：阶乘</a></li><li><a href="#shi-li-er-fei-bo-na-qi-shu-lie">示例二：斐波那契数列</a></li><li><a href="#shi-li-san-han-nuo-ta-wen-ti">示例三：汉诺塔问题</a></li><li><a href="#wei-di-gui">尾递归</a></li><li><a href="#python-zhong-wei-di-gui-de-jie-jue-fang-an">Python 中尾递归的解决方案</a></li></ul><!-- tocstop --><hr><h2><span id="di-gui-gai-nian">递归概念</span></h2><p><font color="#ff0000"><strong>递归</strong></font>：程序调用自身的编程技巧称为递归（ recursion）。用一种通俗的话来说就是自己调用自己，它通常把一个大型复杂的问题层层转化为一个与原问题相似的、但是规模较小的问题来求解，当问题小到一定规模的时候，需要一个递归出口返回。递归策略只需少量的程序就可描述出解题过程所需要的多次重复计算，大大地减少了程序的代码量。递归的能力在于用有限的语句来定义对象的无限集合。</p><p><font color="#ff0000"><strong>递归函数</strong></font>：在编程语言中，函数直接或间接调用函数本身，则该函数称为递归函数；在数学上的定义如下：对于某一函数 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mo stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">f(x)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mopen">(</span><span class="mord mathnormal">x</span><span class="mclose">)</span></span></span></span>，其定义域是集合 A，那么若对于 A 集合中的某一个值 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>X</mi><mn>0</mn></msub></mrow><annotation encoding="application/x-tex">X_0</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.07847em;">X</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0785em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>，其函数值 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mo stretchy="false">(</mo><msub><mi>x</mi><mn>0</mn></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">f(x_0)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span> 由 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mo stretchy="false">(</mo><mi>f</mi><mo stretchy="false">(</mo><msub><mi>x</mi><mn>0</mn></msub><mo stretchy="false">)</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">f(f(x_0))</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">))</span></span></span></span> 决定，那么就称 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mo stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">f(x)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mopen">(</span><span class="mord mathnormal">x</span><span class="mclose">)</span></span></span></span> 为递归函数。</p><hr><h2><span id="di-gui-yao-su">递归要素</span></h2><ul><li><p>递归必须包含一个基本的出口（结束条件），否则就会无限递归，最终导致栈溢出；</p></li><li><p>递归必须包含一个可以分解的问题，例如要想求得 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mi>a</mi><mi>c</mi><mi>t</mi><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">fact(n)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mord mathnormal">a</span><span class="mord mathnormal">c</span><span class="mord mathnormal">t</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span>，就需要用 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>∗</mo><mi>f</mi><mi>a</mi><mi>c</mi><mi>t</mi><mo stretchy="false">(</mo><mi>n</mi><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">n * fact(n-1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4653em;"></span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">∗</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mord mathnormal">a</span><span class="mord mathnormal">c</span><span class="mord mathnormal">t</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">)</span></span></span></span>；</p></li><li><p>递归必须必须要向着递归出口靠近，例如每次递归调用都会 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">n-1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6667em;vertical-align:-0.0833em;"></span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span></span></span>，向着递归出口 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>=</mo><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">n == 0</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">==</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">0</span></span></span></span> 靠近。</p></li></ul><hr><h2><span id="di-gui-yu-die-dai-de-qu-bie">递归与迭代的区别</span></h2><ul><li><p><font color="#ff0000"><strong>递归（recursion）</strong></font>：递归则是一步一步往前递推，直到递归基础，寻找一条路径， 然后再由前向后计算。（A调用A）</p></li><li><p><font color="#ff0000"><strong>迭代（iteration）</strong></font>：迭代是重复反馈过程的活动，其目的通常是为了逼近所需目标或结果。每一次对过程的重复称为一次“迭代”，而每一次迭代得到的结果会作为下一次迭代的初始值，因此迭代是从前往后计算的。（A重复调用B）</p></li></ul><hr><h2><span id="shi-li-yi-jie-cheng">示例一：阶乘</span></h2><p>一个正整数的阶乘（factorial）是所有小于及等于该数的正整数的积，并且 0 的阶乘为 1。即 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo stretchy="false">!</mo><mo>=</mo><mn>1</mn><mo>×</mo><mn>2</mn><mo>×</mo><mn>3</mn><mo>×</mo><mi mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mo>×</mo><mo stretchy="false">(</mo><mi>n</mi><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo><mo>×</mo><mi>n</mi></mrow><annotation encoding="application/x-tex">n!=1×2×3×...×(n-1)×n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">n</span><span class="mclose">!</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.7278em;vertical-align:-0.0833em;"></span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.7278em;vertical-align:-0.0833em;"></span><span class="mord">2</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.7278em;vertical-align:-0.0833em;"></span><span class="mord">3</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.6667em;vertical-align:-0.0833em;"></span><span class="mord">...</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">n</span></span></span></span>，以递归方式定义：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo stretchy="false">!</mo><mo>=</mo><mo stretchy="false">(</mo><mi>n</mi><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo><mo stretchy="false">!</mo><mo>×</mo><mi>n</mi></mrow><annotation encoding="application/x-tex">n!=(n-1)!×n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">n</span><span class="mclose">!</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">)!</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">n</span></span></span></span></p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">factorial</span>(<span class="hljs-params">n</span>):</span>    <span class="hljs-keyword">if</span> n == <span class="hljs-number">0</span>:        <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>    <span class="hljs-keyword">else</span>:        <span class="hljs-keyword">return</span> n * factorial(n-<span class="hljs-number">1</span>)</code></pre><hr><h2><span id="shi-li-er-fei-bo-na-qi-shu-lie">示例二：斐波那契数列</span></h2><p>斐波那契数列（Fibonacci sequence），又称黄金分割数列、因数学家莱昂纳多·斐波那契以兔子繁殖为例子而引入，故又称为“兔子数列”。</p><p>有一个数列：0、1、1、2、3、5、8、13、21、34、55、89…，这个数列从第3项开始，每一项都等于前两项之和。以递推的方法定义：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>F</mi><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo><mo>=</mo><mi>F</mi><mo stretchy="false">(</mo><mi>n</mi><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo><mo>+</mo><mi>F</mi><mo stretchy="false">(</mo><mi>n</mi><mo>−</mo><mn>2</mn><mo stretchy="false">)</mo><mo stretchy="false">(</mo><mi>n</mi><mo>≥</mo><mn>3</mn><mo separator="true">,</mo><mi>n</mi><mo>∈</mo><msup><mi>N</mi><mo>∗</mo></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">F(n)=F(n - 1)+F(n - 2) (n ≥ 3, n ∈ N^*)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.13889em;">F</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.13889em;">F</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.13889em;">F</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">2</span><span class="mclose">)</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">≥</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.8389em;vertical-align:-0.1944em;"></span><span class="mord">3</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">∈</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.10903em;">N</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6887em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin mtight">∗</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fibonacc</span>(<span class="hljs-params">n</span>):</span>    <span class="hljs-keyword">if</span> n == <span class="hljs-number">1</span> <span class="hljs-keyword">or</span> n == <span class="hljs-number">2</span>:        <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>    <span class="hljs-keyword">else</span>:        <span class="hljs-keyword">return</span> fibonacc(n-<span class="hljs-number">1</span>) + fibonacc(n-<span class="hljs-number">2</span>)</code></pre><p>以上方法的时间复杂度为<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><msup><mn>2</mn><mi>n</mi></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(2^n)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord">2</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span>，稍微大一点的数都会算很久，有一个简单的解决方案，使用 <code>lru_cache</code> 缓存装饰器，缓存一些中间结果：</p><pre><code class="hljs python"><span class="hljs-keyword">from</span> functools <span class="hljs-keyword">import</span> lru_cache<span class="hljs-comment"># 缓存斐波那契函数已经计算出的结果，最多占用1024字节内存</span><span class="hljs-meta">@lru_cache(<span class="hljs-params">maxsize=<span class="hljs-number">1024</span></span>)</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fibonacc</span>(<span class="hljs-params">n</span>):</span>    <span class="hljs-keyword">if</span> n == <span class="hljs-number">1</span> <span class="hljs-keyword">or</span> n == <span class="hljs-number">2</span>:        <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>    <span class="hljs-keyword">else</span>:        <span class="hljs-keyword">return</span> fibonacc(n-<span class="hljs-number">1</span>) + fibonacc(n-<span class="hljs-number">2</span>)</code></pre><p>另外还有更加节省时间和空间的方法：</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fibonacc</span>(<span class="hljs-params">n, current=<span class="hljs-number">0</span>, <span class="hljs-built_in">next</span>=<span class="hljs-number">1</span></span>):</span>    <span class="hljs-keyword">if</span> n == <span class="hljs-number">0</span>:        <span class="hljs-keyword">return</span> current    <span class="hljs-keyword">else</span>:        <span class="hljs-keyword">return</span> fibonacc(n-<span class="hljs-number">1</span>, <span class="hljs-built_in">next</span>, current+<span class="hljs-built_in">next</span>)</code></pre><hr><h2><span id="shi-li-san-han-nuo-ta-wen-ti">示例三：汉诺塔问题</span></h2><p>汉诺塔（又称河内塔）问题是源于印度一个古老传说的益智玩具。大梵天创造世界的时候做了三根金刚石柱子，在一根柱子上从下往上按照大小顺序摞着64片黄金圆盘。大梵天命令婆罗门把圆盘从下面开始按大小顺序重新摆放在另一根柱子上。并且规定，在小圆盘上不能放大圆盘，在三根柱子之间一次只能移动一个圆盘。64片黄金圆盘移动完毕之日，就是世界毁灭之时。</p><p><img src="https://static.wukongsec.com/itbob/images/article/037/01.gif" alt="01汉诺塔"></p><p>对于 n 个盘子，移动步骤如下：</p><ul><li>把 n-1 个盘子由 A 经过 C 移动到 B</li><li>把最后一个盘子移动到 C</li><li>把 n-1 个盘子由 B 经过 A 移动到 C</li></ul><p><img src="https://static.wukongsec.com/itbob/images/article/037/02.png" alt="02汉诺塔"></p><p>递归代码实现：</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">hanoi</span>(<span class="hljs-params">n, a, b, c</span>):</span>                                <span class="hljs-comment"># n 个盘子，a，b，c三个柱子</span>    <span class="hljs-keyword">if</span> n &gt; <span class="hljs-number">0</span>:        hanoi(n-<span class="hljs-number">1</span>, a, c, b)                           <span class="hljs-comment"># 把 n-1 个盘子由 a 经过 c 移动到 b</span>        <span class="hljs-built_in">print</span>(<span class="hljs-string">&#x27;moving from &#123;0&#125; to &#123;1&#125;&#x27;</span>.<span class="hljs-built_in">format</span>(a, c))  <span class="hljs-comment"># 把最后一个盘子移动到 c</span>        hanoi(n-<span class="hljs-number">1</span>, b, a, c)                           <span class="hljs-comment"># 把 n-1 个盘子由 b 经过 a 移动到 c</span></code></pre><p>示例：</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">hanoi</span>(<span class="hljs-params">n, a, b, c</span>):</span>    <span class="hljs-keyword">if</span> n &gt; <span class="hljs-number">0</span>:        hanoi(n-<span class="hljs-number">1</span>, a, c, b)        <span class="hljs-built_in">print</span>(<span class="hljs-string">&#x27;moving from &#123;0&#125; to &#123;1&#125;&#x27;</span>.<span class="hljs-built_in">format</span>(a, c))        hanoi(n-<span class="hljs-number">1</span>, b, a, c)hanoi(<span class="hljs-number">3</span>, <span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>)</code></pre><pre><code class="hljs python">moving <span class="hljs-keyword">from</span> A to Cmoving <span class="hljs-keyword">from</span> A to Bmoving <span class="hljs-keyword">from</span> C to Bmoving <span class="hljs-keyword">from</span> A to Cmoving <span class="hljs-keyword">from</span> B to Amoving <span class="hljs-keyword">from</span> B to Cmoving <span class="hljs-keyword">from</span> A to C</code></pre><hr><h2><span id="wei-di-gui">尾递归</span></h2><p>如果一个函数中所有递归形式的调用都出现在函数的末尾，我们称这个递归函数是尾递归的。通俗来讲就是递归调用放在了函数的最后。</p><pre><code class="hljs python"><span class="hljs-comment"># 一般递归</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">func</span>(<span class="hljs-params">n</span>):</span>    <span class="hljs-keyword">if</span> n &gt; <span class="hljs-number">0</span>:        func(n-<span class="hljs-number">1</span>)        <span class="hljs-built_in">print</span>(n)<span class="hljs-comment"># 一般递归</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">func</span>(<span class="hljs-params">n</span>):</span>    <span class="hljs-keyword">if</span> n &gt; <span class="hljs-number">0</span>:        <span class="hljs-keyword">return</span> func(n-<span class="hljs-number">1</span>) + n<span class="hljs-comment"># 尾递归</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">func</span>(<span class="hljs-params">n</span>):</span>    a = n    <span class="hljs-keyword">if</span> n &gt; <span class="hljs-number">0</span>:        a += <span class="hljs-number">1</span>        <span class="hljs-built_in">print</span>(a, n)        <span class="hljs-keyword">return</span> func(n-<span class="hljs-number">1</span>)</code></pre><p>对于普通的递归，每一级递归都产生了新的局部变量，必须创建新的调用栈，随着递归深度的增加，创建的栈越来越多，容易造成爆栈。</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">normal_recursion</span>(<span class="hljs-params">n</span>):</span>    <span class="hljs-keyword">if</span> n == <span class="hljs-number">1</span>:        <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>    <span class="hljs-keyword">else</span>:        <span class="hljs-keyword">return</span> n + normal_recursion(n-<span class="hljs-number">1</span>)</code></pre><p><code>normal_recursion(5)</code> 执行：</p><pre><code class="hljs python">normal_recursion(<span class="hljs-number">5</span>)<span class="hljs-number">5</span> + normal_recursion(<span class="hljs-number">4</span>)<span class="hljs-number">5</span> + <span class="hljs-number">4</span> + normal_recursion(<span class="hljs-number">3</span>)<span class="hljs-number">5</span> + <span class="hljs-number">4</span> + <span class="hljs-number">3</span> + normal_recursion(<span class="hljs-number">2</span>)<span class="hljs-number">5</span> + <span class="hljs-number">4</span> + <span class="hljs-number">3</span> + <span class="hljs-number">2</span> + normal_recursion(<span class="hljs-number">1</span>)<span class="hljs-number">5</span> + <span class="hljs-number">4</span> + <span class="hljs-number">3</span> + <span class="hljs-number">3</span><span class="hljs-number">5</span> + <span class="hljs-number">4</span> + <span class="hljs-number">6</span><span class="hljs-number">5</span> + <span class="hljs-number">10</span><span class="hljs-number">15</span></code></pre><p>尾递归基于函数的尾调用，每一级调用直接返回递归函数更新调用栈，没有新局部变量的产生，类似迭代的实现。</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">tail_recursion</span>(<span class="hljs-params">n, total=<span class="hljs-number">0</span></span>):</span>    <span class="hljs-keyword">if</span> n == <span class="hljs-number">0</span>:        <span class="hljs-keyword">return</span> total    <span class="hljs-keyword">else</span>:        <span class="hljs-keyword">return</span> tail_recursion(n-<span class="hljs-number">1</span>, total+n)</code></pre><p><code>normal_recursion(5)</code> 执行：</p><pre><code class="hljs python">tail_recursion(<span class="hljs-number">5</span>, <span class="hljs-number">0</span>)tail_recursion(<span class="hljs-number">4</span>, <span class="hljs-number">5</span>)tail_recursion(<span class="hljs-number">3</span>, <span class="hljs-number">9</span>)tail_recursion(<span class="hljs-number">2</span>, <span class="hljs-number">12</span>)tail_recursion(<span class="hljs-number">1</span>, <span class="hljs-number">14</span>)tail_recursion(<span class="hljs-number">0</span>, <span class="hljs-number">15</span>)<span class="hljs-number">15</span></code></pre><p>在 Python，Java，Pascal 等语言中是无法实现尾递归优化的，所以采用了 for，while，goto 等特殊结构以迭代的方式来代替尾递归。</p><hr><h2><span id="python-zhong-wei-di-gui-de-jie-jue-fang-an">Python 中尾递归的解决方案</span></h2><p>使用普通的递归来实现斐波那契数列的计算，代码段如下：</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fibonacc</span>(<span class="hljs-params">n, current=<span class="hljs-number">0</span>, <span class="hljs-built_in">next</span>=<span class="hljs-number">1</span></span>):</span>    <span class="hljs-keyword">if</span> n == <span class="hljs-number">0</span>:        <span class="hljs-keyword">return</span> current    <span class="hljs-keyword">else</span>:        <span class="hljs-keyword">return</span> fibonacc(n-<span class="hljs-number">1</span>, <span class="hljs-built_in">next</span>, current+<span class="hljs-built_in">next</span>)a = fibonacc(<span class="hljs-number">1000</span>)<span class="hljs-built_in">print</span>(a)</code></pre><p>此时会报错，因为超过了最大递归深度（默认深度900-1000左右）：</p><pre><code class="hljs python">Traceback (most recent call last):  File <span class="hljs-string">&quot;F:/PycharmProjects/algorithm/fibonacc_test.py&quot;</span>, line <span class="hljs-number">57</span>, <span class="hljs-keyword">in</span> &lt;module&gt;    a = fibonacc(<span class="hljs-number">1000</span>)  File <span class="hljs-string">&quot;F:/PycharmProjects/algorithm/fibonacc_test.py&quot;</span>, line <span class="hljs-number">47</span>, <span class="hljs-keyword">in</span> fibonacc    <span class="hljs-keyword">return</span> fibonacc(n-<span class="hljs-number">1</span>, <span class="hljs-built_in">next</span>, current+<span class="hljs-built_in">next</span>)  File <span class="hljs-string">&quot;F:/PycharmProjects/algorithm/fibonacc_test.py&quot;</span>, line <span class="hljs-number">47</span>, <span class="hljs-keyword">in</span> fibonacc    <span class="hljs-keyword">return</span> fibonacc(n-<span class="hljs-number">1</span>, <span class="hljs-built_in">next</span>, current+<span class="hljs-built_in">next</span>)  File <span class="hljs-string">&quot;F:/PycharmProjects/algorithm/fibonacc_test.py&quot;</span>, line <span class="hljs-number">47</span>, <span class="hljs-keyword">in</span> fibonacc    <span class="hljs-keyword">return</span> fibonacc(n-<span class="hljs-number">1</span>, <span class="hljs-built_in">next</span>, current+<span class="hljs-built_in">next</span>)  [Previous line repeated <span class="hljs-number">995</span> more times]  File <span class="hljs-string">&quot;F:/PycharmProjects/algorithm/fibonacc_test.py&quot;</span>, line <span class="hljs-number">44</span>, <span class="hljs-keyword">in</span> fibonacc    <span class="hljs-keyword">if</span> n == <span class="hljs-number">0</span>:RecursionError: maximum recursion depth exceeded <span class="hljs-keyword">in</span> comparison</code></pre><p>如果是递归深度不是很大的情况，可以手动重设递归深度来解决：</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> syssys.setrecursionlimit(<span class="hljs-number">10000</span>)  <span class="hljs-comment"># 递归深度设置为 10000</span></code></pre><p>如果递归深度非常大，那么就可以采用尾递归优化，但是 Python 官方是并不支持尾递归的（不知道为啥），然而这难不到广大的程序员们，早在 2006 年 <a href="https://code.activestate.com/recipes/users/2792865/">Crutcher Dunnavant</a> 就想出了一个解决办法，实现一个 <code>tail_call_optimized</code> 装饰器，原文链接：<a href="https://code.activestate.com/recipes/474088/">https://code.activestate.com/recipes/474088/</a>，原代码是 Python 2.4 实现的，用 Python 3.x 实现如下：</p><pre><code class="hljs python"><span class="hljs-comment"># This program shows off a python decorator</span><span class="hljs-comment"># which implements tail call optimization. It</span><span class="hljs-comment"># does this by throwing an exception if it is</span><span class="hljs-comment"># it&#x27;s own grandparent, and catching such</span><span class="hljs-comment"># exceptions to recall the stack.</span><span class="hljs-keyword">import</span> sys<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">TailRecurseException</span>(<span class="hljs-params">BaseException</span>):</span>    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, args, kwargs</span>):</span>        self.args = args        self.kwargs = kwargs<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">tail_call_optimized</span>(<span class="hljs-params">g</span>):</span>    <span class="hljs-string">&quot;&quot;&quot;</span><span class="hljs-string">    This function decorates a function with tail call</span><span class="hljs-string">    optimization. It does this by throwing an exception</span><span class="hljs-string">    if it is it&#x27;s own grandparent, and catching such</span><span class="hljs-string">    exceptions to fake the tail call optimization.</span><span class="hljs-string"></span><span class="hljs-string">    This function fails if the decorated5</span><span class="hljs-string">    function recurses in a non-tail context.</span><span class="hljs-string">    &quot;&quot;&quot;</span>    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">func</span>(<span class="hljs-params">*args, **kwargs</span>):</span>        f = sys._getframe()        <span class="hljs-keyword">if</span> f.f_back <span class="hljs-keyword">and</span> f.f_back.f_back <span class="hljs-keyword">and</span> f.f_back.f_back.f_code == f.f_code:            <span class="hljs-keyword">raise</span> TailRecurseException(args, kwargs)        <span class="hljs-keyword">else</span>:            <span class="hljs-keyword">while</span> <span class="hljs-number">1</span>:                <span class="hljs-keyword">try</span>:                    <span class="hljs-keyword">return</span> g(*args, **kwargs)                <span class="hljs-keyword">except</span> TailRecurseException <span class="hljs-keyword">as</span> e:                    args = e.args                    kwargs = e.kwargs    func.__doc__ = g.__doc__    <span class="hljs-keyword">return</span> func</code></pre><p>使用该装饰器再来实现比较大的斐波那契数列的计算：</p><pre><code class="hljs python"><span class="hljs-meta">@tail_call_optimized</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fibonacc</span>(<span class="hljs-params">n, current=<span class="hljs-number">0</span>, <span class="hljs-built_in">next</span>=<span class="hljs-number">1</span></span>):</span>    <span class="hljs-keyword">if</span> n == <span class="hljs-number">0</span>:        <span class="hljs-keyword">return</span> current    <span class="hljs-keyword">else</span>:        <span class="hljs-keyword">return</span> fibonacc(n-<span class="hljs-number">1</span>, <span class="hljs-built_in">next</span>, current+<span class="hljs-built_in">next</span>)a = fibonacc(<span class="hljs-number">1000</span>)<span class="hljs-built_in">print</span>(a)</code></pre><p>输出结果：</p><pre><code class="hljs python"><span class="hljs-number">43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875</span></code></pre><p><code>tail_call_optimized</code> 实现尾递归优化的原理：当递归函数被该装饰器修饰后，递归调用在装饰器while循环内部进行，每当产生新的递归调用栈帧时，<code>f.f_back.f_back.f_code == f.f_code:</code> 就捕获当前尾调用函数的参数，并抛出异常，从而销毁递归栈并使用捕获的参数手动调用递归函数，所以递归的过程中始终只存在一个栈帧对象，达到优化的目的。</p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">TRHX•鲍勃。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/109322815</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#di-gui-gai-nian&quot;&gt;递归概念&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#di-gui-yao-su&quot;&gt;</summary>
      
    
    
    
    <category term="算法" scheme="https://www.itbob.cn/categories/%E7%AE%97%E6%B3%95/"/>
    
    
    <category term="Python" scheme="https://www.itbob.cn/tags/Python/"/>
    
    <category term="算法" scheme="https://www.itbob.cn/tags/%E7%AE%97%E6%B3%95/"/>
    
  </entry>
  
  <entry>
    <title>Python 实现十大经典排序算法</title>
    <link href="https://www.itbob.cn/article/036/"/>
    <id>https://www.itbob.cn/article/036/</id>
    <published>2020-10-23T16:12:00.000Z</published>
    <updated>2022-05-22T12:47:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-pai-xu-suan-fa-fen-lei-font"><font color="#FF0000">排序算法分类</font></a></li><li><a href="#font-color-ff0000-yi-mou-pao-pai-xu-bubble-sort-font"><font color="#FF0000">一、冒泡排序（Bubble Sort）</font></a><ul><li><a href="#1-yuan-li">1、原理</a></li><li><a href="#2-bu-zou">2、步骤</a></li><li><a href="#3-dong-hua-yan-shi">3、动画演示</a></li><li><a href="#4-dai-ma-shi-xian">4、代码实现</a></li><li><a href="#5-ju-ti-shi-li">5、具体示例</a></li></ul></li><li><a href="#font-color-ff0000-er-xuan-ze-pai-xu-selection-sort-font"><font color="#FF0000">二、选择排序（Selection Sort）</font></a><ul><li><a href="#1-yuan-li-1">1、原理</a></li><li><a href="#2-bu-zou-1">2、步骤</a></li><li><a href="#3-dong-hua-yan-shi-1">3、动画演示</a></li><li><a href="#4-dai-ma-shi-xian-1">4、代码实现</a></li><li><a href="#5-ju-ti-shi-li-1">5、具体示例</a></li></ul></li><li><a href="#font-color-ff0000-san-cha-ru-pai-xu-insertion-sort-font"><font color="#FF0000">三、插入排序（Insertion Sort）</font></a><ul><li><a href="#1-yuan-li-2">1、原理</a></li><li><a href="#2-bu-zou-2">2、步骤</a></li><li><a href="#3-dong-hua-yan-shi-2">3、动画演示</a></li><li><a href="#4-dai-ma-shi-xian-2">4、代码实现</a></li><li><a href="#5-ju-ti-shi-li-2">5、具体示例</a></li></ul></li><li><a href="#font-color-ff0000-si-xi-er-pai-xu-shell-sort-font"><font color="#FF0000">四、希尔排序（Shell Sort）</font></a><ul><li><a href="#1-yuan-li-3">1、原理</a></li><li><a href="#2-bu-zou-3">2、步骤</a></li><li><a href="#3-dong-hua-yan-shi-3">3、动画演示</a></li><li><a href="#4-dai-ma-shi-xian-3">4、代码实现</a></li><li><a href="#5-ju-ti-shi-li-3">5、具体示例</a></li></ul></li><li><a href="#font-color-ff0000-wu-gui-bing-pai-xu-merge-sort-font"><font color="#FF0000">五、归并排序（Merge Sort）</font></a><ul><li><a href="#1-yuan-li-4">1、原理</a></li><li><a href="#2-bu-zou-4">2、步骤</a></li><li><a href="#3-dong-hua-yan-shi-4">3、动画演示</a></li><li><a href="#4-dai-ma-shi-xian-4">4、代码实现</a></li><li><a href="#5-ju-ti-shi-li-4">5、具体示例</a></li></ul></li><li><a href="#font-color-ff0000-liu-kuai-su-pai-xu-quick-sort-font"><font color="#FF0000">六、快速排序（Quick Sort）</font></a><ul><li><a href="#1-yuan-li-5">1、原理</a></li><li><a href="#2-bu-zou-5">2、步骤</a></li><li><a href="#3-dong-hua-yan-shi-5">3、动画演示</a></li><li><a href="#4-dai-ma-shi-xian-5">4、代码实现</a></li><li><a href="#5-ju-ti-shi-li-5">5、具体示例</a></li></ul></li><li><a href="#font-color-ff0000-qi-dui-pai-xu-heap-sort-font"><font color="#FF0000">七、堆排序（Heap Sort）</font></a><ul><li><a href="#1-yuan-li-6">1、原理</a></li><li><a href="#2-bu-zou-6">2、步骤</a></li><li><a href="#3-dong-hua-yan-shi-6">3、动画演示</a></li><li><a href="#4-dai-ma-shi-xian-6">4、代码实现</a></li><li><a href="#5-ju-ti-shi-li-6">5、具体示例</a></li></ul></li><li><a href="#font-color-ff0000-ba-ji-shu-pai-xu-counting-sort-font"><font color="#FF0000">八、计数排序（Counting Sort）</font></a><ul><li><a href="#1-yuan-li-7">1、原理</a></li><li><a href="#2-bu-zou-7">2、步骤</a></li><li><a href="#3-dong-hua-yan-shi-7">3、动画演示</a></li><li><a href="#4-dai-ma-shi-xian-7">4、代码实现</a></li><li><a href="#5-ju-ti-shi-li-7">5、具体示例</a></li></ul></li><li><a href="#font-color-ff0000-jiu-tong-pai-xu-bucket-sort-font"><font color="#FF0000">九、桶排序（Bucket Sort）</font></a><ul><li><a href="#1-yuan-li-8">1、原理</a></li><li><a href="#2-bu-zou-8">2、步骤</a></li><li><a href="#3-dong-hua-yan-shi-8">3、动画演示</a></li><li><a href="#4-dai-ma-shi-xian-8">4、代码实现</a></li><li><a href="#5-ju-ti-shi-li-8">5、具体示例</a></li></ul></li><li><a href="#font-color-ff0000-shi-ji-shu-pai-xu-radix-sort-font"><font color="#FF0000">十、基数排序（Radix Sort）</font></a><ul><li><a href="#1-yuan-li-9">1、原理</a></li><li><a href="#2-bu-zou-9">2、步骤</a></li><li><a href="#3-dong-hua-yan-shi-9">3、动画演示</a></li><li><a href="#4-dai-ma-shi-xian-9">4、代码实现</a></li><li><a href="#5-ju-ti-shi-li-9">5、具体示例</a></li></ul></li></ul><!-- tocstop --><hr><ul><li>参考资料：<a href="https://www.bilibili.com/video/BV1mp4y1D7UP">https://www.bilibili.com/video/BV1mp4y1D7UP</a></li><li>本文动图演示来源：<a href="https://visualgo.net/">https://visualgo.net/</a></li></ul><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/108987300</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="pai-xu-suan-fa-fen-lei"><font color="#FF0000">排序算法分类</font></span></h2><ul><li><font color="#FF0000"><strong>内部排序</strong></font>：指在排序期间，元素全部存放在内存中的排序，常见的内部排序算法有：<strong>冒泡排序、选择排序、插入排序、希尔排序、归并排序、快速排序、堆排序、基数排序</strong>等。</li><li><font color="#FF0000"><strong>外部排序</strong></font>：指在排序期间，元素无法完全全部同时存放在内存中，必须在排序的过程中根据要求不断地在内、外存之间移动的排序；</li><li><font color="#FF0000"><strong>比较类排序</strong></font>：通过比较来决定元素间的相对次序，由于其时间复杂度不能突破O(nlogn)，因此也称为非线性时间比较类排序。</li><li><font color="#FF0000"><strong>非比较类排序</strong></font>：不通过比较来决定元素间的相对次序，它可以突破基于比较排序的时间下界，以线性时间运行，因此也称为线性时间非比较类排序。 常见的非比较类排序算法有：<strong>基数排序、计数排序、桶排序</strong>等</li></ul><hr><p>一般情况下，内部排序算法在执行过程中都要进行两种操作：比较和移动。通过比较两个关键字的大小，确定对应元素的前后关系，然后通过移动元素以达到有序。但是，并非所有的内部排序算法都要基于比较操作。</p><p>每种排序算法都有各自的优缺点，适合在不同的环境下使用，就其全面性能而言，很难提出一种被认为是最好的算法。<font color="#FF0000"><strong>通常可以将排序算法分为插入排序、交换排序、选择排序、归并排序和基数排序五大类</strong></font>，内部排序算法的性能取决于算法的时间复杂度和空间复杂度，而时间复杂度一般是由比较和移动的次数决定的。</p><hr><p><img src="https://static.wukongsec.com/itbob/images/article/036/01.png" alt="01"></p><table><thead><tr><th>排序算法</th><th>时间复杂度（平均）</th><th>时间复杂度（最好）</th><th>时间复杂度（最坏）</th><th>空间复杂度</th><th>稳定性</th></tr></thead><tbody><tr><td>冒泡排序</td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><msup><mi>n</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n^2)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">n</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><msup><mi>n</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n^2)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">n</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord">1</span><span class="mclose">)</span></span></span></span></td><td>稳定</td></tr><tr><td>选择排序</td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><msup><mi>n</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n^2)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">n</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><msup><mi>n</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n^2)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">n</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><msup><mi>n</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n^2)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">n</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord">1</span><span class="mclose">)</span></span></span></span></td><td>不稳定</td></tr><tr><td>插入排序</td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><msup><mi>n</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n^2)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">n</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><msup><mi>n</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n^2)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">n</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord">1</span><span class="mclose">)</span></span></span></span></td><td>稳定</td></tr><tr><td>希尔排序</td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nlogn)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><msup><mi>g</mi><mn>2</mn></msup><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nlog^2n)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><msup><mi>g</mi><mn>2</mn></msup><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nlog^2n)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord">1</span><span class="mclose">)</span></span></span></span></td><td>不稳定</td></tr><tr><td>归并排序</td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nlogn)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nlogn)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nlogn)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td>稳定</td></tr><tr><td>快速排序</td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nlogn)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nlogn)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><msup><mi>n</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n^2)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">n</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(logn)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td>不稳定</td></tr><tr><td>堆排序</td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nlogn)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nlogn)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(nlogn)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord">1</span><span class="mclose">)</span></span></span></span></td><td>不稳定</td></tr><tr><td>计数排序</td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo>+</mo><mi>k</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n+k)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo>+</mo><mi>k</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n+k)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo>+</mo><mi>k</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n+k)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>k</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(k)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mclose">)</span></span></span></span></td><td>稳定</td></tr><tr><td>桶排序</td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo>+</mo><mi>k</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n+k)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo>+</mo><mi>k</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n+k)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><msup><mi>n</mi><mn>2</mn></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n^2)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">n</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo>+</mo><mi>k</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n+k)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mclose">)</span></span></span></span></td><td>稳定</td></tr><tr><td>基数排序</td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo>∗</mo><mi>k</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n*k)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">∗</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo>∗</mo><mi>k</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n*k)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">∗</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo>∗</mo><mi>k</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n*k)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">∗</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mclose">)</span></span></span></span></td><td><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo>+</mo><mi>k</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(n+k)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mclose">)</span></span></span></span></td><td>稳定</td></tr></tbody></table><p><strong>稳定性</strong>：排序后 2 个相等键值的顺序和排序之前它们的顺序是否相同。例：如果 a 原本在 b 前面，且 a=b，排序之后 a 仍然在 b 的前面，则表示具有稳定性。</p><p>常见时间复杂度大小比较：</p><p><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>O</mi><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo><mo>&lt;</mo><mi>O</mi><mo stretchy="false">(</mo><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo><mo>&lt;</mo><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo><mo>&lt;</mo><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mi>l</mi><mi>o</mi><mi>g</mi><mi>n</mi><mo stretchy="false">)</mo><mo>&lt;</mo><mi>O</mi><mo stretchy="false">(</mo><msup><mi>n</mi><mn>2</mn></msup><mo stretchy="false">)</mo><mo>&lt;</mo><mi mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mo>&lt;</mo><mi>O</mi><mo stretchy="false">(</mo><msup><mn>2</mn><mi>n</mi></msup><mo stretchy="false">)</mo><mo>&lt;</mo><mi>O</mi><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">!</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">O(1) &lt; O(logn) &lt; O(n) &lt; O(nlogn) &lt; O(n^2) &lt;...&lt; O(2^n)&lt;O (n!)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord">1</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">n</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.0641em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">n</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.5782em;vertical-align:-0.0391em;"></span><span class="mord">...</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord"><span class="mord">2</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord mathnormal">n</span><span class="mclose">!)</span></span></span></span></p><hr><h2><span id="yi-mou-pao-pai-xu-bubble-sort"><font color="#FF0000">一、冒泡排序（Bubble Sort）</font></span></h2><h3><span id="1-yuan-li">1、原理</span></h3><p>重复地走访要排序的元素，依次比较两个相邻的元素，如果顺序（如从大到小）错误就把他们交换过来。走访元素的工作是重复地进行，直到没有相邻元素需要交换，也就是说该元素列已经排序完成。冒泡的意思其实就是每一轮冒泡一个最大的元素就会通过不断比较和交换相邻元素使它转移到最右边。</p><p>假如有 10 个小盆友从左到右站成一排，个头不等。老师想让他们按照个头从低到高站好，于是他开始喊口号。 每喊一次，从第一个小盆友开始，相邻的小朋友如果身高不是正序就会两两调换，就这样第一轮个头最高的排到了最右边（冒泡到最右边），第二轮依次这么来，从第一个小朋友开始两两交换，这样次高的小盆友又排到了倒数第二个位置。依次类推。</p><h3><span id="2-bu-zou">2、步骤</span></h3><ul><li><strong>①</strong> 比较相邻的元素。如果第一个比第二个大，就交换它们两个；</li><li><strong>②</strong> 对每一对相邻元素作同样的工作，从开始第一对到结尾的最后一对，这样在最后的元素应该会是最大的数；</li><li><strong>③</strong> 针对所有的元素重复步骤 <strong>①</strong> ~ <strong>②</strong>，除了最后一个元素，直到排序完成。</li></ul><h3><span id="3-dong-hua-yan-shi">3、动画演示</span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/036/02%E5%86%92%E6%B3%A1%E6%8E%92%E5%BA%8F.gif" alt="02冒泡排序"></p><h3><span id="4-dai-ma-shi-xian">4、代码实现</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">bubbleSort</span>(<span class="hljs-params">arr</span>):</span>    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(arr)-<span class="hljs-number">1</span>):         <span class="hljs-comment"># 循环第 i 趟</span>        <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(arr)-i-<span class="hljs-number">1</span>):   <span class="hljs-comment"># j 为下标</span>            <span class="hljs-keyword">if</span> arr[j] &gt; arr[j+<span class="hljs-number">1</span>]:       <span class="hljs-comment"># 如果这个数大于后面的数就交换两者的位置</span>                arr[j], arr[j+<span class="hljs-number">1</span>] = arr[j+<span class="hljs-number">1</span>], arr[j]    <span class="hljs-keyword">return</span> arr</code></pre><p>冒泡排序还有一种优化算法，就是立一个 flag，当某一趟序列遍历中元素没有发生交换，则证明该序列已经有序，就不再进行后续的排序。动画演示里就是改进后的算法，改进后的代码如下：</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">bubbleSort</span>(<span class="hljs-params">arr</span>):</span>    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(arr)-<span class="hljs-number">1</span>):         <span class="hljs-comment"># 循环第 i 趟</span>    flag = <span class="hljs-literal">False</span>        <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(arr)-i-<span class="hljs-number">1</span>):   <span class="hljs-comment"># j 为下标</span>            <span class="hljs-keyword">if</span> arr[j] &gt; arr[j+<span class="hljs-number">1</span>]:       <span class="hljs-comment"># 如果这个数大于后面的数就交换两者的位置</span>                arr[j], arr[j+<span class="hljs-number">1</span>] = arr[j+<span class="hljs-number">1</span>], arr[j]                flag = <span class="hljs-literal">True</span>        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> flag:            <span class="hljs-keyword">return</span>    <span class="hljs-keyword">return</span> arr</code></pre><p>冒泡排序最快的情况：当输入的数据是正序时；最慢的情况：当输入的数据是反序时。</p><h3><span id="5-ju-ti-shi-li">5、具体示例</span></h3><p>未改进版本：</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">bubble_sort</span>(<span class="hljs-params">arr</span>):</span>    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(arr)-<span class="hljs-number">1</span>):         <span class="hljs-comment"># 循环第 i 趟</span>        <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(arr)-i-<span class="hljs-number">1</span>):   <span class="hljs-comment"># j 为下标</span>            <span class="hljs-keyword">if</span> arr[j] &gt; arr[j+<span class="hljs-number">1</span>]:       <span class="hljs-comment"># 如果这个数大于后面的数就交换两者的位置</span>                arr[j], arr[j+<span class="hljs-number">1</span>] = arr[j+<span class="hljs-number">1</span>], arr[j]        <span class="hljs-built_in">print</span>(arr)                      <span class="hljs-comment"># 每一趟比较完了就打印一次</span>arr = [<span class="hljs-number">3</span>, <span class="hljs-number">44</span>, <span class="hljs-number">38</span>, <span class="hljs-number">5</span>, <span class="hljs-number">47</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">2</span>, <span class="hljs-number">46</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>]bubble_sort(arr)</code></pre><pre><code class="hljs python">[<span class="hljs-number">3</span>, <span class="hljs-number">38</span>, <span class="hljs-number">5</span>, <span class="hljs-number">44</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">2</span>, <span class="hljs-number">46</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">38</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">2</span>, <span class="hljs-number">44</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">2</span>, <span class="hljs-number">38</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">2</span>, <span class="hljs-number">36</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">26</span>, <span class="hljs-number">2</span>, <span class="hljs-number">27</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">2</span>, <span class="hljs-number">26</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">15</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">4</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>]</code></pre><p>改进版本：</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">bubble_sort</span>(<span class="hljs-params">arr</span>):</span>    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(arr)-<span class="hljs-number">1</span>):         <span class="hljs-comment"># 循环第 i 趟</span>        flag = <span class="hljs-literal">False</span>        <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(arr)-i-<span class="hljs-number">1</span>):   <span class="hljs-comment"># j 为下标</span>            <span class="hljs-keyword">if</span> arr[j] &gt; arr[j+<span class="hljs-number">1</span>]:       <span class="hljs-comment"># 如果这个数大于后面的数就交换两者的位置</span>                arr[j], arr[j+<span class="hljs-number">1</span>] = arr[j+<span class="hljs-number">1</span>], arr[j]                flag = <span class="hljs-literal">True</span>        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> flag:            <span class="hljs-keyword">return</span>        <span class="hljs-built_in">print</span>(arr)                      <span class="hljs-comment"># 每一趟比较完了就打印一次</span>arr = [<span class="hljs-number">3</span>, <span class="hljs-number">44</span>, <span class="hljs-number">38</span>, <span class="hljs-number">5</span>, <span class="hljs-number">47</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">2</span>, <span class="hljs-number">46</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>]bubble_sort(arr)</code></pre><pre><code class="hljs python">[<span class="hljs-number">3</span>, <span class="hljs-number">38</span>, <span class="hljs-number">5</span>, <span class="hljs-number">44</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">2</span>, <span class="hljs-number">46</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">38</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">2</span>, <span class="hljs-number">44</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">2</span>, <span class="hljs-number">38</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">2</span>, <span class="hljs-number">36</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">26</span>, <span class="hljs-number">2</span>, <span class="hljs-number">27</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">2</span>, <span class="hljs-number">26</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">15</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">4</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>]</code></pre><hr><h2><span id="er-xuan-ze-pai-xu-selection-sort"><font color="#FF0000">二、选择排序（Selection Sort）</font></span></h2><h3><span id="1-yuan-li">1、原理</span></h3><p>第一次从待排序的数据元素中选出最小（或最大）的一个元素，存放在序列的起始位置，然后再从剩余的未排序元素中寻找到最小（大）元素，然后放到已排序的序列的末尾。以此类推，直到全部待排序的数据元素的个数为零。可以理解为 一个 0 到 n-1 的迭代，每次向后查找选择一个最小的元素。选择排序是不稳定的排序方法。</p><p>假如有 10 个小盆友从左到右站成一排，个头不等。老师想让他们按照个头从低到高站好，我们从第一个开始，从头到尾找一个个头最小的小盆友，然后把它和第一个小盆友交换。 然后从第二个小盆友开始采取同样的策略，这样一圈下来小盆友就是有序的了。</p><h3><span id="2-bu-zou">2、步骤</span></h3><ul><li><strong>①</strong> 首先在未排序序列中找到最小（大）元素，存放到排序序列的起始位置；</li><li><strong>②</strong> 再从剩余未排序元素中继续寻找最小（大）元素，然后放到已排序序列的末尾；</li><li><strong>③</strong> 重复步骤 <strong>②</strong>，直到所有元素均排序完毕。</li></ul><h3><span id="3-dong-hua-yan-shi">3、动画演示</span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/036/03%E9%80%89%E6%8B%A9%E6%8E%92%E5%BA%8F.gif" alt="03选择排序"></p><h3><span id="4-dai-ma-shi-xian">4、代码实现</span></h3><p>Python 代码：</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">selection_sort</span>(<span class="hljs-params">arr</span>):</span>    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(arr)-<span class="hljs-number">1</span>):          <span class="hljs-comment"># 循环第 i 趟</span>        min_index = i                    <span class="hljs-comment"># 记录最小数的下标</span>        <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(i+<span class="hljs-number">1</span>, <span class="hljs-built_in">len</span>(arr)):   <span class="hljs-comment"># j 为下标</span>            <span class="hljs-keyword">if</span> arr[j] &lt; arr[min_index]:  <span class="hljs-comment"># 如果这个数小于记录的最小数，则更新最小数的下标</span>                min_index = j        arr[i], arr[min_index] = arr[min_index], arr[i]  <span class="hljs-comment"># 将 i 位置的数（已排序序列的末尾的数）和最小数进行交换</span>    <span class="hljs-keyword">return</span> arr</code></pre><h3><span id="5-ju-ti-shi-li">5、具体示例</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">selection_sort</span>(<span class="hljs-params">arr</span>):</span>    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(arr)-<span class="hljs-number">1</span>):          <span class="hljs-comment"># 循环第 i 趟</span>        min_index = i                    <span class="hljs-comment"># 记录最小数的下标</span>        <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(i+<span class="hljs-number">1</span>, <span class="hljs-built_in">len</span>(arr)):   <span class="hljs-comment"># j 为下标</span>            <span class="hljs-keyword">if</span> arr[j] &lt; arr[min_index]:  <span class="hljs-comment"># 如果这个数小于记录的最小数，则更新最小数的下标</span>                min_index = j        arr[i], arr[min_index] = arr[min_index], arr[i]  <span class="hljs-comment"># 将 i 位置的数（已排序序列的末尾的数）和最小数进行交换</span>        <span class="hljs-built_in">print</span>(arr)                       <span class="hljs-comment"># 每一趟比较完了就打印一次</span>arr = [<span class="hljs-number">3</span>, <span class="hljs-number">44</span>, <span class="hljs-number">38</span>, <span class="hljs-number">5</span>, <span class="hljs-number">47</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">2</span>, <span class="hljs-number">46</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>]selection_sort(arr)</code></pre><pre><code class="hljs python">[<span class="hljs-number">2</span>, <span class="hljs-number">44</span>, <span class="hljs-number">38</span>, <span class="hljs-number">5</span>, <span class="hljs-number">47</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">3</span>, <span class="hljs-number">46</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">38</span>, <span class="hljs-number">5</span>, <span class="hljs-number">47</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">4</span>, <span class="hljs-number">19</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">47</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">38</span>, <span class="hljs-number">19</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">47</span>, <span class="hljs-number">15</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">38</span>, <span class="hljs-number">19</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">47</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">38</span>, <span class="hljs-number">19</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">38</span>, <span class="hljs-number">47</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">36</span>, <span class="hljs-number">27</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">38</span>, <span class="hljs-number">47</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">38</span>, <span class="hljs-number">47</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">38</span>, <span class="hljs-number">47</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">46</span>, <span class="hljs-number">44</span>, <span class="hljs-number">47</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">50</span>, <span class="hljs-number">48</span>][<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">15</span>, <span class="hljs-number">19</span>, <span class="hljs-number">26</span>, <span class="hljs-number">27</span>, <span class="hljs-number">36</span>, <span class="hljs-number">38</span>, <span class="hljs-number">44</span>, <span class="hljs-number">46</span>, <span class="hljs-number">47</span>, <span class="hljs-number">48</span>, <span class="hljs-number">50</span>]</code></pre><hr><h2><span id="san-cha-ru-pai-xu-insertion-sort"><font color="#FF0000">三、插入排序（Insertion Sort）</font></span></h2><h3><span id="1-yuan-li">1、原理</span></h3><p>插入排序一般也被称为直接插入排序。对于少量元素的排序，它是一个有效的算法。它的基本思想是将一个记录插入到已经排好序的有序表中，从而形成一个新的有序表。在其实现过程使用双层循环，外层循环对除了第一个元素之外的所有元素进行遍历，内层循环对当前元素前面有序表进行待插入位置查找，并进行移动。</p><p>插入排序的工作方式像许多人排序一手扑克牌。开始时，我们的左手为空并且桌子上的牌面向下。然后，我们每次从桌子上拿走一张牌并将它插入左手中正确的位置。为了找到一张牌的正确位置，我们从右到左将它与已在手中的每张牌进行比较。拿在左手上的牌总是排序好的，原来这些牌是桌子上牌堆中顶部的牌。</p><h3><span id="2-bu-zou">2、步骤</span></h3><ul><li><strong>①</strong> 从第一个元素开始，该元素可以认为已经被排序；</li><li><strong>②</strong> 取出下一个元素，在已经排序的元素序列中从后向前扫描；</li><li><strong>③</strong> 如果该元素（已排序的）大于新元素，将该元素往右移到下一位置，重复该步骤，直到找到已排序的元素小于或者等于新元素的位置；</li><li><strong>④</strong> 将新元素插入到步骤 <strong>③</strong> 找到的位置的后面；</li><li><strong>⑤</strong> 重复步骤 <strong>②</strong> ~ <strong>④</strong>。</li></ul><h3><span id="3-dong-hua-yan-shi">3、动画演示</span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/036/04%E6%8F%92%E5%85%A5%E6%8E%92%E5%BA%8F.gif" alt="04插入排序"></p><h3><span id="4-dai-ma-shi-xian">4、代码实现</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">insertion_sort</span>(<span class="hljs-params">arr</span>):</span>    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-number">1</span>, <span class="hljs-built_in">len</span>(arr)):    <span class="hljs-comment"># 将 i 看做摸到的牌的下标</span>        tmp = arr[i]                <span class="hljs-comment"># 将摸到的牌储存到 tmp</span>        j = i-<span class="hljs-number">1</span>                     <span class="hljs-comment"># 将 j 看做手里的牌的下标</span>        <span class="hljs-keyword">while</span> j &gt;= <span class="hljs-number">0</span> <span class="hljs-keyword">and</span> arr[j] &gt; tmp:  <span class="hljs-comment"># 如果手里的牌大于摸到的牌</span>            arr[j+<span class="hljs-number">1</span>] = arr[j]       <span class="hljs-comment"># 将手里的牌往右移一个位置（将手里的牌赋值给下一个位置）</span>            j -= <span class="hljs-number">1</span>                  <span class="hljs-comment"># 将手里的牌的下标减 1，再次准备与摸到的牌进行比较</span>        arr[j+<span class="hljs-number">1</span>] = tmp              <span class="hljs-comment"># 将摸到的牌插入到 j+1 位置</span>    <span class="hljs-keyword">return</span> arr</code></pre><h3><span id="5-ju-ti-shi-li">5、具体示例</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">insertion_sort</span>(<span class="hljs-params">arr</span>):</span>    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-number">1</span>, <span class="hljs-built_in">len</span>(arr)):    <span class="hljs-comment"># 将 i 看做摸到的牌的下标</span>        tmp = arr[i]                <span class="hljs-comment"># 将摸到的牌储存到 tmp</span>        j = i-<span class="hljs-number">1</span>                     <span class="hljs-comment"># 将 j 看做手里的牌的下标</span>        <span class="hljs-keyword">while</span> j &gt;= <span class="hljs-number">0</span> <span class="hljs-keyword">and</span> arr[j] &gt; tmp:  <span class="hljs-comment"># 如果手里的牌大于摸到的牌</span>            arr[j+<span class="hljs-number">1</span>] = arr[j]       <span class="hljs-comment"># 将手里的牌往右移一个位置（将手里的牌赋值给下一个位置）</span>            j -= <span class="hljs-number">1</span>                  <span class="hljs-comment"># 将手里的牌的下标减 1，再次准备与摸到的牌进行比较</span>        arr[j+<span class="hljs-number">1</span>] = tmp              <span class="hljs-comment"># 将摸到的牌插入到 j+1 位置</span>        <span class="hljs-built_in">print</span>(arr)                  <span class="hljs-comment"># 每一趟比较完了就打印一次</span>arr = [<span class="hljs-number">0</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>, <span class="hljs-number">7</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>]insertion_sort(arr)</code></pre><pre><code class="hljs python">[<span class="hljs-number">0</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>, <span class="hljs-number">7</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>]  <span class="hljs-comment"># 手里第一张牌为 0，摸到 9，此时 i=1，j=0，0 比 9 小，将 9 插到索引 j+1=1 处。</span>[<span class="hljs-number">0</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">7</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>]  <span class="hljs-comment"># 手里的牌为 0，9，摸到 8，此时 i=2，j=1，9 比 8 大，将 9 右移一个位置，j-1=0，将 8 插到 j+1=1 处</span>[<span class="hljs-number">0</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>][<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>][<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>][<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>][<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>][<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">6</span>][<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>]</code></pre><hr><h2><span id="si-xi-er-pai-xu-shell-sort"><font color="#FF0000">四、希尔排序（Shell Sort）</font></span></h2><h3><span id="1-yuan-li">1、原理</span></h3><p>希尔排序是插入排序的一种更高效的改进版本，是一种分组插入排序算法，又称缩小增量排序（Diminishing Increment Sort），希尔排序是非稳定排序算法。该方法因 D.L.Shell 于 1959 年提出而得名。</p><p>希尔排序是基于插入排序的以下两点性质而提出改进方法的：</p><ul><li>插入排序在对几乎已经排好序的数据操作时，效率高，即可以达到线性排序的效率；</li><li>但插入排序一般来说是低效的，因为插入排序每次只能将数据移动一位；</li></ul><p>希尔排序的基本思想是：先将整个待排序的记录序列分割成为若干子序列分别进行直接插入排序，待整个序列中的记录“基本有序”时，再对全体记录进行依次直接插入排序。</p><h3><span id="2-bu-zou">2、步骤</span></h3><ul><li><strong>①</strong> n 为数组长度，首先取一个整数 <strong>d1=n/2</strong>，将元素分为 <strong>d1</strong> 个组，每组相邻量元素之间距离为 <strong>d1-1</strong>，在各组内进行直接插入排序；</li><li><strong>②</strong> 取第二个整数 <strong>d2=d1/2</strong>，重复步骤 <strong>①</strong> 分组排序过程，直到 <strong>di=1</strong>，即所有元素在同一组内进行直接插入排序。</li></ul><p>PS：希尔排序每趟并不使某些元素有序，而是使整体数据越来越接近有序；最后一趟排序使得所有数据有序。</p><h3><span id="3-dong-hua-yan-shi">3、动画演示</span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/036/05%E5%B8%8C%E5%B0%94%E6%8E%92%E5%BA%8F.gif" alt="05希尔排序"></p><p><img src="https://static.wukongsec.com/itbob/images/article/036/06%E5%B8%8C%E5%B0%94%E6%8E%92%E5%BA%8F.gif" alt="06希尔排序"></p><h3><span id="4-dai-ma-shi-xian">4、代码实现</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">insertion_sort_gap</span>(<span class="hljs-params">arr, gap</span>):</span>     <span class="hljs-comment"># 将 gap 看做隔 gap 个距离摸一张牌，而不是依次按照顺序摸牌</span>    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(gap, <span class="hljs-built_in">len</span>(arr)):    <span class="hljs-comment"># 将 i 看做摸到的牌的下标</span>        tmp = arr[i]                  <span class="hljs-comment"># 将摸到的牌储存到 tmp</span>        j = i-gap                     <span class="hljs-comment"># 将 j 看做手里的牌的下标</span>        <span class="hljs-keyword">while</span> j &gt;= <span class="hljs-number">0</span> <span class="hljs-keyword">and</span> arr[j] &gt; tmp:  <span class="hljs-comment"># 如果手里的牌大于摸到的牌</span>            arr[j+gap] = arr[j]         <span class="hljs-comment"># 将手里的牌往右移一个位置（将手里的牌赋值给下一个位置）</span>            j -= gap                    <span class="hljs-comment"># 将手里的牌的下标减 gap，再次准备与摸到的牌进行比较</span>        arr[j+gap] = tmp                <span class="hljs-comment"># 将摸到的牌插入到 j+gap 位置</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">shell_sort</span>(<span class="hljs-params">arr</span>):</span>    d = <span class="hljs-built_in">len</span>(arr) // <span class="hljs-number">2</span>                   <span class="hljs-comment"># 第一次分组</span>    <span class="hljs-keyword">while</span> d &gt;= <span class="hljs-number">1</span>:        insertion_sort_gap(arr, d)      <span class="hljs-comment"># 调用插入排序</span>        d //= <span class="hljs-number">2</span>                         <span class="hljs-comment"># 整除 2 后再次分组</span>    <span class="hljs-keyword">return</span> arr</code></pre><p>也可以不使用两个函数，写在一起即可：</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">shell_sort</span>(<span class="hljs-params">arr</span>):</span>    d = <span class="hljs-built_in">len</span>(arr) // <span class="hljs-number">2</span>                   <span class="hljs-comment"># 第一次分组</span>    <span class="hljs-keyword">while</span> d &gt;= <span class="hljs-number">1</span>:                       <span class="hljs-comment"># 将 d 看做隔 d 个距离摸一张牌，而不是依次按照顺序摸牌</span>        <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(d, <span class="hljs-built_in">len</span>(arr)):    <span class="hljs-comment"># 将 i 看做摸到的牌的下标</span>            tmp = arr[i]                <span class="hljs-comment"># 将摸到的牌储存到 tmp</span>            j = i - d                   <span class="hljs-comment"># 将 j 看做手里的牌的下标</span>            <span class="hljs-keyword">while</span> j &gt;= <span class="hljs-number">0</span> <span class="hljs-keyword">and</span> arr[j] &gt; tmp:   <span class="hljs-comment"># 如果手里的牌大于摸到的牌</span>                arr[j + d] = arr[j]          <span class="hljs-comment"># 将手里的牌往右移一个位置（将手里的牌赋值给下一个位置）</span>                j -= d                       <span class="hljs-comment"># 将手里的牌的下标减 d，再次准备与摸到的牌进行比较</span>            arr[j + d] = tmp                 <span class="hljs-comment"># 将摸到的牌插入到 j+d 位置</span>        d //= <span class="hljs-number">2</span>                              <span class="hljs-comment"># 整除 2 后再次分组</span>    <span class="hljs-keyword">return</span> arr</code></pre><h3><span id="5-ju-ti-shi-li">5、具体示例</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">insertion_sort_gap</span>(<span class="hljs-params">arr, gap</span>):</span>     <span class="hljs-comment"># 将 gap 看做隔 gap 个距离摸一张牌，而不是依次按照顺序摸牌</span>    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(gap, <span class="hljs-built_in">len</span>(arr)):    <span class="hljs-comment"># 将 i 看做摸到的牌的下标</span>        tmp = arr[i]                  <span class="hljs-comment"># 将摸到的牌储存到 tmp</span>        j = i-gap                     <span class="hljs-comment"># 将 j 看做手里的牌的下标</span>        <span class="hljs-keyword">while</span> j &gt;= <span class="hljs-number">0</span> <span class="hljs-keyword">and</span> arr[j] &gt; tmp:  <span class="hljs-comment"># 如果手里的牌大于摸到的牌</span>            arr[j+gap] = arr[j]         <span class="hljs-comment"># 将手里的牌往右移一个位置（将手里的牌赋值给下一个位置）</span>            j -= gap                    <span class="hljs-comment"># 将手里的牌的下标减 gap，再次准备与摸到的牌进行比较</span>        arr[j+gap] = tmp                <span class="hljs-comment"># 将摸到的牌插入到 j+gap 位置</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">shell_sort</span>(<span class="hljs-params">arr</span>):</span>    d = <span class="hljs-built_in">len</span>(arr) // <span class="hljs-number">2</span>                   <span class="hljs-comment"># 第一次分组</span>    <span class="hljs-keyword">while</span> d &gt;= <span class="hljs-number">1</span>:        insertion_sort_gap(arr, d)      <span class="hljs-comment"># 调用插入排序</span>        <span class="hljs-built_in">print</span>(arr)                      <span class="hljs-comment"># 每一轮排序后打印一次</span>        d //= <span class="hljs-number">2</span>                         <span class="hljs-comment"># 整除 2 后再次分组</span>arr = [<span class="hljs-number">5</span>, <span class="hljs-number">7</span>, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>]shell_sort(arr)</code></pre><pre><code class="hljs python">[<span class="hljs-number">3</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">6</span>, <span class="hljs-number">5</span>, <span class="hljs-number">7</span>, <span class="hljs-number">4</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>][<span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">6</span>, <span class="hljs-number">4</span>, <span class="hljs-number">7</span>, <span class="hljs-number">5</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>]</code></pre><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">shell_sort</span>(<span class="hljs-params">arr</span>):</span>    d = <span class="hljs-built_in">len</span>(arr) // <span class="hljs-number">2</span>                   <span class="hljs-comment"># 第一次分组</span>    <span class="hljs-keyword">while</span> d &gt;= <span class="hljs-number">1</span>:                       <span class="hljs-comment"># 将 d 看做隔 d 个距离摸一张牌，而不是依次按照顺序摸牌</span>        <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(d, <span class="hljs-built_in">len</span>(arr)):    <span class="hljs-comment"># 将 i 看做摸到的牌的下标</span>            tmp = arr[i]                <span class="hljs-comment"># 将摸到的牌储存到 tmp</span>            j = i - d                   <span class="hljs-comment"># 将 j 看做手里的牌的下标</span>            <span class="hljs-keyword">while</span> j &gt;= <span class="hljs-number">0</span> <span class="hljs-keyword">and</span> arr[j] &gt; tmp:   <span class="hljs-comment"># 如果手里的牌大于摸到的牌</span>                arr[j + d] = arr[j]          <span class="hljs-comment"># 将手里的牌往右移一个位置（将手里的牌赋值给下一个位置）</span>                j -= d                       <span class="hljs-comment"># 将手里的牌的下标减 d，再次准备与摸到的牌进行比较</span>            arr[j + d] = tmp                 <span class="hljs-comment"># 将摸到的牌插入到 j+d 位置</span>        <span class="hljs-built_in">print</span>(arr)                           <span class="hljs-comment"># 每一轮排序后打印一次</span>        d //= <span class="hljs-number">2</span>                              <span class="hljs-comment"># 整除 2 后再次分组</span>arr = [<span class="hljs-number">5</span>, <span class="hljs-number">7</span>, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>]shell_sort(arr)</code></pre><pre><code class="hljs python">[<span class="hljs-number">3</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">6</span>, <span class="hljs-number">5</span>, <span class="hljs-number">7</span>, <span class="hljs-number">4</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>][<span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">6</span>, <span class="hljs-number">4</span>, <span class="hljs-number">7</span>, <span class="hljs-number">5</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>]</code></pre><hr><h2><span id="wu-gui-bing-pai-xu-merge-sort"><font color="#FF0000">五、归并排序（Merge Sort）</font></span></h2><h3><span id="1-yuan-li">1、原理</span></h3><p>归并的概念：假设一个列表分为两段，其中每一段都是有序列表，现在将该两段合并为一个有序列表，这种操作称为一次归并。</p><p>归并排序是建立在归并操作上的一种有效，稳定的排序算法，该算法是采用分治法（Divide and Conquer）的一个非常典型的应用。将已有序的子序列合并，得到完全有序的序列；即先使每个子序列有序，再使子序列段间有序。若将两个有序表合并成一个有序表，称为二路归并。</p><p><img src="https://static.wukongsec.com/itbob/images/article/036/08.png" alt="08"></p><h3><span id="2-bu-zou">2、步骤</span></h3><p><font color="#ff0000"><strong>归并的基本步骤：</strong></font></p><ul><li><strong>①</strong> 申请空间，使其大小为<strong>两个已经排序序列之和</strong>，该空间用来存放合并后的序列；</li><li><strong>②</strong> 设定两个指针，最初位置分别为两个已经排序序列的起始位置；</li><li><strong>③</strong> 比较两个指针所指向的元素，选择相对小的元素放入到合并空间，并移动指针到下一位置；</li><li><strong>④</strong> 重复步骤 <strong>③</strong> 直到某一指针达到序列尾；</li><li><strong>⑤</strong> 将另一序列剩下的所有元素直接复制到合并序列尾。</li></ul><p><font color="#ff0000"><strong>归并排序的步骤：</strong></font></p><ul><li><strong>①</strong> 分解：将列表越分越小，直至分成一个元素，终止条件：一个元素是有序的。</li><li><strong>②</strong> 合并：不断将两个有序列表进行归并，列表越来越大，直到所有序列归并完毕。</li></ul><h3><span id="3-dong-hua-yan-shi">3、动画演示</span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/036/07%E5%BD%92%E5%B9%B6%E6%8E%92%E5%BA%8F.gif" alt="07归并排序"></p><h3><span id="4-dai-ma-shi-xian">4、代码实现</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">merge</span>(<span class="hljs-params">arr, low, mid, high</span>):</span>    <span class="hljs-comment"># low 和 high 为整个数组的第一个和最后一个位置索引，mid 为中间位置索引</span>    <span class="hljs-comment"># i 和 j 为指针，最初位置分别为两个有序序列的起始位置</span>    <span class="hljs-comment"># ltmp 用来存放合并后的序列</span>    i = low    j = mid+<span class="hljs-number">1</span>    ltmp = []    <span class="hljs-keyword">while</span> i &lt;= mid <span class="hljs-keyword">and</span> j &lt;= high:  <span class="hljs-comment"># 只要左右两边都有数</span>        <span class="hljs-keyword">if</span> arr[i] &lt; arr[j]:        <span class="hljs-comment"># 当左边的数小于右边的数</span>            ltmp.append(arr[i])    <span class="hljs-comment"># 将左边的数存入 ltmp</span>            i += <span class="hljs-number">1</span>                 <span class="hljs-comment"># 左边的指针往右移一位</span>        <span class="hljs-keyword">else</span>:                      <span class="hljs-comment"># 当右边的数小于左边的数</span>            ltmp.append(arr[j])    <span class="hljs-comment"># 将右边的数存入 ltmp</span>            j += <span class="hljs-number">1</span>                 <span class="hljs-comment"># 右边的指针往右移一位</span>    <span class="hljs-comment"># 上面的 while 语句执行完后，左边或者右边没有数了</span>    <span class="hljs-keyword">while</span> i &lt;= mid:                <span class="hljs-comment"># 当左边还有数的时候</span>        ltmp.append(arr[i])        <span class="hljs-comment"># 将左边剩下的数全部存入 ltmp</span>        i += <span class="hljs-number">1</span>    <span class="hljs-keyword">while</span> j &lt;= high:               <span class="hljs-comment"># 当右边还有数的时候</span>        ltmp.append(arr[j])        <span class="hljs-comment"># 将右边剩下的数全部存入 ltmp</span>        j += <span class="hljs-number">1</span>    arr[low:high+<span class="hljs-number">1</span>] = ltmp         <span class="hljs-comment"># 将排序后的数组写回原数组</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">merge_sort</span>(<span class="hljs-params">arr, low, high</span>):</span>       <span class="hljs-comment"># low 和 high 为整个数组的第一个和最后一个位置索引</span>    <span class="hljs-keyword">if</span> low &lt; high:                    <span class="hljs-comment"># 至少有两个元素</span>        mid = (low + high) // <span class="hljs-number">2</span>        merge_sort(arr, low, mid)     <span class="hljs-comment"># 把左边递归分解</span>        merge_sort(arr, mid+<span class="hljs-number">1</span>, high)  <span class="hljs-comment"># 把右边递归分解</span>        merge(arr, low, mid, high)    <span class="hljs-comment"># 做归并</span></code></pre><h3><span id="5-ju-ti-shi-li">5、具体示例</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">merge</span>(<span class="hljs-params">arr, low, mid, high</span>):</span>    <span class="hljs-comment"># low 和 high 为整个数组的第一个和最后一个位置索引，mid 为中间位置索引</span>    <span class="hljs-comment"># i 和 j 为指针，最初位置分别为两个有序序列的起始位置</span>    <span class="hljs-comment"># ltmp 用来存放合并后的序列</span>    i = low    j = mid+<span class="hljs-number">1</span>    ltmp = []    <span class="hljs-keyword">while</span> i &lt;= mid <span class="hljs-keyword">and</span> j &lt;= high:  <span class="hljs-comment"># 只要左右两边都有数</span>        <span class="hljs-keyword">if</span> arr[i] &lt; arr[j]:        <span class="hljs-comment"># 当左边的数小于右边的数</span>            ltmp.append(arr[i])    <span class="hljs-comment"># 将左边的数存入 ltmp</span>            i += <span class="hljs-number">1</span>                 <span class="hljs-comment"># 左边的指针往右移一位</span>        <span class="hljs-keyword">else</span>:                      <span class="hljs-comment"># 当右边的数小于左边的数</span>            ltmp.append(arr[j])    <span class="hljs-comment"># 将右边的数存入 ltmp</span>            j += <span class="hljs-number">1</span>                 <span class="hljs-comment"># 右边的指针往右移一位</span>    <span class="hljs-comment"># 上面的 while 语句执行完后，左边或者右边没有数了</span>    <span class="hljs-keyword">while</span> i &lt;= mid:                <span class="hljs-comment"># 当左边还有数的时候</span>        ltmp.append(arr[i])        <span class="hljs-comment"># 将左边剩下的数全部存入 ltmp</span>        i += <span class="hljs-number">1</span>    <span class="hljs-keyword">while</span> j &lt;= high:               <span class="hljs-comment"># 当右边还有数的时候</span>        ltmp.append(arr[j])        <span class="hljs-comment"># 将右边剩下的数全部存入 ltmp</span>        j += <span class="hljs-number">1</span>    arr[low:high+<span class="hljs-number">1</span>] = ltmp         <span class="hljs-comment"># 将排序后的数组写回原数组</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">merge_sort</span>(<span class="hljs-params">arr, low, high</span>):</span>       <span class="hljs-comment"># low 和 high 为整个数组的第一个和最后一个位置索引</span>    <span class="hljs-keyword">if</span> low &lt; high:                    <span class="hljs-comment"># 至少有两个元素</span>        mid = (low + high) // <span class="hljs-number">2</span>        merge_sort(arr, low, mid)     <span class="hljs-comment"># 把左边递归分解</span>        merge_sort(arr, mid+<span class="hljs-number">1</span>, high)  <span class="hljs-comment"># 把右边递归分解</span>        merge(arr, low, mid, high)    <span class="hljs-comment"># 做归并</span>        <span class="hljs-built_in">print</span>(arr)                    <span class="hljs-comment"># 每一次归并打印一次</span>arr = [<span class="hljs-number">7</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">6</span>, <span class="hljs-number">9</span>, <span class="hljs-number">4</span>]merge_sort(arr, <span class="hljs-number">0</span>, <span class="hljs-built_in">len</span>(arr)-<span class="hljs-number">1</span>)</code></pre><pre><code class="hljs python">[<span class="hljs-number">1</span>, <span class="hljs-number">7</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">6</span>, <span class="hljs-number">9</span>, <span class="hljs-number">4</span>][<span class="hljs-number">1</span>, <span class="hljs-number">7</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">6</span>, <span class="hljs-number">9</span>, <span class="hljs-number">4</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">7</span>, <span class="hljs-number">6</span>, <span class="hljs-number">9</span>, <span class="hljs-number">4</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">7</span>, <span class="hljs-number">6</span>, <span class="hljs-number">9</span>, <span class="hljs-number">4</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">7</span>, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, <span class="hljs-number">9</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">9</span>]</code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/108987300</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="liu-kuai-su-pai-xu-quick-sort"><font color="#FF0000">六、快速排序（Quick Sort）</font></span></h2><h3><span id="1-yuan-li">1、原理</span></h3><p>快速排序是对冒泡排序的一种改进。顾名思义快速排序就是快，而且效率高！它是处理大数据最快的排序算法之一了。它的基本思想是：通过一趟排序将要排序的数据分割成独立的两部分，其中一部分的所有数据都比另外一部分的所有数据都要小，然后再按此方法对这两部分数据分别进行快速排序，整个排序过程可以递归进行，以此达到整个数据变成有序序列。</p><h3><span id="2-bu-zou">2、步骤</span></h3><ul><li><strong>①</strong> 从数列中挑出一个元素，称为 “基准值”;</li><li><strong>②</strong> 重新排序数列，所有元素比基准值小的放在基准值的左边，比基准值大的放在基准值的右边（相同的数可以到任一边）。在这个分区退出之后，该基准值就处于数列的中间位置。这个称为分区（partition）操作，也可以称为一次归位操作，归位操作的过程见下动图；</li><li><strong>③</strong> 递归地把小于基准值元素的子数列和大于基准值元素的子数列按照步骤 <strong>① ②</strong> 排序。</li></ul><p><img src="https://static.wukongsec.com/itbob/images/article/036/09%E5%BF%AB%E9%80%9F%E6%8E%92%E5%BA%8F.gif" alt="09快速排序"></p><h3><span id="3-dong-hua-yan-shi">3、动画演示</span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/036/10%E5%BF%AB%E9%80%9F%E6%8E%92%E5%BA%8F.gif" alt="10快速排序"></p><h3><span id="4-dai-ma-shi-xian">4、代码实现</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">partition</span>(<span class="hljs-params">arr, left, right</span>):</span>    <span class="hljs-comment"># 归位操作，left，right 分别为数组左边和右边的位置索引</span>    tmp = arr[left]    <span class="hljs-keyword">while</span> left &lt; right:        <span class="hljs-keyword">while</span> left &lt; right <span class="hljs-keyword">and</span> arr[right] &gt;= tmp:  <span class="hljs-comment"># 从右边找比 tmp 小的数，如果比 tmp 大，则移动指针</span>            right -= <span class="hljs-number">1</span>                             <span class="hljs-comment"># 将指针左移一个位置</span>        arr[left] = arr[right]                     <span class="hljs-comment"># 将右边的值写到左边的空位上</span>        <span class="hljs-keyword">while</span> left &lt; right <span class="hljs-keyword">and</span> arr[left] &lt;= tmp:   <span class="hljs-comment"># 从左边找比 tmp 大的数，如果比 tmp 小，则移动指针</span>            left += <span class="hljs-number">1</span>                              <span class="hljs-comment"># 将指针右移一个位置</span>        arr[right] = arr[left]                     <span class="hljs-comment"># 将左边的值写到右边的空位上</span>    arr[left] = tmp                                <span class="hljs-comment"># 把 tmp 归位</span>    <span class="hljs-keyword">return</span> left                   <span class="hljs-comment"># 返回 left，right 都可以，目的是便于后面的递归操作对左右两部分进行排序</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">quick_sort</span>(<span class="hljs-params">arr, left, right</span>):</span>          <span class="hljs-comment"># 快速排序</span>    <span class="hljs-keyword">if</span> left &lt; right:        mid = partition(arr, left, right)        quick_sort(arr, left, mid-<span class="hljs-number">1</span>)       <span class="hljs-comment"># 对左半部分进行归位操作</span>        quick_sort(arr, mid+<span class="hljs-number">1</span>, right)      <span class="hljs-comment"># 对右半部分进行归位操作</span>    <span class="hljs-keyword">return</span> arr</code></pre><h3><span id="5-ju-ti-shi-li">5、具体示例</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">partition</span>(<span class="hljs-params">arr, left, right</span>):</span>    <span class="hljs-comment"># 归位操作，left，right 分别为数组左边和右边的位置索引</span>    tmp = arr[left]    <span class="hljs-keyword">while</span> left &lt; right:        <span class="hljs-keyword">while</span> left &lt; right <span class="hljs-keyword">and</span> arr[right] &gt;= tmp:  <span class="hljs-comment"># 从右边找比 tmp 小的数，如果比 tmp 大，则移动指针</span>            right -= <span class="hljs-number">1</span>                             <span class="hljs-comment"># 将指针左移一个位置</span>        arr[left] = arr[right]                     <span class="hljs-comment"># 将右边的值写到左边的空位上</span>        <span class="hljs-keyword">while</span> left &lt; right <span class="hljs-keyword">and</span> arr[left] &lt;= tmp:   <span class="hljs-comment"># 从左边找比 tmp 大的数，如果比 tmp 小，则移动指针</span>            left += <span class="hljs-number">1</span>                              <span class="hljs-comment"># 将指针右移一个位置</span>        arr[right] = arr[left]                     <span class="hljs-comment"># 将左边的值写到右边的空位上</span>    arr[left] = tmp                                <span class="hljs-comment"># 把 tmp 归位</span>    <span class="hljs-keyword">return</span> left                   <span class="hljs-comment"># 返回 left，right 都可以，目的是便于后面的递归操作对左右两部分进行排序</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">quick_sort</span>(<span class="hljs-params">arr, left, right</span>):</span>    <span class="hljs-keyword">if</span> left &lt; right:        mid = partition(arr, left, right)        <span class="hljs-built_in">print</span>(arr)                         <span class="hljs-comment"># 每次归位后打印一次</span>        quick_sort(arr, left, mid-<span class="hljs-number">1</span>)       <span class="hljs-comment"># 对左半部分进行归位操作</span>        quick_sort(arr, mid+<span class="hljs-number">1</span>, right)      <span class="hljs-comment"># 对右半部分进行归位操作</span>arr = [<span class="hljs-number">5</span>, <span class="hljs-number">7</span>, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>]quick_sort(arr, <span class="hljs-number">0</span>, <span class="hljs-built_in">len</span>(arr)-<span class="hljs-number">1</span>)</code></pre><pre><code class="hljs python">[<span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">4</span>, <span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">4</span>, <span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>]</code></pre><hr><h2><span id="qi-dui-pai-xu-heap-sort"><font color="#FF0000">七、堆排序（Heap Sort）</font></span></h2><h3><span id="1-yuan-li">1、原理</span></h3><p>堆排序是指利用堆这种数据结构所设计的一种排序算法。堆是一个近似完全二叉树的结构，并同时满足堆积的性质：即子结点的键值或索引总是小于（或者大于）它的父节点。</p><ul><li>堆：一种特殊的完全二叉树结构</li><li>大根堆：一棵完全二叉树，满足任一节点都比其孩子节点大</li><li>小根堆：一棵完全二叉树，满足任一节点都比其孩子节点小</li></ul><p><img src="https://static.wukongsec.com/itbob/images/article/036/11%E5%A0%86%E6%8E%92%E5%BA%8F.png" alt="11堆排序"></p><h3><span id="2-bu-zou">2、步骤</span></h3><ul><li>① 构建堆：将待排序序列构建成一个堆 H[0……n-1]，从最后一个非叶子结点开始，从左至右，从下至上进行调整。根据升序或降序需求选择大顶堆或小顶堆；</li><li>② 此时的堆顶元素，为最大或者最小元素；</li><li>③ 把堆顶元素和堆尾元素互换，调整堆，重新使堆有序；</li><li>④ 此时堆顶元素为第二大元素；</li><li>⑤ 重复以上步骤，直到堆变空。</li></ul><h3><span id="3-dong-hua-yan-shi">3、动画演示</span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/036/12%E6%9E%84%E5%BB%BA%E5%A0%86.gif" alt="12构建堆"></p><p>堆构建完成后再进行推排序：</p><p><img src="https://static.wukongsec.com/itbob/images/article/036/13%E5%A0%86%E6%8E%92%E5%BA%8F.gif" alt="13堆排序"></p><h3><span id="4-dai-ma-shi-xian">4、代码实现</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sift</span>(<span class="hljs-params">arr, low, high</span>):</span>    <span class="hljs-string">&quot;&quot;&quot;</span><span class="hljs-string">    :param li: 列表</span><span class="hljs-string">    :param low: 堆的根节点位置</span><span class="hljs-string">    :param high: 堆的最后一个元素的位置</span><span class="hljs-string">    &quot;&quot;&quot;</span>    i = low                 <span class="hljs-comment"># i最开始指向根节点</span>    j = <span class="hljs-number">2</span> * i + <span class="hljs-number">1</span>           <span class="hljs-comment"># j开始是左孩子</span>    tmp = arr[low]          <span class="hljs-comment"># 把堆顶存起来</span>    <span class="hljs-keyword">while</span> j &lt;= high:        <span class="hljs-comment"># 只要j位置有数</span>        <span class="hljs-keyword">if</span> j + <span class="hljs-number">1</span> &lt;= high <span class="hljs-keyword">and</span> arr[j+<span class="hljs-number">1</span>] &gt; arr[j]:   <span class="hljs-comment"># 如果右孩子有并且比较大</span>            j = j + <span class="hljs-number">1</span>       <span class="hljs-comment"># j指向右孩子</span>        <span class="hljs-keyword">if</span> arr[j] &gt; tmp:            arr[i] = arr[j]            i = j           <span class="hljs-comment"># 往下看一层</span>            j = <span class="hljs-number">2</span> * i + <span class="hljs-number">1</span>        <span class="hljs-keyword">else</span>:               <span class="hljs-comment"># tmp更大，把tmp放到i的位置上</span>            arr[i] = tmp    <span class="hljs-comment"># 把tmp放到某一级领导位置上</span>            <span class="hljs-keyword">break</span>    <span class="hljs-keyword">else</span>:        arr[i] = tmp        <span class="hljs-comment"># 把tmp放到叶子节点上</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">heap_sort</span>(<span class="hljs-params">arr</span>):</span>    n = <span class="hljs-built_in">len</span>(arr)    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>((n-<span class="hljs-number">2</span>)//<span class="hljs-number">2</span>, -<span class="hljs-number">1</span>, -<span class="hljs-number">1</span>):   <span class="hljs-comment"># i表示建堆的时候调整的部分的根的下标</span>        sift(arr, i, n-<span class="hljs-number">1</span>)    <span class="hljs-comment"># 建堆完成</span>    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(n-<span class="hljs-number">1</span>, -<span class="hljs-number">1</span>, -<span class="hljs-number">1</span>):        <span class="hljs-comment"># i 指向当前堆的最后一个元素</span>        arr[<span class="hljs-number">0</span>], arr[i] = arr[i], arr[<span class="hljs-number">0</span>]        sift(arr, <span class="hljs-number">0</span>, i - <span class="hljs-number">1</span>)             <span class="hljs-comment"># i-1是新的high</span>    <span class="hljs-keyword">return</span> arr</code></pre><h3><span id="5-ju-ti-shi-li">5、具体示例</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sift</span>(<span class="hljs-params">arr, low, high</span>):</span>    <span class="hljs-string">&quot;&quot;&quot;</span><span class="hljs-string">    :param li: 列表</span><span class="hljs-string">    :param low: 堆的根节点位置</span><span class="hljs-string">    :param high: 堆的最后一个元素的位置</span><span class="hljs-string">    &quot;&quot;&quot;</span>    i = low                 <span class="hljs-comment"># i最开始指向根节点</span>    j = <span class="hljs-number">2</span> * i + <span class="hljs-number">1</span>           <span class="hljs-comment"># j开始是左孩子</span>    tmp = arr[low]          <span class="hljs-comment"># 把堆顶存起来</span>    <span class="hljs-keyword">while</span> j &lt;= high:        <span class="hljs-comment"># 只要j位置有数</span>        <span class="hljs-keyword">if</span> j + <span class="hljs-number">1</span> &lt;= high <span class="hljs-keyword">and</span> arr[j+<span class="hljs-number">1</span>] &gt; arr[j]:   <span class="hljs-comment"># 如果右孩子有并且比较大</span>            j = j + <span class="hljs-number">1</span>       <span class="hljs-comment"># j指向右孩子</span>        <span class="hljs-keyword">if</span> arr[j] &gt; tmp:            arr[i] = arr[j]            i = j           <span class="hljs-comment"># 往下看一层</span>            j = <span class="hljs-number">2</span> * i + <span class="hljs-number">1</span>        <span class="hljs-keyword">else</span>:               <span class="hljs-comment"># tmp更大，把tmp放到i的位置上</span>            arr[i] = tmp    <span class="hljs-comment"># 把tmp放到某一级领导位置上</span>            <span class="hljs-keyword">break</span>    <span class="hljs-keyword">else</span>:        arr[i] = tmp        <span class="hljs-comment"># 把tmp放到叶子节点上</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">heap_sort</span>(<span class="hljs-params">arr</span>):</span>    n = <span class="hljs-built_in">len</span>(arr)    <span class="hljs-built_in">print</span>(<span class="hljs-string">&#x27;建堆过程：&#x27;</span>)    <span class="hljs-built_in">print</span>(arr)    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>((n-<span class="hljs-number">2</span>)//<span class="hljs-number">2</span>, -<span class="hljs-number">1</span>, -<span class="hljs-number">1</span>):   <span class="hljs-comment"># i表示建堆的时候调整的部分的根的下标</span>        sift(arr, i, n-<span class="hljs-number">1</span>)        <span class="hljs-built_in">print</span>(arr)    <span class="hljs-comment"># 建堆完成</span>    <span class="hljs-built_in">print</span>(<span class="hljs-string">&#x27;堆排序过程：&#x27;</span>)    <span class="hljs-built_in">print</span>(arr)    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(n-<span class="hljs-number">1</span>, -<span class="hljs-number">1</span>, -<span class="hljs-number">1</span>):        <span class="hljs-comment"># i 指向当前堆的最后一个元素</span>        arr[<span class="hljs-number">0</span>], arr[i] = arr[i], arr[<span class="hljs-number">0</span>]        sift(arr, <span class="hljs-number">0</span>, i - <span class="hljs-number">1</span>)             <span class="hljs-comment"># i-1是新的high</span>        <span class="hljs-built_in">print</span>(arr)arr = [<span class="hljs-number">2</span>, <span class="hljs-number">7</span>, <span class="hljs-number">26</span>, <span class="hljs-number">25</span>, <span class="hljs-number">19</span>, <span class="hljs-number">17</span>, <span class="hljs-number">1</span>, <span class="hljs-number">90</span>, <span class="hljs-number">3</span>, <span class="hljs-number">36</span>]heap_sort(arr)</code></pre><pre><code class="hljs python">建堆过程：[<span class="hljs-number">2</span>, <span class="hljs-number">7</span>, <span class="hljs-number">26</span>, <span class="hljs-number">25</span>, <span class="hljs-number">19</span>, <span class="hljs-number">17</span>, <span class="hljs-number">1</span>, <span class="hljs-number">90</span>, <span class="hljs-number">3</span>, <span class="hljs-number">36</span>][<span class="hljs-number">2</span>, <span class="hljs-number">7</span>, <span class="hljs-number">26</span>, <span class="hljs-number">25</span>, <span class="hljs-number">36</span>, <span class="hljs-number">17</span>, <span class="hljs-number">1</span>, <span class="hljs-number">90</span>, <span class="hljs-number">3</span>, <span class="hljs-number">19</span>][<span class="hljs-number">2</span>, <span class="hljs-number">7</span>, <span class="hljs-number">26</span>, <span class="hljs-number">90</span>, <span class="hljs-number">36</span>, <span class="hljs-number">17</span>, <span class="hljs-number">1</span>, <span class="hljs-number">25</span>, <span class="hljs-number">3</span>, <span class="hljs-number">19</span>][<span class="hljs-number">2</span>, <span class="hljs-number">7</span>, <span class="hljs-number">26</span>, <span class="hljs-number">90</span>, <span class="hljs-number">36</span>, <span class="hljs-number">17</span>, <span class="hljs-number">1</span>, <span class="hljs-number">25</span>, <span class="hljs-number">3</span>, <span class="hljs-number">19</span>][<span class="hljs-number">2</span>, <span class="hljs-number">90</span>, <span class="hljs-number">26</span>, <span class="hljs-number">25</span>, <span class="hljs-number">36</span>, <span class="hljs-number">17</span>, <span class="hljs-number">1</span>, <span class="hljs-number">7</span>, <span class="hljs-number">3</span>, <span class="hljs-number">19</span>][<span class="hljs-number">90</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">25</span>, <span class="hljs-number">19</span>, <span class="hljs-number">17</span>, <span class="hljs-number">1</span>, <span class="hljs-number">7</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>]堆排序过程：[<span class="hljs-number">90</span>, <span class="hljs-number">36</span>, <span class="hljs-number">26</span>, <span class="hljs-number">25</span>, <span class="hljs-number">19</span>, <span class="hljs-number">17</span>, <span class="hljs-number">1</span>, <span class="hljs-number">7</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>][<span class="hljs-number">36</span>, <span class="hljs-number">25</span>, <span class="hljs-number">26</span>, <span class="hljs-number">7</span>, <span class="hljs-number">19</span>, <span class="hljs-number">17</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">90</span>][<span class="hljs-number">26</span>, <span class="hljs-number">25</span>, <span class="hljs-number">17</span>, <span class="hljs-number">7</span>, <span class="hljs-number">19</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">36</span>, <span class="hljs-number">90</span>][<span class="hljs-number">25</span>, <span class="hljs-number">19</span>, <span class="hljs-number">17</span>, <span class="hljs-number">7</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>, <span class="hljs-number">26</span>, <span class="hljs-number">36</span>, <span class="hljs-number">90</span>][<span class="hljs-number">19</span>, <span class="hljs-number">7</span>, <span class="hljs-number">17</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">25</span>, <span class="hljs-number">26</span>, <span class="hljs-number">36</span>, <span class="hljs-number">90</span>][<span class="hljs-number">17</span>, <span class="hljs-number">7</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">19</span>, <span class="hljs-number">25</span>, <span class="hljs-number">26</span>, <span class="hljs-number">36</span>, <span class="hljs-number">90</span>][<span class="hljs-number">7</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>, <span class="hljs-number">17</span>, <span class="hljs-number">19</span>, <span class="hljs-number">25</span>, <span class="hljs-number">26</span>, <span class="hljs-number">36</span>, <span class="hljs-number">90</span>][<span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">7</span>, <span class="hljs-number">17</span>, <span class="hljs-number">19</span>, <span class="hljs-number">25</span>, <span class="hljs-number">26</span>, <span class="hljs-number">36</span>, <span class="hljs-number">90</span>][<span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">7</span>, <span class="hljs-number">17</span>, <span class="hljs-number">19</span>, <span class="hljs-number">25</span>, <span class="hljs-number">26</span>, <span class="hljs-number">36</span>, <span class="hljs-number">90</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">7</span>, <span class="hljs-number">17</span>, <span class="hljs-number">19</span>, <span class="hljs-number">25</span>, <span class="hljs-number">26</span>, <span class="hljs-number">36</span>, <span class="hljs-number">90</span>][<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">7</span>, <span class="hljs-number">17</span>, <span class="hljs-number">19</span>, <span class="hljs-number">25</span>, <span class="hljs-number">26</span>, <span class="hljs-number">36</span>, <span class="hljs-number">90</span>]</code></pre><hr><h2><span id="ba-ji-shu-pai-xu-counting-sort"><font color="#FF0000">八、计数排序（Counting Sort）</font></span></h2><h3><span id="1-yuan-li">1、原理</span></h3><p>计数排序是一个非基于比较的排序算法，它的优势在于在对一定范围内的整数排序时，它的复杂度为 Ο(n+k)，其中 k 是整数的范围，快于任何比较排序算法。计数排序是一种牺牲空间换取时间的做法。计数排序的核心在于将输入的数据值转化为键存储在额外开辟的数组空间中。作为一种线性时间复杂度的排序，<strong>计数排序要求输入的数据必须是有确定范围的整数。</strong></p><h3><span id="2-bu-zou">2、步骤</span></h3><ul><li><strong>①</strong> 找到待排序列表中的最大值 k，开辟一个长度为 k+1 的计数列表，计数列表中的值都为 0。</li><li><strong>②</strong> 遍历待排序列表，如果遍历到的元素值为 i，则计数列表中索引 i 的值加1。</li><li><strong>③</strong> 遍历完整个待排序列表，计数列表中索引 i 的值 j 表示 i 的个数为 j，统计出待排序列表中每个值的数量。</li><li><strong>④</strong> 创建一个新列表（也可以清空原列表，在原列表中添加），遍历计数列表，依次在新列表中添加 j 个 i，新列表就是排好序后的列表，整个过程没有比较待排序列表中的数据大小。</li></ul><h3><span id="3-dong-hua-yan-shi">3、动画演示</span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/036/14%E8%AE%A1%E6%95%B0%E6%8E%92%E5%BA%8F.gif" alt="14计数排序"></p><h3><span id="4-dai-ma-shi-xian">4、代码实现</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">count_sort</span>(<span class="hljs-params">arr</span>):</span>    <span class="hljs-keyword">if</span> <span class="hljs-built_in">len</span>(arr) &lt; <span class="hljs-number">2</span>:                       <span class="hljs-comment"># 如果数组长度小于 2 则直接返回</span>        <span class="hljs-keyword">return</span> arr    max_num = <span class="hljs-built_in">max</span>(arr)    count = [<span class="hljs-number">0</span> <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(max_num+<span class="hljs-number">1</span>)]  <span class="hljs-comment"># 开辟一个计数列表</span>    <span class="hljs-keyword">for</span> val <span class="hljs-keyword">in</span> arr:        count[val] += <span class="hljs-number">1</span>    arr.clear()                        <span class="hljs-comment"># 原数组清空</span>    <span class="hljs-keyword">for</span> ind, val <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(count):  <span class="hljs-comment"># 遍历值和下标（值的数量）</span>        <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(val):            arr.append(ind)    <span class="hljs-keyword">return</span> arr</code></pre><h3><span id="5-ju-ti-shi-li">5、具体示例</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">count_sort</span>(<span class="hljs-params">arr</span>):</span>    <span class="hljs-keyword">if</span> <span class="hljs-built_in">len</span>(arr) &lt; <span class="hljs-number">2</span>:                       <span class="hljs-comment"># 如果数组长度小于 2 则直接返回</span>        <span class="hljs-keyword">return</span> arr    max_num = <span class="hljs-built_in">max</span>(arr)    count = [<span class="hljs-number">0</span> <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(max_num+<span class="hljs-number">1</span>)]  <span class="hljs-comment"># 开辟一个计数列表</span>    <span class="hljs-keyword">for</span> val <span class="hljs-keyword">in</span> arr:        count[val] += <span class="hljs-number">1</span>    arr.clear()                        <span class="hljs-comment"># 原数组清空</span>    <span class="hljs-keyword">for</span> ind, val <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(count):  <span class="hljs-comment"># 遍历值和下标（值的数量）</span>        <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(val):            arr.append(ind)    <span class="hljs-keyword">return</span> arrarr = [<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">8</span>, <span class="hljs-number">7</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">7</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">8</span>, <span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">4</span>, <span class="hljs-number">2</span>, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, <span class="hljs-number">9</span>, <span class="hljs-number">2</span>]sorted_arr = count_sort(arr)<span class="hljs-built_in">print</span>(sorted_arr)</code></pre><pre><code class="hljs python">[<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">9</span>]</code></pre><hr><h2><span id="jiu-tong-pai-xu-bucket-sort"><font color="#FF0000">九、桶排序（Bucket Sort）</font></span></h2><h3><span id="1-yuan-li">1、原理</span></h3><p>桶排序又叫箱排序，工作的原理是将数组分到有限数量的桶子里。每个桶子再个别排序（有可能再使用别的排序算法或是以递归方式继续使用桶排序进行排序）。桶排序是鸽巢排序的一种归纳结果。</p><p>桶排序也是计数排序的升级版。它利用了函数的映射关系，高效与否的关键就在于这个映射函数的确定。为了使桶排序更加高效，我们需要做到这两点：</p><ul><li>在额外空间充足的情况下，尽量增大桶的数量；</li><li>使用的映射函数能够将输入的 N 个数据均匀的分配到 K 个桶中。</li></ul><p>同时，对于桶中元素的排序，选择何种比较排序算法对于性能的影响至关重要。</p><ul><li>最快情况：当输入的数据可以均匀的分配到每一个桶中；</li><li>最慢情况：当输入的数据被分配到了同一个桶中。</li></ul><h3><span id="2-bu-zou">2、步骤</span></h3><ul><li><strong>①</strong> 创建一个定量的数组当作空桶子；</li><li><strong>②</strong> 遍历序列，把元素一个一个放到对应的桶子去；</li><li><strong>③</strong> 对每个不是空的桶子进行排序；</li><li><strong>④</strong> 从不是空的桶子里把元素再放回原来的序列中。</li></ul><h3><span id="3-dong-hua-yan-shi">3、动画演示</span></h3><p>（动图来源于@五分钟学算法，侵删）</p><p><img src="https://static.wukongsec.com/itbob/images/article/036/15%E6%A1%B6%E6%8E%92%E5%BA%8F.gif" alt="15桶排序"></p><h3><span id="4-dai-ma-shi-xian">4、代码实现</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">bucket_sort</span>(<span class="hljs-params">arr</span>):</span>    max_num = <span class="hljs-built_in">max</span>(arr)    n = <span class="hljs-built_in">len</span>(arr)    buckets = [[] <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(n)]         <span class="hljs-comment"># 创建桶</span>    <span class="hljs-keyword">for</span> var <span class="hljs-keyword">in</span> arr:        i = <span class="hljs-built_in">min</span>(var // (max_num // n), n-<span class="hljs-number">1</span>)  <span class="hljs-comment"># i 表示 var 放到几号桶里</span>        buckets[i].append(var)               <span class="hljs-comment"># 把 var 加到桶里边</span>        <span class="hljs-comment"># 保持桶内的顺序</span>        <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(buckets[i])-<span class="hljs-number">1</span>, <span class="hljs-number">0</span>, -<span class="hljs-number">1</span>):            <span class="hljs-keyword">if</span> buckets[i][j] &lt; buckets[i][j-<span class="hljs-number">1</span>]:                buckets[i][j], buckets[i][j-<span class="hljs-number">1</span>] = buckets[i][j-<span class="hljs-number">1</span>], buckets[i][j]            <span class="hljs-keyword">else</span>:                <span class="hljs-keyword">break</span>    sorted_arr = []    <span class="hljs-keyword">for</span> buc <span class="hljs-keyword">in</span> buckets:        sorted_arr.extend(buc)    <span class="hljs-keyword">return</span> sorted_arr</code></pre><h3><span id="5-ju-ti-shi-li">5、具体示例</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">bucket_sort</span>(<span class="hljs-params">arr</span>):</span>    max_num = <span class="hljs-built_in">max</span>(arr)    n = <span class="hljs-built_in">len</span>(arr)    buckets = [[] <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(n)]         <span class="hljs-comment"># 创建桶</span>    <span class="hljs-keyword">for</span> var <span class="hljs-keyword">in</span> arr:        i = <span class="hljs-built_in">min</span>(var // (max_num // n), n-<span class="hljs-number">1</span>)  <span class="hljs-comment"># i 表示 var 放到几号桶里</span>        buckets[i].append(var)               <span class="hljs-comment"># 把 var 加到桶里边</span>        <span class="hljs-comment"># 保持桶内的顺序</span>        <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(buckets[i])-<span class="hljs-number">1</span>, <span class="hljs-number">0</span>, -<span class="hljs-number">1</span>):            <span class="hljs-keyword">if</span> buckets[i][j] &lt; buckets[i][j-<span class="hljs-number">1</span>]:                buckets[i][j], buckets[i][j-<span class="hljs-number">1</span>] = buckets[i][j-<span class="hljs-number">1</span>], buckets[i][j]            <span class="hljs-keyword">else</span>:                <span class="hljs-keyword">break</span>    sorted_arr = []    <span class="hljs-keyword">for</span> buc <span class="hljs-keyword">in</span> buckets:        sorted_arr.extend(buc)    <span class="hljs-keyword">return</span> sorted_arrarr = [<span class="hljs-number">7</span>, <span class="hljs-number">12</span>, <span class="hljs-number">56</span>, <span class="hljs-number">23</span>, <span class="hljs-number">19</span>, <span class="hljs-number">33</span>, <span class="hljs-number">35</span>, <span class="hljs-number">42</span>, <span class="hljs-number">42</span>, <span class="hljs-number">2</span>, <span class="hljs-number">8</span>, <span class="hljs-number">22</span>, <span class="hljs-number">39</span>, <span class="hljs-number">26</span>, <span class="hljs-number">17</span>]sorted_arr = bucket_sort(arr)<span class="hljs-built_in">print</span>(sorted_arr)</code></pre><pre><code class="hljs python">[<span class="hljs-number">2</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">12</span>, <span class="hljs-number">17</span>, <span class="hljs-number">19</span>, <span class="hljs-number">22</span>, <span class="hljs-number">23</span>, <span class="hljs-number">26</span>, <span class="hljs-number">33</span>, <span class="hljs-number">35</span>, <span class="hljs-number">39</span>, <span class="hljs-number">42</span>, <span class="hljs-number">42</span>, <span class="hljs-number">56</span>]</code></pre><hr><h2><span id="shi-ji-shu-pai-xu-radix-sort"><font color="#FF0000">十、基数排序（Radix Sort）</font></span></h2><h3><span id="1-yuan-li">1、原理</span></h3><p>基数排序属于分配式排序，是一种非比较型整数排序算法，其原理是将整数按位数切割成不同的数字，然后按每个位数分别比较。由于整数也可以表达字符串（比如名字或日期）和特定格式的浮点数，所以基数排序也不是只能使用于整数。</p><p>基数排序、计数排序、桶排序三种排序算法都利用了桶的概念，但对桶的使用方法上是有明显差异的：</p><ul><li>基数排序：根据键值的每位数字来分配桶；</li><li>计数排序：每个桶只存储单一键值；</li><li>桶排序：每个桶存储一定范围的数值。</li></ul><h3><span id="2-bu-zou">2、步骤</span></h3><ul><li><strong>①</strong> 取数组中的最大数，并取得位数；</li><li><strong>②</strong> 从最低位开始，依次进行一次排序；</li><li><strong>③</strong> 从最低位排序一直到最高位排序完成以后, 数列就变成一个有序序列。</li></ul><h3><span id="3-dong-hua-yan-shi">3、动画演示</span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/036/16%E5%9F%BA%E6%95%B0%E6%8E%92%E5%BA%8F.gif" alt="16基数排序"></p><h3><span id="4-dai-ma-shi-xian">4、代码实现</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">radix_sort</span>(<span class="hljs-params">li</span>):</span>    max_num = <span class="hljs-built_in">max</span>(li)      <span class="hljs-comment"># 最大值 9-&gt;1次循环, 99-&gt;2次循环, 888-&gt;3次循环, 10000-&gt;5次循环</span>    it = <span class="hljs-number">0</span>    <span class="hljs-keyword">while</span> <span class="hljs-number">10</span> ** it &lt;= max_num:        buckets = [[] <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-number">10</span>)]        <span class="hljs-keyword">for</span> var <span class="hljs-keyword">in</span> li:            <span class="hljs-comment"># var=987, it=1, 987//10-&gt;98, 98%10-&gt;8; it=2, 987//100-&gt;9, 9%10=9</span>            digit = (var // <span class="hljs-number">10</span> ** it) % <span class="hljs-number">10</span>   <span class="hljs-comment"># 依次取一位数</span>            buckets[digit].append(var)        <span class="hljs-comment"># 分桶完成</span>        li.clear()        <span class="hljs-keyword">for</span> buc <span class="hljs-keyword">in</span> buckets:            li.extend(buc)        it += <span class="hljs-number">1</span>            <span class="hljs-comment"># 把数重新写回 li</span>    <span class="hljs-keyword">return</span> arr</code></pre><h3><span id="5-ju-ti-shi-li">5、具体示例</span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">radix_sort</span>(<span class="hljs-params">li</span>):</span>    max_num = <span class="hljs-built_in">max</span>(li)      <span class="hljs-comment"># 最大值 9-&gt;1次循环, 99-&gt;2次循环, 888-&gt;3次循环, 10000-&gt;5次循环</span>    it = <span class="hljs-number">0</span>    <span class="hljs-keyword">while</span> <span class="hljs-number">10</span> ** it &lt;= max_num:        buckets = [[] <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-number">10</span>)]        <span class="hljs-keyword">for</span> var <span class="hljs-keyword">in</span> li:            <span class="hljs-comment"># var=987, it=1, 987//10-&gt;98, 98%10-&gt;8; it=2, 987//100-&gt;9, 9%10=9</span>            digit = (var // <span class="hljs-number">10</span> ** it) % <span class="hljs-number">10</span>   <span class="hljs-comment"># 依次取一位数</span>            buckets[digit].append(var)        <span class="hljs-comment"># 分桶完成</span>        li.clear()        <span class="hljs-keyword">for</span> buc <span class="hljs-keyword">in</span> buckets:            li.extend(buc)        it += <span class="hljs-number">1</span>            <span class="hljs-comment"># 把数重新写回 li</span>    <span class="hljs-keyword">return</span> arrarr = [<span class="hljs-number">3221</span>, <span class="hljs-number">1</span>, <span class="hljs-number">10</span>, <span class="hljs-number">9680</span>, <span class="hljs-number">577</span>, <span class="hljs-number">9420</span>, <span class="hljs-number">7</span>, <span class="hljs-number">5622</span>, <span class="hljs-number">4793</span>, <span class="hljs-number">2030</span>, <span class="hljs-number">3138</span>, <span class="hljs-number">82</span>, <span class="hljs-number">2599</span>, <span class="hljs-number">743</span>, <span class="hljs-number">4127</span>]sorted_arr = radix_sort(arr)<span class="hljs-built_in">print</span>(sorted_arr)</code></pre><pre><code class="hljs python">[<span class="hljs-number">1</span>, <span class="hljs-number">7</span>, <span class="hljs-number">10</span>, <span class="hljs-number">82</span>, <span class="hljs-number">577</span>, <span class="hljs-number">743</span>, <span class="hljs-number">2030</span>, <span class="hljs-number">2599</span>, <span class="hljs-number">3138</span>, <span class="hljs-number">3221</span>, <span class="hljs-number">4127</span>, <span class="hljs-number">4793</span>, <span class="hljs-number">5622</span>, <span class="hljs-number">9420</span>, <span class="hljs-number">9680</span>]</code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/108987300</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-pai-xu-suan-fa-fen-lei-font&quot;&gt;&lt;font color=&quot;</summary>
      
    
    
    
    <category term="算法" scheme="https://www.itbob.cn/categories/%E7%AE%97%E6%B3%95/"/>
    
    
    <category term="Python" scheme="https://www.itbob.cn/tags/Python/"/>
    
    <category term="算法" scheme="https://www.itbob.cn/tags/%E7%AE%97%E6%B3%95/"/>
    
  </entry>
  
  <entry>
    <title>COVID-19 肺炎疫情数据实时监控（python 爬虫 + pyecharts 数据可视化 + wordcloud 词云图）</title>
    <link href="https://www.itbob.cn/article/035/"/>
    <id>https://www.itbob.cn/article/035/</id>
    <published>2020-07-06T04:49:35.000Z</published>
    <updated>2022-05-22T12:46:00.000Z</updated>
    
    <content type="html"><![CDATA[<p><strong><center><font color="red" size="5px" weight="bolder">欢迎加入爬虫逆向微信交流群：添加微信 IT-BOB（备注交流群）</font></center></strong></p><h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-1x00-qian-yan-font"><font color="#FF0000">【1x00】前言</font></a></li><li><a href="#font-color-ff0000-2x00-si-wei-dao-tu-font"><font color="#FF0000">【2x00】思维导图</font></a></li><li><a href="#font-color-ff0000-3x00-shu-ju-jie-gou-fen-xi-font"><font color="#FF0000">【3x00】数据结构分析</font></a></li><li><a href="#font-color-ff0000-4x00-zhu-han-shu-main-font"><font color="#FF0000">【4x00】主函数 main()</font></a></li><li><a href="#font-color-ff0000-5x00-shu-ju-huo-qu-mo-kuai-data-get-font"><font color="#FF0000">【5x00】数据获取模块 data_get</font></a><ul><li><a href="#font-color-4876ff-5x01-chu-shi-hua-han-shu-init-font"><font color="#4876FF">【5x01】初始化函数 init()</font></a></li><li><a href="#font-color-4876ff-5x02-zhong-guo-zong-shu-ju-china-total-data-font"><font color="#4876FF">【5x02】中国总数据 china_total_data()</font></a></li><li><a href="#font-color-4876ff-5x03-quan-qiu-zong-shu-ju-global-total-data-font"><font color="#4876FF">【5x03】全球总数据 global_total_data()</font></a></li><li><a href="#font-color-4876ff-5x04-zhong-guo-mei-ri-shu-ju-china-daily-data-font"><font color="#4876FF">【5x04】中国每日数据 china_daily_data()</font></a></li><li><a href="#font-color-4876ff-5x05-jing-wai-mei-ri-shu-ju-foreign-daily-data-font"><font color="#4876FF">【5x05】境外每日数据 foreign_daily_data()</font></a></li></ul></li><li><a href="#font-color-ff0000-6x00-ci-yun-tu-hui-zhi-mo-kuai-data-wordcloud-font"><font color="#FF0000">【6x00】词云图绘制模块 data_wordcloud</font></a><ul><li><a href="#font-color-4876ff-6x01-zhong-guo-lei-ji-que-zhen-ci-yun-tu-foreign-daily-data-font"><font color="#4876FF">【6x01】中国累计确诊词云图 foreign_daily_data()</font></a></li><li><a href="#font-color-4876ff-6x02-quan-qiu-lei-ji-que-zhen-ci-yun-tu-foreign-daily-data-font"><font color="#4876FF">【6x02】全球累计确诊词云图 foreign_daily_data()</font></a></li></ul></li><li><a href="#font-color-ff0000-7x00-di-tu-hui-zhi-mo-kuai-data-map-font"><font color="#FF0000">【7x00】地图绘制模块 data_map</font></a><ul><li><a href="#font-color-4876ff-7x01-zhong-guo-lei-ji-que-zhen-di-tu-china-total-map-font"><font color="#4876FF">【7x01】中国累计确诊地图 china_total_map()</font></a></li><li><a href="#font-color-4876ff-7x02-quan-qiu-lei-ji-que-zhen-di-tu-global-total-map-font"><font color="#4876FF">【7x02】全球累计确诊地图 global_total_map()</font></a></li><li><a href="#font-color-4876ff-7x03-zhong-guo-mei-ri-shu-ju-zhe-xian-tu-china-daily-map-font"><font color="#4876FF">【7x03】中国每日数据折线图 china_daily_map()</font></a></li><li><a href="#font-color-4876ff-7x04-jing-wai-mei-ri-shu-ju-zhe-xian-tu-foreign-daily-map-font"><font color="#4876FF">【7x04】境外每日数据折线图 foreign_daily_map()</font></a></li></ul></li><li><a href="#font-color-ff0000-8x00-jie-guo-jie-tu-font"><font color="#FF0000">【8x00】结果截图</font></a><ul><li><a href="#font-color-4876ff-8x01-shu-ju-chu-cun-excel-font"><font color="#4876FF">【8x01】数据储存 Excel</font></a></li><li><a href="#font-color-4876ff-8x02-ci-yun-tu-font"><font color="#4876FF">【8x02】词云图</font></a></li><li><a href="#font-color-4876ff-8x03-di-tu-zhe-xian-tu-font"><font color="#4876FF">【8x03】地图 + 折线图</font></a></li></ul></li><li><a href="#font-color-ff0000-9x00-wan-zheng-dai-ma-font"><font color="#FF0000">【9x00】完整代码</font></a></li></ul><!-- tocstop --><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/107140534</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="1x00-qian-yan"><font color="#FF0000">【1x00】前言</font></span></h2><p>本来两三个月之前就想搞个疫情数据实时数据展示的，由于各种不可抗拒因素一而再再而三的鸽了，最近终于抽空写了一个，数据是用 Python 爬取的<a href="https://voice.baidu.com/act/newpneumonia/newpneumonia/">百度疫情实时大数据报告</a>，请求库用的 requests，解析用的 Xpath 语法，词云用的 wordcloud 库，数据可视化用 pyecharts 绘制的地图和折线图，数据储存在 Excel 表格里面，使用 openpyxl 对表格进行处理。</p><p>本程序实现了累计确诊地图展示和每日数据变化折线图展示，其他更多数据的获取和展示均可在程序中进行拓展，可以将程序部署在服务器上，设置定时运行，即可实时展示数据，pyecharts 绘图模块也可以整合到 Web 框架（Django、Flask等）中使用。</p><p>在获取数据时有<font color="#FF0000"><strong>全球</strong></font>和<font color="#FF0000"><strong>境外</strong></font>两个概念，全球包含中国，境外不包含中国，后期绘制的四个图：中国累计确诊地图、全球累计确诊地图（包含中国）、中国每日数据折线图、境外每日数据折线图（不包含中国）。</p><p><font color="#FF0000"><strong>注意项：直接向该网页发送请求获取的响应中，没有每个国家的每日数据，该数据获取的地址是：<a href="https://voice.baidu.com/newpneumonia/get?target=trend&amp;isCaseIn=1&amp;stage=publish">https://voice.baidu.com/newpneumonia/get?target=trend&amp;isCaseIn=1&amp;stage=publish</a></strong></font></p><ul><li><strong>预览地址</strong>：<s><a href="http://cov.itrhx.com/">http://cov.itrhx.com/</a></s>（已失效）</li><li><strong>数据来源</strong>：<a href="https://voice.baidu.com/act/newpneumonia/newpneumonia/">https://voice.baidu.com/act/newpneumonia/newpneumonia/</a></li><li><strong>pyecharts 文档</strong>：<a href="https://pyecharts.org/">https://pyecharts.org/</a></li><li><strong>openpyxl 文档</strong>：<a href="https://openpyxl.readthedocs.io/">https://openpyxl.readthedocs.io/</a></li><li><strong>wordcloud 文档</strong>：<a href="http://amueller.github.io/word_cloud/">http://amueller.github.io/word_cloud/</a></li></ul><h2><span id="2x00-si-wei-dao-tu"><font color="#FF0000">【2x00】思维导图</font></span></h2><p><img src="https://static.wukongsec.com/itbob/images/article/035/01.png" alt="01"></p><h2><span id="3x00-shu-ju-jie-gou-fen-xi"><font color="#FF0000">【3x00】数据结构分析</font></span></h2><p>通过查看百度的疫情数据页面，可以看到很多整齐的数据，猜测就是疫情相关的数据，保存该页面，对其进行格式化，很容易可以分析出所有的数据都在 <code>&lt;script type=&quot;application/json&quot; id=&quot;captain-config&quot;&gt;&lt;/script&gt;</code> 里面，其中 title 里面是一些 Unicode 编码，将其转为中文后更容易得到不同的分类数据。</p><p><img src="https://static.wukongsec.com/itbob/images/article/035/02.png" alt="02"></p><p>由于数据繁多，可以将数据主体部分提取出来，删除一些重复项和其他杂项，留下数据大体位置并分析数据结构，便于后期的数据提取，经过处理后的数据大致结构如下：</p><pre><code class="hljs json">&lt;script type=<span class="hljs-string">&quot;application/json&quot;</span> id=<span class="hljs-string">&quot;captain-config&quot;</span>&gt;    &#123;        <span class="hljs-attr">&quot;component&quot;</span>: [            &#123;                <span class="hljs-attr">&quot;mapLastUpdatedTime&quot;</span>: <span class="hljs-string">&quot;2020.07.05 16:13&quot;</span>,        <span class="hljs-comment">// 国内疫情数据最后更新时间</span>                <span class="hljs-attr">&quot;caseList&quot;</span>: [                                    <span class="hljs-comment">// caseList 列表，每一个元素是一个字典</span>                    &#123;                        <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;1&quot;</span>,                        <span class="hljs-comment">// 每个字典包含中国每个省的每一项疫情数据</span>                        <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                        <span class="hljs-attr">&quot;crued&quot;</span>: <span class="hljs-string">&quot;1&quot;</span>,                        <span class="hljs-attr">&quot;relativeTime&quot;</span>: <span class="hljs-string">&quot;1593792000&quot;</span>,                        <span class="hljs-attr">&quot;confirmedRelative&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                        <span class="hljs-attr">&quot;diedRelative&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                        <span class="hljs-attr">&quot;curedRelative&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                        <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                        <span class="hljs-attr">&quot;curConfirmRelative&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                        <span class="hljs-attr">&quot;icuDisable&quot;</span>: <span class="hljs-string">&quot;1&quot;</span>,                        <span class="hljs-attr">&quot;area&quot;</span>: <span class="hljs-string">&quot;西藏&quot;</span>,                        <span class="hljs-attr">&quot;subList&quot;</span>: [                            <span class="hljs-comment">// subList 列表，每一个元素是一个字典</span>                            &#123;                                <span class="hljs-attr">&quot;city&quot;</span>: <span class="hljs-string">&quot;拉萨&quot;</span>,                 <span class="hljs-comment">// 每个字典包含该省份对应的每个城市疫情数据</span>                                <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;1&quot;</span>,                                <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                                <span class="hljs-attr">&quot;crued&quot;</span>: <span class="hljs-string">&quot;1&quot;</span>,                                <span class="hljs-attr">&quot;confirmedRelative&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                                <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                                <span class="hljs-attr">&quot;cityCode&quot;</span>: <span class="hljs-string">&quot;100&quot;</span>                            &#125;                        ]                    &#125;                ],                <span class="hljs-attr">&quot;caseOutsideList&quot;</span>: [                           <span class="hljs-comment">// caseOutsideList 列表，每一个元素是一个字典</span>                    &#123;                        <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;241419&quot;</span>,                 <span class="hljs-comment">// 每个字典包含各国的每一项疫情数据</span>                        <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;34854&quot;</span>,                        <span class="hljs-attr">&quot;crued&quot;</span>: <span class="hljs-string">&quot;191944&quot;</span>,                        <span class="hljs-attr">&quot;relativeTime&quot;</span>: <span class="hljs-string">&quot;1593792000&quot;</span>,                        <span class="hljs-attr">&quot;confirmedRelative&quot;</span>: <span class="hljs-string">&quot;223&quot;</span>,                        <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;14621&quot;</span>,                        <span class="hljs-attr">&quot;icuDisable&quot;</span>: <span class="hljs-string">&quot;1&quot;</span>,                        <span class="hljs-attr">&quot;area&quot;</span>: <span class="hljs-string">&quot;意大利&quot;</span>,                        <span class="hljs-attr">&quot;subList&quot;</span>: [                          <span class="hljs-comment">// subList 列表，每一个元素是一个字典</span>                            &#123;                                <span class="hljs-attr">&quot;city&quot;</span>: <span class="hljs-string">&quot;伦巴第&quot;</span>,              <span class="hljs-comment">// 每个字典包含每个国家对应的每个城市疫情数据</span>                                <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;94318&quot;</span>,                                <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;16691&quot;</span>,                                <span class="hljs-attr">&quot;crued&quot;</span>: <span class="hljs-string">&quot;68201&quot;</span>,                                <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;9426&quot;</span>                            &#125;                        ]                    &#125;                ],                <span class="hljs-attr">&quot;summaryDataIn&quot;</span>: &#123;                           <span class="hljs-comment">// summaryDataIn 国内总的疫情数据</span>                    <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;85307&quot;</span>,                    <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;4648&quot;</span>,                    <span class="hljs-attr">&quot;cured&quot;</span>: <span class="hljs-string">&quot;80144&quot;</span>,                    <span class="hljs-attr">&quot;asymptomatic&quot;</span>: <span class="hljs-string">&quot;99&quot;</span>,                    <span class="hljs-attr">&quot;asymptomaticRelative&quot;</span>: <span class="hljs-string">&quot;7&quot;</span>,                    <span class="hljs-attr">&quot;unconfirmed&quot;</span>: <span class="hljs-string">&quot;7&quot;</span>,                    <span class="hljs-attr">&quot;relativeTime&quot;</span>: <span class="hljs-string">&quot;1593792000&quot;</span>,                    <span class="hljs-attr">&quot;confirmedRelative&quot;</span>: <span class="hljs-string">&quot;19&quot;</span>,                    <span class="hljs-attr">&quot;unconfirmedRelative&quot;</span>: <span class="hljs-string">&quot;1&quot;</span>,                    <span class="hljs-attr">&quot;curedRelative&quot;</span>: <span class="hljs-string">&quot;27&quot;</span>,                    <span class="hljs-attr">&quot;diedRelative&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                    <span class="hljs-attr">&quot;icu&quot;</span>: <span class="hljs-string">&quot;6&quot;</span>,                    <span class="hljs-attr">&quot;icuRelative&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                    <span class="hljs-attr">&quot;overseasInput&quot;</span>: <span class="hljs-string">&quot;1931&quot;</span>,                    <span class="hljs-attr">&quot;unOverseasInputCumulative&quot;</span>: <span class="hljs-string">&quot;83375&quot;</span>,                    <span class="hljs-attr">&quot;overseasInputRelative&quot;</span>: <span class="hljs-string">&quot;6&quot;</span>,                    <span class="hljs-attr">&quot;unOverseasInputNewAdd&quot;</span>: <span class="hljs-string">&quot;13&quot;</span>,                    <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;515&quot;</span>,                    <span class="hljs-attr">&quot;curConfirmRelative&quot;</span>: <span class="hljs-string">&quot;-8&quot;</span>,                    <span class="hljs-attr">&quot;icuDisable&quot;</span>: <span class="hljs-string">&quot;1&quot;</span>                &#125;,                <span class="hljs-attr">&quot;summaryDataOut&quot;</span>: &#123;                           <span class="hljs-comment">// summaryDataOut 国外总的疫情数据</span>                    <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;11302569&quot;</span>,                    <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;528977&quot;</span>,                    <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;4410601&quot;</span>,                    <span class="hljs-attr">&quot;cured&quot;</span>: <span class="hljs-string">&quot;6362991&quot;</span>,                    <span class="hljs-attr">&quot;confirmedRelative&quot;</span>: <span class="hljs-string">&quot;206165&quot;</span>,                    <span class="hljs-attr">&quot;curedRelative&quot;</span>: <span class="hljs-string">&quot;190018&quot;</span>,                    <span class="hljs-attr">&quot;diedRelative&quot;</span>: <span class="hljs-string">&quot;4876&quot;</span>,                    <span class="hljs-attr">&quot;curConfirmRelative&quot;</span>: <span class="hljs-string">&quot;11271&quot;</span>,                    <span class="hljs-attr">&quot;relativeTime&quot;</span>: <span class="hljs-string">&quot;1593792000&quot;</span>                &#125;,                <span class="hljs-attr">&quot;trend&quot;</span>: &#123;                                    <span class="hljs-comment">// trend 字典，包含国内每日的疫情数据</span>                    <span class="hljs-attr">&quot;updateDate&quot;</span>: [],                         <span class="hljs-comment">// 日期</span>                    <span class="hljs-attr">&quot;list&quot;</span>: [                                 <span class="hljs-comment">// list 列表，每项数据及其对应的值</span>                        &#123;                            <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;确诊&quot;</span>,                            <span class="hljs-attr">&quot;data&quot;</span>: []                        &#125;,                        &#123;                            <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;疑似&quot;</span>,                            <span class="hljs-attr">&quot;data&quot;</span>: []                        &#125;,                        &#123;                            <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;治愈&quot;</span>,                            <span class="hljs-attr">&quot;data&quot;</span>: []                        &#125;,                        &#123;                            <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;死亡&quot;</span>,                            <span class="hljs-attr">&quot;data&quot;</span>: []                        &#125;,                        &#123;                            <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;新增确诊&quot;</span>,                            <span class="hljs-attr">&quot;data&quot;</span>: []                        &#125;,                        &#123;                            <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;新增疑似&quot;</span>,                            <span class="hljs-attr">&quot;data&quot;</span>: []                        &#125;,                        &#123;                            <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;新增治愈&quot;</span>,                            <span class="hljs-attr">&quot;data&quot;</span>: []                        &#125;,                        &#123;                            <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;新增死亡&quot;</span>,                            <span class="hljs-attr">&quot;data&quot;</span>: []                        &#125;,                        &#123;                            <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;累计境外输入&quot;</span>,                            <span class="hljs-attr">&quot;data&quot;</span>: []                        &#125;,                        &#123;                            <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;新增境外输入&quot;</span>,                            <span class="hljs-attr">&quot;data&quot;</span>: []                        &#125;                    ]                &#125;,                <span class="hljs-attr">&quot;foreignLastUpdatedTime&quot;</span>: <span class="hljs-string">&quot;2020.07.05 16:13&quot;</span>,       <span class="hljs-comment">// 国外疫情数据最后更新时间</span>                <span class="hljs-attr">&quot;globalList&quot;</span>: [                                     <span class="hljs-comment">// globalList 列表，每一个元素是一个字典</span>                    &#123;                        <span class="hljs-attr">&quot;area&quot;</span>: <span class="hljs-string">&quot;亚洲&quot;</span>,                              <span class="hljs-comment">// 按照不同洲进行分类</span>                        <span class="hljs-attr">&quot;subList&quot;</span>: [                                <span class="hljs-comment">// subList 列表，每个洲各个国家的疫情数据</span>                            &#123;                                <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;52&quot;</span>,                                <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;6159&quot;</span>,                                <span class="hljs-attr">&quot;crued&quot;</span>: <span class="hljs-string">&quot;4809&quot;</span>,                                <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;1298&quot;</span>,                                <span class="hljs-attr">&quot;confirmedRelative&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                                <span class="hljs-attr">&quot;relativeTime&quot;</span>: <span class="hljs-string">&quot;1593792000&quot;</span>,                                <span class="hljs-attr">&quot;country&quot;</span>: <span class="hljs-string">&quot;塔吉克斯坦&quot;</span>                            &#125;                        ],                        <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;56556&quot;</span>,                            <span class="hljs-comment">// 每个洲总的疫情数据</span>                        <span class="hljs-attr">&quot;crued&quot;</span>: <span class="hljs-string">&quot;1625562&quot;</span>,                        <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;2447873&quot;</span>,                        <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;765755&quot;</span>,                        <span class="hljs-attr">&quot;confirmedRelative&quot;</span>: <span class="hljs-string">&quot;60574&quot;</span>                    &#125;,                    &#123;                        <span class="hljs-attr">&quot;area&quot;</span>: <span class="hljs-string">&quot;其他&quot;</span>,                             <span class="hljs-comment">// 其他特殊区域疫情数据</span>                        <span class="hljs-attr">&quot;subList&quot;</span>: [                            &#123;                                <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;13&quot;</span>,                                <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;712&quot;</span>,                                <span class="hljs-attr">&quot;crued&quot;</span>: <span class="hljs-string">&quot;651&quot;</span>,                                <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;48&quot;</span>,                                <span class="hljs-attr">&quot;confirmedRelative&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>,                                <span class="hljs-attr">&quot;relativeTime&quot;</span>: <span class="hljs-string">&quot;1593792000&quot;</span>,                                <span class="hljs-attr">&quot;country&quot;</span>: <span class="hljs-string">&quot;钻石公主号邮轮&quot;</span>                            &#125;                        ],                        <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;13&quot;</span>,                              <span class="hljs-comment">// 其他特殊区域疫情总的数据</span>                        <span class="hljs-attr">&quot;crued&quot;</span>: <span class="hljs-string">&quot;651&quot;</span>,                        <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;712&quot;</span>,                        <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;48&quot;</span>,                        <span class="hljs-attr">&quot;confirmedRelative&quot;</span>: <span class="hljs-string">&quot;0&quot;</span>                    &#125;,                    &#123;                        <span class="hljs-attr">&quot;area&quot;</span>: <span class="hljs-string">&quot;热门&quot;</span>,                            <span class="hljs-comment">// 热门国家疫情数据</span>                        <span class="hljs-attr">&quot;subList&quot;</span>: [                            &#123;                                <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;5206&quot;</span>,                                <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;204610&quot;</span>,                                <span class="hljs-attr">&quot;crued&quot;</span>: <span class="hljs-string">&quot;179492&quot;</span>,                                <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;19912&quot;</span>,                                <span class="hljs-attr">&quot;confirmedRelative&quot;</span>: <span class="hljs-string">&quot;1172&quot;</span>,                                <span class="hljs-attr">&quot;relativeTime&quot;</span>: <span class="hljs-string">&quot;1593792000&quot;</span>,                                <span class="hljs-attr">&quot;country&quot;</span>: <span class="hljs-string">&quot;土耳其&quot;</span>                            &#125;                        ],                        <span class="hljs-attr">&quot;died&quot;</span>: <span class="hljs-string">&quot;528967&quot;</span>,                         <span class="hljs-comment">// 热门国家疫情总的数据</span>                        <span class="hljs-attr">&quot;crued&quot;</span>: <span class="hljs-string">&quot;6362924&quot;</span>,                        <span class="hljs-attr">&quot;confirmed&quot;</span>: <span class="hljs-string">&quot;11302357&quot;</span>,                        <span class="hljs-attr">&quot;confirmedRelative&quot;</span>: <span class="hljs-string">&quot;216478&quot;</span>,                        <span class="hljs-attr">&quot;curConfirm&quot;</span>: <span class="hljs-string">&quot;4410466&quot;</span>                    &#125;],                <span class="hljs-attr">&quot;allForeignTrend&quot;</span>: &#123;                            <span class="hljs-comment">// allForeignTrend 字典，包含国外每日的疫情数据</span>                        <span class="hljs-attr">&quot;updateDate&quot;</span>: [],                       <span class="hljs-comment">// 日期</span>                        <span class="hljs-attr">&quot;list&quot;</span>: [                               <span class="hljs-comment">// list 列表，每项数据及其对应的值</span>                            &#123;                                <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;累计确诊&quot;</span>,                                <span class="hljs-attr">&quot;data&quot;</span>: []                            &#125;,                            &#123;                                <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;治愈&quot;</span>,                                <span class="hljs-attr">&quot;data&quot;</span>: []                            &#125;,                            &#123;                                <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;死亡&quot;</span>,                                <span class="hljs-attr">&quot;data&quot;</span>: []                            &#125;,                            &#123;                                <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;现有确诊&quot;</span>,                                <span class="hljs-attr">&quot;data&quot;</span>: []                            &#125;,                            &#123;                                <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;新增确诊&quot;</span>,                                <span class="hljs-attr">&quot;data&quot;</span>: []                            &#125;                        ]                    &#125;,                <span class="hljs-attr">&quot;topAddCountry&quot;</span>: [                    <span class="hljs-comment">// 确诊增量最高的国家</span>                        &#123;                            <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;美国&quot;</span>,                            <span class="hljs-attr">&quot;value&quot;</span>: <span class="hljs-number">53162</span>                        &#125;                    ],                <span class="hljs-attr">&quot;topOverseasInput&quot;</span>: [                <span class="hljs-comment">// 境外输入最多的省份</span>                    &#123;                        <span class="hljs-attr">&quot;name&quot;</span>: <span class="hljs-string">&quot;黑龙江&quot;</span>,                        <span class="hljs-attr">&quot;value&quot;</span>: <span class="hljs-number">386</span>                    &#125;                ]            &#125;        ]    &#125;&lt;/script&gt;</code></pre><h2><span id="4x00-zhu-han-shu-main"><font color="#FF0000">【4x00】主函数 main()</font></span></h2><p>分别将数据获取、词云图绘制、地图绘制写入三个文件：<code>data_get()</code>、<code>data_wordcloud()</code>、<code>data_map()</code>，然后使用一个主函数文件 <a href="http://main.py">main.py</a> 来调用这三个文件里面的函数。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> data_get<span class="hljs-keyword">import</span> data_wordcloud<span class="hljs-keyword">import</span> data_mapdata_dict = data_get.init()data_get.china_total_data(data_dict)data_get.global_total_data(data_dict)data_get.china_daily_data(data_dict)data_get.foreign_daily_data(data_dict)data_wordcloud.china_wordcloud()data_wordcloud.global_wordcloud()data_map.all_map()</code></pre><h2><span id="5x00-shu-ju-huo-qu-mo-kuai-data-get"><font color="#FF0000">【5x00】数据获取模块 data_get</font></span></h2><h3><span id="5x01-chu-shi-hua-han-shu-init"><font color="#4876FF">【5x01】初始化函数 init()</font></span></h3><p>使用 xpath 语法 <code>//script[@id=&quot;captain-config&quot;]/text()</code> 提取里面的值，利用 <code>json.loads</code> 方法将其转换为字典对象，以便后续的其他函数调用。</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">init</span>():</span>    headers = &#123;        <span class="hljs-string">&#x27;user-agent&#x27;</span>: <span class="hljs-string">&#x27;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.13 Safari/537.36&#x27;</span>    &#125;    url = <span class="hljs-string">&#x27;https://voice.baidu.com/act/newpneumonia/newpneumonia/&#x27;</span>    response = requests.get(url=url, headers=headers)    tree = etree.HTML(response.text)    dict1 = tree.xpath(<span class="hljs-string">&#x27;//script[@id=&quot;captain-config&quot;]/text()&#x27;</span>)    <span class="hljs-built_in">print</span>(<span class="hljs-built_in">type</span>(dict1[<span class="hljs-number">0</span>]))    dict2 = json.loads(dict1[<span class="hljs-number">0</span>])    <span class="hljs-keyword">return</span> dict2</code></pre><h3><span id="5x02-zhong-guo-zong-shu-ju-china-total-data"><font color="#4876FF">【5x02】中国总数据 china_total_data()</font></span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">china_total_data</span>(<span class="hljs-params">data</span>):</span>    <span class="hljs-string">&quot;&quot;&quot;</span><span class="hljs-string">    1、中国省/直辖市/自治区/行政区疫情数据</span><span class="hljs-string">    省/直辖市/自治区/行政区：area</span><span class="hljs-string">    现有确诊：    curConfirm</span><span class="hljs-string">    累计确诊：    confirmed</span><span class="hljs-string">    累计治愈：    crued</span><span class="hljs-string">    累计死亡：    died</span><span class="hljs-string">    现有确诊增量： curConfirmRelative</span><span class="hljs-string">    累计确诊增量： confirmedRelative</span><span class="hljs-string">    累计治愈增量： curedRelative</span><span class="hljs-string">    累计死亡增量： diedRelative</span><span class="hljs-string">    &quot;&quot;&quot;</span>    wb = openpyxl.Workbook()            <span class="hljs-comment"># 创建工作簿</span>    ws_china = wb.active                <span class="hljs-comment"># 获取工作表</span>    ws_china.title = <span class="hljs-string">&quot;中国省份疫情数据&quot;</span>   <span class="hljs-comment"># 命名工作表</span>    ws_china.append([<span class="hljs-string">&#x27;省/直辖市/自治区/行政区&#x27;</span>, <span class="hljs-string">&#x27;现有确诊&#x27;</span>, <span class="hljs-string">&#x27;累计确诊&#x27;</span>, <span class="hljs-string">&#x27;累计治愈&#x27;</span>,                     <span class="hljs-string">&#x27;累计死亡&#x27;</span>, <span class="hljs-string">&#x27;现有确诊增量&#x27;</span>, <span class="hljs-string">&#x27;累计确诊增量&#x27;</span>,                     <span class="hljs-string">&#x27;累计治愈增量&#x27;</span>, <span class="hljs-string">&#x27;累计死亡增量&#x27;</span>])    china = data[<span class="hljs-string">&#x27;component&#x27;</span>][<span class="hljs-number">0</span>][<span class="hljs-string">&#x27;caseList&#x27;</span>]    <span class="hljs-keyword">for</span> province <span class="hljs-keyword">in</span> china:        ws_china.append([province[<span class="hljs-string">&#x27;area&#x27;</span>],                        province[<span class="hljs-string">&#x27;curConfirm&#x27;</span>],                        province[<span class="hljs-string">&#x27;confirmed&#x27;</span>],                        province[<span class="hljs-string">&#x27;crued&#x27;</span>],                        province[<span class="hljs-string">&#x27;died&#x27;</span>],                        province[<span class="hljs-string">&#x27;curConfirmRelative&#x27;</span>],                        province[<span class="hljs-string">&#x27;confirmedRelative&#x27;</span>],                        province[<span class="hljs-string">&#x27;curedRelative&#x27;</span>],                        province[<span class="hljs-string">&#x27;diedRelative&#x27;</span>]])    <span class="hljs-string">&quot;&quot;&quot;</span><span class="hljs-string">    2、中国城市疫情数据</span><span class="hljs-string">    城市：city</span><span class="hljs-string">    现有确诊：curConfirm</span><span class="hljs-string">    累计确诊：confirmed</span><span class="hljs-string">    累计治愈：crued</span><span class="hljs-string">    累计死亡：died</span><span class="hljs-string">    累计确诊增量：confirmedRelative</span><span class="hljs-string">    &quot;&quot;&quot;</span>    ws_city = wb.create_sheet(<span class="hljs-string">&#x27;中国城市疫情数据&#x27;</span>)    ws_city.append([<span class="hljs-string">&#x27;城市&#x27;</span>, <span class="hljs-string">&#x27;现有确诊&#x27;</span>, <span class="hljs-string">&#x27;累计确诊&#x27;</span>,                    <span class="hljs-string">&#x27;累计治愈&#x27;</span>, <span class="hljs-string">&#x27;累计死亡&#x27;</span>, <span class="hljs-string">&#x27;累计确诊增量&#x27;</span>])    <span class="hljs-keyword">for</span> province <span class="hljs-keyword">in</span> china:        <span class="hljs-keyword">for</span> city <span class="hljs-keyword">in</span> province[<span class="hljs-string">&#x27;subList&#x27;</span>]:            <span class="hljs-comment"># 某些城市没有 curConfirm 数据，则将其设置为 0，crued 和 died 为空时，替换成 0</span>            <span class="hljs-keyword">if</span> <span class="hljs-string">&#x27;curConfirm&#x27;</span> <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> city:                city[<span class="hljs-string">&#x27;curConfirm&#x27;</span>] = <span class="hljs-string">&#x27;0&#x27;</span>            <span class="hljs-keyword">if</span> city[<span class="hljs-string">&#x27;crued&#x27;</span>] == <span class="hljs-string">&#x27;&#x27;</span>:                city[<span class="hljs-string">&#x27;crued&#x27;</span>] = <span class="hljs-string">&#x27;0&#x27;</span>            <span class="hljs-keyword">if</span> city[<span class="hljs-string">&#x27;died&#x27;</span>] == <span class="hljs-string">&#x27;&#x27;</span>:                city[<span class="hljs-string">&#x27;died&#x27;</span>] = <span class="hljs-string">&#x27;0&#x27;</span>            ws_city.append([city[<span class="hljs-string">&#x27;city&#x27;</span>], <span class="hljs-string">&#x27;0&#x27;</span>, city[<span class="hljs-string">&#x27;confirmed&#x27;</span>],                           city[<span class="hljs-string">&#x27;crued&#x27;</span>], city[<span class="hljs-string">&#x27;died&#x27;</span>], city[<span class="hljs-string">&#x27;confirmedRelative&#x27;</span>]])    <span class="hljs-string">&quot;&quot;&quot;</span><span class="hljs-string">    3、中国疫情数据更新时间：mapLastUpdatedTime</span><span class="hljs-string">    &quot;&quot;&quot;</span>    time_domestic = data[<span class="hljs-string">&#x27;component&#x27;</span>][<span class="hljs-number">0</span>][<span class="hljs-string">&#x27;mapLastUpdatedTime&#x27;</span>]    ws_time = wb.create_sheet(<span class="hljs-string">&#x27;中国疫情数据更新时间&#x27;</span>)    ws_time.column_dimensions[<span class="hljs-string">&#x27;A&#x27;</span>].width = <span class="hljs-number">22</span>  <span class="hljs-comment"># 调整列宽</span>    ws_time.append([<span class="hljs-string">&#x27;中国疫情数据更新时间&#x27;</span>])    ws_time.append([time_domestic])    wb.save(<span class="hljs-string">&#x27;COVID-19-China.xlsx&#x27;</span>)    <span class="hljs-built_in">print</span>(<span class="hljs-string">&#x27;中国疫情数据已保存至 COVID-19-China.xlsx！&#x27;</span>)</code></pre><h3><span id="5x03-quan-qiu-zong-shu-ju-global-total-data"><font color="#4876FF">【5x03】全球总数据 global_total_data()</font></span></h3><p>全球总数据在提取完成后，进行地图绘制时发现并没有中国的数据，因此在写入全球数据时注意要单独将中国的数据插入 Excel 中。</p><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">global_total_data</span>(<span class="hljs-params">data</span>):</span>    <span class="hljs-string">&quot;&quot;&quot;</span><span class="hljs-string">    1、全球各国疫情数据</span><span class="hljs-string">    国家：country</span><span class="hljs-string">    现有确诊：curConfirm</span><span class="hljs-string">    累计确诊：confirmed</span><span class="hljs-string">    累计治愈：crued</span><span class="hljs-string">    累计死亡：died</span><span class="hljs-string">    累计确诊增量：confirmedRelative</span><span class="hljs-string">    &quot;&quot;&quot;</span>    wb = openpyxl.Workbook()    ws_global = wb.active    ws_global.title = <span class="hljs-string">&quot;全球各国疫情数据&quot;</span>    <span class="hljs-comment"># 按照国家保存数据</span>    countries = data[<span class="hljs-string">&#x27;component&#x27;</span>][<span class="hljs-number">0</span>][<span class="hljs-string">&#x27;caseOutsideList&#x27;</span>]    ws_global.append([<span class="hljs-string">&#x27;国家&#x27;</span>, <span class="hljs-string">&#x27;现有确诊&#x27;</span>, <span class="hljs-string">&#x27;累计确诊&#x27;</span>, <span class="hljs-string">&#x27;累计治愈&#x27;</span>, <span class="hljs-string">&#x27;累计死亡&#x27;</span>, <span class="hljs-string">&#x27;累计确诊增量&#x27;</span>])    <span class="hljs-keyword">for</span> country <span class="hljs-keyword">in</span> countries:        ws_global.append([country[<span class="hljs-string">&#x27;area&#x27;</span>],                          country[<span class="hljs-string">&#x27;curConfirm&#x27;</span>],                          country[<span class="hljs-string">&#x27;confirmed&#x27;</span>],                          country[<span class="hljs-string">&#x27;crued&#x27;</span>],                          country[<span class="hljs-string">&#x27;died&#x27;</span>],                          country[<span class="hljs-string">&#x27;confirmedRelative&#x27;</span>]])    <span class="hljs-comment"># 按照洲保存数据</span>    continent = data[<span class="hljs-string">&#x27;component&#x27;</span>][<span class="hljs-number">0</span>][<span class="hljs-string">&#x27;globalList&#x27;</span>]    <span class="hljs-keyword">for</span> area <span class="hljs-keyword">in</span> continent:        ws_foreign = wb.create_sheet(area[<span class="hljs-string">&#x27;area&#x27;</span>] + <span class="hljs-string">&#x27;疫情数据&#x27;</span>)        ws_foreign.append([<span class="hljs-string">&#x27;国家&#x27;</span>, <span class="hljs-string">&#x27;现有确诊&#x27;</span>, <span class="hljs-string">&#x27;累计确诊&#x27;</span>, <span class="hljs-string">&#x27;累计治愈&#x27;</span>, <span class="hljs-string">&#x27;累计死亡&#x27;</span>, <span class="hljs-string">&#x27;累计确诊增量&#x27;</span>])        <span class="hljs-keyword">for</span> country <span class="hljs-keyword">in</span> area[<span class="hljs-string">&#x27;subList&#x27;</span>]:            ws_foreign.append([country[<span class="hljs-string">&#x27;country&#x27;</span>],                               country[<span class="hljs-string">&#x27;curConfirm&#x27;</span>],                               country[<span class="hljs-string">&#x27;confirmed&#x27;</span>],                               country[<span class="hljs-string">&#x27;crued&#x27;</span>],                               country[<span class="hljs-string">&#x27;died&#x27;</span>],                               country[<span class="hljs-string">&#x27;confirmedRelative&#x27;</span>]])    <span class="hljs-comment"># 在“全球各国疫情数据”和“亚洲疫情数据”两张表中写入中国疫情数据</span>    ws1, ws2 = wb[<span class="hljs-string">&#x27;全球各国疫情数据&#x27;</span>], wb[<span class="hljs-string">&#x27;亚洲疫情数据&#x27;</span>]    original_data = data[<span class="hljs-string">&#x27;component&#x27;</span>][<span class="hljs-number">0</span>][<span class="hljs-string">&#x27;summaryDataIn&#x27;</span>]    add_china_data = [<span class="hljs-string">&#x27;中国&#x27;</span>,                      original_data[<span class="hljs-string">&#x27;curConfirm&#x27;</span>],                      original_data[<span class="hljs-string">&#x27;confirmed&#x27;</span>],                      original_data[<span class="hljs-string">&#x27;cured&#x27;</span>],                      original_data[<span class="hljs-string">&#x27;died&#x27;</span>],                      original_data[<span class="hljs-string">&#x27;confirmedRelative&#x27;</span>]]    ws1.append(add_china_data)    ws2.append(add_china_data)    <span class="hljs-string">&quot;&quot;&quot;</span><span class="hljs-string">    2、全球疫情数据更新时间：foreignLastUpdatedTime</span><span class="hljs-string">    &quot;&quot;&quot;</span>    time_foreign = data[<span class="hljs-string">&#x27;component&#x27;</span>][<span class="hljs-number">0</span>][<span class="hljs-string">&#x27;foreignLastUpdatedTime&#x27;</span>]    ws_time = wb.create_sheet(<span class="hljs-string">&#x27;全球疫情数据更新时间&#x27;</span>)    ws_time.column_dimensions[<span class="hljs-string">&#x27;A&#x27;</span>].width = <span class="hljs-number">22</span>  <span class="hljs-comment"># 调整列宽</span>    ws_time.append([<span class="hljs-string">&#x27;全球疫情数据更新时间&#x27;</span>])    ws_time.append([time_foreign])    wb.save(<span class="hljs-string">&#x27;COVID-19-Global.xlsx&#x27;</span>)    <span class="hljs-built_in">print</span>(<span class="hljs-string">&#x27;全球疫情数据已保存至 COVID-19-Global.xlsx！&#x27;</span>)</code></pre><h3><span id="5x04-zhong-guo-mei-ri-shu-ju-china-daily-data"><font color="#4876FF">【5x04】中国每日数据 china_daily_data()</font></span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">china_daily_data</span>(<span class="hljs-params">data</span>):</span>    <span class="hljs-string">&quot;&quot;&quot;</span><span class="hljs-string">    i_dict = data[&#x27;component&#x27;][0][&#x27;trend&#x27;]</span><span class="hljs-string">    i_dict[&#x27;updateDate&#x27;]：日期</span><span class="hljs-string">    i_dict[&#x27;list&#x27;][0]：确诊</span><span class="hljs-string">    i_dict[&#x27;list&#x27;][1]：疑似</span><span class="hljs-string">    i_dict[&#x27;list&#x27;][2]：治愈</span><span class="hljs-string">    i_dict[&#x27;list&#x27;][3]：死亡</span><span class="hljs-string">    i_dict[&#x27;list&#x27;][4]：新增确诊</span><span class="hljs-string">    i_dict[&#x27;list&#x27;][5]：新增疑似</span><span class="hljs-string">    i_dict[&#x27;list&#x27;][6]：新增治愈</span><span class="hljs-string">    i_dict[&#x27;list&#x27;][7]：新增死亡</span><span class="hljs-string">    i_dict[&#x27;list&#x27;][8]：累计境外输入</span><span class="hljs-string">    i_dict[&#x27;list&#x27;][9]：新增境外输入</span><span class="hljs-string">    &quot;&quot;&quot;</span>    ccd_dict = data[<span class="hljs-string">&#x27;component&#x27;</span>][<span class="hljs-number">0</span>][<span class="hljs-string">&#x27;trend&#x27;</span>]    update_date = ccd_dict[<span class="hljs-string">&#x27;updateDate&#x27;</span>]              <span class="hljs-comment"># 日期</span>    china_confirmed = ccd_dict[<span class="hljs-string">&#x27;list&#x27;</span>][<span class="hljs-number">0</span>][<span class="hljs-string">&#x27;data&#x27;</span>]     <span class="hljs-comment"># 每日累计确诊数据</span>    china_crued = ccd_dict[<span class="hljs-string">&#x27;list&#x27;</span>][<span class="hljs-number">2</span>][<span class="hljs-string">&#x27;data&#x27;</span>]         <span class="hljs-comment"># 每日累计治愈数据</span>    china_died = ccd_dict[<span class="hljs-string">&#x27;list&#x27;</span>][<span class="hljs-number">3</span>][<span class="hljs-string">&#x27;data&#x27;</span>]          <span class="hljs-comment"># 每日累计死亡数据</span>    wb = openpyxl.load_workbook(<span class="hljs-string">&#x27;COVID-19-China.xlsx&#x27;</span>)    <span class="hljs-comment"># 写入每日累计确诊数据</span>    ws_china_confirmed = wb.create_sheet(<span class="hljs-string">&#x27;中国每日累计确诊数据&#x27;</span>)    ws_china_confirmed.append([<span class="hljs-string">&#x27;日期&#x27;</span>, <span class="hljs-string">&#x27;数据&#x27;</span>])    <span class="hljs-keyword">for</span> data <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(update_date, china_confirmed):        ws_china_confirmed.append(data)    <span class="hljs-comment"># 写入每日累计治愈数据</span>    ws_china_crued = wb.create_sheet(<span class="hljs-string">&#x27;中国每日累计治愈数据&#x27;</span>)    ws_china_crued.append([<span class="hljs-string">&#x27;日期&#x27;</span>, <span class="hljs-string">&#x27;数据&#x27;</span>])    <span class="hljs-keyword">for</span> data <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(update_date, china_crued):        ws_china_crued.append(data)    <span class="hljs-comment"># 写入每日累计死亡数据</span>    ws_china_died = wb.create_sheet(<span class="hljs-string">&#x27;中国每日累计死亡数据&#x27;</span>)    ws_china_died.append([<span class="hljs-string">&#x27;日期&#x27;</span>, <span class="hljs-string">&#x27;数据&#x27;</span>])    <span class="hljs-keyword">for</span> data <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(update_date, china_died):        ws_china_died.append(data)    wb.save(<span class="hljs-string">&#x27;COVID-19-China.xlsx&#x27;</span>)    <span class="hljs-built_in">print</span>(<span class="hljs-string">&#x27;中国每日累计确诊/治愈/死亡数据已保存至 COVID-19-China.xlsx！&#x27;</span>)</code></pre><h3><span id="5x05-jing-wai-mei-ri-shu-ju-foreign-daily-data"><font color="#4876FF">【5x05】境外每日数据 foreign_daily_data()</font></span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">foreign_daily_data</span>(<span class="hljs-params">data</span>):</span>    <span class="hljs-string">&quot;&quot;&quot;</span><span class="hljs-string">    te_dict = data[&#x27;component&#x27;][0][&#x27;allForeignTrend&#x27;]</span><span class="hljs-string">    te_dict[&#x27;updateDate&#x27;]：日期</span><span class="hljs-string">    te_dict[&#x27;list&#x27;][0]：累计确诊</span><span class="hljs-string">    te_dict[&#x27;list&#x27;][1]：治愈</span><span class="hljs-string">    te_dict[&#x27;list&#x27;][2]：死亡</span><span class="hljs-string">    te_dict[&#x27;list&#x27;][3]：现有确诊</span><span class="hljs-string">    te_dict[&#x27;list&#x27;][4]：新增确诊</span><span class="hljs-string">    &quot;&quot;&quot;</span>    te_dict = data[<span class="hljs-string">&#x27;component&#x27;</span>][<span class="hljs-number">0</span>][<span class="hljs-string">&#x27;allForeignTrend&#x27;</span>]    update_date = te_dict[<span class="hljs-string">&#x27;updateDate&#x27;</span>]                <span class="hljs-comment"># 日期</span>    foreign_confirmed = te_dict[<span class="hljs-string">&#x27;list&#x27;</span>][<span class="hljs-number">0</span>][<span class="hljs-string">&#x27;data&#x27;</span>]     <span class="hljs-comment"># 每日累计确诊数据</span>    foreign_crued = te_dict[<span class="hljs-string">&#x27;list&#x27;</span>][<span class="hljs-number">1</span>][<span class="hljs-string">&#x27;data&#x27;</span>]         <span class="hljs-comment"># 每日累计治愈数据</span>    foreign_died = te_dict[<span class="hljs-string">&#x27;list&#x27;</span>][<span class="hljs-number">2</span>][<span class="hljs-string">&#x27;data&#x27;</span>]          <span class="hljs-comment"># 每日累计死亡数据</span>    wb = openpyxl.load_workbook(<span class="hljs-string">&#x27;COVID-19-Global.xlsx&#x27;</span>)    <span class="hljs-comment"># 写入每日累计确诊数据</span>    ws_foreign_confirmed = wb.create_sheet(<span class="hljs-string">&#x27;境外每日累计确诊数据&#x27;</span>)    ws_foreign_confirmed.append([<span class="hljs-string">&#x27;日期&#x27;</span>, <span class="hljs-string">&#x27;数据&#x27;</span>])    <span class="hljs-keyword">for</span> data <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(update_date, foreign_confirmed):        ws_foreign_confirmed.append(data)    <span class="hljs-comment"># 写入累计治愈数据</span>    ws_foreign_crued = wb.create_sheet(<span class="hljs-string">&#x27;境外每日累计治愈数据&#x27;</span>)    ws_foreign_crued.append([<span class="hljs-string">&#x27;日期&#x27;</span>, <span class="hljs-string">&#x27;数据&#x27;</span>])    <span class="hljs-keyword">for</span> data <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(update_date, foreign_crued):        ws_foreign_crued.append(data)    <span class="hljs-comment"># 写入累计死亡数据</span>    ws_foreign_died = wb.create_sheet(<span class="hljs-string">&#x27;境外每日累计死亡数据&#x27;</span>)    ws_foreign_died.append([<span class="hljs-string">&#x27;日期&#x27;</span>, <span class="hljs-string">&#x27;数据&#x27;</span>])    <span class="hljs-keyword">for</span> data <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(update_date, foreign_died):        ws_foreign_died.append(data)    wb.save(<span class="hljs-string">&#x27;COVID-19-Global.xlsx&#x27;</span>)    <span class="hljs-built_in">print</span>(<span class="hljs-string">&#x27;境外每日累计确诊/治愈/死亡数据已保存至 COVID-19-Global.xlsx！&#x27;</span>)</code></pre><h2><span id="6x00-ci-yun-tu-hui-zhi-mo-kuai-data-wordcloud"><font color="#FF0000">【6x00】词云图绘制模块 data_wordcloud</font></span></h2><h3><span id="6x01-zhong-guo-lei-ji-que-zhen-ci-yun-tu-foreign-daily-data"><font color="#4876FF">【6x01】中国累计确诊词云图 foreign_daily_data()</font></span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">china_wordcloud</span>():</span>    wb = openpyxl.load_workbook(<span class="hljs-string">&#x27;COVID-19-China.xlsx&#x27;</span>)  <span class="hljs-comment"># 获取已有的xlsx文件</span>    ws_china = wb[<span class="hljs-string">&#x27;中国省份疫情数据&#x27;</span>]                     <span class="hljs-comment"># 获取中国省份疫情数据表</span>    ws_china.delete_rows(<span class="hljs-number">1</span>)                             <span class="hljs-comment"># 删除第一行</span>    china_dict = &#123;&#125;                                     <span class="hljs-comment"># 将省份及其累计确诊按照键值对形式储存在字典中</span>    <span class="hljs-keyword">for</span> data <span class="hljs-keyword">in</span> ws_china.values:        china_dict[data[<span class="hljs-number">0</span>]] = <span class="hljs-built_in">int</span>(data[<span class="hljs-number">2</span>])    word_cloud = wordcloud.WordCloud(font_path=<span class="hljs-string">&#x27;C:/Windows/Fonts/simsun.ttc&#x27;</span>,                                     background_color=<span class="hljs-string">&#x27;#CDC9C9&#x27;</span>,                                     min_font_size=<span class="hljs-number">15</span>,                                     width=<span class="hljs-number">900</span>, height=<span class="hljs-number">500</span>)    word_cloud.generate_from_frequencies(china_dict)    word_cloud.to_file(<span class="hljs-string">&#x27;WordCloud-China.png&#x27;</span>)    <span class="hljs-built_in">print</span>(<span class="hljs-string">&#x27;中国省份疫情词云图绘制完毕！&#x27;</span>)</code></pre><h3><span id="6x02-quan-qiu-lei-ji-que-zhen-ci-yun-tu-foreign-daily-data"><font color="#4876FF">【6x02】全球累计确诊词云图 foreign_daily_data()</font></span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">global_wordcloud</span>():</span>    wb = openpyxl.load_workbook(<span class="hljs-string">&#x27;COVID-19-Global.xlsx&#x27;</span>)    ws_global = wb[<span class="hljs-string">&#x27;全球各国疫情数据&#x27;</span>]    ws_global.delete_rows(<span class="hljs-number">1</span>)    global_dict = &#123;&#125;    <span class="hljs-keyword">for</span> data <span class="hljs-keyword">in</span> ws_global.values:        global_dict[data[<span class="hljs-number">0</span>]] = <span class="hljs-built_in">int</span>(data[<span class="hljs-number">2</span>])    word_cloud = wordcloud.WordCloud(font_path=<span class="hljs-string">&#x27;C:/Windows/Fonts/simsun.ttc&#x27;</span>,                                     background_color=<span class="hljs-string">&#x27;#CDC9C9&#x27;</span>,                                     width=<span class="hljs-number">900</span>, height=<span class="hljs-number">500</span>)    word_cloud.generate_from_frequencies(global_dict)    word_cloud.to_file(<span class="hljs-string">&#x27;WordCloud-Global.png&#x27;</span>)    <span class="hljs-built_in">print</span>(<span class="hljs-string">&#x27;全球各国疫情词云图绘制完毕！&#x27;</span>)</code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/107140534</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="7x00-di-tu-hui-zhi-mo-kuai-data-map"><font color="#FF0000">【7x00】地图绘制模块 data_map</font></span></h2><h3><span id="7x01-zhong-guo-lei-ji-que-zhen-di-tu-china-total-map"><font color="#4876FF">【7x01】中国累计确诊地图 china_total_map()</font></span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">china_total_map</span>():</span>    wb = openpyxl.load_workbook(<span class="hljs-string">&#x27;COVID-19-China.xlsx&#x27;</span>)  <span class="hljs-comment"># 获取已有的xlsx文件</span>    ws_time = wb[<span class="hljs-string">&#x27;中国疫情数据更新时间&#x27;</span>]                   <span class="hljs-comment"># 获取文件中中国疫情数据更新时间表</span>    ws_data = wb[<span class="hljs-string">&#x27;中国省份疫情数据&#x27;</span>]                      <span class="hljs-comment"># 获取文件中中国省份疫情数据表</span>    ws_data.delete_rows(<span class="hljs-number">1</span>)                              <span class="hljs-comment"># 删除第一行</span>    province = []                                       <span class="hljs-comment"># 省份</span>    curconfirm = []                                     <span class="hljs-comment"># 累计确诊</span>    <span class="hljs-keyword">for</span> data <span class="hljs-keyword">in</span> ws_data.values:        province.append(data[<span class="hljs-number">0</span>])        curconfirm.append(data[<span class="hljs-number">2</span>])    time_china = ws_time[<span class="hljs-string">&#x27;A2&#x27;</span>].value                    <span class="hljs-comment"># 更新时间</span>    <span class="hljs-comment"># 设置分级颜色</span>    pieces = [        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">0</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">0</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;0&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#FFFFFF&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">9</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">1</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;1-9&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#FFE5DB&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">99</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">10</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;10-99&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#FF9985&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">999</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">100</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;100-999&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#F57567&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">9999</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">1000</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;1000-9999&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#E64546&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">99999</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">10000</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;≧10000&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#B80909&#x27;</span>&#125;    ]    <span class="hljs-comment"># 绘制地图</span>    ct_map = (        Map()        .add(series_name=<span class="hljs-string">&#x27;累计确诊人数&#x27;</span>, data_pair=[<span class="hljs-built_in">list</span>(z) <span class="hljs-keyword">for</span> z <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(province, curconfirm)], maptype=<span class="hljs-string">&quot;china&quot;</span>)        .set_global_opts(            title_opts=opts.TitleOpts(title=<span class="hljs-string">&quot;中国疫情数据（累计确诊）&quot;</span>,                                      subtitle=<span class="hljs-string">&#x27;数据更新至：&#x27;</span> + time_china + <span class="hljs-string">&#x27;\n\n来源：百度疫情实时大数据报告&#x27;</span>),            visualmap_opts=opts.VisualMapOpts(max_=<span class="hljs-number">300</span>, is_piecewise=<span class="hljs-literal">True</span>, pieces=pieces)        )    )    <span class="hljs-keyword">return</span> ct_map</code></pre><h3><span id="7x02-quan-qiu-lei-ji-que-zhen-di-tu-global-total-map"><font color="#4876FF">【7x02】全球累计确诊地图 global_total_map()</font></span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">global_total_map</span>():</span>    wb = openpyxl.load_workbook(<span class="hljs-string">&#x27;COVID-19-Global.xlsx&#x27;</span>)    ws_time = wb[<span class="hljs-string">&#x27;全球疫情数据更新时间&#x27;</span>]    ws_data = wb[<span class="hljs-string">&#x27;全球各国疫情数据&#x27;</span>]    ws_data.delete_rows(<span class="hljs-number">1</span>)    country = []                        <span class="hljs-comment"># 国家</span>    curconfirm = []                     <span class="hljs-comment"># 累计确诊</span>    <span class="hljs-keyword">for</span> data <span class="hljs-keyword">in</span> ws_data.values:        country.append(data[<span class="hljs-number">0</span>])        curconfirm.append(data[<span class="hljs-number">2</span>])    time_global = ws_time[<span class="hljs-string">&#x27;A2&#x27;</span>].value   <span class="hljs-comment"># 更新时间</span>    <span class="hljs-comment"># 国家名称中英文映射表</span>    name_map = &#123;          <span class="hljs-string">&quot;Somalia&quot;</span>: <span class="hljs-string">&quot;索马里&quot;</span>,          <span class="hljs-string">&quot;Liechtenstein&quot;</span>: <span class="hljs-string">&quot;列支敦士登&quot;</span>,          <span class="hljs-string">&quot;Morocco&quot;</span>: <span class="hljs-string">&quot;摩洛哥&quot;</span>,          <span class="hljs-string">&quot;W. Sahara&quot;</span>: <span class="hljs-string">&quot;西撒哈拉&quot;</span>,          <span class="hljs-string">&quot;Serbia&quot;</span>: <span class="hljs-string">&quot;塞尔维亚&quot;</span>,          <span class="hljs-string">&quot;Afghanistan&quot;</span>: <span class="hljs-string">&quot;阿富汗&quot;</span>,          <span class="hljs-string">&quot;Angola&quot;</span>: <span class="hljs-string">&quot;安哥拉&quot;</span>,          <span class="hljs-string">&quot;Albania&quot;</span>: <span class="hljs-string">&quot;阿尔巴尼亚&quot;</span>,          <span class="hljs-string">&quot;Andorra&quot;</span>: <span class="hljs-string">&quot;安道尔共和国&quot;</span>,          <span class="hljs-string">&quot;United Arab Emirates&quot;</span>: <span class="hljs-string">&quot;阿拉伯联合酋长国&quot;</span>,          <span class="hljs-string">&quot;Argentina&quot;</span>: <span class="hljs-string">&quot;阿根廷&quot;</span>,          <span class="hljs-string">&quot;Armenia&quot;</span>: <span class="hljs-string">&quot;亚美尼亚&quot;</span>,          <span class="hljs-string">&quot;Australia&quot;</span>: <span class="hljs-string">&quot;澳大利亚&quot;</span>,          <span class="hljs-string">&quot;Austria&quot;</span>: <span class="hljs-string">&quot;奥地利&quot;</span>,          <span class="hljs-string">&quot;Azerbaijan&quot;</span>: <span class="hljs-string">&quot;阿塞拜疆&quot;</span>,          <span class="hljs-string">&quot;Burundi&quot;</span>: <span class="hljs-string">&quot;布隆迪&quot;</span>,          <span class="hljs-string">&quot;Belgium&quot;</span>: <span class="hljs-string">&quot;比利时&quot;</span>,          <span class="hljs-string">&quot;Benin&quot;</span>: <span class="hljs-string">&quot;贝宁&quot;</span>,          <span class="hljs-string">&quot;Burkina Faso&quot;</span>: <span class="hljs-string">&quot;布基纳法索&quot;</span>,          <span class="hljs-string">&quot;Bangladesh&quot;</span>: <span class="hljs-string">&quot;孟加拉国&quot;</span>,          <span class="hljs-string">&quot;Bulgaria&quot;</span>: <span class="hljs-string">&quot;保加利亚&quot;</span>,          <span class="hljs-string">&quot;Bahrain&quot;</span>: <span class="hljs-string">&quot;巴林&quot;</span>,          <span class="hljs-string">&quot;Bahamas&quot;</span>: <span class="hljs-string">&quot;巴哈马&quot;</span>,          <span class="hljs-string">&quot;Bosnia and Herz.&quot;</span>: <span class="hljs-string">&quot;波斯尼亚和黑塞哥维那&quot;</span>,          <span class="hljs-string">&quot;Belarus&quot;</span>: <span class="hljs-string">&quot;白俄罗斯&quot;</span>,          <span class="hljs-string">&quot;Belize&quot;</span>: <span class="hljs-string">&quot;伯利兹&quot;</span>,          <span class="hljs-string">&quot;Bermuda&quot;</span>: <span class="hljs-string">&quot;百慕大&quot;</span>,          <span class="hljs-string">&quot;Bolivia&quot;</span>: <span class="hljs-string">&quot;玻利维亚&quot;</span>,          <span class="hljs-string">&quot;Brazil&quot;</span>: <span class="hljs-string">&quot;巴西&quot;</span>,          <span class="hljs-string">&quot;Barbados&quot;</span>: <span class="hljs-string">&quot;巴巴多斯&quot;</span>,          <span class="hljs-string">&quot;Brunei&quot;</span>: <span class="hljs-string">&quot;文莱&quot;</span>,          <span class="hljs-string">&quot;Bhutan&quot;</span>: <span class="hljs-string">&quot;不丹&quot;</span>,          <span class="hljs-string">&quot;Botswana&quot;</span>: <span class="hljs-string">&quot;博茨瓦纳&quot;</span>,          <span class="hljs-string">&quot;Central African Rep.&quot;</span>: <span class="hljs-string">&quot;中非共和国&quot;</span>,          <span class="hljs-string">&quot;Canada&quot;</span>: <span class="hljs-string">&quot;加拿大&quot;</span>,          <span class="hljs-string">&quot;Switzerland&quot;</span>: <span class="hljs-string">&quot;瑞士&quot;</span>,          <span class="hljs-string">&quot;Chile&quot;</span>: <span class="hljs-string">&quot;智利&quot;</span>,          <span class="hljs-string">&quot;China&quot;</span>: <span class="hljs-string">&quot;中国&quot;</span>,          <span class="hljs-string">&quot;Côte d&#x27;Ivoire&quot;</span>: <span class="hljs-string">&quot;科特迪瓦&quot;</span>,          <span class="hljs-string">&quot;Cameroon&quot;</span>: <span class="hljs-string">&quot;喀麦隆&quot;</span>,          <span class="hljs-string">&quot;Dem. Rep. Congo&quot;</span>: <span class="hljs-string">&quot;刚果（布）&quot;</span>,          <span class="hljs-string">&quot;Congo&quot;</span>: <span class="hljs-string">&quot;刚果（金）&quot;</span>,          <span class="hljs-string">&quot;Colombia&quot;</span>: <span class="hljs-string">&quot;哥伦比亚&quot;</span>,          <span class="hljs-string">&quot;Cape Verde&quot;</span>: <span class="hljs-string">&quot;佛得角&quot;</span>,          <span class="hljs-string">&quot;Costa Rica&quot;</span>: <span class="hljs-string">&quot;哥斯达黎加&quot;</span>,          <span class="hljs-string">&quot;Cuba&quot;</span>: <span class="hljs-string">&quot;古巴&quot;</span>,          <span class="hljs-string">&quot;N. Cyprus&quot;</span>: <span class="hljs-string">&quot;北塞浦路斯&quot;</span>,          <span class="hljs-string">&quot;Cyprus&quot;</span>: <span class="hljs-string">&quot;塞浦路斯&quot;</span>,          <span class="hljs-string">&quot;Czech Rep.&quot;</span>: <span class="hljs-string">&quot;捷克&quot;</span>,          <span class="hljs-string">&quot;Germany&quot;</span>: <span class="hljs-string">&quot;德国&quot;</span>,          <span class="hljs-string">&quot;Djibouti&quot;</span>: <span class="hljs-string">&quot;吉布提&quot;</span>,          <span class="hljs-string">&quot;Denmark&quot;</span>: <span class="hljs-string">&quot;丹麦&quot;</span>,          <span class="hljs-string">&quot;Dominican Rep.&quot;</span>: <span class="hljs-string">&quot;多米尼加&quot;</span>,          <span class="hljs-string">&quot;Algeria&quot;</span>: <span class="hljs-string">&quot;阿尔及利亚&quot;</span>,          <span class="hljs-string">&quot;Ecuador&quot;</span>: <span class="hljs-string">&quot;厄瓜多尔&quot;</span>,          <span class="hljs-string">&quot;Egypt&quot;</span>: <span class="hljs-string">&quot;埃及&quot;</span>,          <span class="hljs-string">&quot;Eritrea&quot;</span>: <span class="hljs-string">&quot;厄立特里亚&quot;</span>,          <span class="hljs-string">&quot;Spain&quot;</span>: <span class="hljs-string">&quot;西班牙&quot;</span>,          <span class="hljs-string">&quot;Estonia&quot;</span>: <span class="hljs-string">&quot;爱沙尼亚&quot;</span>,          <span class="hljs-string">&quot;Ethiopia&quot;</span>: <span class="hljs-string">&quot;埃塞俄比亚&quot;</span>,          <span class="hljs-string">&quot;Finland&quot;</span>: <span class="hljs-string">&quot;芬兰&quot;</span>,          <span class="hljs-string">&quot;Fiji&quot;</span>: <span class="hljs-string">&quot;斐济&quot;</span>,          <span class="hljs-string">&quot;France&quot;</span>: <span class="hljs-string">&quot;法国&quot;</span>,          <span class="hljs-string">&quot;Gabon&quot;</span>: <span class="hljs-string">&quot;加蓬&quot;</span>,          <span class="hljs-string">&quot;United Kingdom&quot;</span>: <span class="hljs-string">&quot;英国&quot;</span>,          <span class="hljs-string">&quot;Georgia&quot;</span>: <span class="hljs-string">&quot;格鲁吉亚&quot;</span>,          <span class="hljs-string">&quot;Ghana&quot;</span>: <span class="hljs-string">&quot;加纳&quot;</span>,          <span class="hljs-string">&quot;Guinea&quot;</span>: <span class="hljs-string">&quot;几内亚&quot;</span>,          <span class="hljs-string">&quot;Gambia&quot;</span>: <span class="hljs-string">&quot;冈比亚&quot;</span>,          <span class="hljs-string">&quot;Guinea-Bissau&quot;</span>: <span class="hljs-string">&quot;几内亚比绍&quot;</span>,          <span class="hljs-string">&quot;Eq. Guinea&quot;</span>: <span class="hljs-string">&quot;赤道几内亚&quot;</span>,          <span class="hljs-string">&quot;Greece&quot;</span>: <span class="hljs-string">&quot;希腊&quot;</span>,          <span class="hljs-string">&quot;Grenada&quot;</span>: <span class="hljs-string">&quot;格林纳达&quot;</span>,          <span class="hljs-string">&quot;Greenland&quot;</span>: <span class="hljs-string">&quot;格陵兰岛&quot;</span>,          <span class="hljs-string">&quot;Guatemala&quot;</span>: <span class="hljs-string">&quot;危地马拉&quot;</span>,          <span class="hljs-string">&quot;Guam&quot;</span>: <span class="hljs-string">&quot;关岛&quot;</span>,          <span class="hljs-string">&quot;Guyana&quot;</span>: <span class="hljs-string">&quot;圭亚那合作共和国&quot;</span>,          <span class="hljs-string">&quot;Honduras&quot;</span>: <span class="hljs-string">&quot;洪都拉斯&quot;</span>,          <span class="hljs-string">&quot;Croatia&quot;</span>: <span class="hljs-string">&quot;克罗地亚&quot;</span>,          <span class="hljs-string">&quot;Haiti&quot;</span>: <span class="hljs-string">&quot;海地&quot;</span>,          <span class="hljs-string">&quot;Hungary&quot;</span>: <span class="hljs-string">&quot;匈牙利&quot;</span>,          <span class="hljs-string">&quot;Indonesia&quot;</span>: <span class="hljs-string">&quot;印度尼西亚&quot;</span>,          <span class="hljs-string">&quot;India&quot;</span>: <span class="hljs-string">&quot;印度&quot;</span>,          <span class="hljs-string">&quot;Br. Indian Ocean Ter.&quot;</span>: <span class="hljs-string">&quot;英属印度洋领土&quot;</span>,          <span class="hljs-string">&quot;Ireland&quot;</span>: <span class="hljs-string">&quot;爱尔兰&quot;</span>,          <span class="hljs-string">&quot;Iran&quot;</span>: <span class="hljs-string">&quot;伊朗&quot;</span>,          <span class="hljs-string">&quot;Iraq&quot;</span>: <span class="hljs-string">&quot;伊拉克&quot;</span>,          <span class="hljs-string">&quot;Iceland&quot;</span>: <span class="hljs-string">&quot;冰岛&quot;</span>,          <span class="hljs-string">&quot;Israel&quot;</span>: <span class="hljs-string">&quot;以色列&quot;</span>,          <span class="hljs-string">&quot;Italy&quot;</span>: <span class="hljs-string">&quot;意大利&quot;</span>,          <span class="hljs-string">&quot;Jamaica&quot;</span>: <span class="hljs-string">&quot;牙买加&quot;</span>,          <span class="hljs-string">&quot;Jordan&quot;</span>: <span class="hljs-string">&quot;约旦&quot;</span>,          <span class="hljs-string">&quot;Japan&quot;</span>: <span class="hljs-string">&quot;日本&quot;</span>,          <span class="hljs-string">&quot;Siachen Glacier&quot;</span>: <span class="hljs-string">&quot;锡亚琴冰川&quot;</span>,          <span class="hljs-string">&quot;Kazakhstan&quot;</span>: <span class="hljs-string">&quot;哈萨克斯坦&quot;</span>,          <span class="hljs-string">&quot;Kenya&quot;</span>: <span class="hljs-string">&quot;肯尼亚&quot;</span>,          <span class="hljs-string">&quot;Kyrgyzstan&quot;</span>: <span class="hljs-string">&quot;吉尔吉斯斯坦&quot;</span>,          <span class="hljs-string">&quot;Cambodia&quot;</span>: <span class="hljs-string">&quot;柬埔寨&quot;</span>,          <span class="hljs-string">&quot;Korea&quot;</span>: <span class="hljs-string">&quot;韩国&quot;</span>,          <span class="hljs-string">&quot;Kuwait&quot;</span>: <span class="hljs-string">&quot;科威特&quot;</span>,          <span class="hljs-string">&quot;Lao PDR&quot;</span>: <span class="hljs-string">&quot;老挝&quot;</span>,          <span class="hljs-string">&quot;Lebanon&quot;</span>: <span class="hljs-string">&quot;黎巴嫩&quot;</span>,          <span class="hljs-string">&quot;Liberia&quot;</span>: <span class="hljs-string">&quot;利比里亚&quot;</span>,          <span class="hljs-string">&quot;Libya&quot;</span>: <span class="hljs-string">&quot;利比亚&quot;</span>,          <span class="hljs-string">&quot;Sri Lanka&quot;</span>: <span class="hljs-string">&quot;斯里兰卡&quot;</span>,          <span class="hljs-string">&quot;Lesotho&quot;</span>: <span class="hljs-string">&quot;莱索托&quot;</span>,          <span class="hljs-string">&quot;Lithuania&quot;</span>: <span class="hljs-string">&quot;立陶宛&quot;</span>,          <span class="hljs-string">&quot;Luxembourg&quot;</span>: <span class="hljs-string">&quot;卢森堡&quot;</span>,          <span class="hljs-string">&quot;Latvia&quot;</span>: <span class="hljs-string">&quot;拉脱维亚&quot;</span>,          <span class="hljs-string">&quot;Moldova&quot;</span>: <span class="hljs-string">&quot;摩尔多瓦&quot;</span>,          <span class="hljs-string">&quot;Madagascar&quot;</span>: <span class="hljs-string">&quot;马达加斯加&quot;</span>,          <span class="hljs-string">&quot;Mexico&quot;</span>: <span class="hljs-string">&quot;墨西哥&quot;</span>,          <span class="hljs-string">&quot;Macedonia&quot;</span>: <span class="hljs-string">&quot;马其顿&quot;</span>,          <span class="hljs-string">&quot;Mali&quot;</span>: <span class="hljs-string">&quot;马里&quot;</span>,          <span class="hljs-string">&quot;Malta&quot;</span>: <span class="hljs-string">&quot;马耳他&quot;</span>,          <span class="hljs-string">&quot;Myanmar&quot;</span>: <span class="hljs-string">&quot;缅甸&quot;</span>,          <span class="hljs-string">&quot;Montenegro&quot;</span>: <span class="hljs-string">&quot;黑山&quot;</span>,          <span class="hljs-string">&quot;Mongolia&quot;</span>: <span class="hljs-string">&quot;蒙古国&quot;</span>,          <span class="hljs-string">&quot;Mozambique&quot;</span>: <span class="hljs-string">&quot;莫桑比克&quot;</span>,          <span class="hljs-string">&quot;Mauritania&quot;</span>: <span class="hljs-string">&quot;毛里塔尼亚&quot;</span>,          <span class="hljs-string">&quot;Mauritius&quot;</span>: <span class="hljs-string">&quot;毛里求斯&quot;</span>,          <span class="hljs-string">&quot;Malawi&quot;</span>: <span class="hljs-string">&quot;马拉维&quot;</span>,          <span class="hljs-string">&quot;Malaysia&quot;</span>: <span class="hljs-string">&quot;马来西亚&quot;</span>,          <span class="hljs-string">&quot;Namibia&quot;</span>: <span class="hljs-string">&quot;纳米比亚&quot;</span>,          <span class="hljs-string">&quot;New Caledonia&quot;</span>: <span class="hljs-string">&quot;新喀里多尼亚&quot;</span>,          <span class="hljs-string">&quot;Niger&quot;</span>: <span class="hljs-string">&quot;尼日尔&quot;</span>,          <span class="hljs-string">&quot;Nigeria&quot;</span>: <span class="hljs-string">&quot;尼日利亚&quot;</span>,          <span class="hljs-string">&quot;Nicaragua&quot;</span>: <span class="hljs-string">&quot;尼加拉瓜&quot;</span>,          <span class="hljs-string">&quot;Netherlands&quot;</span>: <span class="hljs-string">&quot;荷兰&quot;</span>,          <span class="hljs-string">&quot;Norway&quot;</span>: <span class="hljs-string">&quot;挪威&quot;</span>,          <span class="hljs-string">&quot;Nepal&quot;</span>: <span class="hljs-string">&quot;尼泊尔&quot;</span>,          <span class="hljs-string">&quot;New Zealand&quot;</span>: <span class="hljs-string">&quot;新西兰&quot;</span>,          <span class="hljs-string">&quot;Oman&quot;</span>: <span class="hljs-string">&quot;阿曼&quot;</span>,          <span class="hljs-string">&quot;Pakistan&quot;</span>: <span class="hljs-string">&quot;巴基斯坦&quot;</span>,          <span class="hljs-string">&quot;Panama&quot;</span>: <span class="hljs-string">&quot;巴拿马&quot;</span>,          <span class="hljs-string">&quot;Peru&quot;</span>: <span class="hljs-string">&quot;秘鲁&quot;</span>,          <span class="hljs-string">&quot;Philippines&quot;</span>: <span class="hljs-string">&quot;菲律宾&quot;</span>,          <span class="hljs-string">&quot;Papua New Guinea&quot;</span>: <span class="hljs-string">&quot;巴布亚新几内亚&quot;</span>,          <span class="hljs-string">&quot;Poland&quot;</span>: <span class="hljs-string">&quot;波兰&quot;</span>,          <span class="hljs-string">&quot;Puerto Rico&quot;</span>: <span class="hljs-string">&quot;波多黎各&quot;</span>,          <span class="hljs-string">&quot;Dem. Rep. Korea&quot;</span>: <span class="hljs-string">&quot;朝鲜&quot;</span>,          <span class="hljs-string">&quot;Portugal&quot;</span>: <span class="hljs-string">&quot;葡萄牙&quot;</span>,          <span class="hljs-string">&quot;Paraguay&quot;</span>: <span class="hljs-string">&quot;巴拉圭&quot;</span>,          <span class="hljs-string">&quot;Palestine&quot;</span>: <span class="hljs-string">&quot;巴勒斯坦&quot;</span>,          <span class="hljs-string">&quot;Qatar&quot;</span>: <span class="hljs-string">&quot;卡塔尔&quot;</span>,          <span class="hljs-string">&quot;Romania&quot;</span>: <span class="hljs-string">&quot;罗马尼亚&quot;</span>,          <span class="hljs-string">&quot;Russia&quot;</span>: <span class="hljs-string">&quot;俄罗斯&quot;</span>,          <span class="hljs-string">&quot;Rwanda&quot;</span>: <span class="hljs-string">&quot;卢旺达&quot;</span>,          <span class="hljs-string">&quot;Saudi Arabia&quot;</span>: <span class="hljs-string">&quot;沙特阿拉伯&quot;</span>,          <span class="hljs-string">&quot;Sudan&quot;</span>: <span class="hljs-string">&quot;苏丹&quot;</span>,          <span class="hljs-string">&quot;S. Sudan&quot;</span>: <span class="hljs-string">&quot;南苏丹&quot;</span>,          <span class="hljs-string">&quot;Senegal&quot;</span>: <span class="hljs-string">&quot;塞内加尔&quot;</span>,          <span class="hljs-string">&quot;Singapore&quot;</span>: <span class="hljs-string">&quot;新加坡&quot;</span>,          <span class="hljs-string">&quot;Solomon Is.&quot;</span>: <span class="hljs-string">&quot;所罗门群岛&quot;</span>,          <span class="hljs-string">&quot;Sierra Leone&quot;</span>: <span class="hljs-string">&quot;塞拉利昂&quot;</span>,          <span class="hljs-string">&quot;El Salvador&quot;</span>: <span class="hljs-string">&quot;萨尔瓦多&quot;</span>,          <span class="hljs-string">&quot;Suriname&quot;</span>: <span class="hljs-string">&quot;苏里南&quot;</span>,          <span class="hljs-string">&quot;Slovakia&quot;</span>: <span class="hljs-string">&quot;斯洛伐克&quot;</span>,          <span class="hljs-string">&quot;Slovenia&quot;</span>: <span class="hljs-string">&quot;斯洛文尼亚&quot;</span>,          <span class="hljs-string">&quot;Sweden&quot;</span>: <span class="hljs-string">&quot;瑞典&quot;</span>,          <span class="hljs-string">&quot;Swaziland&quot;</span>: <span class="hljs-string">&quot;斯威士兰&quot;</span>,          <span class="hljs-string">&quot;Seychelles&quot;</span>: <span class="hljs-string">&quot;塞舌尔&quot;</span>,          <span class="hljs-string">&quot;Syria&quot;</span>: <span class="hljs-string">&quot;叙利亚&quot;</span>,          <span class="hljs-string">&quot;Chad&quot;</span>: <span class="hljs-string">&quot;乍得&quot;</span>,          <span class="hljs-string">&quot;Togo&quot;</span>: <span class="hljs-string">&quot;多哥&quot;</span>,          <span class="hljs-string">&quot;Thailand&quot;</span>: <span class="hljs-string">&quot;泰国&quot;</span>,          <span class="hljs-string">&quot;Tajikistan&quot;</span>: <span class="hljs-string">&quot;塔吉克斯坦&quot;</span>,          <span class="hljs-string">&quot;Turkmenistan&quot;</span>: <span class="hljs-string">&quot;土库曼斯坦&quot;</span>,          <span class="hljs-string">&quot;Timor-Leste&quot;</span>: <span class="hljs-string">&quot;东帝汶&quot;</span>,          <span class="hljs-string">&quot;Tonga&quot;</span>: <span class="hljs-string">&quot;汤加&quot;</span>,          <span class="hljs-string">&quot;Trinidad and Tobago&quot;</span>: <span class="hljs-string">&quot;特立尼达和多巴哥&quot;</span>,          <span class="hljs-string">&quot;Tunisia&quot;</span>: <span class="hljs-string">&quot;突尼斯&quot;</span>,          <span class="hljs-string">&quot;Turkey&quot;</span>: <span class="hljs-string">&quot;土耳其&quot;</span>,          <span class="hljs-string">&quot;Tanzania&quot;</span>: <span class="hljs-string">&quot;坦桑尼亚&quot;</span>,          <span class="hljs-string">&quot;Uganda&quot;</span>: <span class="hljs-string">&quot;乌干达&quot;</span>,          <span class="hljs-string">&quot;Ukraine&quot;</span>: <span class="hljs-string">&quot;乌克兰&quot;</span>,          <span class="hljs-string">&quot;Uruguay&quot;</span>: <span class="hljs-string">&quot;乌拉圭&quot;</span>,          <span class="hljs-string">&quot;United States&quot;</span>: <span class="hljs-string">&quot;美国&quot;</span>,          <span class="hljs-string">&quot;Uzbekistan&quot;</span>: <span class="hljs-string">&quot;乌兹别克斯坦&quot;</span>,          <span class="hljs-string">&quot;Venezuela&quot;</span>: <span class="hljs-string">&quot;委内瑞拉&quot;</span>,          <span class="hljs-string">&quot;Vietnam&quot;</span>: <span class="hljs-string">&quot;越南&quot;</span>,          <span class="hljs-string">&quot;Vanuatu&quot;</span>: <span class="hljs-string">&quot;瓦努阿图&quot;</span>,          <span class="hljs-string">&quot;Yemen&quot;</span>: <span class="hljs-string">&quot;也门&quot;</span>,          <span class="hljs-string">&quot;South Africa&quot;</span>: <span class="hljs-string">&quot;南非&quot;</span>,          <span class="hljs-string">&quot;Zambia&quot;</span>: <span class="hljs-string">&quot;赞比亚&quot;</span>,          <span class="hljs-string">&quot;Zimbabwe&quot;</span>: <span class="hljs-string">&quot;津巴布韦&quot;</span>,          <span class="hljs-string">&quot;Aland&quot;</span>: <span class="hljs-string">&quot;奥兰群岛&quot;</span>,          <span class="hljs-string">&quot;American Samoa&quot;</span>: <span class="hljs-string">&quot;美属萨摩亚&quot;</span>,          <span class="hljs-string">&quot;Fr. S. Antarctic Lands&quot;</span>: <span class="hljs-string">&quot;南极洲&quot;</span>,          <span class="hljs-string">&quot;Antigua and Barb.&quot;</span>: <span class="hljs-string">&quot;安提瓜和巴布达&quot;</span>,          <span class="hljs-string">&quot;Comoros&quot;</span>: <span class="hljs-string">&quot;科摩罗&quot;</span>,          <span class="hljs-string">&quot;Curaçao&quot;</span>: <span class="hljs-string">&quot;库拉索岛&quot;</span>,          <span class="hljs-string">&quot;Cayman Is.&quot;</span>: <span class="hljs-string">&quot;开曼群岛&quot;</span>,          <span class="hljs-string">&quot;Dominica&quot;</span>: <span class="hljs-string">&quot;多米尼加&quot;</span>,          <span class="hljs-string">&quot;Falkland Is.&quot;</span>: <span class="hljs-string">&quot;福克兰群岛马尔维纳斯&quot;</span>,          <span class="hljs-string">&quot;Faeroe Is.&quot;</span>: <span class="hljs-string">&quot;法罗群岛&quot;</span>,          <span class="hljs-string">&quot;Micronesia&quot;</span>: <span class="hljs-string">&quot;密克罗尼西亚&quot;</span>,          <span class="hljs-string">&quot;Heard I. and McDonald Is.&quot;</span>: <span class="hljs-string">&quot;赫德岛和麦克唐纳群岛&quot;</span>,          <span class="hljs-string">&quot;Isle of Man&quot;</span>: <span class="hljs-string">&quot;曼岛&quot;</span>,          <span class="hljs-string">&quot;Jersey&quot;</span>: <span class="hljs-string">&quot;泽西岛&quot;</span>,          <span class="hljs-string">&quot;Kiribati&quot;</span>: <span class="hljs-string">&quot;基里巴斯&quot;</span>,          <span class="hljs-string">&quot;Saint Lucia&quot;</span>: <span class="hljs-string">&quot;圣卢西亚&quot;</span>,          <span class="hljs-string">&quot;N. Mariana Is.&quot;</span>: <span class="hljs-string">&quot;北马里亚纳群岛&quot;</span>,          <span class="hljs-string">&quot;Montserrat&quot;</span>: <span class="hljs-string">&quot;蒙特塞拉特&quot;</span>,          <span class="hljs-string">&quot;Niue&quot;</span>: <span class="hljs-string">&quot;纽埃&quot;</span>,          <span class="hljs-string">&quot;Palau&quot;</span>: <span class="hljs-string">&quot;帕劳&quot;</span>,          <span class="hljs-string">&quot;Fr. Polynesia&quot;</span>: <span class="hljs-string">&quot;法属波利尼西亚&quot;</span>,          <span class="hljs-string">&quot;S. Geo. and S. Sandw. Is.&quot;</span>: <span class="hljs-string">&quot;南乔治亚岛和南桑威奇群岛&quot;</span>,          <span class="hljs-string">&quot;Saint Helena&quot;</span>: <span class="hljs-string">&quot;圣赫勒拿&quot;</span>,          <span class="hljs-string">&quot;St. Pierre and Miquelon&quot;</span>: <span class="hljs-string">&quot;圣皮埃尔和密克隆群岛&quot;</span>,          <span class="hljs-string">&quot;São Tomé and Principe&quot;</span>: <span class="hljs-string">&quot;圣多美和普林西比&quot;</span>,          <span class="hljs-string">&quot;Turks and Caicos Is.&quot;</span>: <span class="hljs-string">&quot;特克斯和凯科斯群岛&quot;</span>,          <span class="hljs-string">&quot;St. Vin. and Gren.&quot;</span>: <span class="hljs-string">&quot;圣文森特和格林纳丁斯&quot;</span>,          <span class="hljs-string">&quot;U.S. Virgin Is.&quot;</span>: <span class="hljs-string">&quot;美属维尔京群岛&quot;</span>,          <span class="hljs-string">&quot;Samoa&quot;</span>: <span class="hljs-string">&quot;萨摩亚&quot;</span>        &#125;    pieces = [        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">0</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">0</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;0&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#FFFFFF&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">49</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">1</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;1-49&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#FFE5DB&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">99</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">50</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;50-99&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#FFC4B3&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">999</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">100</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;100-999&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#FF9985&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">9999</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">1000</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;1000-9999&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#F57567&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">99999</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">10000</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;10000-99999&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#E64546&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">999999</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">100000</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;100000-999999&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#B80909&#x27;</span>&#125;,        &#123;<span class="hljs-string">&#x27;max&#x27;</span>: <span class="hljs-number">9999999</span>, <span class="hljs-string">&#x27;min&#x27;</span>: <span class="hljs-number">1000000</span>, <span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;≧1000000&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;#8A0808&#x27;</span>&#125;    ]    gt_map = (        Map()        .add(series_name=<span class="hljs-string">&#x27;累计确诊人数&#x27;</span>, data_pair=[<span class="hljs-built_in">list</span>(z) <span class="hljs-keyword">for</span> z <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(country, curconfirm)], maptype=<span class="hljs-string">&quot;world&quot;</span>, name_map=name_map, is_map_symbol_show=<span class="hljs-literal">False</span>)        .set_series_opts(label_opts=opts.LabelOpts(is_show=<span class="hljs-literal">False</span>))        .set_global_opts(            title_opts=opts.TitleOpts(title=<span class="hljs-string">&quot;全球疫情数据（累计确诊）&quot;</span>,                                      subtitle=<span class="hljs-string">&#x27;数据更新至：&#x27;</span> + time_global + <span class="hljs-string">&#x27;\n\n来源：百度疫情实时大数据报告&#x27;</span>),            visualmap_opts=opts.VisualMapOpts(max_=<span class="hljs-number">300</span>, is_piecewise=<span class="hljs-literal">True</span>, pieces=pieces),        )    )    <span class="hljs-keyword">return</span> gt_map</code></pre><h3><span id="7x03-zhong-guo-mei-ri-shu-ju-zhe-xian-tu-china-daily-map"><font color="#4876FF">【7x03】中国每日数据折线图 china_daily_map()</font></span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">china_daily_map</span>():</span>    wb = openpyxl.load_workbook(<span class="hljs-string">&#x27;COVID-19-China.xlsx&#x27;</span>)    ws_china_confirmed = wb[<span class="hljs-string">&#x27;中国每日累计确诊数据&#x27;</span>]    ws_china_crued = wb[<span class="hljs-string">&#x27;中国每日累计治愈数据&#x27;</span>]    ws_china_died = wb[<span class="hljs-string">&#x27;中国每日累计死亡数据&#x27;</span>]    ws_china_confirmed.delete_rows(<span class="hljs-number">1</span>)    ws_china_crued.delete_rows(<span class="hljs-number">1</span>)    ws_china_died.delete_rows(<span class="hljs-number">1</span>)    x_date = []               <span class="hljs-comment"># 日期</span>    y_china_confirmed = []    <span class="hljs-comment"># 每日累计确诊</span>    y_china_crued = []        <span class="hljs-comment"># 每日累计治愈</span>    y_china_died = []         <span class="hljs-comment"># 每日累计死亡</span>    <span class="hljs-keyword">for</span> china_confirmed <span class="hljs-keyword">in</span> ws_china_confirmed.values:        y_china_confirmed.append(china_confirmed[<span class="hljs-number">1</span>])    <span class="hljs-keyword">for</span> china_crued <span class="hljs-keyword">in</span> ws_china_crued.values:        x_date.append(china_crued[<span class="hljs-number">0</span>])        y_china_crued.append(china_crued[<span class="hljs-number">1</span>])    <span class="hljs-keyword">for</span> china_died <span class="hljs-keyword">in</span> ws_china_died.values:        y_china_died.append(china_died[<span class="hljs-number">1</span>])    fi_map = (        Line(init_opts=opts.InitOpts(height=<span class="hljs-string">&#x27;420px&#x27;</span>))            .add_xaxis(xaxis_data=x_date)            .add_yaxis(            series_name=<span class="hljs-string">&quot;中国累计确诊数据&quot;</span>,            y_axis=y_china_confirmed,            label_opts=opts.LabelOpts(is_show=<span class="hljs-literal">False</span>),        )            .add_yaxis(            series_name=<span class="hljs-string">&quot;中国累计治愈趋势&quot;</span>,            y_axis=y_china_crued,            label_opts=opts.LabelOpts(is_show=<span class="hljs-literal">False</span>),        )            .add_yaxis(            series_name=<span class="hljs-string">&quot;中国累计死亡趋势&quot;</span>,            y_axis=y_china_died,            label_opts=opts.LabelOpts(is_show=<span class="hljs-literal">False</span>),        )            .set_global_opts(            title_opts=opts.TitleOpts(title=<span class="hljs-string">&quot;中国每日累计确诊/治愈/死亡趋势&quot;</span>),            legend_opts=opts.LegendOpts(pos_bottom=<span class="hljs-string">&quot;bottom&quot;</span>, orient=<span class="hljs-string">&#x27;horizontal&#x27;</span>),            tooltip_opts=opts.TooltipOpts(trigger=<span class="hljs-string">&quot;axis&quot;</span>),            yaxis_opts=opts.AxisOpts(                type_=<span class="hljs-string">&quot;value&quot;</span>,                axistick_opts=opts.AxisTickOpts(is_show=<span class="hljs-literal">True</span>),                splitline_opts=opts.SplitLineOpts(is_show=<span class="hljs-literal">True</span>),            ),            xaxis_opts=opts.AxisOpts(type_=<span class="hljs-string">&quot;category&quot;</span>, boundary_gap=<span class="hljs-literal">False</span>),        )    )    <span class="hljs-keyword">return</span> fi_map</code></pre><h3><span id="7x04-jing-wai-mei-ri-shu-ju-zhe-xian-tu-foreign-daily-map"><font color="#4876FF">【7x04】境外每日数据折线图 foreign_daily_map()</font></span></h3><pre><code class="hljs python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">foreign_daily_map</span>():</span>    wb = openpyxl.load_workbook(<span class="hljs-string">&#x27;COVID-19-Global.xlsx&#x27;</span>)    ws_foreign_confirmed = wb[<span class="hljs-string">&#x27;境外每日累计确诊数据&#x27;</span>]    ws_foreign_crued = wb[<span class="hljs-string">&#x27;境外每日累计治愈数据&#x27;</span>]    ws_foreign_died = wb[<span class="hljs-string">&#x27;境外每日累计死亡数据&#x27;</span>]    ws_foreign_confirmed.delete_rows(<span class="hljs-number">1</span>)    ws_foreign_crued.delete_rows(<span class="hljs-number">1</span>)    ws_foreign_died.delete_rows(<span class="hljs-number">1</span>)    x_date = []                <span class="hljs-comment"># 日期</span>    y_foreign_confirmed = []   <span class="hljs-comment"># 累计确诊</span>    y_foreign_crued = []       <span class="hljs-comment"># 累计治愈</span>    y_foreign_died = []        <span class="hljs-comment"># 累计死亡</span>    <span class="hljs-keyword">for</span> foreign_confirmed <span class="hljs-keyword">in</span> ws_foreign_confirmed.values:        y_foreign_confirmed.append(foreign_confirmed[<span class="hljs-number">1</span>])    <span class="hljs-keyword">for</span> foreign_crued <span class="hljs-keyword">in</span> ws_foreign_crued.values:        x_date.append(foreign_crued[<span class="hljs-number">0</span>])        y_foreign_crued.append(foreign_crued[<span class="hljs-number">1</span>])    <span class="hljs-keyword">for</span> foreign_died <span class="hljs-keyword">in</span> ws_foreign_died.values:        y_foreign_died.append(foreign_died[<span class="hljs-number">1</span>])    fte_map = (        Line(init_opts=opts.InitOpts(height=<span class="hljs-string">&#x27;420px&#x27;</span>))            .add_xaxis(xaxis_data=x_date)            .add_yaxis(            series_name=<span class="hljs-string">&quot;境外累计确诊趋势&quot;</span>,            y_axis=y_foreign_confirmed,            label_opts=opts.LabelOpts(is_show=<span class="hljs-literal">False</span>),        )            .add_yaxis(            series_name=<span class="hljs-string">&quot;境外累计治愈趋势&quot;</span>,            y_axis=y_foreign_crued,            label_opts=opts.LabelOpts(is_show=<span class="hljs-literal">False</span>),        )            .add_yaxis(            series_name=<span class="hljs-string">&quot;境外累计死亡趋势&quot;</span>,            y_axis=y_foreign_died,            label_opts=opts.LabelOpts(is_show=<span class="hljs-literal">False</span>),        )            .set_global_opts(            title_opts=opts.TitleOpts(title=<span class="hljs-string">&quot;境外每日累计确诊/治愈/死亡趋势&quot;</span>),            legend_opts=opts.LegendOpts(pos_bottom=<span class="hljs-string">&quot;bottom&quot;</span>, orient=<span class="hljs-string">&#x27;horizontal&#x27;</span>),            tooltip_opts=opts.TooltipOpts(trigger=<span class="hljs-string">&quot;axis&quot;</span>),            yaxis_opts=opts.AxisOpts(                type_=<span class="hljs-string">&quot;value&quot;</span>,                axistick_opts=opts.AxisTickOpts(is_show=<span class="hljs-literal">True</span>),                splitline_opts=opts.SplitLineOpts(is_show=<span class="hljs-literal">True</span>),            ),            xaxis_opts=opts.AxisOpts(type_=<span class="hljs-string">&quot;category&quot;</span>, boundary_gap=<span class="hljs-literal">False</span>),        )    )    <span class="hljs-keyword">return</span> fte_map</code></pre><h2><span id="8x00-jie-guo-jie-tu"><font color="#FF0000">【8x00】结果截图</font></span></h2><h3><span id="8x01-shu-ju-chu-cun-excel"><font color="#4876FF">【8x01】数据储存 Excel</font></span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/035/03.png" alt="03"></p><p><img src="https://static.wukongsec.com/itbob/images/article/035/04.png" alt="04"></p><h3><span id="8x02-ci-yun-tu"><font color="#4876FF">【8x02】词云图</font></span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/035/05.png" alt="05"></p><p><img src="https://static.wukongsec.com/itbob/images/article/035/06.png" alt="06"></p><h3><span id="8x03-di-tu-zhe-xian-tu"><font color="#4876FF">【8x03】地图 + 折线图</font></span></h3><p><img src="https://static.wukongsec.com/itbob/images/article/035/07.png" alt="07"></p><h2><span id="9x00-wan-zheng-dai-ma"><font color="#FF0000">【9x00】完整代码</font></span></h2><p>预览地址：<s><a href="http://cov.itrhx.com/">http://cov.itrhx.com/</a></s>（已失效）<br>完整代码地址（点亮 star 有 buff 加成）：<a href="https://github.com/TRHX/Python3-Spider-Practice">https://github.com/TRHX/Python3-Spider-Practice</a><br>爬虫实战专栏（持续更新）：<a href="https://itrhx.blog.csdn.net/article/category/9351278">https://itrhx.blog.csdn.net/article/category/9351278</a></p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/107140534</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;p&gt;&lt;strong&gt;&lt;center&gt;&lt;font color=&quot;red&quot; size=&quot;5px&quot; weight=&quot;bolder&quot;&gt;欢迎加入爬虫逆向微信交流群：添加微信 IT-BOB（备注交流群）&lt;/font&gt;&lt;/center&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span id=&quot;</summary>
      
    
    
    
    <category term="爬虫" scheme="https://www.itbob.cn/categories/%E7%88%AC%E8%99%AB/"/>
    
    
    <category term="爬虫" scheme="https://www.itbob.cn/tags/%E7%88%AC%E8%99%AB/"/>
    
    <category term="数据可视化" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%8F%AF%E8%A7%86%E5%8C%96/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Pandas（十）：数据读写</title>
    <link href="https://www.itbob.cn/article/034/"/>
    <id>https://www.itbob.cn/article/034/</id>
    <published>2020-06-26T14:54:56.000Z</published>
    <updated>2022-05-22T12:45:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-01x00-du-qu-shu-ju-font"><font color="#FF0000">【01x00】读取数据</font></a><ul><li><a href="#font-color-4876ff-01x01-jian-dan-shi-li-font"><font color="#4876FF">【01x01】简单示例</font></a></li><li><a href="#font-color-4876ff-01x02-header-names-ding-zhi-lie-biao-qian-font"><font color="#4876FF">【01x02】header / names 定制列标签</font></a></li><li><a href="#font-color-4876ff-01x03-index-col-zhi-ding-lie-wei-xing-suo-yin-font"><font color="#4876FF">【01x03】index_col 指定列为行索引</font></a></li><li><a href="#font-color-4876ff-01x04-sep-zhi-ding-fen-ge-fu-font"><font color="#4876FF">【01x04】sep 指定分隔符</font></a></li><li><a href="#font-color-4876ff-01x05-skiprows-hu-lue-xing-font"><font color="#4876FF">【01x05】skiprows 忽略行</font></a></li><li><a href="#font-color-4876ff-01x06-na-values-she-zhi-que-shi-zhi-font"><font color="#4876FF">【01x06】na_values 设置缺失值</font></a></li><li><a href="#font-color-4876ff-01x07-nrows-chunksize-xing-yu-kuai-font"><font color="#4876FF">【01x07】nrows / chunksize 行与块</font></a></li></ul></li><li><a href="#font-color-ff0000-02x00-xie-ru-shu-ju-font"><font color="#FF0000">【02x00】写入数据</font></a><ul><li><a href="#font-color-4876ff-02x01-jian-dan-shi-li-font"><font color="#4876FF">【02x01】简单示例</font></a></li><li><a href="#font-color-4876ff-02x02-sep-zhi-ding-fen-ge-fu-font"><font color="#4876FF">【02x02】sep 指定分隔符</font></a></li><li><a href="#font-color-4876ff-02x03-na-rep-ti-huan-que-shi-zhi-font"><font color="#4876FF">【02x03】na_rep 替换缺失值</font></a></li><li><a href="#font-color-4876ff-02x04-index-header-xing-yu-lie-biao-qian-font"><font color="#4876FF">【02x04】index / header 行与列标签</font></a></li><li><a href="#font-color-4876ff-02x05-columns-zhi-ding-lie-font"><font color="#4876FF">【02x05】columns 指定列</font></a></li></ul></li></ul><!-- tocstop --><hr><p>Pandas 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/025/">Python 数据分析三剑客之 Pandas（一）：认识 Pandas 及其 Series、DataFrame 对象</a></li><li><a href="https://www.itbob.cn/article/026/">Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作</a></li><li><a href="https://www.itbob.cn/article/027/">Python 数据分析三剑客之 Pandas（三）：算术运算与缺失值的处理</a></li><li><a href="https://www.itbob.cn/article/028/">Python 数据分析三剑客之 Pandas（四）：函数应用、映射、排序和层级索引</a></li><li><a href="https://www.itbob.cn/article/029/">Python 数据分析三剑客之 Pandas（五）：统计计算与统计描述</a></li><li><a href="https://www.itbob.cn/article/030/">Python 数据分析三剑客之 Pandas（六）：GroupBy 数据分裂、应用与合并</a></li><li><a href="https://www.itbob.cn/article/031/">Python 数据分析三剑客之 Pandas（七）：合并数据集</a></li><li><a href="https://www.itbob.cn/article/032/">Python 数据分析三剑客之 Pandas（八）：数据重塑、重复数据处理与数据替换</a></li><li><a href="https://www.itbob.cn/article/033/">Python 数据分析三剑客之 Pandas（九）：时间序列</a></li><li><a href="https://www.itbob.cn/article/034/">Python 数据分析三剑客之 Pandas（十）：数据读写</a></li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106963135</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="01x00-du-qu-shu-ju"><font color="#FF0000">【01x00】读取数据</font></span></h2><p>Pandas 提供了一些用于将表格型数据读取为 DataFrame 对象的函数。常见方法如下：</p><p>Pandas 官方对 IO 工具的介绍：<a href="https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html">https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html</a></p><table><thead><tr><th>函数</th><th>描述</th></tr></thead><tbody><tr><td>read_csv</td><td>从文件、URL、文件型对象中加载带分隔符的数据。默认分隔符为逗号</td></tr><tr><td>read_table</td><td>从文件、URL、文件型对象中加载带分隔符的数据。默认分隔符为制表符（<code>'\t'</code>）</td></tr><tr><td>read_fwf</td><td>读取定宽列格式数据（没有分隔符）</td></tr><tr><td>read_clipboard</td><td>读取剪贴板中的数据，可以看做 read_table 的剪贴板版本。在将网页转换为表格时很有用</td></tr><tr><td>read_excel</td><td>从 Excel XLS 或 XLSX file 读取表格数据</td></tr><tr><td>read_hdf</td><td>读取 pandas写的 HDF5 文件</td></tr><tr><td>read_html</td><td>读取 HTML 文档中的所有表格</td></tr><tr><td>read_json</td><td>读取 JSON（ JavaScript Object Notation）字符串中的数据</td></tr><tr><td>read_msgpack</td><td>读取二进制格式编码的 pandas 数据（Pandas v1.0.0 中已删除对 msgpack 的支持，建议使用 <a href="https://pandas.pydata.org/docs/user_guide/io.html#io-msgpack">pyarrow</a>）</td></tr><tr><td>read_pickle</td><td>读取 Python pickle 格式中存储的任意对象</td></tr><tr><td>read_sas</td><td>读取存储于 SAS 系统自定义存储格式的 SAS 数据集</td></tr><tr><td>read_sql</td><td>（使用 SQLAlchemy）读取 SQL 查询结果为 pandas 的 DataFrame</td></tr><tr><td>read_stata</td><td>读取 Stata 文件格式的数据集</td></tr><tr><td>read_feather</td><td>读取 Feather 二进制格式文件</td></tr></tbody></table><p>以下以 read_csv 和 read_table 为例，它们的参数多达 50 多个，具体可参见官方文档：</p><p>read_csv：<a href="https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html">https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html</a></p><p>read_table：<a href="https://pandas.pydata.org/docs/reference/api/pandas.read_table.html">https://pandas.pydata.org/docs/reference/api/pandas.read_table.html</a></p><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>path</td><td>表示文件系统位置、URL、文件型对象的字符串</td></tr><tr><td>sep / delimiter</td><td>用于对行中各字段进行拆分的字符序列或正则表达式</td></tr><tr><td>header</td><td>用作列名的行号，默认为 0（第一行），如果没有 header 行就应该设置为 None</td></tr><tr><td>index_col</td><td>用作行索引的列编号或列名。可以是单个名称、数字或由多个名称、数字组成的列表（层次化索引）</td></tr><tr><td>names</td><td>用于结果的列名列表，结合 header=None</td></tr><tr><td>skiprows</td><td>需要忽略的行数（从文件开始处算起），或需要跳过的行号列表（从0开始）</td></tr><tr><td>na_values</td><td>指定一组值，将该组值设置为 NaN（缺失值）</td></tr><tr><td>comment</td><td>用于将注释信息从行尾拆分出去的字符（一个或多个）</td></tr><tr><td>parse_dates</td><td>尝试将数据解析为日期，默认为 False。如果为 True，则尝试解析所有列。此外，还可以指定需要解析的一组列号或列名。<br>如果列表的元素为列表或元组，就会将多个列组合到一起再进行日期解析工作（例如，日期、时间分别位于两个列中）</td></tr><tr><td>keep_date_col</td><td>如果连接多列解析日期，则保持参与连接的列。默认为 False</td></tr><tr><td>converters</td><td>由列号 / 列名跟函数之间的映射关系组成的字典。例如，<code>&#123;'foo': f&#125;</code> 会对 foo 列的所有值应用函数 f</td></tr><tr><td>dayfirst</td><td>当解析有歧义的日期时，将其看做国际格式（例如，7/6/2012 —&gt; June 7,2012），默认为 Fase</td></tr><tr><td>date_parser</td><td>用于解析日期的函数</td></tr><tr><td>nrows</td><td>需要读取的行数（从文件开始处算起）</td></tr><tr><td>iterator</td><td>返回一个 TextParser 以便逐块读取文件</td></tr><tr><td>chunksize</td><td>文件块的大小（用于迭代）</td></tr><tr><td>skip_footer</td><td>需要忽略的行数（从文件末尾处算起）</td></tr><tr><td>verbose</td><td>打印各种解析器输出信息，比如“非数值列中缺失值的数量”等</td></tr><tr><td>encoding</td><td>用于 unicode 的文本编码格式。例如，“utf-8” 表示用 UTF-8 编码的文本</td></tr><tr><td>squeeze</td><td>如果数据经解析后仅含一列，则返回 Series</td></tr><tr><td>thousands</td><td>千分位分隔符，如 <code>,</code> 或 <code>.</code></td></tr></tbody></table><h3><span id="01x01-jian-dan-shi-li"><font color="#4876FF">【01x01】简单示例</font></span></h3><p>首先创建一个 test1.csv 文件：</p><p><img src="https://static.wukongsec.com/itbob/images/article/034/01.png" alt="01"></p><p>使用 read_csv 方法将其读出为一个 DataFrame 对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test1.csv&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj   a   b   c   d message<span class="hljs-number">0</span>  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3</span>   <span class="hljs-number">4</span>   hello<span class="hljs-number">1</span>  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   <span class="hljs-number">7</span>   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11</span>  <span class="hljs-number">12</span>  python<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">type</span>(obj)&lt;<span class="hljs-class"><span class="hljs-keyword">class</span> &#x27;<span class="hljs-title">pandas</span>.<span class="hljs-title">core</span>.<span class="hljs-title">frame</span>.<span class="hljs-title">DataFrame</span>&#x27;&gt;</span></code></pre><p>前面的 csv 文件是以逗号分隔的，可以使用 read_table 方法并指定分隔符来读取：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.read_table(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test1.csv&#x27;</span>, sep=<span class="hljs-string">&#x27;,&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj   a   b   c   d message<span class="hljs-number">0</span>  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3</span>   <span class="hljs-number">4</span>   hello<span class="hljs-number">1</span>  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   <span class="hljs-number">7</span>   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11</span>  <span class="hljs-number">12</span>  python</code></pre><h3><span id="01x02-header-names-ding-zhi-lie-biao-qian"><font color="#4876FF">【01x02】header / names 定制列标签</font></span></h3><p>以上示例中第一行为列标签，如果没有单独定义列标签，使用 read_csv 方法也会默认将第一行当作列标签：</p><p><img src="https://static.wukongsec.com/itbob/images/article/034/02.png" alt="02"></p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test2.csv&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj   <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3</span>   <span class="hljs-number">4</span>   hello<span class="hljs-number">0</span>  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   <span class="hljs-number">7</span>   <span class="hljs-number">8</span>   world<span class="hljs-number">1</span>  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11</span>  <span class="hljs-number">12</span>  python</code></pre><p>避免以上情况，可以设置 <code>header=None</code>，Pandas 会为其自动分配列标签：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test2.csv&#x27;</span>, header=<span class="hljs-literal">None</span>)   <span class="hljs-number">0</span>   <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3</span>       <span class="hljs-number">4</span><span class="hljs-number">0</span>  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3</span>   <span class="hljs-number">4</span>   hello<span class="hljs-number">1</span>  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   <span class="hljs-number">7</span>   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11</span>  <span class="hljs-number">12</span>  python</code></pre><p>也可以使用 <code>names</code> 参数自定义列标签，传递的是一个列表：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test2.csv&#x27;</span>, names=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;message&#x27;</span>])   a   b   c   d message<span class="hljs-number">0</span>  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3</span>   <span class="hljs-number">4</span>   hello<span class="hljs-number">1</span>  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   <span class="hljs-number">7</span>   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11</span>  <span class="hljs-number">12</span>  python</code></pre><h3><span id="01x03-index-col-zhi-ding-lie-wei-xing-suo-yin"><font color="#4876FF">【01x03】index_col 指定列为行索引</font></span></h3><p><code>index_col</code> 参数可以指定某一列作为 DataFrame 的行索引，传递的参数是列名称，在以下示例中，会将列名为 <code>message</code> 的列作为 DataFrame 的行索引：</p><p><img src="https://static.wukongsec.com/itbob/images/article/034/03.png" alt="03"></p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span>pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test2.csv&#x27;</span>,                 names=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;message&#x27;</span>],                 index_col=<span class="hljs-string">&#x27;message&#x27;</span>)         a   b   c   dmessage               hello    <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3</span>   <span class="hljs-number">4</span>world    <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   <span class="hljs-number">7</span>   <span class="hljs-number">8</span>python   <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11</span>  <span class="hljs-number">12</span></code></pre><p>如果需要构造多层索引的 DataFrame 对象，则只需传入由列编号或列名组成的列表即可：</p><p><img src="https://static.wukongsec.com/itbob/images/article/034/04.png" alt="04"></p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test3.csv&#x27;</span>, index_col=[<span class="hljs-string">&#x27;key1&#x27;</span>, <span class="hljs-string">&#x27;key2&#x27;</span>])           value1  value2key1 key2                one  a          <span class="hljs-number">1</span>       <span class="hljs-number">2</span>     b          <span class="hljs-number">3</span>       <span class="hljs-number">4</span>     c          <span class="hljs-number">5</span>       <span class="hljs-number">6</span>     d          <span class="hljs-number">7</span>       <span class="hljs-number">8</span>two  a          <span class="hljs-number">9</span>      <span class="hljs-number">10</span>     b         <span class="hljs-number">11</span>      <span class="hljs-number">12</span>     c         <span class="hljs-number">13</span>      <span class="hljs-number">14</span>     d         <span class="hljs-number">15</span>      <span class="hljs-number">16</span></code></pre><h3><span id="01x04-sep-zhi-ding-fen-ge-fu"><font color="#4876FF">【01x04】sep 指定分隔符</font></span></h3><p>在 read_table 中，sep 参数用于接收分隔符，如果遇到不是用固定的分隔符去分隔字段的，也可以传递一个正则表达式作为 read_table 的分隔符，如下面的 txt 文件数据之间是由不同的空白字符间隔开的：</p><p><img src="https://static.wukongsec.com/itbob/images/article/034/05.png" alt="05"></p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.read_table(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test1.txt&#x27;</span>, sep=<span class="hljs-string">&#x27;\s+&#x27;</span>)            A         B         Caaa -<span class="hljs-number">0.264438</span> -<span class="hljs-number">1.026059</span> -<span class="hljs-number">0.619500</span>bbb  <span class="hljs-number">0.927272</span>  <span class="hljs-number">0.302904</span> -<span class="hljs-number">0.032399</span>ccc -<span class="hljs-number">0.264273</span> -<span class="hljs-number">0.386314</span> -<span class="hljs-number">0.217601</span>ddd -<span class="hljs-number">0.871858</span> -<span class="hljs-number">0.348382</span>  <span class="hljs-number">1.100491</span></code></pre><h3><span id="01x05-skiprows-hu-lue-xing"><font color="#4876FF">【01x05】skiprows 忽略行</font></span></h3><p>skiprows参数可用于设置需要忽略的行数，或需要跳过的行号列表，在下面的示例中，读取文件时选择跳过第1、3、4行（索引值分别为0、2、3）：</p><p><img src="https://static.wukongsec.com/itbob/images/article/034/06.png" alt="06"></p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test4.csv&#x27;</span>, skiprows=[<span class="hljs-number">0</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>])   a   b   c   d message<span class="hljs-number">0</span>  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3</span>   <span class="hljs-number">4</span>   hello<span class="hljs-number">1</span>  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   <span class="hljs-number">7</span>   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11</span>  <span class="hljs-number">12</span>  python</code></pre><h3><span id="01x06-na-values-she-zhi-que-shi-zhi"><font color="#4876FF">【01x06】na_values 设置缺失值</font></span></h3><p>当文件中出现了空字符串或者 NA 值，Pandas 会将其标记成 NaN（缺失值），同样也可以使用 <code>isnull</code> 方法来判断结果值是否为缺失值：</p><p><img src="https://static.wukongsec.com/itbob/images/article/034/07.png" alt="07"></p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test5.csv&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  something  a   b     c   d message<span class="hljs-number">0</span>       one  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3.0</span>   <span class="hljs-number">4</span>     NaN<span class="hljs-number">1</span>       two  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   NaN   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>     three  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11.0</span>  <span class="hljs-number">12</span>  python<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.isnull(obj)   something      a      b      c      d  message<span class="hljs-number">0</span>      <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>     <span class="hljs-literal">True</span><span class="hljs-number">1</span>      <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>   <span class="hljs-literal">True</span>  <span class="hljs-literal">False</span>    <span class="hljs-literal">False</span><span class="hljs-number">2</span>      <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>    <span class="hljs-literal">False</span></code></pre><p><code>na_values</code> 方法可以传递一组值，将这组值设置为缺失值，如果传递的为字典对象，则字典的各值将被设置为 NaN：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test5.csv&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj1  something  a   b     c   d message<span class="hljs-number">0</span>       one  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3.0</span>   <span class="hljs-number">4</span>     NaN<span class="hljs-number">1</span>       two  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   NaN   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>     three  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11.0</span>  <span class="hljs-number">12</span>  python<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test5.csv&#x27;</span>, na_values=[<span class="hljs-string">&#x27;1&#x27;</span>, <span class="hljs-string">&#x27;12&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2  something    a   b     c    d message<span class="hljs-number">0</span>       one  NaN   <span class="hljs-number">2</span>   <span class="hljs-number">3.0</span>  <span class="hljs-number">4.0</span>     NaN<span class="hljs-number">1</span>       two  <span class="hljs-number">5.0</span>   <span class="hljs-number">6</span>   NaN  <span class="hljs-number">8.0</span>   world<span class="hljs-number">2</span>     three  <span class="hljs-number">9.0</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11.0</span>  NaN  python<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>sentinels = &#123;<span class="hljs-string">&#x27;message&#x27;</span>: [<span class="hljs-string">&#x27;python&#x27;</span>, <span class="hljs-string">&#x27;world&#x27;</span>], <span class="hljs-string">&#x27;something&#x27;</span>: [<span class="hljs-string">&#x27;two&#x27;</span>]&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj3 = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test5.csv&#x27;</span>, na_values=sentinels)<span class="hljs-meta">&gt;&gt;&gt; </span>obj3  something  a   b     c   d  message<span class="hljs-number">0</span>       one  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3.0</span>   <span class="hljs-number">4</span>      NaN<span class="hljs-number">1</span>       NaN  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   NaN   <span class="hljs-number">8</span>      NaN<span class="hljs-number">2</span>     three  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11.0</span>  <span class="hljs-number">12</span>      NaN</code></pre><h3><span id="01x07-nrows-chunksize-xing-yu-kuai"><font color="#4876FF">【01x07】nrows / chunksize 行与块</font></span></h3><p>以下 test6.csv 文件中包含 50 行数据：</p><p><img src="https://static.wukongsec.com/itbob/images/article/034/08.png" alt="08"></p><p>可以设置 <code>pd.options.display.max_rows</code> 来紧凑地显示指定行数的数据：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.options.display.max_rows = <span class="hljs-number">10</span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test6.csv&#x27;</span>)         one       two     three      four key<span class="hljs-number">0</span>   <span class="hljs-number">0.467976</span> -<span class="hljs-number">0.038649</span> -<span class="hljs-number">0.295344</span> -<span class="hljs-number">1.824726</span>   L<span class="hljs-number">1</span>  -<span class="hljs-number">0.358893</span>  <span class="hljs-number">1.404453</span>  <span class="hljs-number">0.704965</span> -<span class="hljs-number">0.200638</span>   B<span class="hljs-number">2</span>  -<span class="hljs-number">0.501840</span>  <span class="hljs-number">0.659254</span> -<span class="hljs-number">0.421691</span> -<span class="hljs-number">0.057688</span>   G<span class="hljs-number">3</span>   <span class="hljs-number">0.204886</span>  <span class="hljs-number">1.074134</span>  <span class="hljs-number">1.388361</span> -<span class="hljs-number">0.982404</span>   R<span class="hljs-number">4</span>   <span class="hljs-number">0.354628</span> -<span class="hljs-number">0.133116</span>  <span class="hljs-number">0.283763</span> -<span class="hljs-number">0.837063</span>   Q..       ...       ...       ...       ...  ..<span class="hljs-number">45</span>  <span class="hljs-number">2.311896</span> -<span class="hljs-number">0.417070</span> -<span class="hljs-number">1.409599</span> -<span class="hljs-number">0.515821</span>   L<span class="hljs-number">46</span> -<span class="hljs-number">0.479893</span> -<span class="hljs-number">0.633419</span>  <span class="hljs-number">0.745152</span> -<span class="hljs-number">0.646038</span>   E<span class="hljs-number">47</span>  <span class="hljs-number">0.523331</span>  <span class="hljs-number">0.787112</span>  <span class="hljs-number">0.486066</span>  <span class="hljs-number">1.093156</span>   K<span class="hljs-number">48</span> -<span class="hljs-number">0.362559</span>  <span class="hljs-number">0.598894</span> -<span class="hljs-number">1.843201</span>  <span class="hljs-number">0.887292</span>   G<span class="hljs-number">49</span> -<span class="hljs-number">0.096376</span> -<span class="hljs-number">1.012999</span> -<span class="hljs-number">0.657431</span> -<span class="hljs-number">0.573315</span>   <span class="hljs-number">0</span>[<span class="hljs-number">50</span> rows x <span class="hljs-number">5</span> columns]</code></pre><p>通过 nrows 参数可以读取指定行数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test6.csv&#x27;</span>, nrows=<span class="hljs-number">5</span>)        one       two     three      four key<span class="hljs-number">0</span>  <span class="hljs-number">0.467976</span> -<span class="hljs-number">0.038649</span> -<span class="hljs-number">0.295344</span> -<span class="hljs-number">1.824726</span>   L<span class="hljs-number">1</span> -<span class="hljs-number">0.358893</span>  <span class="hljs-number">1.404453</span>  <span class="hljs-number">0.704965</span> -<span class="hljs-number">0.200638</span>   B<span class="hljs-number">2</span> -<span class="hljs-number">0.501840</span>  <span class="hljs-number">0.659254</span> -<span class="hljs-number">0.421691</span> -<span class="hljs-number">0.057688</span>   G<span class="hljs-number">3</span>  <span class="hljs-number">0.204886</span>  <span class="hljs-number">1.074134</span>  <span class="hljs-number">1.388361</span> -<span class="hljs-number">0.982404</span>   R<span class="hljs-number">4</span>  <span class="hljs-number">0.354628</span> -<span class="hljs-number">0.133116</span>  <span class="hljs-number">0.283763</span> -<span class="hljs-number">0.837063</span>   Q</code></pre><p>要逐块读取文件，可以指定 chunksize（行数）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>chunker = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test6.csv&#x27;</span>, chunksize=<span class="hljs-number">50</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>chunker&lt;pandas.io.parsers.TextFileReader <span class="hljs-built_in">object</span> at <span class="hljs-number">0x07A20D60</span>&gt;</code></pre><p>返回的 TextParser 对象，可以根据 chunksize 对文件进行逐块迭代。以下示例中，对 test6.csv 文件数据进行迭代处理，将值计数聚合到 “key” 列中：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>chunker = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test6.csv&#x27;</span>, chunksize=<span class="hljs-number">50</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>tot = pd.Series([], dtype=<span class="hljs-string">&#x27;float64&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">for</span> piece <span class="hljs-keyword">in</span> chunker:    tot = tot.add(piece[<span class="hljs-string">&#x27;key&#x27;</span>].value_counts(), fill_value=<span class="hljs-number">0</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>tot = tot.sort_values(ascending=<span class="hljs-literal">False</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>tot[:<span class="hljs-number">10</span>]G    <span class="hljs-number">6.0</span>E    <span class="hljs-number">5.0</span>B    <span class="hljs-number">5.0</span>L    <span class="hljs-number">5.0</span><span class="hljs-number">0</span>    <span class="hljs-number">5.0</span>K    <span class="hljs-number">4.0</span>A    <span class="hljs-number">4.0</span>R    <span class="hljs-number">4.0</span>C    <span class="hljs-number">2.0</span>Q    <span class="hljs-number">2.0</span>dtype: float64</code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106963135</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="02x00-xie-ru-shu-ju"><font color="#FF0000">【02x00】写入数据</font></span></h2><p>Pandas 提供了一些用于将表格型数据读取为 DataFrame 对象的函数。常见方法如下：</p><table><thead><tr><th>函数</th><th>描述</th></tr></thead><tbody><tr><td>to_csv</td><td>将对象写入逗号分隔值（csv）文件</td></tr><tr><td>to_clipboard</td><td>将对象复制到系统剪贴板</td></tr><tr><td>to_excel</td><td>将对象写入 Excel 工作表</td></tr><tr><td>to_hdf</td><td>使用 HDFStore 将包含的数据写入 HDF5 文件</td></tr><tr><td>to_html</td><td>将 DataFrame 呈现为 HTML 表格</td></tr><tr><td>to_json</td><td>将对象转换为 JSON（ JavaScript Object Notation）字符串</td></tr><tr><td>to_msgpack</td><td>将对象写入二进制格式编码的文件（Pandas v1.0.0 中已删除对 msgpack 的支持，建议使用 <a href="https://pandas.pydata.org/docs/user_guide/io.html#io-msgpack">pyarrow</a>）</td></tr><tr><td>to_pickle</td><td>Pickle（序列化）对象到文件</td></tr><tr><td>to_sql</td><td>将存储在 DataFrame 中的数据写入 SQL 数据库</td></tr><tr><td>to_stata</td><td>将 DataFrame 对象导出为 Stata 格式</td></tr><tr><td>to_feather</td><td>将 DataFrames 写入 Feather 二进制格式文件</td></tr></tbody></table><p>以下以 to_csv 为例，它的参数同样多达 50 多个，具体可参见官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.to_csv.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.to_csv.html</a></p></li></ul><h3><span id="02x01-jian-dan-shi-li"><font color="#4876FF">【02x01】简单示例</font></span></h3><p>以之前的 test5.csv 文件为例，先读出数据，再将数据写入另外的文件：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test5.csv&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>data  something  a   b     c   d message<span class="hljs-number">0</span>       one  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3.0</span>   <span class="hljs-number">4</span>     NaN<span class="hljs-number">1</span>       two  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   NaN   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>     three  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11.0</span>  <span class="hljs-number">12</span>  python<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>data.to_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\out1.csv&#x27;</span>)</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/034/09.png" alt="09"></p><h3><span id="02x02-sep-zhi-ding-fen-ge-fu"><font color="#4876FF">【02x02】sep 指定分隔符</font></span></h3><p>sep 参数可用于其他分隔符：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test5.csv&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>data  something  a   b     c   d message<span class="hljs-number">0</span>       one  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3.0</span>   <span class="hljs-number">4</span>     NaN<span class="hljs-number">1</span>       two  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   NaN   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>     three  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11.0</span>  <span class="hljs-number">12</span>  python&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>data.to_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\out2.csv&#x27;</span>, sep=<span class="hljs-string">&#x27;|&#x27;</span>)</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/034/10.png" alt="10"></p><h3><span id="02x03-na-rep-ti-huan-que-shi-zhi"><font color="#4876FF">【02x03】na_rep 替换缺失值</font></span></h3><p>na_rep 参数可将缺失值（NaN）替换成其他字符串：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test5.csv&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>data  something  a   b     c   d message<span class="hljs-number">0</span>       one  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3.0</span>   <span class="hljs-number">4</span>     NaN<span class="hljs-number">1</span>       two  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   NaN   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>     three  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11.0</span>  <span class="hljs-number">12</span>  python<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>data.to_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\out3.csv&#x27;</span>, na_rep=<span class="hljs-string">&#x27;X&#x27;</span>)</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/034/11.png" alt="11"></p><h3><span id="02x04-index-header-xing-yu-lie-biao-qian"><font color="#4876FF">【02x04】index / header 行与列标签</font></span></h3><p>设置 <code>index=False</code>, <code>header=False</code>，可以禁用行标签与列标签：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test5.csv&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>data  something  a   b     c   d message<span class="hljs-number">0</span>       one  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3.0</span>   <span class="hljs-number">4</span>     NaN<span class="hljs-number">1</span>       two  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   NaN   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>     three  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11.0</span>  <span class="hljs-number">12</span>  python<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>data.to_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\out4.csv&#x27;</span>, index=<span class="hljs-literal">False</span>, header=<span class="hljs-literal">False</span>)</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/034/12.png" alt="12"></p><p>还可以传入列表来重新设置列标签：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test5.csv&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>data  something  a   b     c   d message<span class="hljs-number">0</span>       one  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3.0</span>   <span class="hljs-number">4</span>     NaN<span class="hljs-number">1</span>       two  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   NaN   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>     three  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11.0</span>  <span class="hljs-number">12</span>  python<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>data.to_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\out5.csv&#x27;</span>, header=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>, <span class="hljs-string">&#x27;f&#x27;</span>])</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/034/13.png" alt="13"></p><h3><span id="02x05-columns-zhi-ding-lie"><font color="#4876FF">【02x05】columns 指定列</font></span></h3><p>可以通过设置 columns 参数，只写入部分列，并按照指定顺序排序：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = pd.read_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\test5.csv&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>data  something  a   b     c   d message<span class="hljs-number">0</span>       one  <span class="hljs-number">1</span>   <span class="hljs-number">2</span>   <span class="hljs-number">3.0</span>   <span class="hljs-number">4</span>     NaN<span class="hljs-number">1</span>       two  <span class="hljs-number">5</span>   <span class="hljs-number">6</span>   NaN   <span class="hljs-number">8</span>   world<span class="hljs-number">2</span>     three  <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">11.0</span>  <span class="hljs-number">12</span>  python&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>data.to_csv(<span class="hljs-string">r&#x27;C:\Users\TanRe\Desktop\out6.csv&#x27;</span>, columns=[<span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>])</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/034/14.png" alt="14"></p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106963135</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-01x00-du-qu-shu-ju-font&quot;&gt;&lt;font color=&quot;#FF0</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Pandas" scheme="https://www.itbob.cn/tags/Pandas/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Pandas（九）：时间序列</title>
    <link href="https://www.itbob.cn/article/033/"/>
    <id>https://www.itbob.cn/article/033/</id>
    <published>2020-06-25T13:55:49.000Z</published>
    <updated>2022-05-22T12:44:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-01x00-shi-jian-xu-lie-font"><font color="#FF0000">【01x00】时间序列</font></a></li><li><a href="#font-color-ff0000-02x00-timestamp-shi-jian-chuo-font"><font color="#FF0000">【02x00】Timestamp 时间戳</font></a><ul><li><a href="#font-color-4876ff-02x01-pandas-timestamp-font"><font color="#4876FF">【02x01】pandas.Timestamp</font></a></li><li><a href="#font-color-4876ff-02x02-freq-pin-lu-bu-fen-qu-zhi-font"><font color="#4876FF">【02x02】freq 频率部分取值</font></a></li><li><a href="#font-color-4876ff-02x03-to-datetime-font"><font color="#4876FF">【02x03】to_datetime</font></a></li><li><a href="#font-color-4876ff-02x04-date-range-font"><font color="#4876FF">【02x04】date_range</font></a></li><li><a href="#font-color-4876ff-02x05-suo-yin-yu-qie-pian-font"><font color="#4876FF">【02x05】索引与切片</font></a></li><li><a href="#font-color-4876ff-02x06-yi-dong-shu-ju-yu-shu-ju-pian-yi-font"><font color="#4876FF">【02x06】移动数据与数据偏移</font></a></li><li><a href="#font-color-4876ff-02x07-shi-qu-chu-li-font"><font color="#4876FF">【02x07】时区处理</font></a></li></ul></li><li><a href="#font-color-ff0000-03x00-period-gu-ding-shi-qi-font"><font color="#FF0000">【03x00】period 固定时期</font></a><ul><li><a href="#font-color-4876ff-03x01-pandas-period-font"><font color="#4876FF">【03x01】pandas.Period</font></a></li><li><a href="#font-color-4876ff-03x02-period-range-font"><font color="#4876FF">【03x02】period_range</font></a></li><li><a href="#font-color-4876ff-03x03-asfreq-shi-qi-pin-lu-zhuan-huan-font"><font color="#4876FF">【03x03】asfreq 时期频率转换</font></a></li><li><a href="#font-color-4876ff-03x04-to-period-yu-to-timestamp-font"><font color="#4876FF">【03x04】to_period 与 to_timestamp()</font></a></li></ul></li><li><a href="#font-color-ff0000-04x00-timedelta-shi-jian-jian-ge-font"><font color="#FF0000">【04x00】timedelta 时间间隔</font></a><ul><li><a href="#font-color-4876ff-04x01-pandas-timedelta-font"><font color="#4876FF">【04x01】pandas.Timedelta</font></a></li><li><a href="#font-color-4876ff-04x02-to-timedelta-font"><font color="#4876FF">【04x02】to_timedelta</font></a></li><li><a href="#font-color-4876ff-04x03-timedelta-range-font"><font color="#4876FF">【04x03】timedelta_range</font></a></li></ul></li><li><a href="#font-color-ff0000-05x00-chong-cai-yang-ji-pin-lu-zhuan-huan-font"><font color="#FF0000">【05x00】重采样及频率转换</font></a></li></ul><!-- tocstop --><hr><p>Pandas 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/025/">Python 数据分析三剑客之 Pandas（一）：认识 Pandas 及其 Series、DataFrame 对象</a></li><li><a href="https://www.itbob.cn/article/026/">Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作</a></li><li><a href="https://www.itbob.cn/article/027/">Python 数据分析三剑客之 Pandas（三）：算术运算与缺失值的处理</a></li><li><a href="https://www.itbob.cn/article/028/">Python 数据分析三剑客之 Pandas（四）：函数应用、映射、排序和层级索引</a></li><li><a href="https://www.itbob.cn/article/029/">Python 数据分析三剑客之 Pandas（五）：统计计算与统计描述</a></li><li><a href="https://www.itbob.cn/article/030/">Python 数据分析三剑客之 Pandas（六）：GroupBy 数据分裂、应用与合并</a></li><li><a href="https://www.itbob.cn/article/031/">Python 数据分析三剑客之 Pandas（七）：合并数据集</a></li><li><a href="https://www.itbob.cn/article/032/">Python 数据分析三剑客之 Pandas（八）：数据重塑、重复数据处理与数据替换</a></li><li><a href="https://www.itbob.cn/article/033/">Python 数据分析三剑客之 Pandas（九）：时间序列</a></li><li><a href="https://www.itbob.cn/article/034/">Python 数据分析三剑客之 Pandas（十）：数据读写</a></li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106947061</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="01x00-shi-jian-xu-lie"><font color="#FF0000">【01x00】时间序列</font></span></h2><p>官网对于时间序列的介绍：<a href="https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html">https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html</a></p><p>时间序列（time series）是一种重要的结构化数据形式，应用于多个领域，包括金融学、经济学、生态学、神经科学、物理学等。在多个时间点观察或测量到的任何事物都可以形成一段时间序列。很多时间序列是固定频率的，也就是说，数据点是根据某种规律定期出现的（比如每15秒、每5分钟、每月出现一次）。时间序列也可以是不定期的，没有固定的时间单位或单位之间的偏移量。时间序列数据的意义取决于具体的应用场景，主要有以下几种：</p><ul><li><p><font color="#4169E1"><strong>时间戳（timestamp），表示某个具体的时间点，例如 2020-6-24 15:30；</strong></font></p></li><li><p><font color="#4169E1"><strong>固定周期（period），表示某个时间周期，例如 2020-01；</strong></font></p></li><li><p><font color="#4169E1"><strong>时间间隔（timedelta），持续时间，即两个日期或时间之间的差异。</strong></font></p></li><li><p><font color="#FFA500"><strong>针对时间戳数据，Pandas 提供了 Timestamp 类型。它本质上是 Python 的原生 datetime 类型的替代品，但是在性能更好的 numpy.datetime64 类型的基础上创建。对应的索引数据结构是 DatetimeIndex。</strong></font></p></li><li><p><font color="#FFA500"><strong>针对时间周期数据，Pandas 提供了 Period 类型。这是利用 numpy.datetime64 类型将固定频率的时间间隔进行编码。对应的索引数据结构是 PeriodIndex。</strong></font></p></li><li><p><font color="#FFA500"><strong>针对时间增量或持续时间，Pandas 提供了 Timedelta 类型。Timedelta 是一种代替 Python 原生datetime.timedelta 类型的高性能数据结构，同样是基于 numpy.timedelta64 类型。对应的索引数据结构是 TimedeltaIndex。</strong></font></p></li></ul><h2><span id="02x00-timestamp-shi-jian-chuo"><font color="#FF0000">【02x00】Timestamp 时间戳</font></span></h2><h3><span id="02x01-pandas-timestamp"><font color="#4876FF">【02x01】pandas.Timestamp</font></span></h3><p>在 pandas 中，<code>pandas.Timestamp</code> 方法用来代替 Python 中的 <code>datetime.datetime</code> 方法。</p><p>Timestamp 与 Python 的 Datetime 等效，在大多数情况下都可以互换。 此类型用于组成 DatetimeIndex 以及 Pandas 中其他面向时间序列的数据结构。</p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.html">https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.html</a></p><p>基本语法：</p><pre><code class="hljs python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">pandas</span>.<span class="hljs-title">Timestamp</span>(<span class="hljs-params">ts_input=&lt;<span class="hljs-built_in">object</span> <span class="hljs-built_in">object</span>&gt;, </span></span><span class="hljs-params"><span class="hljs-class">   freq=<span class="hljs-literal">None</span>, tz=<span class="hljs-literal">None</span>, unit=<span class="hljs-literal">None</span>, </span></span><span class="hljs-params"><span class="hljs-class">   year=<span class="hljs-literal">None</span>, month=<span class="hljs-literal">None</span>, day=<span class="hljs-literal">None</span>, </span></span><span class="hljs-params"><span class="hljs-class">   hour=<span class="hljs-literal">None</span>, minute=<span class="hljs-literal">None</span>, second=<span class="hljs-literal">None</span>, </span></span><span class="hljs-params"><span class="hljs-class">   microsecond=<span class="hljs-literal">None</span>, nanosecond=<span class="hljs-literal">None</span>, tzinfo=<span class="hljs-literal">None</span></span>)</span></code></pre><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>ts_input</td><td>要转换为时间戳的对象，可以是 datetime-like，str，int，float 类型</td></tr><tr><td>freq</td><td>时间戳将具有的偏移量，可以是 str，日期偏移量类型，取值参见<a href="#t4">【02x02】freq 频率部分取值</a></td></tr><tr><td>tz</td><td>时间戳将具有的时区</td></tr><tr><td>unit</td><td>如果 ts_input 是整数或浮点数，该参数用于设置其单位（D、s、ms、us、ns）</td></tr></tbody></table><p>简单示例：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Timestamp(<span class="hljs-string">&#x27;2017-01-01T12&#x27;</span>)Timestamp(<span class="hljs-string">&#x27;2017-01-01 12:00:00&#x27;</span>)</code></pre><p>设置 <code>unit='s'</code>，即待转换对象单位为秒：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Timestamp(<span class="hljs-number">1513393355.5</span>, unit=<span class="hljs-string">&#x27;s&#x27;</span>)Timestamp(<span class="hljs-string">&#x27;2017-12-16 03:02:35.500000&#x27;</span>)</code></pre><p>使用 <code>tz</code> 参数设置时区：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Timestamp(<span class="hljs-number">1513393355</span>, unit=<span class="hljs-string">&#x27;s&#x27;</span>, tz=<span class="hljs-string">&#x27;US/Pacific&#x27;</span>)Timestamp(<span class="hljs-string">&#x27;2017-12-15 19:02:35-0800&#x27;</span>, tz=<span class="hljs-string">&#x27;US/Pacific&#x27;</span>)</code></pre><p>单独设置年月日：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Timestamp(year=<span class="hljs-number">2020</span>, month=<span class="hljs-number">6</span>, day=<span class="hljs-number">24</span>, hour=<span class="hljs-number">12</span>)Timestamp(<span class="hljs-string">&#x27;2020-06-24 12:00:00&#x27;</span>)</code></pre><h3><span id="02x02-freq-pin-lu-bu-fen-qu-zhi"><font color="#4876FF">【02x02】freq 频率部分取值</font></span></h3><p>完整取值参见官方文档：<a href="https://pandas.pydata.org/docs/user_guide/timeseries.html#timeseries-offset-aliases">https://pandas.pydata.org/docs/user_guide/timeseries.html#timeseries-offset-aliases</a></p><table><thead><tr><th>参数</th><th>类型</th><th>描述</th></tr></thead><tbody><tr><td>D</td><td>Day</td><td>每日历日</td></tr><tr><td>B</td><td>BusinessDay</td><td>每工作日</td></tr><tr><td>H</td><td>Hour</td><td>每小时</td></tr><tr><td>T 或 min</td><td>Minute</td><td>每分</td></tr><tr><td>S</td><td>Second</td><td>每秒</td></tr><tr><td>L 或 ms</td><td>Milli</td><td>每毫秒（即每千分之一秒）</td></tr><tr><td>U</td><td>Micro</td><td>每微秒（即每百万分之一秒）</td></tr><tr><td>M</td><td>MonthEnd</td><td>每月最后一个日历日</td></tr><tr><td>BM</td><td>BusinessMonthEnd</td><td>每月最后一个工作日</td></tr><tr><td>MS</td><td>MonthBegin</td><td>每月第一个日历日</td></tr><tr><td>BMS</td><td>BusinessMonthBegin</td><td>每月第一个工作日</td></tr><tr><td>W-MON、W-TUE…</td><td>Week</td><td>从指定的星期几（MON、TUE、 WED、THU、FR、SAT、SUN）开始算起，每周</td></tr><tr><td>WoM-1MON、WOM-2MON…</td><td>WeekOfMonth</td><td>产生每月第一、第二、第三或第四周的星期几。例如，WoM-3FRI 表示每月第3个星期五</td></tr><tr><td>Q-JAN、Q-FEB…</td><td>QuarterEnd</td><td>对于以指定月份（JAN、FEB、MAR、APR、MAY、JUN、JUL、AUG、SEP、OCT、NOV、DEC）结束的年度，每季度最后一月的最后个日历日</td></tr><tr><td>BQ-JAN、BQ-FEB…</td><td>BusinessQuarterEnd</td><td>对于以指定月份结束的年度，每季度最后一月的最后一个工作日</td></tr><tr><td>QS-JAN、QS-FEB…</td><td>QuarterBegin</td><td>对于以指定月份结束的年度，每季度最后一月的第一个日历日</td></tr><tr><td>BQS-JAN、 BQS-FEB…</td><td>BusinessQuarterBegin</td><td>对于以指定月份结束的年度，每季度最后一月的第一个工作日</td></tr><tr><td>A-JAN、A-FEB…</td><td>YearEnd</td><td>每年指定月份（JAN、FEB、MAR、APR、MAY、JUN、JUL、AUG、SEP、 OCT、NOV、DEC）的最后一个日历日</td></tr><tr><td>BA-JAN、BA-FEB…</td><td>BusinessYearEnd</td><td>每年指定月份的最后一个工作日</td></tr><tr><td>AS-JAN、AS-FEB…</td><td>YearBegin</td><td>每年指定月份的第一个历日日</td></tr><tr><td>BAS-JAN、BAS-FEB…</td><td>BusinessYearBegin</td><td>每年指定月份的第一个工作日</td></tr></tbody></table><h3><span id="02x03-to-datetime"><font color="#4876FF">【02x03】to_datetime</font></span></h3><p>在 Python 中，datetime 库提供了日期和时间处理方法，利用 <code>str</code> 或 <code>strftime</code> 方法可以将 datetime 对象转化成字符串，具体用法可参见<a href="https://blog.csdn.net/qq_36759224/article/details/104427220">【Python 标准库学习】日期和时间处理库 — datetime</a>。</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime<span class="hljs-meta">&gt;&gt;&gt; </span>stamp = datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">6</span>, <span class="hljs-number">24</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>stampdatetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">6</span>, <span class="hljs-number">24</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>)&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">str</span>(stamp)<span class="hljs-string">&#x27;2020-06-24 00:00:00&#x27;</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>stamp.strftime(<span class="hljs-string">&#x27;%Y-%m-%d&#x27;</span>)<span class="hljs-string">&#x27;2020-06-24&#x27;</span></code></pre><p><font color="#FF0000"><strong>在 pandas 中 to_datetime 方法可以将字符串解析成多种不同的 Timestamp（时间戳） 对象：</strong></font></p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>datestrs = <span class="hljs-string">&#x27;2011-07-06 12:00:00&#x27;</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">type</span>(datestrs)&lt;<span class="hljs-class"><span class="hljs-keyword">class</span> &#x27;<span class="hljs-title">str</span>&#x27;&gt;</span><span class="hljs-class">&gt;&gt;&gt; </span><span class="hljs-class">&gt;&gt;&gt; <span class="hljs-title">pd</span>.<span class="hljs-title">to_datetime</span>(<span class="hljs-params">datestrs</span>)</span><span class="hljs-class"><span class="hljs-title">Timestamp</span>(<span class="hljs-params"><span class="hljs-string">&#x27;2011-07-06 12:00:00&#x27;</span></span>)</span></code></pre><p>基本语法：</p><pre><code class="hljs python">pandas.to_datetime(arg, errors=<span class="hljs-string">&#x27;raise&#x27;</span>, dayfirst=<span class="hljs-literal">False</span>,                    yearfirst=<span class="hljs-literal">False</span>, utc=<span class="hljs-literal">None</span>, <span class="hljs-built_in">format</span>=<span class="hljs-literal">None</span>,                    exact=<span class="hljs-literal">True</span>, unit=<span class="hljs-literal">None</span>, infer_datetime_format=<span class="hljs-literal">False</span>,                    origin=<span class="hljs-string">&#x27;unix&#x27;</span>, cache=<span class="hljs-literal">True</span>)</code></pre><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html">https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html</a></p><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>arg</td><td>要转换为日期时间的对象，可以接受 int, float, str, datetime, list, tuple, 1-d array, Series DataFrame/dict-like 类型</td></tr><tr><td>errors</td><td>如果字符串不满足时间戳的形式，是否会发生异常<br><code>ignore</code>：不引发异常，返回原始输入；<code>raise</code>：无效解析将引发异常（默认）；<code>coerce</code>：无效解析将被设置为NaT</td></tr><tr><td>dayfirst</td><td>bool 类型，默认 False，如果 arg 是 str 或列表，是否首先解析为日期<br>例如 dayfirst 为 True，<code>10/11/12</code> 被解析为 <code>2012-11-10</code>，为 False 则解析为 <code>2012-10-11</code></td></tr><tr><td>yearfirst</td><td>bool 类型，默认 False，如果 arg 是 str 或列表，是否首先解析为年份<br>例如 dayfirst 为 True，<code>10/11/12</code> 被解析为 <code>2010-11-12</code>，为 False 则解析为 <code>2012-10-11</code><br>如果 dayfirst 和 yearfirst 都为 True，则优先 yearfirst</td></tr><tr><td>utc</td><td>bool 类型，是否转换为协调世界时，默认 None</td></tr><tr><td>format</td><td>格式化时间，如 <code>21/2/20 16:10</code> 使用 <code>%d/%m/%y %H:%M</code> 会被解析为 <code>2020-02-21 16:10:00</code><br>符号含义常见文章：<a href="https://blog.csdn.net/qq_36759224/article/details/104427220">【Python 标准库学习】日期和时间处理库 — datetime</a> 或者<a href="https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior">官方文档</a></td></tr><tr><td>exact</td><td>如果为 True，则需要精确的格式匹配。如果为 False，则允许格式与目标字符串中的任何位置匹配</td></tr><tr><td>unit</td><td>如果 arg 是整数或浮点数，该参数用于设置其单位（D、s、ms、us、ns）</td></tr></tbody></table><p>简单应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;year&#x27;</span>: [<span class="hljs-number">2015</span>, <span class="hljs-number">2016</span>], <span class="hljs-string">&#x27;month&#x27;</span>: [<span class="hljs-number">2</span>, <span class="hljs-number">3</span>], <span class="hljs-string">&#x27;day&#x27;</span>: [<span class="hljs-number">4</span>, <span class="hljs-number">5</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj   year  month  day<span class="hljs-number">0</span>  <span class="hljs-number">2015</span>      <span class="hljs-number">2</span>    <span class="hljs-number">4</span><span class="hljs-number">1</span>  <span class="hljs-number">2016</span>      <span class="hljs-number">3</span>    <span class="hljs-number">5</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_datetime(obj)<span class="hljs-number">0</span>   <span class="hljs-number">2015</span>-02-04<span class="hljs-number">1</span>   <span class="hljs-number">2016</span>-03-05dtype: datetime64[ns]</code></pre><p>设置 <code>format</code> 和 <code>errors</code> 参数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_datetime(<span class="hljs-string">&#x27;13000101&#x27;</span>, <span class="hljs-built_in">format</span>=<span class="hljs-string">&#x27;%Y%m%d&#x27;</span>, errors=<span class="hljs-string">&#x27;ignore&#x27;</span>)datetime.datetime(<span class="hljs-number">1300</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_datetime(<span class="hljs-string">&#x27;13000101&#x27;</span>, <span class="hljs-built_in">format</span>=<span class="hljs-string">&#x27;%Y%m%d&#x27;</span>, errors=<span class="hljs-string">&#x27;coerce&#x27;</span>)NaT<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_datetime(<span class="hljs-string">&#x27;13000101&#x27;</span>, <span class="hljs-built_in">format</span>=<span class="hljs-string">&#x27;%Y%m%d&#x27;</span>, errors=<span class="hljs-string">&#x27;raise&#x27;</span>)Traceback (most recent call last):...pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: <span class="hljs-number">1300</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span></code></pre><p>设置 <code>unit</code> 参数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_datetime(<span class="hljs-number">1490195805</span>, unit=<span class="hljs-string">&#x27;s&#x27;</span>)Timestamp(<span class="hljs-string">&#x27;2017-03-22 15:16:45&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_datetime(<span class="hljs-number">1490195805433502912</span>, unit=<span class="hljs-string">&#x27;ns&#x27;</span>)Timestamp(<span class="hljs-string">&#x27;2017-03-22 15:16:45.433502912&#x27;</span>)</code></pre><h3><span id="02x04-date-range"><font color="#4876FF">【02x04】date_range</font></span></h3><p><code>pandas.date_range</code> 方法可用于根据指定的频率生成指定长度的 DatetimeIndex。</p><p>基本语法：</p><pre><code class="hljs python">pandas.date_range(start=<span class="hljs-literal">None</span>, end=<span class="hljs-literal">None</span>, periods=<span class="hljs-literal">None</span>, freq=<span class="hljs-literal">None</span>,                   tz=<span class="hljs-literal">None</span>, normalize=<span class="hljs-literal">False</span>, name=<span class="hljs-literal">None</span>, closed=<span class="hljs-literal">None</span>,                   **kwargs) → pandas.core.indexes.datetimes.DatetimeIndex</code></pre><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.date_range.html">https://pandas.pydata.org/docs/reference/api/pandas.date_range.html</a></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>start</td><td>开始日期</td></tr><tr><td>end</td><td>结束日期</td></tr><tr><td>periods</td><td>int 类型，要生成的时段数（天）</td></tr><tr><td>freq</td><td>频率字符串，即按照某种特定的频率来生成日期，取值参见<a href="#t4">【02x02】freq 频率部分取值</a></td></tr><tr><td>tz</td><td>设置时区，例如 “Asia/Hong_Kong”</td></tr><tr><td>normalize</td><td>bool 类型，默认 False，是否在生成日期之前对其进行规范化（仅保留年月日）</td></tr><tr><td>name</td><td>结果 DatetimeIndex 的名称</td></tr><tr><td>closed</td><td><code>None</code>：默认值，同时保留开始日期和结束日期<br><code>'left'</code>：保留开始日期，不保留结束日期<br><code>'right'</code>：保留结束日期，不保留开始日期</td></tr></tbody></table><p>简单示例：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(start=<span class="hljs-string">&#x27;1/1/2018&#x27;</span>, end=<span class="hljs-string">&#x27;1/08/2018&#x27;</span>)DatetimeIndex([<span class="hljs-string">&#x27;2018-01-01&#x27;</span>, <span class="hljs-string">&#x27;2018-01-02&#x27;</span>, <span class="hljs-string">&#x27;2018-01-03&#x27;</span>, <span class="hljs-string">&#x27;2018-01-04&#x27;</span>,               <span class="hljs-string">&#x27;2018-01-05&#x27;</span>, <span class="hljs-string">&#x27;2018-01-06&#x27;</span>, <span class="hljs-string">&#x27;2018-01-07&#x27;</span>, <span class="hljs-string">&#x27;2018-01-08&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)</code></pre><p>指定 <code>periods</code> 参数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(start=<span class="hljs-string">&#x27;2012-04-01&#x27;</span>, periods=<span class="hljs-number">20</span>)DatetimeIndex([<span class="hljs-string">&#x27;2012-04-01&#x27;</span>, <span class="hljs-string">&#x27;2012-04-02&#x27;</span>, <span class="hljs-string">&#x27;2012-04-03&#x27;</span>, <span class="hljs-string">&#x27;2012-04-04&#x27;</span>,               <span class="hljs-string">&#x27;2012-04-05&#x27;</span>, <span class="hljs-string">&#x27;2012-04-06&#x27;</span>, <span class="hljs-string">&#x27;2012-04-07&#x27;</span>, <span class="hljs-string">&#x27;2012-04-08&#x27;</span>,               <span class="hljs-string">&#x27;2012-04-09&#x27;</span>, <span class="hljs-string">&#x27;2012-04-10&#x27;</span>, <span class="hljs-string">&#x27;2012-04-11&#x27;</span>, <span class="hljs-string">&#x27;2012-04-12&#x27;</span>,               <span class="hljs-string">&#x27;2012-04-13&#x27;</span>, <span class="hljs-string">&#x27;2012-04-14&#x27;</span>, <span class="hljs-string">&#x27;2012-04-15&#x27;</span>, <span class="hljs-string">&#x27;2012-04-16&#x27;</span>,               <span class="hljs-string">&#x27;2012-04-17&#x27;</span>, <span class="hljs-string">&#x27;2012-04-18&#x27;</span>, <span class="hljs-string">&#x27;2012-04-19&#x27;</span>, <span class="hljs-string">&#x27;2012-04-20&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(end=<span class="hljs-string">&#x27;2012-06-01&#x27;</span>, periods=<span class="hljs-number">20</span>)DatetimeIndex([<span class="hljs-string">&#x27;2012-05-13&#x27;</span>, <span class="hljs-string">&#x27;2012-05-14&#x27;</span>, <span class="hljs-string">&#x27;2012-05-15&#x27;</span>, <span class="hljs-string">&#x27;2012-05-16&#x27;</span>,               <span class="hljs-string">&#x27;2012-05-17&#x27;</span>, <span class="hljs-string">&#x27;2012-05-18&#x27;</span>, <span class="hljs-string">&#x27;2012-05-19&#x27;</span>, <span class="hljs-string">&#x27;2012-05-20&#x27;</span>,               <span class="hljs-string">&#x27;2012-05-21&#x27;</span>, <span class="hljs-string">&#x27;2012-05-22&#x27;</span>, <span class="hljs-string">&#x27;2012-05-23&#x27;</span>, <span class="hljs-string">&#x27;2012-05-24&#x27;</span>,               <span class="hljs-string">&#x27;2012-05-25&#x27;</span>, <span class="hljs-string">&#x27;2012-05-26&#x27;</span>, <span class="hljs-string">&#x27;2012-05-27&#x27;</span>, <span class="hljs-string">&#x27;2012-05-28&#x27;</span>,               <span class="hljs-string">&#x27;2012-05-29&#x27;</span>, <span class="hljs-string">&#x27;2012-05-30&#x27;</span>, <span class="hljs-string">&#x27;2012-05-31&#x27;</span>, <span class="hljs-string">&#x27;2012-06-01&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(start=<span class="hljs-string">&#x27;2018-04-24&#x27;</span>, end=<span class="hljs-string">&#x27;2018-04-27&#x27;</span>, periods=<span class="hljs-number">3</span>)DatetimeIndex([<span class="hljs-string">&#x27;2018-04-24 00:00:00&#x27;</span>, <span class="hljs-string">&#x27;2018-04-25 12:00:00&#x27;</span>, <span class="hljs-string">&#x27;2018-04-27 00:00:00&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-literal">None</span>)&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(start=<span class="hljs-string">&#x27;2018-04-24&#x27;</span>, end=<span class="hljs-string">&#x27;2018-04-28&#x27;</span>, periods=<span class="hljs-number">3</span>)DatetimeIndex([<span class="hljs-string">&#x27;2018-04-24&#x27;</span>, <span class="hljs-string">&#x27;2018-04-26&#x27;</span>, <span class="hljs-string">&#x27;2018-04-28&#x27;</span>], dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-literal">None</span>)</code></pre><p>指定 <code>freq='M'</code> 会按照每月最后一个日历日的频率生成日期，指定 <code>freq='3M'</code> 会每隔3个月按照每月最后一个日历日的频率生成日期：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(start=<span class="hljs-string">&#x27;1/1/2018&#x27;</span>, periods=<span class="hljs-number">5</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)DatetimeIndex([<span class="hljs-string">&#x27;2018-01-31&#x27;</span>, <span class="hljs-string">&#x27;2018-02-28&#x27;</span>, <span class="hljs-string">&#x27;2018-03-31&#x27;</span>, <span class="hljs-string">&#x27;2018-04-30&#x27;</span>,               <span class="hljs-string">&#x27;2018-05-31&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(start=<span class="hljs-string">&#x27;1/1/2018&#x27;</span>, periods=<span class="hljs-number">5</span>, freq=<span class="hljs-string">&#x27;3M&#x27;</span>)DatetimeIndex([<span class="hljs-string">&#x27;2018-01-31&#x27;</span>, <span class="hljs-string">&#x27;2018-04-30&#x27;</span>, <span class="hljs-string">&#x27;2018-07-31&#x27;</span>, <span class="hljs-string">&#x27;2018-10-31&#x27;</span>,               <span class="hljs-string">&#x27;2019-01-31&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;3M&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span></code></pre><p>使用 <code>tz</code> 参数设置时区：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(start=<span class="hljs-string">&#x27;1/1/2018&#x27;</span>, periods=<span class="hljs-number">5</span>, tz=<span class="hljs-string">&#x27;Asia/Tokyo&#x27;</span>)DatetimeIndex([<span class="hljs-string">&#x27;2018-01-01 00:00:00+09:00&#x27;</span>, <span class="hljs-string">&#x27;2018-01-02 00:00:00+09:00&#x27;</span>,               <span class="hljs-string">&#x27;2018-01-03 00:00:00+09:00&#x27;</span>, <span class="hljs-string">&#x27;2018-01-04 00:00:00+09:00&#x27;</span>,               <span class="hljs-string">&#x27;2018-01-05 00:00:00+09:00&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns, Asia/Tokyo]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(start=<span class="hljs-string">&#x27;6/24/2020&#x27;</span>, periods=<span class="hljs-number">5</span>, tz=<span class="hljs-string">&#x27;Asia/Hong_Kong&#x27;</span>)DatetimeIndex([<span class="hljs-string">&#x27;2020-06-24 00:00:00+08:00&#x27;</span>, <span class="hljs-string">&#x27;2020-06-25 00:00:00+08:00&#x27;</span>,               <span class="hljs-string">&#x27;2020-06-26 00:00:00+08:00&#x27;</span>, <span class="hljs-string">&#x27;2020-06-27 00:00:00+08:00&#x27;</span>,               <span class="hljs-string">&#x27;2020-06-28 00:00:00+08:00&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns, Asia/Hong_Kong]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)</code></pre><p>设置 <code>normalize</code> 参数，在生成时间戳之前对其进行格式化操作：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(<span class="hljs-string">&#x27;2020-06-24 12:56:31&#x27;</span>, periods=<span class="hljs-number">5</span>, normalize=<span class="hljs-literal">True</span>)DatetimeIndex([<span class="hljs-string">&#x27;2020-06-24&#x27;</span>, <span class="hljs-string">&#x27;2020-06-25&#x27;</span>, <span class="hljs-string">&#x27;2020-06-26&#x27;</span>, <span class="hljs-string">&#x27;2020-06-27&#x27;</span>,               <span class="hljs-string">&#x27;2020-06-28&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)</code></pre><p>设置 <code>closed</code> 参数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(start=<span class="hljs-string">&#x27;2020-06-20&#x27;</span>, end=<span class="hljs-string">&#x27;2020-06-24&#x27;</span>, closed=<span class="hljs-literal">None</span>)DatetimeIndex([<span class="hljs-string">&#x27;2020-06-20&#x27;</span>, <span class="hljs-string">&#x27;2020-06-21&#x27;</span>, <span class="hljs-string">&#x27;2020-06-22&#x27;</span>, <span class="hljs-string">&#x27;2020-06-23&#x27;</span>,               <span class="hljs-string">&#x27;2020-06-24&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(start=<span class="hljs-string">&#x27;2020-06-20&#x27;</span>, end=<span class="hljs-string">&#x27;2020-06-24&#x27;</span>, closed=<span class="hljs-string">&#x27;left&#x27;</span>)DatetimeIndex([<span class="hljs-string">&#x27;2020-06-20&#x27;</span>, <span class="hljs-string">&#x27;2020-06-21&#x27;</span>, <span class="hljs-string">&#x27;2020-06-22&#x27;</span>, <span class="hljs-string">&#x27;2020-06-23&#x27;</span>], dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(start=<span class="hljs-string">&#x27;2020-06-20&#x27;</span>, end=<span class="hljs-string">&#x27;2020-06-24&#x27;</span>, closed=<span class="hljs-string">&#x27;right&#x27;</span>)DatetimeIndex([<span class="hljs-string">&#x27;2020-06-21&#x27;</span>, <span class="hljs-string">&#x27;2020-06-22&#x27;</span>, <span class="hljs-string">&#x27;2020-06-23&#x27;</span>, <span class="hljs-string">&#x27;2020-06-24&#x27;</span>], dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)</code></pre><h3><span id="02x05-suo-yin-yu-qie-pian"><font color="#4876FF">【02x05】索引与切片</font></span></h3><p>Pandas 最基本的时间序列类型就是以时间戳（通常以 Python 字符串或 datatime 对象表示）为索引的Series，这些 datetime 对象实际上是被放在 DatetimeIndex 中的，可以使用类似 pandas.Series 对象的切片方法对其进行索引：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>dates = [datetime(<span class="hljs-number">2011</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>), datetime(<span class="hljs-number">2011</span>, <span class="hljs-number">1</span>, <span class="hljs-number">5</span>),             datetime(<span class="hljs-number">2011</span>, <span class="hljs-number">1</span>, <span class="hljs-number">7</span>), datetime(<span class="hljs-number">2011</span>, <span class="hljs-number">1</span>, <span class="hljs-number">8</span>),             datetime(<span class="hljs-number">2011</span>, <span class="hljs-number">1</span>, <span class="hljs-number">10</span>), datetime(<span class="hljs-number">2011</span>, <span class="hljs-number">1</span>, <span class="hljs-number">12</span>)]<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(np.random.randn(<span class="hljs-number">6</span>), index=dates)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">2011</span>-01-02   -<span class="hljs-number">0.407110</span><span class="hljs-number">2011</span>-01-05   -<span class="hljs-number">0.186661</span><span class="hljs-number">2011</span>-01-07   -<span class="hljs-number">0.731080</span><span class="hljs-number">2011</span>-01-08    <span class="hljs-number">0.860970</span><span class="hljs-number">2011</span>-01-<span class="hljs-number">10</span>    <span class="hljs-number">1.929973</span><span class="hljs-number">2011</span>-01-<span class="hljs-number">12</span>   -<span class="hljs-number">0.168599</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.indexDatetimeIndex([<span class="hljs-string">&#x27;2011-01-02&#x27;</span>, <span class="hljs-string">&#x27;2011-01-05&#x27;</span>, <span class="hljs-string">&#x27;2011-01-07&#x27;</span>, <span class="hljs-string">&#x27;2011-01-08&#x27;</span>,               <span class="hljs-string">&#x27;2011-01-10&#x27;</span>, <span class="hljs-string">&#x27;2011-01-12&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-literal">None</span>)&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.index[<span class="hljs-number">0</span>]Timestamp(<span class="hljs-string">&#x27;2011-01-02 00:00:00&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.index[<span class="hljs-number">0</span>:<span class="hljs-number">3</span>]DatetimeIndex([<span class="hljs-string">&#x27;2011-01-02&#x27;</span>, <span class="hljs-string">&#x27;2011-01-05&#x27;</span>, <span class="hljs-string">&#x27;2011-01-07&#x27;</span>], dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-literal">None</span>)</code></pre><p>另外还可以传入一个可以被解释为日期的字符串，或者只需传入“年”或“年月”即可轻松选取数据的切片：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(np.random.randn(<span class="hljs-number">1000</span>), index=pd.date_range(<span class="hljs-string">&#x27;1/1/2000&#x27;</span>, periods=<span class="hljs-number">1000</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">2000</span>-01-01   -<span class="hljs-number">1.142284</span><span class="hljs-number">2000</span>-01-02    <span class="hljs-number">1.198785</span><span class="hljs-number">2000</span>-01-03    <span class="hljs-number">2.466909</span><span class="hljs-number">2000</span>-01-04   -<span class="hljs-number">0.086728</span><span class="hljs-number">2000</span>-01-05   -<span class="hljs-number">0.978437</span>                ...   <span class="hljs-number">2002</span>-09-<span class="hljs-number">22</span>   -<span class="hljs-number">0.252240</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">23</span>    <span class="hljs-number">0.148561</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">24</span>   -<span class="hljs-number">1.330409</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">25</span>   -<span class="hljs-number">0.673471</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">26</span>   -<span class="hljs-number">0.253271</span>Freq: D, Length: <span class="hljs-number">1000</span>, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;26/9/2002&#x27;</span>]-<span class="hljs-number">0.25327100684233356</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;2002&#x27;</span>]<span class="hljs-number">2002</span>-01-01    <span class="hljs-number">1.058715</span><span class="hljs-number">2002</span>-01-02    <span class="hljs-number">0.900859</span><span class="hljs-number">2002</span>-01-03    <span class="hljs-number">1.993508</span><span class="hljs-number">2002</span>-01-04   -<span class="hljs-number">0.103211</span><span class="hljs-number">2002</span>-01-05   -<span class="hljs-number">0.950090</span>                ...   <span class="hljs-number">2002</span>-09-<span class="hljs-number">22</span>   -<span class="hljs-number">0.252240</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">23</span>    <span class="hljs-number">0.148561</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">24</span>   -<span class="hljs-number">1.330409</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">25</span>   -<span class="hljs-number">0.673471</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">26</span>   -<span class="hljs-number">0.253271</span>Freq: D, Length: <span class="hljs-number">269</span>, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;2002-09&#x27;</span>]<span class="hljs-number">2002</span>-09-01   -<span class="hljs-number">0.995528</span><span class="hljs-number">2002</span>-09-02    <span class="hljs-number">0.501528</span><span class="hljs-number">2002</span>-09-03   -<span class="hljs-number">0.486753</span><span class="hljs-number">2002</span>-09-04   -<span class="hljs-number">1.083906</span><span class="hljs-number">2002</span>-09-05    <span class="hljs-number">1.458975</span><span class="hljs-number">2002</span>-09-06   -<span class="hljs-number">1.331685</span><span class="hljs-number">2002</span>-09-07    <span class="hljs-number">0.195338</span><span class="hljs-number">2002</span>-09-08   -<span class="hljs-number">0.429613</span><span class="hljs-number">2002</span>-09-09    <span class="hljs-number">1.125823</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">10</span>    <span class="hljs-number">1.607051</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">11</span>    <span class="hljs-number">0.530387</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">12</span>   -<span class="hljs-number">0.015938</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">13</span>    <span class="hljs-number">1.781043</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">14</span>   -<span class="hljs-number">0.277123</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">15</span>    <span class="hljs-number">0.344569</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">16</span>   -<span class="hljs-number">1.010810</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">17</span>    <span class="hljs-number">0.463001</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">18</span>    <span class="hljs-number">1.883636</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">19</span>    <span class="hljs-number">0.274520</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">20</span>    <span class="hljs-number">0.624184</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">21</span>   -<span class="hljs-number">1.203057</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">22</span>   -<span class="hljs-number">0.252240</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">23</span>    <span class="hljs-number">0.148561</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">24</span>   -<span class="hljs-number">1.330409</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">25</span>   -<span class="hljs-number">0.673471</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">26</span>   -<span class="hljs-number">0.253271</span>Freq: D, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;20/9/2002&#x27;</span>:<span class="hljs-string">&#x27;26/9/2002&#x27;</span>]<span class="hljs-number">2002</span>-09-<span class="hljs-number">20</span>    <span class="hljs-number">0.624184</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">21</span>   -<span class="hljs-number">1.203057</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">22</span>   -<span class="hljs-number">0.252240</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">23</span>    <span class="hljs-number">0.148561</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">24</span>   -<span class="hljs-number">1.330409</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">25</span>   -<span class="hljs-number">0.673471</span><span class="hljs-number">2002</span>-09-<span class="hljs-number">26</span>   -<span class="hljs-number">0.253271</span>Freq: D, dtype: float64</code></pre><h3><span id="02x06-yi-dong-shu-ju-yu-shu-ju-pian-yi"><font color="#4876FF">【02x06】移动数据与数据偏移</font></span></h3><p>移动（shifting）指的是沿着时间轴将数据前移或后移。Series 和 DataFrame 都有一个 shift 方法用于执行单纯的前移或后移操作，保持索引不变：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(np.random.randn(<span class="hljs-number">4</span>),            index=pd.date_range(<span class="hljs-string">&#x27;1/1/2000&#x27;</span>, periods=<span class="hljs-number">4</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">2000</span>-01-<span class="hljs-number">31</span>   -<span class="hljs-number">0.100217</span><span class="hljs-number">2000</span>-02-<span class="hljs-number">29</span>    <span class="hljs-number">1.177834</span><span class="hljs-number">2000</span>-03-<span class="hljs-number">31</span>   -<span class="hljs-number">0.644353</span><span class="hljs-number">2000</span>-04-<span class="hljs-number">30</span>   -<span class="hljs-number">1.954679</span>Freq: M, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.shift(<span class="hljs-number">2</span>)<span class="hljs-number">2000</span>-01-<span class="hljs-number">31</span>         NaN<span class="hljs-number">2000</span>-02-<span class="hljs-number">29</span>         NaN<span class="hljs-number">2000</span>-03-<span class="hljs-number">31</span>   -<span class="hljs-number">0.100217</span><span class="hljs-number">2000</span>-04-<span class="hljs-number">30</span>    <span class="hljs-number">1.177834</span>Freq: M, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.shift(-<span class="hljs-number">2</span>)<span class="hljs-number">2000</span>-01-<span class="hljs-number">31</span>   -<span class="hljs-number">0.644353</span><span class="hljs-number">2000</span>-02-<span class="hljs-number">29</span>   -<span class="hljs-number">1.954679</span><span class="hljs-number">2000</span>-03-<span class="hljs-number">31</span>         NaN<span class="hljs-number">2000</span>-04-<span class="hljs-number">30</span>         NaNFreq: M, dtype: float64</code></pre><p>因为简单的移位操作不会修改索引，所以部分数据会被丢弃并引入 NaN（缺失值）。因此，如果频率已知，则可以将其传给 shift 以便实现对时间戳进行位移而不是对数据进行简单位移：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(np.random.randn(<span class="hljs-number">4</span>),            index=pd.date_range(<span class="hljs-string">&#x27;1/1/2000&#x27;</span>, periods=<span class="hljs-number">4</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">2000</span>-01-<span class="hljs-number">31</span>   -<span class="hljs-number">0.100217</span><span class="hljs-number">2000</span>-02-<span class="hljs-number">29</span>    <span class="hljs-number">1.177834</span><span class="hljs-number">2000</span>-03-<span class="hljs-number">31</span>   -<span class="hljs-number">0.644353</span><span class="hljs-number">2000</span>-04-<span class="hljs-number">30</span>   -<span class="hljs-number">1.954679</span>Freq: M, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.shift(<span class="hljs-number">2</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)<span class="hljs-number">2000</span>-03-<span class="hljs-number">31</span>   -<span class="hljs-number">0.100217</span><span class="hljs-number">2000</span>-04-<span class="hljs-number">30</span>    <span class="hljs-number">1.177834</span><span class="hljs-number">2000</span>-05-<span class="hljs-number">31</span>   -<span class="hljs-number">0.644353</span><span class="hljs-number">2000</span>-06-<span class="hljs-number">30</span>   -<span class="hljs-number">1.954679</span>Freq: M, dtype: float64</code></pre><p>Pandas 中的频率是由一个基础频率（base frequency）和一个乘数组成的。基础频率通常以一个字符串别名表示，比如 <code>&quot;M&quot;</code> 表示每月，<code>&quot;H&quot;</code> 表示每小时。对于每个基础频率，都有一个被称为日期偏移量（date offset）的对象与之对应。例如，按小时计算的频率可以用 <code>Hour</code> 类表示：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pandas.tseries.offsets <span class="hljs-keyword">import</span> Hour, Minute<span class="hljs-meta">&gt;&gt;&gt; </span>hour = Hour()<span class="hljs-meta">&gt;&gt;&gt; </span>hour&lt;Hour&gt;<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>four_hours = Hour(<span class="hljs-number">4</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>four_hours&lt;<span class="hljs-number">4</span> * Hours&gt;</code></pre><p>一般来说，无需明确创建这样的对象，只需使用诸如 <code>&quot;H&quot;</code> 或 <code>&quot;4H&quot;</code> 这样的字符串别名即可。在基础频率前面放上一个整数即可创建倍数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(<span class="hljs-string">&#x27;2000-01-01&#x27;</span>, <span class="hljs-string">&#x27;2000-01-03 23:59&#x27;</span>, freq=<span class="hljs-string">&#x27;4h&#x27;</span>)DatetimeIndex([<span class="hljs-string">&#x27;2000-01-01 00:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-01 04:00:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-01 08:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-01 12:00:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-01 16:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-01 20:00:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-02 00:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-02 04:00:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-02 08:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-02 12:00:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-02 16:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-02 20:00:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-03 00:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-03 04:00:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-03 08:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-03 12:00:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-03 16:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-03 20:00:00&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;4H&#x27;</span>)</code></pre><p>大部分偏移量对象都可通过加法进行连接：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pandas.tseries.offsets <span class="hljs-keyword">import</span> Hour, Minute<span class="hljs-meta">&gt;&gt;&gt; </span>Hour(<span class="hljs-number">2</span>) + Minute(<span class="hljs-number">30</span>)&lt;<span class="hljs-number">150</span> * Minutes&gt;</code></pre><p>对于 <code>freq</code> 参数也可以传入频率字符串（如 <code>&quot;2h30min&quot;</code>），这种字符串可以被高效地解析为等效的表达式：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.date_range(<span class="hljs-string">&#x27;2000-01-01&#x27;</span>, periods=<span class="hljs-number">10</span>, freq=<span class="hljs-string">&#x27;1h30min&#x27;</span>)DatetimeIndex([<span class="hljs-string">&#x27;2000-01-01 00:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-01 01:30:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-01 03:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-01 04:30:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-01 06:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-01 07:30:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-01 09:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-01 10:30:00&#x27;</span>,               <span class="hljs-string">&#x27;2000-01-01 12:00:00&#x27;</span>, <span class="hljs-string">&#x27;2000-01-01 13:30:00&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;90T&#x27;</span>)</code></pre><p>这种偏移量还可以用在 datetime 或 Timestamp 对象上：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pandas.tseries.offsets <span class="hljs-keyword">import</span> Day, MonthEnd<span class="hljs-meta">&gt;&gt;&gt; </span>now = datetime(<span class="hljs-number">2011</span>, <span class="hljs-number">11</span>, <span class="hljs-number">17</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>now + <span class="hljs-number">3</span> * Day()Timestamp(<span class="hljs-string">&#x27;2011-11-20 00:00:00&#x27;</span>)</code></pre><p>如果加的是锚点偏移量，比如 MonthEnd，第一次增量会将原日期向前滚动到符合频率规则的下一个日期：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pandas.tseries.offsets <span class="hljs-keyword">import</span> Day, MonthEnd<span class="hljs-meta">&gt;&gt;&gt; </span>now = datetime(<span class="hljs-number">2011</span>, <span class="hljs-number">11</span>, <span class="hljs-number">17</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>now + MonthEnd()Timestamp(<span class="hljs-string">&#x27;2011-11-30 00:00:00&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>now + MonthEnd(<span class="hljs-number">2</span>)Timestamp(<span class="hljs-string">&#x27;2011-12-31 00:00:00&#x27;</span>)</code></pre><p>通过锚点偏移量的 rollforward 和 rollback 方法，可明确地将日期向前或向后滚动：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pandas.tseries.offsets <span class="hljs-keyword">import</span> Day, MonthEnd<span class="hljs-meta">&gt;&gt;&gt; </span>now = datetime(<span class="hljs-number">2011</span>, <span class="hljs-number">11</span>, <span class="hljs-number">17</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>offset = MonthEnd()<span class="hljs-meta">&gt;&gt;&gt; </span>offset.rollforward(now)Timestamp(<span class="hljs-string">&#x27;2011-11-30 00:00:00&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>offset.rollback(now)Timestamp(<span class="hljs-string">&#x27;2011-10-31 00:00:00&#x27;</span>)</code></pre><p>与 <code>groupby</code> 方法结合使用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pandas.tseries.offsets <span class="hljs-keyword">import</span> Day, MonthEnd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(np.random.randn(<span class="hljs-number">20</span>),            index=pd.date_range(<span class="hljs-string">&#x27;1/15/2000&#x27;</span>, periods=<span class="hljs-number">20</span>, freq=<span class="hljs-string">&#x27;4d&#x27;</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">2000</span>-01-<span class="hljs-number">15</span>   -<span class="hljs-number">0.591729</span><span class="hljs-number">2000</span>-01-<span class="hljs-number">19</span>   -<span class="hljs-number">0.775844</span><span class="hljs-number">2000</span>-01-<span class="hljs-number">23</span>   -<span class="hljs-number">0.745603</span><span class="hljs-number">2000</span>-01-<span class="hljs-number">27</span>   -<span class="hljs-number">0.076439</span><span class="hljs-number">2000</span>-01-<span class="hljs-number">31</span>    <span class="hljs-number">1.796417</span><span class="hljs-number">2000</span>-02-04   -<span class="hljs-number">0.500349</span><span class="hljs-number">2000</span>-02-08    <span class="hljs-number">0.515851</span><span class="hljs-number">2000</span>-02-<span class="hljs-number">12</span>   -<span class="hljs-number">0.344171</span><span class="hljs-number">2000</span>-02-<span class="hljs-number">16</span>    <span class="hljs-number">0.419657</span><span class="hljs-number">2000</span>-02-<span class="hljs-number">20</span>    <span class="hljs-number">0.307288</span><span class="hljs-number">2000</span>-02-<span class="hljs-number">24</span>    <span class="hljs-number">0.115113</span><span class="hljs-number">2000</span>-02-<span class="hljs-number">28</span>   -<span class="hljs-number">0.362585</span><span class="hljs-number">2000</span>-03-03    <span class="hljs-number">1.074892</span><span class="hljs-number">2000</span>-03-07    <span class="hljs-number">1.111366</span><span class="hljs-number">2000</span>-03-<span class="hljs-number">11</span>    <span class="hljs-number">0.949910</span><span class="hljs-number">2000</span>-03-<span class="hljs-number">15</span>   -<span class="hljs-number">1.535727</span><span class="hljs-number">2000</span>-03-<span class="hljs-number">19</span>    <span class="hljs-number">0.545944</span><span class="hljs-number">2000</span>-03-<span class="hljs-number">23</span>   -<span class="hljs-number">0.810139</span><span class="hljs-number">2000</span>-03-<span class="hljs-number">27</span>   -<span class="hljs-number">1.260627</span><span class="hljs-number">2000</span>-03-<span class="hljs-number">31</span>   -<span class="hljs-number">0.128403</span>Freq: 4D, dtype: float64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>offset = MonthEnd()<span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(offset.rollforward).mean()<span class="hljs-number">2000</span>-01-<span class="hljs-number">31</span>   -<span class="hljs-number">0.078640</span><span class="hljs-number">2000</span>-02-<span class="hljs-number">29</span>    <span class="hljs-number">0.021543</span><span class="hljs-number">2000</span>-03-<span class="hljs-number">31</span>   -<span class="hljs-number">0.006598</span>dtype: float64</code></pre><h3><span id="02x07-shi-qu-chu-li"><font color="#4876FF">【02x07】时区处理</font></span></h3><p>在 Python 中，时区信息来自第三方库 pytz，使用 <code>pytz.common_timezones</code> 方法可以查看所有的时区名称，使用 <code>pytz.timezone</code> 方法从 pytz 中获取时区对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pytz<span class="hljs-meta">&gt;&gt;&gt; </span>pytz.common_timezones[<span class="hljs-string">&#x27;Africa/Abidjan&#x27;</span>, <span class="hljs-string">&#x27;Africa/Accra&#x27;</span>, <span class="hljs-string">&#x27;Africa/Addis_Ababa&#x27;</span>, ..., <span class="hljs-string">&#x27;UTC&#x27;</span>]&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>tz = pytz.timezone(<span class="hljs-string">&#x27;Asia/Shanghai&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>tz&lt;DstTzInfo <span class="hljs-string">&#x27;Asia/Shanghai&#x27;</span> LMT+<span class="hljs-number">8</span>:06:<span class="hljs-number">00</span> STD&gt;  <span class="hljs-comment"># 表示与 UTC 时间相差8小时6分</span></code></pre><p>在 <code>date_range</code> 方法中，<code>tz</code> 参数用于指定时区，默认为 None，可以使用 <code>tz_localize</code> 方法将其进行本地化时区转换，如下示例中，将无时区转本地化 UTC 时区：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>rng = pd.date_range(<span class="hljs-string">&#x27;3/9/2012 9:30&#x27;</span>, periods=<span class="hljs-number">6</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>ts = pd.Series(np.random.randn(<span class="hljs-built_in">len</span>(rng)), index=rng)<span class="hljs-meta">&gt;&gt;&gt; </span>ts<span class="hljs-number">2012</span>-03-09 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">1.527913</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">10</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">1.116101</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">11</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0.359358</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">12</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">0.475920</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">13</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">0.336570</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">14</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">1.075952</span>Freq: D, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">print</span>(ts.index.tz)<span class="hljs-literal">None</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>ts_utc = ts.tz_localize(<span class="hljs-string">&#x27;UTC&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>ts_utc<span class="hljs-number">2012</span>-03-09 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+<span class="hljs-number">00</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">1.527913</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">10</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+<span class="hljs-number">00</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">1.116101</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">11</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+<span class="hljs-number">00</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0.359358</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">12</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+<span class="hljs-number">00</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">0.475920</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">13</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+<span class="hljs-number">00</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">0.336570</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">14</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+<span class="hljs-number">00</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">1.075952</span>Freq: D, dtype: float64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>ts_utc.indexDatetimeIndex([<span class="hljs-string">&#x27;2012-03-09 09:30:00+00:00&#x27;</span>, <span class="hljs-string">&#x27;2012-03-10 09:30:00+00:00&#x27;</span>,               <span class="hljs-string">&#x27;2012-03-11 09:30:00+00:00&#x27;</span>, <span class="hljs-string">&#x27;2012-03-12 09:30:00+00:00&#x27;</span>,               <span class="hljs-string">&#x27;2012-03-13 09:30:00+00:00&#x27;</span>, <span class="hljs-string">&#x27;2012-03-14 09:30:00+00:00&#x27;</span>],              dtype=<span class="hljs-string">&#x27;datetime64[ns, UTC]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)</code></pre><p>时间序列被本地化到某个特定时区后，就可以用 <code>tz_convert</code> 方法将其转换到别的时区了：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>rng = pd.date_range(<span class="hljs-string">&#x27;3/9/2012 9:30&#x27;</span>, periods=<span class="hljs-number">6</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>ts = pd.Series(np.random.randn(<span class="hljs-built_in">len</span>(rng)), index=rng)<span class="hljs-meta">&gt;&gt;&gt; </span>ts<span class="hljs-number">2012</span>-03-09 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0.480303</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">10</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">1.461039</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">11</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">1.512749</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">12</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>   -<span class="hljs-number">2.185421</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">13</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>    <span class="hljs-number">1.657845</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">14</span> 09:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0.175633</span>Freq: D, dtype: float64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>ts.tz_localize(<span class="hljs-string">&#x27;UTC&#x27;</span>).tz_convert(<span class="hljs-string">&#x27;Asia/Shanghai&#x27;</span>)<span class="hljs-number">2012</span>-03-09 <span class="hljs-number">17</span>:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+08:<span class="hljs-number">00</span>    <span class="hljs-number">0.480303</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">10</span> <span class="hljs-number">17</span>:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+08:<span class="hljs-number">00</span>   -<span class="hljs-number">1.461039</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">11</span> <span class="hljs-number">17</span>:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+08:<span class="hljs-number">00</span>   -<span class="hljs-number">1.512749</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">12</span> <span class="hljs-number">17</span>:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+08:<span class="hljs-number">00</span>   -<span class="hljs-number">2.185421</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">13</span> <span class="hljs-number">17</span>:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+08:<span class="hljs-number">00</span>    <span class="hljs-number">1.657845</span><span class="hljs-number">2012</span>-03-<span class="hljs-number">14</span> <span class="hljs-number">17</span>:<span class="hljs-number">30</span>:<span class="hljs-number">00</span>+08:<span class="hljs-number">00</span>    <span class="hljs-number">0.175633</span>Freq: D, dtype: float64</code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106947061</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="03x00-period-gu-ding-shi-qi"><font color="#FF0000">【03x00】period 固定时期</font></span></h2><h3><span id="03x01-pandas-period"><font color="#4876FF">【03x01】pandas.Period</font></span></h3><p>固定时期（period）表示的是时间区间，比如数日、数月、数季、数年等。Period 类所表示的就是这种数据类型，其构造函数需要用到一个字符串或整数。</p><p>基本语法：</p><pre><code class="hljs python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">pandas</span>.<span class="hljs-title">Period</span>(<span class="hljs-params">value=<span class="hljs-literal">None</span>, freq=<span class="hljs-literal">None</span>, ordinal=<span class="hljs-literal">None</span>, </span></span><span class="hljs-params"><span class="hljs-class">year=<span class="hljs-literal">None</span>, month=<span class="hljs-literal">None</span>, quarter=<span class="hljs-literal">None</span>, </span></span><span class="hljs-params"><span class="hljs-class">day=<span class="hljs-literal">None</span>, hour=<span class="hljs-literal">None</span>, minute=<span class="hljs-literal">None</span>, second=<span class="hljs-literal">None</span></span>)</span></code></pre><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.Period.html">https://pandas.pydata.org/docs/reference/api/pandas.Period.html</a></p><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>value</td><td>时间段</td></tr><tr><td>freq</td><td>时间戳将具有的偏移量，可以是 str，日期偏移量类型，取值参见<a href="#t4">【02x02】freq 频率部分取值</a></td></tr></tbody></table><p>以下示例中，Period 对象表示的是从2020年1月1日到2020年12月31日之间的整段时间</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Period(<span class="hljs-number">2020</span>, freq=<span class="hljs-string">&#x27;A-DEC&#x27;</span>)Period(<span class="hljs-string">&#x27;2020&#x27;</span>, <span class="hljs-string">&#x27;A-DEC&#x27;</span>)</code></pre><p>利用加减法对其按照频率进行位移：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Period(<span class="hljs-number">2020</span>, freq=<span class="hljs-string">&#x27;A-DEC&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>objPeriod(<span class="hljs-string">&#x27;2020&#x27;</span>, <span class="hljs-string">&#x27;A-DEC&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj + <span class="hljs-number">5</span>Period(<span class="hljs-string">&#x27;2025&#x27;</span>, <span class="hljs-string">&#x27;A-DEC&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj - <span class="hljs-number">5</span>Period(<span class="hljs-string">&#x27;2015&#x27;</span>, <span class="hljs-string">&#x27;A-DEC&#x27;</span>)</code></pre><p>PeriodIndex 类保存了一组 Period，它可以在任何 pandas 数据结构中被用作轴索引：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>rng = [pd.Period(<span class="hljs-string">&#x27;2000-01&#x27;</span>), pd.Period(<span class="hljs-string">&#x27;2000-02&#x27;</span>), pd.Period(<span class="hljs-string">&#x27;2000-03&#x27;</span>),            pd.Period(<span class="hljs-string">&#x27;2000-04&#x27;</span>), pd.Period(<span class="hljs-string">&#x27;2000-05&#x27;</span>), pd.Period(<span class="hljs-string">&#x27;2000-06&#x27;</span>)]<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(np.random.randn(<span class="hljs-number">6</span>), index=rng)<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">2000</span>-01    <span class="hljs-number">0.229092</span><span class="hljs-number">2000</span>-02    <span class="hljs-number">1.515498</span><span class="hljs-number">2000</span>-03   -<span class="hljs-number">0.334401</span><span class="hljs-number">2000</span>-04   -<span class="hljs-number">0.492681</span><span class="hljs-number">2000</span>-05   -<span class="hljs-number">2.012818</span><span class="hljs-number">2000</span>-06    <span class="hljs-number">0.338804</span>Freq: M, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.indexPeriodIndex([<span class="hljs-string">&#x27;2000-01&#x27;</span>, <span class="hljs-string">&#x27;2000-02&#x27;</span>, <span class="hljs-string">&#x27;2000-03&#x27;</span>, <span class="hljs-string">&#x27;2000-04&#x27;</span>, <span class="hljs-string">&#x27;2000-05&#x27;</span>, <span class="hljs-string">&#x27;2000-06&#x27;</span>], dtype=<span class="hljs-string">&#x27;period[M]&#x27;</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)</code></pre><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>values = [<span class="hljs-string">&#x27;2001Q3&#x27;</span>, <span class="hljs-string">&#x27;2002Q2&#x27;</span>, <span class="hljs-string">&#x27;2003Q1&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>index = pd.PeriodIndex(values, freq=<span class="hljs-string">&#x27;Q-DEC&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>indexPeriodIndex([<span class="hljs-string">&#x27;2001Q3&#x27;</span>, <span class="hljs-string">&#x27;2002Q2&#x27;</span>, <span class="hljs-string">&#x27;2003Q1&#x27;</span>], dtype=<span class="hljs-string">&#x27;period[Q-DEC]&#x27;</span>, freq=<span class="hljs-string">&#x27;Q-DEC&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span></code></pre><h3><span id="03x02-period-range"><font color="#4876FF">【03x02】period_range</font></span></h3><p><code>pandas.period_range</code> 方法可根据指定的频率生成指定长度的 PeriodIndex。</p><p>基本语法：</p><p><code>pandas.period_range(start=None, end=None, periods=None, freq=None, name=None) → pandas.core.indexes.period.PeriodIndex</code></p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.period_range.html">https://pandas.pydata.org/docs/reference/api/pandas.period_range.html</a></p><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>start</td><td>起始日期</td></tr><tr><td>end</td><td>结束日期</td></tr><tr><td>periods</td><td>要生成的时段数</td></tr><tr><td>freq</td><td>时间戳将具有的偏移量，可以是 str，日期偏移量类型，取值参见<a href="#t4">【02x02】freq 频率部分取值</a></td></tr><tr><td>name</td><td>结果 PeriodIndex 对象名称</td></tr></tbody></table><p>简单应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.period_range(start=<span class="hljs-string">&#x27;2019-01-01&#x27;</span>, end=<span class="hljs-string">&#x27;2020-01-01&#x27;</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)PeriodIndex([<span class="hljs-string">&#x27;2019-01&#x27;</span>, <span class="hljs-string">&#x27;2019-02&#x27;</span>, <span class="hljs-string">&#x27;2019-03&#x27;</span>, <span class="hljs-string">&#x27;2019-04&#x27;</span>, <span class="hljs-string">&#x27;2019-05&#x27;</span>, <span class="hljs-string">&#x27;2019-06&#x27;</span>,             <span class="hljs-string">&#x27;2019-07&#x27;</span>, <span class="hljs-string">&#x27;2019-08&#x27;</span>, <span class="hljs-string">&#x27;2019-09&#x27;</span>, <span class="hljs-string">&#x27;2019-10&#x27;</span>, <span class="hljs-string">&#x27;2019-11&#x27;</span>, <span class="hljs-string">&#x27;2019-12&#x27;</span>,             <span class="hljs-string">&#x27;2020-01&#x27;</span>],            dtype=<span class="hljs-string">&#x27;period[M]&#x27;</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>pd.period_range(start=pd.Period(<span class="hljs-string">&#x27;2017Q1&#x27;</span>, freq=<span class="hljs-string">&#x27;Q&#x27;</span>),                end=pd.Period(<span class="hljs-string">&#x27;2017Q2&#x27;</span>, freq=<span class="hljs-string">&#x27;Q&#x27;</span>), freq=<span class="hljs-string">&#x27;M&#x27;</span>)PeriodIndex([<span class="hljs-string">&#x27;2017-03&#x27;</span>, <span class="hljs-string">&#x27;2017-04&#x27;</span>, <span class="hljs-string">&#x27;2017-05&#x27;</span>, <span class="hljs-string">&#x27;2017-06&#x27;</span>], dtype=<span class="hljs-string">&#x27;period[M]&#x27;</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)</code></pre><h3><span id="03x03-asfreq-shi-qi-pin-lu-zhuan-huan"><font color="#4876FF">【03x03】asfreq 时期频率转换</font></span></h3><p>Period 和 PeriodIndex 对象都可以通过 asfreq 方法被转换成别的频率。</p><p>基本语法：<code>PeriodIndex.asfreq(self, *args, **kwargs)</code></p><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>freq</td><td>新的频率（偏移量），取值参见<a href="#t4">【02x02】freq 频率部分取值</a></td></tr><tr><td>how</td><td>按照开始或者结束对齐，<code>'E'</code> or <code>'END'</code> or <code>'FINISH'</code>；<code>'S'</code> or <code>'START'</code> or <code>'BEGIN'</code></td></tr></tbody></table><p>应用示例：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pidx = pd.period_range(<span class="hljs-string">&#x27;2010-01-01&#x27;</span>, <span class="hljs-string">&#x27;2015-01-01&#x27;</span>, freq=<span class="hljs-string">&#x27;A&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>pidxPeriodIndex([<span class="hljs-string">&#x27;2010&#x27;</span>, <span class="hljs-string">&#x27;2011&#x27;</span>, <span class="hljs-string">&#x27;2012&#x27;</span>, <span class="hljs-string">&#x27;2013&#x27;</span>, <span class="hljs-string">&#x27;2014&#x27;</span>, <span class="hljs-string">&#x27;2015&#x27;</span>], dtype=<span class="hljs-string">&#x27;period[A-DEC]&#x27;</span>, freq=<span class="hljs-string">&#x27;A-DEC&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pidx.asfreq(<span class="hljs-string">&#x27;M&#x27;</span>)PeriodIndex([<span class="hljs-string">&#x27;2010-12&#x27;</span>, <span class="hljs-string">&#x27;2011-12&#x27;</span>, <span class="hljs-string">&#x27;2012-12&#x27;</span>, <span class="hljs-string">&#x27;2013-12&#x27;</span>, <span class="hljs-string">&#x27;2014-12&#x27;</span>, <span class="hljs-string">&#x27;2015-12&#x27;</span>], dtype=<span class="hljs-string">&#x27;period[M]&#x27;</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pidx.asfreq(<span class="hljs-string">&#x27;M&#x27;</span>, how=<span class="hljs-string">&#x27;S&#x27;</span>)PeriodIndex([<span class="hljs-string">&#x27;2010-01&#x27;</span>, <span class="hljs-string">&#x27;2011-01&#x27;</span>, <span class="hljs-string">&#x27;2012-01&#x27;</span>, <span class="hljs-string">&#x27;2013-01&#x27;</span>, <span class="hljs-string">&#x27;2014-01&#x27;</span>, <span class="hljs-string">&#x27;2015-01&#x27;</span>], dtype=<span class="hljs-string">&#x27;period[M]&#x27;</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)</code></pre><h3><span id="03x04-to-period-yu-to-timestamp"><font color="#4876FF">【03x04】to_period 与 to_timestamp()</font></span></h3><p><code>to_period</code> 方法可以将 Timestamp（时间戳） 转换为 Period（固定时期）；</p><p><code>to_timestamp</code> 方法可以将 Period（固定时期）转换为 Timestamp（时间戳） 。</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>rng = pd.date_range(<span class="hljs-string">&#x27;2000-01-01&#x27;</span>, periods=<span class="hljs-number">3</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>ts = pd.Series(np.random.randn(<span class="hljs-number">3</span>), index=rng)<span class="hljs-meta">&gt;&gt;&gt; </span>ts<span class="hljs-number">2000</span>-01-<span class="hljs-number">31</span>    <span class="hljs-number">0.220759</span><span class="hljs-number">2000</span>-02-<span class="hljs-number">29</span>   -<span class="hljs-number">0.108221</span><span class="hljs-number">2000</span>-03-<span class="hljs-number">31</span>    <span class="hljs-number">0.819433</span>Freq: M, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pts = ts.to_period()<span class="hljs-meta">&gt;&gt;&gt; </span>pts<span class="hljs-number">2000</span>-01    <span class="hljs-number">0.220759</span><span class="hljs-number">2000</span>-02   -<span class="hljs-number">0.108221</span><span class="hljs-number">2000</span>-03    <span class="hljs-number">0.819433</span>Freq: M, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pts2 = pts.to_timestamp()<span class="hljs-meta">&gt;&gt;&gt; </span>pts2<span class="hljs-number">2000</span>-01-01    <span class="hljs-number">0.220759</span><span class="hljs-number">2000</span>-02-01   -<span class="hljs-number">0.108221</span><span class="hljs-number">2000</span>-03-01    <span class="hljs-number">0.819433</span>Freq: MS, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>ts.indexDatetimeIndex([<span class="hljs-string">&#x27;2000-01-31&#x27;</span>, <span class="hljs-string">&#x27;2000-02-29&#x27;</span>, <span class="hljs-string">&#x27;2000-03-31&#x27;</span>], dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pts.indexPeriodIndex([<span class="hljs-string">&#x27;2000-01&#x27;</span>, <span class="hljs-string">&#x27;2000-02&#x27;</span>, <span class="hljs-string">&#x27;2000-03&#x27;</span>], dtype=<span class="hljs-string">&#x27;period[M]&#x27;</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pts2.indexDatetimeIndex([<span class="hljs-string">&#x27;2000-01-01&#x27;</span>, <span class="hljs-string">&#x27;2000-02-01&#x27;</span>, <span class="hljs-string">&#x27;2000-03-01&#x27;</span>], dtype=<span class="hljs-string">&#x27;datetime64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;MS&#x27;</span>)</code></pre><h2><span id="04x00-timedelta-shi-jian-jian-ge"><font color="#FF0000">【04x00】timedelta 时间间隔</font></span></h2><h3><span id="04x01-pandas-timedelta"><font color="#4876FF">【04x01】pandas.Timedelta</font></span></h3><p>Timedelta 表示持续时间，即两个日期或时间之间的差。</p><p>Timedelta 相当于 Python 的 datetime.timedelta，在大多数情况下两者可以互换。</p><p>基本语法：<code>class pandas.Timedelta(value=&lt;object object&gt;, unit=None, **kwargs)</code></p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.Timedelta.html">https://pandas.pydata.org/docs/reference/api/pandas.Timedelta.html</a></p><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>value</td><td>传入的值，可以是 Timedelta，timedelta，np.timedelta64，string 或 integer 对象</td></tr><tr><td>unit</td><td>用于设置 value 的单位，具体取值参见官方文档</td></tr></tbody></table><p>表示两个 datetime 对象之间的时间差：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_datetime(<span class="hljs-string">&#x27;2020-6-24&#x27;</span>) - pd.to_datetime(<span class="hljs-string">&#x27;2016-1-1&#x27;</span>)Timedelta(<span class="hljs-string">&#x27;1636 days 00:00:00&#x27;</span>)</code></pre><p>通过字符串传递参数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Timedelta(<span class="hljs-string">&#x27;3 days 3 hours 3 minutes 30 seconds&#x27;</span>)Timedelta(<span class="hljs-string">&#x27;3 days 03:03:30&#x27;</span>)</code></pre><p>通过整数传递参数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Timedelta(<span class="hljs-number">5</span>,unit=<span class="hljs-string">&#x27;h&#x27;</span>)Timedelta(<span class="hljs-string">&#x27;0 days 05:00:00&#x27;</span>)</code></pre><p>获取属性：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Timedelta(<span class="hljs-string">&#x27;3 days 3 hours 3 minutes 30 seconds&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>objTimedelta(<span class="hljs-string">&#x27;3 days 03:03:30&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.days<span class="hljs-number">3</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.seconds<span class="hljs-number">11010</span></code></pre><h3><span id="04x02-to-timedelta"><font color="#4876FF">【04x02】to_timedelta</font></span></h3><p>to_timedelta 方法可以将传入的对象转换成 timedelta 对象。</p><p>基本语法：<code>pandas.to_timedelta(arg, unit='ns', errors='raise')</code></p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.to_timedelta.html">https://pandas.pydata.org/docs/reference/api/pandas.to_timedelta.html</a></p><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>arg</td><td>要转换为 timedelta 的对象，可以是 str，timedelta，list-like 或 Series 对象</td></tr><tr><td>unit</td><td>用于设置 arg 的单位，具体取值参见官方文档</td></tr><tr><td>errors</td><td>如果 arg 不满足时间戳的形式，是否会发生异常<br><code>ignore</code>：不引发异常，返回原始输入；<code>raise</code>：无效解析将引发异常（默认）；<code>coerce</code>：无效解析将被设置为NaT</td></tr></tbody></table><p>将单个字符串解析为 timedelta 对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_timedelta(<span class="hljs-string">&#x27;1 days 06:05:01.00003&#x27;</span>)Timedelta(<span class="hljs-string">&#x27;1 days 06:05:01.000030&#x27;</span>)&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_timedelta(<span class="hljs-string">&#x27;15.5us&#x27;</span>)Timedelta(<span class="hljs-string">&#x27;0 days 00:00:00.000015&#x27;</span>)</code></pre><p>将字符串列表或数组解析为  timedelta 对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_timedelta([<span class="hljs-string">&#x27;1 days 06:05:01.00003&#x27;</span>, <span class="hljs-string">&#x27;15.5us&#x27;</span>, <span class="hljs-string">&#x27;nan&#x27;</span>])TimedeltaIndex([<span class="hljs-string">&#x27;1 days 06:05:01.000030&#x27;</span>, <span class="hljs-string">&#x27;0 days 00:00:00.000015&#x27;</span>, NaT], dtype=<span class="hljs-string">&#x27;timedelta64[ns]&#x27;</span>, freq=<span class="hljs-literal">None</span>)</code></pre><p>指定 <code>unit</code> 参数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_timedelta(np.arange(<span class="hljs-number">5</span>), unit=<span class="hljs-string">&#x27;s&#x27;</span>)TimedeltaIndex([<span class="hljs-string">&#x27;00:00:00&#x27;</span>, <span class="hljs-string">&#x27;00:00:01&#x27;</span>, <span class="hljs-string">&#x27;00:00:02&#x27;</span>, <span class="hljs-string">&#x27;00:00:03&#x27;</span>, <span class="hljs-string">&#x27;00:00:04&#x27;</span>], dtype=<span class="hljs-string">&#x27;timedelta64[ns]&#x27;</span>, freq=<span class="hljs-literal">None</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.to_timedelta(np.arange(<span class="hljs-number">5</span>), unit=<span class="hljs-string">&#x27;d&#x27;</span>)TimedeltaIndex([<span class="hljs-string">&#x27;0 days&#x27;</span>, <span class="hljs-string">&#x27;1 days&#x27;</span>, <span class="hljs-string">&#x27;2 days&#x27;</span>, <span class="hljs-string">&#x27;3 days&#x27;</span>, <span class="hljs-string">&#x27;4 days&#x27;</span>], dtype=<span class="hljs-string">&#x27;timedelta64[ns]&#x27;</span>, freq=<span class="hljs-literal">None</span>)</code></pre><h3><span id="04x03-timedelta-range"><font color="#4876FF">【04x03】timedelta_range</font></span></h3><p><code>timedelta_range</code> 方法可根据指定的频率生成指定长度的 TimedeltaIndex。</p><p>基本语法：</p><pre><code class="hljs python">pandas.timedelta_range(start=<span class="hljs-literal">None</span>, end=<span class="hljs-literal">None</span>, periods=<span class="hljs-literal">None</span>,                       freq=<span class="hljs-literal">None</span>, name=<span class="hljs-literal">None</span>, closed=<span class="hljs-literal">None</span>) → pandas.core.indexes.timedeltas.TimedeltaIndex</code></pre><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.timedelta_range.html">https://pandas.pydata.org/docs/reference/api/pandas.timedelta_range.html</a></p><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>start</td><td>开始日期</td></tr><tr><td>end</td><td>结束日期</td></tr><tr><td>periods</td><td>int 类型，要生成的时段数</td></tr><tr><td>freq</td><td>频率字符串，即按照某种特定的频率来生成日期，取值参见<a href="#t4">【02x02】freq 频率部分取值</a></td></tr><tr><td>name</td><td>结果 TimedeltaIndex 的名称</td></tr><tr><td>closed</td><td><code>None</code>：默认值，同时保留开始日期和结束日期<br><code>'left'</code>：保留开始日期，不保留结束日期<br><code>'right'</code>：保留结束日期，不保留开始日期</td></tr></tbody></table><p>应用示例：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.timedelta_range(start=<span class="hljs-string">&#x27;1 day&#x27;</span>, periods=<span class="hljs-number">4</span>)TimedeltaIndex([<span class="hljs-string">&#x27;1 days&#x27;</span>, <span class="hljs-string">&#x27;2 days&#x27;</span>, <span class="hljs-string">&#x27;3 days&#x27;</span>, <span class="hljs-string">&#x27;4 days&#x27;</span>], dtype=<span class="hljs-string">&#x27;timedelta64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)</code></pre><p>closed 参数指定保留哪个端点。默认保留两个端点：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.timedelta_range(start=<span class="hljs-string">&#x27;1 day&#x27;</span>, periods=<span class="hljs-number">4</span>, closed=<span class="hljs-string">&#x27;right&#x27;</span>)TimedeltaIndex([<span class="hljs-string">&#x27;2 days&#x27;</span>, <span class="hljs-string">&#x27;3 days&#x27;</span>, <span class="hljs-string">&#x27;4 days&#x27;</span>], dtype=<span class="hljs-string">&#x27;timedelta64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)</code></pre><p>freq 参数指定 TimedeltaIndex 的频率。只接受固定频率，非固定频率如 <code>'M'</code> 将会报错：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.timedelta_range(start=<span class="hljs-string">&#x27;1 day&#x27;</span>, end=<span class="hljs-string">&#x27;2 days&#x27;</span>, freq=<span class="hljs-string">&#x27;6H&#x27;</span>)TimedeltaIndex([<span class="hljs-string">&#x27;1 days 00:00:00&#x27;</span>, <span class="hljs-string">&#x27;1 days 06:00:00&#x27;</span>, <span class="hljs-string">&#x27;1 days 12:00:00&#x27;</span>,                <span class="hljs-string">&#x27;1 days 18:00:00&#x27;</span>, <span class="hljs-string">&#x27;2 days 00:00:00&#x27;</span>],               dtype=<span class="hljs-string">&#x27;timedelta64[ns]&#x27;</span>, freq=<span class="hljs-string">&#x27;6H&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.timedelta_range(start=<span class="hljs-string">&#x27;1 day&#x27;</span>, end=<span class="hljs-string">&#x27;2 days&#x27;</span>, freq=<span class="hljs-string">&#x27;M&#x27;</span>)Traceback (most recent call last):...ValueError: &lt;MonthEnd&gt; <span class="hljs-keyword">is</span> a non-fixed frequency</code></pre><h2><span id="05x00-chong-cai-yang-ji-pin-lu-zhuan-huan"><font color="#FF0000">【05x00】重采样及频率转换</font></span></h2><p>重采样（resampling）指的是将时间序列从一个频率转换到另一个频率的处理过程。将高频率数据聚合到低频率称为降采样（downsampling），而将低频率数据转换到高频率则称为升采样（upsampling）。并不是所有的重采样都能被划分到这两个大类中。例如，将 W-WED（每周三）转换为 W-FRI 既不是降采样也不是升采样。</p><p>Pandas 中提供了 resample 方法来帮助我们实现重采样。Pandas 对象都带有一个 resample 方法，它是各种频率转换工作的主力函数。</p><p>基本语法：</p><pre><code class="hljs python">Series.resample(self, rule, axis=<span class="hljs-number">0</span>,                 closed: <span class="hljs-type">Union</span>[<span class="hljs-built_in">str</span>, NoneType] = <span class="hljs-literal">None</span>,                 label: <span class="hljs-type">Union</span>[<span class="hljs-built_in">str</span>, NoneType] = <span class="hljs-literal">None</span>,                 convention: <span class="hljs-built_in">str</span> = <span class="hljs-string">&#x27;start&#x27;</span>,                 kind: <span class="hljs-type">Union</span>[<span class="hljs-built_in">str</span>, NoneType] = <span class="hljs-literal">None</span>,                 loffset=<span class="hljs-literal">None</span>, base: <span class="hljs-built_in">int</span> = <span class="hljs-number">0</span>,                 on=<span class="hljs-literal">None</span>, level=<span class="hljs-literal">None</span>)</code></pre><pre><code class="hljs python">DataFrame.resample(self, rule, axis=<span class="hljs-number">0</span>,                    closed: <span class="hljs-type">Union</span>[<span class="hljs-built_in">str</span>, NoneType] = <span class="hljs-literal">None</span>,                    label: <span class="hljs-type">Union</span>[<span class="hljs-built_in">str</span>, NoneType] = <span class="hljs-literal">None</span>,                    convention: <span class="hljs-built_in">str</span> = <span class="hljs-string">&#x27;start&#x27;</span>,                    kind: <span class="hljs-type">Union</span>[<span class="hljs-built_in">str</span>, NoneType] = <span class="hljs-literal">None</span>,                    loffset=<span class="hljs-literal">None</span>, base: <span class="hljs-built_in">int</span> = <span class="hljs-number">0</span>,                    on=<span class="hljs-literal">None</span>, level=<span class="hljs-literal">None</span>)</code></pre><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>rule</td><td></td></tr><tr><td>axis</td><td>重采样的轴，默认 0</td></tr><tr><td>closed</td><td>在重采样中，各时间段的哪一端是闭合（即包含）的，<br>除 <code>'M'</code>、<code>'A'</code>、<code>'Q'</code>、<code>'BM'</code>、<code>'BA'</code>、<code>'BQ'</code> 和 <code>'W'</code> 默认值为 ‘right’ 外，其他默认值为 'left‘</td></tr><tr><td>label</td><td>在重采样中，如何设置聚合值的标签， right 或 left，默认为 None，<br>例如，9:30 到 9:35 之间的这 5 分钟会被标记为 9:30 或 9:35</td></tr><tr><td>convention</td><td>仅用于 PeriodIndex（固定时期），对周期进行重采样，<code>'start'</code> or <code>'s'</code>，<code>'end'</code> or <code>'e'</code></td></tr><tr><td>on</td><td>对于 DataFrame 对象，可用该参数指定重采样后的数据的 index（行索引） 为原数据中的某列</td></tr><tr><td>level</td><td>对于具有层级索引（MultiIndex）的 DataFrame 对象，可以使用该参数来指定需要在哪个级别上进行重新采样</td></tr></tbody></table><p>将序列重采样到三分钟的频率，并将每个频率的值相加：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>index = pd.date_range(<span class="hljs-string">&#x27;1/1/2000&#x27;</span>, periods=<span class="hljs-number">9</span>, freq=<span class="hljs-string">&#x27;T&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>series = pd.Series(<span class="hljs-built_in">range</span>(<span class="hljs-number">9</span>), index=index)<span class="hljs-meta">&gt;&gt;&gt; </span>series<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">00</span>    <span class="hljs-number">1</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:02:<span class="hljs-number">00</span>    <span class="hljs-number">2</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:03:<span class="hljs-number">00</span>    <span class="hljs-number">3</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:04:<span class="hljs-number">00</span>    <span class="hljs-number">4</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:05:<span class="hljs-number">00</span>    <span class="hljs-number">5</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:06:<span class="hljs-number">00</span>    <span class="hljs-number">6</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:07:<span class="hljs-number">00</span>    <span class="hljs-number">7</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:08:<span class="hljs-number">00</span>    <span class="hljs-number">8</span>Freq: T, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>series.resample(<span class="hljs-string">&#x27;3T&#x27;</span>).<span class="hljs-built_in">sum</span>()<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>     <span class="hljs-number">3</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:03:<span class="hljs-number">00</span>    <span class="hljs-number">12</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:06:<span class="hljs-number">00</span>    <span class="hljs-number">21</span>Freq: 3T, dtype: int64</code></pre><p>设置 <code>label='right'</code>，即每个索引 index 会使用靠右侧（较大值）的标签：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>index = pd.date_range(<span class="hljs-string">&#x27;1/1/2000&#x27;</span>, periods=<span class="hljs-number">9</span>, freq=<span class="hljs-string">&#x27;T&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>series = pd.Series(<span class="hljs-built_in">range</span>(<span class="hljs-number">9</span>), index=index)<span class="hljs-meta">&gt;&gt;&gt; </span>series<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">00</span>    <span class="hljs-number">1</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:02:<span class="hljs-number">00</span>    <span class="hljs-number">2</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:03:<span class="hljs-number">00</span>    <span class="hljs-number">3</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:04:<span class="hljs-number">00</span>    <span class="hljs-number">4</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:05:<span class="hljs-number">00</span>    <span class="hljs-number">5</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:06:<span class="hljs-number">00</span>    <span class="hljs-number">6</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:07:<span class="hljs-number">00</span>    <span class="hljs-number">7</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:08:<span class="hljs-number">00</span>    <span class="hljs-number">8</span>Freq: T, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>series.resample(<span class="hljs-string">&#x27;3T&#x27;</span>, label=<span class="hljs-string">&#x27;right&#x27;</span>).<span class="hljs-built_in">sum</span>()<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:03:<span class="hljs-number">00</span>     <span class="hljs-number">3</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:06:<span class="hljs-number">00</span>    <span class="hljs-number">12</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:09:<span class="hljs-number">00</span>    <span class="hljs-number">21</span>Freq: 3T, dtype: int64</code></pre><p>设置 <code>closed='right'</code>，即结果中会包含原数据中最右侧（较大）的值：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>index = pd.date_range(<span class="hljs-string">&#x27;1/1/2000&#x27;</span>, periods=<span class="hljs-number">9</span>, freq=<span class="hljs-string">&#x27;T&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>series = pd.Series(<span class="hljs-built_in">range</span>(<span class="hljs-number">9</span>), index=index)<span class="hljs-meta">&gt;&gt;&gt; </span>series<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">00</span>    <span class="hljs-number">1</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:02:<span class="hljs-number">00</span>    <span class="hljs-number">2</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:03:<span class="hljs-number">00</span>    <span class="hljs-number">3</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:04:<span class="hljs-number">00</span>    <span class="hljs-number">4</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:05:<span class="hljs-number">00</span>    <span class="hljs-number">5</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:06:<span class="hljs-number">00</span>    <span class="hljs-number">6</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:07:<span class="hljs-number">00</span>    <span class="hljs-number">7</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:08:<span class="hljs-number">00</span>    <span class="hljs-number">8</span>Freq: T, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>series.resample(<span class="hljs-string">&#x27;3T&#x27;</span>, label=<span class="hljs-string">&#x27;right&#x27;</span>, closed=<span class="hljs-string">&#x27;right&#x27;</span>).<span class="hljs-built_in">sum</span>()<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>     <span class="hljs-number">0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:03:<span class="hljs-number">00</span>     <span class="hljs-number">6</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:06:<span class="hljs-number">00</span>    <span class="hljs-number">15</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:09:<span class="hljs-number">00</span>    <span class="hljs-number">15</span>Freq: 3T, dtype: int64</code></pre><p>以下示例将序列重采样到30秒的频率，<code>asfreq()[0:5]</code> 用于选择前5行数据：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>index = pd.date_range(<span class="hljs-string">&#x27;1/1/2000&#x27;</span>, periods=<span class="hljs-number">9</span>, freq=<span class="hljs-string">&#x27;T&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>series = pd.Series(<span class="hljs-built_in">range</span>(<span class="hljs-number">9</span>), index=index)<span class="hljs-meta">&gt;&gt;&gt; </span>series<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">00</span>    <span class="hljs-number">1</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:02:<span class="hljs-number">00</span>    <span class="hljs-number">2</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:03:<span class="hljs-number">00</span>    <span class="hljs-number">3</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:04:<span class="hljs-number">00</span>    <span class="hljs-number">4</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:05:<span class="hljs-number">00</span>    <span class="hljs-number">5</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:06:<span class="hljs-number">00</span>    <span class="hljs-number">6</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:07:<span class="hljs-number">00</span>    <span class="hljs-number">7</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:08:<span class="hljs-number">00</span>    <span class="hljs-number">8</span>Freq: T, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>series.resample(<span class="hljs-string">&#x27;30S&#x27;</span>).asfreq()[<span class="hljs-number">0</span>:<span class="hljs-number">5</span>]<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0.0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">30</span>    NaN<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">00</span>    <span class="hljs-number">1.0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">30</span>    NaN<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:02:<span class="hljs-number">00</span>    <span class="hljs-number">2.0</span>Freq: 30S, dtype: float64</code></pre><p>使用 <code>pad</code> 方法向后填充缺失值（NaN）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>index = pd.date_range(<span class="hljs-string">&#x27;1/1/2000&#x27;</span>, periods=<span class="hljs-number">9</span>, freq=<span class="hljs-string">&#x27;T&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>series = pd.Series(<span class="hljs-built_in">range</span>(<span class="hljs-number">9</span>), index=index)<span class="hljs-meta">&gt;&gt;&gt; </span>series<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">00</span>    <span class="hljs-number">1</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:02:<span class="hljs-number">00</span>    <span class="hljs-number">2</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:03:<span class="hljs-number">00</span>    <span class="hljs-number">3</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:04:<span class="hljs-number">00</span>    <span class="hljs-number">4</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:05:<span class="hljs-number">00</span>    <span class="hljs-number">5</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:06:<span class="hljs-number">00</span>    <span class="hljs-number">6</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:07:<span class="hljs-number">00</span>    <span class="hljs-number">7</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:08:<span class="hljs-number">00</span>    <span class="hljs-number">8</span>Freq: T, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>series.resample(<span class="hljs-string">&#x27;30S&#x27;</span>).pad()[<span class="hljs-number">0</span>:<span class="hljs-number">5</span>]<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">30</span>    <span class="hljs-number">0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">00</span>    <span class="hljs-number">1</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">30</span>    <span class="hljs-number">1</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:02:<span class="hljs-number">00</span>    <span class="hljs-number">2</span>Freq: 30S, dtype: int64</code></pre><p>使用 <code>bfill</code> 方法向前填充缺失值（NaN）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>index = pd.date_range(<span class="hljs-string">&#x27;1/1/2000&#x27;</span>, periods=<span class="hljs-number">9</span>, freq=<span class="hljs-string">&#x27;T&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>series = pd.Series(<span class="hljs-built_in">range</span>(<span class="hljs-number">9</span>), index=index)<span class="hljs-meta">&gt;&gt;&gt; </span>series<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">00</span>    <span class="hljs-number">1</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:02:<span class="hljs-number">00</span>    <span class="hljs-number">2</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:03:<span class="hljs-number">00</span>    <span class="hljs-number">3</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:04:<span class="hljs-number">00</span>    <span class="hljs-number">4</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:05:<span class="hljs-number">00</span>    <span class="hljs-number">5</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:06:<span class="hljs-number">00</span>    <span class="hljs-number">6</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:07:<span class="hljs-number">00</span>    <span class="hljs-number">7</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:08:<span class="hljs-number">00</span>    <span class="hljs-number">8</span>Freq: T, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>series.resample(<span class="hljs-string">&#x27;30S&#x27;</span>).bfill()[<span class="hljs-number">0</span>:<span class="hljs-number">5</span>]<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">30</span>    <span class="hljs-number">1</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">00</span>    <span class="hljs-number">1</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">30</span>    <span class="hljs-number">2</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:02:<span class="hljs-number">00</span>    <span class="hljs-number">2</span>Freq: 30S, dtype: int64</code></pre><p>通过 <code>apply</code> 方法传递自定义函数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>index = pd.date_range(<span class="hljs-string">&#x27;1/1/2000&#x27;</span>, periods=<span class="hljs-number">9</span>, freq=<span class="hljs-string">&#x27;T&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>series = pd.Series(<span class="hljs-built_in">range</span>(<span class="hljs-number">9</span>), index=index)<span class="hljs-meta">&gt;&gt;&gt; </span>series<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>    <span class="hljs-number">0</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:01:<span class="hljs-number">00</span>    <span class="hljs-number">1</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:02:<span class="hljs-number">00</span>    <span class="hljs-number">2</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:03:<span class="hljs-number">00</span>    <span class="hljs-number">3</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:04:<span class="hljs-number">00</span>    <span class="hljs-number">4</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:05:<span class="hljs-number">00</span>    <span class="hljs-number">5</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:06:<span class="hljs-number">00</span>    <span class="hljs-number">6</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:07:<span class="hljs-number">00</span>    <span class="hljs-number">7</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:08:<span class="hljs-number">00</span>    <span class="hljs-number">8</span>Freq: T, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">custom_resampler</span>(<span class="hljs-params">array_like</span>):</span>    <span class="hljs-keyword">return</span> np.<span class="hljs-built_in">sum</span>(array_like) + <span class="hljs-number">5</span><span class="hljs-meta">&gt;&gt;&gt; </span>series.resample(<span class="hljs-string">&#x27;3T&#x27;</span>).apply(custom_resampler)<span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>     <span class="hljs-number">8</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:03:<span class="hljs-number">00</span>    <span class="hljs-number">17</span><span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:06:<span class="hljs-number">00</span>    <span class="hljs-number">26</span>Freq: 3T, dtype: int64</code></pre><p>convention 参数的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>s = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], index=pd.period_range(<span class="hljs-string">&#x27;2012-01-01&#x27;</span>, freq=<span class="hljs-string">&#x27;A&#x27;</span>, periods=<span class="hljs-number">2</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>s<span class="hljs-number">2012</span>    <span class="hljs-number">1</span><span class="hljs-number">2013</span>    <span class="hljs-number">2</span>Freq: A-DEC, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>s.resample(<span class="hljs-string">&#x27;Q&#x27;</span>, convention=<span class="hljs-string">&#x27;start&#x27;</span>).asfreq()2012Q1    <span class="hljs-number">1.0</span>2012Q2    NaN2012Q3    NaN2012Q4    NaN2013Q1    <span class="hljs-number">2.0</span>2013Q2    NaN2013Q3    NaN2013Q4    NaNFreq: Q-DEC, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>s.resample(<span class="hljs-string">&#x27;Q&#x27;</span>, convention=<span class="hljs-string">&#x27;end&#x27;</span>).asfreq()2012Q4    <span class="hljs-number">1.0</span>2013Q1    NaN2013Q2    NaN2013Q3    NaN2013Q4    <span class="hljs-number">2.0</span>Freq: Q-DEC, dtype: float64</code></pre><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>q = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>], index=pd.period_range(<span class="hljs-string">&#x27;2018-01-01&#x27;</span>, freq=<span class="hljs-string">&#x27;Q&#x27;</span>, periods=<span class="hljs-number">4</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>q2018Q1    <span class="hljs-number">1</span>2018Q2    <span class="hljs-number">2</span>2018Q3    <span class="hljs-number">3</span>2018Q4    <span class="hljs-number">4</span>Freq: Q-DEC, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>q.resample(<span class="hljs-string">&#x27;M&#x27;</span>, convention=<span class="hljs-string">&#x27;end&#x27;</span>).asfreq()<span class="hljs-number">2018</span>-03    <span class="hljs-number">1.0</span><span class="hljs-number">2018</span>-04    NaN<span class="hljs-number">2018</span>-05    NaN<span class="hljs-number">2018</span>-06    <span class="hljs-number">2.0</span><span class="hljs-number">2018</span>-07    NaN<span class="hljs-number">2018</span>-08    NaN<span class="hljs-number">2018</span>-09    <span class="hljs-number">3.0</span><span class="hljs-number">2018</span>-<span class="hljs-number">10</span>    NaN<span class="hljs-number">2018</span>-<span class="hljs-number">11</span>    NaN<span class="hljs-number">2018</span>-<span class="hljs-number">12</span>    <span class="hljs-number">4.0</span>Freq: M, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>q.resample(<span class="hljs-string">&#x27;M&#x27;</span>, convention=<span class="hljs-string">&#x27;start&#x27;</span>).asfreq()<span class="hljs-number">2018</span>-01    <span class="hljs-number">1.0</span><span class="hljs-number">2018</span>-02    NaN<span class="hljs-number">2018</span>-03    NaN<span class="hljs-number">2018</span>-04    <span class="hljs-number">2.0</span><span class="hljs-number">2018</span>-05    NaN<span class="hljs-number">2018</span>-06    NaN<span class="hljs-number">2018</span>-07    <span class="hljs-number">3.0</span><span class="hljs-number">2018</span>-08    NaN<span class="hljs-number">2018</span>-09    NaN<span class="hljs-number">2018</span>-<span class="hljs-number">10</span>    <span class="hljs-number">4.0</span><span class="hljs-number">2018</span>-<span class="hljs-number">11</span>    NaN<span class="hljs-number">2018</span>-<span class="hljs-number">12</span>    NaNFreq: M, dtype: float64</code></pre><p>对于 DataFrame 对象，可以使用关键字 on 来指定原数据中的某列为重采样后数据的行索引：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>d = <span class="hljs-built_in">dict</span>(&#123;<span class="hljs-string">&#x27;price&#x27;</span>: [<span class="hljs-number">10</span>, <span class="hljs-number">11</span>, <span class="hljs-number">9</span>, <span class="hljs-number">13</span>, <span class="hljs-number">14</span>, <span class="hljs-number">18</span>, <span class="hljs-number">17</span>, <span class="hljs-number">19</span>],            <span class="hljs-string">&#x27;volume&#x27;</span>: [<span class="hljs-number">50</span>, <span class="hljs-number">60</span>, <span class="hljs-number">40</span>, <span class="hljs-number">100</span>, <span class="hljs-number">50</span>, <span class="hljs-number">100</span>, <span class="hljs-number">40</span>, <span class="hljs-number">50</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>df = pd.DataFrame(d)<span class="hljs-meta">&gt;&gt;&gt; </span>df[<span class="hljs-string">&#x27;week_starting&#x27;</span>] = pd.date_range(<span class="hljs-string">&#x27;01/01/2018&#x27;</span>, periods=<span class="hljs-number">8</span>, freq=<span class="hljs-string">&#x27;W&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>df   price  volume week_starting<span class="hljs-number">0</span>     <span class="hljs-number">10</span>      <span class="hljs-number">50</span>    <span class="hljs-number">2018</span>-01-07<span class="hljs-number">1</span>     <span class="hljs-number">11</span>      <span class="hljs-number">60</span>    <span class="hljs-number">2018</span>-01-<span class="hljs-number">14</span><span class="hljs-number">2</span>      <span class="hljs-number">9</span>      <span class="hljs-number">40</span>    <span class="hljs-number">2018</span>-01-<span class="hljs-number">21</span><span class="hljs-number">3</span>     <span class="hljs-number">13</span>     <span class="hljs-number">100</span>    <span class="hljs-number">2018</span>-01-<span class="hljs-number">28</span><span class="hljs-number">4</span>     <span class="hljs-number">14</span>      <span class="hljs-number">50</span>    <span class="hljs-number">2018</span>-02-04<span class="hljs-number">5</span>     <span class="hljs-number">18</span>     <span class="hljs-number">100</span>    <span class="hljs-number">2018</span>-02-<span class="hljs-number">11</span><span class="hljs-number">6</span>     <span class="hljs-number">17</span>      <span class="hljs-number">40</span>    <span class="hljs-number">2018</span>-02-<span class="hljs-number">18</span><span class="hljs-number">7</span>     <span class="hljs-number">19</span>      <span class="hljs-number">50</span>    <span class="hljs-number">2018</span>-02-<span class="hljs-number">25</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>df.resample(<span class="hljs-string">&#x27;M&#x27;</span>, on=<span class="hljs-string">&#x27;week_starting&#x27;</span>).mean()               price  volumeweek_starting               <span class="hljs-number">2018</span>-01-<span class="hljs-number">31</span>     <span class="hljs-number">10.75</span>    <span class="hljs-number">62.5</span><span class="hljs-number">2018</span>-02-<span class="hljs-number">28</span>     <span class="hljs-number">17.00</span>    <span class="hljs-number">60.0</span></code></pre><p>对于具有层级索引（MultiIndex）的 DataFrame 对象，可以使用关键字 <code>level</code> 来指定需要在哪个级别上进行重新采样：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>days = pd.date_range(<span class="hljs-string">&#x27;1/1/2000&#x27;</span>, periods=<span class="hljs-number">4</span>, freq=<span class="hljs-string">&#x27;D&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>d2 = <span class="hljs-built_in">dict</span>(&#123;<span class="hljs-string">&#x27;price&#x27;</span>: [<span class="hljs-number">10</span>, <span class="hljs-number">11</span>, <span class="hljs-number">9</span>, <span class="hljs-number">13</span>, <span class="hljs-number">14</span>, <span class="hljs-number">18</span>, <span class="hljs-number">17</span>, <span class="hljs-number">19</span>],              <span class="hljs-string">&#x27;volume&#x27;</span>: [<span class="hljs-number">50</span>, <span class="hljs-number">60</span>, <span class="hljs-number">40</span>, <span class="hljs-number">100</span>, <span class="hljs-number">50</span>, <span class="hljs-number">100</span>, <span class="hljs-number">40</span>, <span class="hljs-number">50</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>df2 = pd.DataFrame(d2, index=pd.MultiIndex.from_product([days, [<span class="hljs-string">&#x27;morning&#x27;</span>, <span class="hljs-string">&#x27;afternoon&#x27;</span>]]))<span class="hljs-meta">&gt;&gt;&gt; </span>df2                      price  volume<span class="hljs-number">2000</span>-01-01 morning       <span class="hljs-number">10</span>      <span class="hljs-number">50</span>           afternoon     <span class="hljs-number">11</span>      <span class="hljs-number">60</span><span class="hljs-number">2000</span>-01-02 morning        <span class="hljs-number">9</span>      <span class="hljs-number">40</span>           afternoon     <span class="hljs-number">13</span>     <span class="hljs-number">100</span><span class="hljs-number">2000</span>-01-03 morning       <span class="hljs-number">14</span>      <span class="hljs-number">50</span>           afternoon     <span class="hljs-number">18</span>     <span class="hljs-number">100</span><span class="hljs-number">2000</span>-01-04 morning       <span class="hljs-number">17</span>      <span class="hljs-number">40</span>           afternoon     <span class="hljs-number">19</span>      <span class="hljs-number">50</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>df2.resample(<span class="hljs-string">&#x27;D&#x27;</span>, level=<span class="hljs-number">0</span>).<span class="hljs-built_in">sum</span>()            price  volume<span class="hljs-number">2000</span>-01-01     <span class="hljs-number">21</span>     <span class="hljs-number">110</span><span class="hljs-number">2000</span>-01-02     <span class="hljs-number">22</span>     <span class="hljs-number">140</span><span class="hljs-number">2000</span>-01-03     <span class="hljs-number">32</span>     <span class="hljs-number">150</span><span class="hljs-number">2000</span>-01-04     <span class="hljs-number">36</span>      <span class="hljs-number">90</span></code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106947061</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-01x00-shi-jian-xu-lie-font&quot;&gt;&lt;font color=&quot;#</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Pandas" scheme="https://www.itbob.cn/tags/Pandas/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Pandas（八）：数据重塑/重复数据处理/数据替换</title>
    <link href="https://www.itbob.cn/article/032/"/>
    <id>https://www.itbob.cn/article/032/</id>
    <published>2020-06-22T12:59:15.000Z</published>
    <updated>2022-05-22T12:43:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-01x00-shu-ju-chong-su-font"><font color="#FF0000">【01x00】数据重塑</font></a><ul><li><a href="#font-color-4876ff-01x01-stack-font"><font color="#4876FF">【01x01】stack</font></a></li><li><a href="#font-color-4876ff-01x02-unstack-font"><font color="#4876FF">【01x02】unstack</font></a></li></ul></li><li><a href="#font-color-ff0000-02x00-chong-fu-shu-ju-chu-li-font"><font color="#FF0000">【02x00】重复数据处理</font></a><ul><li><a href="#font-color-4876ff-02x01-duplicated-font"><font color="#4876FF">【02x01】duplicated</font></a></li><li><a href="#font-color-4876ff-02x02-drop-duplicates-font"><font color="#4876FF">【02x02】drop_duplicates</font></a></li></ul></li><li><a href="#font-color-ff0000-03x00-shu-ju-ti-huan-font"><font color="#FF0000">【03x00】数据替换</font></a><ul><li><a href="#font-color-4876ff-03x01-replace-font"><font color="#4876FF">【03x01】replace</font></a></li><li><a href="#font-color-4876ff-03x02-where-font"><font color="#4876FF">【03x02】where</font></a></li><li><a href="#font-color-4876ff-03x03-mask-font"><font color="#4876FF">【03x03】mask</font></a></li></ul></li></ul><!-- tocstop --><hr><p>Pandas 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/025/">Python 数据分析三剑客之 Pandas（一）：认识 Pandas 及其 Series、DataFrame 对象</a></li><li><a href="https://www.itbob.cn/article/026/">Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作</a></li><li><a href="https://www.itbob.cn/article/027/">Python 数据分析三剑客之 Pandas（三）：算术运算与缺失值的处理</a></li><li><a href="https://www.itbob.cn/article/028/">Python 数据分析三剑客之 Pandas（四）：函数应用、映射、排序和层级索引</a></li><li><a href="https://www.itbob.cn/article/029/">Python 数据分析三剑客之 Pandas（五）：统计计算与统计描述</a></li><li><a href="https://www.itbob.cn/article/030/">Python 数据分析三剑客之 Pandas（六）：GroupBy 数据分裂、应用与合并</a></li><li><a href="https://www.itbob.cn/article/031/">Python 数据分析三剑客之 Pandas（七）：合并数据集</a></li><li><a href="https://www.itbob.cn/article/032/">Python 数据分析三剑客之 Pandas（八）：数据重塑、重复数据处理与数据替换</a></li><li><a href="https://www.itbob.cn/article/033/">Python 数据分析三剑客之 Pandas（九）：时间序列</a></li><li><a href="https://www.itbob.cn/article/034/">Python 数据分析三剑客之 Pandas（十）：数据读写</a></li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106900748</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="01x00-shu-ju-chong-su"><font color="#FF0000">【01x00】数据重塑</font></span></h2><p>有许多用于重新排列表格型数据的基础运算。这些函数也称作重塑（reshape）或轴向旋转（pivot）运算。重塑层次化索引主要有以下两个方法：</p><ul><li><p><code>stack</code>：将数据的列转换成行；</p></li><li><p><code>unstack</code>：将数据的行转换成列。</p></li></ul><h3><span id="01x01-stack"><font color="#4876FF">【01x01】stack</font></span></h3><p><code>stack</code> 方法用于将数据的列转换成为行；</p><p>基本语法：<code>DataFrame.stack(self, level=-1, dropna=True)</code></p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.stack.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.stack.html</a></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>level</td><td>从列转换到行，指定不同层级的列索引或列标签、由列索引或列标签组成的数组，默认-1</td></tr><tr><td>dropna</td><td>bool 类型，是否删除重塑后数据中所有值为 NaN 的行，默认 True</td></tr></tbody></table><p>单层列（Single level columns）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">0</span>, <span class="hljs-number">1</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">3</span>]], index=[<span class="hljs-string">&#x27;cat&#x27;</span>, <span class="hljs-string">&#x27;dog&#x27;</span>], columns=[<span class="hljs-string">&#x27;weight&#x27;</span>, <span class="hljs-string">&#x27;height&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj     weight  heightcat       <span class="hljs-number">0</span>       <span class="hljs-number">1</span>dog       <span class="hljs-number">2</span>       <span class="hljs-number">3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.stack()cat  weight    <span class="hljs-number">0</span>     height    <span class="hljs-number">1</span>dog  weight    <span class="hljs-number">2</span>     height    <span class="hljs-number">3</span>dtype: int64</code></pre><p>多层列（Multi level columns）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>multicol = pd.MultiIndex.from_tuples([(<span class="hljs-string">&#x27;weight&#x27;</span>, <span class="hljs-string">&#x27;kg&#x27;</span>), (<span class="hljs-string">&#x27;weight&#x27;</span>, <span class="hljs-string">&#x27;pounds&#x27;</span>)])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">4</span>]], index=[<span class="hljs-string">&#x27;cat&#x27;</span>, <span class="hljs-string">&#x27;dog&#x27;</span>], columns=multicol)<span class="hljs-meta">&gt;&gt;&gt; </span>obj    weight               kg poundscat      <span class="hljs-number">1</span>      <span class="hljs-number">2</span>dog      <span class="hljs-number">2</span>      <span class="hljs-number">4</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.stack()            weightcat kg           <span class="hljs-number">1</span>    pounds       <span class="hljs-number">2</span>dog kg           <span class="hljs-number">2</span>    pounds       <span class="hljs-number">4</span></code></pre><p>缺失值填充：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>multicol = pd.MultiIndex.from_tuples([(<span class="hljs-string">&#x27;weight&#x27;</span>, <span class="hljs-string">&#x27;kg&#x27;</span>), (<span class="hljs-string">&#x27;height&#x27;</span>, <span class="hljs-string">&#x27;m&#x27;</span>)])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1.0</span>, <span class="hljs-number">2.0</span>], [<span class="hljs-number">3.0</span>, <span class="hljs-number">4.0</span>]], index=[<span class="hljs-string">&#x27;cat&#x27;</span>, <span class="hljs-string">&#x27;dog&#x27;</span>], columns=multicol)<span class="hljs-meta">&gt;&gt;&gt; </span>obj    weight height        kg      mcat    <span class="hljs-number">1.0</span>    <span class="hljs-number">2.0</span>dog    <span class="hljs-number">3.0</span>    <span class="hljs-number">4.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.stack()        height  weightcat kg     NaN     <span class="hljs-number">1.0</span>    m      <span class="hljs-number">2.0</span>     NaNdog kg     NaN     <span class="hljs-number">3.0</span>    m      <span class="hljs-number">4.0</span>     NaN</code></pre><p>通过 <code>level</code> 参数指定不同层级的轴进行重塑：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>multicol = pd.MultiIndex.from_tuples([(<span class="hljs-string">&#x27;weight&#x27;</span>, <span class="hljs-string">&#x27;kg&#x27;</span>), (<span class="hljs-string">&#x27;height&#x27;</span>, <span class="hljs-string">&#x27;m&#x27;</span>)])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1.0</span>, <span class="hljs-number">2.0</span>], [<span class="hljs-number">3.0</span>, <span class="hljs-number">4.0</span>]], index=[<span class="hljs-string">&#x27;cat&#x27;</span>, <span class="hljs-string">&#x27;dog&#x27;</span>], columns=multicol)<span class="hljs-meta">&gt;&gt;&gt; </span>obj    weight height        kg      mcat    <span class="hljs-number">1.0</span>    <span class="hljs-number">2.0</span>dog    <span class="hljs-number">3.0</span>    <span class="hljs-number">4.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.stack(level=<span class="hljs-number">0</span>)             kg    mcat height  NaN  <span class="hljs-number">2.0</span>    weight  <span class="hljs-number">1.0</span>  NaNdog height  NaN  <span class="hljs-number">4.0</span>    weight  <span class="hljs-number">3.0</span>  NaN<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.stack(level=<span class="hljs-number">1</span>)        height  weightcat kg     NaN     <span class="hljs-number">1.0</span>    m      <span class="hljs-number">2.0</span>     NaNdog kg     NaN     <span class="hljs-number">3.0</span>    m      <span class="hljs-number">4.0</span>     NaN&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.stack(level=[<span class="hljs-number">0</span>, <span class="hljs-number">1</span>])cat  height  m     <span class="hljs-number">2.0</span>     weight  kg    <span class="hljs-number">1.0</span>dog  height  m     <span class="hljs-number">4.0</span>     weight  kg    <span class="hljs-number">3.0</span>dtype: float64</code></pre><p>对于重塑后的数据，若有一行的值均为 NaN，则默认会被删除，可以设置 <code>dropna=False</code> 来保留缺失值：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>multicol = pd.MultiIndex.from_tuples([(<span class="hljs-string">&#x27;weight&#x27;</span>, <span class="hljs-string">&#x27;kg&#x27;</span>), (<span class="hljs-string">&#x27;height&#x27;</span>, <span class="hljs-string">&#x27;m&#x27;</span>)])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-literal">None</span>, <span class="hljs-number">1.0</span>], [<span class="hljs-number">2.0</span>, <span class="hljs-number">3.0</span>]], index=[<span class="hljs-string">&#x27;cat&#x27;</span>, <span class="hljs-string">&#x27;dog&#x27;</span>], columns=multicol)<span class="hljs-meta">&gt;&gt;&gt; </span>obj    weight height        kg      mcat    NaN    <span class="hljs-number">1.0</span>dog    <span class="hljs-number">2.0</span>    <span class="hljs-number">3.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.stack(dropna=<span class="hljs-literal">False</span>)        height  weightcat kg     NaN     NaN    m      <span class="hljs-number">1.0</span>     NaNdog kg     NaN     <span class="hljs-number">2.0</span>    m      <span class="hljs-number">3.0</span>     NaN<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.stack(dropna=<span class="hljs-literal">True</span>)        height  weightcat m      <span class="hljs-number">1.0</span>     NaNdog kg     NaN     <span class="hljs-number">2.0</span>    m      <span class="hljs-number">3.0</span>     NaN</code></pre><h3><span id="01x02-unstack"><font color="#4876FF">【01x02】unstack</font></span></h3><p><code>unstack</code>：将数据的行转换成列。</p><p>基本语法：</p><ul><li><p><code>Series.unstack(self, level=-1, fill_value=None)</code></p></li><li><p><code>DataFrame.unstack(self, level=-1, fill_value=None)</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.unstack.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.unstack.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.unstack.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.unstack.html</a></p></li></ul><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>level</td><td>从行转换到列，指定不同层级的行索引，默认-1</td></tr><tr><td>fill_value</td><td>用于替换 NaN 的值</td></tr></tbody></table><p>在 Series 对象中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>], index=pd.MultiIndex.from_product([[<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>], [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>]]))<span class="hljs-meta">&gt;&gt;&gt; </span>objone  a    <span class="hljs-number">1</span>     b    <span class="hljs-number">2</span>two  a    <span class="hljs-number">3</span>     b    <span class="hljs-number">4</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.unstack()     a  bone  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>two  <span class="hljs-number">3</span>  <span class="hljs-number">4</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.unstack(level=<span class="hljs-number">0</span>)   one  twoa    <span class="hljs-number">1</span>    <span class="hljs-number">3</span>b    <span class="hljs-number">2</span>    <span class="hljs-number">4</span></code></pre><p>和 <code>stack</code> 方法类似，如果值不存在将会引入缺失值（NaN）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.Series([<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.Series([<span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>], index=[<span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj3 = pd.concat([obj1, obj2], keys=[<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj3one  a    <span class="hljs-number">0</span>     b    <span class="hljs-number">1</span>     c    <span class="hljs-number">2</span>     d    <span class="hljs-number">3</span>two  c    <span class="hljs-number">4</span>     d    <span class="hljs-number">5</span>     e    <span class="hljs-number">6</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj3.unstack()       a    b    c    d    eone  <span class="hljs-number">0.0</span>  <span class="hljs-number">1.0</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.0</span>  NaNtwo  NaN  NaN  <span class="hljs-number">4.0</span>  <span class="hljs-number">5.0</span>  <span class="hljs-number">6.0</span></code></pre><p>在 DataFrame 对象中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.arange(<span class="hljs-number">6</span>).reshape((<span class="hljs-number">2</span>, <span class="hljs-number">3</span>)),               index=pd.Index([<span class="hljs-string">&#x27;Ohio&#x27;</span>,<span class="hljs-string">&#x27;Colorado&#x27;</span>], name=<span class="hljs-string">&#x27;state&#x27;</span>),               columns=pd.Index([<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>],               name=<span class="hljs-string">&#x27;number&#x27;</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>objnumber    one  two  threestate                    Ohio        <span class="hljs-number">0</span>    <span class="hljs-number">1</span>      <span class="hljs-number">2</span>Colorado    <span class="hljs-number">3</span>    <span class="hljs-number">4</span>      <span class="hljs-number">5</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = obj.stack()<span class="hljs-meta">&gt;&gt;&gt; </span>obj2state     numberOhio      one       <span class="hljs-number">0</span>          two       <span class="hljs-number">1</span>          three     <span class="hljs-number">2</span>Colorado  one       <span class="hljs-number">3</span>          two       <span class="hljs-number">4</span>          three     <span class="hljs-number">5</span>dtype: int32<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj3 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;left&#x27;</span>: obj2, <span class="hljs-string">&#x27;right&#x27;</span>: obj2 + <span class="hljs-number">5</span>&#125;,            columns=pd.Index([<span class="hljs-string">&#x27;left&#x27;</span>, <span class="hljs-string">&#x27;right&#x27;</span>], name=<span class="hljs-string">&#x27;side&#x27;</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obj3side             left  rightstate    number             Ohio     one        <span class="hljs-number">0</span>      <span class="hljs-number">5</span>         two        <span class="hljs-number">1</span>      <span class="hljs-number">6</span>         three      <span class="hljs-number">2</span>      <span class="hljs-number">7</span>Colorado one        <span class="hljs-number">3</span>      <span class="hljs-number">8</span>         two        <span class="hljs-number">4</span>      <span class="hljs-number">9</span>         three      <span class="hljs-number">5</span>     <span class="hljs-number">10</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj3.unstack(<span class="hljs-string">&#x27;state&#x27;</span>)side   left          right         state  Ohio Colorado  Ohio Coloradonumber                             one       <span class="hljs-number">0</span>        <span class="hljs-number">3</span>     <span class="hljs-number">5</span>        <span class="hljs-number">8</span>two       <span class="hljs-number">1</span>        <span class="hljs-number">4</span>     <span class="hljs-number">6</span>        <span class="hljs-number">9</span>three     <span class="hljs-number">2</span>        <span class="hljs-number">5</span>     <span class="hljs-number">7</span>       <span class="hljs-number">10</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj3.unstack(<span class="hljs-string">&#x27;state&#x27;</span>).stack(<span class="hljs-string">&#x27;side&#x27;</span>)state         Colorado  Ohionumber side                 one    left          <span class="hljs-number">3</span>     <span class="hljs-number">0</span>       right         <span class="hljs-number">8</span>     <span class="hljs-number">5</span>two    left          <span class="hljs-number">4</span>     <span class="hljs-number">1</span>       right         <span class="hljs-number">9</span>     <span class="hljs-number">6</span>three  left          <span class="hljs-number">5</span>     <span class="hljs-number">2</span>       right        <span class="hljs-number">10</span>     <span class="hljs-number">7</span></code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106900748</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="02x00-chong-fu-shu-ju-chu-li"><font color="#FF0000">【02x00】重复数据处理</font></span></h2><ul><li><p><code>duplicated</code>：判断是否为重复值；</p></li><li><p><code>drop_duplicates</code>：删除重复值。</p></li></ul><h3><span id="02x01-duplicated"><font color="#4876FF">【02x01】duplicated</font></span></h3><p><code>duplicated</code> 方法可以判断值是否为重复数据。</p><p>基本语法：</p><ul><li><p><code>Series.duplicated(self, keep='first')</code></p></li><li><p><code>DataFrame.duplicated(self, subset: Union[Hashable, Sequence[Hashable], NoneType] = None, keep: Union[str, bool] = 'first') → ’Series’</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.duplicated.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.duplicated.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.duplicated.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.duplicated.html</a></p></li></ul><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>keep</td><td>标记重复项的方法，默认 <code>'first'</code><br><code>'first'</code>：将非重复项和第一个重复项标记为 False，其他重复项标记为 True<br><code>'last'</code>：将非重复项和最后一个重复项标记为 False，其他重复项标记为 True<br><code>False</code>：将所有重复项标记为 True，非重复项标记为 False</td></tr><tr><td>subset</td><td>列标签或标签序列，在 DataFrame 对象中才有此参数，<br>用于指定某列，仅标记该列的重复项，默认情况下将考虑所有列</td></tr></tbody></table><p>默认情况下，对于每组重复的值，第一个出现的重复值标记为 False，其他重复项标记为 True，非重复项标记为 False，相当于 <code>keep='first'</code>：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;cow&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;beetle&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>      lama<span class="hljs-number">1</span>       cow<span class="hljs-number">2</span>      lama<span class="hljs-number">3</span>    beetle<span class="hljs-number">4</span>      lamadtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.duplicated()<span class="hljs-number">0</span>    <span class="hljs-literal">False</span><span class="hljs-number">1</span>    <span class="hljs-literal">False</span><span class="hljs-number">2</span>     <span class="hljs-literal">True</span><span class="hljs-number">3</span>    <span class="hljs-literal">False</span><span class="hljs-number">4</span>     <span class="hljs-literal">True</span>dtype: <span class="hljs-built_in">bool</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.duplicated(keep=<span class="hljs-string">&#x27;first&#x27;</span>)<span class="hljs-number">0</span>    <span class="hljs-literal">False</span><span class="hljs-number">1</span>    <span class="hljs-literal">False</span><span class="hljs-number">2</span>     <span class="hljs-literal">True</span><span class="hljs-number">3</span>    <span class="hljs-literal">False</span><span class="hljs-number">4</span>     <span class="hljs-literal">True</span>dtype: <span class="hljs-built_in">bool</span></code></pre><p>设置 <code>keep='last'</code>，将每组非重复项和最后一次出现的重复项标记为 False，其他重复项标记为 True，设置 <code>keep=False</code>，则所有重复项均为 True，其他值为 False：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;cow&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;beetle&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>      lama<span class="hljs-number">1</span>       cow<span class="hljs-number">2</span>      lama<span class="hljs-number">3</span>    beetle<span class="hljs-number">4</span>      lamadtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.duplicated(keep=<span class="hljs-string">&#x27;last&#x27;</span>)<span class="hljs-number">0</span>     <span class="hljs-literal">True</span><span class="hljs-number">1</span>    <span class="hljs-literal">False</span><span class="hljs-number">2</span>     <span class="hljs-literal">True</span><span class="hljs-number">3</span>    <span class="hljs-literal">False</span><span class="hljs-number">4</span>    <span class="hljs-literal">False</span>dtype: <span class="hljs-built_in">bool</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.duplicated(keep=<span class="hljs-literal">False</span>)<span class="hljs-number">0</span>     <span class="hljs-literal">True</span><span class="hljs-number">1</span>    <span class="hljs-literal">False</span><span class="hljs-number">2</span>     <span class="hljs-literal">True</span><span class="hljs-number">3</span>    <span class="hljs-literal">False</span><span class="hljs-number">4</span>     <span class="hljs-literal">True</span>dtype: <span class="hljs-built_in">bool</span></code></pre><p>在 DataFrame 对象中，subset 参数用于指定某列，仅标记该列的重复项，默认情况下将考虑所有列：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;data1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>] * <span class="hljs-number">4</span> + [<span class="hljs-string">&#x27;b&#x27;</span>] * <span class="hljs-number">4</span>,                       <span class="hljs-string">&#x27;data2&#x27;</span> : np.random.randint(<span class="hljs-number">0</span>, <span class="hljs-number">4</span>, <span class="hljs-number">8</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  data1  data2<span class="hljs-number">0</span>     a      <span class="hljs-number">0</span><span class="hljs-number">1</span>     a      <span class="hljs-number">0</span><span class="hljs-number">2</span>     a      <span class="hljs-number">0</span><span class="hljs-number">3</span>     a      <span class="hljs-number">3</span><span class="hljs-number">4</span>     b      <span class="hljs-number">3</span><span class="hljs-number">5</span>     b      <span class="hljs-number">3</span><span class="hljs-number">6</span>     b      <span class="hljs-number">0</span><span class="hljs-number">7</span>     b      <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.duplicated()<span class="hljs-number">0</span>    <span class="hljs-literal">False</span><span class="hljs-number">1</span>     <span class="hljs-literal">True</span><span class="hljs-number">2</span>     <span class="hljs-literal">True</span><span class="hljs-number">3</span>    <span class="hljs-literal">False</span><span class="hljs-number">4</span>    <span class="hljs-literal">False</span><span class="hljs-number">5</span>     <span class="hljs-literal">True</span><span class="hljs-number">6</span>    <span class="hljs-literal">False</span><span class="hljs-number">7</span>    <span class="hljs-literal">False</span>dtype: <span class="hljs-built_in">bool</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.duplicated(subset=<span class="hljs-string">&#x27;data1&#x27;</span>)<span class="hljs-number">0</span>    <span class="hljs-literal">False</span><span class="hljs-number">1</span>     <span class="hljs-literal">True</span><span class="hljs-number">2</span>     <span class="hljs-literal">True</span><span class="hljs-number">3</span>     <span class="hljs-literal">True</span><span class="hljs-number">4</span>    <span class="hljs-literal">False</span><span class="hljs-number">5</span>     <span class="hljs-literal">True</span><span class="hljs-number">6</span>     <span class="hljs-literal">True</span><span class="hljs-number">7</span>     <span class="hljs-literal">True</span>dtype: <span class="hljs-built_in">bool</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.duplicated(subset=<span class="hljs-string">&#x27;data2&#x27;</span>, keep=<span class="hljs-string">&#x27;last&#x27;</span>)<span class="hljs-number">0</span>     <span class="hljs-literal">True</span><span class="hljs-number">1</span>     <span class="hljs-literal">True</span><span class="hljs-number">2</span>     <span class="hljs-literal">True</span><span class="hljs-number">3</span>     <span class="hljs-literal">True</span><span class="hljs-number">4</span>     <span class="hljs-literal">True</span><span class="hljs-number">5</span>    <span class="hljs-literal">False</span><span class="hljs-number">6</span>    <span class="hljs-literal">False</span><span class="hljs-number">7</span>    <span class="hljs-literal">False</span>dtype: <span class="hljs-built_in">bool</span></code></pre><h3><span id="02x02-drop-duplicates"><font color="#4876FF">【02x02】drop_duplicates</font></span></h3><p><code>drop_duplicates</code> 方法会返回一个删除了重复值的序列。</p><p>基本语法：</p><pre><code class="hljs python">Series.drop_duplicates(self, keep=<span class="hljs-string">&#x27;first&#x27;</span>, inplace=<span class="hljs-literal">False</span>)</code></pre><pre><code class="hljs python">DataFrame.drop_duplicates(self,                          subset: <span class="hljs-type">Union</span>[Hashable, <span class="hljs-type">Sequence</span>[Hashable], NoneType] = <span class="hljs-literal">None</span>,                          keep: <span class="hljs-type">Union</span>[<span class="hljs-built_in">str</span>, <span class="hljs-built_in">bool</span>] = <span class="hljs-string">&#x27;first&#x27;</span>,                          inplace: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>,                          ignore_index: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>) → <span class="hljs-type">Union</span>[ForwardRef(‘DataFrame’), NoneType]</code></pre><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.drop_duplicates.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.drop_duplicates.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.html</a></p></li></ul><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>keep</td><td>删除重复项的方法，默认 <code>'first'</code><br><code>'first'</code>：保留非重复项和第一个重复项，其他重复项标记均删除<br><code>'last'</code>：保留非重复项和最后一个重复项，其他重复项删除<br><code>False</code>：将所有重复项删除，非重复项保留</td></tr><tr><td>inplace</td><td>是否返回删除重复项后的值，默认 False，若设置为 True，则不返回值，直接改变原数据</td></tr><tr><td>subset</td><td>列标签或标签序列，在 DataFrame 对象中才有此参数，<br>用于指定某列，仅标记该列的重复项，默认情况下将考虑所有列</td></tr><tr><td>ignore_index</td><td>bool 类型，在 DataFrame 对象中才有此参数，是否忽略原对象的轴标记，<br>默认 False，如果为 True，则新对象的索引将是 0, 1, 2, …, n-1</td></tr></tbody></table><p>keep 参数的使用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;cow&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;beetle&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;hippo&#x27;</span>], name=<span class="hljs-string">&#x27;animal&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>      lama<span class="hljs-number">1</span>       cow<span class="hljs-number">2</span>      lama<span class="hljs-number">3</span>    beetle<span class="hljs-number">4</span>      lama<span class="hljs-number">5</span>     hippoName: animal, dtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.drop_duplicates()<span class="hljs-number">0</span>      lama<span class="hljs-number">1</span>       cow<span class="hljs-number">3</span>    beetle<span class="hljs-number">5</span>     hippoName: animal, dtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.drop_duplicates(keep=<span class="hljs-string">&#x27;last&#x27;</span>)<span class="hljs-number">1</span>       cow<span class="hljs-number">3</span>    beetle<span class="hljs-number">4</span>      lama<span class="hljs-number">5</span>     hippoName: animal, dtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.drop_duplicates(keep=<span class="hljs-literal">False</span>)<span class="hljs-number">1</span>       cow<span class="hljs-number">3</span>    beetle<span class="hljs-number">5</span>     hippoName: animal, dtype: <span class="hljs-built_in">object</span></code></pre><p>如果设置 <code>inplace=True</code>，则不会返回任何值，但原对象的值已被改变：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.Series([<span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;cow&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;beetle&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;hippo&#x27;</span>], name=<span class="hljs-string">&#x27;animal&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj1<span class="hljs-number">0</span>      lama<span class="hljs-number">1</span>       cow<span class="hljs-number">2</span>      lama<span class="hljs-number">3</span>    beetle<span class="hljs-number">4</span>      lama<span class="hljs-number">5</span>     hippoName: animal, dtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = obj1.drop_duplicates()<span class="hljs-meta">&gt;&gt;&gt; </span>obj2          <span class="hljs-comment"># 有返回值</span><span class="hljs-number">0</span>      lama<span class="hljs-number">1</span>       cow<span class="hljs-number">3</span>    beetle<span class="hljs-number">5</span>     hippoName: animal, dtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj3 = obj1.drop_duplicates(inplace=<span class="hljs-literal">True</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj3         <span class="hljs-comment"># 无返回值</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj1         <span class="hljs-comment"># 原对象的值已改变</span><span class="hljs-number">0</span>      lama<span class="hljs-number">1</span>       cow<span class="hljs-number">3</span>    beetle<span class="hljs-number">5</span>     hippoName: animal, dtype: <span class="hljs-built_in">object</span></code></pre><p>在 DataFrame 对象中的使用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;data1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>] * <span class="hljs-number">4</span> + [<span class="hljs-string">&#x27;b&#x27;</span>] * <span class="hljs-number">4</span>,                       <span class="hljs-string">&#x27;data2&#x27;</span> : np.random.randint(<span class="hljs-number">0</span>, <span class="hljs-number">4</span>, <span class="hljs-number">8</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  data1  data2<span class="hljs-number">0</span>     a      <span class="hljs-number">2</span><span class="hljs-number">1</span>     a      <span class="hljs-number">1</span><span class="hljs-number">2</span>     a      <span class="hljs-number">1</span><span class="hljs-number">3</span>     a      <span class="hljs-number">2</span><span class="hljs-number">4</span>     b      <span class="hljs-number">1</span><span class="hljs-number">5</span>     b      <span class="hljs-number">2</span><span class="hljs-number">6</span>     b      <span class="hljs-number">0</span><span class="hljs-number">7</span>     b      <span class="hljs-number">0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.drop_duplicates()  data1  data2<span class="hljs-number">0</span>     a      <span class="hljs-number">2</span><span class="hljs-number">1</span>     a      <span class="hljs-number">1</span><span class="hljs-number">4</span>     b      <span class="hljs-number">1</span><span class="hljs-number">5</span>     b      <span class="hljs-number">2</span><span class="hljs-number">6</span>     b      <span class="hljs-number">0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.drop_duplicates(subset=<span class="hljs-string">&#x27;data2&#x27;</span>)  data1  data2<span class="hljs-number">0</span>     a      <span class="hljs-number">2</span><span class="hljs-number">1</span>     a      <span class="hljs-number">1</span><span class="hljs-number">6</span>     b      <span class="hljs-number">0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.drop_duplicates(subset=<span class="hljs-string">&#x27;data2&#x27;</span>, ignore_index=<span class="hljs-literal">True</span>)  data1  data2<span class="hljs-number">0</span>     a      <span class="hljs-number">2</span><span class="hljs-number">1</span>     a      <span class="hljs-number">1</span><span class="hljs-number">2</span>     b      <span class="hljs-number">0</span></code></pre><h2><span id="03x00-shu-ju-ti-huan"><font color="#FF0000">【03x00】数据替换</font></span></h2><h3><span id="03x01-replace"><font color="#4876FF">【03x01】replace</font></span></h3><p><code>replace</code> 方法可以根据值的内容进行替换。</p><p>基本语法：</p><ul><li><p><code>Series.replace(self, to_replace=None, value=None, inplace=False, limit=None, regex=False, method='pad')</code></p></li><li><p><code>DataFrame.replace(self, to_replace=None, value=None, inplace=False, limit=None, regex=False, method='pad')</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.replace.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.replace.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.replace.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.replace.html</a></p></li></ul><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>to_replace</td><td>找到要替换值的方法，可以是：字符串、正则表达式、列表、字典、整数、浮点数、Series 对象或者 None<br>使用不同参数的区别参见官方文档</td></tr><tr><td>value</td><td>用于替换匹配项的值， 对于 DataFrame，可以使用字典的值来指定每列要使用的值，<br>还允许使用此类对象的正则表达式，字符串和列表或字典</td></tr><tr><td>inplace</td><td>bool 类型，是否直接改变原数据且不返回值，默认 False</td></tr><tr><td>regex</td><td>bool 类型或者与 to_replace 相同的类型，<br>当 to_replace 参数为正则表达式时，regex 应为 True，或者直接使用该参数代替 to_replace</td></tr></tbody></table><p><code>to_replace</code> 和 <code>value</code> 参数只传入一个值，单个值替换单个值：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>    <span class="hljs-number">0</span><span class="hljs-number">1</span>    <span class="hljs-number">1</span><span class="hljs-number">2</span>    <span class="hljs-number">2</span><span class="hljs-number">3</span>    <span class="hljs-number">3</span><span class="hljs-number">4</span>    <span class="hljs-number">4</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace(<span class="hljs-number">0</span>, <span class="hljs-number">5</span>)<span class="hljs-number">0</span>    <span class="hljs-number">5</span><span class="hljs-number">1</span>    <span class="hljs-number">1</span><span class="hljs-number">2</span>    <span class="hljs-number">2</span><span class="hljs-number">3</span>    <span class="hljs-number">3</span><span class="hljs-number">4</span>    <span class="hljs-number">4</span>dtype: int64</code></pre><p><code>to_replace</code> 传入多个值，<code>value</code> 传入一个值，多个值替换一个值：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>    <span class="hljs-number">0</span><span class="hljs-number">1</span>    <span class="hljs-number">1</span><span class="hljs-number">2</span>    <span class="hljs-number">2</span><span class="hljs-number">3</span>    <span class="hljs-number">3</span><span class="hljs-number">4</span>    <span class="hljs-number">4</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace([<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], <span class="hljs-number">4</span>)<span class="hljs-number">0</span>    <span class="hljs-number">4</span><span class="hljs-number">1</span>    <span class="hljs-number">4</span><span class="hljs-number">2</span>    <span class="hljs-number">4</span><span class="hljs-number">3</span>    <span class="hljs-number">4</span><span class="hljs-number">4</span>    <span class="hljs-number">4</span>dtype: int64</code></pre><p><code>to_replace</code> 和 <code>value</code> 参数都传入多个值，多个值替换多个值：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>    <span class="hljs-number">0</span><span class="hljs-number">1</span>    <span class="hljs-number">1</span><span class="hljs-number">2</span>    <span class="hljs-number">2</span><span class="hljs-number">3</span>    <span class="hljs-number">3</span><span class="hljs-number">4</span>    <span class="hljs-number">4</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace([<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">4</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">1</span>])<span class="hljs-number">0</span>    <span class="hljs-number">4</span><span class="hljs-number">1</span>    <span class="hljs-number">3</span><span class="hljs-number">2</span>    <span class="hljs-number">2</span><span class="hljs-number">3</span>    <span class="hljs-number">1</span><span class="hljs-number">4</span>    <span class="hljs-number">4</span>dtype: int64</code></pre><p><code>to_replace</code> 传入字典：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;A&#x27;</span>: [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>],            <span class="hljs-string">&#x27;B&#x27;</span>: [<span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>],            <span class="hljs-string">&#x27;C&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj   A  B  C<span class="hljs-number">0</span>  <span class="hljs-number">0</span>  <span class="hljs-number">5</span>  a<span class="hljs-number">1</span>  <span class="hljs-number">1</span>  <span class="hljs-number">6</span>  b<span class="hljs-number">2</span>  <span class="hljs-number">2</span>  <span class="hljs-number">7</span>  c<span class="hljs-number">3</span>  <span class="hljs-number">3</span>  <span class="hljs-number">8</span>  d<span class="hljs-number">4</span>  <span class="hljs-number">4</span>  <span class="hljs-number">9</span>  e<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace(<span class="hljs-number">0</span>, <span class="hljs-number">5</span>)   A  B  C<span class="hljs-number">0</span>  <span class="hljs-number">5</span>  <span class="hljs-number">5</span>  a<span class="hljs-number">1</span>  <span class="hljs-number">1</span>  <span class="hljs-number">6</span>  b<span class="hljs-number">2</span>  <span class="hljs-number">2</span>  <span class="hljs-number">7</span>  c<span class="hljs-number">3</span>  <span class="hljs-number">3</span>  <span class="hljs-number">8</span>  d<span class="hljs-number">4</span>  <span class="hljs-number">4</span>  <span class="hljs-number">9</span>  e<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace(&#123;<span class="hljs-number">0</span>: <span class="hljs-number">10</span>, <span class="hljs-number">1</span>: <span class="hljs-number">100</span>&#125;)     A  B  C<span class="hljs-number">0</span>   <span class="hljs-number">10</span>  <span class="hljs-number">5</span>  a<span class="hljs-number">1</span>  <span class="hljs-number">100</span>  <span class="hljs-number">6</span>  b<span class="hljs-number">2</span>    <span class="hljs-number">2</span>  <span class="hljs-number">7</span>  c<span class="hljs-number">3</span>    <span class="hljs-number">3</span>  <span class="hljs-number">8</span>  d<span class="hljs-number">4</span>    <span class="hljs-number">4</span>  <span class="hljs-number">9</span>  e<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace(&#123;<span class="hljs-string">&#x27;A&#x27;</span>: <span class="hljs-number">0</span>, <span class="hljs-string">&#x27;B&#x27;</span>: <span class="hljs-number">5</span>&#125;, <span class="hljs-number">100</span>)     A    B  C<span class="hljs-number">0</span>  <span class="hljs-number">100</span>  <span class="hljs-number">100</span>  a<span class="hljs-number">1</span>    <span class="hljs-number">1</span>    <span class="hljs-number">6</span>  b<span class="hljs-number">2</span>    <span class="hljs-number">2</span>    <span class="hljs-number">7</span>  c<span class="hljs-number">3</span>    <span class="hljs-number">3</span>    <span class="hljs-number">8</span>  d<span class="hljs-number">4</span>    <span class="hljs-number">4</span>    <span class="hljs-number">9</span>  e<span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace(&#123;<span class="hljs-string">&#x27;A&#x27;</span>: &#123;<span class="hljs-number">0</span>: <span class="hljs-number">100</span>, <span class="hljs-number">4</span>: <span class="hljs-number">400</span>&#125;&#125;)     A  B  C<span class="hljs-number">0</span>  <span class="hljs-number">100</span>  <span class="hljs-number">5</span>  a<span class="hljs-number">1</span>    <span class="hljs-number">1</span>  <span class="hljs-number">6</span>  b<span class="hljs-number">2</span>    <span class="hljs-number">2</span>  <span class="hljs-number">7</span>  c<span class="hljs-number">3</span>    <span class="hljs-number">3</span>  <span class="hljs-number">8</span>  d<span class="hljs-number">4</span>  <span class="hljs-number">400</span>  <span class="hljs-number">9</span>  e</code></pre><p><code>to_replace</code> 传入正则表达式：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;A&#x27;</span>: [<span class="hljs-string">&#x27;bat&#x27;</span>, <span class="hljs-string">&#x27;foo&#x27;</span>, <span class="hljs-string">&#x27;bait&#x27;</span>],            <span class="hljs-string">&#x27;B&#x27;</span>: [<span class="hljs-string">&#x27;abc&#x27;</span>, <span class="hljs-string">&#x27;bar&#x27;</span>, <span class="hljs-string">&#x27;xyz&#x27;</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj      A    B<span class="hljs-number">0</span>   bat  abc<span class="hljs-number">1</span>   foo  bar<span class="hljs-number">2</span>  bait  xyz<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace(to_replace=<span class="hljs-string">r&#x27;^ba.$&#x27;</span>, value=<span class="hljs-string">&#x27;new&#x27;</span>, regex=<span class="hljs-literal">True</span>)      A    B<span class="hljs-number">0</span>   new  abc<span class="hljs-number">1</span>   foo  new<span class="hljs-number">2</span>  bait  xyz<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace(&#123;<span class="hljs-string">&#x27;A&#x27;</span>: <span class="hljs-string">r&#x27;^ba.$&#x27;</span>&#125;, &#123;<span class="hljs-string">&#x27;A&#x27;</span>: <span class="hljs-string">&#x27;new&#x27;</span>&#125;, regex=<span class="hljs-literal">True</span>)      A    B<span class="hljs-number">0</span>   new  abc<span class="hljs-number">1</span>   foo  bar<span class="hljs-number">2</span>  bait  xyz<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace(regex=<span class="hljs-string">r&#x27;^ba.$&#x27;</span>, value=<span class="hljs-string">&#x27;new&#x27;</span>)      A    B<span class="hljs-number">0</span>   new  abc<span class="hljs-number">1</span>   foo  new<span class="hljs-number">2</span>  bait  xyz<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace(regex=&#123;<span class="hljs-string">r&#x27;^ba.$&#x27;</span>: <span class="hljs-string">&#x27;new&#x27;</span>, <span class="hljs-string">&#x27;foo&#x27;</span>: <span class="hljs-string">&#x27;xyz&#x27;</span>&#125;)      A    B<span class="hljs-number">0</span>   new  abc<span class="hljs-number">1</span>   xyz  new<span class="hljs-number">2</span>  bait  xyz<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.replace(regex=[<span class="hljs-string">r&#x27;^ba.$&#x27;</span>, <span class="hljs-string">&#x27;foo&#x27;</span>], value=<span class="hljs-string">&#x27;new&#x27;</span>)      A    B<span class="hljs-number">0</span>   new  abc<span class="hljs-number">1</span>   new  new<span class="hljs-number">2</span>  bait  xyz</code></pre><h3><span id="03x02-where"><font color="#4876FF">【03x02】where</font></span></h3><p><code>where</code> 方法用于替换条件为 False 的值。</p><p>基本语法：</p><ul><li><p><code>Series.where(self, cond, other=nan, inplace=False, axis=None, level=None, errors='raise', try_cast=False)</code></p></li><li><p><code>DataFrame.where(self, cond, other=nan, inplace=False, axis=None, level=None, errors='raise', try_cast=False)</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.where.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.where.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.where.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.where.html</a></p></li></ul><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>cond</td><td>替换条件，如果 cond 为 True，则保留原始值。如果为 False，则替换为来自 other 的相应值</td></tr><tr><td>other</td><td>替换值，如果 cond 为 False，则替换为来自该参数的相应值</td></tr><tr><td>inplace</td><td>bool 类型，是否直接改变原数据且不返回值，默认 False</td></tr></tbody></table><p>在 Series 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(<span class="hljs-built_in">range</span>(<span class="hljs-number">5</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>    <span class="hljs-number">0</span><span class="hljs-number">1</span>    <span class="hljs-number">1</span><span class="hljs-number">2</span>    <span class="hljs-number">2</span><span class="hljs-number">3</span>    <span class="hljs-number">3</span><span class="hljs-number">4</span>    <span class="hljs-number">4</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.where(obj &gt; <span class="hljs-number">0</span>)<span class="hljs-number">0</span>    NaN<span class="hljs-number">1</span>    <span class="hljs-number">1.0</span><span class="hljs-number">2</span>    <span class="hljs-number">2.0</span><span class="hljs-number">3</span>    <span class="hljs-number">3.0</span><span class="hljs-number">4</span>    <span class="hljs-number">4.0</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.where(obj &gt; <span class="hljs-number">1</span>, <span class="hljs-number">10</span>)<span class="hljs-number">0</span>    <span class="hljs-number">10</span><span class="hljs-number">1</span>    <span class="hljs-number">10</span><span class="hljs-number">2</span>     <span class="hljs-number">2</span><span class="hljs-number">3</span>     <span class="hljs-number">3</span><span class="hljs-number">4</span>     <span class="hljs-number">4</span>dtype: int64</code></pre><p>在 DataFrame 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.arange(<span class="hljs-number">10</span>).reshape(-<span class="hljs-number">1</span>, <span class="hljs-number">2</span>), columns=[<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj   A  B<span class="hljs-number">0</span>  <span class="hljs-number">0</span>  <span class="hljs-number">1</span><span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span><span class="hljs-number">2</span>  <span class="hljs-number">4</span>  <span class="hljs-number">5</span><span class="hljs-number">3</span>  <span class="hljs-number">6</span>  <span class="hljs-number">7</span><span class="hljs-number">4</span>  <span class="hljs-number">8</span>  <span class="hljs-number">9</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>m = obj % <span class="hljs-number">3</span> == <span class="hljs-number">0</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.where(m, -obj)   A  B<span class="hljs-number">0</span>  <span class="hljs-number">0</span> -<span class="hljs-number">1</span><span class="hljs-number">1</span> -<span class="hljs-number">2</span>  <span class="hljs-number">3</span><span class="hljs-number">2</span> -<span class="hljs-number">4</span> -<span class="hljs-number">5</span><span class="hljs-number">3</span>  <span class="hljs-number">6</span> -<span class="hljs-number">7</span><span class="hljs-number">4</span> -<span class="hljs-number">8</span>  <span class="hljs-number">9</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.where(m, -obj) == np.where(m, obj, -obj)      A     B<span class="hljs-number">0</span>  <span class="hljs-literal">True</span>  <span class="hljs-literal">True</span><span class="hljs-number">1</span>  <span class="hljs-literal">True</span>  <span class="hljs-literal">True</span><span class="hljs-number">2</span>  <span class="hljs-literal">True</span>  <span class="hljs-literal">True</span><span class="hljs-number">3</span>  <span class="hljs-literal">True</span>  <span class="hljs-literal">True</span><span class="hljs-number">4</span>  <span class="hljs-literal">True</span>  <span class="hljs-literal">True</span></code></pre><h3><span id="03x03-mask"><font color="#4876FF">【03x03】mask</font></span></h3><p><code>mask</code> 方法与 <code>where</code> 方法相反，<code>mask</code> 用于替换条件为 False 的值。</p><p>基本语法：</p><ul><li><p><code>Series.mask(self, cond, other=nan, inplace=False, axis=None, level=None, errors='raise', try_cast=False)</code></p></li><li><p><code>DataFrame.mask(self, cond, other=nan, inplace=False, axis=None, level=None, errors='raise', try_cast=False)</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.mask.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.mask.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.mask.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.mask.html</a></p></li></ul><p>常用参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>cond</td><td>替换条件，如果 cond 为 False，则保留原始值。如果为 True，则替换为来自 other 的相应值</td></tr><tr><td>other</td><td>替换值，如果 cond 为 False，则替换为来自该参数的相应值</td></tr><tr><td>inplace</td><td>bool 类型，是否直接改变原数据且不返回值，默认 False</td></tr></tbody></table><p>在 Series 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(<span class="hljs-built_in">range</span>(<span class="hljs-number">5</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>    <span class="hljs-number">0</span><span class="hljs-number">1</span>    <span class="hljs-number">1</span><span class="hljs-number">2</span>    <span class="hljs-number">2</span><span class="hljs-number">3</span>    <span class="hljs-number">3</span><span class="hljs-number">4</span>    <span class="hljs-number">4</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.mask(obj &gt; <span class="hljs-number">0</span>)<span class="hljs-number">0</span>    <span class="hljs-number">0.0</span><span class="hljs-number">1</span>    NaN<span class="hljs-number">2</span>    NaN<span class="hljs-number">3</span>    NaN<span class="hljs-number">4</span>    NaNdtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.mask(obj &gt; <span class="hljs-number">1</span>, <span class="hljs-number">10</span>)<span class="hljs-number">0</span>     <span class="hljs-number">0</span><span class="hljs-number">1</span>     <span class="hljs-number">1</span><span class="hljs-number">2</span>    <span class="hljs-number">10</span><span class="hljs-number">3</span>    <span class="hljs-number">10</span><span class="hljs-number">4</span>    <span class="hljs-number">10</span>dtype: int64</code></pre><p>在 DataFrame 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.arange(<span class="hljs-number">10</span>).reshape(-<span class="hljs-number">1</span>, <span class="hljs-number">2</span>), columns=[<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj   A  B<span class="hljs-number">0</span>  <span class="hljs-number">0</span>  <span class="hljs-number">1</span><span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span><span class="hljs-number">2</span>  <span class="hljs-number">4</span>  <span class="hljs-number">5</span><span class="hljs-number">3</span>  <span class="hljs-number">6</span>  <span class="hljs-number">7</span><span class="hljs-number">4</span>  <span class="hljs-number">8</span>  <span class="hljs-number">9</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>m = obj % <span class="hljs-number">3</span> == <span class="hljs-number">0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.mask(m, -obj)   A  B<span class="hljs-number">0</span>  <span class="hljs-number">0</span>  <span class="hljs-number">1</span><span class="hljs-number">1</span>  <span class="hljs-number">2</span> -<span class="hljs-number">3</span><span class="hljs-number">2</span>  <span class="hljs-number">4</span>  <span class="hljs-number">5</span><span class="hljs-number">3</span> -<span class="hljs-number">6</span>  <span class="hljs-number">7</span><span class="hljs-number">4</span>  <span class="hljs-number">8</span> -<span class="hljs-number">9</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.where(m, -obj) == obj.mask(~m, -obj)      A     B<span class="hljs-number">0</span>  <span class="hljs-literal">True</span>  <span class="hljs-literal">True</span><span class="hljs-number">1</span>  <span class="hljs-literal">True</span>  <span class="hljs-literal">True</span><span class="hljs-number">2</span>  <span class="hljs-literal">True</span>  <span class="hljs-literal">True</span><span class="hljs-number">3</span>  <span class="hljs-literal">True</span>  <span class="hljs-literal">True</span><span class="hljs-number">4</span>  <span class="hljs-literal">True</span>  <span class="hljs-literal">True</span></code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106900748</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-01x00-shu-ju-chong-su-font&quot;&gt;&lt;font color=&quot;#</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Pandas" scheme="https://www.itbob.cn/tags/Pandas/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Pandas（七）：合并数据集</title>
    <link href="https://www.itbob.cn/article/031/"/>
    <id>https://www.itbob.cn/article/031/</id>
    <published>2020-06-21T12:58:52.000Z</published>
    <updated>2022-05-22T12:42:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-01x00-concat-font"><font color="#FF0000">【01x00】concat</font></a></li><li><a href="#font-color-ff0000-02x00-append-font"><font color="#FF0000">【02x00】append</font></a></li><li><a href="#font-color-ff0000-03x00-merge-font"><font color="#FF0000">【03x00】merge</font></a><ul><li><a href="#font-color-4876ff-03x01-yi-dui-yi-lian-jie-font"><font color="#4876FF">【03x01】一对一连接</font></a></li><li><a href="#font-color-4876ff-03x02-duo-dui-yi-lian-jie-font"><font color="#4876FF">【03x02】多对一连接</font></a></li><li><a href="#font-color-4876ff-03x03-duo-dui-duo-lian-jie-font"><font color="#4876FF">【03x03】多对多连接</font></a></li><li><a href="#font-color-4876ff-03x04-can-shu-on-left-on-right-on-font"><font color="#4876FF">【03x04】参数 on / left_on / right_on</font></a></li><li><a href="#font-color-4876ff-03x05-can-shu-how-font"><font color="#4876FF">【03x05】参数 how</font></a></li><li><a href="#font-color-4876ff-03x06-can-shu-suffixes-font"><font color="#4876FF">【03x06】参数 suffixes</font></a></li><li><a href="#font-color-4876ff-03x07-can-shu-left-index-right-index-font"><font color="#4876FF">【03x07】参数 left_index / right_index</font></a></li></ul></li><li><a href="#font-color-ff0000-04x00-join-font"><font color="#FF0000">【04x00】join</font></a></li><li><a href="#font-color-ff0000-05x00-si-chong-fang-fa-de-qu-bie-font"><font color="#FF0000">【05x00】四种方法的区别</font></a></li></ul><!-- tocstop --><hr><p>Pandas 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/025/">Python 数据分析三剑客之 Pandas（一）：认识 Pandas 及其 Series、DataFrame 对象</a></li><li><a href="https://www.itbob.cn/article/026/">Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作</a></li><li><a href="https://www.itbob.cn/article/027/">Python 数据分析三剑客之 Pandas（三）：算术运算与缺失值的处理</a></li><li><a href="https://www.itbob.cn/article/028/">Python 数据分析三剑客之 Pandas（四）：函数应用、映射、排序和层级索引</a></li><li><a href="https://www.itbob.cn/article/029/">Python 数据分析三剑客之 Pandas（五）：统计计算与统计描述</a></li><li><a href="https://www.itbob.cn/article/030/">Python 数据分析三剑客之 Pandas（六）：GroupBy 数据分裂、应用与合并</a></li><li><a href="https://www.itbob.cn/article/031/">Python 数据分析三剑客之 Pandas（七）：合并数据集</a></li><li><a href="https://www.itbob.cn/article/032/">Python 数据分析三剑客之 Pandas（八）：数据重塑、重复数据处理与数据替换</a></li><li><a href="https://www.itbob.cn/article/033/">Python 数据分析三剑客之 Pandas（九）：时间序列</a></li><li><a href="https://www.itbob.cn/article/034/">Python 数据分析三剑客之 Pandas（十）：数据读写</a></li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106830112</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="01x00-concat"><font color="#FF0000">【01x00】concat</font></span></h2><p><code>pandas.concat</code> 可以沿着指定轴将多个对象堆叠到一起。</p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.concat.html">https://pandas.pydata.org/docs/reference/api/pandas.concat.html</a></p><p>基本语法：</p><pre><code class="hljs python">pandas.concat(objs: <span class="hljs-type">Union</span>[Iterable[‘DataFrame’], Mapping[<span class="hljs-type">Optional</span>[Hashable], ‘DataFrame’]],              axis=<span class="hljs-string">&#x27;0&#x27;</span>,              join: <span class="hljs-built_in">str</span> = <span class="hljs-string">&quot;&#x27;outer&#x27;&quot;</span>,              ignore_index: <span class="hljs-built_in">bool</span> = <span class="hljs-string">&#x27;False&#x27;</span>,              keys=<span class="hljs-string">&#x27;None&#x27;</span>,              levels=<span class="hljs-string">&#x27;None&#x27;</span>,              names=<span class="hljs-string">&#x27;None&#x27;</span>,              verify_integrity: <span class="hljs-built_in">bool</span> = <span class="hljs-string">&#x27;False&#x27;</span>,              sort: <span class="hljs-built_in">bool</span> = <span class="hljs-string">&#x27;False&#x27;</span>,              copy: <span class="hljs-built_in">bool</span> = <span class="hljs-string">&#x27;True&#x27;</span>) → ’DataFrame’</code></pre><pre><code class="hljs python">pandas.concat(objs: <span class="hljs-type">Union</span>[Iterable[FrameOrSeriesUnion], Mapping[<span class="hljs-type">Optional</span>[Hashable], FrameOrSeriesUnion]],              axis=<span class="hljs-string">&#x27;0&#x27;</span>,              join: <span class="hljs-built_in">str</span> = <span class="hljs-string">&quot;&#x27;outer&#x27;&quot;</span>,              ignore_index: <span class="hljs-built_in">bool</span> = <span class="hljs-string">&#x27;False&#x27;</span>,              keys=<span class="hljs-string">&#x27;None&#x27;</span>,              levels=<span class="hljs-string">&#x27;None&#x27;</span>,              names=<span class="hljs-string">&#x27;None&#x27;</span>,              verify_integrity: <span class="hljs-built_in">bool</span> = <span class="hljs-string">&#x27;False&#x27;</span>,              sort: <span class="hljs-built_in">bool</span> = <span class="hljs-string">&#x27;False&#x27;</span>,              copy: <span class="hljs-built_in">bool</span> = <span class="hljs-string">&#x27;True&#x27;</span>) → FrameOrSeriesUnion</code></pre><p>常用参数描述：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>objs</td><td>Series 或 DataFrame 对象的序列或映射，要合并的对象</td></tr><tr><td>axis</td><td>沿指定轴合并，<code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，只有在 DataFrame 中才有 <code>1</code> or <code>'columns’</code></td></tr><tr><td>join</td><td>如何处理其他轴（或多个轴）上的索引，可取值：<code>‘inner’</code>，<code>‘outer’</code>（默认值）<br><code>‘outer’</code>：当 axis = 0 时，列名相同的列会合并，其余列都保留（并集），空值填充；<br><code>‘inner’</code>：当 axis = 0 时，列名相同的列会合并，其余列都舍弃（交集）</td></tr><tr><td>ignore_index</td><td>bool 类型，连接后的值是否使用原索引值，如果为 True，则索引将会是 0, 1, …, n-1</td></tr><tr><td>keys</td><td>序列形式，默认 None，传递 keys 后，会构造一个层次索引，即 MultiIndex 对象，keys 为最外层索引</td></tr><tr><td>levels</td><td>用于构造 MultiIndex 的特定级别（唯一值）。未指定则将从键中推断出来</td></tr><tr><td>names</td><td>列表类型，为索引添加标签</td></tr><tr><td>verify_integrity</td><td>bool 类型，是否检查合并后的索引有无重复项，设置为 <code>True</code> 若有重复项则会报错</td></tr><tr><td>sort</td><td>当 <code>join='outer'</code> 时对列索引进行排序。当 <code>join='inner'</code> 时此操作无效</td></tr></tbody></table><p>合并两个 Series 对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.Series([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.Series([<span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>pd.concat([obj1, obj2])<span class="hljs-number">0</span>    a<span class="hljs-number">1</span>    b<span class="hljs-number">0</span>    c<span class="hljs-number">1</span>    ddtype: <span class="hljs-built_in">object</span></code></pre><p>设置 <code>ignore_index=True</code>，放弃原有的索引值：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.Series([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.Series([<span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>pd.concat([obj1, obj2], ignore_index=<span class="hljs-literal">True</span>)<span class="hljs-number">0</span>    a<span class="hljs-number">1</span>    b<span class="hljs-number">2</span>    c<span class="hljs-number">3</span>    ddtype: <span class="hljs-built_in">object</span></code></pre><p>设置 <code>keys</code> 参数，添加最外层的索引：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.Series([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.Series([<span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>pd.concat([obj1, obj2], keys=[<span class="hljs-string">&#x27;s1&#x27;</span>, <span class="hljs-string">&#x27;s2&#x27;</span>])s1  <span class="hljs-number">0</span>    a    <span class="hljs-number">1</span>    bs2  <span class="hljs-number">0</span>    c    <span class="hljs-number">1</span>    ddtype: <span class="hljs-built_in">object</span></code></pre><p>设置 <code>names</code> 参数，为索引添加标签：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.Series([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.Series([<span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>pd.concat([obj1, obj2], keys=[<span class="hljs-string">&#x27;s1&#x27;</span>, <span class="hljs-string">&#x27;s2&#x27;</span>], names=[<span class="hljs-string">&#x27;Series name&#x27;</span>, <span class="hljs-string">&#x27;Row ID&#x27;</span>])Series name  Row IDs1           <span class="hljs-number">0</span>         a             <span class="hljs-number">1</span>         bs2           <span class="hljs-number">0</span>         c             <span class="hljs-number">1</span>         ddtype: <span class="hljs-built_in">object</span></code></pre><p>合并 <code>DataFrame</code> 对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame([[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-number">1</span>], [<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-number">2</span>]], columns=[<span class="hljs-string">&#x27;letter&#x27;</span>, <span class="hljs-string">&#x27;number&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame([[<span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-number">3</span>], [<span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-number">4</span>]], columns=[<span class="hljs-string">&#x27;letter&#x27;</span>, <span class="hljs-string">&#x27;number&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj1  letter  number<span class="hljs-number">0</span>      a       <span class="hljs-number">1</span><span class="hljs-number">1</span>      b       <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2  letter  number<span class="hljs-number">0</span>      c       <span class="hljs-number">3</span><span class="hljs-number">1</span>      d       <span class="hljs-number">4</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.concat([obj1, obj2])  letter  number<span class="hljs-number">0</span>      a       <span class="hljs-number">1</span><span class="hljs-number">1</span>      b       <span class="hljs-number">2</span><span class="hljs-number">0</span>      c       <span class="hljs-number">3</span><span class="hljs-number">1</span>      d       <span class="hljs-number">4</span></code></pre><p>合并 <code>DataFrame</code> 对象，不存在的值将会被 NaN 填充：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame([[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-number">1</span>], [<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-number">2</span>]], columns=[<span class="hljs-string">&#x27;letter&#x27;</span>, <span class="hljs-string">&#x27;number&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame([[<span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-number">3</span>, <span class="hljs-string">&#x27;cat&#x27;</span>], [<span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-number">4</span>, <span class="hljs-string">&#x27;dog&#x27;</span>]], columns=[<span class="hljs-string">&#x27;letter&#x27;</span>, <span class="hljs-string">&#x27;number&#x27;</span>, <span class="hljs-string">&#x27;animal&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj1  letter  number<span class="hljs-number">0</span>      a       <span class="hljs-number">1</span><span class="hljs-number">1</span>      b       <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2  letter  number animal<span class="hljs-number">0</span>      c       <span class="hljs-number">3</span>    cat<span class="hljs-number">1</span>      d       <span class="hljs-number">4</span>    dog<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.concat([obj1, obj2])  letter  number animal<span class="hljs-number">0</span>      a       <span class="hljs-number">1</span>    NaN<span class="hljs-number">1</span>      b       <span class="hljs-number">2</span>    NaN<span class="hljs-number">0</span>      c       <span class="hljs-number">3</span>    cat<span class="hljs-number">1</span>      d       <span class="hljs-number">4</span>    dog</code></pre><p>合并 <code>DataFrame</code> 对象，设置 <code>join=&quot;inner&quot;</code> 不存在的列将会舍弃：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame([[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-number">1</span>], [<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-number">2</span>]], columns=[<span class="hljs-string">&#x27;letter&#x27;</span>, <span class="hljs-string">&#x27;number&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame([[<span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-number">3</span>, <span class="hljs-string">&#x27;cat&#x27;</span>], [<span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-number">4</span>, <span class="hljs-string">&#x27;dog&#x27;</span>]], columns=[<span class="hljs-string">&#x27;letter&#x27;</span>, <span class="hljs-string">&#x27;number&#x27;</span>, <span class="hljs-string">&#x27;animal&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj1  letter  number<span class="hljs-number">0</span>      a       <span class="hljs-number">1</span><span class="hljs-number">1</span>      b       <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2  letter  number animal<span class="hljs-number">0</span>      c       <span class="hljs-number">3</span>    cat<span class="hljs-number">1</span>      d       <span class="hljs-number">4</span>    dog<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.concat([obj1, obj2], join=<span class="hljs-string">&quot;inner&quot;</span>)  letter  number<span class="hljs-number">0</span>      a       <span class="hljs-number">1</span><span class="hljs-number">1</span>      b       <span class="hljs-number">2</span><span class="hljs-number">0</span>      c       <span class="hljs-number">3</span><span class="hljs-number">1</span>      d       <span class="hljs-number">4</span></code></pre><p>合并 <code>DataFrame</code> 对象，设置 <code>axis=1</code> 沿 y 轴合并（增加列）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame([[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-number">1</span>], [<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-number">2</span>]], columns=[<span class="hljs-string">&#x27;letter&#x27;</span>, <span class="hljs-string">&#x27;number&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame([[<span class="hljs-string">&#x27;bird&#x27;</span>, <span class="hljs-string">&#x27;polly&#x27;</span>], [<span class="hljs-string">&#x27;monkey&#x27;</span>, <span class="hljs-string">&#x27;george&#x27;</span>]], columns=[<span class="hljs-string">&#x27;animal&#x27;</span>, <span class="hljs-string">&#x27;name&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj1  letter  number<span class="hljs-number">0</span>      a       <span class="hljs-number">1</span><span class="hljs-number">1</span>      b       <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2   animal    name<span class="hljs-number">0</span>    bird   polly<span class="hljs-number">1</span>  monkey  george<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.concat([obj1, obj2], axis=<span class="hljs-number">1</span>)  letter  number  animal    name<span class="hljs-number">0</span>      a       <span class="hljs-number">1</span>    bird   polly<span class="hljs-number">1</span>      b       <span class="hljs-number">2</span>  monkey  george</code></pre><p>设置 <code>verify_integrity=True</code> ，检查新的索引是否有重复项，有重复项会报错：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame([<span class="hljs-number">1</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame([<span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj1   <span class="hljs-number">0</span>a  <span class="hljs-number">1</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2   <span class="hljs-number">0</span>a  <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.concat([obj1, obj2], verify_integrity=<span class="hljs-literal">True</span>)Traceback (most recent call last):    ...ValueError: Indexes have overlapping values: [<span class="hljs-string">&#x27;a&#x27;</span>]</code></pre><p>设置 <code>sort=True</code>，会对列索引进行排序输出：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame([[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-number">3</span>], [<span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-number">2</span>]], columns=[<span class="hljs-string">&#x27;letter&#x27;</span>, <span class="hljs-string">&#x27;number&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame([[<span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-number">1</span>, <span class="hljs-string">&#x27;cat&#x27;</span>], [<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-number">4</span>, <span class="hljs-string">&#x27;dog&#x27;</span>]], columns=[<span class="hljs-string">&#x27;letter&#x27;</span>, <span class="hljs-string">&#x27;number&#x27;</span>, <span class="hljs-string">&#x27;animal&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj1  letter  number<span class="hljs-number">0</span>      a       <span class="hljs-number">3</span><span class="hljs-number">1</span>      d       <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2  letter  number animal<span class="hljs-number">0</span>      c       <span class="hljs-number">1</span>    cat<span class="hljs-number">1</span>      b       <span class="hljs-number">4</span>    dog<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.concat([obj1, obj2], sort=<span class="hljs-literal">True</span>)  animal letter  number<span class="hljs-number">0</span>    NaN      a       <span class="hljs-number">3</span><span class="hljs-number">1</span>    NaN      d       <span class="hljs-number">2</span><span class="hljs-number">0</span>    cat      c       <span class="hljs-number">1</span><span class="hljs-number">1</span>    dog      b       <span class="hljs-number">4</span></code></pre><h2><span id="02x00-append"><font color="#FF0000">【02x00】append</font></span></h2><p>Append 方法事实上是在一个 Series / DataFrame 对象后最追加另一个 Series / DataFrame 对象并返回一个新对象，不改变原对象的值。</p><p>基本语法：</p><ul><li><p><code>Series.append(self, to_append, ignore_index=False, verify_integrity=False)</code></p></li><li><p><code>DataFrame.append(self, other, ignore_index=False, verify_integrity=False, sort=False)</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.append.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.append.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.append.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.append.html</a></p></li></ul><p>参数描述：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>to_append / other</td><td>要追加的数据</td></tr><tr><td>ignore_index</td><td>bool 类型，连接后的值是否使用原索引值，如果为 True，则索引将会是 0, 1, …, n-1</td></tr><tr><td>verify_integrity</td><td>bool 类型，是否检查合并后的索引有无重复项，设置为 <code>True</code> 若有重复项则会报错</td></tr><tr><td>sort</td><td>bool 类型，是否对列索引（columns）进行排序，默认 False</td></tr></tbody></table><p>合并 Series 对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.Series([<span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj3 = pd.Series([<span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>], index=[<span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj1<span class="hljs-number">0</span>    <span class="hljs-number">1</span><span class="hljs-number">1</span>    <span class="hljs-number">2</span><span class="hljs-number">2</span>    <span class="hljs-number">3</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2<span class="hljs-number">0</span>    <span class="hljs-number">4</span><span class="hljs-number">1</span>    <span class="hljs-number">5</span><span class="hljs-number">2</span>    <span class="hljs-number">6</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj3<span class="hljs-number">3</span>    <span class="hljs-number">4</span><span class="hljs-number">4</span>    <span class="hljs-number">5</span><span class="hljs-number">5</span>    <span class="hljs-number">6</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1.append(obj2)<span class="hljs-number">0</span>    <span class="hljs-number">1</span><span class="hljs-number">1</span>    <span class="hljs-number">2</span><span class="hljs-number">2</span>    <span class="hljs-number">3</span><span class="hljs-number">0</span>    <span class="hljs-number">4</span><span class="hljs-number">1</span>    <span class="hljs-number">5</span><span class="hljs-number">2</span>    <span class="hljs-number">6</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1.append(obj3)<span class="hljs-number">0</span>    <span class="hljs-number">1</span><span class="hljs-number">1</span>    <span class="hljs-number">2</span><span class="hljs-number">2</span>    <span class="hljs-number">3</span><span class="hljs-number">3</span>    <span class="hljs-number">4</span><span class="hljs-number">4</span>    <span class="hljs-number">5</span><span class="hljs-number">5</span>    <span class="hljs-number">6</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1.append(obj2, ignore_index=<span class="hljs-literal">True</span>)<span class="hljs-number">0</span>    <span class="hljs-number">1</span><span class="hljs-number">1</span>    <span class="hljs-number">2</span><span class="hljs-number">2</span>    <span class="hljs-number">3</span><span class="hljs-number">3</span>    <span class="hljs-number">4</span><span class="hljs-number">4</span>    <span class="hljs-number">5</span><span class="hljs-number">5</span>    <span class="hljs-number">6</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1.append(obj2, verify_integrity=<span class="hljs-literal">True</span>)Traceback (most recent call last):...ValueError: Indexes have overlapping values: Int64Index([<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>], dtype=<span class="hljs-string">&#x27;int64&#x27;</span>)</code></pre><p>合并 DataFrame 对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">3</span>, <span class="hljs-number">4</span>]], columns=<span class="hljs-built_in">list</span>(<span class="hljs-string">&#x27;AB&#x27;</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame([[<span class="hljs-number">5</span>, <span class="hljs-number">6</span>], [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>]], columns=<span class="hljs-built_in">list</span>(<span class="hljs-string">&#x27;AB&#x27;</span>))<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1   A  B<span class="hljs-number">0</span>  <span class="hljs-number">1</span>  <span class="hljs-number">2</span><span class="hljs-number">1</span>  <span class="hljs-number">3</span>  <span class="hljs-number">4</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2   A  B<span class="hljs-number">0</span>  <span class="hljs-number">5</span>  <span class="hljs-number">6</span><span class="hljs-number">1</span>  <span class="hljs-number">7</span>  <span class="hljs-number">8</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1.append(obj2)   A  B<span class="hljs-number">0</span>  <span class="hljs-number">1</span>  <span class="hljs-number">2</span><span class="hljs-number">1</span>  <span class="hljs-number">3</span>  <span class="hljs-number">4</span><span class="hljs-number">0</span>  <span class="hljs-number">5</span>  <span class="hljs-number">6</span><span class="hljs-number">1</span>  <span class="hljs-number">7</span>  <span class="hljs-number">8</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1.append(obj2, ignore_index=<span class="hljs-literal">True</span>)   A  B<span class="hljs-number">0</span>  <span class="hljs-number">1</span>  <span class="hljs-number">2</span><span class="hljs-number">1</span>  <span class="hljs-number">3</span>  <span class="hljs-number">4</span><span class="hljs-number">2</span>  <span class="hljs-number">5</span>  <span class="hljs-number">6</span><span class="hljs-number">3</span>  <span class="hljs-number">7</span>  <span class="hljs-number">8</span></code></pre><p>以下虽然不是生成 DataFrames 的推荐方法，但演示了从多个数据源生成 DataFrames 的两种方法：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(columns=[<span class="hljs-string">&#x27;A&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-number">5</span>):    obj = obj.append(&#123;<span class="hljs-string">&#x27;A&#x27;</span>: i&#125;, ignore_index=<span class="hljs-literal">True</span>)    <span class="hljs-meta">&gt;&gt;&gt; </span>obj   A<span class="hljs-number">0</span>  <span class="hljs-number">0</span><span class="hljs-number">1</span>  <span class="hljs-number">1</span><span class="hljs-number">2</span>  <span class="hljs-number">2</span><span class="hljs-number">3</span>  <span class="hljs-number">3</span><span class="hljs-number">4</span>  <span class="hljs-number">4</span></code></pre><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.concat([pd.DataFrame([i], columns=[<span class="hljs-string">&#x27;A&#x27;</span>]) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-number">5</span>)], ignore_index=<span class="hljs-literal">True</span>)   A<span class="hljs-number">0</span>  <span class="hljs-number">0</span><span class="hljs-number">1</span>  <span class="hljs-number">1</span><span class="hljs-number">2</span>  <span class="hljs-number">2</span><span class="hljs-number">3</span>  <span class="hljs-number">3</span><span class="hljs-number">4</span>  <span class="hljs-number">4</span></code></pre><h2><span id="03x00-merge"><font color="#FF0000">【03x00】merge</font></span></h2><p>将不同的数据源进行合并是数据科学中常见的操作，这既包括将两个不同的数据集非常简单地拼接在一起，也包括用数据库那样的连接（join）与合并（merge）操作处理有重叠字段的数据集。Series 与DataFrame 都具备这类操作，Pandas 的函数与方法让数据合并变得快速简单。</p><p>数据集的合并（merge）或连接（join）运算是通过一个或多个键将行连接起来的。这些运算是关系型数据库（基于SQL）的核心。Pandas 的 merge 函数是对数据应用这些算法的主要切入点。</p><p><font color="#FF0000"> <strong><code>pandas.merge</code> 可根据一个或多个连接键将不同 DataFrame 中的行连接起来。</strong></font></p><p>基本语法：</p><pre><code class="hljs python">pandas.merge(left,             right,             how: <span class="hljs-built_in">str</span> = <span class="hljs-string">&#x27;inner&#x27;</span>,             on=<span class="hljs-literal">None</span>,             left_on=<span class="hljs-literal">None</span>,             right_on=<span class="hljs-literal">None</span>,             left_index: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>,             right_index: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>,             sort: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>,             suffixes=<span class="hljs-string">&#x27;_x&#x27;</span>, <span class="hljs-string">&#x27;_y&#x27;</span>,             copy: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">True</span>,             indicator: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>,             validate=<span class="hljs-literal">None</span>) → ’DataFrame’</code></pre><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.merge.html">https://pandas.pydata.org/docs/reference/api/pandas.merge.html</a></p><p>常见参数描述：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>left</td><td>参与合并的左侧 DataFrame 对象</td></tr><tr><td>right</td><td>参与合并的右侧 DataFrame 对象</td></tr><tr><td>how</td><td>合并方式，默认 <code>'inner'</code><br><code>'inner'</code>：内连接，即使用两个对象中<font color="#FF0000"><strong>都有</strong></font>的键（交集）；<br><code>'outer'</code>：外连接，即使用两个对象中<font color="#FF0000"><strong>所有</strong></font>的键（并集）；<br><code>'left'</code>：左连接，即使用<font color="#FF0000"><strong>左</strong></font>对象中所有的键；<br><code>'right'</code>：右连接，即使用<font color="#FF0000"><strong>右</strong></font>对象中所有的键；</td></tr><tr><td>on</td><td>用于连接的列名。必须存在于左右两个 Dataframe对象中<br>如果未指定，且其他连接键也未指定，则以 left 和 right 列名的交集作为连接键</td></tr><tr><td>left_on</td><td>左侧 DataFrame 对象中用作连接键的列</td></tr><tr><td>right_on</td><td>右侧 DataFrame 对象中用作连接键的列</td></tr><tr><td>left_index</td><td>bool 类型，是否使用左侧 DataFrame 对象中的索引（index）作为连接键，默认 False</td></tr><tr><td>right_index</td><td>bool 类型，是否使用右侧 DataFrame 对象中的索引（index）作为连接键，默认 False</td></tr><tr><td>sort</td><td>bool 类型，是否在结果中按顺序对连接键排序，默认 False。<br>如果为 False，则连接键的顺序取决于联接类型（how 关键字）</td></tr><tr><td>suffixes</td><td>字符串值元组，用于追加到重叠列名的末尾，默认为 <code>('_x', '_y')</code>。<br>例如，如果左右两个 DataFrame 对象都有 <code>data</code> 列时，则结果中就会出现 <code>data_x</code> 和 <code>data_y</code></td></tr></tbody></table><h3><span id="03x01-yi-dui-yi-lian-jie"><font color="#4876FF">【03x01】一对一连接</font></span></h3><p><font color="#FF0000"><strong>一对一连接是指两个 DataFrame 对象的列的值没有重复值。</strong></font></p><p>如果不指定任何参数，调用 <code>merge</code> 方法，<code>merge</code> 就会将重叠的列的列名当做键来合并。</p><p>在下面的示例中，两个 DataFrame 对象都有一个列名为 <code>key</code> 的列，未指定按照哪一列来合并，<code>merge</code> 就会默认按照 <code>key</code> 来合并：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>], <span class="hljs-string">&#x27;data1&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">3</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>], <span class="hljs-string">&#x27;data2&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">3</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj1  key  data1<span class="hljs-number">0</span>   b      <span class="hljs-number">0</span><span class="hljs-number">1</span>   a      <span class="hljs-number">1</span><span class="hljs-number">2</span>   c      <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2  key  data2<span class="hljs-number">0</span>   a      <span class="hljs-number">0</span><span class="hljs-number">1</span>   c      <span class="hljs-number">1</span><span class="hljs-number">2</span>   b      <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(obj1, obj2)  key  data1  data2<span class="hljs-number">0</span>   b      <span class="hljs-number">0</span>      <span class="hljs-number">2</span><span class="hljs-number">1</span>   a      <span class="hljs-number">1</span>      <span class="hljs-number">0</span><span class="hljs-number">2</span>   c      <span class="hljs-number">2</span>      <span class="hljs-number">1</span></code></pre><h3><span id="03x02-duo-dui-yi-lian-jie"><font color="#4876FF">【03x02】多对一连接</font></span></h3><p><font color="#FF0000"><strong>多对一连接是指两个 DataFrame 对象中，有一个的列的值有重复值。</strong></font>通过多对一连接获得的结果，DataFrame  将会保留重复值。</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>], <span class="hljs-string">&#x27;data1&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">7</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], <span class="hljs-string">&#x27;data2&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">3</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1  key  data1<span class="hljs-number">0</span>   b      <span class="hljs-number">0</span><span class="hljs-number">1</span>   b      <span class="hljs-number">1</span><span class="hljs-number">2</span>   a      <span class="hljs-number">2</span><span class="hljs-number">3</span>   c      <span class="hljs-number">3</span><span class="hljs-number">4</span>   a      <span class="hljs-number">4</span><span class="hljs-number">5</span>   a      <span class="hljs-number">5</span><span class="hljs-number">6</span>   b      <span class="hljs-number">6</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2  key  data2<span class="hljs-number">0</span>   a      <span class="hljs-number">0</span><span class="hljs-number">1</span>   b      <span class="hljs-number">1</span><span class="hljs-number">2</span>   d      <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(obj1, obj2)  key  data1  data2<span class="hljs-number">0</span>   b      <span class="hljs-number">0</span>      <span class="hljs-number">1</span><span class="hljs-number">1</span>   b      <span class="hljs-number">1</span>      <span class="hljs-number">1</span><span class="hljs-number">2</span>   b      <span class="hljs-number">6</span>      <span class="hljs-number">1</span><span class="hljs-number">3</span>   a      <span class="hljs-number">2</span>      <span class="hljs-number">0</span><span class="hljs-number">4</span>   a      <span class="hljs-number">4</span>      <span class="hljs-number">0</span><span class="hljs-number">5</span>   a      <span class="hljs-number">5</span>      <span class="hljs-number">0</span></code></pre><h3><span id="03x03-duo-dui-duo-lian-jie"><font color="#4876FF">【03x03】多对多连接</font></span></h3><p><font color="#FF0000"><strong>多对多连接是指两个 DataFrame 对象中的列的值都有重复值。</strong></font></p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>], <span class="hljs-string">&#x27;data1&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">4</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>], <span class="hljs-string">&#x27;data2&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">6</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj1  key  data1<span class="hljs-number">0</span>   a      <span class="hljs-number">0</span><span class="hljs-number">1</span>   b      <span class="hljs-number">1</span><span class="hljs-number">2</span>   b      <span class="hljs-number">2</span><span class="hljs-number">3</span>   c      <span class="hljs-number">3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2  key  data2<span class="hljs-number">0</span>   a      <span class="hljs-number">0</span><span class="hljs-number">1</span>   a      <span class="hljs-number">1</span><span class="hljs-number">2</span>   b      <span class="hljs-number">2</span><span class="hljs-number">3</span>   b      <span class="hljs-number">3</span><span class="hljs-number">4</span>   c      <span class="hljs-number">4</span><span class="hljs-number">5</span>   c      <span class="hljs-number">5</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(obj1, obj2)  key  data1  data2<span class="hljs-number">0</span>   a      <span class="hljs-number">0</span>      <span class="hljs-number">0</span><span class="hljs-number">1</span>   a      <span class="hljs-number">0</span>      <span class="hljs-number">1</span><span class="hljs-number">2</span>   b      <span class="hljs-number">1</span>      <span class="hljs-number">2</span><span class="hljs-number">3</span>   b      <span class="hljs-number">1</span>      <span class="hljs-number">3</span><span class="hljs-number">4</span>   b      <span class="hljs-number">2</span>      <span class="hljs-number">2</span><span class="hljs-number">5</span>   b      <span class="hljs-number">2</span>      <span class="hljs-number">3</span><span class="hljs-number">6</span>   c      <span class="hljs-number">3</span>      <span class="hljs-number">4</span><span class="hljs-number">7</span>   c      <span class="hljs-number">3</span>      <span class="hljs-number">5</span></code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106830112</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h3><span id="03x04-can-shu-on-left-on-right-on"><font color="#4876FF">【03x04】参数 on / left_on / right_on</font></span></h3><p>参数 <code>on</code> 用于指定按照某一列来进行合并，若不指定该参数，则会默认按照重叠的列的列名当做键来合并：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>], <span class="hljs-string">&#x27;data1&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">3</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>], <span class="hljs-string">&#x27;data2&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">3</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj1  key  data1<span class="hljs-number">0</span>   b      <span class="hljs-number">0</span><span class="hljs-number">1</span>   a      <span class="hljs-number">1</span><span class="hljs-number">2</span>   c      <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2  key  data2<span class="hljs-number">0</span>   a      <span class="hljs-number">0</span><span class="hljs-number">1</span>   c      <span class="hljs-number">1</span><span class="hljs-number">2</span>   b      <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(obj1, obj2, on=<span class="hljs-string">&#x27;key&#x27;</span>)  key  data1  data2<span class="hljs-number">0</span>   b      <span class="hljs-number">0</span>      <span class="hljs-number">2</span><span class="hljs-number">1</span>   a      <span class="hljs-number">1</span>      <span class="hljs-number">0</span><span class="hljs-number">2</span>   c      <span class="hljs-number">2</span>      <span class="hljs-number">1</span></code></pre><p>如果要根据多个键进行合并，传入一个由列名组成的列表即可：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>left = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key1&#x27;</span>: [<span class="hljs-string">&#x27;foo&#x27;</span>, <span class="hljs-string">&#x27;foo&#x27;</span>, <span class="hljs-string">&#x27;bar&#x27;</span>],             <span class="hljs-string">&#x27;key2&#x27;</span>: [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>],             <span class="hljs-string">&#x27;lval&#x27;</span>: [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>right = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key1&#x27;</span>: [<span class="hljs-string">&#x27;foo&#x27;</span>, <span class="hljs-string">&#x27;foo&#x27;</span>, <span class="hljs-string">&#x27;bar&#x27;</span>, <span class="hljs-string">&#x27;bar&#x27;</span>],              <span class="hljs-string">&#x27;key2&#x27;</span>: [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>],              <span class="hljs-string">&#x27;rval&#x27;</span>: [<span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>left  key1 key2  lval<span class="hljs-number">0</span>  foo  one     <span class="hljs-number">1</span><span class="hljs-number">1</span>  foo  two     <span class="hljs-number">2</span><span class="hljs-number">2</span>  bar  one     <span class="hljs-number">3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>right  key1 key2  rval<span class="hljs-number">0</span>  foo  one     <span class="hljs-number">4</span><span class="hljs-number">1</span>  foo  one     <span class="hljs-number">5</span><span class="hljs-number">2</span>  bar  one     <span class="hljs-number">6</span><span class="hljs-number">3</span>  bar  two     <span class="hljs-number">7</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(left, right, on=[<span class="hljs-string">&#x27;key1&#x27;</span>, <span class="hljs-string">&#x27;key2&#x27;</span>])  key1 key2  lval  rval<span class="hljs-number">0</span>  foo  one     <span class="hljs-number">1</span>     <span class="hljs-number">4</span><span class="hljs-number">1</span>  foo  one     <span class="hljs-number">1</span>     <span class="hljs-number">5</span><span class="hljs-number">2</span>  bar  one     <span class="hljs-number">3</span>     <span class="hljs-number">6</span></code></pre><p>如果两个对象的列名不同，就可以使用 <code>left_on</code>、<code>right_on</code> 参数分别进行指定：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;lkey&#x27;</span>: [<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>], <span class="hljs-string">&#x27;data1&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">7</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;rkey&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], <span class="hljs-string">&#x27;data2&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">3</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj1  lkey  data1<span class="hljs-number">0</span>    b      <span class="hljs-number">0</span><span class="hljs-number">1</span>    b      <span class="hljs-number">1</span><span class="hljs-number">2</span>    a      <span class="hljs-number">2</span><span class="hljs-number">3</span>    c      <span class="hljs-number">3</span><span class="hljs-number">4</span>    a      <span class="hljs-number">4</span><span class="hljs-number">5</span>    a      <span class="hljs-number">5</span><span class="hljs-number">6</span>    b      <span class="hljs-number">6</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2  rkey  data2<span class="hljs-number">0</span>    a      <span class="hljs-number">0</span><span class="hljs-number">1</span>    b      <span class="hljs-number">1</span><span class="hljs-number">2</span>    d      <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(obj1, obj2, left_on=<span class="hljs-string">&#x27;lkey&#x27;</span>, right_on=<span class="hljs-string">&#x27;rkey&#x27;</span>)  lkey  data1 rkey  data2<span class="hljs-number">0</span>    b      <span class="hljs-number">0</span>    b      <span class="hljs-number">1</span><span class="hljs-number">1</span>    b      <span class="hljs-number">1</span>    b      <span class="hljs-number">1</span><span class="hljs-number">2</span>    b      <span class="hljs-number">6</span>    b      <span class="hljs-number">1</span><span class="hljs-number">3</span>    a      <span class="hljs-number">2</span>    a      <span class="hljs-number">0</span><span class="hljs-number">4</span>    a      <span class="hljs-number">4</span>    a      <span class="hljs-number">0</span><span class="hljs-number">5</span>    a      <span class="hljs-number">5</span>    a      <span class="hljs-number">0</span></code></pre><h3><span id="03x05-can-shu-how"><font color="#4876FF">【03x05】参数 how</font></span></h3><p>在前面的示例中，结果里面 c 和 d 以及与之相关的数据消失了。默认情况下，<code>merge</code> 做的是内连接（<code>'inner'</code>），结果中的键是交集。其他方式还有：<code>'left'</code>、<code>'right'</code>、<code>'outer'</code>，含义如下：</p><ul><li><code>'inner'</code>：内连接，即使用两个对象中<font color="#FF0000"><strong>都有</strong></font>的键（交集）；</li><li><code>'outer'</code>：外连接，即使用两个对象中<font color="#FF0000"><strong>所有</strong></font>的键（并集）；</li><li><code>'left'</code>：左连接，即使用<font color="#FF0000"><strong>左</strong></font>对象中所有的键；</li><li><code>'right'</code>：右连接，即使用<font color="#FF0000"><strong>右</strong></font>对象中所有的键；</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>], <span class="hljs-string">&#x27;data1&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">7</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], <span class="hljs-string">&#x27;data2&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">3</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj1  key  data1<span class="hljs-number">0</span>   b      <span class="hljs-number">0</span><span class="hljs-number">1</span>   b      <span class="hljs-number">1</span><span class="hljs-number">2</span>   a      <span class="hljs-number">2</span><span class="hljs-number">3</span>   c      <span class="hljs-number">3</span><span class="hljs-number">4</span>   a      <span class="hljs-number">4</span><span class="hljs-number">5</span>   a      <span class="hljs-number">5</span><span class="hljs-number">6</span>   b      <span class="hljs-number">6</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2  key  data2<span class="hljs-number">0</span>   a      <span class="hljs-number">0</span><span class="hljs-number">1</span>   b      <span class="hljs-number">1</span><span class="hljs-number">2</span>   d      <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(obj1, obj2, on=<span class="hljs-string">&#x27;key&#x27;</span>, how=<span class="hljs-string">&#x27;inner&#x27;</span>)  key  data1  data2<span class="hljs-number">0</span>   b      <span class="hljs-number">0</span>      <span class="hljs-number">1</span><span class="hljs-number">1</span>   b      <span class="hljs-number">1</span>      <span class="hljs-number">1</span><span class="hljs-number">2</span>   b      <span class="hljs-number">6</span>      <span class="hljs-number">1</span><span class="hljs-number">3</span>   a      <span class="hljs-number">2</span>      <span class="hljs-number">0</span><span class="hljs-number">4</span>   a      <span class="hljs-number">4</span>      <span class="hljs-number">0</span><span class="hljs-number">5</span>   a      <span class="hljs-number">5</span>      <span class="hljs-number">0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(obj1, obj2, on=<span class="hljs-string">&#x27;key&#x27;</span>, how=<span class="hljs-string">&#x27;outer&#x27;</span>)  key  data1  data2<span class="hljs-number">0</span>   b    <span class="hljs-number">0.0</span>    <span class="hljs-number">1.0</span><span class="hljs-number">1</span>   b    <span class="hljs-number">1.0</span>    <span class="hljs-number">1.0</span><span class="hljs-number">2</span>   b    <span class="hljs-number">6.0</span>    <span class="hljs-number">1.0</span><span class="hljs-number">3</span>   a    <span class="hljs-number">2.0</span>    <span class="hljs-number">0.0</span><span class="hljs-number">4</span>   a    <span class="hljs-number">4.0</span>    <span class="hljs-number">0.0</span><span class="hljs-number">5</span>   a    <span class="hljs-number">5.0</span>    <span class="hljs-number">0.0</span><span class="hljs-number">6</span>   c    <span class="hljs-number">3.0</span>    NaN<span class="hljs-number">7</span>   d    NaN    <span class="hljs-number">2.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(obj1, obj2, on=<span class="hljs-string">&#x27;key&#x27;</span>, how=<span class="hljs-string">&#x27;left&#x27;</span>)  key  data1  data2<span class="hljs-number">0</span>   b      <span class="hljs-number">0</span>    <span class="hljs-number">1.0</span><span class="hljs-number">1</span>   b      <span class="hljs-number">1</span>    <span class="hljs-number">1.0</span><span class="hljs-number">2</span>   a      <span class="hljs-number">2</span>    <span class="hljs-number">0.0</span><span class="hljs-number">3</span>   c      <span class="hljs-number">3</span>    NaN<span class="hljs-number">4</span>   a      <span class="hljs-number">4</span>    <span class="hljs-number">0.0</span><span class="hljs-number">5</span>   a      <span class="hljs-number">5</span>    <span class="hljs-number">0.0</span><span class="hljs-number">6</span>   b      <span class="hljs-number">6</span>    <span class="hljs-number">1.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(obj1, obj2, on=<span class="hljs-string">&#x27;key&#x27;</span>, how=<span class="hljs-string">&#x27;right&#x27;</span>)  key  data1  data2<span class="hljs-number">0</span>   b    <span class="hljs-number">0.0</span>      <span class="hljs-number">1</span><span class="hljs-number">1</span>   b    <span class="hljs-number">1.0</span>      <span class="hljs-number">1</span><span class="hljs-number">2</span>   b    <span class="hljs-number">6.0</span>      <span class="hljs-number">1</span><span class="hljs-number">3</span>   a    <span class="hljs-number">2.0</span>      <span class="hljs-number">0</span><span class="hljs-number">4</span>   a    <span class="hljs-number">4.0</span>      <span class="hljs-number">0</span><span class="hljs-number">5</span>   a    <span class="hljs-number">5.0</span>      <span class="hljs-number">0</span><span class="hljs-number">6</span>   d    NaN      <span class="hljs-number">2</span></code></pre><h3><span id="03x06-can-shu-suffixes"><font color="#4876FF">【03x06】参数 suffixes</font></span></h3><p><code>suffixes</code> 参数用于指定附加到左右两个 DataFrame 对象的重叠列名上的字符串：</p><p>在以下示例中，选择按照 <code>key1</code> 进行合并，而两个 DataFrame 对象都包含 <code>key2</code> 列，如果未指定 <code>suffixes</code> 参数，则默认会为两个对象的 <code>key2</code> 加上 <code>_x</code> 和 <code>_y</code>，以便区分它们，如果指定了 <code>suffixes</code> 参数，就会按照添加指定的后缀：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>left = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key1&#x27;</span>: [<span class="hljs-string">&#x27;foo&#x27;</span>, <span class="hljs-string">&#x27;foo&#x27;</span>, <span class="hljs-string">&#x27;bar&#x27;</span>],             <span class="hljs-string">&#x27;key2&#x27;</span>: [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>],             <span class="hljs-string">&#x27;lval&#x27;</span>: [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>right = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key1&#x27;</span>: [<span class="hljs-string">&#x27;foo&#x27;</span>, <span class="hljs-string">&#x27;foo&#x27;</span>, <span class="hljs-string">&#x27;bar&#x27;</span>, <span class="hljs-string">&#x27;bar&#x27;</span>],              <span class="hljs-string">&#x27;key2&#x27;</span>: [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>],              <span class="hljs-string">&#x27;rval&#x27;</span>: [<span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>left  key1 key2  lval<span class="hljs-number">0</span>  foo  one     <span class="hljs-number">1</span><span class="hljs-number">1</span>  foo  two     <span class="hljs-number">2</span><span class="hljs-number">2</span>  bar  one     <span class="hljs-number">3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>right  key1 key2  rval<span class="hljs-number">0</span>  foo  one     <span class="hljs-number">4</span><span class="hljs-number">1</span>  foo  one     <span class="hljs-number">5</span><span class="hljs-number">2</span>  bar  one     <span class="hljs-number">6</span><span class="hljs-number">3</span>  bar  two     <span class="hljs-number">7</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(left, right, on=<span class="hljs-string">&#x27;key1&#x27;</span>)  key1 key2_x  lval key2_y  rval<span class="hljs-number">0</span>  foo    one     <span class="hljs-number">1</span>    one     <span class="hljs-number">4</span><span class="hljs-number">1</span>  foo    one     <span class="hljs-number">1</span>    one     <span class="hljs-number">5</span><span class="hljs-number">2</span>  foo    two     <span class="hljs-number">2</span>    one     <span class="hljs-number">4</span><span class="hljs-number">3</span>  foo    two     <span class="hljs-number">2</span>    one     <span class="hljs-number">5</span><span class="hljs-number">4</span>  bar    one     <span class="hljs-number">3</span>    one     <span class="hljs-number">6</span><span class="hljs-number">5</span>  bar    one     <span class="hljs-number">3</span>    two     <span class="hljs-number">7</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(left, right, on=<span class="hljs-string">&#x27;key1&#x27;</span>, suffixes=(<span class="hljs-string">&#x27;_left&#x27;</span>, <span class="hljs-string">&#x27;_right&#x27;</span>))  key1 key2_left  lval key2_right  rval<span class="hljs-number">0</span>  foo       one     <span class="hljs-number">1</span>        one     <span class="hljs-number">4</span><span class="hljs-number">1</span>  foo       one     <span class="hljs-number">1</span>        one     <span class="hljs-number">5</span><span class="hljs-number">2</span>  foo       two     <span class="hljs-number">2</span>        one     <span class="hljs-number">4</span><span class="hljs-number">3</span>  foo       two     <span class="hljs-number">2</span>        one     <span class="hljs-number">5</span><span class="hljs-number">4</span>  bar       one     <span class="hljs-number">3</span>        one     <span class="hljs-number">6</span><span class="hljs-number">5</span>  bar       one     <span class="hljs-number">3</span>        two     <span class="hljs-number">7</span></code></pre><h3><span id="03x07-can-shu-left-index-right-index"><font color="#4876FF">【03x07】参数 left_index / right_index</font></span></h3><p>有时候，DataFrame 中的连接键位于其索引中。在这种情况下，可以使用 <code>left_index=True</code> 或<code>right_index=True</code>（或两个都传）以说明索引应该被用作连接键。这种方法称为按索引连接，在 Pandas 中还有个 <code>join</code> 方法可以实现这个功能。</p><p>在以下示例中，按照 left 的 key 列进行连接，而 right 对象的连接键位于其索引中，因此要指定 <code>right_index=True</code>：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>left = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>], <span class="hljs-string">&#x27;value&#x27;</span>: <span class="hljs-built_in">range</span>(<span class="hljs-number">6</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>right = pd.DataFrame(&#123;<span class="hljs-string">&#x27;group_val&#x27;</span>: [<span class="hljs-number">3.5</span>, <span class="hljs-number">7</span>]&#125;, index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>left  key  value<span class="hljs-number">0</span>   a      <span class="hljs-number">0</span><span class="hljs-number">1</span>   b      <span class="hljs-number">1</span><span class="hljs-number">2</span>   a      <span class="hljs-number">2</span><span class="hljs-number">3</span>   a      <span class="hljs-number">3</span><span class="hljs-number">4</span>   b      <span class="hljs-number">4</span><span class="hljs-number">5</span>   c      <span class="hljs-number">5</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>right   group_vala        <span class="hljs-number">3.5</span>b        <span class="hljs-number">7.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.merge(left, right, left_on=<span class="hljs-string">&#x27;key&#x27;</span>, right_index=<span class="hljs-literal">True</span>)  key  value  group_val<span class="hljs-number">0</span>   a      <span class="hljs-number">0</span>        <span class="hljs-number">3.5</span><span class="hljs-number">2</span>   a      <span class="hljs-number">2</span>        <span class="hljs-number">3.5</span><span class="hljs-number">3</span>   a      <span class="hljs-number">3</span>        <span class="hljs-number">3.5</span><span class="hljs-number">1</span>   b      <span class="hljs-number">1</span>        <span class="hljs-number">7.0</span><span class="hljs-number">4</span>   b      <span class="hljs-number">4</span>        <span class="hljs-number">7.0</span></code></pre><h2><span id="04x00-join"><font color="#FF0000">【04x00】join</font></span></h2><p>join 方法只适用于 DataFrame 对象，Series 对象没有该方法，该方法用于连接另一个 DataFrame 对象的列（columns）。</p><p>基本语法：<code>DataFrame.join(self, other, on=None, how='left', lsuffix='', rsuffix='', sort=False) → ’DataFrame’</code></p><p>参数描述：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>other</td><td>另一个 DataFrame、Series 或 DataFrame 列表对象</td></tr><tr><td>on</td><td>列名称，或者列名称组成的列表、元组，连接的列</td></tr><tr><td>how</td><td>合并方式，默认 <code>'left'</code><br><code>'inner'</code>：内连接，即使用两个对象中<font color="#FF0000"><strong>都有</strong></font>的键（交集）；<br><code>'outer'</code>：外连接，即使用两个对象中<font color="#FF0000"><strong>所有</strong></font>的键（并集）；<br><code>'left'</code>：左连接，即使用<font color="#FF0000"><strong>左</strong></font>对象中所有的键；<br><code>'right'</code>：右连接，即使用<font color="#FF0000"><strong>右</strong></font>对象中所有的键；</td></tr><tr><td>lsuffix</td><td>当两个对象有相同的列名时，合并后左边数据列名的后缀</td></tr><tr><td>rsuffix</td><td>当两个对象有相同的列名时，合并后右边数据列名的后缀</td></tr><tr><td>sort</td><td>bool 类型，是否在结果中按顺序对连接键排序，默认 False。<br>如果为 False，则连接键的顺序取决于联接类型（how 关键字）</td></tr></tbody></table><p>使用 <code>lsuffix</code> 和 <code>rsuffix</code> 参数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;K0&#x27;</span>, <span class="hljs-string">&#x27;K1&#x27;</span>, <span class="hljs-string">&#x27;K2&#x27;</span>, <span class="hljs-string">&#x27;K3&#x27;</span>, <span class="hljs-string">&#x27;K4&#x27;</span>, <span class="hljs-string">&#x27;K5&#x27;</span>],            <span class="hljs-string">&#x27;A&#x27;</span>: [<span class="hljs-string">&#x27;A0&#x27;</span>, <span class="hljs-string">&#x27;A1&#x27;</span>, <span class="hljs-string">&#x27;A2&#x27;</span>, <span class="hljs-string">&#x27;A3&#x27;</span>, <span class="hljs-string">&#x27;A4&#x27;</span>, <span class="hljs-string">&#x27;A5&#x27;</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>other = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;K0&#x27;</span>, <span class="hljs-string">&#x27;K1&#x27;</span>, <span class="hljs-string">&#x27;K2&#x27;</span>],              <span class="hljs-string">&#x27;B&#x27;</span>: [<span class="hljs-string">&#x27;B0&#x27;</span>, <span class="hljs-string">&#x27;B1&#x27;</span>, <span class="hljs-string">&#x27;B2&#x27;</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key   A<span class="hljs-number">0</span>  K0  A0<span class="hljs-number">1</span>  K1  A1<span class="hljs-number">2</span>  K2  A2<span class="hljs-number">3</span>  K3  A3<span class="hljs-number">4</span>  K4  A4<span class="hljs-number">5</span>  K5  A5<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>other  key   B<span class="hljs-number">0</span>  K0  B0<span class="hljs-number">1</span>  K1  B1<span class="hljs-number">2</span>  K2  B2<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.join(other, lsuffix=<span class="hljs-string">&#x27;_1&#x27;</span>, rsuffix=<span class="hljs-string">&#x27;_2&#x27;</span>)  key_1   A key_2    B<span class="hljs-number">0</span>    K0  A0    K0   B0<span class="hljs-number">1</span>    K1  A1    K1   B1<span class="hljs-number">2</span>    K2  A2    K2   B2<span class="hljs-number">3</span>    K3  A3   NaN  NaN<span class="hljs-number">4</span>    K4  A4   NaN  NaN<span class="hljs-number">5</span>    K5  A5   NaN  NaN</code></pre><p>如果右表的索引是左表的某一列的值，这时可以将右表的索引和左表的列对齐合并这样的灵活方式进行合并：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;A&#x27;</span>: [<span class="hljs-string">&#x27;A0&#x27;</span>, <span class="hljs-string">&#x27;A1&#x27;</span>, <span class="hljs-string">&#x27;A2&#x27;</span>, <span class="hljs-string">&#x27;A3&#x27;</span>], <span class="hljs-string">&#x27;B&#x27;</span>: [<span class="hljs-string">&#x27;B0&#x27;</span>, <span class="hljs-string">&#x27;B1&#x27;</span>, <span class="hljs-string">&#x27;B2&#x27;</span>, <span class="hljs-string">&#x27;B3&#x27;</span>],<span class="hljs-string">&#x27;key&#x27;</span>: [<span class="hljs-string">&#x27;K0&#x27;</span>, <span class="hljs-string">&#x27;K1&#x27;</span>, <span class="hljs-string">&#x27;K0&#x27;</span>, <span class="hljs-string">&#x27;K1&#x27;</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>other = pd.DataFrame(&#123;<span class="hljs-string">&#x27;C&#x27;</span>: [<span class="hljs-string">&#x27;C0&#x27;</span>, <span class="hljs-string">&#x27;C1&#x27;</span>],<span class="hljs-string">&#x27;D&#x27;</span>: [<span class="hljs-string">&#x27;D0&#x27;</span>, <span class="hljs-string">&#x27;D1&#x27;</span>]&#125;,index=[<span class="hljs-string">&#x27;K0&#x27;</span>, <span class="hljs-string">&#x27;K1&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj    A   B key<span class="hljs-number">0</span>  A0  B0  K0<span class="hljs-number">1</span>  A1  B1  K1<span class="hljs-number">2</span>  A2  B2  K0<span class="hljs-number">3</span>  A3  B3  K1<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>other     C   DK0  C0  D0K1  C1  D1<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.join(other, on=<span class="hljs-string">&#x27;key&#x27;</span>)    A   B key   C   D<span class="hljs-number">0</span>  A0  B0  K0  C0  D0<span class="hljs-number">1</span>  A1  B1  K1  C1  D1<span class="hljs-number">2</span>  A2  B2  K0  C0  D0<span class="hljs-number">3</span>  A3  B3  K1  C1  D1</code></pre><h2><span id="05x00-si-chong-fang-fa-de-qu-bie"><font color="#FF0000">【05x00】四种方法的区别</font></span></h2><ul><li><p><code>concat</code>：可用于两个或多个 Series 或 DataFrame 对象间，通过 <code>axis</code> 参数指定按照行方向（增加行）或列方向（增加列）进合并操作，默认行合并（增加行），取并集；</p></li><li><p><code>append</code>：在一个 Series 或 DataFrame 对象后最追加另一个 Series 或 DataFrame 对象并返回一个新对象，不改变原对象的值。只能按行合并（增加行）。</p></li><li><p><code>merge</code>：只能对两个 DataFrame 对象进行合并，一般按照列方向（增加列）进行合并操作，按照行方向合并一般用 join 方法代替，默认列合并（增加列），取交集；</p></li><li><p><code>join</code>：只能对两个 DataFrame 对象进行合并，按照列方向（增加列）进行合并操作，默认左连接。</p></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106830112</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-01x00-concat-font&quot;&gt;&lt;font color=&quot;#FF0000&quot;&gt;【</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Pandas" scheme="https://www.itbob.cn/tags/Pandas/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Pandas（六）：GroupBy 数据分裂/应用/合并</title>
    <link href="https://www.itbob.cn/article/030/"/>
    <id>https://www.itbob.cn/article/030/</id>
    <published>2020-06-17T15:02:33.000Z</published>
    <updated>2022-05-22T12:41:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-01x00-groupby-ji-zhi-font"><font color="#FF0000">【01x00】GroupBy 机制</font></a></li><li><a href="#font-color-ff0000-02x00-groupby-dui-xiang-font"><font color="#FF0000">【02x00】GroupBy 对象</font></a></li><li><a href="#font-color-ff0000-03x00-groupby-split-shu-ju-fen-lie-font"><font color="#FF0000">【03x00】GroupBy Split 数据分裂</font></a><ul><li><a href="#font-color-4876ff-03x01-fen-zu-yun-suan-font"><font color="#4876FF">【03x01】分组运算</font></a></li><li><a href="#font-color-4876ff-03x02-an-lei-xing-an-lie-fen-zu-font"><font color="#4876FF">【03x02】按类型按列分组</font></a></li><li><a href="#font-color-4876ff-03x03-zi-ding-yi-fen-zu-font"><font color="#4876FF">【03x03】自定义分组</font></a><ul><li><a href="#font-color-ffa500-03x03x01-zi-dian-fen-zu-font"><font color="#FFA500">【03x03x01】字典分组</font></a></li><li><a href="#font-color-ffa500-03x03x02-han-shu-fen-zu-font"><font color="#FFA500">【03x03x02】函数分组</font></a></li><li><a href="#font-color-ffa500-03x03x03-suo-yin-ceng-ji-fen-zu-font"><font color="#FFA500">【03x03x03】索引层级分组</font></a></li></ul></li><li><a href="#font-color-4876ff-03x04-fen-zu-die-dai-font"><font color="#4876FF">【03x04】分组迭代</font></a></li><li><a href="#font-color-4876ff-03x05-dui-xiang-zhuan-huan-font"><font color="#4876FF">【03x05】对象转换</font></a></li></ul></li><li><a href="#font-color-ff0000-04x00-groupby-apply-shu-ju-ying-yong-font"><font color="#FF0000">【04x00】GroupBy Apply 数据应用</font></a><ul><li><a href="#font-color-4876ff-04x01-ju-he-han-shu-font"><font color="#4876FF">【04x01】聚合函数</font></a></li><li><a href="#font-color-4876ff-04x02-zi-ding-yi-han-shu-font"><font color="#4876FF">【04x02】自定义函数</font></a></li><li><a href="#font-color-4876ff-04x03-dui-bu-tong-lie-zuo-yong-bu-tong-han-shu-font"><font color="#4876FF">【04x03】对不同列作用不同函数</font></a></li><li><a href="#font-color-4876ff-04x04-groupby-apply-font"><font color="#4876FF">【04x04】GroupBy.apply()</font></a></li></ul></li></ul><!-- tocstop --><hr><p>Pandas 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/025/">Python 数据分析三剑客之 Pandas（一）：认识 Pandas 及其 Series、DataFrame 对象</a></li><li><a href="https://www.itbob.cn/article/026/">Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作</a></li><li><a href="https://www.itbob.cn/article/027/">Python 数据分析三剑客之 Pandas（三）：算术运算与缺失值的处理</a></li><li><a href="https://www.itbob.cn/article/028/">Python 数据分析三剑客之 Pandas（四）：函数应用、映射、排序和层级索引</a></li><li><a href="https://www.itbob.cn/article/029/">Python 数据分析三剑客之 Pandas（五）：统计计算与统计描述</a></li><li><a href="https://www.itbob.cn/article/030/">Python 数据分析三剑客之 Pandas（六）：GroupBy 数据分裂、应用与合并</a></li><li><a href="https://www.itbob.cn/article/031/">Python 数据分析三剑客之 Pandas（七）：合并数据集</a></li><li><a href="https://www.itbob.cn/article/032/">Python 数据分析三剑客之 Pandas（八）：数据重塑、重复数据处理与数据替换</a></li><li><a href="https://www.itbob.cn/article/033/">Python 数据分析三剑客之 Pandas（九）：时间序列</a></li><li><a href="https://www.itbob.cn/article/034/">Python 数据分析三剑客之 Pandas（十）：数据读写</a></li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106804881</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="01x00-groupby-ji-zhi"><font color="#FF0000">【01x00】GroupBy 机制</font></span></h2><p>对数据集进行分组并对各组应用一个函数（无论是聚合还是转换），通常是数据分析工作中的重要环节。在将数据集加载、融合、准备好之后，通常就是计算分组统计或生成透视表。Pandas 提供了一个灵活高效的 GroupBy 功能，虽然“分组”（group by）这个名字是借用 SQL 数据库语言的命令，但其理念引用发明 R 语言 frame 的 Hadley Wickham 的观点可能更合适：分裂（Split）、应用（Apply）和组合（Combine）。</p><p>分组运算过程：Split —&gt; Apply —&gt; Combine</p><ul><li>分裂（Split）：根据某些标准将数据分组；</li><li>应用（Apply）：对每个组独立应用一个函数；</li><li>合并（Combine）：把每个分组的计算结果合并起来。</li></ul><p>官方介绍：<a href="https://pandas.pydata.org/docs/user_guide/groupby.html">https://pandas.pydata.org/docs/user_guide/groupby.html</a></p><p><img src="https://static.wukongsec.com/itbob/images/article/030/01.png" alt="01"></p><h2><span id="02x00-groupby-dui-xiang"><font color="#FF0000">【02x00】GroupBy 对象</font></span></h2><p>常见的 GroupBy 对象：Series.groupby、DataFrame.groupby，基本语法如下：</p><pre><code class="hljs python">Series.groupby(self,               by=<span class="hljs-literal">None</span>,               axis=<span class="hljs-number">0</span>,               level=<span class="hljs-literal">None</span>,               as_index: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">True</span>,               sort: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">True</span>,               group_keys: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">True</span>,               squeeze: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>,               observed: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>) → ’groupby_generic.SeriesGroupBy’</code></pre><pre><code class="hljs python">DataFrame.groupby(self,                  by=<span class="hljs-literal">None</span>,                  axis=<span class="hljs-number">0</span>,                  level=<span class="hljs-literal">None</span>,                  as_index: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">True</span>,                  sort: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">True</span>,                  group_keys: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">True</span>,                  squeeze: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>,                  observed: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>) → ’groupby_generic.DataFrameGroupBy’</code></pre><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.groupby.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.groupby.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html</a></p></li></ul><p>常用参数解释如下：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>by</td><td>映射、函数、标签或标签列表，用于确定分组依据的分组。如果 by 是函数，则会在对象索引的每个值上调用它。 <br>如果传递了 dict 或 Series，则将使用 Series 或 dict 的值来确定组（将 Series 的值首先对齐；请参见.align() 方法）。<br> 如果传递了 ndarray，则按原样使用这些值来确定组。标签或标签列表可以按自身中的列传递给分组。 注意，元组被解释为（单个）键</td></tr><tr><td>axis</td><td>沿指定轴拆分，默认 <code>0</code>，<code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，只有在 DataFrame 中才有 <code>1</code> or <code>'columns’</code></td></tr><tr><td>level</td><td>如果轴是  MultiIndex（层次结构），则按特定层级进行分组，默认 None</td></tr><tr><td>as_index</td><td>bool 类型，默认 True，对于聚合输出，返回以组标签为索引的对象。仅与 DataFrame 输入相关。<br><code>as_index=False</code> 实际上是“SQL样式”分组输出</td></tr><tr><td>sort</td><td>bool 类型，默认 True，对组键排序。关闭此选项可获得更好的性能。注：这不影响每组的观察顺序。Groupby 保留每个组中行的顺序</td></tr><tr><td>group_keys</td><td>bool 类型，默认 True，调用 apply 方法时，是否将组键（keys）添加到索引（ index）以标识块</td></tr><tr><td>squeeze</td><td>bool 类型，默认 False，如果可能，减少返回类型的维度，否则返回一致的类型</td></tr></tbody></table><p>groupby() 进行分组，GroupBy 对象没有进行实际运算，只是包含分组的中间数据，示例如下：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;key1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>],    <span class="hljs-string">&#x27;key2&#x27;</span> : [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>],    <span class="hljs-string">&#x27;data1&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>),    <span class="hljs-string">&#x27;data2&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>)&#125;<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key1   key2     data1     data2<span class="hljs-number">0</span>    a    one -<span class="hljs-number">0.804160</span> -<span class="hljs-number">0.868905</span><span class="hljs-number">1</span>    b    one -<span class="hljs-number">0.086990</span>  <span class="hljs-number">0.325741</span><span class="hljs-number">2</span>    a    two  <span class="hljs-number">0.757992</span>  <span class="hljs-number">0.541101</span><span class="hljs-number">3</span>    b  three -<span class="hljs-number">0.281435</span>  <span class="hljs-number">0.097841</span><span class="hljs-number">4</span>    a    two  <span class="hljs-number">0.817757</span> -<span class="hljs-number">0.643699</span><span class="hljs-number">5</span>    b    two -<span class="hljs-number">0.462760</span> -<span class="hljs-number">0.321196</span><span class="hljs-number">6</span>    a    one -<span class="hljs-number">0.403699</span>  <span class="hljs-number">0.602138</span><span class="hljs-number">7</span>    a  three  <span class="hljs-number">0.883940</span> -<span class="hljs-number">0.850526</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>)&lt;pandas.core.groupby.generic.DataFrameGroupBy <span class="hljs-built_in">object</span> at <span class="hljs-number">0x03CDB7C0</span>&gt;<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;data1&#x27;</span>].groupby(obj[<span class="hljs-string">&#x27;key1&#x27;</span>])&lt;pandas.core.groupby.generic.SeriesGroupBy <span class="hljs-built_in">object</span> at <span class="hljs-number">0x03CDB748</span>&gt;</code></pre><h2><span id="03x00-groupby-split-shu-ju-fen-lie"><font color="#FF0000">【03x00】GroupBy Split 数据分裂</font></span></h2><h3><span id="03x01-fen-zu-yun-suan"><font color="#4876FF">【03x01】分组运算</font></span></h3><p>前面通过 <code>groupby()</code> 方法获得了一个 GroupBy 对象，它实际上还没有进行任何计算，只是含有一些有关分组键 <code>obj['key1']</code> 的中间数据而已。换句话说，该对象已经有了接下来对各分组执行运算所需的一切信息。例如，我们可以调用 GroupBy 的 <code>mean()</code> 方法来计算分组平均值，<code>size()</code> 方法返回每个分组的元素个数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;key1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>],    <span class="hljs-string">&#x27;key2&#x27;</span> : [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>],    <span class="hljs-string">&#x27;data1&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>),    <span class="hljs-string">&#x27;data2&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>)&#125;<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key1   key2     data1     data2<span class="hljs-number">0</span>    a    one -<span class="hljs-number">0.544099</span> -<span class="hljs-number">0.614079</span><span class="hljs-number">1</span>    b    one  <span class="hljs-number">2.193712</span>  <span class="hljs-number">0.101005</span><span class="hljs-number">2</span>    a    two -<span class="hljs-number">0.004683</span>  <span class="hljs-number">0.882770</span><span class="hljs-number">3</span>    b  three  <span class="hljs-number">0.312858</span>  <span class="hljs-number">1.732105</span><span class="hljs-number">4</span>    a    two  <span class="hljs-number">0.011089</span>  <span class="hljs-number">0.089587</span><span class="hljs-number">5</span>    b    two  <span class="hljs-number">0.292165</span>  <span class="hljs-number">1.327638</span><span class="hljs-number">6</span>    a    one -<span class="hljs-number">1.433291</span> -<span class="hljs-number">0.238971</span><span class="hljs-number">7</span>    a  three -<span class="hljs-number">0.004724</span> -<span class="hljs-number">2.117326</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>grouped1 = obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>grouped2 = obj[<span class="hljs-string">&#x27;data1&#x27;</span>].groupby(obj[<span class="hljs-string">&#x27;key1&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>grouped1.mean()         data1     data2key1                    a    -<span class="hljs-number">0.395142</span> -<span class="hljs-number">0.399604</span>b     <span class="hljs-number">0.932912</span>  <span class="hljs-number">1.053583</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>grouped2.mean()key1a   -<span class="hljs-number">0.395142</span>b    <span class="hljs-number">0.932912</span>Name: data1, dtype: float64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>grouped1.size()key1a    <span class="hljs-number">5</span>b    <span class="hljs-number">3</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>grouped2.size()key1a    <span class="hljs-number">5</span>b    <span class="hljs-number">3</span>Name: data1, dtype: int64</code></pre><h3><span id="03x02-an-lei-xing-an-lie-fen-zu"><font color="#4876FF">【03x02】按类型按列分组</font></span></h3><p><code>groupby()</code> 方法 <code>axis</code> 参数默认是 0，通过设置也可以在其他任何轴上进行分组，也支持按照类型（dtype）进行分组：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;key1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>],    <span class="hljs-string">&#x27;key2&#x27;</span> : [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>],    <span class="hljs-string">&#x27;data1&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>),    <span class="hljs-string">&#x27;data2&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>)&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key1   key2     data1     data2<span class="hljs-number">0</span>    a    one -<span class="hljs-number">0.607009</span>  <span class="hljs-number">1.948301</span><span class="hljs-number">1</span>    b    one  <span class="hljs-number">0.150818</span> -<span class="hljs-number">0.025095</span><span class="hljs-number">2</span>    a    two -<span class="hljs-number">2.086024</span>  <span class="hljs-number">0.358164</span><span class="hljs-number">3</span>    b  three  <span class="hljs-number">0.446061</span>  <span class="hljs-number">1.708797</span><span class="hljs-number">4</span>    a    two  <span class="hljs-number">0.745457</span> -<span class="hljs-number">0.980948</span><span class="hljs-number">5</span>    b    two  <span class="hljs-number">0.981877</span>  <span class="hljs-number">2.159327</span><span class="hljs-number">6</span>    a    one  <span class="hljs-number">0.804480</span> -<span class="hljs-number">0.499661</span><span class="hljs-number">7</span>    a  three  <span class="hljs-number">0.112884</span>  <span class="hljs-number">0.004367</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.dtypeskey1      <span class="hljs-built_in">object</span>key2      <span class="hljs-built_in">object</span>data1    float64data2    float64dtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(obj.dtypes, axis=<span class="hljs-number">1</span>).size()float64    <span class="hljs-number">2</span><span class="hljs-built_in">object</span>     <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(obj.dtypes, axis=<span class="hljs-number">1</span>).<span class="hljs-built_in">sum</span>()    float64  <span class="hljs-built_in">object</span><span class="hljs-number">0</span>  <span class="hljs-number">1.341291</span>    aone<span class="hljs-number">1</span>  <span class="hljs-number">0.125723</span>    bone<span class="hljs-number">2</span> -<span class="hljs-number">1.727860</span>    atwo<span class="hljs-number">3</span>  <span class="hljs-number">2.154858</span>  bthree<span class="hljs-number">4</span> -<span class="hljs-number">0.235491</span>    atwo<span class="hljs-number">5</span>  <span class="hljs-number">3.141203</span>    btwo<span class="hljs-number">6</span>  <span class="hljs-number">0.304819</span>    aone<span class="hljs-number">7</span>  <span class="hljs-number">0.117251</span>  athree</code></pre><h3><span id="03x03-zi-ding-yi-fen-zu"><font color="#4876FF">【03x03】自定义分组</font></span></h3><p><code>groupby()</code> 方法中可以一次传入多个数组的列表，也可以自定义一组分组键。也可以通过一个字典、一个函数，或者按照索引层级进行分组。</p><p>传入多个数组的列表：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;key1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>],    <span class="hljs-string">&#x27;key2&#x27;</span> : [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>],    <span class="hljs-string">&#x27;data1&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>),    <span class="hljs-string">&#x27;data2&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>)&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key1   key2     data1     data2<span class="hljs-number">0</span>    a    one -<span class="hljs-number">0.841652</span>  <span class="hljs-number">0.688055</span><span class="hljs-number">1</span>    b    one  <span class="hljs-number">0.510042</span> -<span class="hljs-number">0.561171</span><span class="hljs-number">2</span>    a    two -<span class="hljs-number">0.418862</span> -<span class="hljs-number">0.145983</span><span class="hljs-number">3</span>    b  three -<span class="hljs-number">1.104698</span>  <span class="hljs-number">0.563158</span><span class="hljs-number">4</span>    a    two  <span class="hljs-number">0.329527</span> -<span class="hljs-number">0.893108</span><span class="hljs-number">5</span>    b    two  <span class="hljs-number">0.753653</span> -<span class="hljs-number">0.342520</span><span class="hljs-number">6</span>    a    one -<span class="hljs-number">0.882527</span> -<span class="hljs-number">1.121329</span><span class="hljs-number">7</span>    a  three  <span class="hljs-number">1.726794</span>  <span class="hljs-number">0.160244</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>means = obj[<span class="hljs-string">&#x27;data1&#x27;</span>].groupby([obj[<span class="hljs-string">&#x27;key1&#x27;</span>], obj[<span class="hljs-string">&#x27;key2&#x27;</span>]]).mean()<span class="hljs-meta">&gt;&gt;&gt; </span>meanskey1  key2 a     one     -<span class="hljs-number">0.862090</span>      three    <span class="hljs-number">1.726794</span>      two     -<span class="hljs-number">0.044667</span>b     one      <span class="hljs-number">0.510042</span>      three   -<span class="hljs-number">1.104698</span>      two      <span class="hljs-number">0.753653</span>Name: data1, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>means.unstack()key2       one     three       twokey1                              a    -<span class="hljs-number">0.862090</span>  <span class="hljs-number">1.726794</span> -<span class="hljs-number">0.044667</span>b     <span class="hljs-number">0.510042</span> -<span class="hljs-number">1.104698</span>  <span class="hljs-number">0.753653</span></code></pre><p>自定义分组键：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;key1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>],    <span class="hljs-string">&#x27;key2&#x27;</span> : [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>],    <span class="hljs-string">&#x27;data1&#x27;</span> : np.random.randn(<span class="hljs-number">5</span>),    <span class="hljs-string">&#x27;data2&#x27;</span> : np.random.randn(<span class="hljs-number">5</span>)&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key1 key2     data1     data2<span class="hljs-number">0</span>    a  one -<span class="hljs-number">0.024003</span>  <span class="hljs-number">0.350480</span><span class="hljs-number">1</span>    a  two -<span class="hljs-number">0.767534</span> -<span class="hljs-number">0.100426</span><span class="hljs-number">2</span>    b  one -<span class="hljs-number">0.594983</span> -<span class="hljs-number">1.945580</span><span class="hljs-number">3</span>    b  two -<span class="hljs-number">0.374482</span>  <span class="hljs-number">0.817592</span><span class="hljs-number">4</span>    a  one  <span class="hljs-number">0.755452</span> -<span class="hljs-number">0.137759</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>states = np.array([<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>years = np.array([<span class="hljs-number">2005</span>, <span class="hljs-number">2005</span>, <span class="hljs-number">2006</span>, <span class="hljs-number">2005</span>, <span class="hljs-number">2006</span>])<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;data1&#x27;</span>].groupby([states, years]).mean()Beijing  <span class="hljs-number">2005</span>   -<span class="hljs-number">0.767534</span>         <span class="hljs-number">2006</span>   -<span class="hljs-number">0.594983</span>Wuhan    <span class="hljs-number">2005</span>   -<span class="hljs-number">0.199242</span>         <span class="hljs-number">2006</span>    <span class="hljs-number">0.755452</span>Name: data1, dtype: float64</code></pre><h4><span id="03x03x01-zi-dian-fen-zu"><font color="#FFA500">【03x03x01】字典分组</font></span></h4><p>通过字典进行分组：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.random.randint(<span class="hljs-number">1</span>, <span class="hljs-number">10</span>, (<span class="hljs-number">5</span>,<span class="hljs-number">5</span>)),    columns=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>],    index=[<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>, <span class="hljs-string">&#x27;D&#x27;</span>, <span class="hljs-string">&#x27;E&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj   a  b  c  d  eA  <span class="hljs-number">1</span>  <span class="hljs-number">4</span>  <span class="hljs-number">7</span>  <span class="hljs-number">1</span>  <span class="hljs-number">9</span>B  <span class="hljs-number">8</span>  <span class="hljs-number">2</span>  <span class="hljs-number">4</span>  <span class="hljs-number">7</span>  <span class="hljs-number">8</span>C  <span class="hljs-number">9</span>  <span class="hljs-number">8</span>  <span class="hljs-number">2</span>  <span class="hljs-number">5</span>  <span class="hljs-number">1</span>D  <span class="hljs-number">2</span>  <span class="hljs-number">4</span>  <span class="hljs-number">2</span>  <span class="hljs-number">8</span>  <span class="hljs-number">3</span>E  <span class="hljs-number">7</span>  <span class="hljs-number">5</span>  <span class="hljs-number">7</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj_dict = &#123;<span class="hljs-string">&#x27;a&#x27;</span>:<span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>:<span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>:<span class="hljs-string">&#x27;Java&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>:<span class="hljs-string">&#x27;C++&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>:<span class="hljs-string">&#x27;Java&#x27;</span>&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(obj_dict, axis=<span class="hljs-number">1</span>).size()C++       <span class="hljs-number">1</span>Java      <span class="hljs-number">2</span>Python    <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(obj_dict, axis=<span class="hljs-number">1</span>).count()   C++  Java  PythonA    <span class="hljs-number">1</span>     <span class="hljs-number">2</span>       <span class="hljs-number">2</span>B    <span class="hljs-number">1</span>     <span class="hljs-number">2</span>       <span class="hljs-number">2</span>C    <span class="hljs-number">1</span>     <span class="hljs-number">2</span>       <span class="hljs-number">2</span>D    <span class="hljs-number">1</span>     <span class="hljs-number">2</span>       <span class="hljs-number">2</span>E    <span class="hljs-number">1</span>     <span class="hljs-number">2</span>       <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(obj_dict, axis=<span class="hljs-number">1</span>).<span class="hljs-built_in">sum</span>()   C++  Java  PythonA    <span class="hljs-number">1</span>    <span class="hljs-number">16</span>       <span class="hljs-number">5</span>B    <span class="hljs-number">7</span>    <span class="hljs-number">12</span>      <span class="hljs-number">10</span>C    <span class="hljs-number">5</span>     <span class="hljs-number">3</span>      <span class="hljs-number">17</span>D    <span class="hljs-number">8</span>     <span class="hljs-number">5</span>       <span class="hljs-number">6</span>E    <span class="hljs-number">2</span>    <span class="hljs-number">10</span>      <span class="hljs-number">12</span></code></pre><h4><span id="03x03x02-han-shu-fen-zu"><font color="#FFA500">【03x03x02】函数分组</font></span></h4><p>通过函数进行分组：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.random.randint(<span class="hljs-number">1</span>, <span class="hljs-number">10</span>, (<span class="hljs-number">5</span>,<span class="hljs-number">5</span>)),        columns=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>],        index=[<span class="hljs-string">&#x27;AA&#x27;</span>, <span class="hljs-string">&#x27;BBB&#x27;</span>, <span class="hljs-string">&#x27;CC&#x27;</span>, <span class="hljs-string">&#x27;D&#x27;</span>, <span class="hljs-string">&#x27;EE&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj     a  b  c  d  eAA   <span class="hljs-number">3</span>  <span class="hljs-number">9</span>  <span class="hljs-number">5</span>  <span class="hljs-number">8</span>  <span class="hljs-number">2</span>BBB  <span class="hljs-number">1</span>  <span class="hljs-number">4</span>  <span class="hljs-number">2</span>  <span class="hljs-number">2</span>  <span class="hljs-number">6</span>CC   <span class="hljs-number">9</span>  <span class="hljs-number">2</span>  <span class="hljs-number">4</span>  <span class="hljs-number">7</span>  <span class="hljs-number">6</span>D    <span class="hljs-number">2</span>  <span class="hljs-number">5</span>  <span class="hljs-number">5</span>  <span class="hljs-number">7</span>  <span class="hljs-number">1</span>EE   <span class="hljs-number">8</span>  <span class="hljs-number">8</span>  <span class="hljs-number">8</span>  <span class="hljs-number">2</span>  <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">group_key</span>(<span class="hljs-params">idx</span>):</span>        <span class="hljs-string">&quot;&quot;&quot;</span><span class="hljs-string">        idx 为列索引或行索引</span><span class="hljs-string">    &quot;&quot;&quot;</span>        <span class="hljs-keyword">return</span> <span class="hljs-built_in">len</span>(idx)<span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(group_key).size()    <span class="hljs-comment"># 等价于 obj.groupby(len).size()</span><span class="hljs-number">1</span>    <span class="hljs-number">1</span><span class="hljs-number">2</span>    <span class="hljs-number">3</span><span class="hljs-number">3</span>    <span class="hljs-number">1</span>dtype: int64</code></pre><h4><span id="03x03x03-suo-yin-ceng-ji-fen-zu"><font color="#FFA500">【03x03x03】索引层级分组</font></span></h4><p>通过不同索引层级进行分组：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>columns = pd.MultiIndex.from_arrays([[<span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;Java&#x27;</span>, <span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;Java&#x27;</span>, <span class="hljs-string">&#x27;Python&#x27;</span>],    [<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>]], names=[<span class="hljs-string">&#x27;language&#x27;</span>, <span class="hljs-string">&#x27;index&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.random.randint(<span class="hljs-number">1</span>, <span class="hljs-number">10</span>, (<span class="hljs-number">5</span>, <span class="hljs-number">5</span>)), columns=columns)<span class="hljs-meta">&gt;&gt;&gt; </span>objlanguage Python Java Python Java Pythonindex         A    A      B    C      B<span class="hljs-number">0</span>             <span class="hljs-number">7</span>    <span class="hljs-number">1</span>      <span class="hljs-number">9</span>    <span class="hljs-number">8</span>      <span class="hljs-number">5</span><span class="hljs-number">1</span>             <span class="hljs-number">4</span>    <span class="hljs-number">5</span>      <span class="hljs-number">4</span>    <span class="hljs-number">5</span>      <span class="hljs-number">6</span><span class="hljs-number">2</span>             <span class="hljs-number">4</span>    <span class="hljs-number">3</span>      <span class="hljs-number">1</span>    <span class="hljs-number">9</span>      <span class="hljs-number">5</span><span class="hljs-number">3</span>             <span class="hljs-number">6</span>    <span class="hljs-number">6</span>      <span class="hljs-number">3</span>    <span class="hljs-number">8</span>      <span class="hljs-number">1</span><span class="hljs-number">4</span>             <span class="hljs-number">7</span>    <span class="hljs-number">9</span>      <span class="hljs-number">2</span>    <span class="hljs-number">8</span>      <span class="hljs-number">2</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(level=<span class="hljs-string">&#x27;language&#x27;</span>, axis=<span class="hljs-number">1</span>).<span class="hljs-built_in">sum</span>()language  Java  Python<span class="hljs-number">0</span>            <span class="hljs-number">9</span>      <span class="hljs-number">21</span><span class="hljs-number">1</span>           <span class="hljs-number">10</span>      <span class="hljs-number">14</span><span class="hljs-number">2</span>           <span class="hljs-number">12</span>      <span class="hljs-number">10</span><span class="hljs-number">3</span>           <span class="hljs-number">14</span>      <span class="hljs-number">10</span><span class="hljs-number">4</span>           <span class="hljs-number">17</span>      <span class="hljs-number">11</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(level=<span class="hljs-string">&#x27;index&#x27;</span>, axis=<span class="hljs-number">1</span>).<span class="hljs-built_in">sum</span>()index   A   B  C<span class="hljs-number">0</span>       <span class="hljs-number">8</span>  <span class="hljs-number">14</span>  <span class="hljs-number">8</span><span class="hljs-number">1</span>       <span class="hljs-number">9</span>  <span class="hljs-number">10</span>  <span class="hljs-number">5</span><span class="hljs-number">2</span>       <span class="hljs-number">7</span>   <span class="hljs-number">6</span>  <span class="hljs-number">9</span><span class="hljs-number">3</span>      <span class="hljs-number">12</span>   <span class="hljs-number">4</span>  <span class="hljs-number">8</span><span class="hljs-number">4</span>      <span class="hljs-number">16</span>   <span class="hljs-number">4</span>  <span class="hljs-number">8</span></code></pre><h3><span id="03x04-fen-zu-die-dai"><font color="#4876FF">【03x04】分组迭代</font></span></h3><p>GroupBy 对象支持迭代，对于单层分组，可以产生一组二元元组，由分组名和数据块组成：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;key1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>],    <span class="hljs-string">&#x27;key2&#x27;</span> : [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>],    <span class="hljs-string">&#x27;data1&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>),    <span class="hljs-string">&#x27;data2&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>)&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key1   key2     data1     data2<span class="hljs-number">0</span>    a    one -<span class="hljs-number">1.088762</span>  <span class="hljs-number">0.668504</span><span class="hljs-number">1</span>    b    one  <span class="hljs-number">0.275500</span>  <span class="hljs-number">0.787844</span><span class="hljs-number">2</span>    a    two -<span class="hljs-number">0.108417</span> -<span class="hljs-number">0.491296</span><span class="hljs-number">3</span>    b  three  <span class="hljs-number">0.019524</span> -<span class="hljs-number">0.363390</span><span class="hljs-number">4</span>    a    two  <span class="hljs-number">0.453612</span>  <span class="hljs-number">0.796999</span><span class="hljs-number">5</span>    b    two  <span class="hljs-number">1.982858</span>  <span class="hljs-number">1.501877</span><span class="hljs-number">6</span>    a    one  <span class="hljs-number">1.101132</span> -<span class="hljs-number">1.928362</span><span class="hljs-number">7</span>    a  three  <span class="hljs-number">0.524775</span> -<span class="hljs-number">1.205842</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">for</span> group_name, group_data <span class="hljs-keyword">in</span> obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>):    <span class="hljs-built_in">print</span>(group_name)    <span class="hljs-built_in">print</span>(group_data)    a  key1   key2     data1     data2<span class="hljs-number">0</span>    a    one -<span class="hljs-number">1.088762</span>  <span class="hljs-number">0.668504</span><span class="hljs-number">2</span>    a    two -<span class="hljs-number">0.108417</span> -<span class="hljs-number">0.491296</span><span class="hljs-number">4</span>    a    two  <span class="hljs-number">0.453612</span>  <span class="hljs-number">0.796999</span><span class="hljs-number">6</span>    a    one  <span class="hljs-number">1.101132</span> -<span class="hljs-number">1.928362</span><span class="hljs-number">7</span>    a  three  <span class="hljs-number">0.524775</span> -<span class="hljs-number">1.205842</span>b  key1   key2     data1     data2<span class="hljs-number">1</span>    b    one  <span class="hljs-number">0.275500</span>  <span class="hljs-number">0.787844</span><span class="hljs-number">3</span>    b  three  <span class="hljs-number">0.019524</span> -<span class="hljs-number">0.363390</span><span class="hljs-number">5</span>    b    two  <span class="hljs-number">1.982858</span>  <span class="hljs-number">1.501877</span></code></pre><p>对于多层分组，元组的第一个元素将会是由键值组成的元组，第二个元素为数据块：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;key1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>],    <span class="hljs-string">&#x27;key2&#x27;</span> : [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>],    <span class="hljs-string">&#x27;data1&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>),    <span class="hljs-string">&#x27;data2&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>)&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key1   key2     data1     data2<span class="hljs-number">0</span>    a    one -<span class="hljs-number">1.088762</span>  <span class="hljs-number">0.668504</span><span class="hljs-number">1</span>    b    one  <span class="hljs-number">0.275500</span>  <span class="hljs-number">0.787844</span><span class="hljs-number">2</span>    a    two -<span class="hljs-number">0.108417</span> -<span class="hljs-number">0.491296</span><span class="hljs-number">3</span>    b  three  <span class="hljs-number">0.019524</span> -<span class="hljs-number">0.363390</span><span class="hljs-number">4</span>    a    two  <span class="hljs-number">0.453612</span>  <span class="hljs-number">0.796999</span><span class="hljs-number">5</span>    b    two  <span class="hljs-number">1.982858</span>  <span class="hljs-number">1.501877</span><span class="hljs-number">6</span>    a    one  <span class="hljs-number">1.101132</span> -<span class="hljs-number">1.928362</span><span class="hljs-number">7</span>    a  three  <span class="hljs-number">0.524775</span> -<span class="hljs-number">1.205842</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">for</span> group_name, group_data <span class="hljs-keyword">in</span> obj.groupby([<span class="hljs-string">&#x27;key1&#x27;</span>, <span class="hljs-string">&#x27;key2&#x27;</span>]):    <span class="hljs-built_in">print</span>(group_name)    <span class="hljs-built_in">print</span>(group_data)    (<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>)  key1 key2     data1     data2<span class="hljs-number">0</span>    a  one -<span class="hljs-number">1.088762</span>  <span class="hljs-number">0.668504</span><span class="hljs-number">6</span>    a  one  <span class="hljs-number">1.101132</span> -<span class="hljs-number">1.928362</span>(<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>)  key1   key2     data1     data2<span class="hljs-number">7</span>    a  three  <span class="hljs-number">0.524775</span> -<span class="hljs-number">1.205842</span>(<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>)  key1 key2     data1     data2<span class="hljs-number">2</span>    a  two -<span class="hljs-number">0.108417</span> -<span class="hljs-number">0.491296</span><span class="hljs-number">4</span>    a  two  <span class="hljs-number">0.453612</span>  <span class="hljs-number">0.796999</span>(<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>)  key1 key2   data1     data2<span class="hljs-number">1</span>    b  one  <span class="hljs-number">0.2755</span>  <span class="hljs-number">0.787844</span>(<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>)  key1   key2     data1    data2<span class="hljs-number">3</span>    b  three  <span class="hljs-number">0.019524</span> -<span class="hljs-number">0.36339</span>(<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>)  key1 key2     data1     data2<span class="hljs-number">5</span>    b  two  <span class="hljs-number">1.982858</span>  <span class="hljs-number">1.501877</span></code></pre><h3><span id="03x05-dui-xiang-zhuan-huan"><font color="#4876FF">【03x05】对象转换</font></span></h3><p>GroupBy 对象支持转换成列表或字典：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;key1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>],    <span class="hljs-string">&#x27;key2&#x27;</span> : [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>],    <span class="hljs-string">&#x27;data1&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>),    <span class="hljs-string">&#x27;data2&#x27;</span>: np.random.randn(<span class="hljs-number">8</span>)&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key1   key2     data1     data2<span class="hljs-number">0</span>    a    one -<span class="hljs-number">0.607009</span>  <span class="hljs-number">1.948301</span><span class="hljs-number">1</span>    b    one  <span class="hljs-number">0.150818</span> -<span class="hljs-number">0.025095</span><span class="hljs-number">2</span>    a    two -<span class="hljs-number">2.086024</span>  <span class="hljs-number">0.358164</span><span class="hljs-number">3</span>    b  three  <span class="hljs-number">0.446061</span>  <span class="hljs-number">1.708797</span><span class="hljs-number">4</span>    a    two  <span class="hljs-number">0.745457</span> -<span class="hljs-number">0.980948</span><span class="hljs-number">5</span>    b    two  <span class="hljs-number">0.981877</span>  <span class="hljs-number">2.159327</span><span class="hljs-number">6</span>    a    one  <span class="hljs-number">0.804480</span> -<span class="hljs-number">0.499661</span><span class="hljs-number">7</span>    a  three  <span class="hljs-number">0.112884</span>  <span class="hljs-number">0.004367</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>grouped = obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">list</span>(grouped)[(<span class="hljs-string">&#x27;a&#x27;</span>,   key1   key2     data1     data2<span class="hljs-number">0</span>    a    one -<span class="hljs-number">0.607009</span>  <span class="hljs-number">1.948301</span><span class="hljs-number">2</span>    a    two -<span class="hljs-number">2.086024</span>  <span class="hljs-number">0.358164</span><span class="hljs-number">4</span>    a    two  <span class="hljs-number">0.745457</span> -<span class="hljs-number">0.980948</span><span class="hljs-number">6</span>    a    one  <span class="hljs-number">0.804480</span> -<span class="hljs-number">0.499661</span><span class="hljs-number">7</span>    a  three  <span class="hljs-number">0.112884</span>  <span class="hljs-number">0.004367</span>),(<span class="hljs-string">&#x27;b&#x27;</span>,   key1   key2     data1     data2<span class="hljs-number">1</span>    b    one  <span class="hljs-number">0.150818</span> -<span class="hljs-number">0.025095</span><span class="hljs-number">3</span>    b  three  <span class="hljs-number">0.446061</span>  <span class="hljs-number">1.708797</span><span class="hljs-number">5</span>    b    two  <span class="hljs-number">0.981877</span>  <span class="hljs-number">2.159327</span>)]&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">dict</span>(<span class="hljs-built_in">list</span>(grouped))&#123;<span class="hljs-string">&#x27;a&#x27;</span>:   key1   key2     data1     data2<span class="hljs-number">0</span>    a    one -<span class="hljs-number">0.607009</span>  <span class="hljs-number">1.948301</span><span class="hljs-number">2</span>    a    two -<span class="hljs-number">2.086024</span>  <span class="hljs-number">0.358164</span><span class="hljs-number">4</span>    a    two  <span class="hljs-number">0.745457</span> -<span class="hljs-number">0.980948</span><span class="hljs-number">6</span>    a    one  <span class="hljs-number">0.804480</span> -<span class="hljs-number">0.499661</span><span class="hljs-number">7</span>    a  three  <span class="hljs-number">0.112884</span>  <span class="hljs-number">0.004367</span>,<span class="hljs-string">&#x27;b&#x27;</span>:   key1   key2     data1     data2<span class="hljs-number">1</span>    b    one  <span class="hljs-number">0.150818</span> -<span class="hljs-number">0.025095</span><span class="hljs-number">3</span>    b  three  <span class="hljs-number">0.446061</span>  <span class="hljs-number">1.708797</span><span class="hljs-number">5</span>    b    two  <span class="hljs-number">0.981877</span>  <span class="hljs-number">2.159327</span>&#125;</code></pre><h2><span id="04x00-groupby-apply-shu-ju-ying-yong"><font color="#FF0000">【04x00】GroupBy Apply 数据应用</font></span></h2><p>聚合指的是任何能够从数组产生标量值的数据转换过程，常用于对分组后的数据进行计算</p><h3><span id="04x01-ju-he-han-shu"><font color="#4876FF">【04x01】聚合函数</font></span></h3><p>之前的例子已经用过一些内置的聚合函数，比如 mean、count、min 以及 sum 等。常见的聚合运算如下表所示：</p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/groupby.html">https://pandas.pydata.org/docs/reference/groupby.html</a></p><table><thead><tr><th>方法</th><th>描述</th></tr></thead><tbody><tr><td>count</td><td>非NA值的数量</td></tr><tr><td>describe</td><td>针对Series或各DataFrame列计算汇总统计</td></tr><tr><td>min</td><td>计算最小值</td></tr><tr><td>max</td><td>计算最大值</td></tr><tr><td>argmin</td><td>计算能够获取到最小值的索引位置（整数）</td></tr><tr><td>argmax</td><td>计算能够获取到最大值的索引位置（整数）</td></tr><tr><td>idxmin</td><td>计算能够获取到最小值的索引值</td></tr><tr><td>idxmax</td><td>计算能够获取到最大值的索引值</td></tr><tr><td>quantile</td><td>计算样本的分位数（0到1）</td></tr><tr><td>sum</td><td>值的总和</td></tr><tr><td>mean</td><td>值的平均数</td></tr><tr><td>median</td><td>值的算术中位数（50%分位数）</td></tr><tr><td>mad</td><td>根据平均值计算平均绝对离差</td></tr><tr><td>var</td><td>样本值的方差</td></tr><tr><td>std</td><td>样本值的标准差</td></tr></tbody></table><p>应用示例：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = &#123;<span class="hljs-string">&#x27;key1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>],    <span class="hljs-string">&#x27;key2&#x27;</span> : [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>],    <span class="hljs-string">&#x27;data1&#x27;</span>: np.random.randint(<span class="hljs-number">1</span>,<span class="hljs-number">10</span>, <span class="hljs-number">8</span>),    <span class="hljs-string">&#x27;data2&#x27;</span>: np.random.randint(<span class="hljs-number">1</span>,<span class="hljs-number">10</span>, <span class="hljs-number">8</span>)&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(obj)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key1   key2  data1  data2<span class="hljs-number">0</span>    a    one      <span class="hljs-number">9</span>      <span class="hljs-number">7</span><span class="hljs-number">1</span>    b    one      <span class="hljs-number">5</span>      <span class="hljs-number">9</span><span class="hljs-number">2</span>    a    two      <span class="hljs-number">2</span>      <span class="hljs-number">4</span><span class="hljs-number">3</span>    b  three      <span class="hljs-number">3</span>      <span class="hljs-number">4</span><span class="hljs-number">4</span>    a    two      <span class="hljs-number">5</span>      <span class="hljs-number">1</span><span class="hljs-number">5</span>    b    two      <span class="hljs-number">5</span>      <span class="hljs-number">9</span><span class="hljs-number">6</span>    a    one      <span class="hljs-number">1</span>      <span class="hljs-number">8</span><span class="hljs-number">7</span>    a  three      <span class="hljs-number">2</span>      <span class="hljs-number">4</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>).<span class="hljs-built_in">sum</span>()      data1  data2key1              a        <span class="hljs-number">19</span>     <span class="hljs-number">24</span>b        <span class="hljs-number">13</span>     <span class="hljs-number">22</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>).<span class="hljs-built_in">max</span>()     key2  data1  data2key1                   a     two      <span class="hljs-number">9</span>      <span class="hljs-number">8</span>b     two      <span class="hljs-number">5</span>      <span class="hljs-number">9</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>).<span class="hljs-built_in">min</span>()     key2  data1  data2key1                   a     one      <span class="hljs-number">1</span>      <span class="hljs-number">1</span>b     one      <span class="hljs-number">3</span>      <span class="hljs-number">4</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>).mean()         data1     data2key1                    a     <span class="hljs-number">3.800000</span>  <span class="hljs-number">4.800000</span>b     <span class="hljs-number">4.333333</span>  <span class="hljs-number">7.333333</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>).size()key1a    <span class="hljs-number">5</span>b    <span class="hljs-number">3</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>).count()      key2  data1  data2key1                    a        <span class="hljs-number">5</span>      <span class="hljs-number">5</span>      <span class="hljs-number">5</span>b        <span class="hljs-number">3</span>      <span class="hljs-number">3</span>      <span class="hljs-number">3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>).describe()     data1                                ... data2                         count      mean       std  <span class="hljs-built_in">min</span>  <span class="hljs-number">25</span>%  ...   <span class="hljs-built_in">min</span>  <span class="hljs-number">25</span>%  <span class="hljs-number">50</span>%  <span class="hljs-number">75</span>%  <span class="hljs-built_in">max</span>key1                                      ...                          a      <span class="hljs-number">5.0</span>  <span class="hljs-number">3.800000</span>  <span class="hljs-number">3.271085</span>  <span class="hljs-number">1.0</span>  <span class="hljs-number">2.0</span>  ...   <span class="hljs-number">1.0</span>  <span class="hljs-number">4.0</span>  <span class="hljs-number">4.0</span>  <span class="hljs-number">7.0</span>  <span class="hljs-number">8.0</span>b      <span class="hljs-number">3.0</span>  <span class="hljs-number">4.333333</span>  <span class="hljs-number">1.154701</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">4.0</span>  ...   <span class="hljs-number">4.0</span>  <span class="hljs-number">6.5</span>  <span class="hljs-number">9.0</span>  <span class="hljs-number">9.0</span>  <span class="hljs-number">9.0</span>[<span class="hljs-number">2</span> rows x <span class="hljs-number">16</span> columns]</code></pre><h3><span id="04x02-zi-ding-yi-han-shu"><font color="#4876FF">【04x02】自定义函数</font></span></h3><p>如果自带的内置函数满足不了我们的要求，则可以自定义一个聚合函数，然后传入 <code>GroupBy.agg(func)</code> 或 <code>GroupBy.aggregate(func) </code> 方法中即可。func 的参数为 groupby 索引对应的记录。</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = &#123;<span class="hljs-string">&#x27;key1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>],    <span class="hljs-string">&#x27;key2&#x27;</span> : [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>],    <span class="hljs-string">&#x27;data1&#x27;</span>: np.random.randint(<span class="hljs-number">1</span>,<span class="hljs-number">10</span>, <span class="hljs-number">8</span>),    <span class="hljs-string">&#x27;data2&#x27;</span>: np.random.randint(<span class="hljs-number">1</span>,<span class="hljs-number">10</span>, <span class="hljs-number">8</span>)&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(obj)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key1   key2  data1  data2<span class="hljs-number">0</span>    a    one      <span class="hljs-number">9</span>      <span class="hljs-number">7</span><span class="hljs-number">1</span>    b    one      <span class="hljs-number">5</span>      <span class="hljs-number">9</span><span class="hljs-number">2</span>    a    two      <span class="hljs-number">2</span>      <span class="hljs-number">4</span><span class="hljs-number">3</span>    b  three      <span class="hljs-number">3</span>      <span class="hljs-number">4</span><span class="hljs-number">4</span>    a    two      <span class="hljs-number">5</span>      <span class="hljs-number">1</span><span class="hljs-number">5</span>    b    two      <span class="hljs-number">5</span>      <span class="hljs-number">9</span><span class="hljs-number">6</span>    a    one      <span class="hljs-number">1</span>      <span class="hljs-number">8</span><span class="hljs-number">7</span>    a  three      <span class="hljs-number">2</span>      <span class="hljs-number">4</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">peak_range</span>(<span class="hljs-params">df</span>):</span>    <span class="hljs-keyword">return</span> df.<span class="hljs-built_in">max</span>() - df.<span class="hljs-built_in">min</span>()<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>).agg(peak_range)      data1  data2key1              a         <span class="hljs-number">8</span>      <span class="hljs-number">7</span>b         <span class="hljs-number">2</span>      <span class="hljs-number">5</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>).agg(<span class="hljs-keyword">lambda</span> df : df.<span class="hljs-built_in">max</span>() - df.<span class="hljs-built_in">min</span>())      data1  data2key1              a         <span class="hljs-number">8</span>      <span class="hljs-number">7</span>b         <span class="hljs-number">2</span>      <span class="hljs-number">5</span></code></pre><h3><span id="04x03-dui-bu-tong-lie-zuo-yong-bu-tong-han-shu"><font color="#4876FF">【04x03】对不同列作用不同函数</font></span></h3><p>使用字典可以对不同列作用不同的聚合函数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = &#123;<span class="hljs-string">&#x27;key1&#x27;</span> : [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>],    <span class="hljs-string">&#x27;key2&#x27;</span> : [<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;three&#x27;</span>],    <span class="hljs-string">&#x27;data1&#x27;</span>: np.random.randint(<span class="hljs-number">1</span>,<span class="hljs-number">10</span>, <span class="hljs-number">8</span>),    <span class="hljs-string">&#x27;data2&#x27;</span>: np.random.randint(<span class="hljs-number">1</span>,<span class="hljs-number">10</span>, <span class="hljs-number">8</span>)&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(obj)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  key1   key2  data1  data2<span class="hljs-number">0</span>    a    one      <span class="hljs-number">9</span>      <span class="hljs-number">7</span><span class="hljs-number">1</span>    b    one      <span class="hljs-number">5</span>      <span class="hljs-number">9</span><span class="hljs-number">2</span>    a    two      <span class="hljs-number">2</span>      <span class="hljs-number">4</span><span class="hljs-number">3</span>    b  three      <span class="hljs-number">3</span>      <span class="hljs-number">4</span><span class="hljs-number">4</span>    a    two      <span class="hljs-number">5</span>      <span class="hljs-number">1</span><span class="hljs-number">5</span>    b    two      <span class="hljs-number">5</span>      <span class="hljs-number">9</span><span class="hljs-number">6</span>    a    one      <span class="hljs-number">1</span>      <span class="hljs-number">8</span><span class="hljs-number">7</span>    a  three      <span class="hljs-number">2</span>      <span class="hljs-number">4</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>dict1 = &#123;<span class="hljs-string">&#x27;data1&#x27;</span>:<span class="hljs-string">&#x27;mean&#x27;</span>, <span class="hljs-string">&#x27;data2&#x27;</span>:<span class="hljs-string">&#x27;sum&#x27;</span>&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>dict2 = &#123;<span class="hljs-string">&#x27;data1&#x27;</span>:[<span class="hljs-string">&#x27;mean&#x27;</span>,<span class="hljs-string">&#x27;max&#x27;</span>], <span class="hljs-string">&#x27;data2&#x27;</span>:<span class="hljs-string">&#x27;sum&#x27;</span>&#125;<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>).agg(dict1)         data1  data2key1                 a     <span class="hljs-number">3.800000</span>     <span class="hljs-number">24</span>b     <span class="hljs-number">4.333333</span>     <span class="hljs-number">22</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.groupby(<span class="hljs-string">&#x27;key1&#x27;</span>).agg(dict2)         data1     data2          mean <span class="hljs-built_in">max</span>   <span class="hljs-built_in">sum</span>key1                    a     <span class="hljs-number">3.800000</span>   <span class="hljs-number">9</span>    <span class="hljs-number">24</span>b     <span class="hljs-number">4.333333</span>   <span class="hljs-number">5</span>    <span class="hljs-number">22</span></code></pre><h3><span id="04x04-groupby-apply"><font color="#4876FF">【04x04】GroupBy.apply()</font></span></h3><p><code>apply()</code> 方法会将待处理的对象拆分成多个片段，然后对各片段调用传入的函数，最后尝试将各片段组合到一起。</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;A&#x27;</span>:[<span class="hljs-string">&#x27;bob&#x27;</span>,<span class="hljs-string">&#x27;sos&#x27;</span>,<span class="hljs-string">&#x27;bob&#x27;</span>,<span class="hljs-string">&#x27;sos&#x27;</span>,<span class="hljs-string">&#x27;bob&#x27;</span>,<span class="hljs-string">&#x27;sos&#x27;</span>,<span class="hljs-string">&#x27;bob&#x27;</span>,<span class="hljs-string">&#x27;bob&#x27;</span>],              <span class="hljs-string">&#x27;B&#x27;</span>:[<span class="hljs-string">&#x27;one&#x27;</span>,<span class="hljs-string">&#x27;one&#x27;</span>,<span class="hljs-string">&#x27;two&#x27;</span>,<span class="hljs-string">&#x27;three&#x27;</span>,<span class="hljs-string">&#x27;two&#x27;</span>,<span class="hljs-string">&#x27;two&#x27;</span>,<span class="hljs-string">&#x27;one&#x27;</span>,<span class="hljs-string">&#x27;three&#x27;</span>],              <span class="hljs-string">&#x27;C&#x27;</span>:[<span class="hljs-number">3</span>,<span class="hljs-number">1</span>,<span class="hljs-number">4</span>,<span class="hljs-number">1</span>,<span class="hljs-number">5</span>,<span class="hljs-number">9</span>,<span class="hljs-number">2</span>,<span class="hljs-number">6</span>],              <span class="hljs-string">&#x27;D&#x27;</span>:[<span class="hljs-number">1</span>,<span class="hljs-number">2</span>,<span class="hljs-number">3</span>,<span class="hljs-number">4</span>,<span class="hljs-number">5</span>,<span class="hljs-number">6</span>,<span class="hljs-number">7</span>,<span class="hljs-number">8</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj     A      B  C  D<span class="hljs-number">0</span>  bob    one  <span class="hljs-number">3</span>  <span class="hljs-number">1</span><span class="hljs-number">1</span>  sos    one  <span class="hljs-number">1</span>  <span class="hljs-number">2</span><span class="hljs-number">2</span>  bob    two  <span class="hljs-number">4</span>  <span class="hljs-number">3</span><span class="hljs-number">3</span>  sos  three  <span class="hljs-number">1</span>  <span class="hljs-number">4</span><span class="hljs-number">4</span>  bob    two  <span class="hljs-number">5</span>  <span class="hljs-number">5</span><span class="hljs-number">5</span>  sos    two  <span class="hljs-number">9</span>  <span class="hljs-number">6</span><span class="hljs-number">6</span>  bob    one  <span class="hljs-number">2</span>  <span class="hljs-number">7</span><span class="hljs-number">7</span>  bob  three  <span class="hljs-number">6</span>  <span class="hljs-number">8</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>grouped = obj.groupby(<span class="hljs-string">&#x27;A&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">for</span> name, group <span class="hljs-keyword">in</span> grouped:    <span class="hljs-built_in">print</span>(name)    <span class="hljs-built_in">print</span>(group)    bob     A      B  C  D<span class="hljs-number">0</span>  bob    one  <span class="hljs-number">3</span>  <span class="hljs-number">1</span><span class="hljs-number">2</span>  bob    two  <span class="hljs-number">4</span>  <span class="hljs-number">3</span><span class="hljs-number">4</span>  bob    two  <span class="hljs-number">5</span>  <span class="hljs-number">5</span><span class="hljs-number">6</span>  bob    one  <span class="hljs-number">2</span>  <span class="hljs-number">7</span><span class="hljs-number">7</span>  bob  three  <span class="hljs-number">6</span>  <span class="hljs-number">8</span>sos     A      B  C  D<span class="hljs-number">1</span>  sos    one  <span class="hljs-number">1</span>  <span class="hljs-number">2</span><span class="hljs-number">3</span>  sos  three  <span class="hljs-number">1</span>  <span class="hljs-number">4</span><span class="hljs-number">5</span>  sos    two  <span class="hljs-number">9</span>  <span class="hljs-number">6</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>grouped.apply(<span class="hljs-keyword">lambda</span> x:x.describe())  <span class="hljs-comment"># 对 bob 和 sos 两组数据使用 describe 方法</span>                  C         DA                            bob count  <span class="hljs-number">5.000000</span>  <span class="hljs-number">5.000000</span>    mean   <span class="hljs-number">4.000000</span>  <span class="hljs-number">4.800000</span>    std    <span class="hljs-number">1.581139</span>  <span class="hljs-number">2.863564</span>    <span class="hljs-built_in">min</span>    <span class="hljs-number">2.000000</span>  <span class="hljs-number">1.000000</span>    <span class="hljs-number">25</span>%    <span class="hljs-number">3.000000</span>  <span class="hljs-number">3.000000</span>    <span class="hljs-number">50</span>%    <span class="hljs-number">4.000000</span>  <span class="hljs-number">5.000000</span>    <span class="hljs-number">75</span>%    <span class="hljs-number">5.000000</span>  <span class="hljs-number">7.000000</span>    <span class="hljs-built_in">max</span>    <span class="hljs-number">6.000000</span>  <span class="hljs-number">8.000000</span>sos count  <span class="hljs-number">3.000000</span>  <span class="hljs-number">3.000000</span>    mean   <span class="hljs-number">3.666667</span>  <span class="hljs-number">4.000000</span>    std    <span class="hljs-number">4.618802</span>  <span class="hljs-number">2.000000</span>    <span class="hljs-built_in">min</span>    <span class="hljs-number">1.000000</span>  <span class="hljs-number">2.000000</span>    <span class="hljs-number">25</span>%    <span class="hljs-number">1.000000</span>  <span class="hljs-number">3.000000</span>    <span class="hljs-number">50</span>%    <span class="hljs-number">1.000000</span>  <span class="hljs-number">4.000000</span>    <span class="hljs-number">75</span>%    <span class="hljs-number">5.000000</span>  <span class="hljs-number">5.000000</span>    <span class="hljs-built_in">max</span>    <span class="hljs-number">9.000000</span>  <span class="hljs-number">6.000000</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>grouped.apply(<span class="hljs-keyword">lambda</span> x:x.<span class="hljs-built_in">min</span>())  <span class="hljs-comment"># # 对 bob 和 sos 两组数据使用 min 方法</span>       A    B  C  DA                  bob  bob  one  <span class="hljs-number">2</span>  <span class="hljs-number">1</span>sos  sos  one  <span class="hljs-number">1</span>  <span class="hljs-number">2</span></code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106804881</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-01x00-groupby-ji-zhi-font&quot;&gt;&lt;font color=&quot;#F</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Pandas" scheme="https://www.itbob.cn/tags/Pandas/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Pandas（五）：统计计算与统计描述</title>
    <link href="https://www.itbob.cn/article/029/"/>
    <id>https://www.itbob.cn/article/029/</id>
    <published>2020-06-16T13:15:28.000Z</published>
    <updated>2022-05-22T12:40:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-01x00-tong-ji-ji-suan-font"><font color="#FF0000">【01x00】统计计算</font></a><ul><li><a href="#font-color-4876ff-01x01-sum-qiu-he-font"><font color="#4876FF">【01x01】sum() 求和</font></a></li><li><a href="#font-color-4876ff-01x02-min-zui-xiao-zhi-font"><font color="#4876FF">【01x02】min() 最小值</font></a></li><li><a href="#font-color-4876ff-01x03-max-zui-da-zhi-font"><font color="#4876FF">【01x03】max() 最大值</font></a></li><li><a href="#font-color-4876ff-01x04-mean-ping-jun-zhi-font"><font color="#4876FF">【01x04】mean() 平均值</font></a></li><li><a href="#font-color-4876ff-01x05-idxmin-zui-xiao-zhi-suo-yin-font"><font color="#4876FF">【01x05】idxmin() 最小值索引</font></a></li><li><a href="#font-color-4876ff-01x06-idxmax-zui-da-zhi-suo-yin-font"><font color="#4876FF">【01x06】idxmax() 最大值索引</font></a></li></ul></li><li><a href="#font-color-ff0000-02x00-tong-ji-miao-shu-font"><font color="#FF0000">【02x00】统计描述</font></a></li><li><a href="#font-color-ff0000-03x00-chang-yong-tong-ji-fang-fa-font"><font color="#FF0000">【03x00】常用统计方法</font></a></li></ul><!-- tocstop --><hr><p>Pandas 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/025/">Python 数据分析三剑客之 Pandas（一）：认识 Pandas 及其 Series、DataFrame 对象</a></li><li><a href="https://www.itbob.cn/article/026/">Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作</a></li><li><a href="https://www.itbob.cn/article/027/">Python 数据分析三剑客之 Pandas（三）：算术运算与缺失值的处理</a></li><li><a href="https://www.itbob.cn/article/028/">Python 数据分析三剑客之 Pandas（四）：函数应用、映射、排序和层级索引</a></li><li><a href="https://www.itbob.cn/article/029/">Python 数据分析三剑客之 Pandas（五）：统计计算与统计描述</a></li><li><a href="https://www.itbob.cn/article/030/">Python 数据分析三剑客之 Pandas（六）：GroupBy 数据分裂、应用与合并</a></li><li><a href="https://www.itbob.cn/article/031/">Python 数据分析三剑客之 Pandas（七）：合并数据集</a></li><li><a href="https://www.itbob.cn/article/032/">Python 数据分析三剑客之 Pandas（八）：数据重塑、重复数据处理与数据替换</a></li><li><a href="https://www.itbob.cn/article/033/">Python 数据分析三剑客之 Pandas（九）：时间序列</a></li><li><a href="https://www.itbob.cn/article/034/">Python 数据分析三剑客之 Pandas（十）：数据读写</a></li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106788501</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="01x00-tong-ji-ji-suan"><font color="#FF0000">【01x00】统计计算</font></span></h2><p>Pandas 对象拥有一组常用的数学和统计方法。它们大部分都属于约简和汇总统计，用于从 Series 中提取单个值（如 sum 或 mean）或从 DataFrame 的行或列中提取一个 Series。跟对应的 NumPy 数组方法相比，它们都是基于没有缺失数据的假设而构建的。</p><h3><span id="01x01-sum-qiu-he"><font color="#4876FF">【01x01】sum() 求和</font></span></h3><p><code>sum()</code> 方法用于返回指定轴的和，相当于 <code>numpy.sum()</code>。</p><p>在 Series 和 DataFrame 中的基本语法如下：</p><ul><li><p><code>Series.sum(self, axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)</code></p></li><li><p><code>DataFrame.sum(self, axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.sum.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.sum.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sum.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sum.html</a></p></li></ul><p>常用参数描述如下：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>axis</td><td>指定轴求和，<code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，只有在 DataFrame 中才有 <code>1</code> or <code>'columns’</code></td></tr><tr><td>skipna</td><td>bool 类型，求和时是否排除缺失值（NA/null），默认 True</td></tr><tr><td>level</td><td>如果轴是 MultiIndex（层次结构），则沿指定层次求和</td></tr></tbody></table><p>在 Series 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx = pd.MultiIndex.from_arrays([    [<span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>],    [<span class="hljs-string">&#x27;dog&#x27;</span>, <span class="hljs-string">&#x27;falcon&#x27;</span>, <span class="hljs-string">&#x27;fish&#x27;</span>, <span class="hljs-string">&#x27;spider&#x27;</span>]],    names=[<span class="hljs-string">&#x27;blooded&#x27;</span>, <span class="hljs-string">&#x27;animal&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">4</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">8</span>], name=<span class="hljs-string">&#x27;legs&#x27;</span>, index=idx)<span class="hljs-meta">&gt;&gt;&gt; </span>objblooded  animalwarm     dog       <span class="hljs-number">4</span>         falcon    <span class="hljs-number">2</span>cold     fish      <span class="hljs-number">0</span>         spider    <span class="hljs-number">8</span>Name: legs, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">sum</span>()<span class="hljs-number">14</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">sum</span>(level=<span class="hljs-string">&#x27;blooded&#x27;</span>)bloodedwarm    <span class="hljs-number">6</span>cold    <span class="hljs-number">8</span>Name: legs, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">sum</span>(level=<span class="hljs-number">0</span>)bloodedwarm    <span class="hljs-number">6</span>cold    <span class="hljs-number">8</span>Name: legs, dtype: int64</code></pre><p>在 DataFrame 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1.4</span>, np.nan], [<span class="hljs-number">7.1</span>, -<span class="hljs-number">4.5</span>],    [np.nan, np.nan], [<span class="hljs-number">0.75</span>, -<span class="hljs-number">1.3</span>]],    index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>],    columns=[<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj    one  twoa  <span class="hljs-number">1.40</span>  NaNb  <span class="hljs-number">7.10</span> -<span class="hljs-number">4.5</span>c   NaN  NaNd  <span class="hljs-number">0.75</span> -<span class="hljs-number">1.3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">sum</span>()one    <span class="hljs-number">9.25</span>two   -<span class="hljs-number">5.80</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">sum</span>(axis=<span class="hljs-number">1</span>)a    <span class="hljs-number">1.40</span>b    <span class="hljs-number">2.60</span>c    <span class="hljs-number">0.00</span>d   -<span class="hljs-number">0.55</span>dtype: float64</code></pre><h3><span id="01x02-min-zui-xiao-zhi"><font color="#4876FF">【01x02】min() 最小值</font></span></h3><p><code>min()</code> 方法用于返回指定轴的最小值。</p><p>在 Series 和 DataFrame 中的基本语法如下：</p><ul><li><p><code>Series.min(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)</code></p></li><li><p><code>DataFrame.min(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.min.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.min.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.min.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.min.html</a></p></li></ul><p>常用参数描述如下：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>axis</td><td>指定轴求最小值，<code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，只有在 DataFrame 中才有 <code>1</code> or <code>'columns’</code></td></tr><tr><td>skipna</td><td>bool 类型，求最小值时是否排除缺失值（NA/null），默认 True</td></tr><tr><td>level</td><td>如果轴是 MultiIndex（层次结构），则沿指定层次求最小值</td></tr></tbody></table><p>在 Series 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx = pd.MultiIndex.from_arrays([    [<span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>],    [<span class="hljs-string">&#x27;dog&#x27;</span>, <span class="hljs-string">&#x27;falcon&#x27;</span>, <span class="hljs-string">&#x27;fish&#x27;</span>, <span class="hljs-string">&#x27;spider&#x27;</span>]],    names=[<span class="hljs-string">&#x27;blooded&#x27;</span>, <span class="hljs-string">&#x27;animal&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">4</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">8</span>], name=<span class="hljs-string">&#x27;legs&#x27;</span>, index=idx)<span class="hljs-meta">&gt;&gt;&gt; </span>objblooded  animalwarm     dog       <span class="hljs-number">4</span>         falcon    <span class="hljs-number">2</span>cold     fish      <span class="hljs-number">0</span>         spider    <span class="hljs-number">8</span>Name: legs, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">min</span>()<span class="hljs-number">0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">min</span>(level=<span class="hljs-string">&#x27;blooded&#x27;</span>)bloodedwarm    <span class="hljs-number">2</span>cold    <span class="hljs-number">0</span>Name: legs, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">min</span>(level=<span class="hljs-number">0</span>)bloodedwarm    <span class="hljs-number">2</span>cold    <span class="hljs-number">0</span>Name: legs, dtype: int64</code></pre><p>在 DataFrame 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1.4</span>, np.nan], [<span class="hljs-number">7.1</span>, -<span class="hljs-number">4.5</span>],    [np.nan, np.nan], [<span class="hljs-number">0.75</span>, -<span class="hljs-number">1.3</span>]],    index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>],columns=[<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj    one  twoa  <span class="hljs-number">1.40</span>  NaNb  <span class="hljs-number">7.10</span> -<span class="hljs-number">4.5</span>c   NaN  NaNd  <span class="hljs-number">0.75</span> -<span class="hljs-number">1.3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">min</span>()one    <span class="hljs-number">0.75</span>two   -<span class="hljs-number">4.50</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">min</span>(axis=<span class="hljs-number">1</span>)a    <span class="hljs-number">1.4</span>b   -<span class="hljs-number">4.5</span>c    NaNd   -<span class="hljs-number">1.3</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">min</span>(axis=<span class="hljs-string">&#x27;columns&#x27;</span>, skipna=<span class="hljs-literal">False</span>)a    NaNb   -<span class="hljs-number">4.5</span>c    NaNd   -<span class="hljs-number">1.3</span>dtype: float64</code></pre><h3><span id="01x03-max-zui-da-zhi"><font color="#4876FF">【01x03】max() 最大值</font></span></h3><p><code>max()</code> 方法用于返回指定轴的最大值。</p><p>在 Series 和 DataFrame 中的基本语法如下：</p><ul><li><p><code>Series.max(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)</code></p></li><li><p><code>DataFrame.max(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.max.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.max.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.max.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.max.html</a></p></li></ul><p>常用参数描述如下：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>axis</td><td>指定轴求最大值，<code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，只有在 DataFrame 中才有 <code>1</code> or <code>'columns’</code></td></tr><tr><td>skipna</td><td>bool 类型，求最大值时是否排除缺失值（NA/null），默认 True</td></tr><tr><td>level</td><td>如果轴是 MultiIndex（层次结构），则沿指定层次求最大值</td></tr></tbody></table><p>在 Series 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx = pd.MultiIndex.from_arrays([    [<span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>],    [<span class="hljs-string">&#x27;dog&#x27;</span>, <span class="hljs-string">&#x27;falcon&#x27;</span>, <span class="hljs-string">&#x27;fish&#x27;</span>, <span class="hljs-string">&#x27;spider&#x27;</span>]],    names=[<span class="hljs-string">&#x27;blooded&#x27;</span>, <span class="hljs-string">&#x27;animal&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">4</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">8</span>], name=<span class="hljs-string">&#x27;legs&#x27;</span>, index=idx)<span class="hljs-meta">&gt;&gt;&gt; </span>objblooded  animalwarm     dog       <span class="hljs-number">4</span>         falcon    <span class="hljs-number">2</span>cold     fish      <span class="hljs-number">0</span>         spider    <span class="hljs-number">8</span>Name: legs, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">max</span>()<span class="hljs-number">8</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">max</span>(level=<span class="hljs-string">&#x27;blooded&#x27;</span>)bloodedwarm    <span class="hljs-number">4</span>cold    <span class="hljs-number">8</span>Name: legs, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">max</span>(level=<span class="hljs-number">0</span>)bloodedwarm    <span class="hljs-number">4</span>cold    <span class="hljs-number">8</span>Name: legs, dtype: int64</code></pre><p>在 DataFrame 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1.4</span>, np.nan], [<span class="hljs-number">7.1</span>, -<span class="hljs-number">4.5</span>],    [np.nan, np.nan], [<span class="hljs-number">0.75</span>, -<span class="hljs-number">1.3</span>]],    index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>],columns=[<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj    one  twoa  <span class="hljs-number">1.40</span>  NaNb  <span class="hljs-number">7.10</span> -<span class="hljs-number">4.5</span>c   NaN  NaNd  <span class="hljs-number">0.75</span> -<span class="hljs-number">1.3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">max</span>()one    <span class="hljs-number">7.1</span>two   -<span class="hljs-number">1.3</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">max</span>(axis=<span class="hljs-number">1</span>)a    <span class="hljs-number">1.40</span>b    <span class="hljs-number">7.10</span>c     NaNd    <span class="hljs-number">0.75</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.<span class="hljs-built_in">max</span>(axis=<span class="hljs-string">&#x27;columns&#x27;</span>, skipna=<span class="hljs-literal">False</span>)a     NaNb    <span class="hljs-number">7.10</span>c     NaNd    <span class="hljs-number">0.75</span>dtype: float64</code></pre><h3><span id="01x04-mean-ping-jun-zhi"><font color="#4876FF">【01x04】mean() 平均值</font></span></h3><p><code>mean()</code> 方法用于返回指定轴的平均值。</p><p>在 Series 和 DataFrame 中的基本语法如下：</p><ul><li><p><code>Series.mean(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)</code></p></li><li><p><code>DataFrame.mean(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.mean.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.mean.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.mean.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.mean.html</a></p></li></ul><p>常用参数描述如下：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>axis</td><td>指定轴求平均值，<code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，只有在 DataFrame 中才有 <code>1</code> or <code>'columns’</code></td></tr><tr><td>skipna</td><td>bool 类型，求平均值时是否排除缺失值（NA/null），默认 True</td></tr><tr><td>level</td><td>如果轴是 MultiIndex（层次结构），则沿指定层次求平均值</td></tr></tbody></table><p>在 Series 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx = pd.MultiIndex.from_arrays([    [<span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>],    [<span class="hljs-string">&#x27;dog&#x27;</span>, <span class="hljs-string">&#x27;falcon&#x27;</span>, <span class="hljs-string">&#x27;fish&#x27;</span>, <span class="hljs-string">&#x27;spider&#x27;</span>]],    names=[<span class="hljs-string">&#x27;blooded&#x27;</span>, <span class="hljs-string">&#x27;animal&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">4</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">8</span>], name=<span class="hljs-string">&#x27;legs&#x27;</span>, index=idx)<span class="hljs-meta">&gt;&gt;&gt; </span>objblooded  animalwarm     dog       <span class="hljs-number">4</span>         falcon    <span class="hljs-number">2</span>cold     fish      <span class="hljs-number">0</span>         spider    <span class="hljs-number">8</span>Name: legs, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.mean()<span class="hljs-number">3.5</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.mean(level=<span class="hljs-string">&#x27;blooded&#x27;</span>)bloodedwarm    <span class="hljs-number">3</span>cold    <span class="hljs-number">4</span>Name: legs, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.mean(level=<span class="hljs-number">0</span>)bloodedwarm    <span class="hljs-number">3</span>cold    <span class="hljs-number">4</span>Name: legs, dtype: int64</code></pre><p>在 DataFrame 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1.4</span>, np.nan], [<span class="hljs-number">7.1</span>, -<span class="hljs-number">4.5</span>],    [np.nan, np.nan], [<span class="hljs-number">0.75</span>, -<span class="hljs-number">1.3</span>]],    index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>],columns=[<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj    one  twoa  <span class="hljs-number">1.40</span>  NaNb  <span class="hljs-number">7.10</span> -<span class="hljs-number">4.5</span>c   NaN  NaNd  <span class="hljs-number">0.75</span> -<span class="hljs-number">1.3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.mean()one    <span class="hljs-number">3.083333</span>two   -<span class="hljs-number">2.900000</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.mean(axis=<span class="hljs-number">1</span>)a    <span class="hljs-number">1.400</span>b    <span class="hljs-number">1.300</span>c      NaNd   -<span class="hljs-number">0.275</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.mean(axis=<span class="hljs-string">&#x27;columns&#x27;</span>, skipna=<span class="hljs-literal">False</span>)a      NaNb    <span class="hljs-number">1.300</span>c      NaNd   -<span class="hljs-number">0.275</span>dtype: float64</code></pre><h3><span id="01x05-idxmin-zui-xiao-zhi-suo-yin"><font color="#4876FF">【01x05】idxmin() 最小值索引</font></span></h3><p><code>idxmin()</code> 方法用于返回最小值的索引。</p><p>在 Series 和 DataFrame 中的基本语法如下：</p><ul><li><p><code>Series.idxmin(self, axis=0, skipna=True, *args, **kwargs)</code></p></li><li><p><code>DataFrame.idxmin(self, axis=0, skipna=True)</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.idxmin.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.idxmin.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.idxmin.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.idxmin.html</a></p></li></ul><p>常用参数描述如下：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>axis</td><td>指定轴，<code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，只有在 DataFrame 中才有 <code>1</code> or <code>'columns’</code></td></tr><tr><td>skipna</td><td>bool 类型，是否排除缺失值（NA/null），默认 True</td></tr></tbody></table><p>在 Series 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx = pd.MultiIndex.from_arrays([    [<span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>],    [<span class="hljs-string">&#x27;dog&#x27;</span>, <span class="hljs-string">&#x27;falcon&#x27;</span>, <span class="hljs-string">&#x27;fish&#x27;</span>, <span class="hljs-string">&#x27;spider&#x27;</span>]],    names=[<span class="hljs-string">&#x27;blooded&#x27;</span>, <span class="hljs-string">&#x27;animal&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">4</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">8</span>], name=<span class="hljs-string">&#x27;legs&#x27;</span>, index=idx)<span class="hljs-meta">&gt;&gt;&gt; </span>objblooded  animalwarm     dog       <span class="hljs-number">4</span>         falcon    <span class="hljs-number">2</span>cold     fish      <span class="hljs-number">0</span>         spider    <span class="hljs-number">8</span>Name: legs, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.idxmin()(<span class="hljs-string">&#x27;cold&#x27;</span>, <span class="hljs-string">&#x27;fish&#x27;</span>)</code></pre><p>在 DataFrame 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1.4</span>, np.nan], [<span class="hljs-number">7.1</span>, -<span class="hljs-number">4.5</span>],    [np.nan, np.nan], [<span class="hljs-number">0.75</span>, -<span class="hljs-number">1.3</span>]],    index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>],columns=[<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj    one  twoa  <span class="hljs-number">1.40</span>  NaNb  <span class="hljs-number">7.10</span> -<span class="hljs-number">4.5</span>c   NaN  NaNd  <span class="hljs-number">0.75</span> -<span class="hljs-number">1.3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.idxmin()one    dtwo    bdtype: <span class="hljs-built_in">object</span></code></pre><h3><span id="01x06-idxmax-zui-da-zhi-suo-yin"><font color="#4876FF">【01x06】idxmax() 最大值索引</font></span></h3><p><code>idxmax()</code> 方法用于返回最大值的索引。</p><p>在 Series 和 DataFrame 中的基本语法如下：</p><ul><li><p><code>Series.idxmax(self, axis=0, skipna=True, *args, **kwargs)</code></p></li><li><p><code>DataFrame.idxmax(self, axis=0, skipna=True)</code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.idxmax.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.idxmax.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.idxmax.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.idxmax.html</a></p></li></ul><p>常用参数描述如下：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>axis</td><td>指定轴，<code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，只有在 DataFrame 中才有 <code>1</code> or <code>'columns’</code></td></tr><tr><td>skipna</td><td>bool 类型，是否排除缺失值（NA/null），默认 True</td></tr></tbody></table><p>在 Series 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx = pd.MultiIndex.from_arrays([    [<span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;warm&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>, <span class="hljs-string">&#x27;cold&#x27;</span>],    [<span class="hljs-string">&#x27;dog&#x27;</span>, <span class="hljs-string">&#x27;falcon&#x27;</span>, <span class="hljs-string">&#x27;fish&#x27;</span>, <span class="hljs-string">&#x27;spider&#x27;</span>]],    names=[<span class="hljs-string">&#x27;blooded&#x27;</span>, <span class="hljs-string">&#x27;animal&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">4</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">8</span>], name=<span class="hljs-string">&#x27;legs&#x27;</span>, index=idx)<span class="hljs-meta">&gt;&gt;&gt; </span>objblooded  animalwarm     dog       <span class="hljs-number">4</span>         falcon    <span class="hljs-number">2</span>cold     fish      <span class="hljs-number">0</span>         spider    <span class="hljs-number">8</span>Name: legs, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.idxmax()(<span class="hljs-string">&#x27;cold&#x27;</span>, <span class="hljs-string">&#x27;spider&#x27;</span>)</code></pre><p>在 DataFrame 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1.4</span>, np.nan], [<span class="hljs-number">7.1</span>, -<span class="hljs-number">4.5</span>],    [np.nan, np.nan], [<span class="hljs-number">0.75</span>, -<span class="hljs-number">1.3</span>]],    index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>],columns=[<span class="hljs-string">&#x27;one&#x27;</span>, <span class="hljs-string">&#x27;two&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj    one  twoa  <span class="hljs-number">1.40</span>  NaNb  <span class="hljs-number">7.10</span> -<span class="hljs-number">4.5</span>c   NaN  NaNd  <span class="hljs-number">0.75</span> -<span class="hljs-number">1.3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.idxmax()one    btwo    ddtype: <span class="hljs-built_in">object</span></code></pre><h2><span id="02x00-tong-ji-miao-shu"><font color="#FF0000">【02x00】统计描述</font></span></h2><p><code>describe()</code> 方法用于快速综合统计结果：计数、均值、标准差、最大最小值、四分位数等。还可以通过参数来设置需要忽略或者包含的统计选项。</p><p>在 Series 和 DataFrame 中的基本语法如下：</p><ul><li><p><code>Series.describe(self: ~ FrameOrSeries, percentiles=None, include=None, exclude=None)</code></p></li><li><p><code>DataFrame.describe(self: ~ FrameOrSeries, percentiles=None, include=None, exclude=None) </code></p></li></ul><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.describe.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.describe.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.describe.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.describe.html</a></p></li></ul><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>percentiles</td><td>数字列表，可选项，要包含在输出中的百分比。所有值都应介于 0 和 1 之间。默认值为 [.25、.5、.75]，即返回第 25、50 和 75 个百分点</td></tr><tr><td>include</td><td>要包含在结果中的数据类型，数据类型列表，默认 None，具体取值类型参见官方文档</td></tr><tr><td>exclude</td><td>要从结果中忽略的数据类型，数据类型列表，默认 None，具体取值类型参见官方文档</td></tr></tbody></table><p>描述数字形式的 Series 对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>    <span class="hljs-number">1</span><span class="hljs-number">1</span>    <span class="hljs-number">2</span><span class="hljs-number">2</span>    <span class="hljs-number">3</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.describe()count    <span class="hljs-number">3.0</span>mean     <span class="hljs-number">2.0</span>std      <span class="hljs-number">1.0</span><span class="hljs-built_in">min</span>      <span class="hljs-number">1.0</span><span class="hljs-number">25</span>%      <span class="hljs-number">1.5</span><span class="hljs-number">50</span>%      <span class="hljs-number">2.0</span><span class="hljs-number">75</span>%      <span class="hljs-number">2.5</span><span class="hljs-built_in">max</span>      <span class="hljs-number">3.0</span>dtype: float64</code></pre><p>分类描述：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>    a<span class="hljs-number">1</span>    a<span class="hljs-number">2</span>    b<span class="hljs-number">3</span>    cdtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.describe()count     <span class="hljs-number">4</span>unique    <span class="hljs-number">3</span>top       afreq      <span class="hljs-number">2</span>dtype: <span class="hljs-built_in">object</span></code></pre><p>描述时间戳：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj  = pd.Series([    np.datetime64(<span class="hljs-string">&quot;2000-01-01&quot;</span>),    np.datetime64(<span class="hljs-string">&quot;2010-01-01&quot;</span>),    np.datetime64(<span class="hljs-string">&quot;2010-01-01&quot;</span>)    ])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>   <span class="hljs-number">2000</span>-01-01<span class="hljs-number">1</span>   <span class="hljs-number">2010</span>-01-01<span class="hljs-number">2</span>   <span class="hljs-number">2010</span>-01-01dtype: datetime64[ns]<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.describe()count                       <span class="hljs-number">3</span>unique                      <span class="hljs-number">2</span>top       <span class="hljs-number">2010</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>freq                        <span class="hljs-number">2</span>first     <span class="hljs-number">2000</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>last      <span class="hljs-number">2010</span>-01-01 <span class="hljs-number">00</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00</span>dtype: <span class="hljs-built_in">object</span></code></pre><p>描述 DataFrame 对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;categorical&#x27;</span>: pd.Categorical([<span class="hljs-string">&#x27;d&#x27;</span>,<span class="hljs-string">&#x27;e&#x27;</span>,<span class="hljs-string">&#x27;f&#x27;</span>]), <span class="hljs-string">&#x27;numeric&#x27;</span>: [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], <span class="hljs-string">&#x27;object&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  categorical  numeric <span class="hljs-built_in">object</span><span class="hljs-number">0</span>           d        <span class="hljs-number">1</span>      a<span class="hljs-number">1</span>           e        <span class="hljs-number">2</span>      b<span class="hljs-number">2</span>           f        <span class="hljs-number">3</span>      c<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.describe()       numericcount      <span class="hljs-number">3.0</span>mean       <span class="hljs-number">2.0</span>std        <span class="hljs-number">1.0</span><span class="hljs-built_in">min</span>        <span class="hljs-number">1.0</span><span class="hljs-number">25</span>%        <span class="hljs-number">1.5</span><span class="hljs-number">50</span>%        <span class="hljs-number">2.0</span><span class="hljs-number">75</span>%        <span class="hljs-number">2.5</span><span class="hljs-built_in">max</span>        <span class="hljs-number">3.0</span></code></pre><p>不考虑数据类型，显示所有描述：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;categorical&#x27;</span>: pd.Categorical([<span class="hljs-string">&#x27;d&#x27;</span>,<span class="hljs-string">&#x27;e&#x27;</span>,<span class="hljs-string">&#x27;f&#x27;</span>]), <span class="hljs-string">&#x27;numeric&#x27;</span>: [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], <span class="hljs-string">&#x27;object&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  categorical  numeric <span class="hljs-built_in">object</span><span class="hljs-number">0</span>           d        <span class="hljs-number">1</span>      a<span class="hljs-number">1</span>           e        <span class="hljs-number">2</span>      b<span class="hljs-number">2</span>           f        <span class="hljs-number">3</span>      c<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.describe(include=<span class="hljs-string">&#x27;all&#x27;</span>)       categorical  numeric <span class="hljs-built_in">object</span>count            <span class="hljs-number">3</span>      <span class="hljs-number">3.0</span>      <span class="hljs-number">3</span>unique           <span class="hljs-number">3</span>      NaN      <span class="hljs-number">3</span>top              f      NaN      cfreq             <span class="hljs-number">1</span>      NaN      <span class="hljs-number">1</span>mean           NaN      <span class="hljs-number">2.0</span>    NaNstd            NaN      <span class="hljs-number">1.0</span>    NaN<span class="hljs-built_in">min</span>            NaN      <span class="hljs-number">1.0</span>    NaN<span class="hljs-number">25</span>%            NaN      <span class="hljs-number">1.5</span>    NaN<span class="hljs-number">50</span>%            NaN      <span class="hljs-number">2.0</span>    NaN<span class="hljs-number">75</span>%            NaN      <span class="hljs-number">2.5</span>    NaN<span class="hljs-built_in">max</span>            NaN      <span class="hljs-number">3.0</span>    NaN</code></pre><p>仅包含 category 列：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;categorical&#x27;</span>: pd.Categorical([<span class="hljs-string">&#x27;d&#x27;</span>,<span class="hljs-string">&#x27;e&#x27;</span>,<span class="hljs-string">&#x27;f&#x27;</span>]), <span class="hljs-string">&#x27;numeric&#x27;</span>: [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], <span class="hljs-string">&#x27;object&#x27;</span>: [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj  categorical  numeric <span class="hljs-built_in">object</span><span class="hljs-number">0</span>           d        <span class="hljs-number">1</span>      a<span class="hljs-number">1</span>           e        <span class="hljs-number">2</span>      b<span class="hljs-number">2</span>           f        <span class="hljs-number">3</span>      c<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.describe(include=[<span class="hljs-string">&#x27;category&#x27;</span>])       categoricalcount            <span class="hljs-number">3</span>unique           <span class="hljs-number">3</span>top              ffreq             <span class="hljs-number">1</span></code></pre><h2><span id="03x00-chang-yong-tong-ji-fang-fa"><font color="#FF0000">【03x00】常用统计方法</font></span></h2><p>其他常用统计方法参见下表：</p><table><thead><tr><th>方法</th><th>描述</th><th>官方文档</th></tr></thead><tbody><tr><td>count</td><td>非NA值的数量</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.count.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.count.html">DataFrame</a></td></tr><tr><td>describe</td><td>针对Series或各DataFrame列计算汇总统计</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.describe.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.describe.html">DataFrame</a></td></tr><tr><td>min</td><td>计算最小值</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.min.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.min.html">DataFrame</a></td></tr><tr><td>max</td><td>计算最大值</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.max.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.max.html">DataFrame</a></td></tr><tr><td>argmin</td><td>计算能够获取到最小值的索引位置（整数）</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.argmin.html">Series</a></td></tr><tr><td>argmax</td><td>计算能够获取到最大值的索引位置（整数）</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.argmax.html">Series</a></td></tr><tr><td>idxmin</td><td>计算能够获取到最小值的索引值</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.idxmin.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.idxmin.html">DataFrame</a></td></tr><tr><td>idxmax</td><td>计算能够获取到最大值的索引值</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.idxmax.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.idxmax.html">DataFrame</a></td></tr><tr><td>quantile</td><td>计算样本的分位数（0到1）</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.quantile.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.quantile.html">DataFrame</a></td></tr><tr><td>sum</td><td>值的总和</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.sum.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sum.html">DataFrame</a></td></tr><tr><td>mean</td><td>值的平均数</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.mean.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.mean.html">DataFrame</a></td></tr><tr><td>median</td><td>值的算术中位数（50%分位数）</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.median.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.median.html">DataFrame</a></td></tr><tr><td>mad</td><td>根据平均值计算平均绝对离差</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.mad.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.mad.html">DataFrame</a></td></tr><tr><td>var</td><td>样本值的方差</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.var.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.var.html">DataFrame</a></td></tr><tr><td>std</td><td>样本值的标准差</td><td><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.std.html">Series</a>丨<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.std.html">DataFrame</a></td></tr></tbody></table><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106788501</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-01x00-tong-ji-ji-suan-font&quot;&gt;&lt;font color=&quot;#</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Pandas" scheme="https://www.itbob.cn/tags/Pandas/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Pandas（四）：函数应用/映射/排序和层级索引</title>
    <link href="https://www.itbob.cn/article/028/"/>
    <id>https://www.itbob.cn/article/028/</id>
    <published>2020-06-15T12:27:32.000Z</published>
    <updated>2022-05-22T12:39:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-01x00-han-shu-ying-yong-he-ying-she-font"><font color="#FF0000">【01x00】函数应用和映射</font></a></li><li><a href="#font-color-ff0000-02x00-pai-xu-font"><font color="#FF0000">【02x00】排序</font></a><ul><li><a href="#font-color-4876ff-02x01-sort-index-suo-yin-pai-xu-font"><font color="#4876FF">【02x01】sort_index() 索引排序</font></a></li><li><a href="#font-color-4876ff-02x02-sort-values-an-zhi-pai-xu-font"><font color="#4876FF">【02x02】sort_values() 按值排序</font></a></li><li><a href="#font-color-4876ff-02x03-rank-fan-hui-pai-xu-hou-yuan-su-suo-yin-font"><font color="#4876FF">【02x03】rank() 返回排序后元素索引</font></a></li></ul></li><li><a href="#font-color-ff0000-03x00-ceng-ji-suo-yin-font"><font color="#FF0000">【03x00】层级索引</font></a><ul><li><a href="#font-color-4876ff-03x01-ren-shi-ceng-ji-suo-yin-font"><font color="#4876FF">【03x01】认识层级索引</font></a></li><li><a href="#font-color-4876ff-03x02-multiindex-suo-yin-dui-xiang-font"><font color="#4876FF">【03x02】MultiIndex 索引对象</font></a></li><li><a href="#font-color-4876ff-03x03-ti-qu-zhi-font"><font color="#4876FF">【03x03】提取值</font></a></li><li><a href="#font-color-4876ff-03x04-jiao-huan-fen-ceng-yu-pai-xu-font"><font color="#4876FF">【03x04】交换分层与排序</font></a></li></ul></li></ul><!-- tocstop --><hr><p>Pandas 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/025/">Python 数据分析三剑客之 Pandas（一）：认识 Pandas 及其 Series、DataFrame 对象</a></li><li><a href="https://www.itbob.cn/article/026/">Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作</a></li><li><a href="https://www.itbob.cn/article/027/">Python 数据分析三剑客之 Pandas（三）：算术运算与缺失值的处理</a></li><li><a href="https://www.itbob.cn/article/028/">Python 数据分析三剑客之 Pandas（四）：函数应用、映射、排序和层级索引</a></li><li><a href="https://www.itbob.cn/article/029/">Python 数据分析三剑客之 Pandas（五）：统计计算与统计描述</a></li><li><a href="https://www.itbob.cn/article/030/">Python 数据分析三剑客之 Pandas（六）：GroupBy 数据分裂、应用与合并</a></li><li><a href="https://www.itbob.cn/article/031/">Python 数据分析三剑客之 Pandas（七）：合并数据集</a></li><li><a href="https://www.itbob.cn/article/032/">Python 数据分析三剑客之 Pandas（八）：数据重塑、重复数据处理与数据替换</a></li><li><a href="https://www.itbob.cn/article/033/">Python 数据分析三剑客之 Pandas（九）：时间序列</a></li><li><a href="https://www.itbob.cn/article/034/">Python 数据分析三剑客之 Pandas（十）：数据读写</a></li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106758103</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="01x00-han-shu-ying-yong-he-ying-she"><font color="#FF0000">【01x00】函数应用和映射</font></span></h2><p>Pandas 可直接使用 NumPy 的 ufunc（元素级数组方法） 函数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.random.randn(<span class="hljs-number">5</span>,<span class="hljs-number">4</span>) - <span class="hljs-number">1</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj          <span class="hljs-number">0</span>         <span class="hljs-number">1</span>         <span class="hljs-number">2</span>         <span class="hljs-number">3</span><span class="hljs-number">0</span> -<span class="hljs-number">0.228107</span>  <span class="hljs-number">1.377709</span> -<span class="hljs-number">1.096528</span> -<span class="hljs-number">2.051001</span><span class="hljs-number">1</span> -<span class="hljs-number">2.477144</span> -<span class="hljs-number">0.500013</span> -<span class="hljs-number">0.040695</span> -<span class="hljs-number">0.267452</span><span class="hljs-number">2</span> -<span class="hljs-number">0.485999</span> -<span class="hljs-number">1.232930</span> -<span class="hljs-number">0.390701</span> -<span class="hljs-number">1.947984</span><span class="hljs-number">3</span> -<span class="hljs-number">0.839161</span> -<span class="hljs-number">0.702802</span> -<span class="hljs-number">1.756359</span> -<span class="hljs-number">1.873149</span><span class="hljs-number">4</span>  <span class="hljs-number">0.853121</span> -<span class="hljs-number">1.540105</span>  <span class="hljs-number">0.621614</span> -<span class="hljs-number">0.583360</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>np.<span class="hljs-built_in">abs</span>(obj)          <span class="hljs-number">0</span>         <span class="hljs-number">1</span>         <span class="hljs-number">2</span>         <span class="hljs-number">3</span><span class="hljs-number">0</span>  <span class="hljs-number">0.228107</span>  <span class="hljs-number">1.377709</span>  <span class="hljs-number">1.096528</span>  <span class="hljs-number">2.051001</span><span class="hljs-number">1</span>  <span class="hljs-number">2.477144</span>  <span class="hljs-number">0.500013</span>  <span class="hljs-number">0.040695</span>  <span class="hljs-number">0.267452</span><span class="hljs-number">2</span>  <span class="hljs-number">0.485999</span>  <span class="hljs-number">1.232930</span>  <span class="hljs-number">0.390701</span>  <span class="hljs-number">1.947984</span><span class="hljs-number">3</span>  <span class="hljs-number">0.839161</span>  <span class="hljs-number">0.702802</span>  <span class="hljs-number">1.756359</span>  <span class="hljs-number">1.873149</span><span class="hljs-number">4</span>  <span class="hljs-number">0.853121</span>  <span class="hljs-number">1.540105</span>  <span class="hljs-number">0.621614</span>  <span class="hljs-number">0.583360</span></code></pre><p>函数映射：在 Pandas 中 <code>apply</code> 方法可以将函数应用到列或行上，可以通过设置 axis 参数来指定行或列，默认 axis = 0，即按列映射：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.random.randn(<span class="hljs-number">5</span>,<span class="hljs-number">4</span>) - <span class="hljs-number">1</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj          <span class="hljs-number">0</span>         <span class="hljs-number">1</span>         <span class="hljs-number">2</span>         <span class="hljs-number">3</span><span class="hljs-number">0</span> -<span class="hljs-number">0.707028</span> -<span class="hljs-number">0.755552</span> -<span class="hljs-number">2.196480</span> -<span class="hljs-number">0.529676</span><span class="hljs-number">1</span> -<span class="hljs-number">0.772668</span>  <span class="hljs-number">0.127485</span> -<span class="hljs-number">2.015699</span> -<span class="hljs-number">0.283654</span><span class="hljs-number">2</span>  <span class="hljs-number">0.248200</span> -<span class="hljs-number">1.940189</span> -<span class="hljs-number">1.068028</span> -<span class="hljs-number">1.751737</span><span class="hljs-number">3</span> -<span class="hljs-number">0.872904</span> -<span class="hljs-number">0.465371</span> -<span class="hljs-number">1.327951</span> -<span class="hljs-number">2.883160</span><span class="hljs-number">4</span> -<span class="hljs-number">0.092664</span>  <span class="hljs-number">0.258351</span> -<span class="hljs-number">1.010747</span> -<span class="hljs-number">2.313039</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.apply(<span class="hljs-keyword">lambda</span> x : x.<span class="hljs-built_in">max</span>())<span class="hljs-number">0</span>    <span class="hljs-number">0.248200</span><span class="hljs-number">1</span>    <span class="hljs-number">0.258351</span><span class="hljs-number">2</span>   -<span class="hljs-number">1.010747</span><span class="hljs-number">3</span>   -<span class="hljs-number">0.283654</span>dtype: float64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.apply(<span class="hljs-keyword">lambda</span> x : x.<span class="hljs-built_in">max</span>(), axis=<span class="hljs-number">1</span>)<span class="hljs-number">0</span>   -<span class="hljs-number">0.529676</span><span class="hljs-number">1</span>    <span class="hljs-number">0.127485</span><span class="hljs-number">2</span>    <span class="hljs-number">0.248200</span><span class="hljs-number">3</span>   -<span class="hljs-number">0.465371</span><span class="hljs-number">4</span>    <span class="hljs-number">0.258351</span>dtype: float64</code></pre><p>另外还可以通过 <code>applymap</code> 将函数映射到每个数据上：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.random.randn(<span class="hljs-number">5</span>,<span class="hljs-number">4</span>) - <span class="hljs-number">1</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj          <span class="hljs-number">0</span>         <span class="hljs-number">1</span>         <span class="hljs-number">2</span>         <span class="hljs-number">3</span><span class="hljs-number">0</span> -<span class="hljs-number">0.772463</span> -<span class="hljs-number">1.597008</span> -<span class="hljs-number">3.196100</span> -<span class="hljs-number">1.948486</span><span class="hljs-number">1</span> -<span class="hljs-number">1.765108</span> -<span class="hljs-number">1.646421</span> -<span class="hljs-number">0.687175</span> -<span class="hljs-number">0.401782</span><span class="hljs-number">2</span>  <span class="hljs-number">0.275699</span> -<span class="hljs-number">3.115184</span> -<span class="hljs-number">1.429063</span> -<span class="hljs-number">1.075610</span><span class="hljs-number">3</span> -<span class="hljs-number">0.251734</span> -<span class="hljs-number">0.448399</span> -<span class="hljs-number">3.077677</span> -<span class="hljs-number">0.294674</span><span class="hljs-number">4</span> -<span class="hljs-number">1.495896</span> -<span class="hljs-number">1.689729</span> -<span class="hljs-number">0.560376</span> -<span class="hljs-number">1.808794</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.applymap(<span class="hljs-keyword">lambda</span> x : <span class="hljs-string">&#x27;%.2f&#x27;</span> % x)       <span class="hljs-number">0</span>      <span class="hljs-number">1</span>      <span class="hljs-number">2</span>      <span class="hljs-number">3</span><span class="hljs-number">0</span>  -<span class="hljs-number">0.77</span>  -<span class="hljs-number">1.60</span>  -<span class="hljs-number">3.20</span>  -<span class="hljs-number">1.95</span><span class="hljs-number">1</span>  -<span class="hljs-number">1.77</span>  -<span class="hljs-number">1.65</span>  -<span class="hljs-number">0.69</span>  -<span class="hljs-number">0.40</span><span class="hljs-number">2</span>   <span class="hljs-number">0.28</span>  -<span class="hljs-number">3.12</span>  -<span class="hljs-number">1.43</span>  -<span class="hljs-number">1.08</span><span class="hljs-number">3</span>  -<span class="hljs-number">0.25</span>  -<span class="hljs-number">0.45</span>  -<span class="hljs-number">3.08</span>  -<span class="hljs-number">0.29</span><span class="hljs-number">4</span>  -<span class="hljs-number">1.50</span>  -<span class="hljs-number">1.69</span>  -<span class="hljs-number">0.56</span>  -<span class="hljs-number">1.81</span></code></pre><h2><span id="02x00-pai-xu"><font color="#FF0000">【02x00】排序</font></span></h2><h3><span id="02x01-sort-index-suo-yin-pai-xu"><font color="#4876FF">【02x01】sort_index() 索引排序</font></span></h3><p>根据条件对数据集排序（sorting）也是一种重要的内置运算。要对行或列索引进行排序（按字典顺序），可使用 <code>sort_index</code> 方法，它将返回一个已排序的新对象。</p><p>在 Series 和 DataFrame 中的基本语法如下：</p><pre><code class="hljs python">Series.sort_index(self,                  axis=<span class="hljs-number">0</span>,                  level=<span class="hljs-literal">None</span>,                  ascending=<span class="hljs-literal">True</span>,                  inplace=<span class="hljs-literal">False</span>,                  kind=<span class="hljs-string">&#x27;quicksort&#x27;</span>,                  na_position=<span class="hljs-string">&#x27;last&#x27;</span>,                  sort_remaining=<span class="hljs-literal">True</span>,                  ignore_index: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>)</code></pre><pre><code class="hljs python">DataFrame.sort_index(self,                     axis=<span class="hljs-number">0</span>,                     level=<span class="hljs-literal">None</span>,                     ascending=<span class="hljs-literal">True</span>,                     inplace=<span class="hljs-literal">False</span>,                     kind=<span class="hljs-string">&#x27;quicksort&#x27;</span>,                     na_position=<span class="hljs-string">&#x27;last&#x27;</span>,                     sort_remaining=<span class="hljs-literal">True</span>,                     ignore_index: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>)</code></pre><p>官方文档：</p><ul><li><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.sort_index.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.sort_index.html</a></li><li><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_index.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_index.html</a></li></ul><p>常用参数描述如下：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>axis</td><td>指定轴排序，<code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，只有在 DataFrame 中才有 <code>1</code> or <code>'columns’</code></td></tr><tr><td>ascending</td><td>为 <code>True</code>时升序排序（默认），为 <code>False</code>时降序排序</td></tr><tr><td>kind</td><td>排序方法，<code>quicksort</code>：快速排序（默认）；<code>'mergesort’</code>：归并排序；<code>'heapsort'</code>：堆排序；具体可参见 <a href="https://numpy.org/doc/stable/reference/generated/numpy.sort.html">numpy.sort()</a></td></tr></tbody></table><p>在 Series 中的应用（按照索引 index 排序）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(<span class="hljs-built_in">range</span>(<span class="hljs-number">4</span>), index=[<span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>objd    <span class="hljs-number">0</span>a    <span class="hljs-number">1</span>b    <span class="hljs-number">2</span>c    <span class="hljs-number">3</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.sort_index()a    <span class="hljs-number">1</span>b    <span class="hljs-number">2</span>c    <span class="hljs-number">3</span>d    <span class="hljs-number">0</span>dtype: int64</code></pre><p>在 DataFrame 中的应用（可按照索引 index 或列标签 columns 排序）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.arange(<span class="hljs-number">8</span>).reshape((<span class="hljs-number">2</span>, <span class="hljs-number">4</span>)), index=[<span class="hljs-string">&#x27;three&#x27;</span>, <span class="hljs-string">&#x27;one&#x27;</span>], columns=[<span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj       d  a  b  cthree  <span class="hljs-number">0</span>  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span>one    <span class="hljs-number">4</span>  <span class="hljs-number">5</span>  <span class="hljs-number">6</span>  <span class="hljs-number">7</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.sort_index()       d  a  b  cone    <span class="hljs-number">4</span>  <span class="hljs-number">5</span>  <span class="hljs-number">6</span>  <span class="hljs-number">7</span>three  <span class="hljs-number">0</span>  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.sort_index(axis=<span class="hljs-number">1</span>)       a  b  c  dthree  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span>  <span class="hljs-number">0</span>one    <span class="hljs-number">5</span>  <span class="hljs-number">6</span>  <span class="hljs-number">7</span>  <span class="hljs-number">4</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.sort_index(axis=<span class="hljs-number">1</span>, ascending=<span class="hljs-literal">False</span>)       d  c  b  athree  <span class="hljs-number">0</span>  <span class="hljs-number">3</span>  <span class="hljs-number">2</span>  <span class="hljs-number">1</span>one    <span class="hljs-number">4</span>  <span class="hljs-number">7</span>  <span class="hljs-number">6</span>  <span class="hljs-number">5</span></code></pre><h3><span id="02x02-sort-values-an-zhi-pai-xu"><font color="#4876FF">【02x02】sort_values() 按值排序</font></span></h3><p>在 Series 和 DataFrame 中的基本语法如下：</p><pre><code class="hljs python">Series.sort_values(self,                   axis=<span class="hljs-number">0</span>,                   ascending=<span class="hljs-literal">True</span>,                   inplace=<span class="hljs-literal">False</span>,                   kind=<span class="hljs-string">&#x27;quicksort&#x27;</span>,                   na_position=<span class="hljs-string">&#x27;last&#x27;</span>,                   ignore_index=<span class="hljs-literal">False</span>)</code></pre><pre><code class="hljs python">DataFrame.sort_values(self,                      by,                      axis=<span class="hljs-number">0</span>,                      ascending=<span class="hljs-literal">True</span>,                      inplace=<span class="hljs-literal">False</span>,                      kind=<span class="hljs-string">&#x27;quicksort&#x27;</span>,                      na_position=<span class="hljs-string">&#x27;last&#x27;</span>,                      ignore_index=<span class="hljs-literal">False</span>)</code></pre><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.sort_values.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.sort_values.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_values.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_values.html</a></p></li></ul><p>常用参数描述如下：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>by</td><td>DataFrame 中的必须参数，指定列的值进行排序，Series 中没有此参数</td></tr><tr><td>axis</td><td>指定轴排序，<code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，只有在 DataFrame 中才有 <code>1</code> or <code>'columns’</code></td></tr><tr><td>ascending</td><td>为 <code>True</code>时升序排序（默认），为 <code>False</code>时降序排序</td></tr><tr><td>kind</td><td>排序方法，<code>quicksort</code>：快速排序（默认）；<code>'mergesort’</code>：归并排序；<code>'heapsort'</code>：堆排序；具体可参见 <a href="https://numpy.org/doc/stable/reference/generated/numpy.sort.html">numpy.sort()</a></td></tr></tbody></table><p>在 Series 中的应用，按照值排序，如果有缺失值，默认都会被放到 Series 的末尾：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">4</span>, <span class="hljs-number">7</span>, -<span class="hljs-number">3</span>, <span class="hljs-number">2</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>    <span class="hljs-number">4</span><span class="hljs-number">1</span>    <span class="hljs-number">7</span><span class="hljs-number">2</span>   -<span class="hljs-number">3</span><span class="hljs-number">3</span>    <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.sort_values()<span class="hljs-number">2</span>   -<span class="hljs-number">3</span><span class="hljs-number">3</span>    <span class="hljs-number">2</span><span class="hljs-number">0</span>    <span class="hljs-number">4</span><span class="hljs-number">1</span>    <span class="hljs-number">7</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">4</span>, np.nan, <span class="hljs-number">7</span>, np.nan, -<span class="hljs-number">3</span>, <span class="hljs-number">2</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>    <span class="hljs-number">4.0</span><span class="hljs-number">1</span>    NaN<span class="hljs-number">2</span>    <span class="hljs-number">7.0</span><span class="hljs-number">3</span>    NaN<span class="hljs-number">4</span>   -<span class="hljs-number">3.0</span><span class="hljs-number">5</span>    <span class="hljs-number">2.0</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.sort_values()<span class="hljs-number">4</span>   -<span class="hljs-number">3.0</span><span class="hljs-number">5</span>    <span class="hljs-number">2.0</span><span class="hljs-number">0</span>    <span class="hljs-number">4.0</span><span class="hljs-number">2</span>    <span class="hljs-number">7.0</span><span class="hljs-number">1</span>    NaN<span class="hljs-number">3</span>    NaNdtype: float64</code></pre><p>在 DataFrame 中的应用，有时候可能希望根据一个或多个列中的值进行排序。将一个或多个列的名字传递给 <code>sort_values()</code> 的 <code>by</code> 参数即可达到该目的，当传递多个列时，首先会对第一列进行排序，若第一列有相同的值，再根据第二列进行排序，依次类推：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;a&#x27;</span>: [<span class="hljs-number">4</span>, <span class="hljs-number">4</span>, -<span class="hljs-number">3</span>, <span class="hljs-number">2</span>], <span class="hljs-string">&#x27;b&#x27;</span>: [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>], <span class="hljs-string">&#x27;c&#x27;</span>: [<span class="hljs-number">6</span>, <span class="hljs-number">4</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj   a  b  c<span class="hljs-number">0</span>  <span class="hljs-number">4</span>  <span class="hljs-number">0</span>  <span class="hljs-number">6</span><span class="hljs-number">1</span>  <span class="hljs-number">4</span>  <span class="hljs-number">1</span>  <span class="hljs-number">4</span><span class="hljs-number">2</span> -<span class="hljs-number">3</span>  <span class="hljs-number">0</span>  <span class="hljs-number">1</span><span class="hljs-number">3</span>  <span class="hljs-number">2</span>  <span class="hljs-number">1</span>  <span class="hljs-number">3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.sort_values(by=<span class="hljs-string">&#x27;c&#x27;</span>)   a  b  c<span class="hljs-number">2</span> -<span class="hljs-number">3</span>  <span class="hljs-number">0</span>  <span class="hljs-number">1</span><span class="hljs-number">3</span>  <span class="hljs-number">2</span>  <span class="hljs-number">1</span>  <span class="hljs-number">3</span><span class="hljs-number">1</span>  <span class="hljs-number">4</span>  <span class="hljs-number">1</span>  <span class="hljs-number">4</span><span class="hljs-number">0</span>  <span class="hljs-number">4</span>  <span class="hljs-number">0</span>  <span class="hljs-number">6</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.sort_values(by=<span class="hljs-string">&#x27;c&#x27;</span>, ascending=<span class="hljs-literal">False</span>)   a  b  c<span class="hljs-number">0</span>  <span class="hljs-number">4</span>  <span class="hljs-number">0</span>  <span class="hljs-number">6</span><span class="hljs-number">1</span>  <span class="hljs-number">4</span>  <span class="hljs-number">1</span>  <span class="hljs-number">4</span><span class="hljs-number">3</span>  <span class="hljs-number">2</span>  <span class="hljs-number">1</span>  <span class="hljs-number">3</span><span class="hljs-number">2</span> -<span class="hljs-number">3</span>  <span class="hljs-number">0</span>  <span class="hljs-number">1</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.sort_values(by=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>])   a  b  c<span class="hljs-number">2</span> -<span class="hljs-number">3</span>  <span class="hljs-number">0</span>  <span class="hljs-number">1</span><span class="hljs-number">3</span>  <span class="hljs-number">2</span>  <span class="hljs-number">1</span>  <span class="hljs-number">3</span><span class="hljs-number">0</span>  <span class="hljs-number">4</span>  <span class="hljs-number">0</span>  <span class="hljs-number">6</span><span class="hljs-number">1</span>  <span class="hljs-number">4</span>  <span class="hljs-number">1</span>  <span class="hljs-number">4</span></code></pre><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;a&#x27;</span>: [<span class="hljs-number">4</span>, <span class="hljs-number">4</span>, -<span class="hljs-number">3</span>, <span class="hljs-number">2</span>], <span class="hljs-string">&#x27;b&#x27;</span>: [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>], <span class="hljs-string">&#x27;c&#x27;</span>: [<span class="hljs-number">6</span>, <span class="hljs-number">4</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>]&#125;, index=[<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>, <span class="hljs-string">&#x27;D&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj   a  b  cA  <span class="hljs-number">4</span>  <span class="hljs-number">0</span>  <span class="hljs-number">6</span>B  <span class="hljs-number">4</span>  <span class="hljs-number">1</span>  <span class="hljs-number">4</span>C -<span class="hljs-number">3</span>  <span class="hljs-number">0</span>  <span class="hljs-number">1</span>D  <span class="hljs-number">2</span>  <span class="hljs-number">1</span>  <span class="hljs-number">3</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.sort_values(by=<span class="hljs-string">&#x27;B&#x27;</span>, axis=<span class="hljs-number">1</span>)   b  a  cA  <span class="hljs-number">0</span>  <span class="hljs-number">4</span>  <span class="hljs-number">6</span>B  <span class="hljs-number">1</span>  <span class="hljs-number">4</span>  <span class="hljs-number">4</span>C  <span class="hljs-number">0</span> -<span class="hljs-number">3</span>  <span class="hljs-number">1</span>D  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span></code></pre><h3><span id="02x03-rank-fan-hui-pai-xu-hou-yuan-su-suo-yin"><font color="#4876FF">【02x03】rank() 返回排序后元素索引</font></span></h3><p>rank() 函数会返回一个对象，对象的值是原对象经过排序后的索引值，即下标。</p><p>在 Series 和 DataFrame 中的基本语法如下：</p><pre><code class="hljs python">Series.rank(self: ~ FrameOrSeries,            axis=<span class="hljs-number">0</span>,            method: <span class="hljs-built_in">str</span> = <span class="hljs-string">&#x27;average&#x27;</span>,            numeric_only: <span class="hljs-type">Union</span>[<span class="hljs-built_in">bool</span>, NoneType] = <span class="hljs-literal">None</span>,            na_option: <span class="hljs-built_in">str</span> = <span class="hljs-string">&#x27;keep&#x27;</span>,            ascending: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">True</span>,            pct: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>)</code></pre><pre><code class="hljs python">DataFrame.rank(self: ~ FrameOrSeries,               axis=<span class="hljs-number">0</span>,               method: <span class="hljs-built_in">str</span> = <span class="hljs-string">&#x27;average&#x27;</span>,               numeric_only: <span class="hljs-type">Union</span>[<span class="hljs-built_in">bool</span>, NoneType] = <span class="hljs-literal">None</span>,               na_option: <span class="hljs-built_in">str</span> = <span class="hljs-string">&#x27;keep&#x27;</span>,               ascending: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">True</span>,               pct: <span class="hljs-built_in">bool</span> = <span class="hljs-literal">False</span>)</code></pre><p>官方文档：</p><ul><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.rank.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.rank.html</a></p></li><li><p><a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rank.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rank.html</a></p></li></ul><p>常用参数描述如下：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>axis</td><td>指定轴排序，<code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，只有在 DataFrame 中才有 <code>1</code> or <code>'columns’</code></td></tr><tr><td>method</td><td>有相同值时，如何处理：<br><code>‘average’</code>：默认值，去两个相同索引的平均值；<code>‘min’</code>：取两个相同索引的最小值；<br><code>‘max’</code>：取两个相同索引的最大值；<code>‘first’</code>：按照出现的先后顺序；<br><code>‘dense’</code>：和 <code>'min'</code> 差不多，但是各组之间总是+1的，不太好解释，可以看后面的示例</td></tr><tr><td>ascending</td><td>为 <code>True</code>时升序排序（默认），为 <code>False</code>时降序排序</td></tr></tbody></table><p>在 Series 中的应用，按照值排序，如果有缺失值，默认都会被放到 Series 的末尾：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">7</span>, -<span class="hljs-number">5</span>, <span class="hljs-number">7</span>, <span class="hljs-number">4</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">4</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>    <span class="hljs-number">7</span><span class="hljs-number">1</span>   -<span class="hljs-number">5</span><span class="hljs-number">2</span>    <span class="hljs-number">7</span><span class="hljs-number">3</span>    <span class="hljs-number">4</span><span class="hljs-number">4</span>    <span class="hljs-number">2</span><span class="hljs-number">5</span>    <span class="hljs-number">0</span><span class="hljs-number">6</span>    <span class="hljs-number">4</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.rank()<span class="hljs-number">0</span>    <span class="hljs-number">6.5</span>  <span class="hljs-comment"># 第 0 个和第 2 个值从小到大排名分别为 6 和 7，默认取平均值，即 6.5</span><span class="hljs-number">1</span>    <span class="hljs-number">1.0</span><span class="hljs-number">2</span>    <span class="hljs-number">6.5</span><span class="hljs-number">3</span>    <span class="hljs-number">4.5</span>  <span class="hljs-comment"># 第 3 个和第 6 个值从小到大排名分别为 4 和 5，默认取平均值，即 4.5</span><span class="hljs-number">4</span>    <span class="hljs-number">3.0</span><span class="hljs-number">5</span>    <span class="hljs-number">2.0</span><span class="hljs-number">6</span>    <span class="hljs-number">4.5</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.rank(method=<span class="hljs-string">&#x27;first&#x27;</span>)<span class="hljs-number">0</span>    <span class="hljs-number">6.0</span>  <span class="hljs-comment"># 第 0 个和第 2 个值从小到大排名分别为 6 和 7，按照第一次出现排序，分别为 6 和 7</span><span class="hljs-number">1</span>    <span class="hljs-number">1.0</span><span class="hljs-number">2</span>    <span class="hljs-number">7.0</span><span class="hljs-number">3</span>    <span class="hljs-number">4.0</span>  <span class="hljs-comment"># 第 3 个和第 6 个值从小到大排名分别为 4 和 5，按照第一次出现排序，分别为 4 和 5</span><span class="hljs-number">4</span>    <span class="hljs-number">3.0</span><span class="hljs-number">5</span>    <span class="hljs-number">2.0</span><span class="hljs-number">6</span>    <span class="hljs-number">5.0</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.rank(method=<span class="hljs-string">&#x27;dense&#x27;</span>)<span class="hljs-number">0</span>    <span class="hljs-number">5.0</span>  <span class="hljs-comment"># 第 0 个和第 2 个值从小到大排名分别为 6 和 7，按照最小值排序，但 dense 规定间隔为 1 所以为 5</span><span class="hljs-number">1</span>    <span class="hljs-number">1.0</span><span class="hljs-number">2</span>    <span class="hljs-number">5.0</span><span class="hljs-number">3</span>    <span class="hljs-number">4.0</span>  <span class="hljs-comment"># 第 3 个和第 6 个值从小到大排名分别为 4 和 5，按照最小值排序，即 4</span><span class="hljs-number">4</span>    <span class="hljs-number">3.0</span><span class="hljs-number">5</span>    <span class="hljs-number">2.0</span><span class="hljs-number">6</span>    <span class="hljs-number">4.0</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.rank(method=<span class="hljs-string">&#x27;min&#x27;</span>)<span class="hljs-number">0</span>    <span class="hljs-number">6.0</span>  <span class="hljs-comment"># 第 0 个和第 2 个值从小到大排名分别为 6 和 7，按照最小值排序，即 6</span><span class="hljs-number">1</span>    <span class="hljs-number">1.0</span><span class="hljs-number">2</span>    <span class="hljs-number">6.0</span><span class="hljs-number">3</span>    <span class="hljs-number">4.0</span>  <span class="hljs-comment"># 第 3 个和第 6 个值从小到大排名分别为 4 和 5，按照最小值排序，即 4</span><span class="hljs-number">4</span>    <span class="hljs-number">3.0</span><span class="hljs-number">5</span>    <span class="hljs-number">2.0</span><span class="hljs-number">6</span>    <span class="hljs-number">4.0</span>dtype: float64</code></pre><p>在 DataFrame 中可以使用 axis 参数来指定轴：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(&#123;<span class="hljs-string">&#x27;b&#x27;</span>: [<span class="hljs-number">4.3</span>, <span class="hljs-number">7</span>, -<span class="hljs-number">3</span>, <span class="hljs-number">2</span>], <span class="hljs-string">&#x27;a&#x27;</span>: [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>], <span class="hljs-string">&#x27;c&#x27;</span>: [-<span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">8</span>, -<span class="hljs-number">2.5</span>]&#125;)<span class="hljs-meta">&gt;&gt;&gt; </span>obj     b  a    c<span class="hljs-number">0</span>  <span class="hljs-number">4.3</span>  <span class="hljs-number">0</span> -<span class="hljs-number">2.0</span><span class="hljs-number">1</span>  <span class="hljs-number">7.0</span>  <span class="hljs-number">1</span>  <span class="hljs-number">5.0</span><span class="hljs-number">2</span> -<span class="hljs-number">3.0</span>  <span class="hljs-number">0</span>  <span class="hljs-number">8.0</span><span class="hljs-number">3</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">1</span> -<span class="hljs-number">2.5</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.rank()     b    a    c<span class="hljs-number">0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">1.5</span>  <span class="hljs-number">2.0</span><span class="hljs-number">1</span>  <span class="hljs-number">4.0</span>  <span class="hljs-number">3.5</span>  <span class="hljs-number">3.0</span><span class="hljs-number">2</span>  <span class="hljs-number">1.0</span>  <span class="hljs-number">1.5</span>  <span class="hljs-number">4.0</span><span class="hljs-number">3</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.5</span>  <span class="hljs-number">1.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.rank(axis=<span class="hljs-string">&#x27;columns&#x27;</span>)     b    a    c<span class="hljs-number">0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">1.0</span><span class="hljs-number">1</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">1.0</span>  <span class="hljs-number">2.0</span><span class="hljs-number">2</span>  <span class="hljs-number">1.0</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.0</span><span class="hljs-number">3</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">1.0</span></code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106758103</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="03x00-ceng-ji-suo-yin"><font color="#FF0000">【03x00】层级索引</font></span></h2><h3><span id="03x01-ren-shi-ceng-ji-suo-yin"><font color="#4876FF">【03x01】认识层级索引</font></span></h3><p>以下示例将创建一个 Series 对象， 索引 Index 由两个子 list 组成，第一个子 list 是外层索引，第二个 list 是内层索引：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(np.random.randn(<span class="hljs-number">12</span>),index=[[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>]])<span class="hljs-meta">&gt;&gt;&gt; </span>obja  <span class="hljs-number">0</span>   -<span class="hljs-number">0.201536</span>   <span class="hljs-number">1</span>   -<span class="hljs-number">0.629058</span>   <span class="hljs-number">2</span>    <span class="hljs-number">0.766716</span>b  <span class="hljs-number">0</span>   -<span class="hljs-number">1.255831</span>   <span class="hljs-number">1</span>   -<span class="hljs-number">0.483727</span>   <span class="hljs-number">2</span>   -<span class="hljs-number">0.018653</span>c  <span class="hljs-number">0</span>    <span class="hljs-number">0.788787</span>   <span class="hljs-number">1</span>    <span class="hljs-number">1.010097</span>   <span class="hljs-number">2</span>   -<span class="hljs-number">0.187258</span>d  <span class="hljs-number">0</span>    <span class="hljs-number">1.242363</span>   <span class="hljs-number">1</span>   -<span class="hljs-number">0.822011</span>   <span class="hljs-number">2</span>   -<span class="hljs-number">0.085682</span>dtype: float64</code></pre><h3><span id="03x02-multiindex-suo-yin-dui-xiang"><font color="#4876FF">【03x02】MultiIndex 索引对象</font></span></h3><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.html">https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.html</a></p><p>尝试打印上面示例中 Series 的索引类型，会得到一个 MultiIndex 对象，MultiIndex 对象的 <font color="#FF0000">levels</font> 属性表示两个层级中分别有那些标签，<font color="#FF0000">codes</font> 属性表示每个位置分别是什么标签，如下所示：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(np.random.randn(<span class="hljs-number">12</span>),index=[[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>]])<span class="hljs-meta">&gt;&gt;&gt; </span>obja  <span class="hljs-number">0</span>    <span class="hljs-number">0.035946</span>   <span class="hljs-number">1</span>   -<span class="hljs-number">0.867215</span>   <span class="hljs-number">2</span>   -<span class="hljs-number">0.053355</span>b  <span class="hljs-number">0</span>   -<span class="hljs-number">0.986616</span>   <span class="hljs-number">1</span>    <span class="hljs-number">0.026071</span>   <span class="hljs-number">2</span>   -<span class="hljs-number">0.048394</span>c  <span class="hljs-number">0</span>    <span class="hljs-number">0.251274</span>   <span class="hljs-number">1</span>    <span class="hljs-number">0.217790</span>   <span class="hljs-number">2</span>    <span class="hljs-number">1.137674</span>d  <span class="hljs-number">0</span>   -<span class="hljs-number">1.245178</span>   <span class="hljs-number">1</span>    <span class="hljs-number">1.234972</span>   <span class="hljs-number">2</span>   -<span class="hljs-number">0.035624</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">type</span>(obj.index)&lt;<span class="hljs-class"><span class="hljs-keyword">class</span> &#x27;<span class="hljs-title">pandas</span>.<span class="hljs-title">core</span>.<span class="hljs-title">indexes</span>.<span class="hljs-title">multi</span>.<span class="hljs-title">MultiIndex</span>&#x27;&gt;</span><span class="hljs-class">&gt;&gt;&gt; </span><span class="hljs-class">&gt;&gt;&gt; <span class="hljs-title">obj</span>.<span class="hljs-title">index</span></span><span class="hljs-class"><span class="hljs-title">MultiIndex</span>(<span class="hljs-params">[(<span class="hljs-params"><span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-number">0</span></span>),</span></span><span class="hljs-params"><span class="hljs-class">            (<span class="hljs-params"><span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-number">1</span></span>),</span></span><span class="hljs-params"><span class="hljs-class">            (<span class="hljs-params"><span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-number">2</span></span>),</span></span><span class="hljs-params"><span class="hljs-class">            (<span class="hljs-params"><span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-number">0</span></span>),</span></span><span class="hljs-params"><span class="hljs-class">            (<span class="hljs-params"><span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-number">1</span></span>),</span></span><span class="hljs-params"><span class="hljs-class">            (<span class="hljs-params"><span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-number">2</span></span>),</span></span><span class="hljs-params"><span class="hljs-class">            (<span class="hljs-params"><span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-number">0</span></span>),</span></span><span class="hljs-params"><span class="hljs-class">            (<span class="hljs-params"><span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-number">1</span></span>),</span></span><span class="hljs-params"><span class="hljs-class">            (<span class="hljs-params"><span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-number">2</span></span>),</span></span><span class="hljs-params"><span class="hljs-class">            (<span class="hljs-params"><span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-number">0</span></span>),</span></span><span class="hljs-params"><span class="hljs-class">            (<span class="hljs-params"><span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-number">1</span></span>),</span></span><span class="hljs-params"><span class="hljs-class">            (<span class="hljs-params"><span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-number">2</span></span>)],</span></span><span class="hljs-params"><span class="hljs-class">           </span>)</span><span class="hljs-class">&gt;&gt;&gt; <span class="hljs-title">obj</span>.<span class="hljs-title">index</span>.<span class="hljs-title">levels</span></span><span class="hljs-class"><span class="hljs-title">FrozenList</span>(<span class="hljs-params">[[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>]]</span>)</span><span class="hljs-class">&gt;&gt;&gt;</span><span class="hljs-class">&gt;&gt;&gt; <span class="hljs-title">obj</span>.<span class="hljs-title">index</span>.<span class="hljs-title">codes</span></span><span class="hljs-class"><span class="hljs-title">FrozenList</span>(<span class="hljs-params">[[<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>]]</span>)</span></code></pre><p>通常可以使用 <code>from_arrays()</code> 方法来将数组对象转换为 MultiIndex 索引对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span>arrays = [[<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>], [<span class="hljs-string">&#x27;red&#x27;</span>, <span class="hljs-string">&#x27;blue&#x27;</span>, <span class="hljs-string">&#x27;red&#x27;</span>, <span class="hljs-string">&#x27;blue&#x27;</span>]]<span class="hljs-meta">&gt;&gt;&gt; </span>pd.MultiIndex.from_arrays(arrays, names=(<span class="hljs-string">&#x27;number&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>))MultiIndex([(<span class="hljs-number">1</span>,  <span class="hljs-string">&#x27;red&#x27;</span>),            (<span class="hljs-number">1</span>, <span class="hljs-string">&#x27;blue&#x27;</span>),            (<span class="hljs-number">2</span>,  <span class="hljs-string">&#x27;red&#x27;</span>),            (<span class="hljs-number">2</span>, <span class="hljs-string">&#x27;blue&#x27;</span>)],           names=[<span class="hljs-string">&#x27;number&#x27;</span>, <span class="hljs-string">&#x27;color&#x27;</span>])</code></pre><p>其他常用方法见下表（更多方法参见官方文档）：</p><table><thead><tr><th>方法</th><th>描述</th></tr></thead><tbody><tr><td>from_arrays(arrays[, sortorder, names])</td><td>将数组转换为 MultiIndex</td></tr><tr><td>from_tuples(tuples[, sortorder, names])</td><td>将元组列表转换为 MultiIndex</td></tr><tr><td>from_product(iterables[, sortorder, names])</td><td>将多个可迭代的笛卡尔积转换成 MultiIndex</td></tr><tr><td>from_frame(df[, sortorder, names])</td><td>将 DataFrame 对象转换为 MultiIndex</td></tr><tr><td>set_levels(self, levels[, level, inplace, …])</td><td>为 MultiIndex 设置新的 levels</td></tr><tr><td>set_codes(self, codes[, level, inplace, …])</td><td>为 MultiIndex 设置新的 codes</td></tr><tr><td>sortlevel(self[, level, ascending, …])</td><td>根据 level 进行排序</td></tr><tr><td>droplevel(self[, level])</td><td>删除指定的 level</td></tr><tr><td>swaplevel(self[, i, j])</td><td>交换 level i 与 level i，即交换外层索引与内层索引</td></tr></tbody></table><h3><span id="03x03-ti-qu-zhi"><font color="#4876FF">【03x03】提取值</font></span></h3><p>对于这种有多层索引的对象，如果只传入一个参数，则会对外层索引进行提取，其中包含对应所有的内层索引，如果传入两个参数，则第一个参数表示外层索引，第二个参数表示内层索引，示例如下：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(np.random.randn(<span class="hljs-number">12</span>),index=[[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>]])<span class="hljs-meta">&gt;&gt;&gt; </span>obja  <span class="hljs-number">0</span>    <span class="hljs-number">0.550202</span>   <span class="hljs-number">1</span>    <span class="hljs-number">0.328784</span>   <span class="hljs-number">2</span>    <span class="hljs-number">1.422690</span>b  <span class="hljs-number">0</span>   -<span class="hljs-number">1.333477</span>   <span class="hljs-number">1</span>   -<span class="hljs-number">0.933809</span>   <span class="hljs-number">2</span>   -<span class="hljs-number">0.326541</span>c  <span class="hljs-number">0</span>    <span class="hljs-number">0.663686</span>   <span class="hljs-number">1</span>    <span class="hljs-number">0.943393</span>   <span class="hljs-number">2</span>    <span class="hljs-number">0.273106</span>d  <span class="hljs-number">0</span>    <span class="hljs-number">1.354037</span>   <span class="hljs-number">1</span>   -<span class="hljs-number">2.312847</span>   <span class="hljs-number">2</span>   -<span class="hljs-number">2.343777</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;b&#x27;</span>]<span class="hljs-number">0</span>   -<span class="hljs-number">1.333477</span><span class="hljs-number">1</span>   -<span class="hljs-number">0.933809</span><span class="hljs-number">2</span>   -<span class="hljs-number">0.326541</span>dtype: float64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-number">1</span>]-<span class="hljs-number">0.9338094811708413</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[:, <span class="hljs-number">2</span>]a    <span class="hljs-number">1.422690</span>b   -<span class="hljs-number">0.326541</span>c    <span class="hljs-number">0.273106</span>d   -<span class="hljs-number">2.343777</span>dtype: float64</code></pre><h3><span id="03x04-jiao-huan-fen-ceng-yu-pai-xu"><font color="#4876FF">【03x04】交换分层与排序</font></span></h3><p>MultiIndex 对象的 <code>swaplevel()</code> 方法可以交换外层与内层索引，<code>sortlevel()</code> 方法会先对外层索引进行排序，再对内层索引进行排序，默认是升序，如果设置 <code>ascending</code> 参数为 False 则会降序排列，示例如下：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(np.random.randn(<span class="hljs-number">12</span>),index=[[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>]])<span class="hljs-meta">&gt;&gt;&gt; </span>obja  <span class="hljs-number">0</span>   -<span class="hljs-number">0.110215</span>   <span class="hljs-number">1</span>    <span class="hljs-number">0.193075</span>   <span class="hljs-number">2</span>   -<span class="hljs-number">1.101706</span>b  <span class="hljs-number">0</span>   -<span class="hljs-number">1.325743</span>   <span class="hljs-number">1</span>    <span class="hljs-number">0.528418</span>   <span class="hljs-number">2</span>   -<span class="hljs-number">0.127081</span>c  <span class="hljs-number">0</span>   -<span class="hljs-number">0.733822</span>   <span class="hljs-number">1</span>    <span class="hljs-number">1.665262</span>   <span class="hljs-number">2</span>    <span class="hljs-number">0.127073</span>d  <span class="hljs-number">0</span>    <span class="hljs-number">1.262022</span>   <span class="hljs-number">1</span>   -<span class="hljs-number">1.170518</span>   <span class="hljs-number">2</span>    <span class="hljs-number">0.966334</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.swaplevel()<span class="hljs-number">0</span>  a   -<span class="hljs-number">0.110215</span><span class="hljs-number">1</span>  a    <span class="hljs-number">0.193075</span><span class="hljs-number">2</span>  a   -<span class="hljs-number">1.101706</span><span class="hljs-number">0</span>  b   -<span class="hljs-number">1.325743</span><span class="hljs-number">1</span>  b    <span class="hljs-number">0.528418</span><span class="hljs-number">2</span>  b   -<span class="hljs-number">0.127081</span><span class="hljs-number">0</span>  c   -<span class="hljs-number">0.733822</span><span class="hljs-number">1</span>  c    <span class="hljs-number">1.665262</span><span class="hljs-number">2</span>  c    <span class="hljs-number">0.127073</span><span class="hljs-number">0</span>  d    <span class="hljs-number">1.262022</span><span class="hljs-number">1</span>  d   -<span class="hljs-number">1.170518</span><span class="hljs-number">2</span>  d    <span class="hljs-number">0.966334</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.swaplevel().index.sortlevel()(MultiIndex([(<span class="hljs-number">0</span>, <span class="hljs-string">&#x27;a&#x27;</span>),            (<span class="hljs-number">0</span>, <span class="hljs-string">&#x27;b&#x27;</span>),            (<span class="hljs-number">0</span>, <span class="hljs-string">&#x27;c&#x27;</span>),            (<span class="hljs-number">0</span>, <span class="hljs-string">&#x27;d&#x27;</span>),            (<span class="hljs-number">1</span>, <span class="hljs-string">&#x27;a&#x27;</span>),            (<span class="hljs-number">1</span>, <span class="hljs-string">&#x27;b&#x27;</span>),            (<span class="hljs-number">1</span>, <span class="hljs-string">&#x27;c&#x27;</span>),            (<span class="hljs-number">1</span>, <span class="hljs-string">&#x27;d&#x27;</span>),            (<span class="hljs-number">2</span>, <span class="hljs-string">&#x27;a&#x27;</span>),            (<span class="hljs-number">2</span>, <span class="hljs-string">&#x27;b&#x27;</span>),            (<span class="hljs-number">2</span>, <span class="hljs-string">&#x27;c&#x27;</span>),            (<span class="hljs-number">2</span>, <span class="hljs-string">&#x27;d&#x27;</span>)],           ), array([ <span class="hljs-number">0</span>,  <span class="hljs-number">3</span>,  <span class="hljs-number">6</span>,  <span class="hljs-number">9</span>,  <span class="hljs-number">1</span>,  <span class="hljs-number">4</span>,  <span class="hljs-number">7</span>, <span class="hljs-number">10</span>,  <span class="hljs-number">2</span>,  <span class="hljs-number">5</span>,  <span class="hljs-number">8</span>, <span class="hljs-number">11</span>], dtype=int32))</code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106758103</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-01x00-han-shu-ying-yong-he-ying-she-font&quot;&gt;</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Pandas" scheme="https://www.itbob.cn/tags/Pandas/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Pandas（三）：算术运算与缺失值的处理</title>
    <link href="https://www.itbob.cn/article/027/"/>
    <id>https://www.itbob.cn/article/027/</id>
    <published>2020-06-14T14:42:53.000Z</published>
    <updated>2022-05-22T12:38:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-01x00-pandas-suan-zhu-yun-suan-font"><font color="#FF0000">【01x00】Pandas 算术运算</font></a><ul><li><a href="#font-color-4876ff-01x01-shi-yong-numpy-tong-yong-han-shu-font"><font color="#4876FF">【01x01】使用 NumPy 通用函数</font></a></li><li><a href="#font-color-4876ff-01x02-shu-ju-dui-qi-font"><font color="#4876FF">【01x02】数据对齐</font></a></li><li><a href="#font-color-4876ff-01x03-dataframe-yu-series-zhi-jian-de-yun-suan-font"><font color="#4876FF">【01x03】DataFrame 与 Series 之间的运算</font></a></li><li><a href="#font-color-4876ff-01x04-pandas-suan-zhu-fang-fa-font"><font color="#4876FF">【01x04】Pandas 算术方法</font></a></li></ul></li><li><a href="#font-color-ff0000-02x00-chu-li-que-shi-zhi-font"><font color="#FF0000">【02x00】处理缺失值</font></a><ul><li><a href="#font-color-4876ff-02x01-fill-value-zhi-ding-zhi-yu-que-shi-zhi-jin-xing-yun-suan-font"><font color="#4876FF">【02x01】fill_value() 指定值与缺失值进行运算</font></a></li><li><a href="#font-color-4876ff-02x02-isnull-notnull-pan-duan-que-shi-zhi-font"><font color="#4876FF">【02x02】isnull() / notnull() 判断缺失值</font></a></li><li><a href="#font-color-4876ff-02x03-dropna-shan-chu-que-shi-zhi-font"><font color="#4876FF">【02x03】dropna() 删除缺失值</font></a></li><li><a href="#font-color-4876ff-02x04-fillna-tian-chong-que-shi-zhi-font"><font color="#4876FF">【02x04】fillna() 填充缺失值</font></a></li></ul></li></ul><!-- tocstop --><hr><p>Pandas 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/025/">Python 数据分析三剑客之 Pandas（一）：认识 Pandas 及其 Series、DataFrame 对象</a></li><li><a href="https://www.itbob.cn/article/026/">Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作</a></li><li><a href="https://www.itbob.cn/article/027/">Python 数据分析三剑客之 Pandas（三）：算术运算与缺失值的处理</a></li><li><a href="https://www.itbob.cn/article/028/">Python 数据分析三剑客之 Pandas（四）：函数应用、映射、排序和层级索引</a></li><li><a href="https://www.itbob.cn/article/029/">Python 数据分析三剑客之 Pandas（五）：统计计算与统计描述</a></li><li><a href="https://www.itbob.cn/article/030/">Python 数据分析三剑客之 Pandas（六）：GroupBy 数据分裂、应用与合并</a></li><li><a href="https://www.itbob.cn/article/031/">Python 数据分析三剑客之 Pandas（七）：合并数据集</a></li><li><a href="https://www.itbob.cn/article/032/">Python 数据分析三剑客之 Pandas（八）：数据重塑、重复数据处理与数据替换</a></li><li><a href="https://www.itbob.cn/article/033/">Python 数据分析三剑客之 Pandas（九）：时间序列</a></li><li><a href="https://www.itbob.cn/article/034/">Python 数据分析三剑客之 Pandas（十）：数据读写</a></li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106743778</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="01x00-pandas-suan-zhu-yun-suan"><font color="#FF0000">【01x00】Pandas 算术运算</font></span></h2><p>Pandas 继承了 NumPy 的功能，NumPy 的基本能力之一是快速对每个元素进行运算，既包括基本算术运算（加、减、乘、除），也包括更复杂的运算（三角函数、指数函数和对数函数等）。具体可以参考 NumPy 系列文章。</p><h3><span id="01x01-shi-yong-numpy-tong-yong-han-shu"><font color="#4876FF">【01x01】使用 NumPy 通用函数</font></span></h3><p>因为 Pandas 是建立在 NumPy 基础之上的，所以 NumPy 的通用函数同样适用于 Pandas 的 Series 和 DataFrame 对象，如下所示：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>rng = np.random.RandomState(<span class="hljs-number">42</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>ser = pd.Series(rng.randint(<span class="hljs-number">0</span>, <span class="hljs-number">10</span>, <span class="hljs-number">4</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>ser<span class="hljs-number">0</span>    <span class="hljs-number">6</span><span class="hljs-number">1</span>    <span class="hljs-number">3</span><span class="hljs-number">2</span>    <span class="hljs-number">7</span><span class="hljs-number">3</span>    <span class="hljs-number">4</span>dtype: int32<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(rng.randint(<span class="hljs-number">0</span>, <span class="hljs-number">10</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">4</span>)), columns=[<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>, <span class="hljs-string">&#x27;D&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj   A  B  C  D<span class="hljs-number">0</span>  <span class="hljs-number">6</span>  <span class="hljs-number">9</span>  <span class="hljs-number">2</span>  <span class="hljs-number">6</span><span class="hljs-number">1</span>  <span class="hljs-number">7</span>  <span class="hljs-number">4</span>  <span class="hljs-number">3</span>  <span class="hljs-number">7</span><span class="hljs-number">2</span>  <span class="hljs-number">7</span>  <span class="hljs-number">2</span>  <span class="hljs-number">5</span>  <span class="hljs-number">4</span></code></pre><p>使用 NumPy 通用函数，生成的结果是另一个保留索引的 Pandas 对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>rng = np.random.RandomState(<span class="hljs-number">42</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>ser = pd.Series(rng.randint(<span class="hljs-number">0</span>, <span class="hljs-number">10</span>, <span class="hljs-number">4</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>ser<span class="hljs-number">0</span>    <span class="hljs-number">6</span><span class="hljs-number">1</span>    <span class="hljs-number">3</span><span class="hljs-number">2</span>    <span class="hljs-number">7</span><span class="hljs-number">3</span>    <span class="hljs-number">4</span>dtype: int32<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>np.exp(ser)<span class="hljs-number">0</span>     <span class="hljs-number">403.428793</span><span class="hljs-number">1</span>      <span class="hljs-number">20.085537</span><span class="hljs-number">2</span>    <span class="hljs-number">1096.633158</span><span class="hljs-number">3</span>      <span class="hljs-number">54.598150</span>dtype: float64</code></pre><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(rng.randint(<span class="hljs-number">0</span>, <span class="hljs-number">10</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">4</span>)), columns=[<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>, <span class="hljs-string">&#x27;D&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>np.sin(obj * np.pi / <span class="hljs-number">4</span>)          A             B         C             D<span class="hljs-number">0</span> -<span class="hljs-number">1.000000</span>  <span class="hljs-number">7.071068e-01</span>  <span class="hljs-number">1.000000</span> -<span class="hljs-number">1.000000e+00</span><span class="hljs-number">1</span> -<span class="hljs-number">0.707107</span>  <span class="hljs-number">1.224647e-16</span>  <span class="hljs-number">0.707107</span> -<span class="hljs-number">7.071068e-01</span><span class="hljs-number">2</span> -<span class="hljs-number">0.707107</span>  <span class="hljs-number">1.000000e+00</span> -<span class="hljs-number">0.707107</span>  <span class="hljs-number">1.224647e-16</span></code></pre><h3><span id="01x02-shu-ju-dui-qi"><font color="#4876FF">【01x02】数据对齐</font></span></h3><p>Pandas 最重要的一个功能是，它可以对不同索引的对象进行算术运算。在将对象相加时，如果存在不同的索引对，则结果的索引就是该索引对的并集。自动的数据对齐操作会在不重叠的索引处引入缺失值，即 <font color="#FF0000">NaN</font>，缺失值会在算术运算过程中传播。</p><p>Series 对象的数据对齐操作：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.Series([<span class="hljs-number">7.3</span>, -<span class="hljs-number">2.5</span>, <span class="hljs-number">3.4</span>, <span class="hljs-number">1.5</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.Series([-<span class="hljs-number">2.1</span>, <span class="hljs-number">3.6</span>, -<span class="hljs-number">1.5</span>, <span class="hljs-number">4</span>, <span class="hljs-number">3.1</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>, <span class="hljs-string">&#x27;f&#x27;</span>, <span class="hljs-string">&#x27;g&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj1a    <span class="hljs-number">7.3</span>c   -<span class="hljs-number">2.5</span>d    <span class="hljs-number">3.4</span>e    <span class="hljs-number">1.5</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2a   -<span class="hljs-number">2.1</span>c    <span class="hljs-number">3.6</span>e   -<span class="hljs-number">1.5</span>f    <span class="hljs-number">4.0</span>g    <span class="hljs-number">3.1</span>dtype: float64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 + obj2a    <span class="hljs-number">5.2</span>c    <span class="hljs-number">1.1</span>d    NaNe    <span class="hljs-number">0.0</span>f    NaNg    NaNdtype: float64</code></pre><p>DataFrame 对象的数据对齐操作会同时发生在行和列上：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame(np.arange(<span class="hljs-number">9.</span>).reshape((<span class="hljs-number">3</span>, <span class="hljs-number">3</span>)), columns=<span class="hljs-built_in">list</span>(<span class="hljs-string">&#x27;bcd&#x27;</span>), index=[<span class="hljs-string">&#x27;Ohio&#x27;</span>, <span class="hljs-string">&#x27;Texas&#x27;</span>, <span class="hljs-string">&#x27;Colorado&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame(np.arange(<span class="hljs-number">12.</span>).reshape((<span class="hljs-number">4</span>, <span class="hljs-number">3</span>)), columns=<span class="hljs-built_in">list</span>(<span class="hljs-string">&#x27;bde&#x27;</span>), index=[<span class="hljs-string">&#x27;Utah&#x27;</span>, <span class="hljs-string">&#x27;Ohio&#x27;</span>, <span class="hljs-string">&#x27;Texas&#x27;</span>, <span class="hljs-string">&#x27;Oregon&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj1            b    c    dOhio      <span class="hljs-number">0.0</span>  <span class="hljs-number">1.0</span>  <span class="hljs-number">2.0</span>Texas     <span class="hljs-number">3.0</span>  <span class="hljs-number">4.0</span>  <span class="hljs-number">5.0</span>Colorado  <span class="hljs-number">6.0</span>  <span class="hljs-number">7.0</span>  <span class="hljs-number">8.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2          b     d     eUtah    <span class="hljs-number">0.0</span>   <span class="hljs-number">1.0</span>   <span class="hljs-number">2.0</span>Ohio    <span class="hljs-number">3.0</span>   <span class="hljs-number">4.0</span>   <span class="hljs-number">5.0</span>Texas   <span class="hljs-number">6.0</span>   <span class="hljs-number">7.0</span>   <span class="hljs-number">8.0</span>Oregon  <span class="hljs-number">9.0</span>  <span class="hljs-number">10.0</span>  <span class="hljs-number">11.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1 + obj2            b   c     d   eColorado  NaN NaN   NaN NaNOhio      <span class="hljs-number">3.0</span> NaN   <span class="hljs-number">6.0</span> NaNOregon    NaN NaN   NaN NaNTexas     <span class="hljs-number">9.0</span> NaN  <span class="hljs-number">12.0</span> NaNUtah      NaN NaN   NaN NaN</code></pre><h3><span id="01x03-dataframe-yu-series-zhi-jian-de-yun-suan"><font color="#4876FF">【01x03】DataFrame 与 Series 之间的运算</font></span></h3><p>首先回忆 NumPy 中的广播（参见：<a href="https://itrhx.blog.csdn.net/article/details/104988137">《Python 数据分析三剑客之 NumPy（二）：数组索引 / 切片 / 广播 / 拼接 / 分割》</a>），跟不同维度的 NumPy 数组一样，DataFrame 和 Series 之间算术运算也是有明确规定的。首先回忆一下 NumPy 中不同维度的数组之间的运算：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>arr = np.arange(<span class="hljs-number">12.</span>).reshape((<span class="hljs-number">3</span>, <span class="hljs-number">4</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>arrarray([[ <span class="hljs-number">0.</span>,  <span class="hljs-number">1.</span>,  <span class="hljs-number">2.</span>,  <span class="hljs-number">3.</span>],       [ <span class="hljs-number">4.</span>,  <span class="hljs-number">5.</span>,  <span class="hljs-number">6.</span>,  <span class="hljs-number">7.</span>],       [ <span class="hljs-number">8.</span>,  <span class="hljs-number">9.</span>, <span class="hljs-number">10.</span>, <span class="hljs-number">11.</span>]])<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>arr[<span class="hljs-number">0</span>]array([<span class="hljs-number">0.</span>, <span class="hljs-number">1.</span>, <span class="hljs-number">2.</span>, <span class="hljs-number">3.</span>])<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>arr - arr[<span class="hljs-number">0</span>]array([[<span class="hljs-number">0.</span>, <span class="hljs-number">0.</span>, <span class="hljs-number">0.</span>, <span class="hljs-number">0.</span>],       [<span class="hljs-number">4.</span>, <span class="hljs-number">4.</span>, <span class="hljs-number">4.</span>, <span class="hljs-number">4.</span>],       [<span class="hljs-number">8.</span>, <span class="hljs-number">8.</span>, <span class="hljs-number">8.</span>, <span class="hljs-number">8.</span>]])</code></pre><p>可以看到每一行都进行了减法运算，这正是 NumPy 中的广播，而 DataFrame 与 Series 之间的运算也类似，默认情况下，DataFrame 和 Series 之间的算术运算会将 Series 的索引匹配到 DataFrame 的列，然后沿着行一直向下广播：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>frame = pd.DataFrame(np.arange(<span class="hljs-number">12.</span>).reshape((<span class="hljs-number">4</span>, <span class="hljs-number">3</span>)), columns=<span class="hljs-built_in">list</span>(<span class="hljs-string">&#x27;bde&#x27;</span>), index=[<span class="hljs-string">&#x27;AA&#x27;</span>, <span class="hljs-string">&#x27;BB&#x27;</span>, <span class="hljs-string">&#x27;CC&#x27;</span>, <span class="hljs-string">&#x27;DD&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>frame      b     d     eAA  <span class="hljs-number">0.0</span>   <span class="hljs-number">1.0</span>   <span class="hljs-number">2.0</span>BB  <span class="hljs-number">3.0</span>   <span class="hljs-number">4.0</span>   <span class="hljs-number">5.0</span>CC  <span class="hljs-number">6.0</span>   <span class="hljs-number">7.0</span>   <span class="hljs-number">8.0</span>DD  <span class="hljs-number">9.0</span>  <span class="hljs-number">10.0</span>  <span class="hljs-number">11.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>series = frame.iloc[<span class="hljs-number">0</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>seriesb    <span class="hljs-number">0.0</span>d    <span class="hljs-number">1.0</span>e    <span class="hljs-number">2.0</span>Name: AA, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>frame - series      b    d    eAA  <span class="hljs-number">0.0</span>  <span class="hljs-number">0.0</span>  <span class="hljs-number">0.0</span>BB  <span class="hljs-number">3.0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">3.0</span>CC  <span class="hljs-number">6.0</span>  <span class="hljs-number">6.0</span>  <span class="hljs-number">6.0</span>DD  <span class="hljs-number">9.0</span>  <span class="hljs-number">9.0</span>  <span class="hljs-number">9.0</span></code></pre><p>如果某个索引值在 DataFrame 的列或 Series 的索引中找不到，则参与运算的两个对象就会被重新索引以形成并集：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>frame = pd.DataFrame(np.arange(<span class="hljs-number">12.</span>).reshape((<span class="hljs-number">4</span>, <span class="hljs-number">3</span>)), columns=<span class="hljs-built_in">list</span>(<span class="hljs-string">&#x27;bde&#x27;</span>), index=[<span class="hljs-string">&#x27;AA&#x27;</span>, <span class="hljs-string">&#x27;BB&#x27;</span>, <span class="hljs-string">&#x27;CC&#x27;</span>, <span class="hljs-string">&#x27;DD&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>frame      b     d     eAA  <span class="hljs-number">0.0</span>   <span class="hljs-number">1.0</span>   <span class="hljs-number">2.0</span>BB  <span class="hljs-number">3.0</span>   <span class="hljs-number">4.0</span>   <span class="hljs-number">5.0</span>CC  <span class="hljs-number">6.0</span>   <span class="hljs-number">7.0</span>   <span class="hljs-number">8.0</span>DD  <span class="hljs-number">9.0</span>  <span class="hljs-number">10.0</span>  <span class="hljs-number">11.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>series = pd.Series(<span class="hljs-built_in">range</span>(<span class="hljs-number">3</span>), index=[<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>, <span class="hljs-string">&#x27;f&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>seriesb    <span class="hljs-number">0</span>e    <span class="hljs-number">1</span>f    <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>frame + series      b   d     e   fAA  <span class="hljs-number">0.0</span> NaN   <span class="hljs-number">3.0</span> NaNBB  <span class="hljs-number">3.0</span> NaN   <span class="hljs-number">6.0</span> NaNCC  <span class="hljs-number">6.0</span> NaN   <span class="hljs-number">9.0</span> NaNDD  <span class="hljs-number">9.0</span> NaN  <span class="hljs-number">12.0</span> NaN</code></pre><p>如果希望匹配行且在列上广播，则必须使用算术运算方法，在方法中传入的轴（axis）就是希望匹配的轴。在下例中，我们的目的是匹配 DataFrame 的行索引（axis=‘index’ or axis=0）并进行广播：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>frame = pd.DataFrame(np.arange(<span class="hljs-number">12.</span>).reshape((<span class="hljs-number">4</span>, <span class="hljs-number">3</span>)), columns=<span class="hljs-built_in">list</span>(<span class="hljs-string">&#x27;bde&#x27;</span>), index=[<span class="hljs-string">&#x27;AA&#x27;</span>, <span class="hljs-string">&#x27;BB&#x27;</span>, <span class="hljs-string">&#x27;CC&#x27;</span>, <span class="hljs-string">&#x27;DD&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>frame      b     d     eAA  <span class="hljs-number">0.0</span>   <span class="hljs-number">1.0</span>   <span class="hljs-number">2.0</span>BB  <span class="hljs-number">3.0</span>   <span class="hljs-number">4.0</span>   <span class="hljs-number">5.0</span>CC  <span class="hljs-number">6.0</span>   <span class="hljs-number">7.0</span>   <span class="hljs-number">8.0</span>DD  <span class="hljs-number">9.0</span>  <span class="hljs-number">10.0</span>  <span class="hljs-number">11.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>series = frame[<span class="hljs-string">&#x27;d&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>seriesAA     <span class="hljs-number">1.0</span>BB     <span class="hljs-number">4.0</span>CC     <span class="hljs-number">7.0</span>DD    <span class="hljs-number">10.0</span>Name: d, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>frame.sub(series, axis=<span class="hljs-string">&#x27;index&#x27;</span>)      b    d    eAA -<span class="hljs-number">1.0</span>  <span class="hljs-number">0.0</span>  <span class="hljs-number">1.0</span>BB -<span class="hljs-number">1.0</span>  <span class="hljs-number">0.0</span>  <span class="hljs-number">1.0</span>CC -<span class="hljs-number">1.0</span>  <span class="hljs-number">0.0</span>  <span class="hljs-number">1.0</span>DD -<span class="hljs-number">1.0</span>  <span class="hljs-number">0.0</span>  <span class="hljs-number">1.0</span></code></pre><h3><span id="01x04-pandas-suan-zhu-fang-fa"><font color="#4876FF">【01x04】Pandas 算术方法</font></span></h3><p>完整的 Pandas 算术方法见下表：</p><table><thead><tr><th>方法</th><th>副本</th><th>描述</th></tr></thead><tbody><tr><td>add()</td><td>radd()</td><td>加法（+）</td></tr><tr><td>sub()、subtract()</td><td>rsub()</td><td>减法（-）</td></tr><tr><td>mul()、multiply()</td><td>rmul()</td><td>乘法（*）</td></tr><tr><td>pow()</td><td>rpow()</td><td>指数（**）</td></tr><tr><td>truediv()、div()、divide()</td><td>rdiv()</td><td>除法（/）</td></tr><tr><td>floordiv()</td><td>rfloordiv()</td><td>底除（//）</td></tr><tr><td>mod()</td><td>rmod()</td><td>求余（%）</td></tr></tbody></table><p>副本均为原方法前加了个 <code>r</code>，它会翻转参数：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.arange(<span class="hljs-number">12.</span>).reshape((<span class="hljs-number">3</span>, <span class="hljs-number">4</span>)), columns=<span class="hljs-built_in">list</span>(<span class="hljs-string">&#x27;abcd&#x27;</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obj     a    b     c     d<span class="hljs-number">0</span>  <span class="hljs-number">0.0</span>  <span class="hljs-number">1.0</span>   <span class="hljs-number">2.0</span>   <span class="hljs-number">3.0</span><span class="hljs-number">1</span>  <span class="hljs-number">4.0</span>  <span class="hljs-number">5.0</span>   <span class="hljs-number">6.0</span>   <span class="hljs-number">7.0</span><span class="hljs-number">2</span>  <span class="hljs-number">8.0</span>  <span class="hljs-number">9.0</span>  <span class="hljs-number">10.0</span>  <span class="hljs-number">11.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-number">1</span> / obj       a         b         c         d<span class="hljs-number">0</span>    inf  <span class="hljs-number">1.000000</span>  <span class="hljs-number">0.500000</span>  <span class="hljs-number">0.333333</span><span class="hljs-number">1</span>  <span class="hljs-number">0.250</span>  <span class="hljs-number">0.200000</span>  <span class="hljs-number">0.166667</span>  <span class="hljs-number">0.142857</span><span class="hljs-number">2</span>  <span class="hljs-number">0.125</span>  <span class="hljs-number">0.111111</span>  <span class="hljs-number">0.100000</span>  <span class="hljs-number">0.090909</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.rdiv(<span class="hljs-number">1</span>)       a         b         c         d<span class="hljs-number">0</span>    inf  <span class="hljs-number">1.000000</span>  <span class="hljs-number">0.500000</span>  <span class="hljs-number">0.333333</span><span class="hljs-number">1</span>  <span class="hljs-number">0.250</span>  <span class="hljs-number">0.200000</span>  <span class="hljs-number">0.166667</span>  <span class="hljs-number">0.142857</span><span class="hljs-number">2</span>  <span class="hljs-number">0.125</span>  <span class="hljs-number">0.111111</span>  <span class="hljs-number">0.100000</span>  <span class="hljs-number">0.090909</span></code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106743778</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="02x00-chu-li-que-shi-zhi"><font color="#FF0000">【02x00】处理缺失值</font></span></h2><p>在现实中遇到的数据很少是干净整齐的，许多数据集都会有数据缺失的现象，缺失值主要有三种形式：null、NaN（NAN，nan） 或 NA。</p><h3><span id="02x01-fill-value-zhi-ding-zhi-yu-que-shi-zhi-jin-xing-yun-suan"><font color="#4876FF">【02x01】fill_value() 指定值与缺失值进行运算</font></span></h3><p>使用 <code>add</code>, <code>sub</code>, <code>div</code>, <code>mul</code> 等算术方法时，通过 <code>fill_value</code> 指定填充值，未对齐的数据将和填充值做运算。</p><p>Series 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.Series([<span class="hljs-number">6</span>, <span class="hljs-number">7</span>])<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1<span class="hljs-number">0</span>    <span class="hljs-number">1</span><span class="hljs-number">1</span>    <span class="hljs-number">2</span><span class="hljs-number">2</span>    <span class="hljs-number">3</span><span class="hljs-number">3</span>    <span class="hljs-number">4</span><span class="hljs-number">4</span>    <span class="hljs-number">5</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2<span class="hljs-number">0</span>    <span class="hljs-number">6</span><span class="hljs-number">1</span>    <span class="hljs-number">7</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1.add(obj2)<span class="hljs-number">0</span>    <span class="hljs-number">7.0</span><span class="hljs-number">1</span>    <span class="hljs-number">9.0</span><span class="hljs-number">2</span>    NaN<span class="hljs-number">3</span>    NaN<span class="hljs-number">4</span>    NaNdtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1.add(obj2, fill_value=-<span class="hljs-number">1</span>)<span class="hljs-number">0</span>    <span class="hljs-number">7.0</span><span class="hljs-number">1</span>    <span class="hljs-number">9.0</span><span class="hljs-number">2</span>    <span class="hljs-number">2.0</span><span class="hljs-number">3</span>    <span class="hljs-number">3.0</span><span class="hljs-number">4</span>    <span class="hljs-number">4.0</span>dtype: float64</code></pre><p>DataFrame 中的应用：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj1 = pd.DataFrame(np.arange(<span class="hljs-number">12.</span>).reshape((<span class="hljs-number">3</span>, <span class="hljs-number">4</span>)), columns=<span class="hljs-built_in">list</span>(<span class="hljs-string">&#x27;abcd&#x27;</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = pd.DataFrame(np.arange(<span class="hljs-number">20.</span>).reshape((<span class="hljs-number">4</span>, <span class="hljs-number">5</span>)), columns=<span class="hljs-built_in">list</span>(<span class="hljs-string">&#x27;abcde&#x27;</span>))<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2.loc[<span class="hljs-number">1</span>, <span class="hljs-string">&#x27;b&#x27;</span>] = np.nan<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1     a    b     c     d<span class="hljs-number">0</span>  <span class="hljs-number">0.0</span>  <span class="hljs-number">1.0</span>   <span class="hljs-number">2.0</span>   <span class="hljs-number">3.0</span><span class="hljs-number">1</span>  <span class="hljs-number">4.0</span>  <span class="hljs-number">5.0</span>   <span class="hljs-number">6.0</span>   <span class="hljs-number">7.0</span><span class="hljs-number">2</span>  <span class="hljs-number">8.0</span>  <span class="hljs-number">9.0</span>  <span class="hljs-number">10.0</span>  <span class="hljs-number">11.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2      a     b     c     d     e<span class="hljs-number">0</span>   <span class="hljs-number">0.0</span>   <span class="hljs-number">1.0</span>   <span class="hljs-number">2.0</span>   <span class="hljs-number">3.0</span>   <span class="hljs-number">4.0</span><span class="hljs-number">1</span>   <span class="hljs-number">5.0</span>   NaN   <span class="hljs-number">7.0</span>   <span class="hljs-number">8.0</span>   <span class="hljs-number">9.0</span><span class="hljs-number">2</span>  <span class="hljs-number">10.0</span>  <span class="hljs-number">11.0</span>  <span class="hljs-number">12.0</span>  <span class="hljs-number">13.0</span>  <span class="hljs-number">14.0</span><span class="hljs-number">3</span>  <span class="hljs-number">15.0</span>  <span class="hljs-number">16.0</span>  <span class="hljs-number">17.0</span>  <span class="hljs-number">18.0</span>  <span class="hljs-number">19.0</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1 + obj2      a     b     c     d   e<span class="hljs-number">0</span>   <span class="hljs-number">0.0</span>   <span class="hljs-number">2.0</span>   <span class="hljs-number">4.0</span>   <span class="hljs-number">6.0</span> NaN<span class="hljs-number">1</span>   <span class="hljs-number">9.0</span>   NaN  <span class="hljs-number">13.0</span>  <span class="hljs-number">15.0</span> NaN<span class="hljs-number">2</span>  <span class="hljs-number">18.0</span>  <span class="hljs-number">20.0</span>  <span class="hljs-number">22.0</span>  <span class="hljs-number">24.0</span> NaN<span class="hljs-number">3</span>   NaN   NaN   NaN   NaN NaN<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj1.add(obj2, fill_value=<span class="hljs-number">10</span>)      a     b     c     d     e<span class="hljs-number">0</span>   <span class="hljs-number">0.0</span>   <span class="hljs-number">2.0</span>   <span class="hljs-number">4.0</span>   <span class="hljs-number">6.0</span>  <span class="hljs-number">14.0</span><span class="hljs-number">1</span>   <span class="hljs-number">9.0</span>  <span class="hljs-number">15.0</span>  <span class="hljs-number">13.0</span>  <span class="hljs-number">15.0</span>  <span class="hljs-number">19.0</span><span class="hljs-number">2</span>  <span class="hljs-number">18.0</span>  <span class="hljs-number">20.0</span>  <span class="hljs-number">22.0</span>  <span class="hljs-number">24.0</span>  <span class="hljs-number">24.0</span><span class="hljs-number">3</span>  <span class="hljs-number">25.0</span>  <span class="hljs-number">26.0</span>  <span class="hljs-number">27.0</span>  <span class="hljs-number">28.0</span>  <span class="hljs-number">29.0</span></code></pre><h3><span id="02x02-isnull-notnull-pan-duan-que-shi-zhi"><font color="#4876FF">【02x02】isnull() / notnull() 判断缺失值</font></span></h3><p><code>isnull()</code>：为缺失值时为 <code>True</code>，否则为 <code>False</code>；</p><p><code>notnull()</code> 为缺失值时为 <code>False</code>，否则为 <code>True</code>。</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, np.nan, <span class="hljs-string">&#x27;hello&#x27;</span>, <span class="hljs-literal">None</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>        <span class="hljs-number">1</span><span class="hljs-number">1</span>      NaN<span class="hljs-number">2</span>    hello<span class="hljs-number">3</span>     <span class="hljs-literal">None</span>dtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.isnull()<span class="hljs-number">0</span>    <span class="hljs-literal">False</span><span class="hljs-number">1</span>     <span class="hljs-literal">True</span><span class="hljs-number">2</span>    <span class="hljs-literal">False</span><span class="hljs-number">3</span>     <span class="hljs-literal">True</span>dtype: <span class="hljs-built_in">bool</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.notnull()<span class="hljs-number">0</span>     <span class="hljs-literal">True</span><span class="hljs-number">1</span>    <span class="hljs-literal">False</span><span class="hljs-number">2</span>     <span class="hljs-literal">True</span><span class="hljs-number">3</span>    <span class="hljs-literal">False</span>dtype: <span class="hljs-built_in">bool</span></code></pre><h3><span id="02x03-dropna-shan-chu-que-shi-zhi"><font color="#4876FF">【02x03】dropna() 删除缺失值</font></span></h3><p><code>dropna()</code> 方法用于返回一个删除了缺失值的新 Series 或 DataFrame 对象。</p><p>在 Series 对象当中，<code>dropna()</code> 方法的语法如下（其他参数用法可参考在 DataFrame 中的应用）：</p><p><code>Series.dropna(self, axis=0, inplace=False, how=None)</code></p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.dropna.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.dropna.html</a></p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, np.nan, <span class="hljs-string">&#x27;hello&#x27;</span>, <span class="hljs-literal">None</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>        <span class="hljs-number">1</span><span class="hljs-number">1</span>      NaN<span class="hljs-number">2</span>    hello<span class="hljs-number">3</span>     <span class="hljs-literal">None</span>dtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.dropna()<span class="hljs-number">0</span>        <span class="hljs-number">1</span><span class="hljs-number">2</span>    hellodtype: <span class="hljs-built_in">object</span></code></pre><p>在 DataFrame 对象中，<code>dropna()</code> 方法的语法如下：</p><p><code>DataFrame.dropna(self, axis=0, how='any', thresh=None, subset=None, inplace=False)</code></p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.dropna.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.dropna.html</a></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>axis</td><td>确定是否删除包含缺失值的行或列<br><code>0</code> 或 <code>'index'</code>：删除包含缺失值的行。<code>1</code> 或 <code>'columns'</code>：删除包含缺失值的列</td></tr><tr><td>how</td><td><code>'any'</code>：如果存在任何NA值，则删除该行或列。<code>'all'</code>：如果所有值都是NA，则删除该行或列</td></tr><tr><td>thresh</td><td>设置行或列中<strong>非缺失值</strong>的最小数量</td></tr></tbody></table><p>不传递任何参数，将会删除任何包含缺失值的整行数据：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1</span>, np.nan, <span class="hljs-number">2</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">5</span>], [np.nan, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>]])<span class="hljs-meta">&gt;&gt;&gt; </span>obj     <span class="hljs-number">0</span>    <span class="hljs-number">1</span>  <span class="hljs-number">2</span><span class="hljs-number">0</span>  <span class="hljs-number">1.0</span>  NaN  <span class="hljs-number">2</span><span class="hljs-number">1</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">5</span><span class="hljs-number">2</span>  NaN  <span class="hljs-number">4.0</span>  <span class="hljs-number">6</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.dropna()     <span class="hljs-number">0</span>    <span class="hljs-number">1</span>  <span class="hljs-number">2</span><span class="hljs-number">1</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">5</span></code></pre><p>指定 axis 参数，删除包含缺失值的行或列：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1</span>, np.nan, <span class="hljs-number">2</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">5</span>], [np.nan, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>]])<span class="hljs-meta">&gt;&gt;&gt; </span>obj     <span class="hljs-number">0</span>    <span class="hljs-number">1</span>  <span class="hljs-number">2</span><span class="hljs-number">0</span>  <span class="hljs-number">1.0</span>  NaN  <span class="hljs-number">2</span><span class="hljs-number">1</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">5</span><span class="hljs-number">2</span>  NaN  <span class="hljs-number">4.0</span>  <span class="hljs-number">6</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.dropna(axis=<span class="hljs-string">&#x27;columns&#x27;</span>)   <span class="hljs-number">2</span><span class="hljs-number">0</span>  <span class="hljs-number">2</span><span class="hljs-number">1</span>  <span class="hljs-number">5</span><span class="hljs-number">2</span>  <span class="hljs-number">6</span></code></pre><p>指定 how 参数，<code>'any'</code>：如果存在任何NA值，则删除该行或列。<code>'all'</code>：如果所有值都是NA，则删除该行或列：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1</span>, np.nan, <span class="hljs-number">2</span>, np.nan], [<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">5</span>, np.nan], [np.nan, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, np.nan]])<span class="hljs-meta">&gt;&gt;&gt; </span>obj     <span class="hljs-number">0</span>    <span class="hljs-number">1</span>  <span class="hljs-number">2</span>   <span class="hljs-number">3</span><span class="hljs-number">0</span>  <span class="hljs-number">1.0</span>  NaN  <span class="hljs-number">2</span> NaN<span class="hljs-number">1</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">5</span> NaN<span class="hljs-number">2</span>  NaN  <span class="hljs-number">4.0</span>  <span class="hljs-number">6</span> NaN<span class="hljs-meta">&gt;&gt;&gt; </span>obj.dropna(axis=<span class="hljs-string">&#x27;columns&#x27;</span>, how=<span class="hljs-string">&#x27;all&#x27;</span>)     <span class="hljs-number">0</span>    <span class="hljs-number">1</span>  <span class="hljs-number">2</span><span class="hljs-number">0</span>  <span class="hljs-number">1.0</span>  NaN  <span class="hljs-number">2</span><span class="hljs-number">1</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">5</span><span class="hljs-number">2</span>  NaN  <span class="hljs-number">4.0</span>  <span class="hljs-number">6</span></code></pre><p>指定 thresh 参数，设置行或列中<font color="#FF0000"><strong>非缺失值</strong></font>的最小数量，以下示例中，第一行和第三行只有两个非缺失值，所以会被删除：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1</span>, np.nan, <span class="hljs-number">2</span>, np.nan], [<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">5</span>, np.nan], [np.nan, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, np.nan]])<span class="hljs-meta">&gt;&gt;&gt; </span>obj     <span class="hljs-number">0</span>    <span class="hljs-number">1</span>  <span class="hljs-number">2</span>   <span class="hljs-number">3</span><span class="hljs-number">0</span>  <span class="hljs-number">1.0</span>  NaN  <span class="hljs-number">2</span> NaN<span class="hljs-number">1</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">5</span> NaN<span class="hljs-number">2</span>  NaN  <span class="hljs-number">4.0</span>  <span class="hljs-number">6</span> NaN&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.dropna(axis=<span class="hljs-string">&#x27;rows&#x27;</span>, thresh=<span class="hljs-number">3</span>)     <span class="hljs-number">0</span>    <span class="hljs-number">1</span>  <span class="hljs-number">2</span>   <span class="hljs-number">3</span><span class="hljs-number">1</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">5</span> NaN</code></pre><h3><span id="02x04-fillna-tian-chong-que-shi-zhi"><font color="#4876FF">【02x04】fillna() 填充缺失值</font></span></h3><p><code>fillna()</code> 方法可以将缺失值替换成有效的数值。</p><p>在 Series 对象中，<code>fillna()</code> 方法的语法如下：</p><p><code>Series.fillna(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None)</code></p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.fillna.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.fillna.html</a></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>value</td><td>用于填充的值（例如 0），或者是一个 dict / Series / DataFrame 值<br>指定要用于每个 index（对于 Series）或column（对于 DataFrame）的值<br>不在dict / Series / DataFrame中的值将不被填充。此值不能是列表</td></tr><tr><td>method</td><td>填充方法：<code>None</code><br><code>‘pad’</code> / <code>‘ffill’</code>：将上一个有效观测值向前传播到下一个有效观测值<br><code>‘backfill’</code> / <code>‘bfill’</code>：使用下一个有效观察值来填补空白</td></tr><tr><td>axis</td><td><code>0</code> or <code>‘index’</code>，要填充缺失值的轴</td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, np.nan, <span class="hljs-number">2</span>, <span class="hljs-literal">None</span>, <span class="hljs-number">3</span>], index=<span class="hljs-built_in">list</span>(<span class="hljs-string">&#x27;abcde&#x27;</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obja    <span class="hljs-number">1.0</span>b    NaNc    <span class="hljs-number">2.0</span>d    NaNe    <span class="hljs-number">3.0</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.fillna(<span class="hljs-number">0</span>)a    <span class="hljs-number">1.0</span>b    <span class="hljs-number">0.0</span>c    <span class="hljs-number">2.0</span>d    <span class="hljs-number">0.0</span>e    <span class="hljs-number">3.0</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.fillna(method=<span class="hljs-string">&#x27;ffill&#x27;</span>)a    <span class="hljs-number">1.0</span>b    <span class="hljs-number">1.0</span>c    <span class="hljs-number">2.0</span>d    <span class="hljs-number">2.0</span>e    <span class="hljs-number">3.0</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.fillna(method=<span class="hljs-string">&#x27;bfill&#x27;</span>)a    <span class="hljs-number">1.0</span>b    <span class="hljs-number">2.0</span>c    <span class="hljs-number">2.0</span>d    <span class="hljs-number">3.0</span>e    <span class="hljs-number">3.0</span>dtype: float64</code></pre><p>在 DataFrame 对象中，<code>fillna()</code> 方法的语法如下：</p><p><code>DataFrame.fillna(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None)</code></p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.fillna.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.fillna.html</a></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>value</td><td>用于填充的值（例如 0），或者是一个 dict / Series / DataFrame 值<br>指定要用于每个 index（对于 Series）或column（对于 DataFrame）的值<br>不在dict / Series / DataFrame中的值将不被填充。此值不能是列表</td></tr><tr><td>method</td><td>填充方法：<code>None</code><br><code>‘pad’</code> / <code>‘ffill’</code>：将上一个有效观测值向前传播到下一个有效观测值<br><code>‘backfill’</code> / <code>‘bfill’</code>：使用下一个有效观察值来填补空白</td></tr><tr><td>axis</td><td><code>0</code> or <code>‘index’</code>，<code>1</code> or <code>‘columns’</code>，要填充缺失值的轴</td></tr></tbody></table><p>在 DataFrame 对象中的用法和在 Series 对象中的用法大同小异，只不过 axis 参数多了一个选择：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1</span>, np.nan, <span class="hljs-number">2</span>, np.nan], [<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">5</span>, np.nan], [np.nan, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, np.nan]])<span class="hljs-meta">&gt;&gt;&gt; </span>obj     <span class="hljs-number">0</span>    <span class="hljs-number">1</span>  <span class="hljs-number">2</span>   <span class="hljs-number">3</span><span class="hljs-number">0</span>  <span class="hljs-number">1.0</span>  NaN  <span class="hljs-number">2</span> NaN<span class="hljs-number">1</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">5</span> NaN<span class="hljs-number">2</span>  NaN  <span class="hljs-number">4.0</span>  <span class="hljs-number">6</span> NaN<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.fillna(method=<span class="hljs-string">&#x27;ffill&#x27;</span>, axis=<span class="hljs-number">1</span>)     <span class="hljs-number">0</span>    <span class="hljs-number">1</span>    <span class="hljs-number">2</span>    <span class="hljs-number">3</span><span class="hljs-number">0</span>  <span class="hljs-number">1.0</span>  <span class="hljs-number">1.0</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">2.0</span><span class="hljs-number">1</span>  <span class="hljs-number">2.0</span>  <span class="hljs-number">3.0</span>  <span class="hljs-number">5.0</span>  <span class="hljs-number">5.0</span><span class="hljs-number">2</span>  NaN  <span class="hljs-number">4.0</span>  <span class="hljs-number">6.0</span>  <span class="hljs-number">6.0</span></code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106743778</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-01x00-pandas-suan-zhu-yun-suan-font&quot;&gt;&lt;font</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Pandas" scheme="https://www.itbob.cn/tags/Pandas/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作</title>
    <link href="https://www.itbob.cn/article/026/"/>
    <id>https://www.itbob.cn/article/026/</id>
    <published>2020-06-13T14:19:53.000Z</published>
    <updated>2022-05-22T12:38:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-1-index-suo-yin-dui-xiang-font"><font color="#FF0000">【1】Index 索引对象</font></a></li><li><a href="#font-color-ff0000-2-pandas-yi-ban-suo-yin-font"><font color="#FF0000">【2】Pandas 一般索引</font></a><ul><li><a href="#font-color-4876ff-2-1-series-suo-yin-font"><font color="#4876FF">【2.1】Series 索引</font></a><ul><li><a href="#font-color-ffa500-2-1-1-head-tail-font"><font color="#FFA500">【2.1.1】head() / tail()</font></a></li><li><a href="#font-color-ffa500-2-1-2-xing-suo-yin-font"><font color="#FFA500">【2.1.2】行索引</font></a></li><li><a href="#font-color-ffa500-2-1-3-qie-pian-suo-yin-font"><font color="#FFA500">【2.1.3】切片索引</font></a></li><li><a href="#font-color-ffa500-2-1-4-hua-shi-suo-yin-font"><font color="#FFA500">【2.1.4】花式索引</font></a></li><li><a href="#font-color-ffa500-2-1-5-bu-er-suo-yin-font"><font color="#FFA500">【2.1.5】布尔索引</font></a></li></ul></li><li><a href="#font-color-4876ff-2-2-dataframe-suo-yin-font"><font color="#4876FF">【2.2】DataFrame 索引</font></a><ul><li><a href="#font-color-ffa500-2-2-1-head-tail-font"><font color="#FFA500">【2.2.1】head() / tail()</font></a></li><li><a href="#font-color-ffa500-2-2-2-lie-suo-yin-font"><font color="#FFA500">【2.2.2】列索引</font></a></li><li><a href="#font-color-ffa500-2-2-3-qie-pian-suo-yin-font"><font color="#FFA500">【2.2.3】切片索引</font></a></li><li><a href="#font-color-ffa500-2-2-4-hua-shi-suo-yin-font"><font color="#FFA500">【2.2.4】花式索引</font></a></li><li><a href="#font-color-ffa500-2-2-5-bu-er-suo-yin-font"><font color="#FFA500">【2.2.5】布尔索引</font></a></li></ul></li></ul></li><li><a href="#font-color-ff0000-3-suo-yin-qi-loc-he-iloc-font"><font color="#FF0000">【3】索引器：loc 和 iloc</font></a><ul><li><a href="#font-color-4876ff-3-1-loc-biao-qian-suo-yin-font"><font color="#4876FF">【3.1】loc 标签索引</font></a><ul><li><a href="#font-color-ffa500-3-1-1-series-loc-font"><font color="#FFA500">【3.1.1】Series.loc</font></a></li><li><a href="#font-color-ffa500-3-1-2-dataframe-loc-font"><font color="#FFA500">【3.1.2】DataFrame.loc</font></a></li></ul></li><li><a href="#font-color-4876ff-3-2-iloc-wei-zhi-suo-yin-font"><font color="#4876FF">【3.2】iloc 位置索引</font></a><ul><li><a href="#font-color-ffa500-3-2-1-series-iloc-font"><font color="#FFA500">【3.2.1】Series.iloc</font></a></li><li><a href="#font-color-ffa500-3-2-2-dataframe-iloc-font"><font color="#FFA500">【3.2.2】DataFrame.iloc</font></a></li></ul></li></ul></li><li><a href="#font-color-ff0000-4-pandas-chong-xin-suo-yin-font"><font color="#FF0000">【4】Pandas 重新索引</font></a></li></ul><!-- tocstop --><hr><p>Pandas 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/025/">Python 数据分析三剑客之 Pandas（一）：认识 Pandas 及其 Series、DataFrame 对象</a></li><li><a href="https://www.itbob.cn/article/026/">Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作</a></li><li><a href="https://www.itbob.cn/article/027/">Python 数据分析三剑客之 Pandas（三）：算术运算与缺失值的处理</a></li><li><a href="https://www.itbob.cn/article/028/">Python 数据分析三剑客之 Pandas（四）：函数应用、映射、排序和层级索引</a></li><li><a href="https://www.itbob.cn/article/029/">Python 数据分析三剑客之 Pandas（五）：统计计算与统计描述</a></li><li><a href="https://www.itbob.cn/article/030/">Python 数据分析三剑客之 Pandas（六）：GroupBy 数据分裂、应用与合并</a></li><li><a href="https://www.itbob.cn/article/031/">Python 数据分析三剑客之 Pandas（七）：合并数据集</a></li><li><a href="https://www.itbob.cn/article/032/">Python 数据分析三剑客之 Pandas（八）：数据重塑、重复数据处理与数据替换</a></li><li><a href="https://www.itbob.cn/article/033/">Python 数据分析三剑客之 Pandas（九）：时间序列</a></li><li><a href="https://www.itbob.cn/article/034/">Python 数据分析三剑客之 Pandas（十）：数据读写</a></li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106698307</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h1><span id="1-index-suo-yin-dui-xiang"><font color="#FF0000">【1】Index 索引对象</font></span></h1><p>Series 和 DataFrame 中的索引都是 Index 对象，为了保证数据的安全，索引对象是不可变的，如果尝试更改索引就会报错；常见的 Index 种类有：索引（Index），整数索引（Int64Index），层级索引（MultiIndex），时间戳类型（DatetimeIndex）。</p><p>一下代码演示了 Index 索引对象和其不可变的性质：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj.indexIndex([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], dtype=<span class="hljs-string">&#x27;object&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">type</span>(obj.index)&lt;<span class="hljs-class"><span class="hljs-keyword">class</span> &#x27;<span class="hljs-title">pandas</span>.<span class="hljs-title">core</span>.<span class="hljs-title">indexes</span>.<span class="hljs-title">base</span>.<span class="hljs-title">Index</span>&#x27;&gt;</span><span class="hljs-class">&gt;&gt;&gt; <span class="hljs-title">obj</span>.<span class="hljs-title">index</span>[0] = &#x27;<span class="hljs-title">e</span>&#x27;</span><span class="hljs-class"><span class="hljs-title">Traceback</span> (<span class="hljs-params">most recent call last</span>):</span>  File <span class="hljs-string">&quot;&lt;pyshell#28&gt;&quot;</span>, line <span class="hljs-number">1</span>, <span class="hljs-keyword">in</span> &lt;module&gt;    obj.index[<span class="hljs-number">0</span>] = <span class="hljs-string">&#x27;e&#x27;</span>  File <span class="hljs-string">&quot;C:\Users\...\base.py&quot;</span>, line <span class="hljs-number">3909</span>, <span class="hljs-keyword">in</span> __setitem__    <span class="hljs-keyword">raise</span> TypeError(<span class="hljs-string">&quot;Index does not support mutable operations&quot;</span>)TypeError: Index does <span class="hljs-keyword">not</span> support mutable operations</code></pre><table><tr><td bgcolor="#FFA500"><font size="5" color="#fff">index 索引对象常用属性</font></td></tr></table><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.html">https://pandas.pydata.org/docs/reference/api/pandas.Index.html</a></p><table><thead><tr><th>属性</th><th>描述</th></tr></thead><tbody><tr><td>T</td><td>转置</td></tr><tr><td>array</td><td>index 的数组形式，常见<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.array.html">官方文档</a></td></tr><tr><td>dtype</td><td>返回基础数据的 dtype 对象</td></tr><tr><td>hasnans</td><td>是否有 NaN（缺失值）</td></tr><tr><td>inferred_type</td><td>返回一个字符串，表示 index 的类型</td></tr><tr><td>is_monotonic</td><td>判断 index 是否是递增的</td></tr><tr><td>is_monotonic_decreasing</td><td>判断 index 是否单调递减</td></tr><tr><td>is_monotonic_increasing</td><td>判断 index 是否单调递增</td></tr><tr><td>is_unique</td><td>index 是否没有重复值</td></tr><tr><td>nbytes</td><td>返回 index 中的字节数</td></tr><tr><td>ndim</td><td>index 的维度</td></tr><tr><td>nlevels</td><td>Number of levels.</td></tr><tr><td>shape</td><td>返回一个元组，表示 index 的形状</td></tr><tr><td>size</td><td>index 的大小</td></tr><tr><td>values</td><td>返回 index 中的值 / 数组</td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj.indexIndex([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], dtype=<span class="hljs-string">&#x27;object&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.array&lt;PandasArray&gt;[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>]Length: <span class="hljs-number">4</span>, dtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.dtypedtype(<span class="hljs-string">&#x27;O&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.hasnans<span class="hljs-literal">False</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.inferred_type<span class="hljs-string">&#x27;string&#x27;</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.is_monotonic<span class="hljs-literal">True</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.is_monotonic_decreasing<span class="hljs-literal">False</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.is_monotonic_increasing<span class="hljs-literal">True</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.is_unique<span class="hljs-literal">True</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.nbytes<span class="hljs-number">16</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.ndim<span class="hljs-number">1</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.nlevels<span class="hljs-number">1</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.shape(<span class="hljs-number">4</span>,)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.size<span class="hljs-number">4</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.valuesarray([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], dtype=<span class="hljs-built_in">object</span>)</code></pre><table><tr><td bgcolor="#FFA500"><font size="5" color="#fff">index 索引对象常用方法</font></td></tr></table><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.html">https://pandas.pydata.org/docs/reference/api/pandas.Index.html</a></p><table><thead><tr><th>方法</th><th>描述</th></tr></thead><tbody><tr><td>all(self, *args, **kwargs)</td><td>判断所有元素是否为真，有 0 会被视为 False</td></tr><tr><td>any(self, *args, **kwargs)</td><td>判断是否至少有一个元素为真，均为 0 会被视为 False</td></tr><tr><td>append(self, other)</td><td>连接另一个 index，产生一个新的 index</td></tr><tr><td>argmax(self[, axis, skipna])</td><td>返回 index 中最大值的索引值</td></tr><tr><td>argmin(self[, axis, skipna])</td><td>返回 index 中最小值的索引值</td></tr><tr><td>argsort(self, *args, **kwargs)</td><td>对 index 从小到大排序，返回排序后的元素在原 index 中的索引值</td></tr><tr><td>delete(self, loc)</td><td>删除指定索引位置的元素，返回删除后的新 index</td></tr><tr><td>difference(self, other[, sort])</td><td>在第一个 index 中删除第二个 index 中的元素，即差集</td></tr><tr><td>drop(self, labels[, errors])</td><td>在原 index 中删除传入的值</td></tr><tr><td>drop_duplicates(self[, keep])</td><td>删除重复值，keep 参数可选值如下：<br><code>‘first’</code>：保留第一次出现的重复项；<br><code>‘last’</code>：保留最后一次出现的重复项；<br><code>False</code>：不保留重复项</td></tr><tr><td>duplicated(self[, keep])</td><td>判断是否为重复值，keep 参数可选值如下：<br><code>‘first’</code>：第一次重复的为 False，其他为 True；<br><code>‘last’</code>：最后一次重复的为 False，其他为 True；<br><code>False</code>：所有重复的均为 True</td></tr><tr><td>dropna(self[, how])</td><td>删除缺失值，即 NaN</td></tr><tr><td>fillna(self[, value, downcast])</td><td>用指定值填充缺失值，即 NaN</td></tr><tr><td>equals(self, other)</td><td>判断两个  index 是否相同</td></tr><tr><td>insert(self, loc, item)</td><td>将元素插入到指定索引处，返回新的 index</td></tr><tr><td>intersection(self, other[, sort])</td><td>返回两个 index 的交集</td></tr><tr><td>isna(self)</td><td>检测 index 元素是否为缺失值，即 NaN</td></tr><tr><td>isnull(self)</td><td>检测 index 元素是否为缺失值，即 NaN</td></tr><tr><td>max(self[, axis, skipna])</td><td>返回 index 的最大值</td></tr><tr><td>min(self[, axis, skipna])</td><td>返回 index 的最小值</td></tr><tr><td>union(self, other[, sort])</td><td>返回两个 index 的并集</td></tr><tr><td>unique(self[, level])</td><td>返回 index 中的唯一值，相当于去除重复值</td></tr></tbody></table><ul><li><code>all(self, *args, **kwargs)</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.all.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]).<span class="hljs-built_in">all</span>()<span class="hljs-literal">True</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>]).<span class="hljs-built_in">all</span>()<span class="hljs-literal">False</span></code></pre><ul><li><code>any(self, *args, **kwargs)</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.any.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>]).<span class="hljs-built_in">any</span>()<span class="hljs-literal">True</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>]).<span class="hljs-built_in">any</span>()<span class="hljs-literal">False</span></code></pre><ul><li><code>append(self, other)</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.append.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>]).append(pd.Index([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]))Index([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], dtype=<span class="hljs-string">&#x27;object&#x27;</span>)</code></pre><ul><li><code>argmax(self[, axis, skipna])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.argmax.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>]).argmax()<span class="hljs-number">3</span></code></pre><ul><li><code>argmin(self[, axis, skipna])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.argmin.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>]).argmin()<span class="hljs-number">4</span></code></pre><ul><li><code>argsort(self, *args, **kwargs)</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.argsort.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>]).argsort()array([<span class="hljs-number">4</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">3</span>], dtype=int32)</code></pre><ul><li><code>delete(self, loc)</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.delete.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>]).delete(<span class="hljs-number">0</span>)Int64Index([<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>], dtype=<span class="hljs-string">&#x27;int64&#x27;</span>)</code></pre><ul><li><code>difference(self, other[, sort])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.difference.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx1 = pd.Index([<span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx2 = pd.Index([<span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx1.difference(idx2)Int64Index([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], dtype=<span class="hljs-string">&#x27;int64&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>idx1.difference(idx2, sort=<span class="hljs-literal">False</span>)Int64Index([<span class="hljs-number">2</span>, <span class="hljs-number">1</span>], dtype=<span class="hljs-string">&#x27;int64&#x27;</span>)</code></pre><ul><li><code>drop(self, labels[, errors])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.drop.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>]).drop([<span class="hljs-number">2</span>, <span class="hljs-number">1</span>])Int64Index([<span class="hljs-number">5</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>], dtype=<span class="hljs-string">&#x27;int64&#x27;</span>)</code></pre><ul><li><code>drop_duplicates(self[, keep])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.drop_duplicates.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx = pd.Index([<span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;cow&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;beetle&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;hippo&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx.drop_duplicates(keep=<span class="hljs-string">&#x27;first&#x27;</span>)Index([<span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;cow&#x27;</span>, <span class="hljs-string">&#x27;beetle&#x27;</span>, <span class="hljs-string">&#x27;hippo&#x27;</span>], dtype=<span class="hljs-string">&#x27;object&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>idx.drop_duplicates(keep=<span class="hljs-string">&#x27;last&#x27;</span>)Index([<span class="hljs-string">&#x27;cow&#x27;</span>, <span class="hljs-string">&#x27;beetle&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;hippo&#x27;</span>], dtype=<span class="hljs-string">&#x27;object&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>idx.drop_duplicates(keep=<span class="hljs-literal">False</span>)Index([<span class="hljs-string">&#x27;cow&#x27;</span>, <span class="hljs-string">&#x27;beetle&#x27;</span>, <span class="hljs-string">&#x27;hippo&#x27;</span>], dtype=<span class="hljs-string">&#x27;object&#x27;</span>)</code></pre><ul><li>duplicated(self[, keep]) 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.duplicated.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx = pd.Index([<span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;cow&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>, <span class="hljs-string">&#x27;beetle&#x27;</span>, <span class="hljs-string">&#x27;lama&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx.duplicated()array([<span class="hljs-literal">False</span>, <span class="hljs-literal">False</span>,  <span class="hljs-literal">True</span>, <span class="hljs-literal">False</span>,  <span class="hljs-literal">True</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx.duplicated(keep=<span class="hljs-string">&#x27;first&#x27;</span>)array([<span class="hljs-literal">False</span>, <span class="hljs-literal">False</span>,  <span class="hljs-literal">True</span>, <span class="hljs-literal">False</span>,  <span class="hljs-literal">True</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx.duplicated(keep=<span class="hljs-string">&#x27;last&#x27;</span>)array([ <span class="hljs-literal">True</span>, <span class="hljs-literal">False</span>,  <span class="hljs-literal">True</span>, <span class="hljs-literal">False</span>, <span class="hljs-literal">False</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx.duplicated(keep=<span class="hljs-literal">False</span>)array([ <span class="hljs-literal">True</span>, <span class="hljs-literal">False</span>,  <span class="hljs-literal">True</span>, <span class="hljs-literal">False</span>,  <span class="hljs-literal">True</span>])</code></pre><ul><li><code>dropna(self[, how])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.dropna.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">2</span>, <span class="hljs-number">5</span>, np.NaN, <span class="hljs-number">6</span>, np.NaN, np.NaN]).dropna()Float64Index([<span class="hljs-number">2.0</span>, <span class="hljs-number">5.0</span>, <span class="hljs-number">6.0</span>], dtype=<span class="hljs-string">&#x27;float64&#x27;</span>)</code></pre><ul><li><code>fillna(self[, value, downcast])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.fillna.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">2</span>, <span class="hljs-number">5</span>, np.NaN, <span class="hljs-number">6</span>, np.NaN, np.NaN]).fillna(<span class="hljs-number">5</span>)Float64Index([<span class="hljs-number">2.0</span>, <span class="hljs-number">5.0</span>, <span class="hljs-number">5.0</span>, <span class="hljs-number">6.0</span>, <span class="hljs-number">5.0</span>, <span class="hljs-number">5.0</span>], dtype=<span class="hljs-string">&#x27;float64&#x27;</span>)</code></pre><ul><li><code>equals(self, other)</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.equals.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx1 = pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx2 = pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx1.equals(idx2)<span class="hljs-literal">True</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>idx1 = pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx2 = pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">4</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx1.equals(idx2)<span class="hljs-literal">False</span></code></pre><ul><li><code>intersection(self, other[, sort])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.intersection.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx1 = pd.Index([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx2 = pd.Index([<span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx1.intersection(idx2)Int64Index([<span class="hljs-number">3</span>, <span class="hljs-number">4</span>], dtype=<span class="hljs-string">&#x27;int64&#x27;</span>)</code></pre><ul><li><code>insert(self, loc, item)</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.insert.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>]).insert(<span class="hljs-number">2</span>, <span class="hljs-string">&#x27;A&#x27;</span>)Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>], dtype=<span class="hljs-string">&#x27;object&#x27;</span>)</code></pre><ul><li><code>isna(self)</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.isna.html">官方文档</a>】、<code>isnull(self)</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.isnull.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">2</span>, <span class="hljs-number">5</span>, np.NaN, <span class="hljs-number">6</span>, np.NaN, np.NaN]).isna()array([<span class="hljs-literal">False</span>, <span class="hljs-literal">False</span>,  <span class="hljs-literal">True</span>, <span class="hljs-literal">False</span>,  <span class="hljs-literal">True</span>,  <span class="hljs-literal">True</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">2</span>, <span class="hljs-number">5</span>, np.NaN, <span class="hljs-number">6</span>, np.NaN, np.NaN]).isnull()array([<span class="hljs-literal">False</span>, <span class="hljs-literal">False</span>,  <span class="hljs-literal">True</span>, <span class="hljs-literal">False</span>,  <span class="hljs-literal">True</span>,  <span class="hljs-literal">True</span>])</code></pre><ul><li><code>max(self[, axis, skipna])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.max.html">官方文档</a>】、<code>min(self[, axis, skipna])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.min.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>]).<span class="hljs-built_in">max</span>()<span class="hljs-number">9</span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>]).<span class="hljs-built_in">min</span>()<span class="hljs-number">1</span></code></pre><ul><li><code>union(self, other[, sort])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.union.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>idx1 = pd.Index([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx2 = pd.Index([<span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>idx1.union(idx2)Int64Index([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>], dtype=<span class="hljs-string">&#x27;int64&#x27;</span>)</code></pre><ul><li><code>unique(self[, level])</code> 【<a href="https://pandas.pydata.org/docs/reference/api/pandas.Index.unique.html">官方文档</a>】</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>pd.Index([<span class="hljs-number">5</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">1</span>]).unique()Int64Index([<span class="hljs-number">5</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>], dtype=<span class="hljs-string">&#x27;int64&#x27;</span>)</code></pre><h1><span id="2-pandas-yi-ban-suo-yin"><font color="#FF0000">【2】Pandas 一般索引</font></span></h1><p>由于在 Pandas 中，由于有一些更高级的索引操作，比如重新索引，层级索引等，因此将一般的切片索引、花式索引、布尔索引等归纳为一般索引。</p><h2><span id="2-1-series-suo-yin"><font color="#4876FF">【2.1】Series 索引</font></span></h2><h3><span id="2-1-1-head-tail"><font color="#FFA500">【2.1.1】head() / tail()</font></span></h3><p><code>Series.head()</code> 和 <code>Series.tail()</code> 方法可以获取的前五行和后五行数据，如果向 head() / tail() 里面传入参数，则会获取指定行：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(np.random.randn(<span class="hljs-number">8</span>))<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>   -<span class="hljs-number">0.643437</span><span class="hljs-number">1</span>   -<span class="hljs-number">0.365652</span><span class="hljs-number">2</span>   -<span class="hljs-number">0.966554</span><span class="hljs-number">3</span>   -<span class="hljs-number">0.036127</span><span class="hljs-number">4</span>    <span class="hljs-number">1.046095</span><span class="hljs-number">5</span>   -<span class="hljs-number">2.048362</span><span class="hljs-number">6</span>   -<span class="hljs-number">1.865551</span><span class="hljs-number">7</span>    <span class="hljs-number">1.344728</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.head()<span class="hljs-number">0</span>   -<span class="hljs-number">0.643437</span><span class="hljs-number">1</span>   -<span class="hljs-number">0.365652</span><span class="hljs-number">2</span>   -<span class="hljs-number">0.966554</span><span class="hljs-number">3</span>   -<span class="hljs-number">0.036127</span><span class="hljs-number">4</span>    <span class="hljs-number">1.046095</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.head(<span class="hljs-number">3</span>)<span class="hljs-number">0</span>   -<span class="hljs-number">0.643437</span><span class="hljs-number">1</span>   -<span class="hljs-number">0.365652</span><span class="hljs-number">2</span>   -<span class="hljs-number">0.966554</span>dtype: float64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.tail()<span class="hljs-number">3</span>    <span class="hljs-number">1.221221</span><span class="hljs-number">4</span>   -<span class="hljs-number">1.373496</span><span class="hljs-number">5</span>    <span class="hljs-number">1.032843</span><span class="hljs-number">6</span>    <span class="hljs-number">0.029734</span><span class="hljs-number">7</span>   -<span class="hljs-number">1.861485</span>dtype: float64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.tail(<span class="hljs-number">3</span>)<span class="hljs-number">5</span>    <span class="hljs-number">1.032843</span><span class="hljs-number">6</span>    <span class="hljs-number">0.029734</span><span class="hljs-number">7</span>   -<span class="hljs-number">1.861485</span>dtype: float64</code></pre><h3><span id="2-1-2-xing-suo-yin"><font color="#FFA500">【2.1.2】行索引</font></span></h3><p>Pandas 中可以按照位置进行索引，也可以按照索引名（index）进行索引，也可以用 Python 字典的表达式和方法来获取值：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obja    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>d    <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;c&#x27;</span>]-<span class="hljs-number">8</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-number">2</span>]-<span class="hljs-number">8</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">&#x27;b&#x27;</span> <span class="hljs-keyword">in</span> obj<span class="hljs-literal">True</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.keys()Index([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], dtype=<span class="hljs-string">&#x27;object&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">list</span>(obj.items())[(<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-number">1</span>), (<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-number">5</span>), (<span class="hljs-string">&#x27;c&#x27;</span>, -<span class="hljs-number">8</span>), (<span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-number">2</span>)]</code></pre><h3><span id="2-1-3-qie-pian-suo-yin"><font color="#FFA500">【2.1.3】切片索引</font></span></h3><p>切片的方法有两种：按位置切片和按索引名（index）切片，注意：按位置切片时，<font color="#FF0000"><strong>不包含</strong></font>终止索引；按索引名（index）切片时，<font color="#FF0000"><strong>包含</strong></font>终止索引。</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obja    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>d    <span class="hljs-number">2</span>dtype: int64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-number">1</span>:<span class="hljs-number">3</span>]b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>dtype: int64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-number">0</span>:<span class="hljs-number">3</span>:<span class="hljs-number">2</span>]a    <span class="hljs-number">1</span>c   -<span class="hljs-number">8</span>dtype: int64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;b&#x27;</span>:<span class="hljs-string">&#x27;d&#x27;</span>]b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>d    <span class="hljs-number">2</span>dtype: int64</code></pre><h3><span id="2-1-4-hua-shi-suo-yin"><font color="#FFA500">【2.1.4】花式索引</font></span></h3><p>所谓的花式索引，就是间隔索引、不连续的索引，传递一个由索引名（index）或者位置参数组成的<font color="#FF0000"><strong>列表</strong></font>来一次性获得多个元素：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obja    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>d    <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[[<span class="hljs-number">0</span>, <span class="hljs-number">2</span>]]a    <span class="hljs-number">1</span>c   -<span class="hljs-number">8</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>]]a    <span class="hljs-number">1</span>c   -<span class="hljs-number">8</span>d    <span class="hljs-number">2</span>dtype: int64</code></pre><h3><span id="2-1-5-bu-er-suo-yin"><font color="#FFA500">【2.1.5】布尔索引</font></span></h3><p>可以通过一个布尔数组来索引目标数组，即通过布尔运算（如：比较运算符）来获取符合指定条件的元素的数组。</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>, -<span class="hljs-number">3</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obja    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>d    <span class="hljs-number">2</span>e   -<span class="hljs-number">3</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[obj &gt; <span class="hljs-number">0</span>]a    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>d    <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj &gt; <span class="hljs-number">0</span>a     <span class="hljs-literal">True</span>b     <span class="hljs-literal">True</span>c    <span class="hljs-literal">False</span>d     <span class="hljs-literal">True</span>e    <span class="hljs-literal">False</span>dtype: <span class="hljs-built_in">bool</span></code></pre><h2><span id="2-2-dataframe-suo-yin"><font color="#4876FF">【2.2】DataFrame 索引</font></span></h2><h3><span id="2-2-1-head-tail"><font color="#FFA500">【2.2.1】head() / tail()</font></span></h3><p>和 Series 一样，<code>DataFrame.head()</code> 和 <code>DataFrame.tail()</code> 方法同样可以获取 DataFrame 的前五行和后五行数据，如果向 head() / tail() 里面传入参数，则会获取指定行：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.random.randn(<span class="hljs-number">8</span>,<span class="hljs-number">4</span>), columns = [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj          a         b         c         d<span class="hljs-number">0</span> -<span class="hljs-number">1.399390</span>  <span class="hljs-number">0.521596</span> -<span class="hljs-number">0.869613</span>  <span class="hljs-number">0.506621</span><span class="hljs-number">1</span> -<span class="hljs-number">0.748562</span> -<span class="hljs-number">0.364952</span>  <span class="hljs-number">0.188399</span> -<span class="hljs-number">1.402566</span><span class="hljs-number">2</span>  <span class="hljs-number">1.378776</span> -<span class="hljs-number">1.476480</span>  <span class="hljs-number">0.361635</span>  <span class="hljs-number">0.451134</span><span class="hljs-number">3</span> -<span class="hljs-number">0.206405</span> -<span class="hljs-number">1.188609</span>  <span class="hljs-number">3.002599</span>  <span class="hljs-number">0.563650</span><span class="hljs-number">4</span>  <span class="hljs-number">0.993289</span>  <span class="hljs-number">1.133748</span>  <span class="hljs-number">1.177549</span> -<span class="hljs-number">2.562286</span><span class="hljs-number">5</span> -<span class="hljs-number">0.482157</span>  <span class="hljs-number">1.069293</span>  <span class="hljs-number">1.143983</span> -<span class="hljs-number">1.303079</span><span class="hljs-number">6</span> -<span class="hljs-number">1.199154</span>  <span class="hljs-number">0.220360</span>  <span class="hljs-number">0.801838</span> -<span class="hljs-number">0.104533</span><span class="hljs-number">7</span> -<span class="hljs-number">1.359816</span> -<span class="hljs-number">2.092035</span>  <span class="hljs-number">2.003530</span> -<span class="hljs-number">0.151812</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.head()          a         b         c         d<span class="hljs-number">0</span> -<span class="hljs-number">1.399390</span>  <span class="hljs-number">0.521596</span> -<span class="hljs-number">0.869613</span>  <span class="hljs-number">0.506621</span><span class="hljs-number">1</span> -<span class="hljs-number">0.748562</span> -<span class="hljs-number">0.364952</span>  <span class="hljs-number">0.188399</span> -<span class="hljs-number">1.402566</span><span class="hljs-number">2</span>  <span class="hljs-number">1.378776</span> -<span class="hljs-number">1.476480</span>  <span class="hljs-number">0.361635</span>  <span class="hljs-number">0.451134</span><span class="hljs-number">3</span> -<span class="hljs-number">0.206405</span> -<span class="hljs-number">1.188609</span>  <span class="hljs-number">3.002599</span>  <span class="hljs-number">0.563650</span><span class="hljs-number">4</span>  <span class="hljs-number">0.993289</span>  <span class="hljs-number">1.133748</span>  <span class="hljs-number">1.177549</span> -<span class="hljs-number">2.562286</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.head(<span class="hljs-number">3</span>)          a         b         c         d<span class="hljs-number">0</span> -<span class="hljs-number">1.399390</span>  <span class="hljs-number">0.521596</span> -<span class="hljs-number">0.869613</span>  <span class="hljs-number">0.506621</span><span class="hljs-number">1</span> -<span class="hljs-number">0.748562</span> -<span class="hljs-number">0.364952</span>  <span class="hljs-number">0.188399</span> -<span class="hljs-number">1.402566</span><span class="hljs-number">2</span>  <span class="hljs-number">1.378776</span> -<span class="hljs-number">1.476480</span>  <span class="hljs-number">0.361635</span>  <span class="hljs-number">0.451134</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.tail()          a         b         c         d<span class="hljs-number">3</span> -<span class="hljs-number">0.206405</span> -<span class="hljs-number">1.188609</span>  <span class="hljs-number">3.002599</span>  <span class="hljs-number">0.563650</span><span class="hljs-number">4</span>  <span class="hljs-number">0.993289</span>  <span class="hljs-number">1.133748</span>  <span class="hljs-number">1.177549</span> -<span class="hljs-number">2.562286</span><span class="hljs-number">5</span> -<span class="hljs-number">0.482157</span>  <span class="hljs-number">1.069293</span>  <span class="hljs-number">1.143983</span> -<span class="hljs-number">1.303079</span><span class="hljs-number">6</span> -<span class="hljs-number">1.199154</span>  <span class="hljs-number">0.220360</span>  <span class="hljs-number">0.801838</span> -<span class="hljs-number">0.104533</span><span class="hljs-number">7</span> -<span class="hljs-number">1.359816</span> -<span class="hljs-number">2.092035</span>  <span class="hljs-number">2.003530</span> -<span class="hljs-number">0.151812</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.tail(<span class="hljs-number">3</span>)          a         b         c         d<span class="hljs-number">5</span> -<span class="hljs-number">0.482157</span>  <span class="hljs-number">1.069293</span>  <span class="hljs-number">1.143983</span> -<span class="hljs-number">1.303079</span><span class="hljs-number">6</span> -<span class="hljs-number">1.199154</span>  <span class="hljs-number">0.220360</span>  <span class="hljs-number">0.801838</span> -<span class="hljs-number">0.104533</span><span class="hljs-number">7</span> -<span class="hljs-number">1.359816</span> -<span class="hljs-number">2.092035</span>  <span class="hljs-number">2.003530</span> -<span class="hljs-number">0.151812</span></code></pre><h3><span id="2-2-2-lie-suo-yin"><font color="#FFA500">【2.2.2】列索引</font></span></h3><p>DataFrame 可以按照列标签（columns）来进行列索引：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.random.randn(<span class="hljs-number">7</span>,<span class="hljs-number">2</span>), columns = [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj          a         b<span class="hljs-number">0</span> -<span class="hljs-number">1.198795</span>  <span class="hljs-number">0.928378</span><span class="hljs-number">1</span> -<span class="hljs-number">2.878230</span>  <span class="hljs-number">0.014650</span><span class="hljs-number">2</span>  <span class="hljs-number">2.267475</span>  <span class="hljs-number">0.370952</span><span class="hljs-number">3</span>  <span class="hljs-number">0.639340</span> -<span class="hljs-number">1.301041</span><span class="hljs-number">4</span> -<span class="hljs-number">1.953444</span>  <span class="hljs-number">0.148934</span><span class="hljs-number">5</span> -<span class="hljs-number">0.445225</span>  <span class="hljs-number">0.459632</span><span class="hljs-number">6</span>  <span class="hljs-number">0.097109</span> -<span class="hljs-number">2.592833</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;a&#x27;</span>]<span class="hljs-number">0</span>   -<span class="hljs-number">1.198795</span><span class="hljs-number">1</span>   -<span class="hljs-number">2.878230</span><span class="hljs-number">2</span>    <span class="hljs-number">2.267475</span><span class="hljs-number">3</span>    <span class="hljs-number">0.639340</span><span class="hljs-number">4</span>   -<span class="hljs-number">1.953444</span><span class="hljs-number">5</span>   -<span class="hljs-number">0.445225</span><span class="hljs-number">6</span>    <span class="hljs-number">0.097109</span>Name: a, dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[[<span class="hljs-string">&#x27;a&#x27;</span>]]          a<span class="hljs-number">0</span> -<span class="hljs-number">1.198795</span><span class="hljs-number">1</span> -<span class="hljs-number">2.878230</span><span class="hljs-number">2</span>  <span class="hljs-number">2.267475</span><span class="hljs-number">3</span>  <span class="hljs-number">0.639340</span><span class="hljs-number">4</span> -<span class="hljs-number">1.953444</span><span class="hljs-number">5</span> -<span class="hljs-number">0.445225</span><span class="hljs-number">6</span>  <span class="hljs-number">0.097109</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">type</span>(obj[<span class="hljs-string">&#x27;a&#x27;</span>])&lt;<span class="hljs-class"><span class="hljs-keyword">class</span> &#x27;<span class="hljs-title">pandas</span>.<span class="hljs-title">core</span>.<span class="hljs-title">series</span>.<span class="hljs-title">Series</span>&#x27;&gt;</span><span class="hljs-class">&gt;&gt;&gt; <span class="hljs-title">type</span>(<span class="hljs-params">obj[[<span class="hljs-string">&#x27;a&#x27;</span>]]</span>)</span><span class="hljs-class">&lt;<span class="hljs-title">class</span> &#x27;<span class="hljs-title">pandas</span>.<span class="hljs-title">core</span>.<span class="hljs-title">frame</span>.<span class="hljs-title">DataFrame</span>&#x27;&gt;</span></code></pre><h3><span id="2-2-3-qie-pian-suo-yin"><font color="#FFA500">【2.2.3】切片索引</font></span></h3><p>DataFrame 中的切片索引是针对行来操作的，切片的方法有两种：按位置切片和按索引名（index）切片，注意：按位置切片时，不包含终止索引；按索引名（index）切片时，包含终止索引。</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = np.random.randn(<span class="hljs-number">5</span>,<span class="hljs-number">4</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>index = [<span class="hljs-string">&#x27;I1&#x27;</span>, <span class="hljs-string">&#x27;I2&#x27;</span>, <span class="hljs-string">&#x27;I3&#x27;</span>, <span class="hljs-string">&#x27;I4&#x27;</span>, <span class="hljs-string">&#x27;I5&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>columns = [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data, index, columns)<span class="hljs-meta">&gt;&gt;&gt; </span>obj           a         b         c         dI1  <span class="hljs-number">0.828676</span> -<span class="hljs-number">1.663337</span>  <span class="hljs-number">1.753632</span>  <span class="hljs-number">1.432487</span>I2  <span class="hljs-number">0.368138</span>  <span class="hljs-number">0.222166</span>  <span class="hljs-number">0.902764</span> -<span class="hljs-number">1.436186</span>I3  <span class="hljs-number">2.285615</span> -<span class="hljs-number">2.415175</span> -<span class="hljs-number">1.344456</span> -<span class="hljs-number">0.502214</span>I4  <span class="hljs-number">3.224288</span> -<span class="hljs-number">0.500268</span>  <span class="hljs-number">1.293596</span> -<span class="hljs-number">1.235549</span>I5 -<span class="hljs-number">0.938833</span> -<span class="hljs-number">0.804433</span> -<span class="hljs-number">0.170047</span> -<span class="hljs-number">0.566766</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-number">0</span>:<span class="hljs-number">3</span>]           a         b         c         dI1  <span class="hljs-number">0.828676</span> -<span class="hljs-number">1.663337</span>  <span class="hljs-number">1.753632</span>  <span class="hljs-number">1.432487</span>I2  <span class="hljs-number">0.368138</span>  <span class="hljs-number">0.222166</span>  <span class="hljs-number">0.902764</span> -<span class="hljs-number">1.436186</span>I3  <span class="hljs-number">2.285615</span> -<span class="hljs-number">2.415175</span> -<span class="hljs-number">1.344456</span> -<span class="hljs-number">0.502214</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-number">0</span>:<span class="hljs-number">4</span>:<span class="hljs-number">2</span>]           a         b         c         dI1 -<span class="hljs-number">0.042168</span>  <span class="hljs-number">1.437354</span> -<span class="hljs-number">1.114545</span>  <span class="hljs-number">0.830790</span>I3  <span class="hljs-number">0.241506</span>  <span class="hljs-number">0.018984</span> -<span class="hljs-number">0.499151</span> -<span class="hljs-number">1.190143</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;I2&#x27;</span>:<span class="hljs-string">&#x27;I4&#x27;</span>]           a         b         c         dI2  <span class="hljs-number">0.368138</span>  <span class="hljs-number">0.222166</span>  <span class="hljs-number">0.902764</span> -<span class="hljs-number">1.436186</span>I3  <span class="hljs-number">2.285615</span> -<span class="hljs-number">2.415175</span> -<span class="hljs-number">1.344456</span> -<span class="hljs-number">0.502214</span>I4  <span class="hljs-number">3.224288</span> -<span class="hljs-number">0.500268</span>  <span class="hljs-number">1.293596</span> -<span class="hljs-number">1.235549</span></code></pre><h3><span id="2-2-4-hua-shi-suo-yin"><font color="#FFA500">【2.2.4】花式索引</font></span></h3><p>和 Series 一样，所谓的花式索引，就是间隔索引、不连续的索引，传递一个由列名（columns）组成的<font color="#FF0000"><strong>列表</strong></font>来一次性获得多列元素：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = np.random.randn(<span class="hljs-number">5</span>,<span class="hljs-number">4</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>index = [<span class="hljs-string">&#x27;I1&#x27;</span>, <span class="hljs-string">&#x27;I2&#x27;</span>, <span class="hljs-string">&#x27;I3&#x27;</span>, <span class="hljs-string">&#x27;I4&#x27;</span>, <span class="hljs-string">&#x27;I5&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>columns = [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data, index, columns)<span class="hljs-meta">&gt;&gt;&gt; </span>obj           a         b         c         dI1 -<span class="hljs-number">1.083223</span> -<span class="hljs-number">0.182874</span> -<span class="hljs-number">0.348460</span> -<span class="hljs-number">1.572120</span>I2 -<span class="hljs-number">0.205206</span> -<span class="hljs-number">0.251931</span>  <span class="hljs-number">1.180131</span>  <span class="hljs-number">0.847720</span>I3 -<span class="hljs-number">0.980379</span>  <span class="hljs-number">0.325553</span> -<span class="hljs-number">0.847566</span> -<span class="hljs-number">0.882343</span>I4 -<span class="hljs-number">0.638228</span> -<span class="hljs-number">0.282882</span> -<span class="hljs-number">0.624997</span> -<span class="hljs-number">0.245980</span>I5 -<span class="hljs-number">0.229769</span>  <span class="hljs-number">1.002930</span> -<span class="hljs-number">0.226715</span> -<span class="hljs-number">0.916591</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>]]           a         dI1 -<span class="hljs-number">1.083223</span> -<span class="hljs-number">1.572120</span>I2 -<span class="hljs-number">0.205206</span>  <span class="hljs-number">0.847720</span>I3 -<span class="hljs-number">0.980379</span> -<span class="hljs-number">0.882343</span>I4 -<span class="hljs-number">0.638228</span> -<span class="hljs-number">0.245980</span>I5 -<span class="hljs-number">0.229769</span> -<span class="hljs-number">0.916591</span></code></pre><h3><span id="2-2-5-bu-er-suo-yin"><font color="#FFA500">【2.2.5】布尔索引</font></span></h3><p>可以通过一个布尔数组来索引目标数组，即通过布尔运算（如：比较运算符）来获取符合指定条件的元素的数组。</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = np.random.randn(<span class="hljs-number">5</span>,<span class="hljs-number">4</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>index = [<span class="hljs-string">&#x27;I1&#x27;</span>, <span class="hljs-string">&#x27;I2&#x27;</span>, <span class="hljs-string">&#x27;I3&#x27;</span>, <span class="hljs-string">&#x27;I4&#x27;</span>, <span class="hljs-string">&#x27;I5&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>columns = [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data, index, columns)<span class="hljs-meta">&gt;&gt;&gt; </span>obj           a         b         c         dI1 -<span class="hljs-number">0.602984</span> -<span class="hljs-number">0.135716</span>  <span class="hljs-number">0.999689</span> -<span class="hljs-number">0.339786</span>I2  <span class="hljs-number">0.911130</span> -<span class="hljs-number">0.092485</span> -<span class="hljs-number">0.914074</span> -<span class="hljs-number">0.279588</span>I3  <span class="hljs-number">0.849606</span> -<span class="hljs-number">0.420055</span> -<span class="hljs-number">1.240389</span> -<span class="hljs-number">0.179297</span>I4  <span class="hljs-number">0.249986</span> -<span class="hljs-number">1.250668</span>  <span class="hljs-number">0.329416</span> -<span class="hljs-number">1.105774</span>I5 -<span class="hljs-number">0.743816</span>  <span class="hljs-number">0.430647</span> -<span class="hljs-number">0.058126</span> -<span class="hljs-number">0.337319</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[obj &gt; <span class="hljs-number">0</span>]           a         b         c   dI1       NaN       NaN  <span class="hljs-number">0.999689</span> NaNI2  <span class="hljs-number">0.911130</span>       NaN       NaN NaNI3  <span class="hljs-number">0.849606</span>       NaN       NaN NaNI4  <span class="hljs-number">0.249986</span>       NaN  <span class="hljs-number">0.329416</span> NaNI5       NaN  <span class="hljs-number">0.430647</span>       NaN NaN<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj &gt; <span class="hljs-number">0</span>        a      b      c      dI1  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>   <span class="hljs-literal">True</span>  <span class="hljs-literal">False</span>I2   <span class="hljs-literal">True</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>I3   <span class="hljs-literal">True</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span>I4   <span class="hljs-literal">True</span>  <span class="hljs-literal">False</span>   <span class="hljs-literal">True</span>  <span class="hljs-literal">False</span>I5  <span class="hljs-literal">False</span>   <span class="hljs-literal">True</span>  <span class="hljs-literal">False</span>  <span class="hljs-literal">False</span></code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106698307</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h1><span id="3-suo-yin-qi-loc-he-iloc"><font color="#FF0000">【3】索引器：loc 和 iloc</font></span></h1><p>loc 是标签索引、iloc 是位置索引，注意：在 Pandas1.0.0 之前还有 ix 方法（即可按标签也可按位置索引），在 Pandas1.0.0 之后已被移除。</p><h2><span id="3-1-loc-biao-qian-suo-yin"><font color="#4876FF">【3.1】loc 标签索引</font></span></h2><p>loc 标签索引，即根据 index 和 columns 来选择数据。</p><h3><span id="3-1-1-series-loc"><font color="#FFA500">【3.1.1】Series.loc</font></span></h3><p>在 Series 中，允许输入：</p><ul><li>单个标签，例如 <code>5</code> 或 <code>'a'</code>，（注意，<code>5</code> 是 index 的名称，而不是位置索引）；</li><li>标签列表或数组，例如 <code>['a', 'b', 'c']</code>；</li><li>带有标签的切片对象，例如 <code>'a':'f'</code>。</li></ul><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.loc.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.loc.html</a></p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obja    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>d    <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.loc[<span class="hljs-string">&#x27;a&#x27;</span>]<span class="hljs-number">1</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.loc[<span class="hljs-string">&#x27;a&#x27;</span>:<span class="hljs-string">&#x27;c&#x27;</span>]a    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>dtype: int64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.loc[[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>]]a    <span class="hljs-number">1</span>d    <span class="hljs-number">2</span>dtype: int64</code></pre><h3><span id="3-1-2-dataframe-loc"><font color="#FFA500">【3.1.2】DataFrame.loc</font></span></h3><p>在 DataFrame 中，第一个参数索引<font color="#FF0000"><strong>行</strong></font>，第二个参数是索引<font color="#FF0000"><strong>列</strong></font>，允许输入的格式和 Series 大同小异。</p><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html</a></p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>], [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>]], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>], columns=[<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj   A  B  Ca  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span>b  <span class="hljs-number">4</span>  <span class="hljs-number">5</span>  <span class="hljs-number">6</span>c  <span class="hljs-number">7</span>  <span class="hljs-number">8</span>  <span class="hljs-number">9</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.loc[<span class="hljs-string">&#x27;a&#x27;</span>]A    <span class="hljs-number">1</span>B    <span class="hljs-number">2</span>C    <span class="hljs-number">3</span>Name: a, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.loc[<span class="hljs-string">&#x27;a&#x27;</span>:<span class="hljs-string">&#x27;c&#x27;</span>]   A  B  Ca  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span>b  <span class="hljs-number">4</span>  <span class="hljs-number">5</span>  <span class="hljs-number">6</span>c  <span class="hljs-number">7</span>  <span class="hljs-number">8</span>  <span class="hljs-number">9</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.loc[[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>]]   A  B  Ca  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span>c  <span class="hljs-number">7</span>  <span class="hljs-number">8</span>  <span class="hljs-number">9</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.loc[<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>]<span class="hljs-number">5</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.loc[<span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;A&#x27;</span>:<span class="hljs-string">&#x27;C&#x27;</span>]A    <span class="hljs-number">4</span>B    <span class="hljs-number">5</span>C    <span class="hljs-number">6</span>Name: b, dtype: int64</code></pre><h2><span id="3-2-iloc-wei-zhi-suo-yin"><font color="#4876FF">【3.2】iloc 位置索引</font></span></h2><p>作用和 loc 一样，不过是基于索引的编号来索引，即根据 index 和 columns 的位置编号来选择数据。</p><h3><span id="3-2-1-series-iloc"><font color="#FFA500">【3.2.1】Series.iloc</font></span></h3><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.iloc.html">https://pandas.pydata.org/docs/reference/api/pandas.Series.iloc.html</a></p><p>在 Series 中，允许输入：</p><ul><li>整数，例如 <code>5</code>；</li><li>整数列表或数组，例如 <code>[4, 3, 0]</code>；</li><li>具有整数的切片对象，例如 <code>1:7</code>。</li></ul><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obja    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>d    <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.iloc[<span class="hljs-number">1</span>]<span class="hljs-number">5</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.iloc[<span class="hljs-number">0</span>:<span class="hljs-number">2</span>]a    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.iloc[[<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>]]a    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>d    <span class="hljs-number">2</span>dtype: int64</code></pre><h3><span id="3-2-2-dataframe-iloc"><font color="#FFA500">【3.2.2】DataFrame.iloc</font></span></h3><p>官方文档：<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iloc.html">https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iloc.html</a></p><p>在 DataFrame 中，第一个参数索引<font color="#FF0000"><strong>行</strong></font>，第二个参数是索引<font color="#FF0000"><strong>列</strong></font>，允许输入的格式和 Series 大同小异：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>], [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>]], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>], columns=[<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj   A  B  Ca  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span>b  <span class="hljs-number">4</span>  <span class="hljs-number">5</span>  <span class="hljs-number">6</span>c  <span class="hljs-number">7</span>  <span class="hljs-number">8</span>  <span class="hljs-number">9</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.iloc[<span class="hljs-number">1</span>]A    <span class="hljs-number">4</span>B    <span class="hljs-number">5</span>C    <span class="hljs-number">6</span>Name: b, dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.iloc[<span class="hljs-number">0</span>:<span class="hljs-number">2</span>]   A  B  Ca  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span>b  <span class="hljs-number">4</span>  <span class="hljs-number">5</span>  <span class="hljs-number">6</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.iloc[[<span class="hljs-number">0</span>, <span class="hljs-number">2</span>]]   A  B  Ca  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span>c  <span class="hljs-number">7</span>  <span class="hljs-number">8</span>  <span class="hljs-number">9</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.iloc[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>]<span class="hljs-number">6</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.iloc[<span class="hljs-number">1</span>, <span class="hljs-number">0</span>:<span class="hljs-number">2</span>]A    <span class="hljs-number">4</span>B    <span class="hljs-number">5</span>Name: b, dtype: int64</code></pre><h1><span id="4-pandas-chong-xin-suo-yin"><font color="#FF0000">【4】Pandas 重新索引</font></span></h1><p>Pandas 对象的一个重要方法是 reindex，其作用是创建一个新对象，它的数据符合新的索引。以 <code>DataFrame.reindex</code> 为例（Series 类似），基本语法如下：</p><p><code>DataFrame.reindex(self, labels=None, index=None, columns=None, axis=None, method=None, copy=True, level=None, fill_value=nan, limit=None, tolerance=None)</code></p><p>部分参数描述如下：（完整参数解释参见<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.reindex.html">官方文档</a>）</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>index</td><td>用作索引的新序列，既可以是 index 实例，也可以是其他序列型的 Python 数据结构</td></tr><tr><td>method</td><td>插值（填充）方式，取值如下：<br><code>None</code>：不填补空白；<br><code>pad / ffill</code>：将上一个有效的观测值向前传播到下一个有效的观测值；<br><code>backfill / bfill</code>：使用下一个有效观察值来填补空白；<br><code>nearest</code>：使用最近的有效观测值来填补空白。</td></tr><tr><td>fill_value</td><td>在重新索引的过程中，需要引入缺失值时使用的替代值</td></tr><tr><td>limit</td><td>前向或后向填充时的最大填充量</td></tr><tr><td>tolerance</td><td>向前或向后填充时，填充不准确匹配项的最大间距（绝对值距离）</td></tr><tr><td>level</td><td>在 Multilndex 的指定级别上匹配简单索引，否则选其子集</td></tr><tr><td>copy</td><td>默认为 True，无论如何都复制；如果为 False，则新旧相等就不复制</td></tr></tbody></table><p>reindex 将会根据新索引进行重排。如果某个索引值当前不存在，就引入缺失值：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">4.5</span>, <span class="hljs-number">7.2</span>, -<span class="hljs-number">5.3</span>, <span class="hljs-number">3.6</span>], index=[<span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>objd    <span class="hljs-number">4.5</span>b    <span class="hljs-number">7.2</span>a   -<span class="hljs-number">5.3</span>c    <span class="hljs-number">3.6</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = obj.reindex([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2a   -<span class="hljs-number">5.3</span>b    <span class="hljs-number">7.2</span>c    <span class="hljs-number">3.6</span>d    <span class="hljs-number">4.5</span>e    NaNdtype: float64</code></pre><p>对于时间序列这样的有序数据，重新索引时可能需要做一些插值处理。method 选项即可达到此目的，例如，使用 ffill 可以实现前向值填充：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-string">&#x27;blue&#x27;</span>, <span class="hljs-string">&#x27;purple&#x27;</span>, <span class="hljs-string">&#x27;yellow&#x27;</span>], index=[<span class="hljs-number">0</span>, <span class="hljs-number">2</span>, <span class="hljs-number">4</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>      blue<span class="hljs-number">2</span>    purple<span class="hljs-number">4</span>    yellowdtype: <span class="hljs-built_in">object</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = obj.reindex(<span class="hljs-built_in">range</span>(<span class="hljs-number">6</span>), method=<span class="hljs-string">&#x27;ffill&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj2<span class="hljs-number">0</span>      blue<span class="hljs-number">1</span>      blue<span class="hljs-number">2</span>    purple<span class="hljs-number">3</span>    purple<span class="hljs-number">4</span>    yellow<span class="hljs-number">5</span>    yellowdtype: <span class="hljs-built_in">object</span></code></pre><p>借助 DataFrame，reindex可以修改（行）索引和列。只传递一个序列时，会重新索引结果的行：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.arange(<span class="hljs-number">9</span>).reshape((<span class="hljs-number">3</span>, <span class="hljs-number">3</span>)), index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], columns=[<span class="hljs-string">&#x27;Ohio&#x27;</span>, <span class="hljs-string">&#x27;Texas&#x27;</span>, <span class="hljs-string">&#x27;California&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj   Ohio  Texas  Californiaa     <span class="hljs-number">0</span>      <span class="hljs-number">1</span>           <span class="hljs-number">2</span>c     <span class="hljs-number">3</span>      <span class="hljs-number">4</span>           <span class="hljs-number">5</span>d     <span class="hljs-number">6</span>      <span class="hljs-number">7</span>           <span class="hljs-number">8</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj2 = obj.reindex([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj2   Ohio  Texas  Californiaa   <span class="hljs-number">0.0</span>    <span class="hljs-number">1.0</span>         <span class="hljs-number">2.0</span>b   NaN    NaN         NaNc   <span class="hljs-number">3.0</span>    <span class="hljs-number">4.0</span>         <span class="hljs-number">5.0</span>d   <span class="hljs-number">6.0</span>    <span class="hljs-number">7.0</span>         <span class="hljs-number">8.0</span></code></pre><p>列可以用 columns 关键字重新索引：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(np.arange(<span class="hljs-number">9</span>).reshape((<span class="hljs-number">3</span>, <span class="hljs-number">3</span>)), index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], columns=[<span class="hljs-string">&#x27;Ohio&#x27;</span>, <span class="hljs-string">&#x27;Texas&#x27;</span>, <span class="hljs-string">&#x27;California&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj   Ohio  Texas  Californiaa     <span class="hljs-number">0</span>      <span class="hljs-number">1</span>           <span class="hljs-number">2</span>c     <span class="hljs-number">3</span>      <span class="hljs-number">4</span>           <span class="hljs-number">5</span>d     <span class="hljs-number">6</span>      <span class="hljs-number">7</span>           <span class="hljs-number">8</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>states = [<span class="hljs-string">&#x27;Texas&#x27;</span>, <span class="hljs-string">&#x27;Utah&#x27;</span>, <span class="hljs-string">&#x27;California&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>obj.reindex(columns=states)   Texas  Utah  Californiaa      <span class="hljs-number">1</span>   NaN           <span class="hljs-number">2</span>c      <span class="hljs-number">4</span>   NaN           <span class="hljs-number">5</span>d      <span class="hljs-number">7</span>   NaN           <span class="hljs-number">8</span></code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106698307</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-1-index-suo-yin-dui-xiang-font&quot;&gt;&lt;font colo</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Pandas" scheme="https://www.itbob.cn/tags/Pandas/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Pandas（一）：认识 Pandas 及其 Series、DataFrame 对象</title>
    <link href="https://www.itbob.cn/article/025/"/>
    <id>https://www.itbob.cn/article/025/</id>
    <published>2020-06-11T12:39:54.000Z</published>
    <updated>2022-05-22T12:37:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-01x00-liao-jie-pandas-font"><font color="#FF0000">【01x00】了解 Pandas</font></a></li><li><a href="#font-color-ff0000-02x00-pandas-shu-ju-jie-gou-font"><font color="#FF0000">【02x00】Pandas 数据结构</font></a></li><li><a href="#font-color-ff0000-03x00-series-dui-xiang-font"><font color="#FF0000">【03x00】Series 对象</font></a><ul><li><a href="#font-color-4876ff-03x01-tong-guo-list-gou-jian-series-font"><font color="#4876FF">【03x01】通过 list 构建 Series</font></a></li><li><a href="#font-color-4876ff-03x02-tong-guo-dict-gou-jian-series-font"><font color="#4876FF">【03x02】通过 dict 构建 Series</font></a></li><li><a href="#font-color-4876ff-03x03-huo-qu-qi-shu-ju-he-suo-yin-font"><font color="#4876FF">【03x03】获取其数据和索引</font></a></li><li><a href="#font-color-4876ff-03x04-tong-guo-suo-yin-huo-qu-shu-ju-font"><font color="#4876FF">【03x04】通过索引获取数据</font></a></li><li><a href="#font-color-4876ff-03x05-shi-yong-han-shu-yun-suan-font"><font color="#4876FF">【03x05】使用函数运算</font></a></li><li><a href="#font-color-4876ff-03x06-name-shu-xing-font"><font color="##4876FF">【03x06】name 属性</font></a></li></ul></li><li><a href="#font-color-ff0000-04x00-dataframe-dui-xiang-font"><font color="#FF0000">【04x00】DataFrame 对象</font></a><ul><li><a href="#font-color-4876ff-03x01-tong-guo-ndarray-gou-jian-dataframe-font"><font color="#4876FF">【03x01】通过 ndarray 构建 DataFrame</font></a></li><li><a href="#font-color-4876ff-03x02-tong-guo-dict-gou-jian-dataframe-font"><font color="#4876FF">【03x02】通过 dict 构建 DataFrame</font></a></li><li><a href="#font-color-4876ff-03x03-huo-qu-qi-shu-ju-he-suo-yin-font-1"><font color="#4876FF">【03x03】获取其数据和索引</font></a></li><li><a href="#font-color-4876ff-03x04-tong-guo-suo-yin-huo-qu-shu-ju-font-1"><font color="#4876FF">【03x04】通过索引获取数据</font></a></li><li><a href="#font-color-4876ff-03x05-xiu-gai-lie-de-zhi-font"><font color="#4876FF">【03x05】修改列的值</font></a></li><li><a href="#font-color-4876ff-03x06-zeng-jia-shan-chu-lie-font"><font color="#4876FF">【03x06】增加 / 删除列</font></a></li><li><a href="#font-color-4876ff-03x07-name-shu-xing-font"><font color="##4876FF">【03x07】name 属性</font></a></li></ul></li></ul><!-- tocstop --><hr><p>Pandas 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/025/">Python 数据分析三剑客之 Pandas（一）：认识 Pandas 及其 Series、DataFrame 对象</a></li><li><a href="https://www.itbob.cn/article/026/">Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作</a></li><li><a href="https://www.itbob.cn/article/027/">Python 数据分析三剑客之 Pandas（三）：算术运算与缺失值的处理</a></li><li><a href="https://www.itbob.cn/article/028/">Python 数据分析三剑客之 Pandas（四）：函数应用、映射、排序和层级索引</a></li><li><a href="https://www.itbob.cn/article/029/">Python 数据分析三剑客之 Pandas（五）：统计计算与统计描述</a></li><li><a href="https://www.itbob.cn/article/030/">Python 数据分析三剑客之 Pandas（六）：GroupBy 数据分裂、应用与合并</a></li><li><a href="https://www.itbob.cn/article/031/">Python 数据分析三剑客之 Pandas（七）：合并数据集</a></li><li><a href="https://www.itbob.cn/article/032/">Python 数据分析三剑客之 Pandas（八）：数据重塑、重复数据处理与数据替换</a></li><li><a href="https://www.itbob.cn/article/033/">Python 数据分析三剑客之 Pandas（九）：时间序列</a></li><li><a href="https://www.itbob.cn/article/034/">Python 数据分析三剑客之 Pandas（十）：数据读写</a></li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106676693</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="01x00-liao-jie-pandas"><font color="#FF0000">【01x00】了解 Pandas</font></span></h2><p><a href="https://pandas.pydata.org/">Pandas</a> 是 <a href="https://www.python.org/">Python</a> 的一个数据分析包，是基于 <a href="https://numpy.org/">NumPy</a> 构建的，最初由 <a href="https://www.aqr.com/">AQR Capital Management</a> 于 2008 年 4 月开发，并于 2009 年底开源出来，目前由专注于 <a href="https://www.python.org/">Python</a> 数据包开发的 <a href="https://pydata.org/">PyData</a> 开发团队继续开发和维护，属于 <a href="https://pydata.org/">PyData</a> 项目的一部分。</p><p><a href="https://pandas.pydata.org/">Pandas</a> 最初被作为金融数据分析工具而开发出来，因此，<a href="https://pandas.pydata.org/">Pandas</a> 为时间序列分析提供了很好的支持。<font color="#FFA500"><strong>Pandas 的名称来自于面板数据（panel data）和 Python 数据分析（data analysis）</strong></font>。panel data 是经济学中关于多维数据集的一个术语，在 <a href="https://pandas.pydata.org/">Pandas</a> 中也提供了 panel 的数据类型。</p><p><a href="https://pandas.pydata.org/">Pandas</a> 经常和其它工具一同使用，如数值计算工具 <a href="https://numpy.org/">NumPy</a> 和 <a href="https://www.scipy.org/">SciPy</a>，分析库 <a href="https://www.statsmodels.org/">statsmodels</a> 和 <a href="https://scikit-learn.org/">scikit-learn</a>，数据可视化库 <a href="https://matplotlib.org/">Matplotlib</a> 等，虽然 <a href="https://pandas.pydata.org/">Pandas</a> 采用了大量的 NumPy 编码风格，但二者最大的不同是 <font color="#FFA500"><strong>Pandas 是专门为处理表格和混杂数据设计的。而 NumPy 更适合处理统一的数值数组数据。</strong></font></p><hr><p>【以下对 Pandas 的解释翻译自官方文档：<a href="https://pandas.pydata.org/docs/getting_started/overview.html#package-overview">https://pandas.pydata.org/docs/getting_started/overview.html#package-overview</a>】</p><hr><p>Pandas 是 Python 的核心数据分析支持库，提供了快速、灵活、明确的数据结构，旨在简单、直观地处理关系型、标记型数据。Pandas 的目标是成为 Python 数据分析实践与实战的必备高级工具，其长远目标是成为<strong>最强大、最灵活、可以支持任何语言的开源数据分析工具</strong>。经过多年不懈的努力，Pandas 离这个目标已经越来越近了。</p><p>Pandas 适用于处理以下类型的数据：</p><ul><li>与 SQL 或 Excel 表类似的，含异构列的表格数据;</li><li>有序和无序（非固定频率）的时间序列数据;</li><li>带行列标签的矩阵数据，包括同构或异构型数据;</li><li>任意其它形式的观测、统计数据集, 数据转入 Pandas 数据结构时不必事先标记。</li></ul><p>Pandas 的主要数据结构是 <a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.html#pandas.Series">Series</a>（一维数据）与 <a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html#pandas.DataFrame">DataFrame</a>（二维数据），这两种数据结构足以处理- 金融、统计、社会科学、工程等领域里的大多数典型用例。对于 R 语言用户，<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html#pandas.DataFrame">DataFrame</a> 提供了比 R 语言 <code>data.frame</code> 更丰富的功能。Pandas 基于 <a href="https://www.numpy.org/">NumPy</a> 开发，可以与其它第三方科学计算支持库完美集成。</p><p>Pandas 就像一把万能瑞士军刀，下面仅列出了它的部分优势 ：</p><ul><li>处理浮点与非浮点数据里的<strong>缺失数据</strong>，表示为 NaN；</li><li>大小可变：<strong>插入或删除</strong> DataFrame 等多维对象的列；</li><li>自动、显式<strong>数据对齐</strong>：显式地将对象与一组标签对齐，也可以忽略标签，在 Series、DataFrame 计算时自动与数据对齐；</li><li>强大、灵活的<strong>分组</strong>（group by）功能：<strong>拆分-应用-组合</strong>数据集，聚合、转换数据；</li><li>把 Python 和 NumPy 数据结构里不规则、不同索引的数据<strong>轻松地转换</strong>为 DataFrame 对象；</li><li>基于智能标签，对大型数据集进行<strong>切片、花式索引、子集分解</strong>等操作；</li><li>直观地<strong>合并</strong>和<strong>连接</strong>数据集；</li><li>灵活地<strong>重塑</strong>和<strong>旋转</strong>数据集；</li><li>轴支持<strong>分层</strong>标签（每个刻度可能有多个标签）；</li><li>强大的 IO 工具，读取平面文件（CSV 等支持分隔符的文件）、Excel 文件、数据库等来源的数据，以及从超快 <strong>HDF5 格式</strong>保存 / 加载数据；</li><li><strong>时间序列</strong>：支持日期范围生成、频率转换、移动窗口统计、移动窗口线性回归、日期位移等时间序列功能。</li></ul><p>这些功能主要是为了解决其它编程语言、科研环境的痛点。处理数据一般分为几个阶段：数据整理与清洗、数据分析与建模、数据可视化与制表，Pandas 是处理数据的理想工具。</p><p>其它说明：</p><ul><li>Pandas 速度很快。Pandas 的很多底层算法都用 <a href="https://cython.org/">Cython</a> 优化过。然而，为了保持通用性，必然要牺牲一些性能，如果专注某一功能，完全可以开发出比 Pandas 更快的专用工具。</li><li>Pandas 是 <a href="https://www.statsmodels.org/stable/index.html">statsmodels</a> 的依赖项，因此，Pandas 也是 Python 中统计计算生态系统的重要组成部分。</li><li>Pandas 已广泛应用于金融领域。</li></ul><h2><span id="02x00-pandas-shu-ju-jie-gou"><font color="#FF0000">【02x00】Pandas 数据结构</font></span></h2><p>Pandas 的主要数据结构是 <a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.html#pandas.Series">Series</a>（带标签的一维同构数组）与 <a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html#pandas.DataFrame">DataFrame</a>（带标签的，大小可变的二维异构表格）。</p><p>Pandas 数据结构就像是低维数据的容器。比如，DataFrame 是 Series 的容器，Series 则是标量的容器。使用这种方式，可以在容器中以字典的形式插入或删除对象。</p><p>此外，通用 API 函数的默认操作要顾及时间序列与截面数据集的方向。当使用 Ndarray 存储二维或三维数据时，编写函数要注意数据集的方向，这对用户来说是一种负担；如果不考虑 C 或 Fortran 中连续性对性能的影响，一般情况下，不同的轴在程序里其实没有什么区别。Pandas 里，轴的概念主要是为了给数据赋予更直观的语义，即用更恰当的方式表示数据集的方向。这样做可以让用户编写数据转换函数时，少费点脑子。</p><p>处理 DataFrame 等表格数据时，对比 Numpy，<strong>index</strong>（行）或 <strong>columns</strong>（列）比 <strong>axis 0</strong> 和 <strong>axis 1</strong> 更直观。用这种方式迭代 DataFrame 的列，代码更易读易懂：</p><pre><code class="hljs python"><span class="hljs-keyword">for</span> col <span class="hljs-keyword">in</span> df.columns:    series = df[col]    <span class="hljs-comment"># do something with series</span></code></pre><h2><span id="03x00-series-dui-xiang"><font color="#FF0000">【03x00】Series 对象</font></span></h2><p>Series 是带标签的一维数组，可存储整数、浮点数、字符串、Python 对象等类型的数据。轴标签统称为索引。调用 pandas.Series 函数即可创建 Series，基本语法如下：</p><p><code>pandas.Series(data=None[, index=None, dtype=None, name=None, copy=False, fastpath=False])</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>data</td><td>数组类型，可迭代的，字典或标量值，存储在序列中的数据</td></tr><tr><td>index</td><td>索引（数据标签），值必须是可哈希的，并且具有与数据相同的长度，<br>允许使用非唯一索引值。如果未提供，将默认为RangeIndex（0，1，2，…，n）</td></tr><tr><td>dtype</td><td>输出系列的数据类型。可选项，如果未指定，则将从数据中推断，具体参考官网 <a href="https://pandas.pydata.org/docs/getting_started/basics.html#dtypes">dtypes</a> 介绍</td></tr><tr><td>name</td><td>str 类型，可选项，给 Series 命名</td></tr><tr><td>copy</td><td>bool 类型，可选项，默认 False，是否复制输入数据</td></tr></tbody></table><p><img src="https://static.wukongsec.com/itbob/images/article/025/01.png" alt="01"></p><h3><span id="03x01-tong-guo-list-gou-jian-series"><font color="#4876FF">【03x01】通过 list 构建 Series</font></span></h3><p>一般情况下我们只会用到 data 和 index 参数，可以通过 list（列表） 构建 Series，示例如下：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj<span class="hljs-number">0</span>    <span class="hljs-number">1</span><span class="hljs-number">1</span>    <span class="hljs-number">5</span><span class="hljs-number">2</span>   -<span class="hljs-number">8</span><span class="hljs-number">3</span>    <span class="hljs-number">2</span>dtype: int64</code></pre><p>由于我们没有为数据指定索引，于是会自动创建一个 0 到 N-1（N 为数据的长度）的整数型索引，左边一列是自动创建的索引（index），右边一列是数据（data）。</p><p>此外，还可以自定义索引（index）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obja    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>d    <span class="hljs-number">2</span>dtype: int64</code></pre><p>索引（index）也可以通过赋值的方式就地修改：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obja    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>d    <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span>obj.index = [<span class="hljs-string">&#x27;Bob&#x27;</span>, <span class="hljs-string">&#x27;Steve&#x27;</span>, <span class="hljs-string">&#x27;Jeff&#x27;</span>, <span class="hljs-string">&#x27;Ryan&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>objBob      <span class="hljs-number">1</span>Steve    <span class="hljs-number">5</span>Jeff    -<span class="hljs-number">8</span>Ryan     <span class="hljs-number">2</span>dtype: int64</code></pre><h3><span id="03x02-tong-guo-dict-gou-jian-series"><font color="#4876FF">【03x02】通过 dict 构建 Series</font></span></h3><p>通过 字典（dict） 构建 Series，字典的键（key）会作为索引（index），字典的值（value）会作为数据（data），示例如下：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;Beijing&#x27;</span>: <span class="hljs-number">21530000</span>, <span class="hljs-string">&#x27;Shanghai&#x27;</span>: <span class="hljs-number">24280000</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>: <span class="hljs-number">11210000</span>, <span class="hljs-string">&#x27;Zhejiang&#x27;</span>: <span class="hljs-number">58500000</span>&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(data)<span class="hljs-meta">&gt;&gt;&gt; </span>objBeijing     <span class="hljs-number">21530000</span>Shanghai    <span class="hljs-number">24280000</span>Wuhan       <span class="hljs-number">11210000</span>Zhejiang    <span class="hljs-number">58500000</span>dtype: int64</code></pre><p>如果你想按照某个特定的顺序输出结果，可以传入排好序的字典的键以改变顺序：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;Beijing&#x27;</span>: <span class="hljs-number">21530000</span>, <span class="hljs-string">&#x27;Shanghai&#x27;</span>: <span class="hljs-number">24280000</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>: <span class="hljs-number">11210000</span>, <span class="hljs-string">&#x27;Zhejiang&#x27;</span>: <span class="hljs-number">58500000</span>&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>cities = [<span class="hljs-string">&#x27;Guangzhou&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Zhejiang&#x27;</span>, <span class="hljs-string">&#x27;Shanghai&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(data, index=cities)<span class="hljs-meta">&gt;&gt;&gt; </span>objGuangzhou           NaNWuhan        <span class="hljs-number">11210000.0</span>Zhejiang     <span class="hljs-number">58500000.0</span>Shanghai     <span class="hljs-number">24280000.0</span>dtype: float64</code></pre><p><font color="#FF0000"><strong>注意：data 为字典，且未设置 index 参数时：</strong></font></p><ul><li><font color="#FF0000"><strong>如果 Python &gt;= 3.6 且 Pandas &gt;= 0.23，Series 按字典的插入顺序排序索引。</strong></font></li><li><font color="#FF0000"><strong>如果 Python &lt; 3.6 或 Pandas &lt; 0.23，Series 按字母顺序排序索引。</strong></font></li></ul><h3><span id="03x03-huo-qu-qi-shu-ju-he-suo-yin"><font color="#4876FF">【03x03】获取其数据和索引</font></span></h3><p>我们可以通过 Series 的 values 和 index 属性获取其数据和索引对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj.valuesarray([ <span class="hljs-number">1</span>,  <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>,  <span class="hljs-number">2</span>], dtype=int64)<span class="hljs-meta">&gt;&gt;&gt; </span>obj.indexIndex([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>], dtype=<span class="hljs-string">&#x27;object&#x27;</span>)</code></pre><h3><span id="03x04-tong-guo-suo-yin-huo-qu-shu-ju"><font color="#4876FF">【03x04】通过索引获取数据</font></span></h3><p>与普通 NumPy 数组相比，Pandas 可以通过索引的方式选取 Series 中的单个或一组值，获取一组值时，传入的是一个列表，列表中的元素是索引值，另外还可以通过索引来修改其对应的值：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obja    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>d    <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;a&#x27;</span>]<span class="hljs-number">1</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;a&#x27;</span>] = <span class="hljs-number">3</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>]]a    <span class="hljs-number">3</span>b    <span class="hljs-number">5</span>c   -<span class="hljs-number">8</span>dtype: int64</code></pre><h3><span id="03x05-shi-yong-han-shu-yun-suan"><font color="#4876FF">【03x05】使用函数运算</font></span></h3><p>在 Pandas 中可以使用 NumPy 函数或类似 NumPy 的运算（如根据布尔型数组进行过滤、标量乘法、应用数学函数等）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj[obj &gt; <span class="hljs-number">0</span>]a    <span class="hljs-number">1</span>b    <span class="hljs-number">5</span>d    <span class="hljs-number">2</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span>obj * <span class="hljs-number">2</span>a     <span class="hljs-number">2</span>b    <span class="hljs-number">10</span>c   -<span class="hljs-number">16</span>d     <span class="hljs-number">4</span>dtype: int64<span class="hljs-meta">&gt;&gt;&gt; </span>np.exp(obj)a      <span class="hljs-number">2.718282</span>b    <span class="hljs-number">148.413159</span>c      <span class="hljs-number">0.000335</span>d      <span class="hljs-number">7.389056</span>dtype: float64</code></pre><p>除了这些运算函数以外，还可以将 Series 看成是一个定长的有序字典，因为它是索引值到数据值的一个映射。它可以用在许多原本需要字典参数的函数中：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">&#x27;a&#x27;</span> <span class="hljs-keyword">in</span> obj<span class="hljs-literal">True</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">&#x27;e&#x27;</span> <span class="hljs-keyword">in</span> obj<span class="hljs-literal">False</span></code></pre><p>和 NumPy 类似，Pandas 中也有 NaN（即非数字，not a number），在 Pandas 中，它用于表示缺失值，Pandas 的 isnull 和 notnull 函数可用于检测缺失数据：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series([np.NaN, <span class="hljs-number">5</span>, -<span class="hljs-number">8</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obja    NaNb    <span class="hljs-number">5.0</span>c   -<span class="hljs-number">8.0</span>d    <span class="hljs-number">2.0</span>dtype: float64<span class="hljs-meta">&gt;&gt;&gt; </span>pd.isnull(obj)a     <span class="hljs-literal">True</span>b    <span class="hljs-literal">False</span>c    <span class="hljs-literal">False</span>d    <span class="hljs-literal">False</span>dtype: <span class="hljs-built_in">bool</span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.notnull(obj)a    <span class="hljs-literal">False</span>b     <span class="hljs-literal">True</span>c     <span class="hljs-literal">True</span>d     <span class="hljs-literal">True</span>dtype: <span class="hljs-built_in">bool</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.isnull()a     <span class="hljs-literal">True</span>b    <span class="hljs-literal">False</span>c    <span class="hljs-literal">False</span>d    <span class="hljs-literal">False</span>dtype: <span class="hljs-built_in">bool</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.notnull()a    <span class="hljs-literal">False</span>b     <span class="hljs-literal">True</span>c     <span class="hljs-literal">True</span>d     <span class="hljs-literal">True</span>dtype: <span class="hljs-built_in">bool</span></code></pre><h3><span id="03x06-name-shu-xing"><font color="##4876FF">【03x06】name 属性</font></span></h3><p>可以在 <code>pandas.Series</code> 方法中为 Series 对象指定一个 name：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;Beijing&#x27;</span>: <span class="hljs-number">21530000</span>, <span class="hljs-string">&#x27;Shanghai&#x27;</span>: <span class="hljs-number">24280000</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>: <span class="hljs-number">11210000</span>, <span class="hljs-string">&#x27;Zhejiang&#x27;</span>: <span class="hljs-number">58500000</span>&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(data, name=<span class="hljs-string">&#x27;population&#x27;</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>objBeijing     <span class="hljs-number">21530000</span>Shanghai    <span class="hljs-number">24280000</span>Wuhan       <span class="hljs-number">11210000</span>Zhejiang    <span class="hljs-number">58500000</span>Name: population, dtype: int64</code></pre><p>也可以通过 name 和 <a href="http://index.name">index.name</a> 属性为 Series 对象和其索引指定 name：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;Beijing&#x27;</span>: <span class="hljs-number">21530000</span>, <span class="hljs-string">&#x27;Shanghai&#x27;</span>: <span class="hljs-number">24280000</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>: <span class="hljs-number">11210000</span>, <span class="hljs-string">&#x27;Zhejiang&#x27;</span>: <span class="hljs-number">58500000</span>&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.Series(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj.name = <span class="hljs-string">&#x27;population&#x27;</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.name = <span class="hljs-string">&#x27;cities&#x27;</span><span class="hljs-meta">&gt;&gt;&gt; </span>objcitiesBeijing     <span class="hljs-number">21530000</span>Shanghai    <span class="hljs-number">24280000</span>Wuhan       <span class="hljs-number">11210000</span>Zhejiang    <span class="hljs-number">58500000</span>Name: population, dtype: int64</code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106676693</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="04x00-dataframe-dui-xiang"><font color="#FF0000">【04x00】DataFrame 对象</font></span></h2><p>DataFrame 是一个表格型的数据结构，它含有一组有序的列，每列可以是不同的值类型（数值、字符串、布尔值等）。DataFrame 既有行索引也有列索引，它可以被看做由 Series 组成的字典（共用同一个索引）。DataFrame 中的数据是以一个或多个二维块存放的（而不是列表、字典或别的一维数据结构）。</p><ul><li>类似多维数组/表格数据 (如Excel、R 语言中的 data.frame)；</li><li>每列数据可以是不同的类型；</li><li>索引包括列索引和行索引</li></ul><p>基本语法如下：</p><p><code>pandas.DataFrame(data=None, index: Optional[Collection] = None, columns: Optional[Collection] = None, dtype: Union[str, numpy.dtype, ExtensionDtype, None] = None, copy: bool = False)</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>data</td><td>ndarray 对象（结构化或同类的）、可迭代的或者字典形式，存储在序列中的数据</td></tr><tr><td>index</td><td>数组类型，索引（数据标签），如果未提供，将默认为 RangeIndex（0，1，2，…，n）</td></tr><tr><td>columns</td><td>列标签。如果未提供，则将默认为 RangeIndex（0、1、2、…、n）</td></tr><tr><td>dtype</td><td>输出系列的数据类型。可选项，如果未指定，则将从数据中推断，具体参考官网 <a href="https://pandas.pydata.org/docs/getting_started/basics.html#dtypes">dtypes</a> 介绍</td></tr><tr><td>copy</td><td>bool 类型，可选项，默认 False，是否复制输入数据，仅影响 DataFrame/2d ndarray 输入</td></tr></tbody></table><p><img src="https://static.wukongsec.com/itbob/images/article/025/02.png" alt="02"></p><h3><span id="03x01-tong-guo-ndarray-gou-jian-dataframe"><font color="#4876FF">【03x01】通过 ndarray 构建 DataFrame</font></span></h3><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = np.random.randn(<span class="hljs-number">5</span>,<span class="hljs-number">3</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>dataarray([[-<span class="hljs-number">2.16231157</span>,  <span class="hljs-number">0.44967198</span>, -<span class="hljs-number">0.73131523</span>],       [ <span class="hljs-number">1.18982913</span>,  <span class="hljs-number">0.94670798</span>,  <span class="hljs-number">0.82973421</span>],       [-<span class="hljs-number">1.57680831</span>, -<span class="hljs-number">0.99732066</span>,  <span class="hljs-number">0.96432</span>   ],       [-<span class="hljs-number">0.77483149</span>, -<span class="hljs-number">1.23802881</span>,  <span class="hljs-number">0.44061227</span>],       [ <span class="hljs-number">1.77666419</span>,  <span class="hljs-number">0.24931983</span>, -<span class="hljs-number">1.12960153</span>]])<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj          <span class="hljs-number">0</span>         <span class="hljs-number">1</span>         <span class="hljs-number">2</span><span class="hljs-number">0</span> -<span class="hljs-number">2.162312</span>  <span class="hljs-number">0.449672</span> -<span class="hljs-number">0.731315</span><span class="hljs-number">1</span>  <span class="hljs-number">1.189829</span>  <span class="hljs-number">0.946708</span>  <span class="hljs-number">0.829734</span><span class="hljs-number">2</span> -<span class="hljs-number">1.576808</span> -<span class="hljs-number">0.997321</span>  <span class="hljs-number">0.964320</span><span class="hljs-number">3</span> -<span class="hljs-number">0.774831</span> -<span class="hljs-number">1.238029</span>  <span class="hljs-number">0.440612</span><span class="hljs-number">4</span>  <span class="hljs-number">1.776664</span>  <span class="hljs-number">0.249320</span> -<span class="hljs-number">1.129602</span></code></pre><p>指定索引（index）和列标签（columns），和 Series 对象类似，可以在构建的时候添加索引和标签，也可以直接通过赋值的方式就地修改：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = np.random.randn(<span class="hljs-number">5</span>,<span class="hljs-number">3</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>index = [<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>, <span class="hljs-string">&#x27;e&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>columns = [<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data, index, columns)<span class="hljs-meta">&gt;&gt;&gt; </span>obj          A         B         Ca -<span class="hljs-number">1.042909</span> -<span class="hljs-number">0.238236</span> -<span class="hljs-number">1.050308</span>b  <span class="hljs-number">0.587079</span>  <span class="hljs-number">0.739683</span> -<span class="hljs-number">0.233624</span>c -<span class="hljs-number">0.451254</span> -<span class="hljs-number">0.638496</span>  <span class="hljs-number">1.708807</span>d -<span class="hljs-number">0.620158</span> -<span class="hljs-number">1.875929</span> -<span class="hljs-number">0.432382</span>e -<span class="hljs-number">1.093815</span>  <span class="hljs-number">0.396965</span> -<span class="hljs-number">0.759479</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.index = [<span class="hljs-string">&#x27;A1&#x27;</span>, <span class="hljs-string">&#x27;A2&#x27;</span>, <span class="hljs-string">&#x27;A3&#x27;</span>, <span class="hljs-string">&#x27;A4&#x27;</span>, <span class="hljs-string">&#x27;A5&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>obj.columns = [<span class="hljs-string">&#x27;B1&#x27;</span>, <span class="hljs-string">&#x27;B2&#x27;</span>, <span class="hljs-string">&#x27;B3&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>obj          B1        B2        B3A1 -<span class="hljs-number">1.042909</span> -<span class="hljs-number">0.238236</span> -<span class="hljs-number">1.050308</span>A2  <span class="hljs-number">0.587079</span>  <span class="hljs-number">0.739683</span> -<span class="hljs-number">0.233624</span>A3 -<span class="hljs-number">0.451254</span> -<span class="hljs-number">0.638496</span>  <span class="hljs-number">1.708807</span>A4 -<span class="hljs-number">0.620158</span> -<span class="hljs-number">1.875929</span> -<span class="hljs-number">0.432382</span>A5 -<span class="hljs-number">1.093815</span>  <span class="hljs-number">0.396965</span> -<span class="hljs-number">0.759479</span></code></pre><h3><span id="03x02-tong-guo-dict-gou-jian-dataframe"><font color="#4876FF">【03x02】通过 dict 构建 DataFrame</font></span></h3><p>通过 字典（dict） 构建 DataFrame，字典的键（key）会作为列标签（columns），字典的值（value）会作为数据（data），示例如下：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;city&#x27;</span>: [<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>],        <span class="hljs-string">&#x27;year&#x27;</span>: [<span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>, <span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>],        <span class="hljs-string">&#x27;people&#x27;</span>: [<span class="hljs-number">10892900</span>, <span class="hljs-number">11081000</span>, <span class="hljs-number">11212000</span>, <span class="hljs-number">21707000</span>, <span class="hljs-number">21542000</span>, <span class="hljs-number">21536000</span>]&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj      city  year    people<span class="hljs-number">0</span>    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span><span class="hljs-number">1</span>    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span><span class="hljs-number">2</span>    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span><span class="hljs-number">3</span>  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span><span class="hljs-number">4</span>  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span><span class="hljs-number">5</span>  Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span></code></pre><p>如果指定了列序列，则 DataFrame 的列就会按照指定顺序进行排列，如果传入的列在数据中找不到，就会在结果中产生缺失值（NaN）：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;city&#x27;</span>: [<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>],        <span class="hljs-string">&#x27;year&#x27;</span>: [<span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>, <span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>],        <span class="hljs-string">&#x27;people&#x27;</span>: [<span class="hljs-number">10892900</span>, <span class="hljs-number">11081000</span>, <span class="hljs-number">11212000</span>, <span class="hljs-number">21707000</span>, <span class="hljs-number">21542000</span>, <span class="hljs-number">21536000</span>]&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>pd.DataFrame(data)      city  year    people<span class="hljs-number">0</span>    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span><span class="hljs-number">1</span>    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span><span class="hljs-number">2</span>    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span><span class="hljs-number">3</span>  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span><span class="hljs-number">4</span>  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span><span class="hljs-number">5</span>  Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.DataFrame(data, columns=[<span class="hljs-string">&#x27;year&#x27;</span>, <span class="hljs-string">&#x27;city&#x27;</span>, <span class="hljs-string">&#x27;people&#x27;</span>])   year     city    people<span class="hljs-number">0</span>  <span class="hljs-number">2017</span>    Wuhan  <span class="hljs-number">10892900</span><span class="hljs-number">1</span>  <span class="hljs-number">2018</span>    Wuhan  <span class="hljs-number">11081000</span><span class="hljs-number">2</span>  <span class="hljs-number">2019</span>    Wuhan  <span class="hljs-number">11212000</span><span class="hljs-number">3</span>  <span class="hljs-number">2017</span>  Beijing  <span class="hljs-number">21707000</span><span class="hljs-number">4</span>  <span class="hljs-number">2018</span>  Beijing  <span class="hljs-number">21542000</span><span class="hljs-number">5</span>  <span class="hljs-number">2019</span>  Beijing  <span class="hljs-number">21536000</span><span class="hljs-meta">&gt;&gt;&gt; </span>pd.DataFrame(data, columns=[<span class="hljs-string">&#x27;year&#x27;</span>, <span class="hljs-string">&#x27;city&#x27;</span>, <span class="hljs-string">&#x27;people&#x27;</span>, <span class="hljs-string">&#x27;money&#x27;</span>])   year     city    people money<span class="hljs-number">0</span>  <span class="hljs-number">2017</span>    Wuhan  <span class="hljs-number">10892900</span>   NaN<span class="hljs-number">1</span>  <span class="hljs-number">2018</span>    Wuhan  <span class="hljs-number">11081000</span>   NaN<span class="hljs-number">2</span>  <span class="hljs-number">2019</span>    Wuhan  <span class="hljs-number">11212000</span>   NaN<span class="hljs-number">3</span>  <span class="hljs-number">2017</span>  Beijing  <span class="hljs-number">21707000</span>   NaN<span class="hljs-number">4</span>  <span class="hljs-number">2018</span>  Beijing  <span class="hljs-number">21542000</span>   NaN<span class="hljs-number">5</span>  <span class="hljs-number">2019</span>  Beijing  <span class="hljs-number">21536000</span>   NaN</code></pre><p><font color="#FF0000"><strong>注意：data 为字典，且未设置 columns 参数时：</strong></font></p><ul><li><p><font color="#FF0000"><strong>Python &gt; = 3.6 且 Pandas &gt; = 0.23，DataFrame 的列按字典的插入顺序排序。</strong></font></p></li><li><p><font color="#FF0000"><strong>Python &lt; 3.6 或 Pandas &lt; 0.23，DataFrame 的列按字典键的字母排序。</strong></font></p></li></ul><h3><span id="03x03-huo-qu-qi-shu-ju-he-suo-yin"><font color="#4876FF">【03x03】获取其数据和索引</font></span></h3><p>和 Series 一样，DataFrame 也可以通过其 values 和 index 属性获取其数据和索引对象：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;city&#x27;</span>: [<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>],    <span class="hljs-string">&#x27;year&#x27;</span>: [<span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>, <span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>],        <span class="hljs-string">&#x27;people&#x27;</span>: [<span class="hljs-number">10892900</span>, <span class="hljs-number">11081000</span>, <span class="hljs-number">11212000</span>, <span class="hljs-number">21707000</span>, <span class="hljs-number">21542000</span>, <span class="hljs-number">21536000</span>]&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj.indexRangeIndex(start=<span class="hljs-number">0</span>, stop=<span class="hljs-number">6</span>, step=<span class="hljs-number">1</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj.valuesarray([[<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-number">2017</span>, <span class="hljs-number">10892900</span>],       [<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">11081000</span>],       [<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-number">2019</span>, <span class="hljs-number">11212000</span>],       [<span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-number">2017</span>, <span class="hljs-number">21707000</span>],       [<span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">21542000</span>],       [<span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-number">2019</span>, <span class="hljs-number">21536000</span>]], dtype=<span class="hljs-built_in">object</span>)</code></pre><h3><span id="03x04-tong-guo-suo-yin-huo-qu-shu-ju"><font color="#4876FF">【03x04】通过索引获取数据</font></span></h3><p>通过类似字典标记的方式或属性的方式，可以将 DataFrame 的列获取为一个 Series 对象；</p><p>行也可以通过位置或名称的方式进行获取，比如用 loc 属性；</p><p>对于特别大的 DataFrame，有一个 head 方法可以选取前五行数据。</p><p>用法示例：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;city&#x27;</span>: [<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>],    <span class="hljs-string">&#x27;year&#x27;</span>: [<span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>, <span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>],        <span class="hljs-string">&#x27;people&#x27;</span>: [<span class="hljs-number">10892900</span>, <span class="hljs-number">11081000</span>, <span class="hljs-number">11212000</span>, <span class="hljs-number">21707000</span>, <span class="hljs-number">21542000</span>, <span class="hljs-number">21536000</span>]&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj      city  year    people<span class="hljs-number">0</span>    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span><span class="hljs-number">1</span>    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span><span class="hljs-number">2</span>    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span><span class="hljs-number">3</span>  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span><span class="hljs-number">4</span>  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span><span class="hljs-number">5</span>  Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;city&#x27;</span>]<span class="hljs-number">0</span>      Wuhan<span class="hljs-number">1</span>      Wuhan<span class="hljs-number">2</span>      Wuhan<span class="hljs-number">3</span>    Beijing<span class="hljs-number">4</span>    Beijing<span class="hljs-number">5</span>    BeijingName: city, dtype: <span class="hljs-built_in">object</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.year<span class="hljs-number">0</span>    <span class="hljs-number">2017</span><span class="hljs-number">1</span>    <span class="hljs-number">2018</span><span class="hljs-number">2</span>    <span class="hljs-number">2019</span><span class="hljs-number">3</span>    <span class="hljs-number">2017</span><span class="hljs-number">4</span>    <span class="hljs-number">2018</span><span class="hljs-number">5</span>    <span class="hljs-number">2019</span>Name: year, dtype: int64&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-built_in">type</span>(obj.year)&lt;<span class="hljs-class"><span class="hljs-keyword">class</span> &#x27;<span class="hljs-title">pandas</span>.<span class="hljs-title">core</span>.<span class="hljs-title">series</span>.<span class="hljs-title">Series</span>&#x27;&gt;</span><span class="hljs-class">&gt;&gt;&gt;</span><span class="hljs-class">&gt;&gt;&gt; <span class="hljs-title">obj</span>.<span class="hljs-title">loc</span>[2]</span><span class="hljs-class"><span class="hljs-title">city</span>         <span class="hljs-title">Wuhan</span></span><span class="hljs-class"><span class="hljs-title">year</span>          2019</span><span class="hljs-class"><span class="hljs-title">people</span>    11212000</span><span class="hljs-class"><span class="hljs-title">Name</span>:</span> <span class="hljs-number">2</span>, dtype: <span class="hljs-built_in">object</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj.head()      city  year    people<span class="hljs-number">0</span>    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span><span class="hljs-number">1</span>    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span><span class="hljs-number">2</span>    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span><span class="hljs-number">3</span>  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span><span class="hljs-number">4</span>  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span></code></pre><h3><span id="03x05-xiu-gai-lie-de-zhi"><font color="#4876FF">【03x05】修改列的值</font></span></h3><p>列可以通过赋值的方式进行修改。在下面示例中，分别给&quot;money&quot;列赋上一个标量值和一组值：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;city&#x27;</span>: [<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>],        <span class="hljs-string">&#x27;year&#x27;</span>: [<span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>, <span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>],        <span class="hljs-string">&#x27;people&#x27;</span>: [<span class="hljs-number">10892900</span>, <span class="hljs-number">11081000</span>, <span class="hljs-number">11212000</span>, <span class="hljs-number">21707000</span>, <span class="hljs-number">21542000</span>, <span class="hljs-number">21536000</span>],        <span class="hljs-string">&#x27;money&#x27;</span>:[np.NaN, np.NaN, np.NaN, np.NaN, np.NaN, np.NaN]&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data, index=[<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>, <span class="hljs-string">&#x27;D&#x27;</span>, <span class="hljs-string">&#x27;E&#x27;</span>, <span class="hljs-string">&#x27;F&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj      city  year    people  moneyA    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span>    NaNB    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span>    NaNC    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span>    NaND  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span>    NaNE  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span>    NaNF  Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span>    NaN&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;money&#x27;</span>] = <span class="hljs-number">6666666666</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj      city  year    people       moneyA    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span>  <span class="hljs-number">6666666666</span>B    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span>  <span class="hljs-number">6666666666</span>C    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span>  <span class="hljs-number">6666666666</span>D  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span>  <span class="hljs-number">6666666666</span>E  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span>  <span class="hljs-number">6666666666</span>F  Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span>  <span class="hljs-number">6666666666</span>&gt;&gt;&gt;<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;money&#x27;</span>] = np.arange(<span class="hljs-number">100000000</span>, <span class="hljs-number">700000000</span>, <span class="hljs-number">100000000</span>)<span class="hljs-meta">&gt;&gt;&gt; </span>obj      city  year    people      moneyA    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span>  <span class="hljs-number">100000000</span>B    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span>  <span class="hljs-number">200000000</span>C    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span>  <span class="hljs-number">300000000</span>D  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span>  <span class="hljs-number">400000000</span>E  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span>  <span class="hljs-number">500000000</span>F  Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span>  <span class="hljs-number">600000000</span></code></pre><p>将列表或数组赋值给某个列时，其长度必须跟 DataFrame 的长度相匹配。如果赋值的是一个 Series，就会精确匹配 DataFrame 的索引：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;city&#x27;</span>: [<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>],        <span class="hljs-string">&#x27;year&#x27;</span>: [<span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>, <span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>],        <span class="hljs-string">&#x27;people&#x27;</span>: [<span class="hljs-number">10892900</span>, <span class="hljs-number">11081000</span>, <span class="hljs-number">11212000</span>, <span class="hljs-number">21707000</span>, <span class="hljs-number">21542000</span>, <span class="hljs-number">21536000</span>],        <span class="hljs-string">&#x27;money&#x27;</span>:[np.NaN, np.NaN, np.NaN, np.NaN, np.NaN, np.NaN]&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data, index=[<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;B&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>, <span class="hljs-string">&#x27;D&#x27;</span>, <span class="hljs-string">&#x27;E&#x27;</span>, <span class="hljs-string">&#x27;F&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj      city  year    people  moneyA    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span>    NaNB    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span>    NaNC    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span>    NaND  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span>    NaNE  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span>    NaNF  Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span>    NaN<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>new_data = pd.Series([<span class="hljs-number">5670000000</span>, <span class="hljs-number">6890000000</span>, <span class="hljs-number">7890000000</span>], index=[<span class="hljs-string">&#x27;A&#x27;</span>, <span class="hljs-string">&#x27;C&#x27;</span>, <span class="hljs-string">&#x27;E&#x27;</span>])<span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;money&#x27;</span>] = new_data<span class="hljs-meta">&gt;&gt;&gt; </span>obj      city  year    people         moneyA    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span>  <span class="hljs-number">5.670000e+09</span>B    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span>           NaNC    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span>  <span class="hljs-number">6.890000e+09</span>D  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span>           NaNE  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span>  <span class="hljs-number">7.890000e+09</span>F  Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span>           NaN</code></pre><h3><span id="03x06-zeng-jia-shan-chu-lie"><font color="#4876FF">【03x06】增加 / 删除列</font></span></h3><p>为不存在的列赋值会创建出一个新列，关键字 del 用于删除列：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;city&#x27;</span>: [<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>],    <span class="hljs-string">&#x27;year&#x27;</span>: [<span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>, <span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>],        <span class="hljs-string">&#x27;people&#x27;</span>: [<span class="hljs-number">10892900</span>, <span class="hljs-number">11081000</span>, <span class="hljs-number">11212000</span>, <span class="hljs-number">21707000</span>, <span class="hljs-number">21542000</span>, <span class="hljs-number">21536000</span>]&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj      city  year    people<span class="hljs-number">0</span>    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span><span class="hljs-number">1</span>    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span><span class="hljs-number">2</span>    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span><span class="hljs-number">3</span>  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span><span class="hljs-number">4</span>  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span><span class="hljs-number">5</span>  Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span>obj[<span class="hljs-string">&#x27;northern&#x27;</span>] = obj[<span class="hljs-string">&#x27;city&#x27;</span>] == <span class="hljs-string">&#x27;Beijing&#x27;</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj      city  year    people  northern<span class="hljs-number">0</span>    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span>     <span class="hljs-literal">False</span><span class="hljs-number">1</span>    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span>     <span class="hljs-literal">False</span><span class="hljs-number">2</span>    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span>     <span class="hljs-literal">False</span><span class="hljs-number">3</span>  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span>      <span class="hljs-literal">True</span><span class="hljs-number">4</span>  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span>      <span class="hljs-literal">True</span><span class="hljs-number">5</span>  Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span>      <span class="hljs-literal">True</span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">del</span> obj[<span class="hljs-string">&#x27;northern&#x27;</span>]<span class="hljs-meta">&gt;&gt;&gt; </span>obj      city  year    people<span class="hljs-number">0</span>    Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span><span class="hljs-number">1</span>    Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span><span class="hljs-number">2</span>    Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span><span class="hljs-number">3</span>  Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span><span class="hljs-number">4</span>  Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span><span class="hljs-number">5</span>  Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span></code></pre><h3><span id="03x07-name-shu-xing"><font color="##4876FF">【03x07】name 属性</font></span></h3><p>可以通过 <a href="http://index.name">index.name</a> 和 <a href="http://columns.name">columns.name</a> 属性设置索引（index）和列标签（columns）的 name，注意 DataFrame 对象是没有 name 属性的：</p><pre><code class="hljs python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-meta">&gt;&gt;&gt; </span>data = &#123;<span class="hljs-string">&#x27;city&#x27;</span>: [<span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Wuhan&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>, <span class="hljs-string">&#x27;Beijing&#x27;</span>],        <span class="hljs-string">&#x27;year&#x27;</span>: [<span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>, <span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>],        <span class="hljs-string">&#x27;people&#x27;</span>: [<span class="hljs-number">10892900</span>, <span class="hljs-number">11081000</span>, <span class="hljs-number">11212000</span>, <span class="hljs-number">21707000</span>, <span class="hljs-number">21542000</span>, <span class="hljs-number">21536000</span>]&#125;<span class="hljs-meta">&gt;&gt;&gt; </span>obj = pd.DataFrame(data)<span class="hljs-meta">&gt;&gt;&gt; </span>obj.index.name = <span class="hljs-string">&#x27;index&#x27;</span><span class="hljs-meta">&gt;&gt;&gt; </span>obj.columns.name = <span class="hljs-string">&#x27;columns&#x27;</span><span class="hljs-meta">&gt;&gt;&gt; </span>objcolumns     city  year    peopleindex                           <span class="hljs-number">0</span>          Wuhan  <span class="hljs-number">2017</span>  <span class="hljs-number">10892900</span><span class="hljs-number">1</span>          Wuhan  <span class="hljs-number">2018</span>  <span class="hljs-number">11081000</span><span class="hljs-number">2</span>          Wuhan  <span class="hljs-number">2019</span>  <span class="hljs-number">11212000</span><span class="hljs-number">3</span>        Beijing  <span class="hljs-number">2017</span>  <span class="hljs-number">21707000</span><span class="hljs-number">4</span>        Beijing  <span class="hljs-number">2018</span>  <span class="hljs-number">21542000</span><span class="hljs-number">5</span>        Beijing  <span class="hljs-number">2019</span>  <span class="hljs-number">21536000</span></code></pre><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106676693</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-01x00-liao-jie-pandas-font&quot;&gt;&lt;font color=&quot;#</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Pandas" scheme="https://www.itbob.cn/tags/Pandas/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Matplotlib（十一）：最常用最有价值的 50 个图表【译文】</title>
    <link href="https://www.itbob.cn/article/024/"/>
    <id>https://www.itbob.cn/article/024/</id>
    <published>2020-06-09T08:13:49.000Z</published>
    <updated>2022-05-22T12:36:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-1x00-jie-shao-introduction-font"><font color="#FF0000">【1x00】介绍（Introduction）</font></a></li><li><a href="#font-color-ff0000-2x00-zhun-bei-gong-zuo-setup-font"><font color="#FF0000">【2x00】准备工作（Setup）</font></a></li><li><a href="#font-color-ff0000-3x00-guan-lian-correlation-font"><font color="#FF0000">【3x00】关联（Correlation） </font></a><ul><li><a href="#font-color-4876ff-01-san-dian-tu-scatter-plot-font"><font color="##4876FF">【01】散点图（Scatter plot）</font></a></li><li><a href="#font-color-4876ff-02-dai-bian-jie-de-qi-pao-tu-bubble-plot-with-encircling-font"><font color="##4876FF">【02】带边界的气泡图（Bubble plot with Encircling）</font></a></li><li><a href="#font-color-4876ff-03-dai-xian-xing-hui-gui-zui-jia-ni-he-xian-de-san-dian-tu-scatter-plot-with-linear-regression-line-of-best-fit-font"><font color="##4876FF">【03】带线性回归最佳拟合线的散点图（Scatter plot with linear regression line of best fit）</font></a></li><li><a href="#font-color-4876ff-04-dou-dong-tu-jittering-with-stripplot-font"><font color="##4876FF">【04】抖动图（Jittering with stripplot）</font></a></li><li><a href="#font-color-4876ff-05-ji-shu-tu-counts-plot-font"><font color="##4876FF">【05】计数图（Counts Plot）</font></a></li><li><a href="#font-color-4876ff-06-bian-yuan-zhi-fang-tu-marginal-histogram-font"><font color="##4876FF">【06】边缘直方图（Marginal Histogram）</font></a></li><li><a href="#font-color-4876ff-07-bian-yuan-xiang-xing-tu-marginal-boxplot-font"><font color="##4876FF">【07】边缘箱形图（Marginal Boxplot）</font></a></li><li><a href="#font-color-4876ff-08-xiang-guan-tu-correllogram-font"><font color="##4876FF">【08】相关图（Correllogram）</font></a></li><li><a href="#font-color-4876ff-09-cheng-dui-tu-pairwise-plot-font"><font color="##4876FF">【09】成对图（Pairwise Plot）</font></a></li></ul></li><li><a href="#font-color-ff0000-4x00-pian-chai-deviation-font"><font color="#FF0000">【4x00】偏差（Deviation）</font></a><ul><li><a href="#font-color-4876ff-10-fa-san-xing-tiao-xing-tu-diverging-bars-font"><font color="##4876FF">【10】发散型条形图（Diverging Bars）</font></a></li><li><a href="#font-color-4876ff-11-fa-san-xing-wen-ben-tu-diverging-texts-font"><font color="##4876FF">【11】发散型文本图（Diverging Texts）</font></a></li><li><a href="#font-color-4876ff-12-fa-san-xing-san-dian-tu-diverging-dot-plot-font"><font color="##4876FF">【12】发散型散点图（Diverging Dot Plot）</font></a></li><li><a href="#font-color-4876ff-13-dai-biao-ji-de-fa-san-xing-bang-bang-tang-tu-diverging-lollipop-chart-with-markers-font"><font color="##4876FF">【13】带标记的发散型棒棒糖图（Diverging Lollipop Chart with Markers）</font></a></li><li><a href="#font-color-4876ff-14-mian-ji-tu-area-chart-font"><font color="##4876FF">【14】面积图（Area Chart）</font></a></li></ul></li><li><a href="#font-color-ff0000-5x00-pai-xu-ranking-font"><font color="#FF0000">【5x00】排序（Ranking）</font></a><ul><li><a href="#font-color-4876ff-15-you-xu-tiao-xing-tu-ordered-bar-chart-font"><font color="##4876FF">【15】有序条形图（Ordered Bar Chart）</font></a></li><li><a href="#font-color-4876ff-16-bang-bang-tang-tu-lollipop-chart-font"><font color="##4876FF">【16】棒棒糖图（Lollipop Chart）</font></a></li><li><a href="#font-color-4876ff-17-dian-tu-dot-plot-font"><font color="##4876FF">【17】点图（Dot Plot）</font></a></li><li><a href="#font-color-4876ff-18-po-du-tu-slope-chart-font"><font color="##4876FF">【18】坡度图（Slope Chart）</font></a></li><li><a href="#font-color-4876ff-19-ya-ling-tu-dumbbell-plot-font"><font color="##4876FF">【19】哑铃图（Dumbbell Plot）</font></a></li></ul></li><li><a href="#font-color-ff0000-6x00-fen-bu-distribution-font"><font color="#FF0000">【6x00】分布（Distribution）</font></a><ul><li><a href="#font-color-4876ff-20-lian-xu-bian-liang-de-zhi-fang-tu-histogram-for-continuous-variable-font"><font color="##4876FF">【20】连续变量的直方图（Histogram for Continuous Variable）</font></a></li><li><a href="#font-color-4876ff-21-fen-lei-bian-liang-de-zhi-fang-tu-histogram-for-categorical-variable-font"><font color="##4876FF">【21】分类变量的直方图（Histogram for Categorical Variable）</font></a></li><li><a href="#font-color-4876ff-22-mi-du-tu-density-plot-font"><font color="##4876FF">【22】密度图（Density Plot）</font></a></li><li><a href="#font-color-4876ff-23-zhi-fang-tu-mi-du-qu-xian-density-curves-with-histogram-font"><font color="##4876FF">【23】直方图密度曲线（Density Curves with Histogram）</font></a></li><li><a href="#font-color-4876ff-24-shan-feng-die-luan-tu-huan-le-tu-joy-plot-font"><font color="##4876FF">【24】山峰叠峦图 / 欢乐图（Joy Plot）</font></a></li><li><a href="#font-color-4876ff-25-fen-bu-shi-dian-tu-distributed-dot-plot-font"><font color="##4876FF">【25】分布式点图（Distributed Dot Plot）</font></a></li><li><a href="#font-color-4876ff-26-xiang-xing-tu-box-plot-font"><font color="##4876FF">【26】箱形图（Box Plot）</font></a></li><li><a href="#font-color-4876ff-27-dian-xiang-xing-tu-dot-box-plot-font"><font color="##4876FF">【27】点 + 箱形图（Dot + Box Plot）</font></a></li><li><a href="#font-color-4876ff-28-xiao-ti-qin-tu-violin-plot-font"><font color="##4876FF">【28】小提琴图（Violin Plot）</font></a></li><li><a href="#font-color-4876ff-29-ren-kou-jin-zi-ta-tu-population-pyramid-font"><font color="##4876FF">【29】人口金字塔图（Population Pyramid）</font></a></li><li><a href="#font-color-4876ff-30-fen-lei-tu-categorical-plots-font"><font color="##4876FF">【30】分类图（Categorical Plots）</font></a></li></ul></li><li><a href="#font-color-ff0000-7x00-zu-cheng-composition-font"><font color="#FF0000">【7x00】组成（Composition）</font></a><ul><li><a href="#font-color-4876ff-31-hua-fu-bing-tu-waffle-chart-font"><font color="##4876FF">【31】华夫饼图（Waffle Chart）</font></a></li><li><a href="#font-color-4876ff-32-bing-tu-pie-chart-font"><font color="##4876FF">【32】饼图（Pie Chart）</font></a></li><li><a href="#font-color-4876ff-33-ju-zhen-shu-xing-tu-treemap-font"><font color="##4876FF">【33】矩阵树形图（Treemap）</font></a></li><li><a href="#font-color-4876ff-34-tiao-xing-tu-bar-chart-font"><font color="##4876FF">【34】条形图（Bar Chart）</font></a></li></ul></li><li><a href="#font-color-ff0000-8x00-bian-hua-change-font"><font color="#FF0000">【8x00】变化（Change）</font></a><ul><li><a href="#font-color-4876ff-35-shi-jian-xu-lie-tu-time-series-plot-font"><font color="##4876FF">【35】时间序列图（Time Series Plot）</font></a></li><li><a href="#font-color-4876ff-36-dai-bo-feng-he-bo-gu-zhu-shi-de-shi-jian-xu-lie-tu-time-series-with-peaks-and-troughs-annotated-font"><font color="##4876FF">【36】带波峰和波谷注释的时间序列图（Time Series with Peaks and Troughs Annotated）</font></a></li><li><a href="#font-color-4876ff-37-zi-xiang-guan-acf-he-bu-fen-zi-xiang-guan-pacf-tu-autocorrelation-acf-and-partial-autocorrelation-pacf-plot-font"><font color="##4876FF">【37】自相关 (ACF) 和部分自相关 (PACF) 图（Autocorrelation (ACF) and Partial Autocorrelation (PACF) Plot）</font></a></li><li><a href="#font-color-4876ff-38-jiao-cha-xiang-guan-tu-cross-correlation-plot-font"><font color="##4876FF">【38】交叉相关图（Cross Correlation plot）</font></a></li><li><a href="#font-color-4876ff-39-shi-jian-xu-lie-fen-jie-tu-time-series-decomposition-plot-font"><font color="##4876FF">【39】时间序列分解图（Time Series Decomposition Plot）</font></a></li><li><a href="#font-color-4876ff-40-duo-chong-shi-jian-xu-lie-multiple-time-series-font"><font color="##4876FF">【40】多重时间序列（Multiple Time Series）</font></a></li><li><a href="#font-color-4876ff-41-shi-yong-ci-yao-de-y-zhou-lai-hui-zhi-bu-tong-fan-wei-de-tu-xing-plotting-with-different-scales-using-secondary-y-axis-font"><font color="##4876FF">【41】使用次要的 Y 轴来绘制不同范围的图形（Plotting with different scales using secondary Y axis）</font></a></li><li><a href="#font-color-4876ff-42-dai-wu-chai-dai-de-shi-jian-xu-lie-time-series-with-error-bands-font"><font color="##4876FF">【42】带误差带的时间序列（Time Series with Error Bands）</font></a></li><li><a href="#font-color-4876ff-43-dui-ji-mian-ji-tu-stacked-area-chart-font"><font color="##4876FF">【43】堆积面积图（Stacked Area Chart）</font></a></li><li><a href="#font-color-4876ff-44-wei-dui-ji-mian-ji-tu-area-chart-unstacked-font"><font color="##4876FF">【44】未堆积面积图（Area Chart UnStacked）</font></a></li><li><a href="#font-color-4876ff-45-ri-li-re-li-tu-calendar-heat-map-font"><font color="##4876FF">【45】日历热力图（Calendar Heat Map）</font></a></li><li><a href="#font-color-4876ff-46-ji-jie-tu-seasonal-plot-font"><font color="##4876FF">【46】季节图（Seasonal Plot）</font></a></li></ul></li><li><a href="#font-color-ff0000-9x00-fen-zu-groups-font"><font color="#FF0000">【9x00】分组（ Groups）</font></a><ul><li><a href="#font-color-4876ff-47-shu-zhuang-tu-dendrogram-font"><font color="##4876FF">【47】树状图（Dendrogram）</font></a></li><li><a href="#font-color-4876ff-48-ju-lei-tu-cluster-plot-font"><font color="##4876FF">【48】聚类图（Cluster Plot）</font></a></li><li><a href="#font-color-4876ff-49-an-de-lu-si-qu-xian-andrews-curve-font"><font color="##4876FF">【49】安德鲁斯曲线（Andrews Curve）</font></a></li><li><a href="#font-color-4876ff-50-ping-xing-zuo-biao-tu-parallel-coordinates-font"><font color="##4876FF">【50】平行坐标图（Parallel Coordinates）</font></a></li></ul></li></ul><!-- tocstop --><hr><p>Matplotlib 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/014/">Python 数据分析三剑客之 Matplotlib（一）：初识 Matplotlib 与其 matplotibrc 配置文件</a></li><li><a href="https://www.itbob.cn/article/015/">Python 数据分析三剑客之 Matplotlib（二）：文本描述 / 中文支持 / 画布 / 网格等基本图像属性</a></li><li><a href="https://www.itbob.cn/article/016/">Python 数据分析三剑客之 Matplotlib（三）：图例 / LaTeX / 刻度 / 子图 / 补丁等基本图像属性</a></li><li><a href="https://www.itbob.cn/article/017/">Python 数据分析三剑客之 Matplotlib（四）：线性图的绘制</a></li><li><a href="https://www.itbob.cn/article/018/">Python 数据分析三剑客之 Matplotlib（五）：散点图的绘制</a></li><li><a href="https://www.itbob.cn/article/019/">Python 数据分析三剑客之 Matplotlib（六）：直方图 / 柱状图 / 条形图的绘制</a></li><li><a href="https://www.itbob.cn/article/020/">Python 数据分析三剑客之 Matplotlib（七）：饼状图的绘制</a></li><li><a href="https://www.itbob.cn/article/021/">Python 数据分析三剑客之 Matplotlib（八）：等高线 / 等值线图的绘制</a></li><li><a href="https://www.itbob.cn/article/022/">Python 数据分析三剑客之 Matplotlib（九）：极区图 / 极坐标图 / 雷达图的绘制</a></li><li><a href="https://www.itbob.cn/article/023/">Python 数据分析三剑客之 Matplotlib（十）：3D 图的绘制</a></li><li><a href="https://www.itbob.cn/article/024/">Python 数据分析三剑客之 Matplotlib（十一）：最热门最常用的 50 个图表</a>【译文】</li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><blockquote><p>翻译丨<a href="https://itrhx.blog.csdn.net/">TRHX</a><br>作者丨<a href="https://www.machinelearningplus.com/author/selva86/">Selva Prabhakaran</a><br>原文丨<a href="https://www.machinelearningplus.com/plots/top-50-matplotlib-visualizations-the-master-plots-python/">《Top 50 matplotlib Visualizations – The Master Plots (with full python code)》</a></p></blockquote><hr><blockquote><p>★ 本文中的示例原作者使用的编辑器为 Jupyter Notebook；<br>★ 译者使用 PyCharm 测试原文中有部分代码不太准确，部分已进行修改，对应有注释说明；<br>★ 运行本文代码，需要安装 Matplotlib 和 Seaborn 等可视化库，其他的一些辅助可视化库已在代码部分作标注；<br>★ 示例中用到的数据均储存在作者的 GitHub：<a href="https://github.com/selva86/datasets">https://github.com/selva86/datasets</a>，因此运行程序可能需要FQ；<br>★ 译者英文水平有限，若遇到翻译模糊的词建议参考原文来理解。<br>★ 本文50个示例代码已打包为 .py 文件，可直接下载：<a href="https://download.csdn.net/download/qq_36759224/12507219">https://download.csdn.net/download/qq_36759224/12507219</a></p></blockquote><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本译文首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">Selva</span> <span class="hljs-string">Prabhakaran，译者</span> <span class="hljs-string">TRHX。</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106558131</span><span class="hljs-string">原文链接：https://www.machinelearningplus.com/plots/top-50-matplotlib-visualizations-the-master-plots-python/</span></code></pre><hr><h2><span id="1x00-jie-shao-introduction"><font color="#FF0000">【1x00】介绍（Introduction）</font></span></h2><p>在数据分析和可视化中最常用的、最有价值的前 50 个 Matplotlib 图表。这些图表会让你懂得在不同情况下合理使用 Python 的 Matplotlib 和 Seaborn 库来达到数据可视化效果。</p><p>这些图表根据可视化目标的 7 个不同情景进行分组。 例如，如果要想象两个变量之间的关系，请查看“关联”部分下的图表。 或者，如果您想要显示值如何随时间变化，请查看“变化”部分，依此类推。</p><p>有效图表的重要特征：</p><ul><li>在不歪曲事实的情况下传达正确和必要的信息；</li><li>设计简单，不必太费力就能理解它；</li><li>从审美角度支持信息而不是掩盖信息；</li><li>信息没有超负荷。</li></ul><h2><span id="2x00-zhun-bei-gong-zuo-setup"><font color="#FF0000">【2x00】准备工作（Setup）</font></span></h2><p>在代码运行前先引入下面的基本设置，当然，个别图表可能会重新定义显示要素。</p><pre><code class="hljs python"><span class="hljs-comment"># !pip install brewer2mpl</span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-keyword">import</span> matplotlib <span class="hljs-keyword">as</span> mpl<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-keyword">import</span> seaborn <span class="hljs-keyword">as</span> sns<span class="hljs-keyword">import</span> warnings; warnings.filterwarnings(action=<span class="hljs-string">&#x27;once&#x27;</span>)large = <span class="hljs-number">22</span>; med = <span class="hljs-number">16</span>; small = <span class="hljs-number">12</span>params = &#123;<span class="hljs-string">&#x27;axes.titlesize&#x27;</span>: large,          <span class="hljs-string">&#x27;legend.fontsize&#x27;</span>: med,          <span class="hljs-string">&#x27;figure.figsize&#x27;</span>: (<span class="hljs-number">16</span>, <span class="hljs-number">10</span>),          <span class="hljs-string">&#x27;axes.labelsize&#x27;</span>: med,          <span class="hljs-string">&#x27;axes.titlesize&#x27;</span>: med,          <span class="hljs-string">&#x27;xtick.labelsize&#x27;</span>: med,          <span class="hljs-string">&#x27;ytick.labelsize&#x27;</span>: med,          <span class="hljs-string">&#x27;figure.titlesize&#x27;</span>: large&#125;plt.rcParams.update(params)plt.style.use(<span class="hljs-string">&#x27;seaborn-whitegrid&#x27;</span>)sns.set_style(<span class="hljs-string">&quot;white&quot;</span>)%matplotlib inline<span class="hljs-comment"># Version</span><span class="hljs-built_in">print</span>(mpl.__version__)  <span class="hljs-comment">#&gt; 3.0.0</span><span class="hljs-built_in">print</span>(sns.__version__)  <span class="hljs-comment">#&gt; 0.9.0</span></code></pre><h2><span id="3x00-guan-lian-correlation"><font color="#FF0000">【3x00】关联（Correlation） </font></span></h2><p>关联图用于可视化两个或多个变量之间的关系。也就是说，一个变量相对于另一个变量如何变化。</p><h3><span id="01-san-dian-tu-scatter-plot"><font color="##4876FF">【01】散点图（Scatter plot）</font></span></h3><p>散点图是研究两个变量之间关系的经典和基本的绘图。如果数据中有多个组，则可能需要以不同的颜色显示每个组。在 Matplotlib 中，您可以使用 <code>plt.scatterplot()</code> 方便地执行此操作。</p><pre><code class="hljs python"><span class="hljs-comment"># Import dataset </span>midwest = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/midwest_filter.csv&quot;</span>)<span class="hljs-comment"># Prepare Data </span><span class="hljs-comment"># Create as many colors as there are unique midwest[&#x27;category&#x27;]</span>categories = np.unique(midwest[<span class="hljs-string">&#x27;category&#x27;</span>])colors = [plt.cm.tab10(i/<span class="hljs-built_in">float</span>(<span class="hljs-built_in">len</span>(categories)-<span class="hljs-number">1</span>)) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(categories))]<span class="hljs-comment"># Draw Plot for Each Category</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi= <span class="hljs-number">80</span>, facecolor=<span class="hljs-string">&#x27;w&#x27;</span>, edgecolor=<span class="hljs-string">&#x27;k&#x27;</span>)<span class="hljs-keyword">for</span> i, category <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(categories):    plt.scatter(<span class="hljs-string">&#x27;area&#x27;</span>, <span class="hljs-string">&#x27;poptotal&#x27;</span>,                data=midwest.loc[midwest.category==category, :],                s=<span class="hljs-number">20</span>, cmap=colors[i], label=<span class="hljs-built_in">str</span>(category))<span class="hljs-comment"># 原文 c=colors[i] 已修改为 cmap=colors[i]</span><span class="hljs-comment"># Decorations</span>plt.gca().<span class="hljs-built_in">set</span>(xlim=(<span class="hljs-number">0.0</span>, <span class="hljs-number">0.1</span>), ylim=(<span class="hljs-number">0</span>, <span class="hljs-number">90000</span>),              xlabel=<span class="hljs-string">&#x27;Area&#x27;</span>, ylabel=<span class="hljs-string">&#x27;Population&#x27;</span>)plt.xticks(fontsize=<span class="hljs-number">12</span>); plt.yticks(fontsize=<span class="hljs-number">12</span>)plt.title(<span class="hljs-string">&quot;Scatterplot of Midwest Area vs Population&quot;</span>, fontsize=<span class="hljs-number">22</span>)plt.legend(fontsize=<span class="hljs-number">12</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/01.png" alt="01"></p><h3><span id="02-dai-bian-jie-de-qi-pao-tu-bubble-plot-with-encircling"><font color="##4876FF">【02】带边界的气泡图（Bubble plot with Encircling）</font></span></h3><p>有时候您想在一个边界内显示一组点来强调它们的重要性。在本例中，您将从被包围的数据中获取记录，并将其传递给下面的代码中描述的 <code>encircle()</code>。</p><pre><code class="hljs python"><span class="hljs-keyword">from</span> matplotlib <span class="hljs-keyword">import</span> patches<span class="hljs-keyword">from</span> scipy.spatial <span class="hljs-keyword">import</span> ConvexHull<span class="hljs-keyword">import</span> warnings; warnings.simplefilter(<span class="hljs-string">&#x27;ignore&#x27;</span>)sns.set_style(<span class="hljs-string">&quot;white&quot;</span>)<span class="hljs-comment"># Step 1: Prepare Data</span>midwest = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/midwest_filter.csv&quot;</span>)<span class="hljs-comment"># As many colors as there are unique midwest[&#x27;category&#x27;]</span>categories = np.unique(midwest[<span class="hljs-string">&#x27;category&#x27;</span>])colors = [plt.cm.tab10(i/<span class="hljs-built_in">float</span>(<span class="hljs-built_in">len</span>(categories)-<span class="hljs-number">1</span>)) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(categories))]<span class="hljs-comment"># Step 2: Draw Scatterplot with unique color for each category</span>fig = plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>, facecolor=<span class="hljs-string">&#x27;w&#x27;</span>, edgecolor=<span class="hljs-string">&#x27;k&#x27;</span>)<span class="hljs-keyword">for</span> i, category <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(categories):    plt.scatter(<span class="hljs-string">&#x27;area&#x27;</span>, <span class="hljs-string">&#x27;poptotal&#x27;</span>, data=midwest.loc[midwest.category == category, :], s=<span class="hljs-string">&#x27;dot_size&#x27;</span>, cmap=colors[i], label=<span class="hljs-built_in">str</span>(category), edgecolors=<span class="hljs-string">&#x27;black&#x27;</span>, linewidths=<span class="hljs-number">.5</span>)<span class="hljs-comment"># 原文 c=colors[i] 已修改为 cmap=colors[i]</span><span class="hljs-comment"># Step 3: Encircling</span><span class="hljs-comment"># https://stackoverflow.com/questions/44575681/how-do-i-encircle-different-data-sets-in-scatter-plot</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">encircle</span>(<span class="hljs-params">x,y, ax=<span class="hljs-literal">None</span>, **kw</span>):</span>    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> ax: ax = plt.gca()    p = np.c_[x, y]    hull = ConvexHull(p)    poly = plt.Polygon(p[hull.vertices, :], **kw)    ax.add_patch(poly)<span class="hljs-comment"># Select data to be encircled</span>midwest_encircle_data = midwest.loc[midwest.state==<span class="hljs-string">&#x27;IN&#x27;</span>, :]<span class="hljs-comment"># Draw polygon surrounding vertices</span>encircle(midwest_encircle_data.area, midwest_encircle_data.poptotal, ec=<span class="hljs-string">&quot;k&quot;</span>, fc=<span class="hljs-string">&quot;gold&quot;</span>, alpha=<span class="hljs-number">0.1</span>)encircle(midwest_encircle_data.area, midwest_encircle_data.poptotal, ec=<span class="hljs-string">&quot;firebrick&quot;</span>, fc=<span class="hljs-string">&quot;none&quot;</span>, linewidth=<span class="hljs-number">1.5</span>)<span class="hljs-comment"># Step 4: Decorations</span>plt.gca().<span class="hljs-built_in">set</span>(xlim=(<span class="hljs-number">0.0</span>, <span class="hljs-number">0.1</span>), ylim=(<span class="hljs-number">0</span>, <span class="hljs-number">90000</span>),              xlabel=<span class="hljs-string">&#x27;Area&#x27;</span>, ylabel=<span class="hljs-string">&#x27;Population&#x27;</span>)plt.xticks(fontsize=<span class="hljs-number">12</span>); plt.yticks(fontsize=<span class="hljs-number">12</span>)plt.title(<span class="hljs-string">&quot;Bubble Plot with Encircling&quot;</span>, fontsize=<span class="hljs-number">22</span>)plt.legend(fontsize=<span class="hljs-number">12</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/02.png" alt="02"></p><h3><span id="03-dai-xian-xing-hui-gui-zui-jia-ni-he-xian-de-san-dian-tu-scatter-plot-with-linear-regression-line-of-best-fit"><font color="##4876FF">【03】带线性回归最佳拟合线的散点图（Scatter plot with linear regression line of best fit）</font></span></h3><p>如果你想了解两个变量之间是如何变化的，那么最佳拟合线就是常用的方法。下图显示了数据中不同组之间的最佳拟合线的差异。若要禁用分组并只为整个数据集绘制一条最佳拟合线，请从 <code>sns.lmplot()</code> 方法中删除 <code>hue ='cyl'</code> 参数。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv&quot;</span>)df_select = df.loc[df.cyl.isin([<span class="hljs-number">4</span>, <span class="hljs-number">8</span>]), :]<span class="hljs-comment"># Plot</span>sns.set_style(<span class="hljs-string">&quot;white&quot;</span>)gridobj = sns.lmplot(x=<span class="hljs-string">&quot;displ&quot;</span>, y=<span class="hljs-string">&quot;hwy&quot;</span>, hue=<span class="hljs-string">&quot;cyl&quot;</span>, data=df_select,                     height=<span class="hljs-number">7</span>, aspect=<span class="hljs-number">1.6</span>, robust=<span class="hljs-literal">True</span>, palette=<span class="hljs-string">&#x27;tab10&#x27;</span>,                     scatter_kws=<span class="hljs-built_in">dict</span>(s=<span class="hljs-number">60</span>, linewidths=<span class="hljs-number">.7</span>, edgecolors=<span class="hljs-string">&#x27;black&#x27;</span>))<span class="hljs-comment"># Decorations</span>gridobj.<span class="hljs-built_in">set</span>(xlim=(<span class="hljs-number">0.5</span>, <span class="hljs-number">7.5</span>), ylim=(<span class="hljs-number">0</span>, <span class="hljs-number">50</span>))plt.title(<span class="hljs-string">&quot;Scatterplot with line of best fit grouped by number of cylinders&quot;</span>, fontsize=<span class="hljs-number">20</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/03.png" alt="03"></p><p>针对每一组数据绘制线性回归线（Each regression line in its own column），可以通过在 <code>sns.lmplot()</code> 中设置 <code>col=groupingcolumn</code> 参数来实现，如下：</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv&quot;</span>)df_select = df.loc[df.cyl.isin([<span class="hljs-number">4</span>, <span class="hljs-number">8</span>]), :]<span class="hljs-comment"># Each line in its own column</span>sns.set_style(<span class="hljs-string">&quot;white&quot;</span>)gridobj = sns.lmplot(x=<span class="hljs-string">&quot;displ&quot;</span>, y=<span class="hljs-string">&quot;hwy&quot;</span>,                     data=df_select,                     height=<span class="hljs-number">7</span>,                     robust=<span class="hljs-literal">True</span>,                     palette=<span class="hljs-string">&#x27;Set1&#x27;</span>,                     col=<span class="hljs-string">&quot;cyl&quot;</span>,                     scatter_kws=<span class="hljs-built_in">dict</span>(s=<span class="hljs-number">60</span>, linewidths=<span class="hljs-number">.7</span>, edgecolors=<span class="hljs-string">&#x27;black&#x27;</span>))<span class="hljs-comment"># Decorations</span>gridobj.<span class="hljs-built_in">set</span>(xlim=(<span class="hljs-number">0.5</span>, <span class="hljs-number">7.5</span>), ylim=(<span class="hljs-number">0</span>, <span class="hljs-number">50</span>))plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/04.png" alt="04"></p><h3><span id="04-dou-dong-tu-jittering-with-stripplot"><font color="##4876FF">【04】抖动图（Jittering with stripplot）</font></span></h3><p>通常，多个数据点具有完全相同的 X 和 Y 值。 此时多个点绘制会重叠并隐藏。为避免这种情况，可以将数据点稍微抖动，以便可以直观地看到它们。 使用 <code>seaborn</code> 库的 <code>stripplot()</code> 方法可以很方便的实现这个功能。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Draw Stripplot</span>fig, ax = plt.subplots(figsize=(<span class="hljs-number">16</span>,<span class="hljs-number">10</span>), dpi= <span class="hljs-number">80</span>)sns.stripplot(df.cty, df.hwy, jitter=<span class="hljs-number">0.25</span>, size=<span class="hljs-number">8</span>, ax=ax, linewidth=<span class="hljs-number">.5</span>)<span class="hljs-comment"># Decorations</span>plt.title(<span class="hljs-string">&#x27;Use jittered plots to avoid overlapping of points&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/05.png" alt="05"></p><h3><span id="05-ji-shu-tu-counts-plot"><font color="##4876FF">【05】计数图（Counts Plot）</font></span></h3><p>避免点重叠问题的另一个选择是根据点的位置增加点的大小。所以，点的大小越大，它周围的点就越集中。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv&quot;</span>)df_counts = df.groupby([<span class="hljs-string">&#x27;hwy&#x27;</span>, <span class="hljs-string">&#x27;cty&#x27;</span>]).size().reset_index(name=<span class="hljs-string">&#x27;counts&#x27;</span>)<span class="hljs-comment"># Draw Stripplot</span>fig, ax = plt.subplots(figsize=(<span class="hljs-number">16</span>,<span class="hljs-number">10</span>), dpi= <span class="hljs-number">80</span>)    <span class="hljs-comment"># 原文代码</span><span class="hljs-comment"># sns.stripplot(df_counts.cty, df_counts.hwy, size=df_counts.counts*2, ax=ax)</span><span class="hljs-comment"># 纠正代码</span>sns.stripplot(df_counts.cty, df_counts.hwy, sizes=df_counts.counts*<span class="hljs-number">2</span>, ax=ax)<span class="hljs-comment"># Decorations</span>plt.title(<span class="hljs-string">&#x27;Counts Plot - Size of circle is bigger as more points overlap&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/06.png" alt="06"></p><h3><span id="06-bian-yuan-zhi-fang-tu-marginal-histogram"><font color="##4876FF">【06】边缘直方图（Marginal Histogram）</font></span></h3><p>边缘直方图是具有沿 X 和 Y 轴变量的直方图。 这用于可视化 X 和 Y 之间的关系以及单独的 X 和 Y 的单变量分布。 这种图经常用于探索性数据分析（EDA）。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Create Fig and gridspec</span>fig = plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi= <span class="hljs-number">80</span>)grid = plt.GridSpec(<span class="hljs-number">4</span>, <span class="hljs-number">4</span>, hspace=<span class="hljs-number">0.5</span>, wspace=<span class="hljs-number">0.2</span>)<span class="hljs-comment"># Define the axes</span>ax_main = fig.add_subplot(grid[:-<span class="hljs-number">1</span>, :-<span class="hljs-number">1</span>])ax_right = fig.add_subplot(grid[:-<span class="hljs-number">1</span>, -<span class="hljs-number">1</span>], xticklabels=[], yticklabels=[])ax_bottom = fig.add_subplot(grid[-<span class="hljs-number">1</span>, <span class="hljs-number">0</span>:-<span class="hljs-number">1</span>], xticklabels=[], yticklabels=[])<span class="hljs-comment"># Scatterplot on main ax</span>ax_main.scatter(<span class="hljs-string">&#x27;displ&#x27;</span>, <span class="hljs-string">&#x27;hwy&#x27;</span>, s=df.cty*<span class="hljs-number">4</span>, c=df.manufacturer.astype(<span class="hljs-string">&#x27;category&#x27;</span>).cat.codes, alpha=<span class="hljs-number">.9</span>, data=df, cmap=<span class="hljs-string">&quot;tab10&quot;</span>, edgecolors=<span class="hljs-string">&#x27;gray&#x27;</span>, linewidths=<span class="hljs-number">.5</span>)<span class="hljs-comment"># histogram on the right</span>ax_bottom.hist(df.displ, <span class="hljs-number">40</span>, histtype=<span class="hljs-string">&#x27;stepfilled&#x27;</span>, orientation=<span class="hljs-string">&#x27;vertical&#x27;</span>, color=<span class="hljs-string">&#x27;deeppink&#x27;</span>)ax_bottom.invert_yaxis()<span class="hljs-comment"># histogram in the bottom</span>ax_right.hist(df.hwy, <span class="hljs-number">40</span>, histtype=<span class="hljs-string">&#x27;stepfilled&#x27;</span>, orientation=<span class="hljs-string">&#x27;horizontal&#x27;</span>, color=<span class="hljs-string">&#x27;deeppink&#x27;</span>)<span class="hljs-comment"># Decorations</span>ax_main.<span class="hljs-built_in">set</span>(title=<span class="hljs-string">&#x27;Scatterplot with Histograms \n displ vs hwy&#x27;</span>, xlabel=<span class="hljs-string">&#x27;displ&#x27;</span>, ylabel=<span class="hljs-string">&#x27;hwy&#x27;</span>)ax_main.title.set_fontsize(<span class="hljs-number">20</span>)<span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> ([ax_main.xaxis.label, ax_main.yaxis.label] + ax_main.get_xticklabels() + ax_main.get_yticklabels()):    item.set_fontsize(<span class="hljs-number">14</span>)xlabels = ax_main.get_xticks().tolist()ax_main.set_xticklabels(xlabels)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/07.png" alt="07"></p><h3><span id="07-bian-yuan-xiang-xing-tu-marginal-boxplot"><font color="##4876FF">【07】边缘箱形图（Marginal Boxplot）</font></span></h3><p>边缘箱形图与边缘直方图具有相似的用途。 然而，箱线图有助于精确定位 X 和 Y 的中位数、第25和第75百分位数。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Create Fig and gridspec</span>fig = plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi= <span class="hljs-number">80</span>)grid = plt.GridSpec(<span class="hljs-number">4</span>, <span class="hljs-number">4</span>, hspace=<span class="hljs-number">0.5</span>, wspace=<span class="hljs-number">0.2</span>)<span class="hljs-comment"># Define the axes</span>ax_main = fig.add_subplot(grid[:-<span class="hljs-number">1</span>, :-<span class="hljs-number">1</span>])ax_right = fig.add_subplot(grid[:-<span class="hljs-number">1</span>, -<span class="hljs-number">1</span>], xticklabels=[], yticklabels=[])ax_bottom = fig.add_subplot(grid[-<span class="hljs-number">1</span>, <span class="hljs-number">0</span>:-<span class="hljs-number">1</span>], xticklabels=[], yticklabels=[])<span class="hljs-comment"># Scatterplot on main ax</span>ax_main.scatter(<span class="hljs-string">&#x27;displ&#x27;</span>, <span class="hljs-string">&#x27;hwy&#x27;</span>, s=df.cty*<span class="hljs-number">5</span>, c=df.manufacturer.astype(<span class="hljs-string">&#x27;category&#x27;</span>).cat.codes, alpha=<span class="hljs-number">.9</span>, data=df, cmap=<span class="hljs-string">&quot;Set1&quot;</span>, edgecolors=<span class="hljs-string">&#x27;black&#x27;</span>, linewidths=<span class="hljs-number">.5</span>)<span class="hljs-comment"># Add a graph in each part</span>sns.boxplot(df.hwy, ax=ax_right, orient=<span class="hljs-string">&quot;v&quot;</span>)sns.boxplot(df.displ, ax=ax_bottom, orient=<span class="hljs-string">&quot;h&quot;</span>)<span class="hljs-comment"># Decorations ------------------</span><span class="hljs-comment"># Remove x axis name for the boxplot</span>ax_bottom.<span class="hljs-built_in">set</span>(xlabel=<span class="hljs-string">&#x27;&#x27;</span>)ax_right.<span class="hljs-built_in">set</span>(ylabel=<span class="hljs-string">&#x27;&#x27;</span>)<span class="hljs-comment"># Main Title, Xlabel and YLabel</span>ax_main.<span class="hljs-built_in">set</span>(title=<span class="hljs-string">&#x27;Scatterplot with Histograms \n displ vs hwy&#x27;</span>, xlabel=<span class="hljs-string">&#x27;displ&#x27;</span>, ylabel=<span class="hljs-string">&#x27;hwy&#x27;</span>)<span class="hljs-comment"># Set font size of different components</span>ax_main.title.set_fontsize(<span class="hljs-number">20</span>)<span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> ([ax_main.xaxis.label, ax_main.yaxis.label] + ax_main.get_xticklabels() + ax_main.get_yticklabels()):    item.set_fontsize(<span class="hljs-number">14</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/08.png" alt="08"></p><h3><span id="08-xiang-guan-tu-correllogram"><font color="##4876FF">【08】相关图（Correllogram）</font></span></h3><p>相关图用于直观地查看给定数据帧（或二维数组）中所有可能的数值变量对之间的相关性度量。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Dataset</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mtcars.csv&quot;</span>)<span class="hljs-comment"># Plot</span>plt.figure(figsize=(<span class="hljs-number">12</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)sns.heatmap(df.corr(), xticklabels=df.corr().columns, yticklabels=df.corr().columns, cmap=<span class="hljs-string">&#x27;RdYlGn&#x27;</span>, center=<span class="hljs-number">0</span>, annot=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Decorations</span>plt.title(<span class="hljs-string">&#x27;Correlogram of mtcars&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.xticks(fontsize=<span class="hljs-number">12</span>)plt.yticks(fontsize=<span class="hljs-number">12</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/09.png" alt="09"></p><h3><span id="09-cheng-dui-tu-pairwise-plot"><font color="##4876FF">【09】成对图（Pairwise Plot）</font></span></h3><p>成对图是探索性分析中最受欢迎的一种方法，用来理解所有可能的数值变量对之间的关系。它是二元分析的必备工具。</p><pre><code class="hljs python"><span class="hljs-comment"># Load Dataset</span>df = sns.load_dataset(<span class="hljs-string">&#x27;iris&#x27;</span>)<span class="hljs-comment"># Plot</span>plt.figure(figsize=(<span class="hljs-number">10</span>, <span class="hljs-number">8</span>), dpi=<span class="hljs-number">80</span>)sns.pairplot(df, kind=<span class="hljs-string">&quot;scatter&quot;</span>, hue=<span class="hljs-string">&quot;species&quot;</span>, plot_kws=<span class="hljs-built_in">dict</span>(s=<span class="hljs-number">80</span>, edgecolor=<span class="hljs-string">&quot;white&quot;</span>, linewidth=<span class="hljs-number">2.5</span>))plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/10.png" alt="10"></p><pre><code class="hljs python"><span class="hljs-comment"># Load Dataset</span>df = sns.load_dataset(<span class="hljs-string">&#x27;iris&#x27;</span>)<span class="hljs-comment"># Plot</span>plt.figure(figsize=(<span class="hljs-number">10</span>, <span class="hljs-number">8</span>), dpi=<span class="hljs-number">80</span>)sns.pairplot(df, kind=<span class="hljs-string">&quot;reg&quot;</span>, hue=<span class="hljs-string">&quot;species&quot;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/11.png" alt="11"></p><h2><span id="4x00-pian-chai-deviation"><font color="#FF0000">【4x00】偏差（Deviation）</font></span></h2><h3><span id="10-fa-san-xing-tiao-xing-tu-diverging-bars"><font color="##4876FF">【10】发散型条形图（Diverging Bars）</font></span></h3><p>如果您想根据单个指标查看项目的变化情况，并可视化此差异的顺序和数量，那么散型条形图是一个很好的工具。 它有助于快速区分数据组的性能，并且非常直观，并且可以立即传达这一点。</p><pre><code class="hljs python"><span class="hljs-comment"># Prepare Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mtcars.csv&quot;</span>)x = df.loc[:, [<span class="hljs-string">&#x27;mpg&#x27;</span>]]df[<span class="hljs-string">&#x27;mpg_z&#x27;</span>] = (x - x.mean())/x.std()df[<span class="hljs-string">&#x27;colors&#x27;</span>] = [<span class="hljs-string">&#x27;red&#x27;</span> <span class="hljs-keyword">if</span> x &lt; <span class="hljs-number">0</span> <span class="hljs-keyword">else</span> <span class="hljs-string">&#x27;green&#x27;</span> <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> df[<span class="hljs-string">&#x27;mpg_z&#x27;</span>]]df.sort_values(<span class="hljs-string">&#x27;mpg_z&#x27;</span>, inplace=<span class="hljs-literal">True</span>)df.reset_index(inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Draw plot</span>plt.figure(figsize=(<span class="hljs-number">14</span>,<span class="hljs-number">10</span>), dpi= <span class="hljs-number">80</span>)plt.hlines(y=df.index, xmin=<span class="hljs-number">0</span>, xmax=df.mpg_z, color=df.colors, alpha=<span class="hljs-number">0.4</span>, linewidth=<span class="hljs-number">5</span>)<span class="hljs-comment"># Decorations</span>plt.gca().<span class="hljs-built_in">set</span>(ylabel=<span class="hljs-string">&#x27;$Model$&#x27;</span>, xlabel=<span class="hljs-string">&#x27;$Mileage$&#x27;</span>)plt.yticks(df.index, df.cars, fontsize=<span class="hljs-number">12</span>)plt.title(<span class="hljs-string">&#x27;Diverging Bars of Car Mileage&#x27;</span>, fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>:<span class="hljs-number">20</span>&#125;)plt.grid(linestyle=<span class="hljs-string">&#x27;--&#x27;</span>, alpha=<span class="hljs-number">0.5</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/12.png" alt="12"></p><h3><span id="11-fa-san-xing-wen-ben-tu-diverging-texts"><font color="##4876FF">【11】发散型文本图（Diverging Texts）</font></span></h3><p>发散型文本图与发散型条形图相似，如果你希望以一种美观的方式显示图表中每个项目的值，就可以使用这种方法。</p><pre><code class="hljs python"><span class="hljs-comment"># Prepare Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mtcars.csv&quot;</span>)x = df.loc[:, [<span class="hljs-string">&#x27;mpg&#x27;</span>]]df[<span class="hljs-string">&#x27;mpg_z&#x27;</span>] = (x - x.mean())/x.std()df[<span class="hljs-string">&#x27;colors&#x27;</span>] = [<span class="hljs-string">&#x27;red&#x27;</span> <span class="hljs-keyword">if</span> x &lt; <span class="hljs-number">0</span> <span class="hljs-keyword">else</span> <span class="hljs-string">&#x27;green&#x27;</span> <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> df[<span class="hljs-string">&#x27;mpg_z&#x27;</span>]]df.sort_values(<span class="hljs-string">&#x27;mpg_z&#x27;</span>, inplace=<span class="hljs-literal">True</span>)df.reset_index(inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Draw plot</span>plt.figure(figsize=(<span class="hljs-number">14</span>, <span class="hljs-number">14</span>), dpi=<span class="hljs-number">80</span>)plt.hlines(y=df.index, xmin=<span class="hljs-number">0</span>, xmax=df.mpg_z)<span class="hljs-keyword">for</span> x, y, tex <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(df.mpg_z, df.index, df.mpg_z):    t = plt.text(x, y, <span class="hljs-built_in">round</span>(tex, <span class="hljs-number">2</span>), horizontalalignment=<span class="hljs-string">&#x27;right&#x27;</span> <span class="hljs-keyword">if</span> x &lt; <span class="hljs-number">0</span> <span class="hljs-keyword">else</span> <span class="hljs-string">&#x27;left&#x27;</span>,                 verticalalignment=<span class="hljs-string">&#x27;center&#x27;</span>, fontdict=&#123;<span class="hljs-string">&#x27;color&#x27;</span>:<span class="hljs-string">&#x27;red&#x27;</span> <span class="hljs-keyword">if</span> x &lt; <span class="hljs-number">0</span> <span class="hljs-keyword">else</span> <span class="hljs-string">&#x27;green&#x27;</span>, <span class="hljs-string">&#x27;size&#x27;</span>:<span class="hljs-number">14</span>&#125;)<span class="hljs-comment"># Decorations</span>plt.yticks(df.index, df.cars, fontsize=<span class="hljs-number">12</span>)plt.title(<span class="hljs-string">&#x27;Diverging Text Bars of Car Mileage&#x27;</span>, fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>:<span class="hljs-number">20</span>&#125;)plt.grid(linestyle=<span class="hljs-string">&#x27;--&#x27;</span>, alpha=<span class="hljs-number">0.5</span>)plt.xlim(-<span class="hljs-number">2.5</span>, <span class="hljs-number">2.5</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/13.png" alt="13"></p><h3><span id="12-fa-san-xing-san-dian-tu-diverging-dot-plot"><font color="##4876FF">【12】发散型散点图（Diverging Dot Plot）</font></span></h3><p>发散型散点图类似于发散型条形图。 但是，与发散型条形图相比，没有条形会减少组之间的对比度和差异。</p><pre><code class="hljs python"><span class="hljs-comment"># Prepare Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mtcars.csv&quot;</span>)x = df.loc[:, [<span class="hljs-string">&#x27;mpg&#x27;</span>]]df[<span class="hljs-string">&#x27;mpg_z&#x27;</span>] = (x - x.mean())/x.std()df[<span class="hljs-string">&#x27;colors&#x27;</span>] = [<span class="hljs-string">&#x27;red&#x27;</span> <span class="hljs-keyword">if</span> x &lt; <span class="hljs-number">0</span> <span class="hljs-keyword">else</span> <span class="hljs-string">&#x27;darkgreen&#x27;</span> <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> df[<span class="hljs-string">&#x27;mpg_z&#x27;</span>]]df.sort_values(<span class="hljs-string">&#x27;mpg_z&#x27;</span>, inplace=<span class="hljs-literal">True</span>)df.reset_index(inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Draw plot</span>plt.figure(figsize=(<span class="hljs-number">14</span>, <span class="hljs-number">16</span>), dpi=<span class="hljs-number">80</span>)plt.scatter(df.mpg_z, df.index, s=<span class="hljs-number">450</span>, alpha=<span class="hljs-number">.6</span>, color=df.colors)<span class="hljs-keyword">for</span> x, y, tex <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(df.mpg_z, df.index, df.mpg_z):    t = plt.text(x, y, <span class="hljs-built_in">round</span>(tex, <span class="hljs-number">1</span>), horizontalalignment=<span class="hljs-string">&#x27;center&#x27;</span>,                 verticalalignment=<span class="hljs-string">&#x27;center&#x27;</span>, fontdict=&#123;<span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;white&#x27;</span>&#125;)<span class="hljs-comment"># Decorations</span><span class="hljs-comment"># Lighten borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.yticks(df.index, df.cars)plt.title(<span class="hljs-string">&#x27;Diverging Dotplot of Car Mileage&#x27;</span>, fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">20</span>&#125;)plt.xlabel(<span class="hljs-string">&#x27;$Mileage$&#x27;</span>)plt.grid(linestyle=<span class="hljs-string">&#x27;--&#x27;</span>, alpha=<span class="hljs-number">0.5</span>)plt.xlim(-<span class="hljs-number">2.5</span>, <span class="hljs-number">2.5</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/14.png" alt="14"></p><h3><span id="13-dai-biao-ji-de-fa-san-xing-bang-bang-tang-tu-diverging-lollipop-chart-with-markers"><font color="##4876FF">【13】带标记的发散型棒棒糖图（Diverging Lollipop Chart with Markers）</font></span></h3><p>带有标记的棒棒糖提供了一种灵活的方式，强调您想要引起注意的任何重要数据点并在图表中适当地给出推理。</p><pre><code class="hljs python"><span class="hljs-comment"># Prepare Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mtcars.csv&quot;</span>)x = df.loc[:, [<span class="hljs-string">&#x27;mpg&#x27;</span>]]df[<span class="hljs-string">&#x27;mpg_z&#x27;</span>] = (x - x.mean())/x.std()df[<span class="hljs-string">&#x27;colors&#x27;</span>] = <span class="hljs-string">&#x27;black&#x27;</span><span class="hljs-comment"># color fiat differently</span>df.loc[df.cars == <span class="hljs-string">&#x27;Fiat X1-9&#x27;</span>, <span class="hljs-string">&#x27;colors&#x27;</span>] = <span class="hljs-string">&#x27;darkorange&#x27;</span>df.sort_values(<span class="hljs-string">&#x27;mpg_z&#x27;</span>, inplace=<span class="hljs-literal">True</span>)df.reset_index(inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Draw plot</span><span class="hljs-keyword">import</span> matplotlib.patches <span class="hljs-keyword">as</span> patchesplt.figure(figsize=(<span class="hljs-number">14</span>, <span class="hljs-number">16</span>), dpi=<span class="hljs-number">80</span>)plt.hlines(y=df.index, xmin=<span class="hljs-number">0</span>, xmax=df.mpg_z, color=df.colors, alpha=<span class="hljs-number">0.4</span>, linewidth=<span class="hljs-number">1</span>)plt.scatter(df.mpg_z, df.index, color=df.colors, s=[<span class="hljs-number">600</span> <span class="hljs-keyword">if</span> x == <span class="hljs-string">&#x27;Fiat X1-9&#x27;</span> <span class="hljs-keyword">else</span> <span class="hljs-number">300</span> <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> df.cars], alpha=<span class="hljs-number">0.6</span>)plt.yticks(df.index, df.cars)plt.xticks(fontsize=<span class="hljs-number">12</span>)<span class="hljs-comment"># Annotate</span>plt.annotate(<span class="hljs-string">&#x27;Mercedes Models&#x27;</span>, xy=(<span class="hljs-number">0.0</span>, <span class="hljs-number">11.0</span>), xytext=(<span class="hljs-number">1.0</span>, <span class="hljs-number">11</span>), xycoords=<span class="hljs-string">&#x27;data&#x27;</span>,            fontsize=<span class="hljs-number">15</span>, ha=<span class="hljs-string">&#x27;center&#x27;</span>, va=<span class="hljs-string">&#x27;center&#x27;</span>,            bbox=<span class="hljs-built_in">dict</span>(boxstyle=<span class="hljs-string">&#x27;square&#x27;</span>, fc=<span class="hljs-string">&#x27;firebrick&#x27;</span>),            arrowprops=<span class="hljs-built_in">dict</span>(arrowstyle=<span class="hljs-string">&#x27;-[, widthB=2.0, lengthB=1.5&#x27;</span>, lw=<span class="hljs-number">2.0</span>, color=<span class="hljs-string">&#x27;steelblue&#x27;</span>), color=<span class="hljs-string">&#x27;white&#x27;</span>)<span class="hljs-comment"># Add Patches</span>p1 = patches.Rectangle((-<span class="hljs-number">2.0</span>, -<span class="hljs-number">1</span>), width=<span class="hljs-number">.3</span>, height=<span class="hljs-number">3</span>, alpha=<span class="hljs-number">.2</span>, facecolor=<span class="hljs-string">&#x27;red&#x27;</span>)p2 = patches.Rectangle((<span class="hljs-number">1.5</span>, <span class="hljs-number">27</span>), width=<span class="hljs-number">.8</span>, height=<span class="hljs-number">5</span>, alpha=<span class="hljs-number">.2</span>, facecolor=<span class="hljs-string">&#x27;green&#x27;</span>)plt.gca().add_patch(p1)plt.gca().add_patch(p2)<span class="hljs-comment"># Decorate</span>plt.title(<span class="hljs-string">&#x27;Diverging Bars of Car Mileage&#x27;</span>, fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">20</span>&#125;)plt.grid(linestyle=<span class="hljs-string">&#x27;--&#x27;</span>, alpha=<span class="hljs-number">0.5</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/15.png" alt="15"></p><h3><span id="14-mian-ji-tu-area-chart"><font color="##4876FF">【14】面积图（Area Chart）</font></span></h3><p>通过对轴和线之间的区域进行着色，面积图不仅强调波峰和波谷，还强调波峰和波谷的持续时间。 高点持续时间越长，线下面积越大。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd<span class="hljs-comment"># Prepare Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/economics.csv&quot;</span>, parse_dates=[<span class="hljs-string">&#x27;date&#x27;</span>]).head(<span class="hljs-number">100</span>)x = np.arange(df.shape[<span class="hljs-number">0</span>])y_returns = (df.psavert.diff().fillna(<span class="hljs-number">0</span>)/df.psavert.shift(<span class="hljs-number">1</span>)).fillna(<span class="hljs-number">0</span>) * <span class="hljs-number">100</span><span class="hljs-comment"># Plot</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)plt.fill_between(x[<span class="hljs-number">1</span>:], y_returns[<span class="hljs-number">1</span>:], <span class="hljs-number">0</span>, where=y_returns[<span class="hljs-number">1</span>:] &gt;= <span class="hljs-number">0</span>, facecolor=<span class="hljs-string">&#x27;green&#x27;</span>, interpolate=<span class="hljs-literal">True</span>, alpha=<span class="hljs-number">0.7</span>)plt.fill_between(x[<span class="hljs-number">1</span>:], y_returns[<span class="hljs-number">1</span>:], <span class="hljs-number">0</span>, where=y_returns[<span class="hljs-number">1</span>:] &lt;= <span class="hljs-number">0</span>, facecolor=<span class="hljs-string">&#x27;red&#x27;</span>, interpolate=<span class="hljs-literal">True</span>, alpha=<span class="hljs-number">0.7</span>)<span class="hljs-comment"># Annotate</span>plt.annotate(<span class="hljs-string">&#x27;Peak \n1975&#x27;</span>, xy=(<span class="hljs-number">94.0</span>, <span class="hljs-number">21.0</span>), xytext=(<span class="hljs-number">88.0</span>, <span class="hljs-number">28</span>),             bbox=<span class="hljs-built_in">dict</span>(boxstyle=<span class="hljs-string">&#x27;square&#x27;</span>, fc=<span class="hljs-string">&#x27;firebrick&#x27;</span>),             arrowprops=<span class="hljs-built_in">dict</span>(facecolor=<span class="hljs-string">&#x27;steelblue&#x27;</span>, shrink=<span class="hljs-number">0.05</span>), fontsize=<span class="hljs-number">15</span>, color=<span class="hljs-string">&#x27;white&#x27;</span>)<span class="hljs-comment"># Decorations</span>xtickvals = [<span class="hljs-built_in">str</span>(m)[:<span class="hljs-number">3</span>].upper()+<span class="hljs-string">&quot;-&quot;</span>+<span class="hljs-built_in">str</span>(y) <span class="hljs-keyword">for</span> y, m <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(df.date.dt.year, df.date.dt.month_name())]plt.gca().set_xticks(x[::<span class="hljs-number">6</span>])plt.gca().set_xticklabels(xtickvals[::<span class="hljs-number">6</span>], rotation=<span class="hljs-number">90</span>, fontdict=&#123;<span class="hljs-string">&#x27;horizontalalignment&#x27;</span>: <span class="hljs-string">&#x27;center&#x27;</span>, <span class="hljs-string">&#x27;verticalalignment&#x27;</span>: <span class="hljs-string">&#x27;center_baseline&#x27;</span>&#125;)plt.ylim(-<span class="hljs-number">35</span>, <span class="hljs-number">35</span>)plt.xlim(<span class="hljs-number">1</span>, <span class="hljs-number">100</span>)plt.title(<span class="hljs-string">&quot;Month Economics Return %&quot;</span>, fontsize=<span class="hljs-number">22</span>)plt.ylabel(<span class="hljs-string">&#x27;Monthly returns %&#x27;</span>)plt.grid(alpha=<span class="hljs-number">0.5</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/16.png" alt="16"></p><h2><span id="5x00-pai-xu-ranking"><font color="#FF0000">【5x00】排序（Ranking）</font></span></h2><h3><span id="15-you-xu-tiao-xing-tu-ordered-bar-chart"><font color="##4876FF">【15】有序条形图（Ordered Bar Chart）</font></span></h3><p>有序条形图有效地传达了项目的排序顺序。在图表上方添加度量标准的值，用户就可以从图表本身获得精确的信息。</p><pre><code class="hljs python"><span class="hljs-comment"># Prepare Data</span>df_raw = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)df = df_raw[[<span class="hljs-string">&#x27;cty&#x27;</span>, <span class="hljs-string">&#x27;manufacturer&#x27;</span>]].groupby(<span class="hljs-string">&#x27;manufacturer&#x27;</span>).apply(<span class="hljs-keyword">lambda</span> x: x.mean())df.sort_values(<span class="hljs-string">&#x27;cty&#x27;</span>, inplace=<span class="hljs-literal">True</span>)df.reset_index(inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Draw plot</span><span class="hljs-keyword">import</span> matplotlib.patches <span class="hljs-keyword">as</span> patchesfig, ax = plt.subplots(figsize=(<span class="hljs-number">16</span>,<span class="hljs-number">10</span>), facecolor=<span class="hljs-string">&#x27;white&#x27;</span>, dpi= <span class="hljs-number">80</span>)ax.vlines(x=df.index, ymin=<span class="hljs-number">0</span>, ymax=df.cty, color=<span class="hljs-string">&#x27;firebrick&#x27;</span>, alpha=<span class="hljs-number">0.7</span>, linewidth=<span class="hljs-number">20</span>)<span class="hljs-comment"># Annotate Text</span><span class="hljs-keyword">for</span> i, cty <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(df.cty):    ax.text(i, cty+<span class="hljs-number">0.5</span>, <span class="hljs-built_in">round</span>(cty, <span class="hljs-number">1</span>), horizontalalignment=<span class="hljs-string">&#x27;center&#x27;</span>)<span class="hljs-comment"># Title, Label, Ticks and Ylim</span>ax.set_title(<span class="hljs-string">&#x27;Bar Chart for Highway Mileage&#x27;</span>, fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>:<span class="hljs-number">22</span>&#125;)ax.<span class="hljs-built_in">set</span>(ylabel=<span class="hljs-string">&#x27;Miles Per Gallon&#x27;</span>, ylim=(<span class="hljs-number">0</span>, <span class="hljs-number">30</span>))plt.xticks(df.index, df.manufacturer.<span class="hljs-built_in">str</span>.upper(), rotation=<span class="hljs-number">60</span>, horizontalalignment=<span class="hljs-string">&#x27;right&#x27;</span>, fontsize=<span class="hljs-number">12</span>)<span class="hljs-comment"># Add patches to color the X axis labels</span>p1 = patches.Rectangle((<span class="hljs-number">.57</span>, -<span class="hljs-number">0.005</span>), width=<span class="hljs-number">.33</span>, height=<span class="hljs-number">.13</span>, alpha=<span class="hljs-number">.1</span>, facecolor=<span class="hljs-string">&#x27;green&#x27;</span>, transform=fig.transFigure)p2 = patches.Rectangle((<span class="hljs-number">.124</span>, -<span class="hljs-number">0.005</span>), width=<span class="hljs-number">.446</span>, height=<span class="hljs-number">.13</span>, alpha=<span class="hljs-number">.1</span>, facecolor=<span class="hljs-string">&#x27;red&#x27;</span>, transform=fig.transFigure)fig.add_artist(p1)fig.add_artist(p2)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/17.png" alt="17"></p><h3><span id="16-bang-bang-tang-tu-lollipop-chart"><font color="##4876FF">【16】棒棒糖图（Lollipop Chart）</font></span></h3><p>棒棒糖图表以一种视觉上令人愉悦的方式提供与有序条形图类似的目的。</p><pre><code class="hljs python"><span class="hljs-comment"># Prepare Data</span>df_raw = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)df = df_raw[[<span class="hljs-string">&#x27;cty&#x27;</span>, <span class="hljs-string">&#x27;manufacturer&#x27;</span>]].groupby(<span class="hljs-string">&#x27;manufacturer&#x27;</span>).apply(<span class="hljs-keyword">lambda</span> x: x.mean())df.sort_values(<span class="hljs-string">&#x27;cty&#x27;</span>, inplace=<span class="hljs-literal">True</span>)df.reset_index(inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Draw plot</span>fig, ax = plt.subplots(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)ax.vlines(x=df.index, ymin=<span class="hljs-number">0</span>, ymax=df.cty, color=<span class="hljs-string">&#x27;firebrick&#x27;</span>, alpha=<span class="hljs-number">0.7</span>, linewidth=<span class="hljs-number">2</span>)ax.scatter(x=df.index, y=df.cty, s=<span class="hljs-number">75</span>, color=<span class="hljs-string">&#x27;firebrick&#x27;</span>, alpha=<span class="hljs-number">0.7</span>)<span class="hljs-comment"># Title, Label, Ticks and Ylim</span>ax.set_title(<span class="hljs-string">&#x27;Lollipop Chart for Highway Mileage&#x27;</span>, fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">22</span>&#125;)ax.set_ylabel(<span class="hljs-string">&#x27;Miles Per Gallon&#x27;</span>)ax.set_xticks(df.index)ax.set_xticklabels(df.manufacturer.<span class="hljs-built_in">str</span>.upper(), rotation=<span class="hljs-number">60</span>, fontdict=&#123;<span class="hljs-string">&#x27;horizontalalignment&#x27;</span>: <span class="hljs-string">&#x27;right&#x27;</span>, <span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">12</span>&#125;)ax.set_ylim(<span class="hljs-number">0</span>, <span class="hljs-number">30</span>)<span class="hljs-comment"># Annotate</span><span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> df.itertuples():    ax.text(row.Index, row.cty+<span class="hljs-number">.5</span>, s=<span class="hljs-built_in">round</span>(row.cty, <span class="hljs-number">2</span>), horizontalalignment=<span class="hljs-string">&#x27;center&#x27;</span>, verticalalignment=<span class="hljs-string">&#x27;bottom&#x27;</span>, fontsize=<span class="hljs-number">14</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/18.png" alt="18"></p><h3><span id="17-dian-tu-dot-plot"><font color="##4876FF">【17】点图（Dot Plot）</font></span></h3><p>点图可以表示项目的排名顺序。由于它是沿水平轴对齐的，所以可以更容易地看到点之间的距离。</p><pre><code class="hljs python"><span class="hljs-comment"># Prepare Data</span>df_raw = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)df = df_raw[[<span class="hljs-string">&#x27;cty&#x27;</span>, <span class="hljs-string">&#x27;manufacturer&#x27;</span>]].groupby(<span class="hljs-string">&#x27;manufacturer&#x27;</span>).apply(<span class="hljs-keyword">lambda</span> x: x.mean())df.sort_values(<span class="hljs-string">&#x27;cty&#x27;</span>, inplace=<span class="hljs-literal">True</span>)df.reset_index(inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Draw plot</span>fig, ax = plt.subplots(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)ax.hlines(y=df.index, xmin=<span class="hljs-number">11</span>, xmax=<span class="hljs-number">26</span>, color=<span class="hljs-string">&#x27;gray&#x27;</span>, alpha=<span class="hljs-number">0.7</span>, linewidth=<span class="hljs-number">1</span>, linestyles=<span class="hljs-string">&#x27;dashdot&#x27;</span>)ax.scatter(y=df.index, x=df.cty, s=<span class="hljs-number">75</span>, color=<span class="hljs-string">&#x27;firebrick&#x27;</span>, alpha=<span class="hljs-number">0.7</span>)<span class="hljs-comment"># Title, Label, Ticks and Ylim</span>ax.set_title(<span class="hljs-string">&#x27;Dot Plot for Highway Mileage&#x27;</span>, fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">22</span>&#125;)ax.set_xlabel(<span class="hljs-string">&#x27;Miles Per Gallon&#x27;</span>)ax.set_yticks(df.index)ax.set_yticklabels(df.manufacturer.<span class="hljs-built_in">str</span>.title(), fontdict=&#123;<span class="hljs-string">&#x27;horizontalalignment&#x27;</span>: <span class="hljs-string">&#x27;right&#x27;</span>&#125;)ax.set_xlim(<span class="hljs-number">10</span>, <span class="hljs-number">27</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/19.png" alt="19"></p><h3><span id="18-po-du-tu-slope-chart"><font color="##4876FF">【18】坡度图（Slope Chart）</font></span></h3><p>坡度图最适合比较给定人员/项目的“前”和“后”位置。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.lines <span class="hljs-keyword">as</span> mlines<span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/gdppercap.csv&quot;</span>)left_label = [<span class="hljs-built_in">str</span>(c) + <span class="hljs-string">&#x27;, &#x27;</span> + <span class="hljs-built_in">str</span>(<span class="hljs-built_in">round</span>(y)) <span class="hljs-keyword">for</span> c, y <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(df.continent, df[<span class="hljs-string">&#x27;1952&#x27;</span>])]right_label = [<span class="hljs-built_in">str</span>(c) + <span class="hljs-string">&#x27;, &#x27;</span> + <span class="hljs-built_in">str</span>(<span class="hljs-built_in">round</span>(y)) <span class="hljs-keyword">for</span> c, y <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(df.continent, df[<span class="hljs-string">&#x27;1957&#x27;</span>])]klass = [<span class="hljs-string">&#x27;red&#x27;</span> <span class="hljs-keyword">if</span> (y1 - y2) &lt; <span class="hljs-number">0</span> <span class="hljs-keyword">else</span> <span class="hljs-string">&#x27;green&#x27;</span> <span class="hljs-keyword">for</span> y1, y2 <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(df[<span class="hljs-string">&#x27;1952&#x27;</span>], df[<span class="hljs-string">&#x27;1957&#x27;</span>])]<span class="hljs-comment"># draw line</span><span class="hljs-comment"># https://stackoverflow.com/questions/36470343/how-to-draw-a-line-with-matplotlib/36479941</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">newline</span>(<span class="hljs-params">p1, p2, color=<span class="hljs-string">&#x27;black&#x27;</span></span>):</span>    ax = plt.gca()    l = mlines.Line2D([p1[<span class="hljs-number">0</span>], p2[<span class="hljs-number">0</span>]], [p1[<span class="hljs-number">1</span>], p2[<span class="hljs-number">1</span>]], color=<span class="hljs-string">&#x27;red&#x27;</span> <span class="hljs-keyword">if</span> p1[<span class="hljs-number">1</span>] - p2[<span class="hljs-number">1</span>] &gt; <span class="hljs-number">0</span> <span class="hljs-keyword">else</span> <span class="hljs-string">&#x27;green&#x27;</span>, marker=<span class="hljs-string">&#x27;o&#x27;</span>,                      markersize=<span class="hljs-number">6</span>)    ax.add_line(l)    <span class="hljs-keyword">return</span> lfig, ax = plt.subplots(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, figsize=(<span class="hljs-number">14</span>, <span class="hljs-number">14</span>), dpi=<span class="hljs-number">80</span>)<span class="hljs-comment"># Vertical Lines</span>ax.vlines(x=<span class="hljs-number">1</span>, ymin=<span class="hljs-number">500</span>, ymax=<span class="hljs-number">13000</span>, color=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">0.7</span>, linewidth=<span class="hljs-number">1</span>, linestyles=<span class="hljs-string">&#x27;dotted&#x27;</span>)ax.vlines(x=<span class="hljs-number">3</span>, ymin=<span class="hljs-number">500</span>, ymax=<span class="hljs-number">13000</span>, color=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">0.7</span>, linewidth=<span class="hljs-number">1</span>, linestyles=<span class="hljs-string">&#x27;dotted&#x27;</span>)<span class="hljs-comment"># Points</span>ax.scatter(y=df[<span class="hljs-string">&#x27;1952&#x27;</span>], x=np.repeat(<span class="hljs-number">1</span>, df.shape[<span class="hljs-number">0</span>]), s=<span class="hljs-number">10</span>, color=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">0.7</span>)ax.scatter(y=df[<span class="hljs-string">&#x27;1957&#x27;</span>], x=np.repeat(<span class="hljs-number">3</span>, df.shape[<span class="hljs-number">0</span>]), s=<span class="hljs-number">10</span>, color=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">0.7</span>)<span class="hljs-comment"># Line Segmentsand Annotation</span><span class="hljs-keyword">for</span> p1, p2, c <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(df[<span class="hljs-string">&#x27;1952&#x27;</span>], df[<span class="hljs-string">&#x27;1957&#x27;</span>], df[<span class="hljs-string">&#x27;continent&#x27;</span>]):    newline([<span class="hljs-number">1</span>, p1], [<span class="hljs-number">3</span>, p2])    ax.text(<span class="hljs-number">1</span> - <span class="hljs-number">0.05</span>, p1, c + <span class="hljs-string">&#x27;, &#x27;</span> + <span class="hljs-built_in">str</span>(<span class="hljs-built_in">round</span>(p1)), horizontalalignment=<span class="hljs-string">&#x27;right&#x27;</span>, verticalalignment=<span class="hljs-string">&#x27;center&#x27;</span>,            fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">14</span>&#125;)    ax.text(<span class="hljs-number">3</span> + <span class="hljs-number">0.05</span>, p2, c + <span class="hljs-string">&#x27;, &#x27;</span> + <span class="hljs-built_in">str</span>(<span class="hljs-built_in">round</span>(p2)), horizontalalignment=<span class="hljs-string">&#x27;left&#x27;</span>, verticalalignment=<span class="hljs-string">&#x27;center&#x27;</span>,            fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">14</span>&#125;)<span class="hljs-comment"># &#x27;Before&#x27; and &#x27;After&#x27; Annotations</span>ax.text(<span class="hljs-number">1</span> - <span class="hljs-number">0.05</span>, <span class="hljs-number">13000</span>, <span class="hljs-string">&#x27;BEFORE&#x27;</span>, horizontalalignment=<span class="hljs-string">&#x27;right&#x27;</span>, verticalalignment=<span class="hljs-string">&#x27;center&#x27;</span>,        fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">18</span>, <span class="hljs-string">&#x27;weight&#x27;</span>: <span class="hljs-number">700</span>&#125;)ax.text(<span class="hljs-number">3</span> + <span class="hljs-number">0.05</span>, <span class="hljs-number">13000</span>, <span class="hljs-string">&#x27;AFTER&#x27;</span>, horizontalalignment=<span class="hljs-string">&#x27;left&#x27;</span>, verticalalignment=<span class="hljs-string">&#x27;center&#x27;</span>,        fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">18</span>, <span class="hljs-string">&#x27;weight&#x27;</span>: <span class="hljs-number">700</span>&#125;)<span class="hljs-comment"># Decoration</span>ax.set_title(<span class="hljs-string">&quot;Slopechart: Comparing GDP Per Capita between 1952 vs 1957&quot;</span>, fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">22</span>&#125;)ax.<span class="hljs-built_in">set</span>(xlim=(<span class="hljs-number">0</span>, <span class="hljs-number">4</span>), ylim=(<span class="hljs-number">0</span>, <span class="hljs-number">14000</span>), ylabel=<span class="hljs-string">&#x27;Mean GDP Per Capita&#x27;</span>)ax.set_xticks([<span class="hljs-number">1</span>, <span class="hljs-number">3</span>])ax.set_xticklabels([<span class="hljs-string">&quot;1952&quot;</span>, <span class="hljs-string">&quot;1957&quot;</span>])plt.yticks(np.arange(<span class="hljs-number">500</span>, <span class="hljs-number">13000</span>, <span class="hljs-number">2000</span>), fontsize=<span class="hljs-number">12</span>)<span class="hljs-comment"># Lighten borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">.0</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">.0</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">.0</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">.0</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/20.png" alt="20"></p><h3><span id="19-ya-ling-tu-dumbbell-plot"><font color="##4876FF">【19】哑铃图（Dumbbell Plot）</font></span></h3><p>哑铃图传达了各种项目的“前”和“后”位置以及项目的等级顺序。如果您希望可视化特定项目/计划对不同对象的影响，那么它非常有用。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.lines <span class="hljs-keyword">as</span> mlines<span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/health.csv&quot;</span>)df.sort_values(<span class="hljs-string">&#x27;pct_2014&#x27;</span>, inplace=<span class="hljs-literal">True</span>)df.reset_index(inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Func to draw line segment</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">newline</span>(<span class="hljs-params">p1, p2, color=<span class="hljs-string">&#x27;black&#x27;</span></span>):</span>    ax = plt.gca()    l = mlines.Line2D([p1[<span class="hljs-number">0</span>], p2[<span class="hljs-number">0</span>]], [p1[<span class="hljs-number">1</span>], p2[<span class="hljs-number">1</span>]], color=<span class="hljs-string">&#x27;skyblue&#x27;</span>)    ax.add_line(l)    <span class="hljs-keyword">return</span> l<span class="hljs-comment"># Figure and Axes</span>fig, ax = plt.subplots(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, figsize=(<span class="hljs-number">14</span>, <span class="hljs-number">14</span>), facecolor=<span class="hljs-string">&#x27;#f7f7f7&#x27;</span>, dpi=<span class="hljs-number">80</span>)<span class="hljs-comment"># Vertical Lines</span>ax.vlines(x=<span class="hljs-number">.05</span>, ymin=<span class="hljs-number">0</span>, ymax=<span class="hljs-number">26</span>, color=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">1</span>, linewidth=<span class="hljs-number">1</span>, linestyles=<span class="hljs-string">&#x27;dotted&#x27;</span>)ax.vlines(x=<span class="hljs-number">.10</span>, ymin=<span class="hljs-number">0</span>, ymax=<span class="hljs-number">26</span>, color=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">1</span>, linewidth=<span class="hljs-number">1</span>, linestyles=<span class="hljs-string">&#x27;dotted&#x27;</span>)ax.vlines(x=<span class="hljs-number">.15</span>, ymin=<span class="hljs-number">0</span>, ymax=<span class="hljs-number">26</span>, color=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">1</span>, linewidth=<span class="hljs-number">1</span>, linestyles=<span class="hljs-string">&#x27;dotted&#x27;</span>)ax.vlines(x=<span class="hljs-number">.20</span>, ymin=<span class="hljs-number">0</span>, ymax=<span class="hljs-number">26</span>, color=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">1</span>, linewidth=<span class="hljs-number">1</span>, linestyles=<span class="hljs-string">&#x27;dotted&#x27;</span>)<span class="hljs-comment"># Points</span>ax.scatter(y=df[<span class="hljs-string">&#x27;index&#x27;</span>], x=df[<span class="hljs-string">&#x27;pct_2013&#x27;</span>], s=<span class="hljs-number">50</span>, color=<span class="hljs-string">&#x27;#0e668b&#x27;</span>, alpha=<span class="hljs-number">0.7</span>)ax.scatter(y=df[<span class="hljs-string">&#x27;index&#x27;</span>], x=df[<span class="hljs-string">&#x27;pct_2014&#x27;</span>], s=<span class="hljs-number">50</span>, color=<span class="hljs-string">&#x27;#a3c4dc&#x27;</span>, alpha=<span class="hljs-number">0.7</span>)<span class="hljs-comment"># Line Segments</span><span class="hljs-keyword">for</span> i, p1, p2 <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(df[<span class="hljs-string">&#x27;index&#x27;</span>], df[<span class="hljs-string">&#x27;pct_2013&#x27;</span>], df[<span class="hljs-string">&#x27;pct_2014&#x27;</span>]):    newline([p1, i], [p2, i])<span class="hljs-comment"># Decoration</span>ax.set_facecolor(<span class="hljs-string">&#x27;#f7f7f7&#x27;</span>)ax.set_title(<span class="hljs-string">&quot;Dumbell Chart: Pct Change - 2013 vs 2014&quot;</span>, fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">22</span>&#125;)ax.<span class="hljs-built_in">set</span>(xlim=(<span class="hljs-number">0</span>, <span class="hljs-number">.25</span>), ylim=(-<span class="hljs-number">1</span>, <span class="hljs-number">27</span>), ylabel=<span class="hljs-string">&#x27;Mean GDP Per Capita&#x27;</span>)ax.set_xticks([<span class="hljs-number">.05</span>, <span class="hljs-number">.1</span>, <span class="hljs-number">.15</span>, <span class="hljs-number">.20</span>])ax.set_xticklabels([<span class="hljs-string">&#x27;5%&#x27;</span>, <span class="hljs-string">&#x27;15%&#x27;</span>, <span class="hljs-string">&#x27;20%&#x27;</span>, <span class="hljs-string">&#x27;25%&#x27;</span>])ax.set_xticklabels([<span class="hljs-string">&#x27;5%&#x27;</span>, <span class="hljs-string">&#x27;15%&#x27;</span>, <span class="hljs-string">&#x27;20%&#x27;</span>, <span class="hljs-string">&#x27;25%&#x27;</span>])plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/21.png" alt="21"></p><h2><span id="6x00-fen-bu-distribution"><font color="#FF0000">【6x00】分布（Distribution）</font></span></h2><h3><span id="20-lian-xu-bian-liang-de-zhi-fang-tu-histogram-for-continuous-variable"><font color="##4876FF">【20】连续变量的直方图（Histogram for Continuous Variable）</font></span></h3><p>连续变量的直方图显示给定变量的频率分布。下面的图表基于分类变量对频率条进行分组，从而更深入地了解连续变量和分类变量。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Prepare data</span>x_var = <span class="hljs-string">&#x27;displ&#x27;</span>groupby_var = <span class="hljs-string">&#x27;class&#x27;</span>df_agg = df.loc[:, [x_var, groupby_var]].groupby(groupby_var)vals = [df[x_var].values.tolist() <span class="hljs-keyword">for</span> i, df <span class="hljs-keyword">in</span> df_agg]<span class="hljs-comment"># Draw</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">9</span>), dpi=<span class="hljs-number">80</span>)colors = [plt.cm.Spectral(i / <span class="hljs-built_in">float</span>(<span class="hljs-built_in">len</span>(vals) - <span class="hljs-number">1</span>)) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(vals))]n, bins, patches = plt.hist(vals, <span class="hljs-number">30</span>, stacked=<span class="hljs-literal">True</span>, density=<span class="hljs-literal">False</span>, color=colors[:<span class="hljs-built_in">len</span>(vals)])<span class="hljs-comment"># Decoration</span>plt.legend(&#123;group: col <span class="hljs-keyword">for</span> group, col <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(np.unique(df[groupby_var]).tolist(), colors[:<span class="hljs-built_in">len</span>(vals)])&#125;)plt.title(<span class="hljs-string">f&quot;Stacked Histogram of $<span class="hljs-subst">&#123;x_var&#125;</span>$ colored by $<span class="hljs-subst">&#123;groupby_var&#125;</span>$&quot;</span>, fontsize=<span class="hljs-number">22</span>)plt.xlabel(x_var)plt.ylabel(<span class="hljs-string">&quot;Frequency&quot;</span>)plt.ylim(<span class="hljs-number">0</span>, <span class="hljs-number">25</span>)plt.xticks(ticks=bins[::<span class="hljs-number">3</span>], labels=[<span class="hljs-built_in">round</span>(b, <span class="hljs-number">1</span>) <span class="hljs-keyword">for</span> b <span class="hljs-keyword">in</span> bins[::<span class="hljs-number">3</span>]])plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/22.png" alt="22"></p><h3><span id="21-fen-lei-bian-liang-de-zhi-fang-tu-histogram-for-categorical-variable"><font color="##4876FF">【21】分类变量的直方图（Histogram for Categorical Variable）</font></span></h3><p>分类变量的直方图显示该变量的频率分布。通过给条形图上色，您可以将分布与表示颜色的另一个类型变量相关联。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Prepare data</span>x_var = <span class="hljs-string">&#x27;manufacturer&#x27;</span>groupby_var = <span class="hljs-string">&#x27;class&#x27;</span>df_agg = df.loc[:, [x_var, groupby_var]].groupby(groupby_var)vals = [df[x_var].values.tolist() <span class="hljs-keyword">for</span> i, df <span class="hljs-keyword">in</span> df_agg]<span class="hljs-comment"># Draw</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">9</span>), dpi=<span class="hljs-number">80</span>)colors = [plt.cm.Spectral(i / <span class="hljs-built_in">float</span>(<span class="hljs-built_in">len</span>(vals) - <span class="hljs-number">1</span>)) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(vals))]n, bins, patches = plt.hist(vals, df[x_var].unique().__len__(), stacked=<span class="hljs-literal">True</span>, density=<span class="hljs-literal">False</span>, color=colors[:<span class="hljs-built_in">len</span>(vals)])<span class="hljs-comment"># Decoration</span>plt.legend(&#123;group: col <span class="hljs-keyword">for</span> group, col <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(np.unique(df[groupby_var]).tolist(), colors[:<span class="hljs-built_in">len</span>(vals)])&#125;)plt.title(<span class="hljs-string">f&quot;Stacked Histogram of $<span class="hljs-subst">&#123;x_var&#125;</span>$ colored by $<span class="hljs-subst">&#123;groupby_var&#125;</span>$&quot;</span>, fontsize=<span class="hljs-number">22</span>)plt.xlabel(x_var)plt.ylabel(<span class="hljs-string">&quot;Frequency&quot;</span>)plt.ylim(<span class="hljs-number">0</span>, <span class="hljs-number">40</span>)plt.xticks(ticks=bins, labels=np.unique(df[x_var]).tolist(), rotation=<span class="hljs-number">90</span>, horizontalalignment=<span class="hljs-string">&#x27;left&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/23.png" alt="23"></p><h3><span id="22-mi-du-tu-density-plot"><font color="##4876FF">【22】密度图（Density Plot）</font></span></h3><p>密度图是连续变量分布可视化的常用工具。通过按“response”变量对它们进行分组，您可以检查 X 和 Y 之间的关系。如果出于代表性目的来描述城市里程分布如何随气缸数而变化，请参见下面的情况。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Draw Plot</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)sns.kdeplot(df.loc[df[<span class="hljs-string">&#x27;cyl&#x27;</span>] == <span class="hljs-number">4</span>, <span class="hljs-string">&quot;cty&quot;</span>], shade=<span class="hljs-literal">True</span>, color=<span class="hljs-string">&quot;g&quot;</span>, label=<span class="hljs-string">&quot;Cyl=4&quot;</span>, alpha=<span class="hljs-number">.7</span>)sns.kdeplot(df.loc[df[<span class="hljs-string">&#x27;cyl&#x27;</span>] == <span class="hljs-number">5</span>, <span class="hljs-string">&quot;cty&quot;</span>], shade=<span class="hljs-literal">True</span>, color=<span class="hljs-string">&quot;deeppink&quot;</span>, label=<span class="hljs-string">&quot;Cyl=5&quot;</span>, alpha=<span class="hljs-number">.7</span>)sns.kdeplot(df.loc[df[<span class="hljs-string">&#x27;cyl&#x27;</span>] == <span class="hljs-number">6</span>, <span class="hljs-string">&quot;cty&quot;</span>], shade=<span class="hljs-literal">True</span>, color=<span class="hljs-string">&quot;dodgerblue&quot;</span>, label=<span class="hljs-string">&quot;Cyl=6&quot;</span>, alpha=<span class="hljs-number">.7</span>)sns.kdeplot(df.loc[df[<span class="hljs-string">&#x27;cyl&#x27;</span>] == <span class="hljs-number">8</span>, <span class="hljs-string">&quot;cty&quot;</span>], shade=<span class="hljs-literal">True</span>, color=<span class="hljs-string">&quot;orange&quot;</span>, label=<span class="hljs-string">&quot;Cyl=8&quot;</span>, alpha=<span class="hljs-number">.7</span>)<span class="hljs-comment"># Decoration</span>plt.title(<span class="hljs-string">&#x27;Density Plot of City Mileage by n_Cylinders&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.legend()plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/24.png" alt="24"></p><h3><span id="23-zhi-fang-tu-mi-du-qu-xian-density-curves-with-histogram"><font color="##4876FF">【23】直方图密度曲线（Density Curves with Histogram）</font></span></h3><p>具有直方图的密度曲线将两个图所传达的信息集合在一起，因此您可以将它们都放在一个图形中，而不是放在两个图形中。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Draw Plot</span>plt.figure(figsize=(<span class="hljs-number">13</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)sns.distplot(df.loc[df[<span class="hljs-string">&#x27;class&#x27;</span>] == <span class="hljs-string">&#x27;compact&#x27;</span>, <span class="hljs-string">&quot;cty&quot;</span>], color=<span class="hljs-string">&quot;dodgerblue&quot;</span>, label=<span class="hljs-string">&quot;Compact&quot;</span>, hist_kws=&#123;<span class="hljs-string">&#x27;alpha&#x27;</span>: <span class="hljs-number">.7</span>&#125;,             kde_kws=&#123;<span class="hljs-string">&#x27;linewidth&#x27;</span>: <span class="hljs-number">3</span>&#125;)sns.distplot(df.loc[df[<span class="hljs-string">&#x27;class&#x27;</span>] == <span class="hljs-string">&#x27;suv&#x27;</span>, <span class="hljs-string">&quot;cty&quot;</span>], color=<span class="hljs-string">&quot;orange&quot;</span>, label=<span class="hljs-string">&quot;SUV&quot;</span>, hist_kws=&#123;<span class="hljs-string">&#x27;alpha&#x27;</span>: <span class="hljs-number">.7</span>&#125;,             kde_kws=&#123;<span class="hljs-string">&#x27;linewidth&#x27;</span>: <span class="hljs-number">3</span>&#125;)sns.distplot(df.loc[df[<span class="hljs-string">&#x27;class&#x27;</span>] == <span class="hljs-string">&#x27;minivan&#x27;</span>, <span class="hljs-string">&quot;cty&quot;</span>], color=<span class="hljs-string">&quot;g&quot;</span>, label=<span class="hljs-string">&quot;minivan&quot;</span>, hist_kws=&#123;<span class="hljs-string">&#x27;alpha&#x27;</span>: <span class="hljs-number">.7</span>&#125;,             kde_kws=&#123;<span class="hljs-string">&#x27;linewidth&#x27;</span>: <span class="hljs-number">3</span>&#125;)plt.ylim(<span class="hljs-number">0</span>, <span class="hljs-number">0.35</span>)<span class="hljs-comment"># Decoration</span>plt.title(<span class="hljs-string">&#x27;Density Plot of City Mileage by Vehicle Type&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.legend()plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/25.png" alt="25"></p><h3><span id="24-shan-feng-die-luan-tu-huan-le-tu-joy-plot"><font color="##4876FF">【24】山峰叠峦图 / 欢乐图（Joy Plot）</font></span></h3><p>Joy Plot 允许不同组的密度曲线重叠，这是一种很好的可视化方法，可以直观地显示大量分组之间的关系。它看起来赏心悦目，清楚地传达了正确的信息。它可以使用基于 <code>matplotlib</code> 的 <code>joypy</code> 包轻松构建。</p><p>【译者 TRHX 注：Joy Plot 看起来就像是山峰叠峦，山峦起伏，层次分明，但取名为 Joy，欢乐的意思，所以不太好翻译，在使用该方法时要先安装 joypy 库】</p><pre><code class="hljs python"><span class="hljs-comment"># !pip install joypy</span><span class="hljs-comment"># Import Data</span><span class="hljs-keyword">import</span> joypy<span class="hljs-comment"># 原文没有 import joypy，译者 TRHX 添加</span>mpg = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Draw Plot</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)fig, axes = joypy.joyplot(mpg, column=[<span class="hljs-string">&#x27;hwy&#x27;</span>, <span class="hljs-string">&#x27;cty&#x27;</span>], by=<span class="hljs-string">&quot;class&quot;</span>, ylim=<span class="hljs-string">&#x27;own&#x27;</span>, figsize=(<span class="hljs-number">14</span>, <span class="hljs-number">10</span>))<span class="hljs-comment"># Decoration</span>plt.title(<span class="hljs-string">&#x27;Joy Plot of City and Highway Mileage by Class&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/26.png" alt="26"></p><h3><span id="25-fen-bu-shi-dian-tu-distributed-dot-plot"><font color="##4876FF">【25】分布式点图（Distributed Dot Plot）</font></span></h3><p>分布点图显示按组分割的点的单变量分布。点越暗，数据点在该区域的集中程度就越高。通过对中值进行不同的着色，这些组的真实位置立即变得明显。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.patches <span class="hljs-keyword">as</span> mpatches<span class="hljs-comment"># Prepare Data</span>df_raw = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)cyl_colors = &#123;<span class="hljs-number">4</span>: <span class="hljs-string">&#x27;tab:red&#x27;</span>, <span class="hljs-number">5</span>: <span class="hljs-string">&#x27;tab:green&#x27;</span>, <span class="hljs-number">6</span>: <span class="hljs-string">&#x27;tab:blue&#x27;</span>, <span class="hljs-number">8</span>: <span class="hljs-string">&#x27;tab:orange&#x27;</span>&#125;df_raw[<span class="hljs-string">&#x27;cyl_color&#x27;</span>] = df_raw.cyl.<span class="hljs-built_in">map</span>(cyl_colors)<span class="hljs-comment"># Mean and Median city mileage by make</span>df = df_raw[[<span class="hljs-string">&#x27;cty&#x27;</span>, <span class="hljs-string">&#x27;manufacturer&#x27;</span>]].groupby(<span class="hljs-string">&#x27;manufacturer&#x27;</span>).apply(<span class="hljs-keyword">lambda</span> x: x.mean())df.sort_values(<span class="hljs-string">&#x27;cty&#x27;</span>, ascending=<span class="hljs-literal">False</span>, inplace=<span class="hljs-literal">True</span>)df.reset_index(inplace=<span class="hljs-literal">True</span>)df_median = df_raw[[<span class="hljs-string">&#x27;cty&#x27;</span>, <span class="hljs-string">&#x27;manufacturer&#x27;</span>]].groupby(<span class="hljs-string">&#x27;manufacturer&#x27;</span>).apply(<span class="hljs-keyword">lambda</span> x: x.median())<span class="hljs-comment"># Draw horizontal lines</span>fig, ax = plt.subplots(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)ax.hlines(y=df.index, xmin=<span class="hljs-number">0</span>, xmax=<span class="hljs-number">40</span>, color=<span class="hljs-string">&#x27;gray&#x27;</span>, alpha=<span class="hljs-number">0.5</span>, linewidth=<span class="hljs-number">.5</span>, linestyles=<span class="hljs-string">&#x27;dashdot&#x27;</span>)<span class="hljs-comment"># Draw the Dots</span><span class="hljs-keyword">for</span> i, make <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(df.manufacturer):    df_make = df_raw.loc[df_raw.manufacturer == make, :]    <span class="hljs-comment"># 原文代码</span>    <span class="hljs-comment"># ax.scatter(y=np.repeat(i, df_make.shape[0]), x=&#x27;cty&#x27;, data=df_make, s=75, edgecolors=&#x27;gray&#x27;, c=&#x27;w&#x27;, alpha=0.5)</span>    <span class="hljs-comment"># 纠正代码</span>    ax.scatter(y=<span class="hljs-built_in">list</span>(np.repeat(i, df_make.shape[<span class="hljs-number">0</span>])), x=<span class="hljs-string">&#x27;cty&#x27;</span>, data=df_make, s=<span class="hljs-number">75</span>, edgecolors=<span class="hljs-string">&#x27;gray&#x27;</span>, c=<span class="hljs-string">&#x27;w&#x27;</span>, alpha=<span class="hljs-number">0.5</span>)    ax.scatter(y=i, x=<span class="hljs-string">&#x27;cty&#x27;</span>, data=df_median.loc[df_median.index == make, :], s=<span class="hljs-number">75</span>, c=<span class="hljs-string">&#x27;firebrick&#x27;</span>)<span class="hljs-comment"># Annotate</span>ax.text(<span class="hljs-number">33</span>, <span class="hljs-number">13</span>, <span class="hljs-string">&quot;$red \; dots \; are \; the \: median$&quot;</span>, fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">12</span>&#125;, color=<span class="hljs-string">&#x27;firebrick&#x27;</span>)<span class="hljs-comment"># Decorations</span>red_patch = plt.plot([], [], marker=<span class="hljs-string">&quot;o&quot;</span>, ms=<span class="hljs-number">10</span>, ls=<span class="hljs-string">&quot;&quot;</span>, mec=<span class="hljs-literal">None</span>, color=<span class="hljs-string">&#x27;firebrick&#x27;</span>, label=<span class="hljs-string">&quot;Median&quot;</span>)plt.legend(handles=red_patch)ax.set_title(<span class="hljs-string">&#x27;Distribution of City Mileage by Make&#x27;</span>, fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">22</span>&#125;)ax.set_xlabel(<span class="hljs-string">&#x27;Miles Per Gallon (City)&#x27;</span>, alpha=<span class="hljs-number">0.7</span>)ax.set_yticks(df.index)ax.set_yticklabels(df.manufacturer.<span class="hljs-built_in">str</span>.title(), fontdict=&#123;<span class="hljs-string">&#x27;horizontalalignment&#x27;</span>: <span class="hljs-string">&#x27;right&#x27;</span>&#125;, alpha=<span class="hljs-number">0.7</span>)ax.set_xlim(<span class="hljs-number">1</span>, <span class="hljs-number">40</span>)plt.xticks(alpha=<span class="hljs-number">0.7</span>)plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_visible(<span class="hljs-literal">False</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_visible(<span class="hljs-literal">False</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_visible(<span class="hljs-literal">False</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_visible(<span class="hljs-literal">False</span>)plt.grid(axis=<span class="hljs-string">&#x27;both&#x27;</span>, alpha=<span class="hljs-number">.4</span>, linewidth=<span class="hljs-number">.1</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/27.png" alt="27"></p><h3><span id="26-xiang-xing-tu-box-plot"><font color="##4876FF">【26】箱形图（Box Plot）</font></span></h3><p>箱形图是可视化分布的一种好方法，同时牢记中位数，第 25 个第 75 个四分位数和离群值。 但是，在解释方框的大小时需要小心，这可能会扭曲该组中包含的点数。 因此，手动提供每个框中的观察次数可以帮助克服此缺点。</p><p>例如，左侧的前两个框，尽管它们分别具有 5 和 47 个 obs，但是却具有相同大小， 因此，有必要写下该组中的观察数。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Draw Plot</span>plt.figure(figsize=(<span class="hljs-number">13</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)sns.boxplot(x=<span class="hljs-string">&#x27;class&#x27;</span>, y=<span class="hljs-string">&#x27;hwy&#x27;</span>, data=df, notch=<span class="hljs-literal">False</span>)<span class="hljs-comment"># Add N Obs inside boxplot (optional)</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">add_n_obs</span>(<span class="hljs-params">df, group_col, y</span>):</span>    medians_dict = &#123;grp[<span class="hljs-number">0</span>]: grp[<span class="hljs-number">1</span>][y].median() <span class="hljs-keyword">for</span> grp <span class="hljs-keyword">in</span> df.groupby(group_col)&#125;    xticklabels = [x.get_text() <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> plt.gca().get_xticklabels()]    n_obs = df.groupby(group_col)[y].size().values    <span class="hljs-keyword">for</span> (x, xticklabel), n_ob <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(<span class="hljs-built_in">enumerate</span>(xticklabels), n_obs):        plt.text(x, medians_dict[xticklabel] * <span class="hljs-number">1.01</span>, <span class="hljs-string">&quot;#obs : &quot;</span> + <span class="hljs-built_in">str</span>(n_ob), horizontalalignment=<span class="hljs-string">&#x27;center&#x27;</span>,                 fontdict=&#123;<span class="hljs-string">&#x27;size&#x27;</span>: <span class="hljs-number">14</span>&#125;, color=<span class="hljs-string">&#x27;white&#x27;</span>)add_n_obs(df, group_col=<span class="hljs-string">&#x27;class&#x27;</span>, y=<span class="hljs-string">&#x27;hwy&#x27;</span>)<span class="hljs-comment"># Decoration</span>plt.title(<span class="hljs-string">&#x27;Box Plot of Highway Mileage by Vehicle Class&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.ylim(<span class="hljs-number">10</span>, <span class="hljs-number">40</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/28.png" alt="28"></p><h3><span id="27-dian-xiang-xing-tu-dot-box-plot"><font color="##4876FF">【27】点 + 箱形图（Dot + Box Plot）</font></span></h3><p>点 + 箱形图传达类似于分组的箱形图信息。此外，这些点还提供了每组中有多少数据点的含义。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Draw Plot</span>plt.figure(figsize=(<span class="hljs-number">13</span>,<span class="hljs-number">10</span>), dpi= <span class="hljs-number">80</span>)sns.boxplot(x=<span class="hljs-string">&#x27;class&#x27;</span>, y=<span class="hljs-string">&#x27;hwy&#x27;</span>, data=df, hue=<span class="hljs-string">&#x27;cyl&#x27;</span>)sns.stripplot(x=<span class="hljs-string">&#x27;class&#x27;</span>, y=<span class="hljs-string">&#x27;hwy&#x27;</span>, data=df, color=<span class="hljs-string">&#x27;black&#x27;</span>, size=<span class="hljs-number">3</span>, jitter=<span class="hljs-number">1</span>)<span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(df[<span class="hljs-string">&#x27;class&#x27;</span>].unique())-<span class="hljs-number">1</span>):    plt.vlines(i+<span class="hljs-number">.5</span>, <span class="hljs-number">10</span>, <span class="hljs-number">45</span>, linestyles=<span class="hljs-string">&#x27;solid&#x27;</span>, colors=<span class="hljs-string">&#x27;gray&#x27;</span>, alpha=<span class="hljs-number">0.2</span>)<span class="hljs-comment"># Decoration</span>plt.title(<span class="hljs-string">&#x27;Box Plot of Highway Mileage by Vehicle Class&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.legend(title=<span class="hljs-string">&#x27;Cylinders&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/29.png" alt="29"></p><h3><span id="28-xiao-ti-qin-tu-violin-plot"><font color="##4876FF">【28】小提琴图（Violin Plot）</font></span></h3><p>小提琴图是箱形图在视觉上令人愉悦的替代品。 小提琴的形状或面积取决于它所持有的观察次数。 但是，小提琴图可能更难以阅读，并且在专业设置中不常用。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Draw Plot</span>plt.figure(figsize=(<span class="hljs-number">13</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)sns.violinplot(x=<span class="hljs-string">&#x27;class&#x27;</span>, y=<span class="hljs-string">&#x27;hwy&#x27;</span>, data=df, scale=<span class="hljs-string">&#x27;width&#x27;</span>, inner=<span class="hljs-string">&#x27;quartile&#x27;</span>)<span class="hljs-comment"># Decoration</span>plt.title(<span class="hljs-string">&#x27;Violin Plot of Highway Mileage by Vehicle Class&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/30.png" alt="30"></p><h3><span id="29-ren-kou-jin-zi-ta-tu-population-pyramid"><font color="##4876FF">【29】人口金字塔图（Population Pyramid）</font></span></h3><p>人口金字塔可用于显示按体积排序的组的分布。或者它也可以用于显示人口的逐级过滤，因为它是用来显示有多少人通过一个营销漏斗（Marketing Funnel）的每个阶段。</p><pre><code class="hljs python"><span class="hljs-comment"># Read data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/email_campaign_funnel.csv&quot;</span>)<span class="hljs-comment"># Draw Plot</span>plt.figure(figsize=(<span class="hljs-number">13</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)group_col = <span class="hljs-string">&#x27;Gender&#x27;</span>order_of_bars = df.Stage.unique()[::-<span class="hljs-number">1</span>]colors = [plt.cm.Spectral(i / <span class="hljs-built_in">float</span>(<span class="hljs-built_in">len</span>(df[group_col].unique()) - <span class="hljs-number">1</span>)) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(df[group_col].unique()))]<span class="hljs-keyword">for</span> c, group <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(colors, df[group_col].unique()):    sns.barplot(x=<span class="hljs-string">&#x27;Users&#x27;</span>, y=<span class="hljs-string">&#x27;Stage&#x27;</span>, data=df.loc[df[group_col] == group, :], order=order_of_bars, color=c, label=group)<span class="hljs-comment"># Decorations</span>plt.xlabel(<span class="hljs-string">&quot;$Users$&quot;</span>)plt.ylabel(<span class="hljs-string">&quot;Stage of Purchase&quot;</span>)plt.yticks(fontsize=<span class="hljs-number">12</span>)plt.title(<span class="hljs-string">&quot;Population Pyramid of the Marketing Funnel&quot;</span>, fontsize=<span class="hljs-number">22</span>)plt.legend()plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/31.png" alt="31"></p><h3><span id="30-fen-lei-tu-categorical-plots"><font color="##4876FF">【30】分类图（Categorical Plots）</font></span></h3><p>由 <code>seaborn</code> 库提供的分类图可用于可视化彼此相关的两个或更多分类变量的计数分布。</p><pre><code class="hljs python"><span class="hljs-comment"># Load Dataset</span>titanic = sns.load_dataset(<span class="hljs-string">&quot;titanic&quot;</span>)<span class="hljs-comment"># Plot</span>g = sns.catplot(<span class="hljs-string">&quot;alive&quot;</span>, col=<span class="hljs-string">&quot;deck&quot;</span>, col_wrap=<span class="hljs-number">4</span>,                data=titanic[titanic.deck.notnull()],                kind=<span class="hljs-string">&quot;count&quot;</span>, height=<span class="hljs-number">3.5</span>, aspect=<span class="hljs-number">.8</span>,                palette=<span class="hljs-string">&#x27;tab20&#x27;</span>)<span class="hljs-comment"># 译者 TRHX 注释掉了这一行代码</span><span class="hljs-comment"># fig.suptitle(&#x27;sf&#x27;)</span>plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/32.png" alt="32"></p><pre><code class="hljs python"><span class="hljs-comment"># Load Dataset</span>titanic = sns.load_dataset(<span class="hljs-string">&quot;titanic&quot;</span>)<span class="hljs-comment"># Plot</span>sns.catplot(x=<span class="hljs-string">&quot;age&quot;</span>, y=<span class="hljs-string">&quot;embark_town&quot;</span>,            hue=<span class="hljs-string">&quot;sex&quot;</span>, col=<span class="hljs-string">&quot;class&quot;</span>,            data=titanic[titanic.embark_town.notnull()],            orient=<span class="hljs-string">&quot;h&quot;</span>, height=<span class="hljs-number">5</span>, aspect=<span class="hljs-number">1</span>, palette=<span class="hljs-string">&quot;tab10&quot;</span>,            kind=<span class="hljs-string">&quot;violin&quot;</span>, dodge=<span class="hljs-literal">True</span>, cut=<span class="hljs-number">0</span>, bw=<span class="hljs-number">.2</span>)<span class="hljs-comment"># 译者 TRHX 添加了这行代码</span>plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/33.png" alt="33"></p><h2><span id="7x00-zu-cheng-composition"><font color="#FF0000">【7x00】组成（Composition）</font></span></h2><h3><span id="31-hua-fu-bing-tu-waffle-chart"><font color="##4876FF">【31】华夫饼图（Waffle Chart）</font></span></h3><p>华夫饼图可以使用 <code>pywaffle</code> 包创建，用于显示较大群体中的组的组成。</p><p>【译者 TRHX 注：在使用该方法时要先安装 pywaffle 库】</p><pre><code class="hljs python"><span class="hljs-comment"># ! pip install pywaffle</span><span class="hljs-comment"># Reference: https://stackoverflow.com/questions/41400136/how-to-do-waffle-charts-in-python-square-piechart</span><span class="hljs-keyword">from</span> pywaffle <span class="hljs-keyword">import</span> Waffle<span class="hljs-comment"># Import</span>df_raw = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Prepare Data</span>df = df_raw.groupby(<span class="hljs-string">&#x27;class&#x27;</span>).size().reset_index(name=<span class="hljs-string">&#x27;counts&#x27;</span>)n_categories = df.shape[<span class="hljs-number">0</span>]colors = [plt.cm.inferno_r(i / <span class="hljs-built_in">float</span>(n_categories)) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(n_categories)]<span class="hljs-comment"># Draw Plot and Decorate</span>fig = plt.figure(    FigureClass=Waffle,    plots=&#123;        <span class="hljs-string">&#x27;111&#x27;</span>: &#123;            <span class="hljs-string">&#x27;values&#x27;</span>: df[<span class="hljs-string">&#x27;counts&#x27;</span>],            <span class="hljs-string">&#x27;labels&#x27;</span>: [<span class="hljs-string">&quot;&#123;0&#125; (&#123;1&#125;)&quot;</span>.<span class="hljs-built_in">format</span>(n[<span class="hljs-number">0</span>], n[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> df[[<span class="hljs-string">&#x27;class&#x27;</span>, <span class="hljs-string">&#x27;counts&#x27;</span>]].itertuples()],            <span class="hljs-string">&#x27;legend&#x27;</span>: &#123;<span class="hljs-string">&#x27;loc&#x27;</span>: <span class="hljs-string">&#x27;upper left&#x27;</span>, <span class="hljs-string">&#x27;bbox_to_anchor&#x27;</span>: (<span class="hljs-number">1.05</span>, <span class="hljs-number">1</span>), <span class="hljs-string">&#x27;fontsize&#x27;</span>: <span class="hljs-number">12</span>&#125;,            <span class="hljs-string">&#x27;title&#x27;</span>: &#123;<span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;# Vehicles by Class&#x27;</span>, <span class="hljs-string">&#x27;loc&#x27;</span>: <span class="hljs-string">&#x27;center&#x27;</span>, <span class="hljs-string">&#x27;fontsize&#x27;</span>: <span class="hljs-number">18</span>&#125;        &#125;,    &#125;,    rows=<span class="hljs-number">7</span>,    colors=colors,    figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">9</span>))<span class="hljs-comment"># 译者 TRHX 添加了这行代码</span>plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/34.png" alt="34"></p><pre><code class="hljs python"><span class="hljs-comment"># ! pip install pywaffle</span><span class="hljs-keyword">from</span> pywaffle <span class="hljs-keyword">import</span> Waffle<span class="hljs-comment"># Import</span><span class="hljs-comment"># 译者 TRHX 取消注释了这行代码</span>df_raw = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Prepare Data</span><span class="hljs-comment"># By Class Data</span>df_class = df_raw.groupby(<span class="hljs-string">&#x27;class&#x27;</span>).size().reset_index(name=<span class="hljs-string">&#x27;counts_class&#x27;</span>)n_categories = df_class.shape[<span class="hljs-number">0</span>]colors_class = [plt.cm.Set3(i / <span class="hljs-built_in">float</span>(n_categories)) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(n_categories)]<span class="hljs-comment"># By Cylinders Data</span>df_cyl = df_raw.groupby(<span class="hljs-string">&#x27;cyl&#x27;</span>).size().reset_index(name=<span class="hljs-string">&#x27;counts_cyl&#x27;</span>)n_categories = df_cyl.shape[<span class="hljs-number">0</span>]colors_cyl = [plt.cm.Spectral(i / <span class="hljs-built_in">float</span>(n_categories)) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(n_categories)]<span class="hljs-comment"># By Make Data</span>df_make = df_raw.groupby(<span class="hljs-string">&#x27;manufacturer&#x27;</span>).size().reset_index(name=<span class="hljs-string">&#x27;counts_make&#x27;</span>)n_categories = df_make.shape[<span class="hljs-number">0</span>]colors_make = [plt.cm.tab20b(i / <span class="hljs-built_in">float</span>(n_categories)) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(n_categories)]<span class="hljs-comment"># Draw Plot and Decorate</span>fig = plt.figure(    FigureClass=Waffle,    plots=&#123;        <span class="hljs-string">&#x27;311&#x27;</span>: &#123;            <span class="hljs-string">&#x27;values&#x27;</span>: df_class[<span class="hljs-string">&#x27;counts_class&#x27;</span>],            <span class="hljs-string">&#x27;labels&#x27;</span>: [<span class="hljs-string">&quot;&#123;1&#125;&quot;</span>.<span class="hljs-built_in">format</span>(n[<span class="hljs-number">0</span>], n[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> df_class[[<span class="hljs-string">&#x27;class&#x27;</span>, <span class="hljs-string">&#x27;counts_class&#x27;</span>]].itertuples()],            <span class="hljs-string">&#x27;legend&#x27;</span>: &#123;<span class="hljs-string">&#x27;loc&#x27;</span>: <span class="hljs-string">&#x27;upper left&#x27;</span>, <span class="hljs-string">&#x27;bbox_to_anchor&#x27;</span>: (<span class="hljs-number">1.05</span>, <span class="hljs-number">1</span>), <span class="hljs-string">&#x27;fontsize&#x27;</span>: <span class="hljs-number">12</span>, <span class="hljs-string">&#x27;title&#x27;</span>: <span class="hljs-string">&#x27;Class&#x27;</span>&#125;,            <span class="hljs-string">&#x27;title&#x27;</span>: &#123;<span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;# Vehicles by Class&#x27;</span>, <span class="hljs-string">&#x27;loc&#x27;</span>: <span class="hljs-string">&#x27;center&#x27;</span>, <span class="hljs-string">&#x27;fontsize&#x27;</span>: <span class="hljs-number">18</span>&#125;,            <span class="hljs-string">&#x27;colors&#x27;</span>: colors_class        &#125;,        <span class="hljs-string">&#x27;312&#x27;</span>: &#123;            <span class="hljs-string">&#x27;values&#x27;</span>: df_cyl[<span class="hljs-string">&#x27;counts_cyl&#x27;</span>],            <span class="hljs-string">&#x27;labels&#x27;</span>: [<span class="hljs-string">&quot;&#123;1&#125;&quot;</span>.<span class="hljs-built_in">format</span>(n[<span class="hljs-number">0</span>], n[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> df_cyl[[<span class="hljs-string">&#x27;cyl&#x27;</span>, <span class="hljs-string">&#x27;counts_cyl&#x27;</span>]].itertuples()],            <span class="hljs-string">&#x27;legend&#x27;</span>: &#123;<span class="hljs-string">&#x27;loc&#x27;</span>: <span class="hljs-string">&#x27;upper left&#x27;</span>, <span class="hljs-string">&#x27;bbox_to_anchor&#x27;</span>: (<span class="hljs-number">1.05</span>, <span class="hljs-number">1</span>), <span class="hljs-string">&#x27;fontsize&#x27;</span>: <span class="hljs-number">12</span>, <span class="hljs-string">&#x27;title&#x27;</span>: <span class="hljs-string">&#x27;Cyl&#x27;</span>&#125;,            <span class="hljs-string">&#x27;title&#x27;</span>: &#123;<span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;# Vehicles by Cyl&#x27;</span>, <span class="hljs-string">&#x27;loc&#x27;</span>: <span class="hljs-string">&#x27;center&#x27;</span>, <span class="hljs-string">&#x27;fontsize&#x27;</span>: <span class="hljs-number">18</span>&#125;,            <span class="hljs-string">&#x27;colors&#x27;</span>: colors_cyl        &#125;,        <span class="hljs-string">&#x27;313&#x27;</span>: &#123;            <span class="hljs-string">&#x27;values&#x27;</span>: df_make[<span class="hljs-string">&#x27;counts_make&#x27;</span>],            <span class="hljs-string">&#x27;labels&#x27;</span>: [<span class="hljs-string">&quot;&#123;1&#125;&quot;</span>.<span class="hljs-built_in">format</span>(n[<span class="hljs-number">0</span>], n[<span class="hljs-number">1</span>]) <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> df_make[[<span class="hljs-string">&#x27;manufacturer&#x27;</span>, <span class="hljs-string">&#x27;counts_make&#x27;</span>]].itertuples()],            <span class="hljs-string">&#x27;legend&#x27;</span>: &#123;<span class="hljs-string">&#x27;loc&#x27;</span>: <span class="hljs-string">&#x27;upper left&#x27;</span>, <span class="hljs-string">&#x27;bbox_to_anchor&#x27;</span>: (<span class="hljs-number">1.05</span>, <span class="hljs-number">1</span>), <span class="hljs-string">&#x27;fontsize&#x27;</span>: <span class="hljs-number">12</span>, <span class="hljs-string">&#x27;title&#x27;</span>: <span class="hljs-string">&#x27;Manufacturer&#x27;</span>&#125;,            <span class="hljs-string">&#x27;title&#x27;</span>: &#123;<span class="hljs-string">&#x27;label&#x27;</span>: <span class="hljs-string">&#x27;# Vehicles by Make&#x27;</span>, <span class="hljs-string">&#x27;loc&#x27;</span>: <span class="hljs-string">&#x27;center&#x27;</span>, <span class="hljs-string">&#x27;fontsize&#x27;</span>: <span class="hljs-number">18</span>&#125;,            <span class="hljs-string">&#x27;colors&#x27;</span>: colors_make        &#125;    &#125;,    rows=<span class="hljs-number">9</span>,    figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">14</span>))<span class="hljs-comment"># 译者 TRHX 添加了这行代码</span>plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/35.png" alt="35"></p><h3><span id="32-bing-tu-pie-chart"><font color="##4876FF">【32】饼图（Pie Chart）</font></span></h3><p>饼图是显示组成的经典方法。然而，现在一般不宜使用，因为馅饼部分的面积有时会产生误导。因此，如果要使用饼图，强烈建议您显式地记下饼图每个部分的百分比或数字。</p><pre><code class="hljs python"><span class="hljs-comment"># Import</span>df_raw = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Prepare Data</span>df = df_raw.groupby(<span class="hljs-string">&#x27;class&#x27;</span>).size()<span class="hljs-comment"># Make the plot with pandas</span><span class="hljs-string">&#x27;&#x27;&#x27;</span><span class="hljs-string">原代码：df.plot(kind=&#x27;pie&#x27;, subplots=True, figsize=(8, 8), dpi=80)</span><span class="hljs-string">译者 TRHX 删除了 dpi= 80</span><span class="hljs-string">&#x27;&#x27;&#x27;</span>df.plot(kind=<span class="hljs-string">&#x27;pie&#x27;</span>, subplots=<span class="hljs-literal">True</span>, figsize=(<span class="hljs-number">8</span>, <span class="hljs-number">8</span>))plt.title(<span class="hljs-string">&quot;Pie Chart of Vehicle Class - Bad&quot;</span>)plt.ylabel(<span class="hljs-string">&quot;&quot;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/36.png" alt="36"></p><pre><code class="hljs python"><span class="hljs-comment"># Import</span>df_raw = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Prepare Data</span>df = df_raw.groupby(<span class="hljs-string">&#x27;class&#x27;</span>).size().reset_index(name=<span class="hljs-string">&#x27;counts&#x27;</span>)<span class="hljs-comment"># Draw Plot</span>fig, ax = plt.subplots(figsize=(<span class="hljs-number">12</span>, <span class="hljs-number">7</span>), subplot_kw=<span class="hljs-built_in">dict</span>(aspect=<span class="hljs-string">&quot;equal&quot;</span>), dpi=<span class="hljs-number">80</span>)data = df[<span class="hljs-string">&#x27;counts&#x27;</span>]categories = df[<span class="hljs-string">&#x27;class&#x27;</span>]explode = [<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0.1</span>, <span class="hljs-number">0</span>]<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">func</span>(<span class="hljs-params">pct, allvals</span>):</span>    absolute = <span class="hljs-built_in">int</span>(pct / <span class="hljs-number">100.</span> * np.<span class="hljs-built_in">sum</span>(allvals))    <span class="hljs-keyword">return</span> <span class="hljs-string">&quot;&#123;:.1f&#125;% (&#123;:d&#125; )&quot;</span>.<span class="hljs-built_in">format</span>(pct, absolute)wedges, texts, autotexts = ax.pie(data,                                  autopct=<span class="hljs-keyword">lambda</span> pct: func(pct, data),                                  textprops=<span class="hljs-built_in">dict</span>(color=<span class="hljs-string">&quot;w&quot;</span>),                                  colors=plt.cm.Dark2.colors,                                  startangle=<span class="hljs-number">140</span>,                                  explode=explode)<span class="hljs-comment"># Decoration</span>ax.legend(wedges, categories, title=<span class="hljs-string">&quot;Vehicle Class&quot;</span>, loc=<span class="hljs-string">&quot;center left&quot;</span>, bbox_to_anchor=(<span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0.5</span>, <span class="hljs-number">1</span>))plt.setp(autotexts, size=<span class="hljs-number">10</span>, weight=<span class="hljs-number">700</span>)ax.set_title(<span class="hljs-string">&quot;Class of Vehicles: Pie Chart&quot;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/37.png" alt="37"></p><h3><span id="33-ju-zhen-shu-xing-tu-treemap"><font color="##4876FF">【33】矩阵树形图（Treemap）</font></span></h3><p>矩阵树形图类似于饼图，它可以更好地完成工作而不会误导每个组的贡献。</p><p>【译者 TRHX 注：在使用该方法时要先安装 squarify 库】</p><pre><code class="hljs python"><span class="hljs-comment"># pip install squarify</span><span class="hljs-keyword">import</span> squarify<span class="hljs-comment"># Import Data</span>df_raw = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Prepare Data</span>df = df_raw.groupby(<span class="hljs-string">&#x27;class&#x27;</span>).size().reset_index(name=<span class="hljs-string">&#x27;counts&#x27;</span>)labels = df.apply(<span class="hljs-keyword">lambda</span> x: <span class="hljs-built_in">str</span>(x[<span class="hljs-number">0</span>]) + <span class="hljs-string">&quot;\n (&quot;</span> + <span class="hljs-built_in">str</span>(x[<span class="hljs-number">1</span>]) + <span class="hljs-string">&quot;)&quot;</span>, axis=<span class="hljs-number">1</span>)sizes = df[<span class="hljs-string">&#x27;counts&#x27;</span>].values.tolist()colors = [plt.cm.Spectral(i / <span class="hljs-built_in">float</span>(<span class="hljs-built_in">len</span>(labels))) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(labels))]<span class="hljs-comment"># Draw Plot</span>plt.figure(figsize=(<span class="hljs-number">12</span>, <span class="hljs-number">8</span>), dpi=<span class="hljs-number">80</span>)squarify.plot(sizes=sizes, label=labels, color=colors, alpha=<span class="hljs-number">.8</span>)<span class="hljs-comment"># Decorate</span>plt.title(<span class="hljs-string">&#x27;Treemap of Vechile Class&#x27;</span>)plt.axis(<span class="hljs-string">&#x27;off&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/38.png" alt="38"></p><h3><span id="34-tiao-xing-tu-bar-chart"><font color="##4876FF">【34】条形图（Bar Chart）</font></span></h3><p>条形图是一种基于计数或任何给定度量的可视化项的经典方法。在下面的图表中，我为每个项目使用了不同的颜色，但您通常可能希望为所有项目选择一种颜色，除非您按组对它们进行着色。颜色名称存储在下面代码中的 <code>all_colors</code> 中。您可以通过在 <code>plt.plot()</code> 中设置 <code>color</code> 参数来更改条形的颜色。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> random<span class="hljs-comment"># Import Data</span>df_raw = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv&quot;</span>)<span class="hljs-comment"># Prepare Data</span>df = df_raw.groupby(<span class="hljs-string">&#x27;manufacturer&#x27;</span>).size().reset_index(name=<span class="hljs-string">&#x27;counts&#x27;</span>)n = df[<span class="hljs-string">&#x27;manufacturer&#x27;</span>].unique().__len__()+<span class="hljs-number">1</span>all_colors = <span class="hljs-built_in">list</span>(plt.cm.colors.cnames.keys())random.seed(<span class="hljs-number">100</span>)c = random.choices(all_colors, k=n)<span class="hljs-comment"># Plot Bars</span>plt.figure(figsize=(<span class="hljs-number">16</span>,<span class="hljs-number">10</span>), dpi= <span class="hljs-number">80</span>)plt.bar(df[<span class="hljs-string">&#x27;manufacturer&#x27;</span>], df[<span class="hljs-string">&#x27;counts&#x27;</span>], color=c, width=<span class="hljs-number">.5</span>)<span class="hljs-keyword">for</span> i, val <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(df[<span class="hljs-string">&#x27;counts&#x27;</span>].values):    plt.text(i, val, <span class="hljs-built_in">float</span>(val), horizontalalignment=<span class="hljs-string">&#x27;center&#x27;</span>, verticalalignment=<span class="hljs-string">&#x27;bottom&#x27;</span>, fontdict=&#123;<span class="hljs-string">&#x27;fontweight&#x27;</span>:<span class="hljs-number">500</span>, <span class="hljs-string">&#x27;size&#x27;</span>:<span class="hljs-number">12</span>&#125;)<span class="hljs-comment"># Decoration</span>plt.gca().set_xticklabels(df[<span class="hljs-string">&#x27;manufacturer&#x27;</span>], rotation=<span class="hljs-number">60</span>, horizontalalignment= <span class="hljs-string">&#x27;right&#x27;</span>)plt.title(<span class="hljs-string">&quot;Number of Vehicles by Manaufacturers&quot;</span>, fontsize=<span class="hljs-number">22</span>)plt.ylabel(<span class="hljs-string">&#x27;# Vehicles&#x27;</span>)plt.ylim(<span class="hljs-number">0</span>, <span class="hljs-number">45</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/39.png" alt="39"></p><h2><span id="8x00-bian-hua-change"><font color="#FF0000">【8x00】变化（Change）</font></span></h2><h3><span id="35-shi-jian-xu-lie-tu-time-series-plot"><font color="##4876FF">【35】时间序列图（Time Series Plot）</font></span></h3><p>时间序列图用于可视化给定指标随时间的变化。在这里你可以看到 1949 年到 1969 年间的航空客运量是如何变化的。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&#x27;https://github.com/selva86/datasets/raw/master/AirPassengers.csv&#x27;</span>)<span class="hljs-comment"># Draw Plot</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)plt.plot(<span class="hljs-string">&#x27;date&#x27;</span>, <span class="hljs-string">&#x27;traffic&#x27;</span>, data=df, color=<span class="hljs-string">&#x27;tab:red&#x27;</span>)<span class="hljs-comment"># Decoration</span>plt.ylim(<span class="hljs-number">50</span>, <span class="hljs-number">750</span>)xtick_location = df.index.tolist()[::<span class="hljs-number">12</span>]xtick_labels = [x[-<span class="hljs-number">4</span>:] <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> df.date.tolist()[::<span class="hljs-number">12</span>]]plt.xticks(ticks=xtick_location, labels=xtick_labels, rotation=<span class="hljs-number">0</span>, fontsize=<span class="hljs-number">12</span>, horizontalalignment=<span class="hljs-string">&#x27;center&#x27;</span>, alpha=<span class="hljs-number">.7</span>)plt.yticks(fontsize=<span class="hljs-number">12</span>, alpha=<span class="hljs-number">.7</span>)plt.title(<span class="hljs-string">&quot;Air Passengers Traffic (1949 - 1969)&quot;</span>, fontsize=<span class="hljs-number">22</span>)plt.grid(axis=<span class="hljs-string">&#x27;both&#x27;</span>, alpha=<span class="hljs-number">.3</span>)<span class="hljs-comment"># Remove borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">0.0</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">0.3</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">0.0</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">0.3</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/40.png" alt="40"></p><h3><span id="36-dai-bo-feng-he-bo-gu-zhu-shi-de-shi-jian-xu-lie-tu-time-series-with-peaks-and-troughs-annotated"><font color="##4876FF">【36】带波峰和波谷注释的时间序列图（Time Series with Peaks and Troughs Annotated）</font></span></h3><p>下面的时间序列绘制了所有的波峰和波谷，并注释了所选特殊事件的发生。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&#x27;https://github.com/selva86/datasets/raw/master/AirPassengers.csv&#x27;</span>)<span class="hljs-comment"># Get the Peaks and Troughs</span>data = df[<span class="hljs-string">&#x27;traffic&#x27;</span>].valuesdoublediff = np.diff(np.sign(np.diff(data)))peak_locations = np.where(doublediff == -<span class="hljs-number">2</span>)[<span class="hljs-number">0</span>] + <span class="hljs-number">1</span>doublediff2 = np.diff(np.sign(np.diff(-<span class="hljs-number">1</span> * data)))trough_locations = np.where(doublediff2 == -<span class="hljs-number">2</span>)[<span class="hljs-number">0</span>] + <span class="hljs-number">1</span><span class="hljs-comment"># Draw Plot</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)plt.plot(<span class="hljs-string">&#x27;date&#x27;</span>, <span class="hljs-string">&#x27;traffic&#x27;</span>, data=df, color=<span class="hljs-string">&#x27;tab:blue&#x27;</span>, label=<span class="hljs-string">&#x27;Air Traffic&#x27;</span>)plt.scatter(df.date[peak_locations], df.traffic[peak_locations], marker=mpl.markers.CARETUPBASE, color=<span class="hljs-string">&#x27;tab:green&#x27;</span>,            s=<span class="hljs-number">100</span>, label=<span class="hljs-string">&#x27;Peaks&#x27;</span>)plt.scatter(df.date[trough_locations], df.traffic[trough_locations], marker=mpl.markers.CARETDOWNBASE, color=<span class="hljs-string">&#x27;tab:red&#x27;</span>,            s=<span class="hljs-number">100</span>, label=<span class="hljs-string">&#x27;Troughs&#x27;</span>)<span class="hljs-comment"># Annotate</span><span class="hljs-keyword">for</span> t, p <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(trough_locations[<span class="hljs-number">1</span>::<span class="hljs-number">5</span>], peak_locations[::<span class="hljs-number">3</span>]):    plt.text(df.date[p], df.traffic[p] + <span class="hljs-number">15</span>, df.date[p], horizontalalignment=<span class="hljs-string">&#x27;center&#x27;</span>, color=<span class="hljs-string">&#x27;darkgreen&#x27;</span>)    plt.text(df.date[t], df.traffic[t] - <span class="hljs-number">35</span>, df.date[t], horizontalalignment=<span class="hljs-string">&#x27;center&#x27;</span>, color=<span class="hljs-string">&#x27;darkred&#x27;</span>)<span class="hljs-comment"># Decoration</span>plt.ylim(<span class="hljs-number">50</span>, <span class="hljs-number">750</span>)xtick_location = df.index.tolist()[::<span class="hljs-number">6</span>]xtick_labels = df.date.tolist()[::<span class="hljs-number">6</span>]plt.xticks(ticks=xtick_location, labels=xtick_labels, rotation=<span class="hljs-number">90</span>, fontsize=<span class="hljs-number">12</span>, alpha=<span class="hljs-number">.7</span>)plt.title(<span class="hljs-string">&quot;Peak and Troughs of Air Passengers Traffic (1949 - 1969)&quot;</span>, fontsize=<span class="hljs-number">22</span>)plt.yticks(fontsize=<span class="hljs-number">12</span>, alpha=<span class="hljs-number">.7</span>)<span class="hljs-comment"># Lighten borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">.0</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">.0</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.legend(loc=<span class="hljs-string">&#x27;upper left&#x27;</span>)plt.grid(axis=<span class="hljs-string">&#x27;y&#x27;</span>, alpha=<span class="hljs-number">.3</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/41.png" alt="41"></p><h3><span id="37-zi-xiang-guan-acf-he-bu-fen-zi-xiang-guan-pacf-tu-autocorrelation-acf-and-partial-autocorrelation-pacf-plot"><font color="##4876FF">【37】自相关 (ACF) 和部分自相关 (PACF) 图（Autocorrelation (ACF) and Partial Autocorrelation (PACF) Plot）</font></span></h3><p>自相关图（ACF图）显示了时间序列与其自身滞后的相关性。 每条垂直线（在自相关图上）表示系列与滞后 0 之间的滞后的相关性。图中的蓝色阴影区域是显著性级别。 那些位于蓝线之上的滞后是显著的滞后。</p><p>那么如何解释呢？</p><p>对于航空乘客来说，我们看到超过 14 个滞后已经越过蓝线，因此意义重大。这意味着，14 年前的航空客运量对今天的交通量产生了影响。</p><p>另一方面，部分自相关图（PACF）显示了任何给定滞后（时间序列）相对于当前序列的自相关，但消除了中间滞后的贡献。</p><pre><code class="hljs python"><span class="hljs-keyword">from</span> statsmodels.graphics.tsaplots <span class="hljs-keyword">import</span> plot_acf, plot_pacf<span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&#x27;https://github.com/selva86/datasets/raw/master/AirPassengers.csv&#x27;</span>)<span class="hljs-comment"># Draw Plot</span>fig, (ax1, ax2) = plt.subplots(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">6</span>), dpi=<span class="hljs-number">80</span>)plot_acf(df.traffic.tolist(), ax=ax1, lags=<span class="hljs-number">50</span>)plot_pacf(df.traffic.tolist(), ax=ax2, lags=<span class="hljs-number">20</span>)<span class="hljs-comment"># Decorate</span><span class="hljs-comment"># lighten the borders</span>ax1.spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">.3</span>); ax2.spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)ax1.spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">.3</span>); ax2.spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)ax1.spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">.3</span>); ax2.spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)ax1.spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">.3</span>); ax2.spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)<span class="hljs-comment"># font size of tick labels</span>ax1.tick_params(axis=<span class="hljs-string">&#x27;both&#x27;</span>, labelsize=<span class="hljs-number">12</span>)ax2.tick_params(axis=<span class="hljs-string">&#x27;both&#x27;</span>, labelsize=<span class="hljs-number">12</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/42.png" alt="42"></p><h3><span id="38-jiao-cha-xiang-guan-tu-cross-correlation-plot"><font color="##4876FF">【38】交叉相关图（Cross Correlation plot）</font></span></h3><p>交叉相关图显示了两个时间序列相互之间的滞后。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> statsmodels.tsa.stattools <span class="hljs-keyword">as</span> stattools<span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&#x27;https://github.com/selva86/datasets/raw/master/mortality.csv&#x27;</span>)x = df[<span class="hljs-string">&#x27;mdeaths&#x27;</span>]y = df[<span class="hljs-string">&#x27;fdeaths&#x27;</span>]<span class="hljs-comment"># Compute Cross Correlations</span>ccs = stattools.ccf(x, y)[:<span class="hljs-number">100</span>]nlags = <span class="hljs-built_in">len</span>(ccs)<span class="hljs-comment"># Compute the Significance level</span><span class="hljs-comment"># ref: https://stats.stackexchange.com/questions/3115/cross-correlation-significance-in-r/3128#3128</span>conf_level = <span class="hljs-number">2</span> / np.sqrt(nlags)<span class="hljs-comment"># Draw Plot</span>plt.figure(figsize=(<span class="hljs-number">12</span>, <span class="hljs-number">7</span>), dpi=<span class="hljs-number">80</span>)plt.hlines(<span class="hljs-number">0</span>, xmin=<span class="hljs-number">0</span>, xmax=<span class="hljs-number">100</span>, color=<span class="hljs-string">&#x27;gray&#x27;</span>)  <span class="hljs-comment"># 0 axis</span>plt.hlines(conf_level, xmin=<span class="hljs-number">0</span>, xmax=<span class="hljs-number">100</span>, color=<span class="hljs-string">&#x27;gray&#x27;</span>)plt.hlines(-conf_level, xmin=<span class="hljs-number">0</span>, xmax=<span class="hljs-number">100</span>, color=<span class="hljs-string">&#x27;gray&#x27;</span>)plt.bar(x=np.arange(<span class="hljs-built_in">len</span>(ccs)), height=ccs, width=<span class="hljs-number">.3</span>)<span class="hljs-comment"># Decoration</span>plt.title(<span class="hljs-string">&#x27;$Cross\; Correlation\; Plot:\; mdeaths\; vs\; fdeaths$&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.xlim(<span class="hljs-number">0</span>, <span class="hljs-built_in">len</span>(ccs))plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/43.png" alt="43"></p><h3><span id="39-shi-jian-xu-lie-fen-jie-tu-time-series-decomposition-plot"><font color="##4876FF">【39】时间序列分解图（Time Series Decomposition Plot）</font></span></h3><p>时间序列分解图将时间序列分解为趋势、季节和残差分量。</p><pre><code class="hljs python"><span class="hljs-keyword">from</span> statsmodels.tsa.seasonal <span class="hljs-keyword">import</span> seasonal_decompose<span class="hljs-keyword">from</span> dateutil.parser <span class="hljs-keyword">import</span> parse<span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&#x27;https://github.com/selva86/datasets/raw/master/AirPassengers.csv&#x27;</span>)dates = pd.DatetimeIndex([parse(d).strftime(<span class="hljs-string">&#x27;%Y-%m-01&#x27;</span>) <span class="hljs-keyword">for</span> d <span class="hljs-keyword">in</span> df[<span class="hljs-string">&#x27;date&#x27;</span>]])df.set_index(dates, inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Decompose</span>result = seasonal_decompose(df[<span class="hljs-string">&#x27;traffic&#x27;</span>], model=<span class="hljs-string">&#x27;multiplicative&#x27;</span>)<span class="hljs-comment"># Plot</span>plt.rcParams.update(&#123;<span class="hljs-string">&#x27;figure.figsize&#x27;</span>: (<span class="hljs-number">10</span>, <span class="hljs-number">10</span>)&#125;)result.plot().suptitle(<span class="hljs-string">&#x27;Time Series Decomposition of Air Passengers&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/44.png" alt="44"></p><h3><span id="40-duo-chong-shi-jian-xu-lie-multiple-time-series"><font color="##4876FF">【40】多重时间序列（Multiple Time Series）</font></span></h3><p>您可以在同一图表上绘制多个测量相同值的时间序列，如下所示。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&#x27;https://github.com/selva86/datasets/raw/master/mortality.csv&#x27;</span>)<span class="hljs-comment"># Define the upper limit, lower limit, interval of Y axis and colors</span>y_LL = <span class="hljs-number">100</span>y_UL = <span class="hljs-built_in">int</span>(df.iloc[:, <span class="hljs-number">1</span>:].<span class="hljs-built_in">max</span>().<span class="hljs-built_in">max</span>() * <span class="hljs-number">1.1</span>)y_interval = <span class="hljs-number">400</span>mycolors = [<span class="hljs-string">&#x27;tab:red&#x27;</span>, <span class="hljs-string">&#x27;tab:blue&#x27;</span>, <span class="hljs-string">&#x27;tab:green&#x27;</span>, <span class="hljs-string">&#x27;tab:orange&#x27;</span>]<span class="hljs-comment"># Draw Plot and Annotate</span>fig, ax = plt.subplots(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">9</span>), dpi=<span class="hljs-number">80</span>)columns = df.columns[<span class="hljs-number">1</span>:]<span class="hljs-keyword">for</span> i, column <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(columns):    plt.plot(df.date.values, df[column].values, lw=<span class="hljs-number">1.5</span>, color=mycolors[i])    plt.text(df.shape[<span class="hljs-number">0</span>] + <span class="hljs-number">1</span>, df[column].values[-<span class="hljs-number">1</span>], column, fontsize=<span class="hljs-number">14</span>, color=mycolors[i])<span class="hljs-comment"># Draw Tick lines</span><span class="hljs-keyword">for</span> y <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(y_LL, y_UL, y_interval):    plt.hlines(y, xmin=<span class="hljs-number">0</span>, xmax=<span class="hljs-number">71</span>, colors=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">0.3</span>, linestyles=<span class="hljs-string">&quot;--&quot;</span>, lw=<span class="hljs-number">0.5</span>)<span class="hljs-comment"># Decorations</span>plt.tick_params(axis=<span class="hljs-string">&quot;both&quot;</span>, which=<span class="hljs-string">&quot;both&quot;</span>, bottom=<span class="hljs-literal">False</span>, top=<span class="hljs-literal">False</span>,                labelbottom=<span class="hljs-literal">True</span>, left=<span class="hljs-literal">False</span>, right=<span class="hljs-literal">False</span>, labelleft=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Lighten borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.title(<span class="hljs-string">&#x27;Number of Deaths from Lung Diseases in the UK (1974-1979)&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.yticks(<span class="hljs-built_in">range</span>(y_LL, y_UL, y_interval), [<span class="hljs-built_in">str</span>(y) <span class="hljs-keyword">for</span> y <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(y_LL, y_UL, y_interval)], fontsize=<span class="hljs-number">12</span>)plt.xticks(<span class="hljs-built_in">range</span>(<span class="hljs-number">0</span>, df.shape[<span class="hljs-number">0</span>], <span class="hljs-number">12</span>), df.date.values[::<span class="hljs-number">12</span>], horizontalalignment=<span class="hljs-string">&#x27;left&#x27;</span>, fontsize=<span class="hljs-number">12</span>)plt.ylim(y_LL, y_UL)plt.xlim(-<span class="hljs-number">2</span>, <span class="hljs-number">80</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/45.png" alt="45"></p><h3><span id="41-shi-yong-ci-yao-de-y-zhou-lai-hui-zhi-bu-tong-fan-wei-de-tu-xing-plotting-with-different-scales-using-secondary-y-axis"><font color="##4876FF">【41】使用次要的 Y 轴来绘制不同范围的图形（Plotting with different scales using secondary Y axis）</font></span></h3><p>如果要显示在同一时间点测量两个不同数量的两个时间序列，则可以在右侧的次要 Y 轴上再绘制第二个系列。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/economics.csv&quot;</span>)x = df[<span class="hljs-string">&#x27;date&#x27;</span>]y1 = df[<span class="hljs-string">&#x27;psavert&#x27;</span>]y2 = df[<span class="hljs-string">&#x27;unemploy&#x27;</span>]<span class="hljs-comment"># Plot Line1 (Left Y Axis)</span>fig, ax1 = plt.subplots(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">9</span>), dpi=<span class="hljs-number">80</span>)ax1.plot(x, y1, color=<span class="hljs-string">&#x27;tab:red&#x27;</span>)<span class="hljs-comment"># Plot Line2 (Right Y Axis)</span>ax2 = ax1.twinx()  <span class="hljs-comment"># instantiate a second axes that shares the same x-axis</span>ax2.plot(x, y2, color=<span class="hljs-string">&#x27;tab:blue&#x27;</span>)<span class="hljs-comment"># Decorations</span><span class="hljs-comment"># ax1 (left Y axis)</span>ax1.set_xlabel(<span class="hljs-string">&#x27;Year&#x27;</span>, fontsize=<span class="hljs-number">20</span>)ax1.tick_params(axis=<span class="hljs-string">&#x27;x&#x27;</span>, rotation=<span class="hljs-number">0</span>, labelsize=<span class="hljs-number">12</span>)ax1.set_ylabel(<span class="hljs-string">&#x27;Personal Savings Rate&#x27;</span>, color=<span class="hljs-string">&#x27;tab:red&#x27;</span>, fontsize=<span class="hljs-number">20</span>)ax1.tick_params(axis=<span class="hljs-string">&#x27;y&#x27;</span>, rotation=<span class="hljs-number">0</span>, labelcolor=<span class="hljs-string">&#x27;tab:red&#x27;</span>)ax1.grid(alpha=<span class="hljs-number">.4</span>)<span class="hljs-comment"># ax2 (right Y axis)</span>ax2.set_ylabel(<span class="hljs-string">&quot;# Unemployed (1000&#x27;s)&quot;</span>, color=<span class="hljs-string">&#x27;tab:blue&#x27;</span>, fontsize=<span class="hljs-number">20</span>)ax2.tick_params(axis=<span class="hljs-string">&#x27;y&#x27;</span>, labelcolor=<span class="hljs-string">&#x27;tab:blue&#x27;</span>)ax2.set_xticks(np.arange(<span class="hljs-number">0</span>, <span class="hljs-built_in">len</span>(x), <span class="hljs-number">60</span>))ax2.set_xticklabels(x[::<span class="hljs-number">60</span>], rotation=<span class="hljs-number">90</span>, fontdict=&#123;<span class="hljs-string">&#x27;fontsize&#x27;</span>: <span class="hljs-number">10</span>&#125;)ax2.set_title(<span class="hljs-string">&quot;Personal Savings Rate vs Unemployed: Plotting in Secondary Y Axis&quot;</span>, fontsize=<span class="hljs-number">22</span>)fig.tight_layout()plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/46.png" alt="46"></p><h3><span id="42-dai-wu-chai-dai-de-shi-jian-xu-lie-time-series-with-error-bands"><font color="##4876FF">【42】带误差带的时间序列（Time Series with Error Bands）</font></span></h3><p>如果您有一个时间序列数据集，其中每个时间点（日期/时间戳）有多个观测值，则可以构造具有误差带的时间序列。下面您可以看到一些基于一天中不同时间的订单的示例。还有一个关于45天内到达的订单数量的例子。</p><p>在这种方法中，订单数量的平均值用白线表示。并计算95%的置信区间，并围绕平均值绘制。</p><pre><code class="hljs python"><span class="hljs-keyword">from</span> scipy.stats <span class="hljs-keyword">import</span> sem<span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/user_orders_hourofday.csv&quot;</span>)df_mean = df.groupby(<span class="hljs-string">&#x27;order_hour_of_day&#x27;</span>).quantity.mean()df_se = df.groupby(<span class="hljs-string">&#x27;order_hour_of_day&#x27;</span>).quantity.apply(sem).mul(<span class="hljs-number">1.96</span>)<span class="hljs-comment"># Plot</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)plt.ylabel(<span class="hljs-string">&quot;# Orders&quot;</span>, fontsize=<span class="hljs-number">16</span>)x = df_mean.indexplt.plot(x, df_mean, color=<span class="hljs-string">&quot;white&quot;</span>, lw=<span class="hljs-number">2</span>)plt.fill_between(x, df_mean - df_se, df_mean + df_se, color=<span class="hljs-string">&quot;#3F5D7D&quot;</span>)<span class="hljs-comment"># Decorations</span><span class="hljs-comment"># Lighten borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">1</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">1</span>)plt.xticks(x[::<span class="hljs-number">2</span>], [<span class="hljs-built_in">str</span>(d) <span class="hljs-keyword">for</span> d <span class="hljs-keyword">in</span> x[::<span class="hljs-number">2</span>]], fontsize=<span class="hljs-number">12</span>)plt.title(<span class="hljs-string">&quot;User Orders by Hour of Day (95% confidence)&quot;</span>, fontsize=<span class="hljs-number">22</span>)plt.xlabel(<span class="hljs-string">&quot;Hour of Day&quot;</span>)s, e = plt.gca().get_xlim()plt.xlim(s, e)<span class="hljs-comment"># Draw Horizontal Tick lines</span><span class="hljs-keyword">for</span> y <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-number">8</span>, <span class="hljs-number">20</span>, <span class="hljs-number">2</span>):    plt.hlines(y, xmin=s, xmax=e, colors=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">0.5</span>, linestyles=<span class="hljs-string">&quot;--&quot;</span>, lw=<span class="hljs-number">0.5</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/47.png" alt="47"></p><pre><code class="hljs python"><span class="hljs-string">&quot;Data Source: https://www.kaggle.com/olistbr/brazilian-ecommerce#olist_orders_dataset.csv&quot;</span><span class="hljs-keyword">from</span> dateutil.parser <span class="hljs-keyword">import</span> parse<span class="hljs-keyword">from</span> scipy.stats <span class="hljs-keyword">import</span> sem<span class="hljs-comment"># Import Data</span>df_raw = pd.read_csv(<span class="hljs-string">&#x27;https://raw.githubusercontent.com/selva86/datasets/master/orders_45d.csv&#x27;</span>,                     parse_dates=[<span class="hljs-string">&#x27;purchase_time&#x27;</span>, <span class="hljs-string">&#x27;purchase_date&#x27;</span>])<span class="hljs-comment"># Prepare Data: Daily Mean and SE Bands</span>df_mean = df_raw.groupby(<span class="hljs-string">&#x27;purchase_date&#x27;</span>).quantity.mean()df_se = df_raw.groupby(<span class="hljs-string">&#x27;purchase_date&#x27;</span>).quantity.apply(sem).mul(<span class="hljs-number">1.96</span>)<span class="hljs-comment"># Plot</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)plt.ylabel(<span class="hljs-string">&quot;# Daily Orders&quot;</span>, fontsize=<span class="hljs-number">16</span>)x = [d.date().strftime(<span class="hljs-string">&#x27;%Y-%m-%d&#x27;</span>) <span class="hljs-keyword">for</span> d <span class="hljs-keyword">in</span> df_mean.index]plt.plot(x, df_mean, color=<span class="hljs-string">&quot;white&quot;</span>, lw=<span class="hljs-number">2</span>)plt.fill_between(x, df_mean - df_se, df_mean + df_se, color=<span class="hljs-string">&quot;#3F5D7D&quot;</span>)<span class="hljs-comment"># Decorations</span><span class="hljs-comment"># Lighten borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">1</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">1</span>)plt.xticks(x[::<span class="hljs-number">6</span>], [<span class="hljs-built_in">str</span>(d) <span class="hljs-keyword">for</span> d <span class="hljs-keyword">in</span> x[::<span class="hljs-number">6</span>]], fontsize=<span class="hljs-number">12</span>)plt.title(<span class="hljs-string">&quot;Daily Order Quantity of Brazilian Retail with Error Bands (95% confidence)&quot;</span>, fontsize=<span class="hljs-number">20</span>)<span class="hljs-comment"># Axis limits</span>s, e = plt.gca().get_xlim()plt.xlim(s, e - <span class="hljs-number">2</span>)plt.ylim(<span class="hljs-number">4</span>, <span class="hljs-number">10</span>)<span class="hljs-comment"># Draw Horizontal Tick lines</span><span class="hljs-keyword">for</span> y <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-number">5</span>, <span class="hljs-number">10</span>, <span class="hljs-number">1</span>):    plt.hlines(y, xmin=s, xmax=e, colors=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">0.5</span>, linestyles=<span class="hljs-string">&quot;--&quot;</span>, lw=<span class="hljs-number">0.5</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/48.png" alt="48"></p><h3><span id="43-dui-ji-mian-ji-tu-stacked-area-chart"><font color="##4876FF">【43】堆积面积图（Stacked Area Chart）</font></span></h3><p>堆积面积图提供了多个时间序列的贡献程度的可视化表示，以便相互比较。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&#x27;https://raw.githubusercontent.com/selva86/datasets/master/nightvisitors.csv&#x27;</span>)<span class="hljs-comment"># Decide Colors</span>mycolors = [<span class="hljs-string">&#x27;tab:red&#x27;</span>, <span class="hljs-string">&#x27;tab:blue&#x27;</span>, <span class="hljs-string">&#x27;tab:green&#x27;</span>, <span class="hljs-string">&#x27;tab:orange&#x27;</span>, <span class="hljs-string">&#x27;tab:brown&#x27;</span>, <span class="hljs-string">&#x27;tab:grey&#x27;</span>, <span class="hljs-string">&#x27;tab:pink&#x27;</span>, <span class="hljs-string">&#x27;tab:olive&#x27;</span>]<span class="hljs-comment"># Draw Plot and Annotate</span>fig, ax = plt.subplots(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">9</span>), dpi=<span class="hljs-number">80</span>)columns = df.columns[<span class="hljs-number">1</span>:]labs = columns.values.tolist()<span class="hljs-comment"># Prepare data</span>x = df[<span class="hljs-string">&#x27;yearmon&#x27;</span>].values.tolist()y0 = df[columns[<span class="hljs-number">0</span>]].values.tolist()y1 = df[columns[<span class="hljs-number">1</span>]].values.tolist()y2 = df[columns[<span class="hljs-number">2</span>]].values.tolist()y3 = df[columns[<span class="hljs-number">3</span>]].values.tolist()y4 = df[columns[<span class="hljs-number">4</span>]].values.tolist()y5 = df[columns[<span class="hljs-number">5</span>]].values.tolist()y6 = df[columns[<span class="hljs-number">6</span>]].values.tolist()y7 = df[columns[<span class="hljs-number">7</span>]].values.tolist()y = np.vstack([y0, y2, y4, y6, y7, y5, y1, y3])<span class="hljs-comment"># Plot for each column</span>labs = columns.values.tolist()ax = plt.gca()ax.stackplot(x, y, labels=labs, colors=mycolors, alpha=<span class="hljs-number">0.8</span>)<span class="hljs-comment"># Decorations</span>ax.set_title(<span class="hljs-string">&#x27;Night Visitors in Australian Regions&#x27;</span>, fontsize=<span class="hljs-number">18</span>)ax.<span class="hljs-built_in">set</span>(ylim=[<span class="hljs-number">0</span>, <span class="hljs-number">100000</span>])ax.legend(fontsize=<span class="hljs-number">10</span>, ncol=<span class="hljs-number">4</span>)plt.xticks(x[::<span class="hljs-number">5</span>], fontsize=<span class="hljs-number">10</span>, horizontalalignment=<span class="hljs-string">&#x27;center&#x27;</span>)plt.yticks(np.arange(<span class="hljs-number">10000</span>, <span class="hljs-number">100000</span>, <span class="hljs-number">20000</span>), fontsize=<span class="hljs-number">10</span>)plt.xlim(x[<span class="hljs-number">0</span>], x[-<span class="hljs-number">1</span>])<span class="hljs-comment"># Lighten borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/49.png" alt="49"></p><h3><span id="44-wei-dui-ji-mian-ji-tu-area-chart-unstacked"><font color="##4876FF">【44】未堆积面积图（Area Chart UnStacked）</font></span></h3><p>未堆积的面积图用于可视化两个或多个序列彼此之间的进度（起伏）。在下面的图表中，你可以清楚地看到，随着失业持续时间的中位数增加，个人储蓄率是如何下降的。未堆积面积图很好地展示了这一现象。</p><pre><code class="hljs python"><span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/economics.csv&quot;</span>)<span class="hljs-comment"># Prepare Data</span>x = df[<span class="hljs-string">&#x27;date&#x27;</span>].values.tolist()y1 = df[<span class="hljs-string">&#x27;psavert&#x27;</span>].values.tolist()y2 = df[<span class="hljs-string">&#x27;uempmed&#x27;</span>].values.tolist()mycolors = [<span class="hljs-string">&#x27;tab:red&#x27;</span>, <span class="hljs-string">&#x27;tab:blue&#x27;</span>, <span class="hljs-string">&#x27;tab:green&#x27;</span>, <span class="hljs-string">&#x27;tab:orange&#x27;</span>, <span class="hljs-string">&#x27;tab:brown&#x27;</span>, <span class="hljs-string">&#x27;tab:grey&#x27;</span>, <span class="hljs-string">&#x27;tab:pink&#x27;</span>, <span class="hljs-string">&#x27;tab:olive&#x27;</span>]columns = [<span class="hljs-string">&#x27;psavert&#x27;</span>, <span class="hljs-string">&#x27;uempmed&#x27;</span>]<span class="hljs-comment"># Draw Plot</span>fig, ax = plt.subplots(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">9</span>), dpi=<span class="hljs-number">80</span>)ax.fill_between(x, y1=y1, y2=<span class="hljs-number">0</span>, label=columns[<span class="hljs-number">1</span>], alpha=<span class="hljs-number">0.5</span>, color=mycolors[<span class="hljs-number">1</span>], linewidth=<span class="hljs-number">2</span>)ax.fill_between(x, y1=y2, y2=<span class="hljs-number">0</span>, label=columns[<span class="hljs-number">0</span>], alpha=<span class="hljs-number">0.5</span>, color=mycolors[<span class="hljs-number">0</span>], linewidth=<span class="hljs-number">2</span>)<span class="hljs-comment"># Decorations</span>ax.set_title(<span class="hljs-string">&#x27;Personal Savings Rate vs Median Duration of Unemployment&#x27;</span>, fontsize=<span class="hljs-number">18</span>)ax.<span class="hljs-built_in">set</span>(ylim=[<span class="hljs-number">0</span>, <span class="hljs-number">30</span>])ax.legend(loc=<span class="hljs-string">&#x27;best&#x27;</span>, fontsize=<span class="hljs-number">12</span>)plt.xticks(x[::<span class="hljs-number">50</span>], fontsize=<span class="hljs-number">10</span>, horizontalalignment=<span class="hljs-string">&#x27;center&#x27;</span>)plt.yticks(np.arange(<span class="hljs-number">2.5</span>, <span class="hljs-number">30.0</span>, <span class="hljs-number">2.5</span>), fontsize=<span class="hljs-number">10</span>)plt.xlim(-<span class="hljs-number">10</span>, x[-<span class="hljs-number">1</span>])<span class="hljs-comment"># Draw Tick lines</span><span class="hljs-keyword">for</span> y <span class="hljs-keyword">in</span> np.arange(<span class="hljs-number">2.5</span>, <span class="hljs-number">30.0</span>, <span class="hljs-number">2.5</span>):    plt.hlines(y, xmin=<span class="hljs-number">0</span>, xmax=<span class="hljs-built_in">len</span>(x), colors=<span class="hljs-string">&#x27;black&#x27;</span>, alpha=<span class="hljs-number">0.3</span>, linestyles=<span class="hljs-string">&quot;--&quot;</span>, lw=<span class="hljs-number">0.5</span>)<span class="hljs-comment"># Lighten borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/50.png" alt="50"></p><h3><span id="45-ri-li-re-li-tu-calendar-heat-map"><font color="##4876FF">【45】日历热力图（Calendar Heat Map）</font></span></h3><p>与时间序列相比，日历地图是另一种基于时间的数据可视化的不太受欢迎的方法。虽然在视觉上很吸引人，但数值并不十分明显。然而，它能很好地描绘极端值和假日效果。</p><p>【译者 TRHX 注：在使用该方法时要先安装 calmap 库】</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib <span class="hljs-keyword">as</span> mpl<span class="hljs-keyword">import</span> calmap<span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/yahoo.csv&quot;</span>, parse_dates=[<span class="hljs-string">&#x27;date&#x27;</span>])df.set_index(<span class="hljs-string">&#x27;date&#x27;</span>, inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Plot</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)calmap.calendarplot(df[<span class="hljs-string">&#x27;2014&#x27;</span>][<span class="hljs-string">&#x27;VIX.Close&#x27;</span>], fig_kws=&#123;<span class="hljs-string">&#x27;figsize&#x27;</span>: (<span class="hljs-number">16</span>, <span class="hljs-number">10</span>)&#125;,                    yearlabel_kws=&#123;<span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;black&#x27;</span>, <span class="hljs-string">&#x27;fontsize&#x27;</span>: <span class="hljs-number">14</span>&#125;, subplot_kws=&#123;<span class="hljs-string">&#x27;title&#x27;</span>: <span class="hljs-string">&#x27;Yahoo Stock Prices&#x27;</span>&#125;)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/51.png" alt="51"></p><h3><span id="46-ji-jie-tu-seasonal-plot"><font color="##4876FF">【46】季节图（Seasonal Plot）</font></span></h3><p>季节图可用于比较上一季度同一天（年/月/周等）时间序列的表现。</p><pre><code class="hljs python"><span class="hljs-keyword">from</span> dateutil.parser <span class="hljs-keyword">import</span> parse<span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&#x27;https://github.com/selva86/datasets/raw/master/AirPassengers.csv&#x27;</span>)<span class="hljs-comment"># Prepare data</span>df[<span class="hljs-string">&#x27;year&#x27;</span>] = [parse(d).year <span class="hljs-keyword">for</span> d <span class="hljs-keyword">in</span> df.date]df[<span class="hljs-string">&#x27;month&#x27;</span>] = [parse(d).strftime(<span class="hljs-string">&#x27;%b&#x27;</span>) <span class="hljs-keyword">for</span> d <span class="hljs-keyword">in</span> df.date]years = df[<span class="hljs-string">&#x27;year&#x27;</span>].unique()<span class="hljs-comment"># 译者 TRHX 添加了该行代码</span>df.rename(columns=&#123;<span class="hljs-string">&#x27;value&#x27;</span>: <span class="hljs-string">&#x27;traffic&#x27;</span>&#125;, inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Draw Plot</span>mycolors = [<span class="hljs-string">&#x27;tab:red&#x27;</span>, <span class="hljs-string">&#x27;tab:blue&#x27;</span>, <span class="hljs-string">&#x27;tab:green&#x27;</span>, <span class="hljs-string">&#x27;tab:orange&#x27;</span>, <span class="hljs-string">&#x27;tab:brown&#x27;</span>, <span class="hljs-string">&#x27;tab:grey&#x27;</span>, <span class="hljs-string">&#x27;tab:pink&#x27;</span>, <span class="hljs-string">&#x27;tab:olive&#x27;</span>,            <span class="hljs-string">&#x27;deeppink&#x27;</span>, <span class="hljs-string">&#x27;steelblue&#x27;</span>, <span class="hljs-string">&#x27;firebrick&#x27;</span>, <span class="hljs-string">&#x27;mediumseagreen&#x27;</span>]plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)<span class="hljs-keyword">for</span> i, y <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(years):    plt.plot(<span class="hljs-string">&#x27;month&#x27;</span>, <span class="hljs-string">&#x27;traffic&#x27;</span>, data=df.loc[df.year == y, :], color=mycolors[i], label=y)    plt.text(df.loc[df.year == y, :].shape[<span class="hljs-number">0</span>] - <span class="hljs-number">.9</span>, df.loc[df.year == y, <span class="hljs-string">&#x27;traffic&#x27;</span>][-<span class="hljs-number">1</span>:].values[<span class="hljs-number">0</span>], y, fontsize=<span class="hljs-number">12</span>,             color=mycolors[i])<span class="hljs-comment"># Decoration</span>plt.ylim(<span class="hljs-number">50</span>, <span class="hljs-number">750</span>)plt.xlim(-<span class="hljs-number">0.3</span>, <span class="hljs-number">11</span>)plt.ylabel(<span class="hljs-string">&#x27;$Air Traffic$&#x27;</span>)plt.yticks(fontsize=<span class="hljs-number">12</span>, alpha=<span class="hljs-number">.7</span>)plt.title(<span class="hljs-string">&quot;Monthly Seasonal Plot: Air Passengers Traffic (1949 - 1969)&quot;</span>, fontsize=<span class="hljs-number">22</span>)plt.grid(axis=<span class="hljs-string">&#x27;y&#x27;</span>, alpha=<span class="hljs-number">.3</span>)<span class="hljs-comment"># Remove borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">0.0</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">0.5</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">0.0</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">0.5</span>)<span class="hljs-comment"># plt.legend(loc=&#x27;upper right&#x27;, ncol=2, fontsize=12)</span>plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/52.png" alt="52"></p><h2><span id="9x00-fen-zu-groups"><font color="#FF0000">【9x00】分组（ Groups）</font></span></h2><h3><span id="47-shu-zhuang-tu-dendrogram"><font color="##4876FF">【47】树状图（Dendrogram）</font></span></h3><p>树状图根据给定的距离度量将相似的点组合在一起，并根据点的相似性将它们组织成树状链接。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> scipy.cluster.hierarchy <span class="hljs-keyword">as</span> shc<span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&#x27;https://raw.githubusercontent.com/selva86/datasets/master/USArrests.csv&#x27;</span>)<span class="hljs-comment"># Plot</span>plt.figure(figsize=(<span class="hljs-number">16</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)plt.title(<span class="hljs-string">&quot;USArrests Dendograms&quot;</span>, fontsize=<span class="hljs-number">22</span>)dend = shc.dendrogram(shc.linkage(df[[<span class="hljs-string">&#x27;Murder&#x27;</span>, <span class="hljs-string">&#x27;Assault&#x27;</span>, <span class="hljs-string">&#x27;UrbanPop&#x27;</span>, <span class="hljs-string">&#x27;Rape&#x27;</span>]], method=<span class="hljs-string">&#x27;ward&#x27;</span>), labels=df.State.values,                      color_threshold=<span class="hljs-number">100</span>)plt.xticks(fontsize=<span class="hljs-number">12</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/53.png" alt="53"></p><h3><span id="48-ju-lei-tu-cluster-plot"><font color="##4876FF">【48】聚类图（Cluster Plot）</font></span></h3><p>聚类图可以用来划分属于同一个聚类的点。下面是一个基于 USArrests 数据集将美国各州分成 5 组的代表性示例。这个聚类图使用 ‘murder’ 和 ‘assault’ 作为 X 轴和 Y 轴。或者，您可以将第一个主元件用作 X 轴和 Y 轴。</p><p>【译者 TRHX 注：在使用该方法时要先安装 sklearn 库】</p><pre><code class="hljs python"><span class="hljs-keyword">from</span> sklearn.cluster <span class="hljs-keyword">import</span> AgglomerativeClustering<span class="hljs-keyword">from</span> scipy.spatial <span class="hljs-keyword">import</span> ConvexHull<span class="hljs-comment"># Import Data</span>df = pd.read_csv(<span class="hljs-string">&#x27;https://raw.githubusercontent.com/selva86/datasets/master/USArrests.csv&#x27;</span>)<span class="hljs-comment"># Agglomerative Clustering</span>cluster = AgglomerativeClustering(n_clusters=<span class="hljs-number">5</span>, affinity=<span class="hljs-string">&#x27;euclidean&#x27;</span>, linkage=<span class="hljs-string">&#x27;ward&#x27;</span>)cluster.fit_predict(df[[<span class="hljs-string">&#x27;Murder&#x27;</span>, <span class="hljs-string">&#x27;Assault&#x27;</span>, <span class="hljs-string">&#x27;UrbanPop&#x27;</span>, <span class="hljs-string">&#x27;Rape&#x27;</span>]])<span class="hljs-comment"># Plot</span>plt.figure(figsize=(<span class="hljs-number">14</span>, <span class="hljs-number">10</span>), dpi=<span class="hljs-number">80</span>)plt.scatter(df.iloc[:, <span class="hljs-number">0</span>], df.iloc[:, <span class="hljs-number">1</span>], c=cluster.labels_, cmap=<span class="hljs-string">&#x27;tab10&#x27;</span>)<span class="hljs-comment"># Encircle</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">encircle</span>(<span class="hljs-params">x, y, ax=<span class="hljs-literal">None</span>, **kw</span>):</span>    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> ax: ax = plt.gca()    p = np.c_[x, y]    hull = ConvexHull(p)    poly = plt.Polygon(p[hull.vertices,:], **kw)    ax.add_patch(poly)<span class="hljs-comment"># Draw polygon surrounding vertices</span>encircle(df.loc[cluster.labels_ == <span class="hljs-number">0</span>, <span class="hljs-string">&#x27;Murder&#x27;</span>], df.loc[cluster.labels_ == <span class="hljs-number">0</span>, <span class="hljs-string">&#x27;Assault&#x27;</span>], ec=<span class="hljs-string">&quot;k&quot;</span>, fc=<span class="hljs-string">&quot;gold&quot;</span>, alpha=<span class="hljs-number">0.2</span>, linewidth=<span class="hljs-number">0</span>)encircle(df.loc[cluster.labels_ == <span class="hljs-number">1</span>, <span class="hljs-string">&#x27;Murder&#x27;</span>], df.loc[cluster.labels_ == <span class="hljs-number">1</span>, <span class="hljs-string">&#x27;Assault&#x27;</span>], ec=<span class="hljs-string">&quot;k&quot;</span>, fc=<span class="hljs-string">&quot;tab:blue&quot;</span>, alpha=<span class="hljs-number">0.2</span>, linewidth=<span class="hljs-number">0</span>)encircle(df.loc[cluster.labels_ == <span class="hljs-number">2</span>, <span class="hljs-string">&#x27;Murder&#x27;</span>], df.loc[cluster.labels_ == <span class="hljs-number">2</span>, <span class="hljs-string">&#x27;Assault&#x27;</span>], ec=<span class="hljs-string">&quot;k&quot;</span>, fc=<span class="hljs-string">&quot;tab:red&quot;</span>, alpha=<span class="hljs-number">0.2</span>, linewidth=<span class="hljs-number">0</span>)encircle(df.loc[cluster.labels_ == <span class="hljs-number">3</span>, <span class="hljs-string">&#x27;Murder&#x27;</span>], df.loc[cluster.labels_ == <span class="hljs-number">3</span>, <span class="hljs-string">&#x27;Assault&#x27;</span>], ec=<span class="hljs-string">&quot;k&quot;</span>, fc=<span class="hljs-string">&quot;tab:green&quot;</span>, alpha=<span class="hljs-number">0.2</span>, linewidth=<span class="hljs-number">0</span>)encircle(df.loc[cluster.labels_ == <span class="hljs-number">4</span>, <span class="hljs-string">&#x27;Murder&#x27;</span>], df.loc[cluster.labels_ == <span class="hljs-number">4</span>, <span class="hljs-string">&#x27;Assault&#x27;</span>], ec=<span class="hljs-string">&quot;k&quot;</span>, fc=<span class="hljs-string">&quot;tab:orange&quot;</span>, alpha=<span class="hljs-number">0.2</span>, linewidth=<span class="hljs-number">0</span>)<span class="hljs-comment"># Decorations</span>plt.xlabel(<span class="hljs-string">&#x27;Murder&#x27;</span>); plt.xticks(fontsize=<span class="hljs-number">12</span>)plt.ylabel(<span class="hljs-string">&#x27;Assault&#x27;</span>); plt.yticks(fontsize=<span class="hljs-number">12</span>)plt.title(<span class="hljs-string">&#x27;Agglomerative Clustering of USArrests (5 Groups)&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/54.png" alt="54"></p><h3><span id="49-an-de-lu-si-qu-xian-andrews-curve"><font color="##4876FF">【49】安德鲁斯曲线（Andrews Curve）</font></span></h3><p>安德鲁斯曲线有助于可视化是否存在基于给定分组的数值特征的固有分组。如果特征（数据集中的列）不能帮助区分组（cyl），则行将不会像下图所示被很好地分隔开。</p><pre><code class="hljs python"><span class="hljs-keyword">from</span> pandas.plotting <span class="hljs-keyword">import</span> andrews_curves<span class="hljs-comment"># Import</span>df = pd.read_csv(<span class="hljs-string">&quot;https://github.com/selva86/datasets/raw/master/mtcars.csv&quot;</span>)df.drop([<span class="hljs-string">&#x27;cars&#x27;</span>, <span class="hljs-string">&#x27;carname&#x27;</span>], axis=<span class="hljs-number">1</span>, inplace=<span class="hljs-literal">True</span>)<span class="hljs-comment"># Plot</span>plt.figure(figsize=(<span class="hljs-number">12</span>, <span class="hljs-number">9</span>), dpi=<span class="hljs-number">80</span>)andrews_curves(df, <span class="hljs-string">&#x27;cyl&#x27;</span>, colormap=<span class="hljs-string">&#x27;Set1&#x27;</span>)<span class="hljs-comment"># Lighten borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.title(<span class="hljs-string">&#x27;Andrews Curves of mtcars&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.xlim(-<span class="hljs-number">3</span>, <span class="hljs-number">3</span>)plt.grid(alpha=<span class="hljs-number">0.3</span>)plt.xticks(fontsize=<span class="hljs-number">12</span>)plt.yticks(fontsize=<span class="hljs-number">12</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/55.png" alt="55"></p><h3><span id="50-ping-xing-zuo-biao-tu-parallel-coordinates"><font color="##4876FF">【50】平行坐标图（Parallel Coordinates）</font></span></h3><p>平行坐标有助于可视化功能是否有助于有效地隔离组。如果一个分离受到影响，则该特征可能在预测该组时非常有用。</p><pre><code class="hljs python"><span class="hljs-keyword">from</span> pandas.plotting <span class="hljs-keyword">import</span> parallel_coordinates<span class="hljs-comment"># Import Data</span>df_final = pd.read_csv(<span class="hljs-string">&quot;https://raw.githubusercontent.com/selva86/datasets/master/diamonds_filter.csv&quot;</span>)<span class="hljs-comment"># Plot</span>plt.figure(figsize=(<span class="hljs-number">12</span>, <span class="hljs-number">9</span>), dpi=<span class="hljs-number">80</span>)parallel_coordinates(df_final, <span class="hljs-string">&#x27;cut&#x27;</span>, colormap=<span class="hljs-string">&#x27;Dark2&#x27;</span>)<span class="hljs-comment"># Lighten borders</span>plt.gca().spines[<span class="hljs-string">&quot;top&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;bottom&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.gca().spines[<span class="hljs-string">&quot;right&quot;</span>].set_alpha(<span class="hljs-number">0</span>)plt.gca().spines[<span class="hljs-string">&quot;left&quot;</span>].set_alpha(<span class="hljs-number">.3</span>)plt.title(<span class="hljs-string">&#x27;Parallel Coordinated of Diamonds&#x27;</span>, fontsize=<span class="hljs-number">22</span>)plt.grid(alpha=<span class="hljs-number">0.3</span>)plt.xticks(fontsize=<span class="hljs-number">12</span>)plt.yticks(fontsize=<span class="hljs-number">12</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/024/56.png" alt="56"></p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本译文首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">Selva</span> <span class="hljs-string">Prabhakaran，译者</span> <span class="hljs-string">TRHX。</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106558131</span><span class="hljs-string">原文链接：https://www.machinelearningplus.com/plots/top-50-matplotlib-visualizations-the-master-plots-python/</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-1x00-jie-shao-introduction-font&quot;&gt;&lt;font col</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Matplotlib" scheme="https://www.itbob.cn/tags/Matplotlib/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Matplotlib（十）：3D 图的绘制</title>
    <link href="https://www.itbob.cn/article/023/"/>
    <id>https://www.itbob.cn/article/023/</id>
    <published>2020-06-07T16:00:08.000Z</published>
    <updated>2022-05-22T12:35:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-01x00-liao-jie-mplot3d-toolkit-font"><font color="#FF0000">【01x00】了解 mplot3d Toolkit</font></a><ul><li><a href="#font-color-4876ff-01x01-axes3d-dui-xiang-chuang-jian-fang-fa-yi-axes3d-fig-font"><font color="##4876FF">【01x01】Axes3D 对象创建方法一：Axes3D(fig)</font></a></li><li><a href="#font-color-4876ff-01x02-axes3d-dui-xiang-chuang-jian-fang-fa-er-add-subplot-font"><font color="##4876FF">【01x02】Axes3D 对象创建方法二：add_subplot</font></a></li><li><a href="#font-color-4876ff-01x03-axes3d-dui-xiang-chuang-jian-fang-fa-san-gca-font"><font color="##4876FF">【01x03】Axes3D 对象创建方法三：gca</font></a></li></ul></li><li><a href="#font-color-ff0000-02x00-cmap-yu-colorbar-font"><font color="#FF0000">【02x00】cmap 与 colorbar</font></a></li><li><a href="#font-color-ff0000-03x00-3d-xian-xing-tu-axes3d-plot-font"><font color="#FF0000">【03x00】3D 线性图：Axes3D.plot</font></a></li><li><a href="#font-color-ff0000-04x00-3d-san-dian-tu-axes3d-scatter-font"><font color="#FF0000">【04x00】3D 散点图：Axes3D.scatter</font></a></li><li><a href="#font-color-ff0000-05x00-3d-xian-kuang-tu-axes3d-plot-wireframe-font"><font color="#FF0000">【05x00】3D 线框图：Axes3D.plot_wireframe</font></a></li><li><a href="#font-color-ff0000-06x00-3d-qu-mian-tu-axes3d-plot-surface-font"><font color="#FF0000">【06x00】3D 曲面图：Axes3D.plot_surface</font></a></li><li><a href="#font-color-ff0000-07x00-3d-zhu-zhuang-tu-axes3d-bar-font"><font color="#FF0000">【07x00】3D 柱状图：Axes3D.bar</font></a></li><li><a href="#font-color-ff0000-08x00-3d-jian-tou-tu-axes3d-quiver-font"><font color="#FF0000">【08x00】3D 箭头图：Axes3D.quiver</font></a></li><li><a href="#font-color-ff0000-09x00-3d-deng-gao-xian-tu-axes3d-contour-font"><font color="#FF0000">【09x00】3D 等高线图：Axes3D.contour</font></a></li><li><a href="#font-color-ff0000-10x00-3d-deng-gao-xian-tian-chong-tu-axes3d-contourf-font"><font color="#FF0000">【10x00】3D 等高线填充图：Axes3D.contourf</font></a></li><li><a href="#font-color-ff0000-11x00-3d-san-jiao-qu-mian-tu-axes3d-plot-trisurf-font"><font color="#FF0000">【11x00】3D 三角曲面图：Axes3D.plot_trisurf</font></a></li><li><a href="#font-color-ff0000-12x00-jiang-2d-tu-xiang-ju-he-dao-3d-tu-xiang-zhong-axes3d-add-collection3d-font"><font color="#FF0000">【12x00】将 2D 图像聚合到 3D 图像中：Axes3D.add_collection3d</font></a></li><li><a href="#font-color-ff0000-13x00-3d-tu-tian-jia-wen-ben-miao-shu-axes3d-text-font"><font color="#FF0000">【13x00】3D 图添加文本描述：Axes3D.text</font></a></li></ul><!-- tocstop --><hr><p>Matplotlib 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/014/">Python 数据分析三剑客之 Matplotlib（一）：初识 Matplotlib 与其 matplotibrc 配置文件</a></li><li><a href="https://www.itbob.cn/article/015/">Python 数据分析三剑客之 Matplotlib（二）：文本描述 / 中文支持 / 画布 / 网格等基本图像属性</a></li><li><a href="https://www.itbob.cn/article/016/">Python 数据分析三剑客之 Matplotlib（三）：图例 / LaTeX / 刻度 / 子图 / 补丁等基本图像属性</a></li><li><a href="https://www.itbob.cn/article/017/">Python 数据分析三剑客之 Matplotlib（四）：线性图的绘制</a></li><li><a href="https://www.itbob.cn/article/018/">Python 数据分析三剑客之 Matplotlib（五）：散点图的绘制</a></li><li><a href="https://www.itbob.cn/article/019/">Python 数据分析三剑客之 Matplotlib（六）：直方图 / 柱状图 / 条形图的绘制</a></li><li><a href="https://www.itbob.cn/article/020/">Python 数据分析三剑客之 Matplotlib（七）：饼状图的绘制</a></li><li><a href="https://www.itbob.cn/article/021/">Python 数据分析三剑客之 Matplotlib（八）：等高线 / 等值线图的绘制</a></li><li><a href="https://www.itbob.cn/article/022/">Python 数据分析三剑客之 Matplotlib（九）：极区图 / 极坐标图 / 雷达图的绘制</a></li><li><a href="https://www.itbob.cn/article/023/">Python 数据分析三剑客之 Matplotlib（十）：3D 图的绘制</a></li><li><a href="https://www.itbob.cn/article/024/">Python 数据分析三剑客之 Matplotlib（十一）：最热门最常用的 50 个图表</a>【译文】</li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106558131</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="01x00-liao-jie-mplot3d-toolkit"><font color="#FF0000">【01x00】了解 mplot3d Toolkit</font></span></h2><p>mplot3d Toolkit 即 mplot3d 工具包，在 matplotlib 中使用 mplot3d 工具包可以绘制 3D 图。</p><p>mplot3d 官方文档：<a href="https://matplotlib.org/tutorials/toolkits/mplot3d.html">https://matplotlib.org/tutorials/toolkits/mplot3d.html</a></p><p>在 matplotlib 中，figure 为画布，axes 为绘图区，<code>fig.add_subplot()</code>、<code>plt.subplot()</code> 方法均可以创建子图，在绘制 3D 图时，某些 2D 图的参数也适用于 3D 图，在本文的示例中，可能会用到的一些没有具体解释的函数或者参数，其用法均可在前面的系列文章中找到：</p><ul><li><p><a href="https://itrhx.blog.csdn.net/article/details/105638122">《Python 数据分析三剑客之 Matplotlib（一）：初识 Matplotlib 与其 matplotibrc 配置文件》</a></p></li><li><p><a href="https://itrhx.blog.csdn.net/article/details/105828049">《Python 数据分析三剑客之 Matplotlib（二）：文本描述 / 中文支持 / 画布 / 网格等基本图像属性》</a></p></li><li><p><a href="https://itrhx.blog.csdn.net/article/details/105828143">《Python 数据分析三剑客之 Matplotlib（三）：图例 / LaTeX / 刻度 / 子图 / 补丁等基本图像属性》</a></p></li></ul><p><font color="#FF0000"><strong>绘制 3D 图的步骤：创建 Axes3D 对象，然后调用 Axes3D 的不同方法来绘制不同类型的 3D 图。以下介绍三种 Axes3D 对象创建的方法。</strong></font></p><h3><span id="01x01-axes3d-dui-xiang-chuang-jian-fang-fa-yi-axes3d-fig"><font color="##4876FF">【01x01】Axes3D 对象创建方法一：Axes3D(fig)</font></span></h3><p>在 Matplotlib 1.0.0 版本中，绘制 3D 图需要先导入 Axes3D 包，获取 figure 画布对象 fig 后，通过 Axes3D(fig) 方法来创建 Axes3D 对象，具体方法如下：</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-keyword">from</span> mpl_toolkits.mplot3d <span class="hljs-keyword">import</span> Axes3D<span class="hljs-comment"># 获取 figure 画布并创建 Axes3D 对象</span>fig = plt.figure()ax = Axes3D(fig)<span class="hljs-comment"># 数据坐标</span>z = np.linspace(<span class="hljs-number">0</span>, <span class="hljs-number">15</span>, <span class="hljs-number">1000</span>)x = np.sin(z)y = np.cos(z)<span class="hljs-comment"># 绘制线性图</span>ax.plot(x, y, z)plt.show()</code></pre><h3><span id="01x02-axes3d-dui-xiang-chuang-jian-fang-fa-er-add-subplot"><font color="##4876FF">【01x02】Axes3D 对象创建方法二：add_subplot</font></span></h3><p>在 Matplotlib 3.2.0 版本中，绘制 3D 图可以通过创建子图，然后指定 projection 参数 为 3d 即可，返回的 ax 为 Axes3D 对象，以下两种方法均可：</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-comment"># 获取 figure 画布并通过子图创建 Axes3D 对象</span>fig = plt.figure()ax = fig.add_subplot(<span class="hljs-number">111</span>, projection=<span class="hljs-string">&#x27;3d&#x27;</span>)<span class="hljs-comment"># 数据坐标</span>z = np.linspace(<span class="hljs-number">0</span>, <span class="hljs-number">15</span>, <span class="hljs-number">1000</span>)x = np.sin(z)y = np.cos(z)<span class="hljs-comment"># 绘制线性图</span>ax.plot(x, y, z)plt.show()</code></pre><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-comment"># 通过子图创建 Axes3D 对象</span>ax = plt.subplot(<span class="hljs-number">111</span>, projection=<span class="hljs-string">&#x27;3d&#x27;</span>)<span class="hljs-comment"># 数据坐标</span>z = np.linspace(<span class="hljs-number">0</span>, <span class="hljs-number">15</span>, <span class="hljs-number">1000</span>)x = np.sin(z)y = np.cos(z)<span class="hljs-comment"># 绘制线性图</span>ax.plot(x, y, z)plt.show()</code></pre><h3><span id="01x03-axes3d-dui-xiang-chuang-jian-fang-fa-san-gca"><font color="##4876FF">【01x03】Axes3D 对象创建方法三：gca</font></span></h3><p>除了以上两种方法以外，还可以先获取画布对象 fig，再通过 <code>fig.gca()</code> 方法获取当前绘图区（gca = Get Current Axes），然后指定 projection 参数 为 3d 即可，返回的 ax 为 Axes3D 对象。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-comment"># 依次获取画布和绘图区并创建 Axes3D 对象</span>fig = plt.figure()ax = fig.gca(projection=<span class="hljs-string">&#x27;3d&#x27;</span>)<span class="hljs-comment"># 数据坐标</span>z = np.linspace(<span class="hljs-number">0</span>, <span class="hljs-number">15</span>, <span class="hljs-number">1000</span>)x = np.sin(z)y = np.cos(z)<span class="hljs-comment"># 绘制线性图</span>ax.plot(x, y, z)plt.show()</code></pre><p>以上三种方法运行结果均为下图：</p><p><img src="https://static.wukongsec.com/itbob/images/article/023/01.png" alt="01"></p><h2><span id="02x00-cmap-yu-colorbar"><font color="#FF0000">【02x00】cmap 与 colorbar</font></span></h2><p>默认情况下，散点图、线性图、曲面图等将以纯色着色，但可以通过提供 cmap 参数支持颜色映射。cmap 参数用于设置一些特殊的颜色组合，如渐变色等，参数取值通常为 Colormap 中的值，具体取值可参见下图：</p><p>官方文档：<a href="https://matplotlib.org/tutorials/colors/colormaps.html">https://matplotlib.org/tutorials/colors/colormaps.html</a></p><p><img src="https://static.wukongsec.com/itbob/images/article/023/02.jpg" alt="02"></p><p>如果使用了 cmap 参数，则可以使用 <code>pyplot.colorbar()</code> 函数来绘制一个色条，即颜色对照条。</p><p>基本语法：<code>matplotlib.pyplot.colorbar([mappable=None, cax=None, ax=None, **kw])</code></p><p>部分参数解释如下表，其他参数，如长度，宽度等请参考<a href="https://matplotlib.org/api/_as_gen/matplotlib.pyplot.colorbar.html">官方文档</a>。</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>mappable</td><td>要设置色条的图像对象，该参数对于 <code>Figure.colorbar</code> 方法是必需的，但对于 <code>pyplot.colorbar</code> 函数是可选的</td></tr><tr><td>cax</td><td>可选项，要绘制色条的轴</td></tr><tr><td>ax</td><td>可选项，设置色条的显示位置，通常在一个画布上有多个子图时使用</td></tr><tr><td>**kw</td><td>可选项，其他关键字参数，参考<a href="https://matplotlib.org/api/_as_gen/matplotlib.pyplot.colorbar.html">官方文档</a></td></tr></tbody></table><h2><span id="03x00-3d-xian-xing-tu-axes3d-plot"><font color="#FF0000">【03x00】3D 线性图：Axes3D.plot</font></span></h2><p>基本方法：<code>Axes3D.plot(xs, ys[, zs, zdir='z', *args, **kwargs])</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>xs</td><td>一维数组，点的 x 轴坐标</td></tr><tr><td>ys</td><td>一维数组，点的 y 轴坐标</td></tr><tr><td>zs</td><td>一维数组，可选项，点的 z 轴坐标</td></tr><tr><td>zdir</td><td>可选项，在 3D 轴上绘制 2D 数据时，数据必须以 xs，ys 的形式传递，<br>若此时将 zdir 设置为 ‘y’，数据将会被绘制到 x-z 轴平面上，默认为 ‘z’</td></tr><tr><td>**kwargs</td><td>其他关键字参数，可选项，可参见 <a href="https://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.plot.html">matplotlib.axes.Axes.plot</a></td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-comment"># 设置中文显示</span>plt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]<span class="hljs-comment"># 依次获取画布和绘图区并创建 Axes3D 对象</span>fig = plt.figure()ax = fig.gca(projection=<span class="hljs-string">&#x27;3d&#x27;</span>)<span class="hljs-comment"># 第一条3D线性图数据</span>theta = np.linspace(-<span class="hljs-number">4</span> * np.pi, <span class="hljs-number">4</span> * np.pi, <span class="hljs-number">100</span>)z1 = np.linspace(-<span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">100</span>)r = z1**<span class="hljs-number">2</span> + <span class="hljs-number">1</span>x1 = r * np.sin(theta)y1 = r * np.cos(theta)<span class="hljs-comment"># 第二条3D线性图数据</span>z2 = np.linspace(-<span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">100</span>)x2 = np.sin(z2)y2 = np.cos(z2)<span class="hljs-comment"># 绘制3D线性图</span>ax.plot(x1, y1, z1, color=<span class="hljs-string">&#x27;b&#x27;</span>, label=<span class="hljs-string">&#x27;3D 线性图一&#x27;</span>)ax.plot(x2, y2, z2, color=<span class="hljs-string">&#x27;r&#x27;</span>, label=<span class="hljs-string">&#x27;3D 线性图二&#x27;</span>)<span class="hljs-comment"># 设置标题、轴标签、图例，也可以直接使用 plt.title、plt.xlabel、plt.legend...</span>ax.set_title(<span class="hljs-string">&#x27;绘制 3D 线性图示例&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;x 轴&#x27;</span>, color=<span class="hljs-string">&#x27;r&#x27;</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;y 轴&#x27;</span>, color=<span class="hljs-string">&#x27;g&#x27;</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;z 轴&#x27;</span>, color=<span class="hljs-string">&#x27;b&#x27;</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.legend()plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/03.png" alt="03"></p><h2><span id="04x00-3d-san-dian-tu-axes3d-scatter"><font color="#FF0000">【04x00】3D 散点图：Axes3D.scatter</font></span></h2><p>基本方法：<code>Axes3D.scatter(xs, ys[, zs=0, zdir='z', s=20, c=None, depthshade=True, *args, **kwargs])</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>xs</td><td>一维数组，点的 x 轴坐标</td></tr><tr><td>ys</td><td>一维数组，点的 y 轴坐标</td></tr><tr><td>zs</td><td>一维数组，可选项，点的 z 轴坐标</td></tr><tr><td>zdir</td><td>可选项，在 3D 轴上绘制 2D 数据时，数据必须以 xs，ys 的形式传递，<br>若此时将 zdir 设置为 ‘y’，数据将会被绘制到 x-z 轴平面上，默认为 ‘z’</td></tr><tr><td>s</td><td>标量或数组类型，可选项，标记的大小，默认 20</td></tr><tr><td>c</td><td>标记的颜色，可选项，可以是单个颜色或者一个颜色列表<br>支持英文颜色名称及其简写、十六进制颜色码等，更多颜色示例参见官网 <a href="https://matplotlib.org/gallery/color/color_demo.html">Color Demo</a></td></tr><tr><td>depthshade</td><td>bool 值，可选项，默认 True，是否为散点标记着色以提供深度外观</td></tr><tr><td>**kwargs</td><td>其他关键字参数，可选项，可参见 <a href="https://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.scatter.html">scatter</a></td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]<span class="hljs-comment"># 依次获取画布和绘图区并创建 Axes3D 对象</span>fig = plt.figure()ax = fig.gca(projection=<span class="hljs-string">&#x27;3d&#x27;</span>)n = <span class="hljs-number">100</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">randrange</span>(<span class="hljs-params">n, vmin, vmax</span>):</span>    <span class="hljs-keyword">return</span> (vmax - vmin)*np.random.rand(n) + vmin<span class="hljs-string">&#x27;&#x27;&#x27;</span><span class="hljs-string">定义绘制 n 个随机点，设置每一组数据点的样式和范围</span><span class="hljs-string">x轴数据位于[23，32]区间，y轴数据位于[0，100]区间，z轴数据位于[zlow，zhigh]区间</span><span class="hljs-string">&#x27;&#x27;&#x27;</span><span class="hljs-keyword">for</span> m, zlow, zhigh <span class="hljs-keyword">in</span> [(<span class="hljs-string">&#x27;o&#x27;</span>, -<span class="hljs-number">50</span>, -<span class="hljs-number">25</span>), (<span class="hljs-string">&#x27;^&#x27;</span>, -<span class="hljs-number">30</span>, -<span class="hljs-number">5</span>)]:    xs = randrange(n, <span class="hljs-number">23</span>, <span class="hljs-number">32</span>)    ys = randrange(n, <span class="hljs-number">0</span>, <span class="hljs-number">100</span>)    zs = randrange(n, zlow, zhigh)    ax.scatter(xs, ys, zs, marker=m)<span class="hljs-comment"># 设置标题、轴标签、图例，也可以直接使用 plt.title、plt.xlabel...</span>ax.set_title(<span class="hljs-string">&#x27;绘制 3D 散点图示例&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;x 轴&#x27;</span>, color=<span class="hljs-string">&#x27;b&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;y 轴&#x27;</span>, color=<span class="hljs-string">&#x27;b&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;z 轴&#x27;</span>, color=<span class="hljs-string">&#x27;b&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/04.png" alt="04"></p><h2><span id="05x00-3d-xian-kuang-tu-axes3d-plot-wireframe"><font color="#FF0000">【05x00】3D 线框图：Axes3D.plot_wireframe</font></span></h2><p>基本方法：<code>Axes3D.plot_wireframe(X, Y, Z[, *args, **kwargs])</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>X</td><td>二维数组，x 轴数据</td></tr><tr><td>Y</td><td>二维数组，y 轴数据</td></tr><tr><td>Z</td><td>二维数组，z 轴数据</td></tr><tr><td>**kwargs</td><td>其他关键字参数，可选项，如线条样式颜色等，可参见 <a href="https://matplotlib.org/api/_as_gen/mpl_toolkits.mplot3d.art3d.Line3DCollection.html">Line3DCollection</a></td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]<span class="hljs-comment"># 获取 figure 画布并通过子图创建 Axes3D 对象</span>fig = plt.figure()ax = fig.add_subplot(<span class="hljs-number">111</span>, projection=<span class="hljs-string">&#x27;3d&#x27;</span>)<span class="hljs-comment"># 定义Z轴坐标的生成方法</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>(<span class="hljs-params">m, n</span>):</span>    <span class="hljs-keyword">return</span> np.sin(np.sqrt(m ** <span class="hljs-number">2</span> + n ** <span class="hljs-number">2</span>))<span class="hljs-comment"># 设置3D线框图数据</span>x = np.linspace(-<span class="hljs-number">6</span>, <span class="hljs-number">6</span>, <span class="hljs-number">30</span>)y = np.linspace(-<span class="hljs-number">6</span>, <span class="hljs-number">6</span>, <span class="hljs-number">30</span>)<span class="hljs-comment"># 生成网格点坐标矩阵，该方法在系列文章八中有具体介绍</span>X, Y = np.meshgrid(x, y)Z = f(X, Y)<span class="hljs-comment"># 绘制3D线框图</span>ax.plot_wireframe(X, Y, Z, color=<span class="hljs-string">&#x27;c&#x27;</span>)<span class="hljs-comment"># 设置标题、轴标签、图例，也可以直接使用 plt.title、plt.xlabel...</span>ax.set_title(<span class="hljs-string">&#x27;绘制 3D 线框图示例&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;x 轴&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;y 轴&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;z 轴&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/05.png" alt="05"></p><h2><span id="06x00-3d-qu-mian-tu-axes3d-plot-surface"><font color="#FF0000">【06x00】3D 曲面图：Axes3D.plot_surface</font></span></h2><p>基本方法：<code>Axes3D.plot_surface(X, Y, Z[, *args, vmin=None, vmax=None, **kwargs])</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>X</td><td>二维数组，x 轴数据</td></tr><tr><td>Y</td><td>二维数组，y 轴数据</td></tr><tr><td>Z</td><td>二维数组，z 轴数据</td></tr><tr><td>vmin / vmax</td><td>规定数据界限</td></tr><tr><td>**kwargs</td><td>其他关键字参数，可选项，如线条样式颜色等，可参见 <a href="https://matplotlib.org/api/_as_gen/mpl_toolkits.mplot3d.art3d.Line3DCollection.html">Line3DCollection</a></td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]<span class="hljs-comment"># 获取 figure 画布并通过子图创建 Axes3D 对象</span>fig = plt.figure()ax = fig.add_subplot(<span class="hljs-number">111</span>, projection=<span class="hljs-string">&#x27;3d&#x27;</span>)<span class="hljs-comment"># 设置3D曲面图数据</span>X = np.arange(-<span class="hljs-number">5</span>, <span class="hljs-number">5</span>, <span class="hljs-number">0.25</span>)Y = np.arange(-<span class="hljs-number">5</span>, <span class="hljs-number">5</span>, <span class="hljs-number">0.25</span>)<span class="hljs-comment"># 生成网格点坐标矩阵，该方法在系列文章八中有具体介绍</span>X, Y = np.meshgrid(X, Y)R = np.sqrt(X**<span class="hljs-number">2</span> + Y**<span class="hljs-number">2</span>)Z = np.sin(R)<span class="hljs-comment"># 绘制3D曲面图并添加色条（长度0.8）</span>surface = ax.plot_surface(X, Y, Z, cmap=<span class="hljs-string">&#x27;rainbow&#x27;</span>, antialiased=<span class="hljs-literal">False</span>)fig.colorbar(surface, shrink=<span class="hljs-number">0.8</span>)<span class="hljs-comment"># 设置标题、轴标签、图例，也可以直接使用 plt.title、plt.xlabel...</span>ax.set_title(<span class="hljs-string">&#x27;绘制 3D 曲面图示例&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;x 轴&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;y 轴&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;z 轴&#x27;</span>)<span class="hljs-comment"># 调整观察角度和方位角，俯仰角25度，方位角40度</span>ax.view_init(<span class="hljs-number">25</span>, <span class="hljs-number">40</span>)<span class="hljs-comment"># 设置Z轴刻度界限</span>ax.set_zlim(-<span class="hljs-number">2</span>, <span class="hljs-number">2</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/06.png" alt="06"></p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106558131</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="07x00-3d-zhu-zhuang-tu-axes3d-bar"><font color="#FF0000">【07x00】3D 柱状图：Axes3D.bar</font></span></h2><p>基本方法：<code>Axes3D.bar(left, height, zs=0, zdir='z', *args, **kwargs)</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>left</td><td>一维数组，柱状图最左侧位置的 x 坐标</td></tr><tr><td>height</td><td>一维数组，柱状图的高度（y 坐标）</td></tr><tr><td>zs</td><td>第 i 个多边形将出现在平面 y=zs[i] 上</td></tr><tr><td>zdir</td><td>可选项，在 3D 轴上绘制 2D 数据时，数据必须以 xs，ys 的形式传递，<br>若此时将 zdir 设置为 ‘y’，数据将会被绘制到 x-z 轴平面上，默认为 ‘z’</td></tr><tr><td>**kwargs</td><td>其他关键字参数，参见 <a href="https://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.bar.html">matplotlib.axes.Axes.bar</a></td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> npplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]<span class="hljs-comment"># 获取 figure 画布并通过子图创建 Axes3D 对象</span>fig = plt.figure()ax = fig.add_subplot(<span class="hljs-number">111</span>, projection=<span class="hljs-string">&#x27;3d&#x27;</span>)colors = [<span class="hljs-string">&#x27;r&#x27;</span>, <span class="hljs-string">&#x27;g&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;y&#x27;</span>]yticks = [<span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>]<span class="hljs-comment"># 设置3D柱状图数据并绘制图像</span><span class="hljs-keyword">for</span> c, k <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(colors, yticks):    xs = np.arange(<span class="hljs-number">20</span>)    ys = np.random.rand(<span class="hljs-number">20</span>)    cs = [c] * <span class="hljs-built_in">len</span>(xs)    ax.bar(xs, ys, zs=k, zdir=<span class="hljs-string">&#x27;y&#x27;</span>, color=cs, alpha=<span class="hljs-number">0.8</span>)<span class="hljs-comment"># 设置图像标题、坐标标签以及范围</span>ax.set_title(<span class="hljs-string">&#x27;绘制 3D 柱状图示例&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;X 轴&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;Y 轴&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;Z 轴&#x27;</span>)ax.set_yticks(yticks)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/07.png" alt="07"></p><h2><span id="08x00-3d-jian-tou-tu-axes3d-quiver"><font color="#FF0000">【08x00】3D 箭头图：Axes3D.quiver</font></span></h2><p>基本方法：<code>Axes3D.quiver(X, Y, Z, U, V, W, length=1, arrow_length_ratio=0.3, pivot='tail', normalize=False, **kwargs)</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>X, Y, Z</td><td>数组形式，箭头位置的 x、y 和 z 轴坐标（默认为箭头尾部）</td></tr><tr><td>U, V, W</td><td>数组形式，箭头向量的 x、y 和 z 轴分量</td></tr><tr><td>length</td><td>float 类型，每个箭筒的长度，默认为 1.0</td></tr><tr><td>arrow_length_ratio</td><td>float 类型，箭头相对于箭身的比率，默认为 0.3</td></tr><tr><td>pivot</td><td>箭头在网格点上的位置；箭头围绕该点旋转，因此命名为 pivot，默认为 ‘tail’<br>可选项：<code>'tail'</code>：尾部；<code>'middle'</code>：中间；<code>'tip'</code>：尖端</td></tr><tr><td>normalize</td><td>bool 类型，如果为 True，则所有箭头的长度都将相同<br>默认为 False，即箭头的长度取决于 U、V、W 的值</td></tr><tr><td>**kwargs</td><td>其他关键字参数，参见 <a href="https://matplotlib.org/api/collections_api.html#matplotlib.collections.LineCollection">LineCollection</a></td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> npplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]<span class="hljs-comment"># 依次获取画布和绘图区并创建 Axes3D 对象</span>fig = plt.figure()ax = fig.gca(projection=<span class="hljs-string">&#x27;3d&#x27;</span>)<span class="hljs-comment"># 设置箭头位置</span>x, y, z = np.meshgrid(np.arange(-<span class="hljs-number">0.8</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0.2</span>),                      np.arange(-<span class="hljs-number">0.8</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0.2</span>),                      np.arange(-<span class="hljs-number">0.8</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0.8</span>))<span class="hljs-comment"># 设置箭头数据</span>u = np.sin(np.pi * x) * np.cos(np.pi * y) * np.cos(np.pi * z)v = -np.cos(np.pi * x) * np.sin(np.pi * y) * np.cos(np.pi * z)w = (np.sqrt(<span class="hljs-number">2.0</span> / <span class="hljs-number">3.0</span>) * np.cos(np.pi * x) * np.cos(np.pi * y) * np.sin(np.pi * z))<span class="hljs-comment"># 绘制 3D 箭头图</span>ax.quiver(x, y, z, u, v, w, length=<span class="hljs-number">0.1</span>, normalize=<span class="hljs-literal">True</span>)<span class="hljs-comment"># 设置图像标题、坐标标签</span>ax.set_title(<span class="hljs-string">&#x27;绘制 3D 箭头图示例&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;X 轴&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;Y 轴&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;Z 轴&#x27;</span>)<span class="hljs-comment"># 调整观察角度，俯仰角20度</span>ax.view_init(<span class="hljs-number">20</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/08.png" alt="08"></p><h2><span id="09x00-3d-deng-gao-xian-tu-axes3d-contour"><font color="#FF0000">【09x00】3D 等高线图：Axes3D.contour</font></span></h2><p>基本方法：<code>Axes3D.contour(X, Y, Z[, *args, extend3d=False, stride=5, zdir='z', offset=None, **kwargs])</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>X</td><td>一维数组，x 轴数据</td></tr><tr><td>Y</td><td>一维数组，y 轴数据</td></tr><tr><td>Z</td><td>一维数组，z 轴数据</td></tr><tr><td>extend3d</td><td>bool 值，可选项，是否以 3D 延伸轮廓，默认 False</td></tr><tr><td>stride</td><td>int 类型，可选项，用于延伸轮廓的步长</td></tr><tr><td>zdir</td><td>可选项，在 3D 轴上绘制 2D 数据时，数据必须以 xs，ys 的形式传递，<br>若此时将 zdir 设置为 ‘y’，数据将会被绘制到 x-z 轴平面上，默认为 ‘z’</td></tr><tr><td>offset</td><td>标量，可选项，如果指定，则在垂直于 zdir 的平面上的位置绘制轮廓线的投影</td></tr><tr><td>**kwargs</td><td>其他关键字参数，可选项，可参见 <a href="https://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.contour.html">matplotlib.axes.Axes.contour</a></td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]<span class="hljs-comment"># 获取 figure 画布并通过子图创建 Axes3D 对象</span>fig = plt.figure(figsize=(<span class="hljs-number">8</span>, <span class="hljs-number">4.8</span>))ax = fig.add_subplot(<span class="hljs-number">111</span>, projection=<span class="hljs-string">&#x27;3d&#x27;</span>)<span class="hljs-comment"># 设置等高线数据</span>X = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)Y = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)<span class="hljs-comment"># 生成网格点坐标矩阵</span>m, n = np.meshgrid(X, Y)<span class="hljs-comment"># 指定一个函数用于计算每个点的高度，也可以直接使用二维数组储存每个点的高度</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>(<span class="hljs-params">a, b</span>):</span>    <span class="hljs-keyword">return</span> (<span class="hljs-number">1</span> - b ** <span class="hljs-number">5</span> + a ** <span class="hljs-number">5</span>) * np.exp(-a ** <span class="hljs-number">2</span> - b ** <span class="hljs-number">2</span>)<span class="hljs-comment"># 绘制3D等高线图并添加色条图（长度0.8）</span>contour = ax.contour(X, Y, f(m, n), cmap=<span class="hljs-string">&#x27;rainbow&#x27;</span>)fig.colorbar(contour, shrink=<span class="hljs-number">0.8</span>)<span class="hljs-comment"># 设置标题、轴标签、图例，也可以直接使用 plt.title、plt.xlabel...</span>ax.set_title(<span class="hljs-string">&#x27;绘制 3D 等高线图示例&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;x 轴&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;y 轴&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;z 轴&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/09.png" alt="09"></p><h2><span id="10x00-3d-deng-gao-xian-tian-chong-tu-axes3d-contourf"><font color="#FF0000">【10x00】3D 等高线填充图：Axes3D.contourf</font></span></h2><p>基本语法：<code>Axes3D.contourf(X, Y, Z[, *args, zdir='z', offset=None, **kwargs])</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>X</td><td>一维数组，x 轴数据</td></tr><tr><td>Y</td><td>一维数组，y 轴数据</td></tr><tr><td>Z</td><td>一维数组，z 轴数据</td></tr><tr><td>zdir</td><td>可选项，在 3D 轴上绘制 2D 数据时，数据必须以 xs，ys 的形式传递，<br>若此时将 zdir 设置为 ‘y’，数据将会被绘制到 x-z 轴平面上，默认为 ‘z’</td></tr><tr><td>offset</td><td>标量，可选项，如果指定，则在垂直于 zdir 的平面上的位置绘制轮廓线的投影</td></tr><tr><td>**kwargs</td><td>其他关键字参数，可选项，可参见 <a href="https://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.contourf.html">matplotlib.axes.Axes.contourf</a></td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]<span class="hljs-comment"># 获取 figure 画布并通过子图创建 Axes3D 对象</span>fig = plt.figure(figsize=(<span class="hljs-number">8</span>, <span class="hljs-number">4.8</span>))ax = fig.add_subplot(<span class="hljs-number">111</span>, projection=<span class="hljs-string">&#x27;3d&#x27;</span>)<span class="hljs-comment"># 设置等高线数据</span>X = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)Y = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)<span class="hljs-comment"># 生成网格点坐标矩阵</span>m, n = np.meshgrid(X, Y)<span class="hljs-comment"># 指定一个函数用于计算每个点的高度，也可以直接使用二维数组储存每个点的高度</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>(<span class="hljs-params">a, b</span>):</span>    <span class="hljs-keyword">return</span> (<span class="hljs-number">1</span> - b ** <span class="hljs-number">5</span> + a ** <span class="hljs-number">5</span>) * np.exp(-a ** <span class="hljs-number">2</span> - b ** <span class="hljs-number">2</span>)<span class="hljs-comment"># 绘制3D等高线图并添加色条图（长度0.8）</span>contourf = ax.contourf(X, Y, f(m, n), cmap=<span class="hljs-string">&#x27;rainbow&#x27;</span>)fig.colorbar(contourf, shrink=<span class="hljs-number">0.8</span>)<span class="hljs-comment"># 设置标题、轴标签、图例，也可以直接使用 plt.title、plt.xlabel...</span>ax.set_title(<span class="hljs-string">&#x27;绘制 3D 等高线填充图示例&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;x 轴&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;y 轴&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;z 轴&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/10.png" alt="10"></p><h2><span id="11x00-3d-san-jiao-qu-mian-tu-axes3d-plot-trisurf"><font color="#FF0000">【11x00】3D 三角曲面图：Axes3D.plot_trisurf</font></span></h2><p>基本方法：<code>Axes3D.plot_trisurf(X, Y, Z[, *args, color=None, vmin=None, vmax=None, **kwargs])</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>X</td><td>一维数组，x 轴数据</td></tr><tr><td>Y</td><td>一维数组，y 轴数据</td></tr><tr><td>Z</td><td>一维数组，z 轴数据</td></tr><tr><td>color</td><td>曲面表面的颜色</td></tr><tr><td>vmin / vmax</td><td>规定数据界限</td></tr><tr><td>**kwargs</td><td>可选项，其他关键字参数，可参见 <a href="https://matplotlib.org/api/_as_gen/mpl_toolkits.mplot3d.art3d.Poly3DCollection.html">Poly3DCollection</a></td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> npplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]<span class="hljs-comment"># 获取 figure 画布并通过子图创建 Axes3D 对象</span>fig = plt.figure()ax = fig.add_subplot(<span class="hljs-number">111</span>, projection=<span class="hljs-string">&#x27;3d&#x27;</span>)n_radii = <span class="hljs-number">8</span>n_angles = <span class="hljs-number">36</span>radii = np.linspace(<span class="hljs-number">0.125</span>, <span class="hljs-number">1.0</span>, n_radii)angles = np.linspace(<span class="hljs-number">0</span>, <span class="hljs-number">2</span>*np.pi, n_angles, endpoint=<span class="hljs-literal">False</span>)[..., np.newaxis]x = np.append(<span class="hljs-number">0</span>, (radii*np.cos(angles)).flatten())y = np.append(<span class="hljs-number">0</span>, (radii*np.sin(angles)).flatten())z = np.sin(-x*y)<span class="hljs-comment"># 绘制3D三角曲面图并添加色条（长度0.8）</span>trisurf = ax.plot_trisurf(x, y, z, cmap=<span class="hljs-string">&#x27;rainbow&#x27;</span>)fig.colorbar(trisurf, shrink=<span class="hljs-number">0.8</span>)<span class="hljs-comment"># 设置标题、轴标签、图例，也可以直接使用 plt.title、plt.xlabel...</span>ax.set_title(<span class="hljs-string">&#x27;绘制 3D 三角曲面图示例&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;x 轴&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;y 轴&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;z 轴&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/11.png" alt="11"></p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-keyword">import</span> matplotlib.tri <span class="hljs-keyword">as</span> mtriplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]fig = plt.figure(figsize=(<span class="hljs-number">15</span>, <span class="hljs-number">6</span>))<span class="hljs-comment"># ============ 第一个示例图 ============ #</span>ax = fig.add_subplot(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">1</span>, projection=<span class="hljs-string">&#x27;3d&#x27;</span>)u = np.linspace(<span class="hljs-number">0</span>, <span class="hljs-number">2.0</span> * np.pi, endpoint=<span class="hljs-literal">True</span>, num=<span class="hljs-number">50</span>)v = np.linspace(-<span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>, endpoint=<span class="hljs-literal">True</span>, num=<span class="hljs-number">10</span>)u, v = np.meshgrid(u, v)u, v = u.flatten(), v.flatten()x = (<span class="hljs-number">1</span> + <span class="hljs-number">0.5</span> * v * np.cos(u / <span class="hljs-number">2.0</span>)) * np.cos(u)y = (<span class="hljs-number">1</span> + <span class="hljs-number">0.5</span> * v * np.cos(u / <span class="hljs-number">2.0</span>)) * np.sin(u)z = <span class="hljs-number">0.5</span> * v * np.sin(u / <span class="hljs-number">2.0</span>)tri = mtri.Triangulation(u, v)trisurf_1 = ax.plot_trisurf(x, y, z, triangles=tri.triangles, cmap=<span class="hljs-string">&#x27;cool&#x27;</span>)fig.colorbar(trisurf_1, shrink=<span class="hljs-number">0.8</span>)ax.set_zlim(-<span class="hljs-number">1</span>, <span class="hljs-number">1</span>)ax.set_title(<span class="hljs-string">&#x27;绘制 3D 三角曲面图示例一&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;x 轴&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;y 轴&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;z 轴&#x27;</span>)<span class="hljs-comment"># ============ 第二个示例图 ============ #</span>ax = fig.add_subplot(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, projection=<span class="hljs-string">&#x27;3d&#x27;</span>)n_angles = <span class="hljs-number">36</span>n_radii = <span class="hljs-number">8</span>min_radius = <span class="hljs-number">0.25</span>radii = np.linspace(min_radius, <span class="hljs-number">0.95</span>, n_radii)angles = np.linspace(<span class="hljs-number">0</span>, <span class="hljs-number">2</span>*np.pi, n_angles, endpoint=<span class="hljs-literal">False</span>)angles = np.repeat(angles[..., np.newaxis], n_radii, axis=<span class="hljs-number">1</span>)angles[:, <span class="hljs-number">1</span>::<span class="hljs-number">2</span>] += np.pi/n_anglesx = (radii*np.cos(angles)).flatten()y = (radii*np.sin(angles)).flatten()z = (np.cos(radii)*np.cos(<span class="hljs-number">3</span>*angles)).flatten()triang = mtri.Triangulation(x, y)xmid = x[triang.triangles].mean(axis=<span class="hljs-number">1</span>)ymid = y[triang.triangles].mean(axis=<span class="hljs-number">1</span>)mask = xmid**<span class="hljs-number">2</span> + ymid**<span class="hljs-number">2</span> &lt; min_radius**<span class="hljs-number">2</span>triang.set_mask(mask)trisurf_2 = ax.plot_trisurf(triang, z, cmap=<span class="hljs-string">&#x27;hsv&#x27;</span>)fig.colorbar(trisurf_2, shrink=<span class="hljs-number">0.8</span>)ax.set_title(<span class="hljs-string">&#x27;绘制 3D 三角曲面图示例二&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;x 轴&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;y 轴&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;z 轴&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/12.png" alt="12"></p><h2><span id="12x00-jiang-2d-tu-xiang-ju-he-dao-3d-tu-xiang-zhong-axes3d-add-collection3d"><font color="#FF0000">【12x00】将 2D 图像聚合到 3D 图像中：Axes3D.add_collection3d</font></span></h2><p>基本方法：<code>Axes3D.add_collection3d(col, zs=0, zdir='z')</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>col</td><td><a href="https://matplotlib.org/api/collections_api.html?highlight=polycollection#matplotlib.collections.PolyCollection">PolyCollection</a> / <a href="https://matplotlib.org/api/collections_api.html?highlight=linecollection#matplotlib.collections.LineCollection">LineCollection</a> / <a href="https://matplotlib.org/api/collections_api.html?highlight=patchcollection#matplotlib.collections.PatchCollection">PatchCollection</a> 对象</td></tr><tr><td>zs</td><td>第 i 个多边形将出现在平面 y=zs[i] 上</td></tr><tr><td>zdir</td><td>可选项，在 3D 轴上绘制 2D 数据时，数据必须以 xs，ys 的形式传递，<br>若此时将 zdir 设置为 ‘y’，数据将会被绘制到 x-z 轴平面上，默认为 ‘z’</td></tr></tbody></table><p>该函数一般用来向图形中添加 3D 集合对象，以下用一个示例来展示某个地区在不同年份和不同月份的降水量：</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-keyword">from</span> matplotlib.collections <span class="hljs-keyword">import</span> PolyCollectionplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]fig = plt.figure()ax = fig.gca(projection=<span class="hljs-string">&#x27;3d&#x27;</span>)np.random.seed(<span class="hljs-number">59</span>)month = np.arange(<span class="hljs-number">0</span>, <span class="hljs-number">13</span>)years = [<span class="hljs-number">2017</span>, <span class="hljs-number">2018</span>, <span class="hljs-number">2019</span>, <span class="hljs-number">2020</span>]precipitation = []<span class="hljs-keyword">for</span> year <span class="hljs-keyword">in</span> years:    value = np.random.rand(<span class="hljs-built_in">len</span>(month)) * <span class="hljs-number">300</span>    value[<span class="hljs-number">0</span>], value[-<span class="hljs-number">1</span>] = <span class="hljs-number">0</span>, <span class="hljs-number">0</span>    precipitation.append(<span class="hljs-built_in">list</span>(<span class="hljs-built_in">zip</span>(month, value)))poly = PolyCollection(precipitation, facecolors=[<span class="hljs-string">&#x27;r&#x27;</span>, <span class="hljs-string">&#x27;g&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;y&#x27;</span>], alpha=<span class="hljs-number">.6</span>)ax.add_collection3d(poly, zs=years, zdir=<span class="hljs-string">&#x27;y&#x27;</span>)ax.set_title(<span class="hljs-string">&#x27;2D 图像聚合到 3D 图像示例&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;月份&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;年份&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;降水量&#x27;</span>)ax.set_xlim3d(<span class="hljs-number">0</span>, <span class="hljs-number">12</span>)ax.set_ylim3d(<span class="hljs-number">2016</span>, <span class="hljs-number">2021</span>)ax.set_zlim3d(<span class="hljs-number">0</span>, <span class="hljs-number">300</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/13.png" alt="13"></p><p>此外，该方法也常被用于绘制 3D 多边形图，即多边体，示例如下：</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-keyword">from</span> mpl_toolkits.mplot3d.art3d <span class="hljs-keyword">import</span> Poly3DCollection, Line3DCollectionplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]fig = plt.figure()ax = fig.gca(projection=<span class="hljs-string">&#x27;3d&#x27;</span>)<span class="hljs-comment"># 六面体顶点和面</span>verts = [(<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>), (<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>), (<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>), (<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>)]faces = [[<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>], [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">5</span>, <span class="hljs-number">4</span>], [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">6</span>, <span class="hljs-number">5</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">7</span>, <span class="hljs-number">6</span>], [<span class="hljs-number">0</span>, <span class="hljs-number">3</span>, <span class="hljs-number">7</span>, <span class="hljs-number">4</span>]]<span class="hljs-comment"># 获取每个面的顶点</span>poly3d = [[verts[vert_id] <span class="hljs-keyword">for</span> vert_id <span class="hljs-keyword">in</span> face] <span class="hljs-keyword">for</span> face <span class="hljs-keyword">in</span> faces]<span class="hljs-comment"># 绘制顶点</span>x, y, z = <span class="hljs-built_in">zip</span>(*verts)ax.scatter(x, y, z)<span class="hljs-comment"># 绘制多边形面</span>ax.add_collection3d(Poly3DCollection(poly3d, facecolors=<span class="hljs-string">&#x27;w&#x27;</span>, linewidths=<span class="hljs-number">1</span>, alpha=<span class="hljs-number">0.5</span>))<span class="hljs-comment"># 绘制多边形的边</span>ax.add_collection3d(Line3DCollection(poly3d, colors=<span class="hljs-string">&#x27;k&#x27;</span>, linewidths=<span class="hljs-number">0.5</span>, linestyles=<span class="hljs-string">&#x27;:&#x27;</span>))<span class="hljs-comment"># 设置图像标题、坐标标签以及范围</span>ax.set_title(<span class="hljs-string">&#x27;绘制多边体示例&#x27;</span>, pad=<span class="hljs-number">15</span>, fontsize=<span class="hljs-string">&#x27;12&#x27;</span>)ax.set_xlabel(<span class="hljs-string">&#x27;X 轴&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;Y 轴&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;Z 轴&#x27;</span>)ax.set_xlim3d(-<span class="hljs-number">0.5</span>, <span class="hljs-number">1.5</span>)ax.set_ylim3d(-<span class="hljs-number">0.5</span>, <span class="hljs-number">1.5</span>)ax.set_zlim3d(-<span class="hljs-number">0.5</span>, <span class="hljs-number">1.5</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/14.png" alt="14"></p><h2><span id="13x00-3d-tu-tian-jia-wen-ben-miao-shu-axes3d-text"><font color="#FF0000">【13x00】3D 图添加文本描述：Axes3D.text</font></span></h2><p>基本方法：<code>Axes3D.text(x, y, z, s[, zdir=None, **kwargs])</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>x, y, z</td><td>文本位置的 x、y、z 轴坐标</td></tr><tr><td>s</td><td>要添加的文本</td></tr><tr><td>zdir</td><td>可选项，若将 zdir 设置为 ‘y’，文本将会被投影到 x-z 轴平面上，默认为 None</td></tr><tr><td>**kwargs</td><td>其他关键字参数，参见 <a href="https://matplotlib.org/api/text_api.html">matplotlib.text</a></td></tr></tbody></table><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]<span class="hljs-comment"># 依次获取画布和绘图区并创建 Axes3D 对象</span>fig = plt.figure()ax = fig.gca(projection=<span class="hljs-string">&#x27;3d&#x27;</span>)<span class="hljs-comment"># Demo 1: zdir 参数用法</span>zdirs = (<span class="hljs-literal">None</span>, <span class="hljs-string">&#x27;x&#x27;</span>, <span class="hljs-string">&#x27;y&#x27;</span>, <span class="hljs-string">&#x27;z&#x27;</span>, (<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>))xs = (<span class="hljs-number">1</span>, <span class="hljs-number">4</span>, <span class="hljs-number">4</span>, <span class="hljs-number">9</span>, <span class="hljs-number">4</span>, <span class="hljs-number">1</span>)ys = (<span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">8</span>, <span class="hljs-number">10</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>)zs = (<span class="hljs-number">10</span>, <span class="hljs-number">3</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">1</span>, <span class="hljs-number">8</span>)<span class="hljs-keyword">for</span> zdir, x, y, z <span class="hljs-keyword">in</span> <span class="hljs-built_in">zip</span>(zdirs, xs, ys, zs):    label = <span class="hljs-string">&#x27;(%d, %d, %d), dir=%s&#x27;</span> % (x, y, z, zdir)    ax.text(x, y, z, label, zdir)<span class="hljs-comment"># Demo 2：设置颜色</span>ax.text(<span class="hljs-number">9</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-string">&quot;red&quot;</span>, color=<span class="hljs-string">&#x27;red&#x27;</span>)<span class="hljs-comment"># Demo 3: text2D，位置(0，0)为左下角，(1，1)为右上角。</span>ax.text2D(<span class="hljs-number">0.05</span>, <span class="hljs-number">0.95</span>, <span class="hljs-string">&quot;2D Text&quot;</span>, transform=ax.transAxes)<span class="hljs-comment"># 设置坐标轴界限和标签</span>ax.set_xlim(<span class="hljs-number">0</span>, <span class="hljs-number">10</span>)ax.set_ylim(<span class="hljs-number">0</span>, <span class="hljs-number">10</span>)ax.set_zlim(<span class="hljs-number">0</span>, <span class="hljs-number">10</span>)ax.set_xlabel(<span class="hljs-string">&#x27;X 轴&#x27;</span>)ax.set_ylabel(<span class="hljs-string">&#x27;Y 轴&#x27;</span>)ax.set_zlabel(<span class="hljs-string">&#x27;Z 轴&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/023/15.png" alt="15"></p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106558131</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-01x00-liao-jie-mplot3d-toolkit-font&quot;&gt;&lt;font</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Matplotlib" scheme="https://www.itbob.cn/tags/Matplotlib/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Matplotlib（九）：极区图/极坐标图/雷达图的绘制</title>
    <link href="https://www.itbob.cn/article/022/"/>
    <id>https://www.itbob.cn/article/022/</id>
    <published>2020-06-03T10:58:43.000Z</published>
    <updated>2022-05-22T12:34:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-1x00-liao-jie-ji-zuo-biao-font"><font color="#FF0000">【1x00】了解极坐标</font></a></li><li><a href="#font-color-ff0000-2x00-ji-ben-fang-fa-matplotlib-pyplot-polar-font"><font color="#FF0000">【2x00】基本方法 matplotlib.pyplot.polar()</font></a></li><li><a href="#font-color-ff0000-3x00-hui-zhi-ji-zuo-biao-font"><font color="#FF0000">【3x00】绘制极坐标</font></a></li><li><a href="#font-color-ff0000-4x00-hui-zhi-lei-da-tu-font"><font color="#FF0000">【4x00】绘制雷达图</font></a><ul><li><a href="#font-color-4876ff-4x01-li-jie-numpy-concatenate-font"><font color="##4876FF">【4x01】理解 numpy.concatenate()</font></a></li><li><a href="#font-color-4876ff-4x02-li-jie-pyplot-thetagrids-font"><font color="##4876FF">【4x02】理解 pyplot.thetagrids()</font></a></li><li><a href="#font-color-4876ff-4x03-hui-zhi-lei-da-tu-font"><font color="##4876FF">【4x03】绘制雷达图</font></a></li></ul></li><li><a href="#font-color-ff0000-5x00-gao-ji-yong-fa-hui-zhi-ji-zuo-biao-san-dian-tu-font"><font color="#FF0000">【5x00】高级用法：绘制极坐标散点图</font></a><ul><li><a href="#font-color-4876ff-5x01-fang-fa-yi-pyplot-scatter-yu-pyplot-polar-font"><font color="##4876FF">【5x01】方法一：pyplot.scatter() 与 pyplot.polar()</font></a></li><li><a href="#font-color-4876ff-5x02-fang-fa-er-pyplot-scatter-yu-pyplot-subplot-font"><font color="##4876FF">【5x02】方法二：pyplot.scatter() 与 pyplot.subplot()</font></a></li><li><a href="#font-color-4876ff-5x03-fang-fa-san-pyplot-scatter-yu-pyplot-axes-font"><font color="##4876FF">【5x03】方法三：pyplot.scatter() 与 pyplot.axes()</font></a></li></ul></li><li><a href="#font-color-ff0000-6x00-gao-ji-yong-fa-hui-zhi-ji-zuo-biao-zhu-zhuang-tu-font"><font color="#FF0000">【6x00】高级用法：绘制极坐标柱状图</font></a><ul><li><a href="#font-color-4876ff-6x01-fang-fa-yi-pyplot-bar-yu-pyplot-polar-font"><font color="##4876FF">【6x01】方法一：pyplot.bar() 与 pyplot.polar()</font></a></li><li><a href="#font-color-4876ff-6x02-fang-fa-er-pyplot-bar-yu-pyplot-subplot-font"><font color="##4876FF">【6x02】方法二：pyplot.bar() 与 pyplot.subplot()</font></a></li><li><a href="#font-color-4876ff-6x03-fang-fa-san-pyplot-bar-yu-pyplot-axes-font"><font color="##4876FF">【6x03】方法三：pyplot.bar() 与 pyplot.axes()</font></a></li></ul></li></ul><!-- tocstop --><hr><p>Matplotlib 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/014/">Python 数据分析三剑客之 Matplotlib（一）：初识 Matplotlib 与其 matplotibrc 配置文件</a></li><li><a href="https://www.itbob.cn/article/015/">Python 数据分析三剑客之 Matplotlib（二）：文本描述 / 中文支持 / 画布 / 网格等基本图像属性</a></li><li><a href="https://www.itbob.cn/article/016/">Python 数据分析三剑客之 Matplotlib（三）：图例 / LaTeX / 刻度 / 子图 / 补丁等基本图像属性</a></li><li><a href="https://www.itbob.cn/article/017/">Python 数据分析三剑客之 Matplotlib（四）：线性图的绘制</a></li><li><a href="https://www.itbob.cn/article/018/">Python 数据分析三剑客之 Matplotlib（五）：散点图的绘制</a></li><li><a href="https://www.itbob.cn/article/019/">Python 数据分析三剑客之 Matplotlib（六）：直方图 / 柱状图 / 条形图的绘制</a></li><li><a href="https://www.itbob.cn/article/020/">Python 数据分析三剑客之 Matplotlib（七）：饼状图的绘制</a></li><li><a href="https://www.itbob.cn/article/021/">Python 数据分析三剑客之 Matplotlib（八）：等高线 / 等值线图的绘制</a></li><li><a href="https://www.itbob.cn/article/022/">Python 数据分析三剑客之 Matplotlib（九）：极区图 / 极坐标图 / 雷达图的绘制</a></li><li><a href="https://www.itbob.cn/article/023/">Python 数据分析三剑客之 Matplotlib（十）：3D 图的绘制</a></li><li><a href="https://www.itbob.cn/article/024/">Python 数据分析三剑客之 Matplotlib（十一）：最热门最常用的 50 个图表</a>【译文】</li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106162412</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="1x00-liao-jie-ji-zuo-biao"><font color="#FF0000">【1x00】了解极坐标</font></span></h2><p>参考百度百科：极坐标，属于二维坐标系统，创始人是牛顿，主要应用于数学领域。极坐标是指在平面内取一个定点 <font color="#FF0000">O</font>，叫极点，引一条射线 <font color="#FF0000">Ox</font>，叫做极轴，再选定一个长度单位和角度的正方向（通常取逆时针方向）。对于平面内任何一点 <font color="#FF0000">M</font>，用 <font color="#FF0000">ρ</font> 表示线段 <font color="#FF0000">OM</font> 的长度（有时也用 <font color="#FF0000">r</font> 表示），<font color="#FF0000">θ</font> 表示从 <font color="#FF0000">Ox</font> 到 <font color="#FF0000">OM</font> 的角度，<font color="#FF0000">ρ</font> 叫做点 <font color="#FF0000">M</font> 的极径，<font color="#FF0000">θ</font> 叫做点 <font color="#FF0000">M</font> 的极角，有序数对 <font color="#FF0000">(ρ,θ)</font> 就叫点 <font color="#FF0000">M</font> 的极坐标，这样建立的坐标系叫做极坐标系。通常情况下，<font color="#FF0000">M</font> 的极径坐标单位为 <font color="#FF0000">1</font>（长度单位），极角坐标单位为 <font color="#FF0000">rad</font>（或<font color="#FF0000">°</font>）。</p><p><img src="https://static.wukongsec.com/itbob/images/article/022/01.png" alt="01"></p><h2><span id="2x00-ji-ben-fang-fa-matplotlib-pyplot-polar"><font color="#FF0000">【2x00】基本方法 matplotlib.pyplot.polar()</font></span></h2><p><code>matplotlib.pyplot.polar()</code> 方法可用于绘制极坐标图。</p><p>基本语法：<code>polar(theta, r, **kwargs)</code></p><ul><li><font color="#FF0000"><strong>theta</strong></font>：点的角坐标，以弧度单位传入参数；</li><li><font color="#FF0000"><strong>r</strong></font>：点的半径坐标；</li><li><font color="#FF0000">**<strong>kwargs</strong></font>：可选项，其他 Line2D 属性，常用属性见<font color="#FF0000"><strong>表一</strong></font>。</li></ul><p>拓展：数学上通常是用弧度而非角度，弧度单位缩写为 rad，2π rad = 360°，1° ≈ 0.0174533 rad，1 rad ≈ 57.29578°。</p><ul><li>角度转换为弧度公式：弧度 = 角度 ÷ 180 × π</li><li>弧度转换为角度公式：角度 = 弧度 × 180 ÷ π</li></ul><table><tr><td bgcolor="#7FFFD4" colspan="2"><strong><font color="FF0000" size="3px">表一：Line2D 部分属性，完整属性参见官方文档：<br>https://matplotlib.org/api/_as_gen/matplotlib.lines.Line2D.html</font></strong></td></tr></table><table><thead><tr><th>属性</th><th>描述</th></tr></thead><tbody><tr><td>alpha</td><td>线条透明度，float 类型，取值范围：<code>[0, 1]</code>，默认为 1.0，即不透明</td></tr><tr><td>antialiased / aa</td><td>是否使用抗锯齿渲染，默认为 True</td></tr><tr><td>color / c</td><td>线条颜色，支持英文颜色名称及其简写、十六进制颜色码等，更多颜色示例参见官网 <a href="https://matplotlib.org/gallery/color/color_demo.html">Color Demo</a></td></tr><tr><td>fillstyle</td><td>点的填充样式，<code>'full'</code>、<code>'left'</code>、<code>'right'</code>、<code>'bottom'</code>、<code>'top'</code>、<code>'none'</code></td></tr><tr><td>label</td><td>图例，具体参数参见：<br><a href="https://itrhx.blog.csdn.net/article/details/105828143">《Python 数据分析三剑客之 Matplotlib（三）：图例 / LaTeX / 刻度 / 子图 / 补丁等基本图像属性》</a></td></tr><tr><td>linestyle / ls</td><td>连接的线条样式：<code>'-'</code> or <code>'solid'</code>, <code>'--'</code> or <code>'dashed'</code>, <code>'-.'</code> or <code>'dashdot'</code> <br> <code>':'</code> or <code>'dotted'</code>, <code>'none'</code> or <code>' '</code> or <code>''</code></td></tr><tr><td>linewidth / lw</td><td>连接的线条宽度，float 类型，默认 0.8</td></tr><tr><td>marker</td><td>标记样式，具体样式参见<font color="#FF0000"><strong>表二</strong></font></td></tr><tr><td>markeredgecolor / mec</td><td>marker 标记的边缘颜色</td></tr><tr><td>markeredgewidth / mew</td><td>marker 标记的边缘宽度</td></tr><tr><td>markerfacecolor / mfc</td><td>marker 标记的颜色</td></tr><tr><td>markerfacecoloralt / mfcalt</td><td>marker 标记的备用颜色</td></tr><tr><td>markersize / ms</td><td>marker 标记的大小</td></tr></tbody></table><table><tr><td bgcolor="#7FFFD4" colspan="2"><strong><font color="FF0000" size="3px">表二：marker 标记的样式，官方文档：<br>https://matplotlib.org/api/markers_api.html</font></strong></td></tr></table><table><thead><tr><th>标记</th><th>描述</th></tr></thead><tbody><tr><td><code>&quot;.&quot;</code></td><td>点</td></tr><tr><td><code>&quot;,&quot;</code></td><td>像素点</td></tr><tr><td><code>&quot;o&quot;</code></td><td>圆圈</td></tr><tr><td><code>&quot;v&quot;</code></td><td>倒三角</td></tr><tr><td><code>&quot;^&quot;</code></td><td>正三角</td></tr><tr><td><code>&quot;&lt;&quot;</code></td><td>左三角</td></tr><tr><td><code>&quot;&gt;&quot;</code></td><td>右三角</td></tr><tr><td><code>&quot;1&quot;</code></td><td>倒三叉星</td></tr><tr><td><code>&quot;2&quot;</code></td><td>正三叉星（类似奔驰车标形状）</td></tr><tr><td><code>&quot;3&quot;</code></td><td>左三叉星</td></tr><tr><td><code>&quot;4&quot;</code></td><td>右三叉星</td></tr><tr><td><code>&quot;8&quot;</code></td><td>八边形</td></tr><tr><td><code>&quot;s&quot;</code></td><td>正方形</td></tr><tr><td><code>&quot;p&quot;</code></td><td>五边形</td></tr><tr><td><code>&quot;P&quot;</code></td><td>填充的加号（粗加号）</td></tr><tr><td><code>&quot;+&quot;</code></td><td>加号</td></tr><tr><td><code>&quot;*&quot;</code></td><td>星形</td></tr><tr><td><code>&quot;h&quot;</code></td><td>六边形（底部是角）</td></tr><tr><td><code>&quot;H&quot;</code></td><td>六边形（底部是边）</td></tr><tr><td><code>&quot;x&quot;</code></td><td>x 号</td></tr><tr><td><code>&quot;X&quot;</code></td><td>填充的 x 号（粗 x 号）</td></tr><tr><td><code>&quot;D&quot;</code></td><td>粗菱形（对角线相等）</td></tr><tr><td><code>&quot;d&quot;</code></td><td>细菱形（对角线不等）</td></tr><tr><td>`&quot;</td><td>&quot;`</td></tr><tr><td><code>&quot;_&quot;</code></td><td>水平线</td></tr><tr><td><code>0</code></td><td>水平线靠左</td></tr><tr><td><code>1</code></td><td>水平线靠右</td></tr><tr><td><code>2</code></td><td>垂直线靠上</td></tr><tr><td><code>3</code></td><td>垂直线靠下</td></tr><tr><td><code>4</code></td><td>左三角（比 <code>&quot;&lt;&quot;</code> 更细）</td></tr><tr><td><code>5</code></td><td>右三角（比 <code>&quot;&gt;&quot;</code> 更细）</td></tr><tr><td><code>6</code></td><td>正三角（比 <code>&quot;^&quot;</code> 更细）</td></tr><tr><td><code>7</code></td><td>倒三角（比 <code>&quot;v&quot;</code> 更细）</td></tr><tr><td><code>8</code></td><td>左三角（比 <code>&quot;&lt;&quot;</code> 更细，靠左显示）</td></tr><tr><td><code>9</code></td><td>右三角（比 <code>&quot;&gt;&quot;</code> 更细，靠右显示）</td></tr><tr><td><code>10</code></td><td>正三角（比 <code>&quot;^&quot;</code> 更细，靠上显示）</td></tr><tr><td><code>11</code></td><td>倒三角（比 <code>&quot;v&quot;</code> 更细，靠下显示）</td></tr><tr><td><code>&quot;None&quot;</code> / <code>&quot; &quot;</code> / <code>&quot;&quot;</code></td><td>无样式</td></tr><tr><td><code>'$...$'</code></td><td>支持 LaTeX 数学公式，表达式用美元符号包围起来</td></tr></tbody></table><h2><span id="3x00-hui-zhi-ji-zuo-biao"><font color="#FF0000">【3x00】绘制极坐标</font></span></h2><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-comment"># 设置中文显示</span>plt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]<span class="hljs-comment"># 设置画布大小</span>plt.figure(figsize=(<span class="hljs-number">8.0</span>, <span class="hljs-number">6.0</span>))<span class="hljs-comment"># 设置三个数据，theta 为点位置的弧度参数，r 为点的半径坐标</span>theta1 = np.array([<span class="hljs-number">1.25</span>*np.pi, np.pi/<span class="hljs-number">2</span>, <span class="hljs-number">0</span>])theta2 = np.array([-np.pi/<span class="hljs-number">6</span>, -np.pi/<span class="hljs-number">2</span>, <span class="hljs-number">0</span>, np.pi/<span class="hljs-number">2</span>, np.pi])theta3 = np.arange(<span class="hljs-number">0.</span>, <span class="hljs-number">2</span>*np.pi, <span class="hljs-number">0.5</span>)r1 = np.array([<span class="hljs-number">4</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>])r2 = np.array([<span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">3</span>])r3 = np.random.randint(<span class="hljs-number">0</span>, <span class="hljs-number">5</span>, <span class="hljs-number">13</span>)<span class="hljs-comment"># 绘制第一个极坐标图，点的标记样式为细菱形，大小为8，点之间的连接线条样式为:</span>plt.polar(theta1, r1, marker=<span class="hljs-string">&#x27;d&#x27;</span>, ms=<span class="hljs-number">8</span>, ls=<span class="hljs-string">&#x27;:&#x27;</span>, label=<span class="hljs-string">&#x27;数据一&#x27;</span>)<span class="hljs-comment"># 填充第一个极坐标图，填充颜色为蓝色，透明度0.3</span>plt.fill(theta1, r1, color=<span class="hljs-string">&#x27;b&#x27;</span>, alpha=<span class="hljs-number">0.3</span>)<span class="hljs-comment"># 绘制第二个极坐标图，marker、linestyle、color 三个参数可以组合以字符串形式传入</span>plt.polar(theta2, r2, <span class="hljs-string">&#x27;*-g&#x27;</span>, ms=<span class="hljs-number">10</span>, label=<span class="hljs-string">&#x27;数据二&#x27;</span>)<span class="hljs-comment"># 绘制第三个极坐标图，设置 linestyle 为 none，即点与点之间不相连</span>plt.polar(theta3, r3, marker=<span class="hljs-string">&#x27;o&#x27;</span>, ls=<span class="hljs-string">&#x27;none&#x27;</span>, ms=<span class="hljs-number">8</span>, color=<span class="hljs-string">&#x27;r&#x27;</span>, label=<span class="hljs-string">&#x27;数据三&#x27;</span>)plt.title(<span class="hljs-string">&#x27;matplotlib.pyplot.polar 用法示例&#x27;</span>, pad=<span class="hljs-number">25</span>, fontsize=<span class="hljs-number">15</span>)plt.legend(bbox_to_anchor=(<span class="hljs-number">1.3</span>, <span class="hljs-number">1</span>))plt.show()</code></pre><p>示例中 figure、title、legend 等其他方法的解释可参见我的系列文章：</p><ul><li><a href="https://itrhx.blog.csdn.net/article/details/105638122">《Python 数据分析三剑客之 Matplotlib（一）：初识 Matplotlib 与其 matplotibrc 配置文件》</a></li><li><a href="https://itrhx.blog.csdn.net/article/details/105828049">《Python 数据分析三剑客之 Matplotlib（二）：文本描述 / 中文支持 / 画布 / 网格等基本图像属性》</a></li><li><a href="https://itrhx.blog.csdn.net/article/details/105828143">《Python 数据分析三剑客之 Matplotlib（三）：图例 / LaTeX / 刻度 / 子图 / 补丁等基本图像属性》</a></li></ul><p>绘制结果如下图：</p><p><img src="https://static.wukongsec.com/itbob/images/article/022/02.png" alt="02"></p><h2><span id="4x00-hui-zhi-lei-da-tu"><font color="#FF0000">【4x00】绘制雷达图</font></span></h2><p>雷达图是以从同一点开始的轴上表示的三个或更多个定量变量的二维图表的形式显示多变量数据的图形方法。轴的相对位置和角度通常是无信息的。 雷达图也称为网络图，蜘蛛图，星图，蜘蛛网图，不规则多边形，极坐标图或 Kiviat 图。它相当于平行坐标图，轴径向排列。</p><p>在前面的示例中，使用了 <code>matplotlib.pyplot.fill()</code> 方法对三个极坐标点围成的图形进行了填充，这就有点儿接近于雷达图了，仔细观察前面的示例，在填充时第一个点和最后一个点之间没有连线，即没有闭合，而更精确的雷达图应该是闭合的，且外围应该是文字描述而不是度数。</p><p>在绘制雷达图之前需要提前了解一些函数。这些函数可以帮助我们实现闭合、自定义文字标签等。</p><hr><h3><span id="4x01-li-jie-numpy-concatenate"><font color="##4876FF">【4x01】理解 numpy.concatenate()</font></span></h3><p><code>numpy.concatenate()</code> 方法用于沿现有轴连接一系列数组，我们可以利用此方法来实现闭合操作。</p><p>基本语法：<code>numpy.concatenate((a1, a2, ...)[, axis=0, out=None])</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>a1, a2, …</td><td>要连接的数组，必须拥有相同的维度</td></tr><tr><td>axis</td><td>沿指定轴连接数组，可选项，如果 axis 为 None，则数组在使用前被展平，默认值为 0</td></tr><tr><td>out</td><td>用于接收连接后的数组，可选项</td></tr></tbody></table><p>用法示例：</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> npa = np.array([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>])b = np.array([<span class="hljs-string">&#x27;a&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>, <span class="hljs-string">&#x27;c&#x27;</span>, <span class="hljs-string">&#x27;d&#x27;</span>])<span class="hljs-built_in">print</span>(np.concatenate((a, b)))</code></pre><p>输出结果如下：</p><pre><code class="hljs python">[<span class="hljs-string">&#x27;1&#x27;</span> <span class="hljs-string">&#x27;2&#x27;</span> <span class="hljs-string">&#x27;3&#x27;</span> <span class="hljs-string">&#x27;4&#x27;</span> <span class="hljs-string">&#x27;a&#x27;</span> <span class="hljs-string">&#x27;b&#x27;</span> <span class="hljs-string">&#x27;c&#x27;</span> <span class="hljs-string">&#x27;d&#x27;</span>]</code></pre><p>如果要实现数组的闭合，则可以传入原数组和一个新数组，其中新数组中的元素为原数组中的第一个元素，示例如下：</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> npa = np.array([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>])<span class="hljs-built_in">print</span>(np.concatenate((a, [a[<span class="hljs-number">0</span>]])))</code></pre><p>输出结果如下：</p><pre><code class="hljs python">[<span class="hljs-number">1</span> <span class="hljs-number">2</span> <span class="hljs-number">3</span> <span class="hljs-number">4</span> <span class="hljs-number">1</span>]</code></pre><hr><h3><span id="4x02-li-jie-pyplot-thetagrids"><font color="##4876FF">【4x02】理解 pyplot.thetagrids()</font></span></h3><p><code>matplotlib.pyplot.thetagrids()</code> 方法用于获取并设置当前极区图上的极轴。</p><p>基本语法：<code>matplotlib.pyplot.thetagrids(angles, labels=None, fmt=None, **kwargs)</code></p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>angles</td><td>网格线的角度，浮点数、度数组成的元组</td></tr><tr><td>labels</td><td>每个极轴要使用的文本标签，字符串组成的元组</td></tr><tr><td>fmt</td><td>格式化 angles 参数，如 <code>'%1.2f'</code> 保留两位小数，注意，将使用以弧度为单位的角度</td></tr><tr><td>**kwargs</td><td>其他关键字参数，参见<a href="https://matplotlib.org/api/text_api.html?highlight=text#matplotlib.text.Text">官方文档</a></td></tr></tbody></table><p>应用举例：</p><pre><code class="hljs python"> <span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]plt.polar()angles = <span class="hljs-built_in">range</span>(<span class="hljs-number">0</span>, <span class="hljs-number">360</span>, <span class="hljs-number">45</span>)labels = (<span class="hljs-string">&#x27;东&#x27;</span>, <span class="hljs-string">&#x27;东北&#x27;</span>, <span class="hljs-string">&#x27;北&#x27;</span>, <span class="hljs-string">&#x27;西北&#x27;</span>, <span class="hljs-string">&#x27;西&#x27;</span>, <span class="hljs-string">&#x27;西南&#x27;</span>, <span class="hljs-string">&#x27;南&#x27;</span>, <span class="hljs-string">&#x27;东南&#x27;</span>)plt.thetagrids(angles, labels)plt.title(<span class="hljs-string">&#x27;matplotlib.pyplot.thetagrids() 用法示例&#x27;</span>, pad=<span class="hljs-number">15</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/022/03.png" alt="03"></p><hr><h3><span id="4x03-hui-zhi-lei-da-tu"><font color="##4876FF">【4x03】绘制雷达图</font></span></h3><p><code>numpy.concatenate()</code> 方法能够解决闭合问题，<code>matplotlib.pyplot.thetagrids()</code> 能够解决自定义极轴和极轴的文本标记问题，因此就可以绘制一个标准的雷达图了。示例如下：</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt<span class="hljs-comment"># 设置中文显示、画布大小</span>plt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]plt.figure(figsize=(<span class="hljs-number">8.0</span>, <span class="hljs-number">6.0</span>))<span class="hljs-comment"># 分割圆并执行闭合操作（0-2π之间返回间隔均匀的6个弧度：π/3、2π/3、π、4π/3、5π/3、2π）</span>theta = np.linspace(<span class="hljs-number">0</span>, <span class="hljs-number">2</span>*np.pi, <span class="hljs-number">6</span>, endpoint=<span class="hljs-literal">False</span>)theta = np.concatenate((theta, [theta[<span class="hljs-number">0</span>]]))<span class="hljs-comment"># 设置两组数据并执行闭合操作</span>data1 = np.array([<span class="hljs-number">9</span>, <span class="hljs-number">4</span>, <span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">2</span>, <span class="hljs-number">8</span>])data2 = np.array([<span class="hljs-number">3</span>, <span class="hljs-number">6</span>, <span class="hljs-number">9</span>, <span class="hljs-number">6</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>])data1 = np.concatenate((data1, [data1[<span class="hljs-number">0</span>]]))data2 = np.concatenate((data2, [data2[<span class="hljs-number">0</span>]]))<span class="hljs-comment"># 绘制并填充两组数据</span>plt.polar(theta, data1, <span class="hljs-string">&#x27;bo-&#x27;</span>, label=<span class="hljs-string">&#x27;小王&#x27;</span>)plt.polar(theta, data2, <span class="hljs-string">&#x27;ro:&#x27;</span>, label=<span class="hljs-string">&#x27;小张&#x27;</span>)plt.fill(theta, data1, color=<span class="hljs-string">&#x27;b&#x27;</span>, alpha=<span class="hljs-number">0.3</span>)plt.fill(theta, data2, color=<span class="hljs-string">&#x27;r&#x27;</span>, alpha=<span class="hljs-number">0.3</span>)<span class="hljs-comment"># 将六个弧度（π/3、2π/3、π、4π/3、5π/3、2π）转换成角度，并分别设置标签</span>labels = np.array([<span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;Golang&#x27;</span>, <span class="hljs-string">&#x27;Java&#x27;</span>, <span class="hljs-string">&#x27;C++&#x27;</span>, <span class="hljs-string">&#x27;PHP&#x27;</span>, <span class="hljs-string">&#x27;JavaScript&#x27;</span>])plt.thetagrids(theta * <span class="hljs-number">180</span>/np.pi, labels)<span class="hljs-comment"># 设置刻度范围、标题、图例</span>plt.ylim(<span class="hljs-number">0</span>, <span class="hljs-number">10</span>)plt.title(<span class="hljs-string">&#x27;编程语言掌握程度&#x27;</span>)plt.legend(bbox_to_anchor=(<span class="hljs-number">1.3</span>, <span class="hljs-number">1</span>))plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/022/04.png" alt="04"></p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106162412</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="5x00-gao-ji-yong-fa-hui-zhi-ji-zuo-biao-san-dian-tu"><font color="#FF0000">【5x00】高级用法：绘制极坐标散点图</font></span></h2><p><code>matplotlib.pyplot.polar()</code> 方法可以实现极坐标散点图，但仅用这一个函数的话实现的样式效果并不多，以下介绍另外三种绘制极坐标散点图的方法：</p><ul><li><p><code>matplotlib.pyplot.polar()</code> 和 <code>matplotlib.pyplot.scatter()</code> 结合，前者绘制极坐标图，后者在极坐标图上绘制散点图；</p></li><li><p><code>matplotlib.pyplot.subplot()</code> 和 <code>matplotlib.pyplot.scatter()</code> 结合，前者添加子图，其中指定 <code>projection='polar'</code> 即为极坐标图， 后者在极坐标图上绘制散点图；</p></li><li><p><code>matplotlib.pyplot.axes()</code> 与 <code>matplotlib.pyplot.scatter()</code> 结合，前者设置绘图区参数，其中指定 <code>projection='polar'</code> 或 <code>polar=True</code> 即为极坐标图， 后者在极坐标图上绘制散点图。</p></li></ul><hr><h3><span id="5x01-fang-fa-yi-pyplot-scatter-yu-pyplot-polar"><font color="##4876FF">【5x01】方法一：pyplot.scatter() 与 pyplot.polar()</font></span></h3><p>以下用到的 <code>matplotlib.pyplot.scatter()</code> 函数，各参数含义以及支持的其他参数可以参见前文：</p><ul><li><p><a href="https://itrhx.blog.csdn.net/article/details/105828049">《Python 数据分析三剑客之 Matplotlib（二）：文本描述 / 中文支持 / 画布 / 网格等基本图像属性》</a></p></li><li><p><a href="https://itrhx.blog.csdn.net/article/details/105914929">《Python 数据分析三剑客之 Matplotlib（五）：散点图的绘制》</a></p></li></ul><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]N = <span class="hljs-number">50</span>r = <span class="hljs-number">2</span> * np.random.rand(N)theta = <span class="hljs-number">2</span> * np.pi * np.random.rand(N)size = <span class="hljs-number">200</span> * r ** <span class="hljs-number">2</span>colors = N * np.random.rand(N)plt.polar()plt.scatter(theta, r, s=size, c=colors, alpha=<span class="hljs-number">0.8</span>)plt.title(<span class="hljs-string">&#x27;极坐标散点图示例一&#x27;</span>, pad=<span class="hljs-number">15</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/022/05.png" alt="05"></p><hr><h3><span id="5x02-fang-fa-er-pyplot-scatter-yu-pyplot-subplot"><font color="##4876FF">【5x02】方法二：pyplot.scatter() 与 pyplot.subplot()</font></span></h3><p><code>matplotlib.pyplot.subplot()</code> 方法用于添加子图，如果想要子图为极坐标图，则需要指定 <code>projection</code> 参数为 <code>polar</code>，有关此函数的具体介绍可参见<a href="https://matplotlib.org/api/_as_gen/matplotlib.pyplot.subplot.html">官方文档</a>。其他函数的参数解释可参考前文：</p><ul><li><p><a href="https://itrhx.blog.csdn.net/article/details/105828143">《Python 数据分析三剑客之 Matplotlib（三）：图例 / LaTeX / 刻度 / 子图 / 补丁等基本图像属性》</a></p></li><li><p><a href="https://itrhx.blog.csdn.net/article/details/105914929">《Python 数据分析三剑客之 Matplotlib（五）：散点图的绘制》</a></p></li></ul><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]N = <span class="hljs-number">50</span>r = <span class="hljs-number">2</span> * np.random.rand(N)theta = <span class="hljs-number">2</span> * np.pi * np.random.rand(N)size = <span class="hljs-number">200</span> * r ** <span class="hljs-number">2</span>colors = N * np.random.rand(N)<span class="hljs-comment"># 一行一列第一个子图</span>plt.subplot(<span class="hljs-number">111</span>, projection=<span class="hljs-string">&#x27;polar&#x27;</span>)plt.scatter(theta, r, s=size, c=colors, alpha=<span class="hljs-number">0.8</span>)plt.title(<span class="hljs-string">&#x27;极坐标散点图示例二&#x27;</span>, pad=<span class="hljs-number">15</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/022/06.png" alt="06"></p><hr><h3><span id="5x03-fang-fa-san-pyplot-scatter-yu-pyplot-axes"><font color="##4876FF">【5x03】方法三：pyplot.scatter() 与 pyplot.axes()</font></span></h3><p>axes 为 Matplotlib 图像中的绘图区，<code>matplotlib.pyplot.axes()</code> 方法可以对绘图区进行设置，同样的也可以设置 <code>projection</code> 参数为 <code>polar</code> 来实现极坐标图，设置 <code>polar=True</code> 也行。示例中其他函数的参数解释可参考前文：</p><ul><li><p><a href="https://itrhx.blog.csdn.net/article/details/105638122">《Python 数据分析三剑客之 Matplotlib（一）：初识 Matplotlib 与其 matplotibrc 配置文件》</a></p></li><li><p><a href="https://itrhx.blog.csdn.net/article/details/105914929">《Python 数据分析三剑客之 Matplotlib（五）：散点图的绘制》</a></p></li></ul><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]N = <span class="hljs-number">50</span>r = <span class="hljs-number">2</span> * np.random.rand(N)theta = <span class="hljs-number">2</span> * np.pi * np.random.rand(N)size = <span class="hljs-number">200</span> * r ** <span class="hljs-number">2</span>colors = N * np.random.rand(N)<span class="hljs-comment"># plt.axes(polar=True)</span>plt.axes(projection=<span class="hljs-string">&#x27;polar&#x27;</span>)plt.scatter(theta, r, s=size, c=colors, alpha=<span class="hljs-number">0.8</span>)plt.title(<span class="hljs-string">&#x27;极坐标散点图示例三&#x27;</span>, pad=<span class="hljs-number">15</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/022/07.png" alt="07"></p><h2><span id="6x00-gao-ji-yong-fa-hui-zhi-ji-zuo-biao-zhu-zhuang-tu"><font color="#FF0000">【6x00】高级用法：绘制极坐标柱状图</font></span></h2><p>和极坐标散点图的绘制类似，<code>matplotlib.pyplot.polar()</code> 方法可以实现极坐标图，但仅用这一个函数的话实现的样式效果并不多，以下介绍另外三种绘制极坐标柱状图的方法：</p><ul><li><p><code>matplotlib.pyplot.polar()</code> 和 <code>matplotlib.pyplot.bar()</code> 结合，前者绘制极坐标图，后者在极坐标图上绘制柱状图；</p></li><li><p><code>matplotlib.pyplot.subplot()</code> 和 <code>matplotlib.pyplot.bar()</code> 结合，前者添加子图，其中指定 <code>projection='polar'</code> 即为极坐标图， 后者在极坐标图上绘制柱状图；</p></li><li><p><code>matplotlib.pyplot.axes()</code> 与 <code>matplotlib.pyplot.bar()</code> 结合，前者设置绘图区参数，其中指定 <code>projection='polar'</code> 或 <code>polar=True</code> 即为极坐标图， 后者在极坐标图上绘制柱状图。</p></li></ul><hr><h3><span id="6x01-fang-fa-yi-pyplot-bar-yu-pyplot-polar"><font color="##4876FF">【6x01】方法一：pyplot.bar() 与 pyplot.polar()</font></span></h3><p>以下用到的 <code>matplotlib.pyplot.bar()</code> 函数，各参数含义以及支持的其他参数可以参见前文：</p><ul><li><p><a href="https://itrhx.blog.csdn.net/article/details/105828049">《Python 数据分析三剑客之 Matplotlib（二）：文本描述 / 中文支持 / 画布 / 网格等基本图像属性》</a></p></li><li><p><a href="https://itrhx.blog.csdn.net/article/details/105952856">《Python 数据分析三剑客之 Matplotlib（六）：直方图 / 柱状图 / 条形图的绘制》</a></p></li></ul><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]r = np.random.rand(<span class="hljs-number">8</span>)theta = np.arange(<span class="hljs-number">0</span>, <span class="hljs-number">2</span> * np.pi, <span class="hljs-number">2</span> * np.pi / <span class="hljs-number">8</span>)colors = np.array([<span class="hljs-string">&#x27;#4bb2c5&#x27;</span>, <span class="hljs-string">&#x27;#c5b47f&#x27;</span>, <span class="hljs-string">&#x27;#EAA228&#x27;</span>, <span class="hljs-string">&#x27;#579575&#x27;</span>, <span class="hljs-string">&#x27;#839557&#x27;</span>, <span class="hljs-string">&#x27;#958c12&#x27;</span>, <span class="hljs-string">&#x27;#953579&#x27;</span>, <span class="hljs-string">&#x27;#4b5de4&#x27;</span>])plt.polar()plt.bar(theta, r, color=colors, alpha=<span class="hljs-number">0.8</span>)plt.title(<span class="hljs-string">&#x27;极坐标柱状图示例一&#x27;</span>, pad=<span class="hljs-number">15</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/022/08.png" alt="08"></p><hr><h3><span id="6x02-fang-fa-er-pyplot-bar-yu-pyplot-subplot"><font color="##4876FF">【6x02】方法二：pyplot.bar() 与 pyplot.subplot()</font></span></h3><p><code>matplotlib.pyplot.subplot()</code> 方法用于添加子图，如果想要子图为极坐标图，则需要指定 <code>projection</code> 参数为 <code>polar</code>，有关此函数的具体介绍可参见<a href="https://matplotlib.org/api/_as_gen/matplotlib.pyplot.subplot.html">官方文档</a>。其他函数的参数解释可参考前文：</p><ul><li><p><a href="https://itrhx.blog.csdn.net/article/details/105828143">《Python 数据分析三剑客之 Matplotlib（三）：图例 / LaTeX / 刻度 / 子图 / 补丁等基本图像属性》</a></p></li><li><p><a href="https://itrhx.blog.csdn.net/article/details/105952856">《Python 数据分析三剑客之 Matplotlib（六）：直方图 / 柱状图 / 条形图的绘制》</a></p></li></ul><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]r = np.random.rand(<span class="hljs-number">8</span>)theta = np.arange(<span class="hljs-number">0</span>, <span class="hljs-number">2</span> * np.pi, <span class="hljs-number">2</span> * np.pi / <span class="hljs-number">8</span>)colors = np.array([<span class="hljs-string">&#x27;#4bb2c5&#x27;</span>, <span class="hljs-string">&#x27;#c5b47f&#x27;</span>, <span class="hljs-string">&#x27;#EAA228&#x27;</span>, <span class="hljs-string">&#x27;#579575&#x27;</span>, <span class="hljs-string">&#x27;#839557&#x27;</span>, <span class="hljs-string">&#x27;#958c12&#x27;</span>, <span class="hljs-string">&#x27;#953579&#x27;</span>, <span class="hljs-string">&#x27;#4b5de4&#x27;</span>])plt.subplot(<span class="hljs-number">111</span>, projection=<span class="hljs-string">&#x27;polar&#x27;</span>)plt.bar(theta, r, color=colors, alpha=<span class="hljs-number">0.8</span>)plt.title(<span class="hljs-string">&#x27;极坐标柱状图示例二&#x27;</span>, pad=<span class="hljs-number">15</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/022/09.png" alt="09"></p><hr><h3><span id="6x03-fang-fa-san-pyplot-bar-yu-pyplot-axes"><font color="##4876FF">【6x03】方法三：pyplot.bar() 与 pyplot.axes()</font></span></h3><p>axes 为 Matplotlib 图像中的绘图区，<code>matplotlib.pyplot.axes()</code> 方法可以对绘图区进行设置，同样的也可以设置 <code>projection</code> 参数为 <code>polar</code> 来实现极坐标图，设置 <code>polar=True</code> 也行。示例中其他函数的参数解释可参考前文：</p><ul><li><p><a href="https://itrhx.blog.csdn.net/article/details/105638122">《Python 数据分析三剑客之 Matplotlib（一）：初识 Matplotlib 与其 matplotibrc 配置文件》</a></p></li><li><p><a href="https://itrhx.blog.csdn.net/article/details/105952856">《Python 数据分析三剑客之 Matplotlib（六）：直方图 / 柱状图 / 条形图的绘制》</a></p></li></ul><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]r = np.random.rand(<span class="hljs-number">8</span>)theta = np.arange(<span class="hljs-number">0</span>, <span class="hljs-number">2</span> * np.pi, <span class="hljs-number">2</span> * np.pi / <span class="hljs-number">8</span>)colors = np.array([<span class="hljs-string">&#x27;#4bb2c5&#x27;</span>, <span class="hljs-string">&#x27;#c5b47f&#x27;</span>, <span class="hljs-string">&#x27;#EAA228&#x27;</span>, <span class="hljs-string">&#x27;#579575&#x27;</span>, <span class="hljs-string">&#x27;#839557&#x27;</span>, <span class="hljs-string">&#x27;#958c12&#x27;</span>, <span class="hljs-string">&#x27;#953579&#x27;</span>, <span class="hljs-string">&#x27;#4b5de4&#x27;</span>])<span class="hljs-comment"># plt.axes(polar=True)</span>plt.axes(projection=<span class="hljs-string">&#x27;polar&#x27;</span>)plt.bar(theta, r, color=colors, alpha=<span class="hljs-number">0.8</span>)plt.title(<span class="hljs-string">&#x27;极坐标柱状图示例三&#x27;</span>, pad=<span class="hljs-number">15</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/022/10.png" alt="10"></p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106162412</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-1x00-liao-jie-ji-zuo-biao-font&quot;&gt;&lt;font colo</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Matplotlib" scheme="https://www.itbob.cn/tags/Matplotlib/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Matplotlib（八）：等高线/等值线图的绘制</title>
    <link href="https://www.itbob.cn/article/021/"/>
    <id>https://www.itbob.cn/article/021/</id>
    <published>2020-05-12T14:35:53.000Z</published>
    <updated>2022-05-22T12:33:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-1x00-deng-gao-xian-gai-nian-font"><font color="#FF0000">【1x00】等高线概念</font></a></li><li><a href="#font-color-ff0000-2x00-li-jie-numpy-meshgrid-font"><font color="#FF0000">【2x00】理解 numpy.meshgrid()</font></a></li><li><a href="#font-color-ff0000-3x00-hui-zhi-fang-fa-matplotlib-pyplot-contour-font"><font color="#FF0000">【3x00】绘制方法 matplotlib.pyplot.contour()</font></a></li><li><a href="#font-color-ff0000-4x00-tian-chong-fang-fa-matplotlib-pyplot-contourf-font"><font color="#FF0000">【4x00】填充方法 matplotlib.pyplot.contourf()</font></a></li><li><a href="#font-color-ff0000-5x00-biao-ji-fang-fa-matplotlib-pyplot-clabel-font"><font color="#FF0000">【5x00】标记方法 matplotlib.pyplot.clabel()</font></a></li><li><a href="#font-color-ff0000-6x00-colormap-qu-zhi-font"><font color="#FF0000">【6x00】Colormap 取值</font></a></li><li><a href="#font-color-ff0000-7x00-jian-dan-shi-li-font"><font color="#FF0000">【7x00】简单示例</font></a></li><li><a href="#font-color-ff0000-8x00-tian-jia-biao-ji-font"><font color="#FF0000">【8x00】添加标记</font></a></li><li><a href="#font-color-ff0000-9x00-lun-kuo-xian-yan-se-he-yang-shi-font"><font color="#FF0000">【9x00】轮廓线颜色和样式</font></a></li><li><a href="#font-color-ff0000-10x00-yan-se-tian-chong-font"><font color="#FF0000">【10x00】颜色填充</font></a></li></ul><!-- tocstop --><hr><p>Matplotlib 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/014/">Python 数据分析三剑客之 Matplotlib（一）：初识 Matplotlib 与其 matplotibrc 配置文件</a></li><li><a href="https://www.itbob.cn/article/015/">Python 数据分析三剑客之 Matplotlib（二）：文本描述 / 中文支持 / 画布 / 网格等基本图像属性</a></li><li><a href="https://www.itbob.cn/article/016/">Python 数据分析三剑客之 Matplotlib（三）：图例 / LaTeX / 刻度 / 子图 / 补丁等基本图像属性</a></li><li><a href="https://www.itbob.cn/article/017/">Python 数据分析三剑客之 Matplotlib（四）：线性图的绘制</a></li><li><a href="https://www.itbob.cn/article/018/">Python 数据分析三剑客之 Matplotlib（五）：散点图的绘制</a></li><li><a href="https://www.itbob.cn/article/019/">Python 数据分析三剑客之 Matplotlib（六）：直方图 / 柱状图 / 条形图的绘制</a></li><li><a href="https://www.itbob.cn/article/020/">Python 数据分析三剑客之 Matplotlib（七）：饼状图的绘制</a></li><li><a href="https://www.itbob.cn/article/021/">Python 数据分析三剑客之 Matplotlib（八）：等高线 / 等值线图的绘制</a></li><li><a href="https://www.itbob.cn/article/022/">Python 数据分析三剑客之 Matplotlib（九）：极区图 / 极坐标图 / 雷达图的绘制</a></li><li><a href="https://www.itbob.cn/article/023/">Python 数据分析三剑客之 Matplotlib（十）：3D 图的绘制</a></li><li><a href="https://www.itbob.cn/article/024/">Python 数据分析三剑客之 Matplotlib（十一）：最热门最常用的 50 个图表</a>【译文】</li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106066852</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="1x00-deng-gao-xian-gai-nian"><font color="#FF0000">【1x00】等高线概念</font></span></h2><p>参考百度百科，等高线概念总结如下：等高线指的是地形图上高程相等的相邻各点所连成的闭合曲线。把地面上海拔高度相同的点连成的闭合曲线，并垂直投影到一个水平面上，并按比例缩绘在图纸上，就得到等高线。等高线也可以看作是不同海拔高度的水平面与实际地面的交线，所以等高线是闭合曲线。在等高线上标注的数字为该等高线的海拔。</p><ul><li>位于同一等高线上的地面点，海拔高度相同。但海拔高度相同的点不一定位于同一条等高线上；</li><li>在同一幅图内，除了陡崖以外，不同高程的等高线不能相交；</li><li>在图廓内相邻等高线的高差一般是相同的，因此地面坡度与等高线之间的等高线平距成反比，等高线平距愈小，等高线排列越密，说明地面坡度越大；等高线平距愈大，等高线排列越稀，则说明地面坡度愈小；</li><li>等高线是一条闭合的曲线，如果不能在同一幅内闭合，则必在相邻或者其他图幅内闭合。</li><li>等高线经过山脊或山谷时改变方向，因此，山脊线或者山谷线应垂直于等高线转折点处的切线，即等高线与山脊线或者山谷线正交。</li></ul><p><font color="#FF0000"><strong>在 Matplotlib 等高线的绘制中，需要传递三个基本参数：某个点的 x、y 轴坐标以及其高度。</strong></font></p><p><img src="https://static.wukongsec.com/itbob/images/article/021/01.png" alt="01"></p><p><img src="https://static.wukongsec.com/itbob/images/article/021/02.jpg" alt="02"></p><h2><span id="2x00-li-jie-numpy-meshgrid"><font color="#FF0000">【2x00】理解 numpy.meshgrid()</font></span></h2><p><code>numpy.meshgrid()</code> 方法用于生成网格点坐标矩阵。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> npa = np.array([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>])b = np.array([<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>])res = np.meshgrid(a, b)<span class="hljs-built_in">print</span>(res)</code></pre><p>输出结果：</p><pre><code class="hljs python">[array([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>],       [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>],       [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]]), array([[<span class="hljs-number">7</span>, <span class="hljs-number">7</span>, <span class="hljs-number">7</span>],       [<span class="hljs-number">8</span>, <span class="hljs-number">8</span>, <span class="hljs-number">8</span>],       [<span class="hljs-number">9</span>, <span class="hljs-number">9</span>, <span class="hljs-number">9</span>]])]</code></pre><p>给定两个数组，<code>a[1, 2, 3]</code> 和 <code>b[7, 8, 9]</code>，a 作为 x 轴数据，b 作为 y 轴数据，那么一共可以绘制出 9 个点： (1,7)、(1,8)、(1,9)、(2,7)、(2,8)、(2,9)、(3,7)、(3,8)、(3,9)，而 <code>numpy.meshgrid()</code> 方法就是起这样的作用，返回的两个二维数组，横坐标矩阵 a 中的每个元素，与纵坐标矩阵 b 中对应位置元素，共同构成一个点的完整坐标。</p><p><font color="#FF0000"><strong>因为在 <code>matplotlib.pyplot.contour()</code> 等高线绘制函数中接收的是二维坐标信息，所以在绘制等高线图之前要将原数据经过 <code>numpy.meshgrid()</code> 方法处理，也可以自己构建类似于上述的二维数组。</strong></font></p><p><img src="https://img-blog.csdnimg.cn/20200512112427932.png" alt="分割线"></p><h2><span id="3x00-hui-zhi-fang-fa-matplotlib-pyplot-contour"><font color="#FF0000">【3x00】绘制方法 matplotlib.pyplot.contour()</font></span></h2><p><code>matplotlib.pyplot.contour()</code> 方法可用于绘制等高线图。</p><p>基本语法：<code>matplotlib.pyplot.contour(\*args, data=None, \*\*kwargs)</code></p><p>通用格式：<code>matplotlib.pyplot.contour([X, Y,] Z, [levels], **kwargs)</code></p><p>基本参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>X, Y</td><td>数组形式的点的 x 和 y 轴坐标，两者都必须是二维的，形状与 Z 相同</td></tr><tr><td>Z</td><td>绘制轮廓的高度值，二维数组，每个元素是其对应点的高度</td></tr><tr><td>levels</td><td>确定等高线的数目和位置，如果是整数 N，则使用 N 个数据间隔，即绘制 N+1 条等高线<br>如果是数组形式，则绘制指定的等高线。值必须按递增顺序排列</td></tr></tbody></table><p>其他参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>colors</td><td>等高线的颜色，颜色字符串或颜色序列</td></tr><tr><td>cmap</td><td>等高线的颜色，字符串或者 <a href="https://matplotlib.org/tutorials/colors/colormaps.html">Colormap</a><br>通常包含一系列的渐变色或其他颜色组合，取值参见<font color="#FF0000"><strong>【6x00】Colormap 取值</strong></font></td></tr><tr><td>alpha</td><td>透明度，介于0（透明）和1（不透明）之间</td></tr><tr><td>origin</td><td>通过指定 Z[0，0] 的位置来确定 Z 的方向和确切位置，仅当未指定 X, Y 时才有意义<br><code>None</code>：Z[0，0] 位于左下角的 X=0, Y=0 处<br><code>'lower'</code>：Z [0, 0] 位于左下角的 X = 0.5, Y = 0.5 处<br><code>'upper'</code>：Z[0，0] 位于左上角的 X=N+0.5, Y=0.5 处<br><code>'image'</code>：使用 <code>rcParams[“image.origin”] = 'upper'</code>的值</td></tr><tr><td>antialiased</td><td>是否启用抗锯齿渲染，默认 True</td></tr><tr><td>linewidths</td><td>等高线的线宽，如果是数字，则所有等高线都将使用此线宽<br>如果是序列，则将按指定的顺序以升序打印线宽<br>默认为 <code>rcParams[“lines.linewidth”] = 1.5</code></td></tr><tr><td>linestyles</td><td>等高线的样式，如果线条颜色为单色，则负等高线默认为虚线<br><code>'-'</code> or <code>'solid'</code>, <code>'--'</code> or <code>'dashed'</code>, <code>'-.'</code> or <code>'dashdot'</code> <code>':'</code> or <code>'dotted'</code>, <code>'none'</code> or <code>' '</code> or <code>''</code></td></tr></tbody></table><p><img src="https://img-blog.csdnimg.cn/20200512112427932.png" alt="分割线"></p><h2><span id="4x00-tian-chong-fang-fa-matplotlib-pyplot-contourf"><font color="#FF0000">【4x00】填充方法 matplotlib.pyplot.contourf()</font></span></h2><p><code>matplotlib.pyplot.contourf()</code> 方法与 <code>matplotlib.pyplot.contour()</code> 的区别在于：<code>contourf()</code> 会对等高线间的区域进行颜色填充（filled contours）。除此之外两者的函数签名和返回值都相同。</p><p>基本语法：<code>matplotlib.pyplot.contourf(\*args, data=None, \*\*kwargs)</code></p><p>通用格式：<code>matplotlib.pyplot.contour([X, Y,] Z, [levels], **kwargs)</code></p><p>基本参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>X, Y</td><td>数组形式的点的 x 和 y 轴坐标，两者都必须是二维的，形状与 Z 相同</td></tr><tr><td>Z</td><td>绘制轮廓的高度值，二维数组，每个元素是其对应点的高度</td></tr><tr><td>levels</td><td>确定等高线的数目和位置，如果是整数 N，则使用 N 个数据间隔，即绘制 N+1 条等高线<br>如果是数组形式，则绘制指定的等高线。值必须按递增顺序排列</td></tr></tbody></table><p>其他参数：</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>colors</td><td>等高线的填充颜色，颜色字符串或颜色序列</td></tr><tr><td>cmap</td><td>等高线的填充颜色，字符串或者 <a href="https://matplotlib.org/tutorials/colors/colormaps.html">Colormap</a><br>通常包含一系列的渐变色或其他颜色组合，取值参见<font color="#FF0000"><strong>【6x00】Colormap 取值</strong></font></td></tr><tr><td>alpha</td><td>透明度，介于0（透明）和1（不透明）之间</td></tr><tr><td>origin</td><td>通过指定 Z[0，0] 的位置来确定 Z 的方向和确切位置，仅当未指定 X, Y 时才有意义<br><code>None</code>：Z[0，0] 位于左下角的 X=0, Y=0 处<br><code>'lower'</code>：Z [0, 0] 位于左下角的 X = 0.5, Y = 0.5 处<br><code>'upper'</code>：Z[0，0] 位于左上角的 X=N+0.5, Y=0.5 处<br><code>'image'</code>：使用 <code>rcParams[“image.origin”] = 'upper'</code>的值</td></tr><tr><td>antialiased</td><td>是否启用抗锯齿渲染，默认 True</td></tr><tr><td>linewidths</td><td>等高线的线宽，如果是数字，则所有等高线都将使用此线宽<br>如果是序列，则将按指定的顺序以升序打印线宽<br>默认为 <code>rcParams[“lines.linewidth”] = 1.5</code></td></tr><tr><td>linestyles</td><td>等高线的样式，如果线条颜色为单色，则负等高线默认为虚线<br><code>'-'</code> or <code>'solid'</code>, <code>'--'</code> or <code>'dashed'</code>, <code>'-.'</code> or <code>'dashdot'</code> <code>':'</code> or <code>'dotted'</code>, <code>'none'</code> or <code>' '</code> or <code>''</code></td></tr></tbody></table><p><img src="https://img-blog.csdnimg.cn/20200512180336350.png" alt="分割线"></p><h2><span id="5x00-biao-ji-fang-fa-matplotlib-pyplot-clabel"><font color="#FF0000">【5x00】标记方法 matplotlib.pyplot.clabel()</font></span></h2><p><code>matplotlib.pyplot.clabel(CS, \*args, \*\*kwargs)</code> 方法可用于标记等高线图。</p><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>CS</td><td>ContourSet（等高线集）对象，即 <code>pyplot.contour()</code> 返回的对象</td></tr><tr><td>levels</td><td>需要标记的等高线集，数组类型，如果未指定则默认标记所有等高线</td></tr><tr><td>fontsize</td><td>标记的字体大小，可选项：<br><code>'xx-small'</code>, <code>'x-small'</code>, <code>'small'</code>, <code>'medium'</code>, <code>'large'</code>, <code>'x-large'</code>, <code>'xx-large'</code></td></tr><tr><td>colors</td><td>标记的颜色，颜色字符串或颜色序列</td></tr><tr><td>inline</td><td>是否在标签位置移除轮廓显示，bool 类型，默认 True</td></tr><tr><td>inline_spacing</td><td>标签位置移除轮廓的宽度，float 类型，默认为 5</td></tr><tr><td>fmt</td><td>标签的格式字符串。str 或 dict 类型，默认值为 <code>%1.3f</code></td></tr><tr><td>rightside_up</td><td>是否将标签旋转始终与水平面成正负90度，bool 类型，默认 True</td></tr><tr><td>use_clabeltext</td><td>默认为 False，如果为 True，则使用 <a href="https://matplotlib.org/api/contour_api.html#matplotlib.contour.ClabelText">ClabelText</a> 类（而不是 <a href="https://matplotlib.org/api/text_api.html#matplotlib.text.Text">Text</a>）创建标签<br>ClabelText  在绘图期间重新计算文本的旋转角度，如果轴的角度发生变化，则可以使用此功能</td></tr></tbody></table><p><img src="https://img-blog.csdnimg.cn/20200512112427932.png" alt="分割线"></p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106066852</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="6x00-colormap-qu-zhi"><font color="#FF0000">【6x00】Colormap 取值</font></span></h2><p><code>matplotlib.pyplot.contour()</code> 和 <code>matplotlib.pyplot.contourf()</code> 中 <code>cmap</code> 参数用于设置等高线的颜色，取值通常为 Colormap 中的值，通常包含一系列的渐变色或其他颜色组合。具体参加下图。</p><p>官方文档：<a href="https://matplotlib.org/tutorials/colors/colormaps.html">https://matplotlib.org/tutorials/colors/colormaps.html</a></p><p><img src="https://static.wukongsec.com/itbob/images/article/021/03.png" alt="03"></p><p><img src="https://img-blog.csdnimg.cn/20200512112427932.png" alt="分割线"></p><h2><span id="7x00-jian-dan-shi-li"><font color="#FF0000">【7x00】简单示例</font></span></h2><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)y = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)m, n = np.meshgrid(x, y)        <span class="hljs-comment"># 生成网格点坐标矩阵</span><span class="hljs-comment"># 指定一个函数用于计算每个点的高度，也可以直接使用二维数组储存每个点的高度</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>(<span class="hljs-params">a, b</span>):</span>    <span class="hljs-keyword">return</span> (<span class="hljs-number">1</span> - b ** <span class="hljs-number">5</span> + a ** <span class="hljs-number">5</span>) * np.exp(-a ** <span class="hljs-number">2</span> - b ** <span class="hljs-number">2</span>)<span class="hljs-comment"># 绘制等高线图，8 个数据间隔，颜色为黑色</span>plt.contour(m, n, f(m, n), <span class="hljs-number">8</span>, colors=<span class="hljs-string">&#x27;k&#x27;</span>)plt.title(<span class="hljs-string">&#x27;等高线图简单示例&#x27;</span>)plt.xlabel(<span class="hljs-string">&#x27;x axis label&#x27;</span>)plt.ylabel(<span class="hljs-string">&#x27;y axis label&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/021/04.png" alt="04"></p><p><img src="https://img-blog.csdnimg.cn/20200512112427932.png" alt="分割线"></p><h2><span id="8x00-tian-jia-biao-ji"><font color="#FF0000">【8x00】添加标记</font></span></h2><p><code>matplotlib.pyplot.clabel()</code> 方法用于给等高线添加标记。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)y = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)m, n = np.meshgrid(x, y)        <span class="hljs-comment"># 生成网格点坐标矩阵</span><span class="hljs-comment"># 指定一个函数用于计算每个点的高度，也可以直接使用二维数组储存每个点的高度</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>(<span class="hljs-params">a, b</span>):</span>    <span class="hljs-keyword">return</span> (<span class="hljs-number">1</span> - b ** <span class="hljs-number">5</span> + a ** <span class="hljs-number">5</span>) * np.exp(-a ** <span class="hljs-number">2</span> - b ** <span class="hljs-number">2</span>)<span class="hljs-comment"># 绘制等高线图，8 个数据间隔，颜色为黑色</span>C = plt.contour(m, n, f(m, n), <span class="hljs-number">8</span>, colors=<span class="hljs-string">&#x27;k&#x27;</span>)<span class="hljs-comment"># 添加标记，标记处不显示轮廓线，颜色为黑红绿蓝四种，保留两位小数</span>plt.clabel(C, inline=<span class="hljs-literal">True</span>, colors=[<span class="hljs-string">&#x27;k&#x27;</span>, <span class="hljs-string">&#x27;r&#x27;</span>, <span class="hljs-string">&#x27;g&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>], fmt=<span class="hljs-string">&#x27;%1.2f&#x27;</span>)plt.title(<span class="hljs-string">&#x27;等高线图添加标记示例&#x27;</span>)plt.xlabel(<span class="hljs-string">&#x27;x axis label&#x27;</span>)plt.ylabel(<span class="hljs-string">&#x27;y axis label&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/021/05.png" alt="05"></p><p><img src="https://img-blog.csdnimg.cn/20200512112427932.png" alt="分割线"></p><h2><span id="9x00-lun-kuo-xian-yan-se-he-yang-shi"><font color="#FF0000">【9x00】轮廓线颜色和样式</font></span></h2><p><code>matplotlib.pyplot.contour()</code> 方法中，<code>colors</code> 参数即可为等高线轮廓设置颜色，可以是单色，也可以是一个颜色列表，<code>linestyles</code> 参数可以设置轮廓线样式，注意，如果线条颜色为单色，则负等高线（高度值为负）默认为虚线。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)y = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)m, n = np.meshgrid(x, y)        <span class="hljs-comment"># 生成网格点坐标矩阵</span><span class="hljs-comment"># 指定一个函数用于计算每个点的高度，也可以直接使用二维数组储存每个点的高度</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>(<span class="hljs-params">a, b</span>):</span>    <span class="hljs-keyword">return</span> (<span class="hljs-number">1</span> - b ** <span class="hljs-number">5</span> + a ** <span class="hljs-number">5</span>) * np.exp(-a ** <span class="hljs-number">2</span> - b ** <span class="hljs-number">2</span>)colors = [<span class="hljs-string">&#x27;k&#x27;</span>, <span class="hljs-string">&#x27;r&#x27;</span>, <span class="hljs-string">&#x27;g&#x27;</span>, <span class="hljs-string">&#x27;b&#x27;</span>]<span class="hljs-comment"># 绘制等高线图，8 个数据间隔，颜色为黑色，线条样式为 --</span>C = plt.contour(m, n, f(m, n), <span class="hljs-number">8</span>, colors=colors, linestyles=<span class="hljs-string">&#x27;--&#x27;</span>)<span class="hljs-comment"># 添加标记，标记处不显示轮廓线，颜色为黑红绿蓝四种，保留两位小数</span>plt.clabel(C, inline=<span class="hljs-literal">True</span>, colors=colors, fmt=<span class="hljs-string">&#x27;%1.2f&#x27;</span>)plt.title(<span class="hljs-string">&#x27;等高线图设置颜色/样式示例&#x27;</span>)plt.xlabel(<span class="hljs-string">&#x27;x axis label&#x27;</span>)plt.ylabel(<span class="hljs-string">&#x27;y axis label&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/021/06.png" alt="06"></p><p>如果想启用渐变色，则可以设置 <code>cmap</code>，取值参见<font color="#FF0000"><strong>【6x00】Colormap 取值</strong></font>，<code>colorbar()</code> 方法可以显示颜色对照条。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)y = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)m, n = np.meshgrid(x, y)        <span class="hljs-comment"># 生成网格点坐标矩阵</span><span class="hljs-comment"># 指定一个函数用于计算每个点的高度，也可以直接使用二维数组储存每个点的高度</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>(<span class="hljs-params">a, b</span>):</span>    <span class="hljs-keyword">return</span> (<span class="hljs-number">1</span> - b ** <span class="hljs-number">5</span> + a ** <span class="hljs-number">5</span>) * np.exp(-a ** <span class="hljs-number">2</span> - b ** <span class="hljs-number">2</span>)<span class="hljs-comment"># 绘制等高线图，8 个数据间隔，颜色为 plasma</span>C = plt.contour(m, n, f(m, n), <span class="hljs-number">8</span>, cmap=<span class="hljs-string">&#x27;plasma&#x27;</span>)<span class="hljs-comment"># 添加标记，标记处不显示轮廓线，颜色为黑色，保留两位小数</span>plt.clabel(C, inline=<span class="hljs-literal">True</span>, colors=<span class="hljs-string">&#x27;k&#x27;</span>, fmt=<span class="hljs-string">&#x27;%1.2f&#x27;</span>)<span class="hljs-comment"># 显示颜色条</span>plt.colorbar()plt.title(<span class="hljs-string">&#x27;等高线图设置渐变色示例&#x27;</span>)plt.xlabel(<span class="hljs-string">&#x27;x axis label&#x27;</span>)plt.ylabel(<span class="hljs-string">&#x27;y axis label&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/021/07.png" alt="07"></p><p><img src="https://img-blog.csdnimg.cn/20200512112427932.png" alt="分割线"></p><h2><span id="10x00-yan-se-tian-chong"><font color="#FF0000">【10x00】颜色填充</font></span></h2><p><code>matplotlib.pyplot.contourf()</code> 方法用于对等高线之间的地方进行颜色填充。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)y = np.arange(-<span class="hljs-number">2.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">0.01</span>)m, n = np.meshgrid(x, y)        <span class="hljs-comment"># 生成网格点坐标矩阵</span><span class="hljs-comment"># 指定一个函数用于计算每个点的高度，也可以直接使用二维数组储存每个点的高度</span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>(<span class="hljs-params">a, b</span>):</span>    <span class="hljs-keyword">return</span> (<span class="hljs-number">1</span> - b ** <span class="hljs-number">5</span> + a ** <span class="hljs-number">5</span>) * np.exp(-a ** <span class="hljs-number">2</span> - b ** <span class="hljs-number">2</span>)<span class="hljs-comment"># 绘制等高线图，8 个数据间隔，颜色为 plasma</span>plt.contourf(m, n, f(m, n), <span class="hljs-number">8</span>, cmap=<span class="hljs-string">&#x27;plasma&#x27;</span>)C = plt.contour(m, n, f(m, n), <span class="hljs-number">8</span>, cmap=<span class="hljs-string">&#x27;plasma&#x27;</span>)<span class="hljs-comment"># 添加标记，标记处不显示轮廓线，颜色为黑色，保留两位小数</span>plt.clabel(C, inline=<span class="hljs-literal">True</span>, colors=<span class="hljs-string">&#x27;k&#x27;</span>, fmt=<span class="hljs-string">&#x27;%1.2f&#x27;</span>)<span class="hljs-comment"># 显示颜色条</span>plt.colorbar()plt.title(<span class="hljs-string">&#x27;等高线图颜色填充示例&#x27;</span>)plt.xlabel(<span class="hljs-string">&#x27;x axis label&#x27;</span>)plt.ylabel(<span class="hljs-string">&#x27;y axis label&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/021/08.png" alt="08"></p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106066852</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-1x00-deng-gao-xian-gai-nian-font&quot;&gt;&lt;font co</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Matplotlib" scheme="https://www.itbob.cn/tags/Matplotlib/"/>
    
  </entry>
  
  <entry>
    <title>Python 数据分析三剑客之 Matplotlib（七）：饼状图的绘制</title>
    <link href="https://www.itbob.cn/article/020/"/>
    <id>https://www.itbob.cn/article/020/</id>
    <published>2020-05-11T16:43:20.000Z</published>
    <updated>2022-05-22T12:32:00.000Z</updated>
    
    <content type="html"><![CDATA[<h2><span id="wen-zhang-mu-lu">文章目录</span></h2><!-- toc --><ul><li><a href="#font-color-ff0000-1x00-fang-fa-miao-shu-font"><font color="#FF0000">【1x00】方法描述</font></a></li><li><a href="#font-color-ff0000-2x00-jian-dan-shi-li-font"><font color="#FF0000">【2x00】简单示例</font></a></li><li><a href="#font-color-ff0000-3x00-an-jiao-du-diao-zheng-shan-xing-biao-qian-font"><font color="#FF0000">【3x00】按角度调整扇形标签</font></a></li><li><a href="#font-color-ff0000-4x00-xian-shi-tu-li-font"><font color="#FF0000">【4x00】显示图例</font></a></li><li><a href="#font-color-ff0000-5x00-tu-chu-xian-shi-shan-xing-kuai-font"><font color="#FF0000">【5x00】突出显示扇形块</font></a></li><li><a href="#font-color-ff0000-6x00-xian-shi-ge-shan-qu-suo-zhan-bai-fen-bi-font"><font color="#FF0000">【6x00】显示各扇区所占百分比</font></a></li><li><a href="#font-color-ff0000-7x00-xuan-zhuan-bing-zhuang-tu-font"><font color="#FF0000">【7x00】旋转饼状图</font></a></li><li><a href="#font-color-ff0000-8x00-zi-ding-yi-mei-ge-shan-xing-he-wen-zi-shu-xing-font"><font color="#FF0000">【8x00】自定义每个扇形和文字属性</font></a></li></ul><!-- tocstop --><hr><p>Matplotlib 系列文章：</p><ul><li><a href="https://www.itbob.cn/article/014/">Python 数据分析三剑客之 Matplotlib（一）：初识 Matplotlib 与其 matplotibrc 配置文件</a></li><li><a href="https://www.itbob.cn/article/015/">Python 数据分析三剑客之 Matplotlib（二）：文本描述 / 中文支持 / 画布 / 网格等基本图像属性</a></li><li><a href="https://www.itbob.cn/article/016/">Python 数据分析三剑客之 Matplotlib（三）：图例 / LaTeX / 刻度 / 子图 / 补丁等基本图像属性</a></li><li><a href="https://www.itbob.cn/article/017/">Python 数据分析三剑客之 Matplotlib（四）：线性图的绘制</a></li><li><a href="https://www.itbob.cn/article/018/">Python 数据分析三剑客之 Matplotlib（五）：散点图的绘制</a></li><li><a href="https://www.itbob.cn/article/019/">Python 数据分析三剑客之 Matplotlib（六）：直方图 / 柱状图 / 条形图的绘制</a></li><li><a href="https://www.itbob.cn/article/020/">Python 数据分析三剑客之 Matplotlib（七）：饼状图的绘制</a></li><li><a href="https://www.itbob.cn/article/021/">Python 数据分析三剑客之 Matplotlib（八）：等高线 / 等值线图的绘制</a></li><li><a href="https://www.itbob.cn/article/022/">Python 数据分析三剑客之 Matplotlib（九）：极区图 / 极坐标图 / 雷达图的绘制</a></li><li><a href="https://www.itbob.cn/article/023/">Python 数据分析三剑客之 Matplotlib（十）：3D 图的绘制</a></li><li><a href="https://www.itbob.cn/article/024/">Python 数据分析三剑客之 Matplotlib（十一）：最热门最常用的 50 个图表</a>【译文】</li></ul><hr><p>专栏：</p><ul><li>NumPy 专栏：<a href="https://itrhx.blog.csdn.net/category_9780393.html">https://itrhx.blog.csdn.net/category_9780393.html</a></li><li>Pandas 专栏：<a href="https://itrhx.blog.csdn.net/category_9780397.html">https://itrhx.blog.csdn.net/category_9780397.html</a></li><li>Matplotlib 专栏：<a href="https://itrhx.blog.csdn.net/category_9780418.html">https://itrhx.blog.csdn.net/category_9780418.html</a></li></ul><br>推荐学习资料与网站：<br><br><ul><li>NumPy 官方中文网：<a href="https://www.numpy.org.cn/">https://www.numpy.org.cn/</a></li><li>Pandas 官方中文网：<a href="https://www.pypandas.cn/">https://www.pypandas.cn/</a></li><li>Matplotlib 官方中文网：<a href="https://www.matplotlib.org.cn/">https://www.matplotlib.org.cn/</a></li><li>NumPy、Matplotlib、Pandas 速查表：<a href="https://github.com/TRHX/Python-quick-reference-table">https://github.com/TRHX/Python-quick-reference-table</a></li></ul><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106025845</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="1x00-fang-fa-miao-shu"><font color="#FF0000">【1x00】方法描述</font></span></h2><p><code>matplotlib.pyplot.pie()</code> 方法用于绘制饼状图。</p><p>基本语法：</p><pre><code class="hljs python">matplotlib.pyplot.pie(        x[, explode=<span class="hljs-literal">None</span>, labels=<span class="hljs-literal">None</span>, colors=<span class="hljs-literal">None</span>,        autopct=<span class="hljs-literal">None</span>, pctdistance=<span class="hljs-number">0.6</span>, shadow=<span class="hljs-literal">False</span>,        labeldistance=<span class="hljs-number">1.1</span>, startangle=<span class="hljs-literal">None</span>, radius=<span class="hljs-literal">None</span>,        counterclock=<span class="hljs-literal">True</span>, wedgeprops=<span class="hljs-literal">None</span>, textprops=<span class="hljs-literal">None</span>,        center=(<span class="hljs-number">0</span>, <span class="hljs-number">0</span>), frame=<span class="hljs-literal">False</span>, rotatelabels=<span class="hljs-literal">False</span>, \*, data=<span class="hljs-literal">None</span>]        )</code></pre><table><thead><tr><th>参数</th><th>描述</th></tr></thead><tbody><tr><td>x</td><td>每个扇形块的大小，数组形式，大小单位是比例</td></tr><tr><td>explode</td><td>指定对应扇形块脱离饼图的半径大小，数组形式，其中元素个数应该是 len(x)</td></tr><tr><td>labels</td><td>每个扇形块上的文本标签，列表形式</td></tr><tr><td>labeldistance</td><td>每个扇形块上的文本标签与扇形中心的距离，float 类型，默认 1.1</td></tr><tr><td>colors</td><td>每个扇形块对应的颜色，数组形式</td></tr><tr><td>autopct</td><td>用于计算每个扇形块所占比例，字符串或者函数类型<br>例如：<code>autopct='%1.1f%%'</code> 表示浮点数，保留一位小数，并添加百分比符号</td></tr><tr><td>pctdistance</td><td>每个扇形块的中心与 autopct 生成的文本之间的距离，float 类型，默认 0.6</td></tr><tr><td>shadow</td><td>是否为扇形添加阴影效果</td></tr><tr><td>startangle</td><td>将饼图按照逆时针旋转指定的角度，float 类型</td></tr><tr><td>radius</td><td>饼图的半径，如果是 None，则将被设置为 1，float 类型</td></tr><tr><td>counterclock</td><td>是否按照逆时针对扇形图进行排列，bool 类型，默认 True</td></tr><tr><td>wedgeprops</td><td>传递给绘制每个扇形图对象的参数，字典形式，参数值参见 <a href="https://matplotlib.org/api/_as_gen/matplotlib.patches.Wedge.html#matplotlib.patches.Wedge">Wedge</a><br>例如：<code>wedgeprops = &#123;'linewidth': 3&#125;</code> 设置扇形边框线宽度为 3</td></tr><tr><td>textprops</td><td>传递给文本对象的参数，字典形式<br>例如：<code>textprops=&#123;'color': 'r', 'fontsize': 15&#125;</code> 设置文字为红色，大小为15</td></tr><tr><td>center</td><td>饼图圆心在画布上是坐标，默认 (0, 0)</td></tr><tr><td>frame</td><td>是否显示 x, y 坐标轴外框，默认 False</td></tr><tr><td>rotatelabels</td><td>是否按照角度进行调整每块饼的 label 文本标签，默认 False</td></tr></tbody></table><h2><span id="2x00-jian-dan-shi-li"><font color="#FF0000">【2x00】简单示例</font></span></h2><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = [<span class="hljs-number">10</span>, <span class="hljs-number">30</span>, <span class="hljs-number">45</span>, <span class="hljs-number">15</span>]labels = [<span class="hljs-string">&#x27;Java&#x27;</span>, <span class="hljs-string">&#x27;Golang&#x27;</span>, <span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;C++&#x27;</span>]colors = [<span class="hljs-string">&#x27;red&#x27;</span>, <span class="hljs-string">&#x27;yellow&#x27;</span>, <span class="hljs-string">&#x27;blue&#x27;</span>, <span class="hljs-string">&#x27;green&#x27;</span>]<span class="hljs-comment"># 指定4个扇区所占比例以及扇区的颜色，扇区文本标签距离扇区中心1.1</span>plt.pie(x, labels=labels, colors=colors, labeldistance=<span class="hljs-number">1.1</span>)plt.title(<span class="hljs-string">&#x27;饼状图简单示例&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/020/01.png" alt="01"></p><h2><span id="3x00-an-jiao-du-diao-zheng-shan-xing-biao-qian"><font color="#FF0000">【3x00】按角度调整扇形标签</font></span></h2><p><code>rotatelabels</code> 属性可以设置是否按照角度调整每块饼的 label（标签）显示方式。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = [<span class="hljs-number">10</span>, <span class="hljs-number">30</span>, <span class="hljs-number">45</span>, <span class="hljs-number">15</span>]labels = [<span class="hljs-string">&#x27;Java&#x27;</span>, <span class="hljs-string">&#x27;Go&#x27;</span>, <span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;C++&#x27;</span>]colors = [<span class="hljs-string">&#x27;red&#x27;</span>, <span class="hljs-string">&#x27;yellow&#x27;</span>, <span class="hljs-string">&#x27;blue&#x27;</span>, <span class="hljs-string">&#x27;green&#x27;</span>]<span class="hljs-comment"># 指定4个扇区所占比例以及扇区的颜色，扇区文本标签距离扇区中心1.1，按角度调整 labels</span>plt.pie(x, labels=labels, colors=colors, labeldistance=<span class="hljs-number">1.1</span>, rotatelabels=<span class="hljs-literal">True</span>)plt.title(<span class="hljs-string">&#x27;饼状图按角度调整 labels 示例&#x27;</span>)plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/020/02.png" alt="02"></p><h2><span id="4x00-xian-shi-tu-li"><font color="#FF0000">【4x00】显示图例</font></span></h2><p>与前面文章中绘制线性图、散点图、条形图一样，调用 <code>matplotlib.pyplot.legend()</code> 方法可绘制图例，该方法的参数解释参见前文<a href="https://itrhx.blog.csdn.net/article/details/105828143">《Python 数据分析三剑客之 Matplotlib（三）：图例 / LaTeX / 刻度 / 子图 / 补丁等基本图像属性》</a></p><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = [<span class="hljs-number">10</span>, <span class="hljs-number">30</span>, <span class="hljs-number">45</span>, <span class="hljs-number">15</span>]labels = [<span class="hljs-string">&#x27;Java&#x27;</span>, <span class="hljs-string">&#x27;Go&#x27;</span>, <span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;C++&#x27;</span>]colors = [<span class="hljs-string">&#x27;red&#x27;</span>, <span class="hljs-string">&#x27;yellow&#x27;</span>, <span class="hljs-string">&#x27;blue&#x27;</span>, <span class="hljs-string">&#x27;green&#x27;</span>]plt.pie(x, labels=labels, colors=colors, labeldistance=<span class="hljs-number">1.1</span>)plt.title(<span class="hljs-string">&#x27;饼状图显示图例示例&#x27;</span>)plt.legend(bbox_to_anchor=(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>))plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/020/03.png" alt="03"></p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106025845</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr><h2><span id="5x00-tu-chu-xian-shi-shan-xing-kuai"><font color="#FF0000">【5x00】突出显示扇形块</font></span></h2><p><code>explode</code> 参数可以实现突出显示某一块扇区，接收数组形式的参数，这个数组中的元素个数应该是 len(x)，即和扇区块的数量相同。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = [<span class="hljs-number">10</span>, <span class="hljs-number">30</span>, <span class="hljs-number">45</span>, <span class="hljs-number">15</span>]labels = [<span class="hljs-string">&#x27;Java&#x27;</span>, <span class="hljs-string">&#x27;Golang&#x27;</span>, <span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;C++&#x27;</span>]colors = [<span class="hljs-string">&#x27;red&#x27;</span>, <span class="hljs-string">&#x27;yellow&#x27;</span>, <span class="hljs-string">&#x27;blue&#x27;</span>, <span class="hljs-string">&#x27;green&#x27;</span>]<span class="hljs-comment"># 指定第一个扇区块脱离饼图的半径大小为0.3，其它扇区不脱离</span>plt.pie(x, labels=labels, colors=colors, labeldistance=<span class="hljs-number">1.1</span>, explode=[<span class="hljs-number">0.3</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>])plt.title(<span class="hljs-string">&#x27;饼状图突出显示扇形块示例&#x27;</span>)plt.legend(bbox_to_anchor=(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>))plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/020/04.png" alt="04"></p><h2><span id="6x00-xian-shi-ge-shan-qu-suo-zhan-bai-fen-bi"><font color="#FF0000">【6x00】显示各扇区所占百分比</font></span></h2><p><code>autopct</code> 参数可用于计算每个扇形块所占比例，接收字符串或者函数类型，例如：<code>autopct='%1.1f%%'</code> 表示浮点数，保留一位小数，并添加百分比符号。<code>pctdistance</code> 参数用于调整每个扇形块的中心与 <code>autopct</code> 生成的文本之间的距离，float 类型，默认 0.6。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = [<span class="hljs-number">10</span>, <span class="hljs-number">30</span>, <span class="hljs-number">45</span>, <span class="hljs-number">15</span>]labels = [<span class="hljs-string">&#x27;Java&#x27;</span>, <span class="hljs-string">&#x27;Golang&#x27;</span>, <span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;C++&#x27;</span>]colors = [<span class="hljs-string">&#x27;red&#x27;</span>, <span class="hljs-string">&#x27;yellow&#x27;</span>, <span class="hljs-string">&#x27;blue&#x27;</span>, <span class="hljs-string">&#x27;green&#x27;</span>]plt.pie(    x,                        <span class="hljs-comment"># 每个扇形块所占比例</span>    labels=labels,            <span class="hljs-comment"># 扇形块文本标签</span>    colors=colors,            <span class="hljs-comment"># 扇形块颜色</span>    labeldistance=<span class="hljs-number">1.1</span>,        <span class="hljs-comment"># 扇形块标签距离中心的距离</span>    explode=[<span class="hljs-number">0.3</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>],   <span class="hljs-comment"># 第一个扇形块突出显示</span>    autopct=<span class="hljs-string">&#x27;%1.1f%%&#x27;</span>,        <span class="hljs-comment"># 显示百分比，保留一位小数</span>    pctdistance=<span class="hljs-number">0.5</span>           <span class="hljs-comment"># 百分比文本距离饼状图中心的距离</span>)plt.title(<span class="hljs-string">&#x27;饼状图显示各扇区所占百分比示例&#x27;</span>)plt.legend(bbox_to_anchor=(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>))  <span class="hljs-comment"># 显示图例</span>plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/020/05.png" alt="05"></p><h2><span id="7x00-xuan-zhuan-bing-zhuang-tu"><font color="#FF0000">【7x00】旋转饼状图</font></span></h2><p><code>startangle</code> 参数可以选择饼状图，改变饼状图放置的角度。注意是按照逆时针旋转。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = [<span class="hljs-number">10</span>, <span class="hljs-number">30</span>, <span class="hljs-number">45</span>, <span class="hljs-number">15</span>]labels = [<span class="hljs-string">&#x27;Java&#x27;</span>, <span class="hljs-string">&#x27;Golang&#x27;</span>, <span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;C++&#x27;</span>]colors = [<span class="hljs-string">&#x27;red&#x27;</span>, <span class="hljs-string">&#x27;yellow&#x27;</span>, <span class="hljs-string">&#x27;blue&#x27;</span>, <span class="hljs-string">&#x27;green&#x27;</span>]plt.pie(    x,                        <span class="hljs-comment"># 每个扇形块所占比例</span>    labels=labels,            <span class="hljs-comment"># 扇形块文本标签</span>    colors=colors,            <span class="hljs-comment"># 扇形块颜色</span>    labeldistance=<span class="hljs-number">1.1</span>,        <span class="hljs-comment"># 扇形块标签距离中心的距离</span>    explode=[<span class="hljs-number">0.3</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>],   <span class="hljs-comment"># 第一个扇形块突出显示</span>    autopct=<span class="hljs-string">&#x27;%1.1f%%&#x27;</span>,        <span class="hljs-comment"># 显示百分比，保留一位小数</span>    pctdistance=<span class="hljs-number">0.5</span>,          <span class="hljs-comment"># 百分比文本距离饼状图中心的距离</span>    startangle=-<span class="hljs-number">90</span>            <span class="hljs-comment"># 逆时针旋转-90°，即顺时针旋转90°</span>)plt.title(<span class="hljs-string">&#x27;饼状图旋转角度示例&#x27;</span>)plt.legend(bbox_to_anchor=(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>))  <span class="hljs-comment"># 显示图例</span>plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/020/06.png" alt="06"></p><h2><span id="8x00-zi-ding-yi-mei-ge-shan-xing-he-wen-zi-shu-xing"><font color="#FF0000">【8x00】自定义每个扇形和文字属性</font></span></h2><p><code>wedgeprops</code> 参数以字典形式为每个扇形添加自定义属性，例如：<code>wedgeprops = &#123;'linewidth': 3&#125;</code> 设置扇形边框线宽度为 3，更多其他参数值参见 <a href="https://matplotlib.org/api/_as_gen/matplotlib.patches.Wedge.html#matplotlib.patches.Wedge">Wedge</a>；</p><p><code>textprops</code> 参数同样以字典形式为文本对象添加自定义属性，例如：<code>textprops=&#123;'color': 'r', 'fontsize': 15&#125;</code> 设置文字为红色，大小为15，更多其他参数值参见 <a href="https://matplotlib.org/api/text_api.html?highlight=text#matplotlib.text.Text">Text</a>。</p><pre><code class="hljs python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> pltplt.rcParams[<span class="hljs-string">&#x27;font.sans-serif&#x27;</span>] = [<span class="hljs-string">&#x27;Microsoft YaHei&#x27;</span>]x = [<span class="hljs-number">10</span>, <span class="hljs-number">30</span>, <span class="hljs-number">45</span>, <span class="hljs-number">15</span>]labels = [<span class="hljs-string">&#x27;Java&#x27;</span>, <span class="hljs-string">&#x27;Golang&#x27;</span>, <span class="hljs-string">&#x27;Python&#x27;</span>, <span class="hljs-string">&#x27;C++&#x27;</span>]colors = [<span class="hljs-string">&#x27;red&#x27;</span>, <span class="hljs-string">&#x27;yellow&#x27;</span>, <span class="hljs-string">&#x27;blue&#x27;</span>, <span class="hljs-string">&#x27;green&#x27;</span>]plt.pie(    x,                           <span class="hljs-comment"># 每个扇形块所占比例</span>    labels=labels,               <span class="hljs-comment"># 扇形块文本标签</span>    colors=colors,               <span class="hljs-comment"># 扇形块颜色</span>    labeldistance=<span class="hljs-number">1.1</span>,           <span class="hljs-comment"># 扇形块标签距离中心的距离</span>    explode=[<span class="hljs-number">0.3</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>],      <span class="hljs-comment"># 第一个扇形块突出显示</span>    autopct=<span class="hljs-string">&#x27;%1.1f%%&#x27;</span>,           <span class="hljs-comment"># 显示百分比，保留一位小数</span>    pctdistance=<span class="hljs-number">0.6</span>,             <span class="hljs-comment"># 百分比文本距离饼状图中心的距离</span>    shadow=<span class="hljs-literal">True</span>,                 <span class="hljs-comment"># 显示阴影效果</span>    wedgeprops=&#123;                 <span class="hljs-comment"># 为每个扇形添加属性</span>        <span class="hljs-string">&#x27;width&#x27;</span>: <span class="hljs-number">0.7</span>,            <span class="hljs-comment"># 扇形宽度0.7</span>        <span class="hljs-string">&#x27;edgecolor&#x27;</span>: <span class="hljs-string">&#x27;#98F5FF&#x27;</span>,  <span class="hljs-comment"># 扇形边缘线颜色</span>        <span class="hljs-string">&#x27;linewidth&#x27;</span>: <span class="hljs-number">3</span>           <span class="hljs-comment"># 扇形边缘线宽度</span>    &#125;,    textprops=&#123;                  <span class="hljs-comment"># 为文字添加属性</span>        <span class="hljs-string">&#x27;fontsize&#x27;</span>: <span class="hljs-number">13</span>,          <span class="hljs-comment"># 文字大小</span>        <span class="hljs-string">&#x27;fontweight&#x27;</span>: <span class="hljs-string">&#x27;bold&#x27;</span>,    <span class="hljs-comment"># 文字粗细</span>        <span class="hljs-string">&#x27;color&#x27;</span>: <span class="hljs-string">&#x27;k&#x27;</span>             <span class="hljs-comment"># 文字颜色，黑色</span>    &#125;)plt.title(<span class="hljs-string">&#x27;饼状图自定义每个扇形和文字属性示例&#x27;</span>, fontweight=<span class="hljs-string">&#x27;bold&#x27;</span>)plt.legend(bbox_to_anchor=(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>), borderpad=<span class="hljs-number">0.6</span>)  <span class="hljs-comment"># 显示图例</span>plt.show()</code></pre><p><img src="https://static.wukongsec.com/itbob/images/article/020/07.png" alt="07"></p><hr><pre><code class="hljs yaml"><span class="hljs-string">这里是一段物理防爬虫文本，请读者忽略。</span><span class="hljs-string">本文原创首发于</span> <span class="hljs-string">CSDN，作者</span> <span class="hljs-string">ITBOB。</span><span class="hljs-string">博客首页：https://itrhx.blog.csdn.net/</span><span class="hljs-string">本文链接：https://itrhx.blog.csdn.net/article/details/106025845</span><span class="hljs-string">未经授权，禁止转载！恶意转载，后果自负！尊重原创，远离剽窃！</span></code></pre><hr>]]></content>
    
    
      
      
    <summary type="html">&lt;h2&gt;&lt;span id=&quot;wen-zhang-mu-lu&quot;&gt;文章目录&lt;/span&gt;&lt;/h2&gt;
&lt;!-- toc --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#font-color-ff0000-1x00-fang-fa-miao-shu-font&quot;&gt;&lt;font color=&quot;#</summary>
      
    
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/categories/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    
    <category term="数据分析" scheme="https://www.itbob.cn/tags/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    
    <category term="Matplotlib" scheme="https://www.itbob.cn/tags/Matplotlib/"/>
    
  </entry>
  
</feed>
