40  作图

40.1 散点图

40.1.1 简单例子

sysuse auto, clear
scatter price weight

  ___  ____  ____  ____  ____ ©
 /__    /   ____/   /   ____/      17.0
___/   /   /___/   /   /___/       BE—Basic Edition

 Statistics and Data Science       Copyright 1985-2021 StataCorp LLC
                                   StataCorp
                                   4905 Lakeway Drive
                                   College Station, Texas 77845 USA
                                   800-STATA-PC        https://www.stata.com
                                   979-696-4600        stata@stata.com

Stata license: Single-user  perpetual
Serial number: 301706350312
  Licensed to: Clinical Research Center
               Maoming People's Hospital

Notes:
      1. Unicode is supported; see help unicode_advice.
(1978 automobile data)

40.1.2 进阶例子

https://psychstatistics.com/2020/07/23/stata-scatterplots-and-histograms/

40.1.2.1 Creating a Scatterplot with title

graph twoway scatter price length, title("Scatterplot of price and length")

scatter price length, title("Scatterplot of price and length")

40.1.2.2 Adding a Lowess Smoother

Adding the lowess’ smoother is easy as well. To do this we are going to append two graph twoway plots. Specifically, we are going to append scatter and lowess. We append two plots by using double-pipes — ||.

graph twoway scatter price length || lowess price length, title("Scatterplot of price and length")

40.2 直方图

To create a histogram using commands, just type histogram followed by the variable name, e.g. histogram mpg, if you want to look at miles per gallon, you would type:

histogram mpg
(bin=8, start=12, width=3.625)

Often the default settings of the histogram may not be the best representation of your data. There are a number of useful options with the histogram command, including width with allows you to specify bin width, frequency which changes the y-axis to reflect frequency instead of density and normal which overlays a normal curve onto your graphic. You can also modify the title and axes of the graph using syntax options.

histogram mpg, width(2) frequency normal title(mpg histogram)
(bin=15, start=12, width=2)

40.3 ROC曲线

// 加载自带数据集
sysuse auto, clear

// 拟合 logistic 回归模型
logit foreign mpg weight

// 获取预测概率
predict p

// 计算 ROC 曲线相关统计指标
roctab foreign p, graph
(1978 automobile data)

Iteration 0:   log likelihood =  -45.03321  
Iteration 1:   log likelihood = -29.238536  
Iteration 2:   log likelihood = -27.244139  
Iteration 3:   log likelihood = -27.175277  
Iteration 4:   log likelihood = -27.175156  
Iteration 5:   log likelihood = -27.175156  

Logistic regression                                     Number of obs =     74
                                                        LR chi2(2)    =  35.72
                                                        Prob > chi2   = 0.0000
Log likelihood = -27.175156                             Pseudo R2     = 0.3966

------------------------------------------------------------------------------
     foreign | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         mpg |  -.1685869   .0919175    -1.83   0.067    -.3487418     .011568
      weight |  -.0039067   .0010116    -3.86   0.000    -.0058894    -.001924
       _cons |   13.70837   4.518709     3.03   0.002     4.851859    22.56487
------------------------------------------------------------------------------
(option pr assumed; Pr(foreign))

41 统计分析

41.1 回归

sysuse auto, clear
(1978 automobile data)
regress price mpg

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(1, 72)        =     20.26
       Model |   139449474         1   139449474   Prob > F        =    0.0000
    Residual |   495615923        72  6883554.48   R-squared       =    0.2196
-------------+----------------------------------   Adj R-squared   =    0.2087
       Total |   635065396        73  8699525.97   Root MSE        =    2623.7

------------------------------------------------------------------------------
       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         mpg |  -238.8943   53.07669    -4.50   0.000    -344.7008   -133.0879
       _cons |   11253.06   1170.813     9.61   0.000     8919.088    13587.03
------------------------------------------------------------------------------

To get standardized coefficients we add the beta option to our command.

regress price mpg, beta

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(1, 72)        =     20.26
       Model |   139449474         1   139449474   Prob > F        =    0.0000
    Residual |   495615923        72  6883554.48   R-squared       =    0.2196
-------------+----------------------------------   Adj R-squared   =    0.2087
       Total |   635065396        73  8699525.97   Root MSE        =    2623.7

------------------------------------------------------------------------------
       price | Coefficient  Std. err.      t    P>|t|                     Beta
-------------+----------------------------------------------------------------
         mpg |  -238.8943   53.07669    -4.50   0.000                -.4685967
       _cons |   11253.06   1170.813     9.61   0.000                        .
------------------------------------------------------------------------------

Visualizing Regression Lines

graph twoway scatter price mpg

Add the regression line to the plot. The lfit graph command allows us to do this (lfit stands for linear fit). However, we don’t want the regression line in isolation. We want it on top of the scatterplot. Stata lets you combine twoway graphs in one of two ways: (1) using parentheses or (2) using pipes.

graph twoway (lfit price mpg) (scatter price mpg)

graph twoway lfitci price mpg || scatter price mpg

41.2 关联

sysuse auto, clear 
(1978 automobile data)
correlate price mpg weight length
(obs=74)

             |    price      mpg   weight   length
-------------+------------------------------------
       price |   1.0000
         mpg |  -0.4686   1.0000
      weight |   0.5386  -0.8072   1.0000
      length |   0.4318  -0.7958   0.9460   1.0000
corr price mpg weigh length, covariance
(obs=74)

             |    price      mpg   weight   length
-------------+------------------------------------
       price |  8.7e+06
         mpg | -7996.28   33.472
      weight |  1.2e+06 -3629.43   604030
      length |  28360.3 -102.514  16370.9   495.79
pwcorr price mpg weight length, sig

             |    price      mpg   weight   length
-------------+------------------------------------
       price |   1.0000 
             |
             |
         mpg |  -0.4686   1.0000 
             |   0.0000
             |
      weight |   0.5386  -0.8072   1.0000 
             |   0.0000   0.0000
             |
      length |   0.4318  -0.7958   0.9460   1.0000 
             |   0.0001   0.0000   0.0000
             |

41.3 html格式

%help histogram
Failed to fetch HTML help.
HTTP Error 403: Forbidden

解决办法:

  1. 找到报警所在的文件
sudo grep -r "Failed to fetch HTML help." /

/usr/local/lib/python3.8/dist-packages/nbstata/magics.py

  1. 打开,手动设置ssl
# 手动添加,防止ssl相关报警
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
# 手动添加,防止ssl相关报警
  1. 关闭jupyterlab notebook文件,重新打开
%help histogram
Failed to fetch HTML help.
HTTP Error 403: Forbidden