TABULATE入門2

Last-modified: 2019-04-26 (金) 05:45:43

**************************************************************;
* TAB030329s2.sas                                            *;
* proc tabulate を使おう                                     *;
* session 2 : table文で表を指定                              *;
* 2003.3.29, 翔                                          *;
* 2008.3.13, 改定2                                           *;
**************************************************************;

options nocenter;

libname mydata "c:\";

/*
前セッションと同じく
1年間にクレジットカードを利用したホテル利用者の特徴を学習させることを目的としたのデータ。1,450人分を使って,実行例を示します.
*/

data creditcard;
  infile "c:\train_space.txt"  firstobs=2 dlm='09'x;; /*外部ファイルから入力*/

          input id      hotel   xgender $       xage    xincome  xmarital $     xusage  xretail xrest   xenter  xtrans  xother;  /*読み込むデータを指定*/
run;

proc print data=creditcard(obs=5);run;
/*
 OBS   id   hotel   xgender   xage   xincome   xmarital   xusage   xretail   xrest   xenter   xtrans   xother

   1    1     0     female     60       .       single    300000    100000       1        0        0    30000
   2    2     1     male       40       .       single     30000     10000       0        0        0    30000
   3    3     0     male       40       .       single    100000    100000       1        0    10000   100000
   4    4     0     male       30       .       single    300000    300000   10000   100000    30000    10000
   5    5     0     male       40       .       single     10000     10000       1        0        0        1

*/

/*****************************************************
このセッションでは,表構成を指定するtable文の書き方を
説明します.
性別X収入の度数表ををつくります.
*/


proc tabulate data=creditcard;
  class xgender xincome;
  table xgender,xincome;
run;

/*
----------------------------------------------------------------------------------------------------------
|                          |                                   xincome                                   |
|                          |-----------------------------------------------------------------------------|
|                          |     1      |    100     |    300     |    500     |    700     |    1000    |
|                          |------------+------------+------------+------------+------------+------------|
|                          |     N      |     N      |     N      |     N      |     N      |     N      |
|--------------------------+------------+------------+------------+------------+------------+------------|
|xgender                   |            |            |            |            |            |            |
|--------------------------|            |            |            |            |            |            |
|female                    |      128.00|       59.00|       50.00|       25.00|       19.00|       19.00|
|--------------------------+------------+------------+------------+------------+------------+------------|
|male                      |      545.00|       55.00|      108.00|      129.00|      119.00|      160.00|
----------------------------------------------------------------------------------------------------------

*/

/*
要点は,
1.分類する変数を,class文で宣言する.
2.table文で,表側 カンマ(,) 表頭 を指定する.
です.
table文で,表の構成を指定するわけですが,ここで最低限必要なものは,
分類変数と3つの表制御記号のカンマ(,),アスタリスク(*),空白(" ")です.
カンマは表の次元を指定し,アスタリスクは交差を,空白は並列を示します.
*/

/********************************************************
次元指定
カンマの前に表側の変数を,カンマの後ろに表頭の変数を指定します.
freqのtables文のアスタリスクが,このカンマに相当します.
このように,カンマが1つで,2次元の表が指定できます.
カンマなしだと,1次元,カンマ2個だと3次元の表となります.
カンマ個数 次元 指定
0     1  表頭
1     2  表側, 表頭
2     3  シート,表側, 表頭
(3個以上は,指定できません)
*/

proc tabulate data=creditcard;
  class xgender xincome xmarital;
  table xmarital;  /* 1次元*/
run;
/*

---------------------------
|        xmarital         |
|-------------------------|
|  married   |   single   |
|------------+------------|
|     N      |     N      |
|------------+------------|
|      929.00|      487.00|
---------------------------

*/

proc tabulate data=creditcard;
  class xgender xincome xmarital;
  table xincome,xmarital;  /* 2次元*/
run;
/*
------------------------------------------------------
|                          |        xmarital         |
|                          |-------------------------|
|                          |  married   |   single   |
|                          |------------+------------|
|                          |     N      |     N      |
|--------------------------+------------+------------|
|xincome                   |            |            |
|--------------------------|            |            |
|1                         |      448.00|      225.00|
|--------------------------+------------+------------|
|100                       |       36.00|       78.00|
|--------------------------+------------+------------|
|300                       |       76.00|       82.00|
|--------------------------+------------+------------|
|500                       |      101.00|       53.00|
|--------------------------+------------+------------|
|700                       |      108.00|       30.00|
|--------------------------+------------+------------|
|1000                      |      160.00|       19.00|
------------------------------------------------------
*/

proc tabulate data=creditcard;
  class xgender xincome xmarital;;
  table xgender,xincome,xmarital;  /* 3次元*/
run;

/*


xgender female
------------------------------------------------------
|                          |        xmarital         |
|                          |-------------------------|
|                          |  married   |   single   |
|                          |------------+------------|
|                          |     N      |     N      |
|--------------------------+------------+------------|
|xincome                   |            |            |
|--------------------------|            |            |
|1                         |       58.00|       70.00|
|--------------------------+------------+------------|
|100                       |       25.00|       34.00|
|--------------------------+------------+------------|
|300                       |       24.00|       26.00|
|--------------------------+------------+------------|
|500                       |       14.00|       11.00|
|--------------------------+------------+------------|
|700                       |       13.00|        6.00|
|--------------------------+------------+------------|
|1000                      |       12.00|        7.00|
------------------------------------------------------
xgender male
------------------------------------------------------
|                          |        xmarital         |
|                          |-------------------------|
|                          |  married   |   single   |
|                          |------------+------------|
|                          |     N      |     N      |
|--------------------------+------------+------------|
|xincome                   |            |            |
|--------------------------|            |            |
|1                         |      390.00|      155.00|
|--------------------------+------------+------------|
|100                       |       11.00|       44.00|
|--------------------------+------------+------------|
|300                       |       52.00|       56.00|
|--------------------------+------------+------------|
|500                       |       87.00|       42.00|
|--------------------------+------------+------------|
|700                       |       95.00|       24.00|
|--------------------------+------------+------------|
|1000                      |      148.00|       12.00|
------------------------------------------------------
*/

/***************************************************
交差指定
1つの次元の中で2変数以上のクロスを指定します.
交差したい変数同士をアスタリスクでつなぎます.
freq の table var1*var2/LIST; に相当します.
*/



proc tabulate data=creditcard;
  class xgender xincome xmarital;
  table xgender*xincome,xmarital;  /* 表側で交差 */
run;

/*
------------------------------------------------------
|                          |        xmarital         |
|                          |-------------------------|
|                          |  married   |   single   |
|                          |------------+------------|
|                          |     N      |     N      |
|--------------------------+------------+------------|
|xgender     |xincome      |            |            |
|------------+-------------|            |            |
|female      |1            |       58.00|       70.00|
|            |-------------+------------+------------|
|            |100          |       25.00|       34.00|
|            |-------------+------------+------------|
|            |300          |       24.00|       26.00|
|            |-------------+------------+------------|
|            |500          |       14.00|       11.00|
|            |-------------+------------+------------|
|            |700          |       13.00|        6.00|
|            |-------------+------------+------------|
|            |1000         |       12.00|        7.00|
|------------+-------------+------------+------------|
|male        |1            |      390.00|      155.00|
|            |-------------+------------+------------|
|            |100          |       11.00|       44.00|
|            |-------------+------------+------------|
|            |300          |       52.00|       56.00|
|            |-------------+------------+------------|
|            |500          |       87.00|       42.00|
|            |-------------+------------+------------|
|            |700          |       95.00|       24.00|
|            |-------------+------------+------------|
|            |1000         |      148.00|       12.00|
------------------------------------------------------
*/

proc tabulate data=creditcard;
  class xgender xincome xmarital xretail;
  table xgender*xincome,xretail*xmarital;  /* 表側と表頭で交差 */
run;

/*
----------------------------------------------------------------------------------------------------------
|                          |                                   xretail                                   |
|                          |-----------------------------------------------------------------------------|
|                          |            0            |            1            |          10000          |
|                          |-------------------------+-------------------------+-------------------------|
|                          |        xmarital         |        xmarital         |        xmarital         |
|                          |-------------------------+-------------------------+-------------------------|
|                          |  married   |   single   |  married   |   single   |  married   |   single   |
|                          |------------+------------+------------+------------+------------+------------|
|                          |     N      |     N      |     N      |     N      |     N      |     N      |
|--------------------------+------------+------------+------------+------------+------------+------------|
|xgender     |xincome      |            |            |            |            |            |            |
|------------+-------------|            |            |            |            |            |            |
|female      |1            |        6.00|        9.00|        1.00|        3.00|        6.00|        4.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |100          |        2.00|        4.00|        1.00|           .|        3.00|        6.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |300          |        4.00|        4.00|           .|           .|           .|        2.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |500          |        2.00|        1.00|        1.00|           .|        1.00|           .|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |700          |        3.00|           .|        2.00|           .|           .|        1.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |1000         |        3.00|        1.00|           .|           .|        1.00|        1.00|
|------------+-------------+------------+------------+------------+------------+------------+------------|
|male        |1            |      119.00|       47.00|       21.00|        8.00|       33.00|       25.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |100          |        1.00|       19.00|        1.00|        1.00|        1.00|        3.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |300          |       13.00|       17.00|        3.00|        2.00|        6.00|        4.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |500          |       22.00|       10.00|        4.00|        1.00|       14.00|        5.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |700          |       22.00|        9.00|        7.00|        2.00|       15.00|        1.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |1000         |       47.00|        3.00|        8.00|        1.00|       15.00|           .|
----------------------------------------------------------------------------------------------------------
(Continued)


----------------------------------------------------------------------------------------------------------
|                          |                                   xretail                                   |
|                          |-----------------------------------------------------------------------------|
|                          |          30000          |         100000          |         300000          |
|                          |-------------------------+-------------------------+-------------------------|
|                          |        xmarital         |        xmarital         |        xmarital         |
|                          |-------------------------+-------------------------+-------------------------|
|                          |  married   |   single   |  married   |   single   |  married   |   single   |
|                          |------------+------------+------------+------------+------------+------------|
|                          |     N      |     N      |     N      |     N      |     N      |     N      |
|--------------------------+------------+------------+------------+------------+------------+------------|
|xgender     |xincome      |            |            |            |            |            |            |
|------------+-------------|            |            |            |            |            |            |
|female      |1            |       13.00|       21.00|       15.00|       16.00|       17.00|       17.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |100          |        5.00|        7.00|        5.00|       13.00|        9.00|        4.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |300          |        6.00|        5.00|        8.00|        7.00|        6.00|        8.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |500          |        2.00|        3.00|        4.00|        4.00|        4.00|        3.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |700          |        2.00|        1.00|        5.00|        4.00|        1.00|           .|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |1000         |        3.00|        1.00|        3.00|        2.00|        2.00|        2.00|
|------------+-------------+------------+------------+------------+------------+------------+------------|
|male        |1            |       84.00|       25.00|       93.00|       39.00|       40.00|       11.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |100          |        1.00|        5.00|        7.00|       10.00|           .|        6.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |300          |       14.00|       15.00|       14.00|       14.00|        2.00|        4.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |500          |       21.00|        7.00|       16.00|       11.00|       10.00|        8.00|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |700          |       12.00|        5.00|       29.00|        7.00|       10.00|           .|
|            |-------------+------------+------------+------------+------------+------------+------------|
|            |1000         |       29.00|        1.00|       32.00|        2.00|       17.00|        5.00|
----------------------------------------------------------------------------------------------------------
*/

/***************************************************
並列指定
1つの次元の中で2変数以上を並列させます.
指定のしかたは,変数を空白で区切って並べればよく,
freq の table var1 var2; と同じです.
変数だけでなく,交差指定した変数の組を並列することもできます.
*/


proc tabulate data=creditcard;
  class xgender xincome xmarital xretail;
  table xgender xretail xgender*xretail,xmarital;
run;


/*
------------------------------------------------------
|                          |        xmarital         |
|                          |-------------------------|
|                          |  married   |   single   |
|                          |------------+------------|
|                          |     N      |     N      |
|--------------------------+------------+------------|
|xgender                   |            |            |
|--------------------------|            |            |
|female                    |      146.00|      154.00|
|--------------------------+------------+------------|
|male                      |      783.00|      333.00|
|--------------------------+------------+------------|
|xretail                   |            |            |
|--------------------------|            |            |
|0                         |      244.00|      124.00|
|--------------------------+------------+------------|
|1                         |       49.00|       18.00|
|--------------------------+------------+------------|
|10000                     |       95.00|       52.00|
|--------------------------+------------+------------|
|30000                     |      192.00|       96.00|
|--------------------------+------------+------------|
|100000                    |      231.00|      129.00|
|--------------------------+------------+------------|
|300000                    |      118.00|       68.00|
|--------------------------+------------+------------|
|xgender     |xretail      |            |            |
|------------+-------------|            |            |
|female      |0            |       20.00|       19.00|
|            |-------------+------------+------------|
|            |1            |        5.00|        3.00|
|            |-------------+------------+------------|
|            |10000        |       11.00|       14.00|
|            |-------------+------------+------------|
|            |30000        |       31.00|       38.00|
|            |-------------+------------+------------|
|            |100000       |       40.00|       46.00|
|            |-------------+------------+------------|
|            |300000       |       39.00|       34.00|
|------------+-------------+------------+------------|
|male        |0            |      224.00|      105.00|
|            |-------------+------------+------------|
|            |1            |       44.00|       15.00|
|            |-------------+------------+------------|
|            |10000        |       84.00|       38.00|
|            |-------------+------------+------------|
|            |30000        |      161.00|       58.00|
|            |-------------+------------+------------|
|            |100000       |      191.00|       83.00|
|            |-------------+------------+------------|
|            |300000       |       79.00|       34.00|
------------------------------------------------------
*/

/*
上の例では,性別と小売の年間利用料と性別X小売の年間利用料の3つを表側に並列させています.
ダミー変数のALLを並列させることによって,計をつくることを覚えておきましょう.
*/


proc tabulate data=creditcard;
  class xgender xincome xmarital xretail;
  table all xgender,all xretail;
run;


/*
----------------------------------------------------------------------------------------------------------
|                          |            |                            xretail                             |
|                          |            |----------------------------------------------------------------|
|                          |    All     |     0      |     1      |   10000    |   30000    |   100000   |
|                          |------------+------------+------------+------------+------------+------------|
|                          |     N      |     N      |     N      |     N      |     N      |     N      |
|--------------------------+------------+------------+------------+------------+------------+------------|
|All                       |     1416.00|      368.00|       67.00|      147.00|      288.00|      360.00|
|--------------------------+------------+------------+------------+------------+------------+------------|
|xgender                   |            |            |            |            |            |            |
|--------------------------|            |            |            |            |            |            |
|female                    |      300.00|       39.00|        8.00|       25.00|       69.00|       86.00|
|--------------------------+------------+------------+------------+------------+------------+------------|
|male                      |     1116.00|      329.00|       59.00|      122.00|      219.00|      274.00|
----------------------------------------------------------------------------------------------------------

(Continued)

-----------------------------------------
|                          |  xretail   |
|                          |------------|
|                          |   300000   |
|                          |------------|
|                          |     N      |
|--------------------------+------------|
|All                       |      186.00|
|--------------------------+------------|
|xgender                   |            |
|--------------------------|            |
|female                    |       73.00|
|--------------------------+------------|
|male                      |      113.00|
-----------------------------------------
*/

/*********************************************************
括弧()を使って,交差を簡略表記することもできます.
*/


proc tabulate data=creditcard;
  class xgender xincome xmarital xretail;
  table (all xgender)*(all xretail),xmarital;
run;

/*
------------------------------------------------------
|                          |        xmarital         |
|                          |-------------------------|
|                          |  married   |   single   |
|                          |------------+------------|
|                          |     N      |     N      |
|--------------------------+------------+------------|
|All         |All          |      929.00|      487.00|
|            |-------------+------------+------------|
|            |xretail      |            |            |
|            |-------------|            |            |
|            |0            |      244.00|      124.00|
|            |-------------+------------+------------|
|            |1            |       49.00|       18.00|
|            |-------------+------------+------------|
|            |10000        |       95.00|       52.00|
|            |-------------+------------+------------|
|            |30000        |      192.00|       96.00|
|            |-------------+------------+------------|
|            |100000       |      231.00|      129.00|
|            |-------------+------------+------------|
|            |300000       |      118.00|       68.00|
|------------+-------------+------------+------------|
|xgender     |All          |      146.00|      154.00|
|------------+-------------+------------+------------|
|female      |xretail      |            |            |
|            |-------------|            |            |
|            |0            |       20.00|       19.00|
|            |-------------+------------+------------|
|            |1            |        5.00|        3.00|
|            |-------------+------------+------------|
|            |10000        |       11.00|       14.00|
|            |-------------+------------+------------|
|            |30000        |       31.00|       38.00|
|            |-------------+------------+------------|
|            |100000       |       40.00|       46.00|
|            |-------------+------------+------------|
|            |300000       |       39.00|       34.00|
|------------+-------------+------------+------------|
|male        |All          |      783.00|      333.00|
|            |-------------+------------+------------|
|            |xretail      |            |            |
|            |-------------|            |            |
|            |0            |      224.00|      105.00|
|            |-------------+------------+------------|
|            |1            |       44.00|       15.00|
|            |-------------+------------+------------|
|            |10000        |       84.00|       38.00|
|            |-------------+------------+------------|
|            |30000        |      161.00|       58.00|
|            |-------------+------------+------------|
|            |100000       |      191.00|       83.00|
|            |-------------+------------+------------|
|            |300000       |       79.00|       34.00|
------------------------------------------------------
*/


/*
表側の (all xgender)*(all xretail) は,
all*all all*xretail xgender*all xgender*xretail
すなわち
all xretail xgender xgender*xretail
と指定したことと実質同じです.

表指定の階層構造は,
1.カンマ区切りによる次元指定
2.空白による変数または交差変数組の並列指定
3.アスタリスクによる交差変数指定
となります.
*/

/***********************************************************
このセッションの最後に表の見栄えを変えるオプションのいくつかを紹介します.

1.罫線をすべてはずす.
慣れると,罫線はじゃまなことが多いです.次のオプションで罫線は消せます.
proc tabulate formchar="           " noseps;
formcharで,罫線に使う文字を全11種とも空白に指定します.
nosepsは,縦のセパレータを無しにするという意味です.

2.欠損値も数える
SASは,欠損値を除外するのが基本なので,特別なことがない限り,
必ず欠損表示をさせましょう.
proc tabulate missing;

3.統計量フォーマットのデフォルトを指定
proc tabulate format=7.;
度数に小数点2桁は変ですから,デフォルトで変えておくと便利です.
comma9.なんかもお勧めです.
個別にも指定できますが,この話は後日に.

4.表側の幅を小さくしたい
table ../RTS=10;
rts=で,表側の幅を,ページ幅の何パーセントにするかを指定できます.
あまり短くすると,1カテゴリが折り返して,複数行になってしまうので,
うまく調整する必要があります.

5.分類変数のカテゴリ値のフォーマット
freqなどと同じく変数にフォーマットをあてれば,自由にカテゴリ分類を変更できます.
format xage agef.;
*/


proc format;
  value agef low-<20 = "under20"
             20-<30  = "20s"
             30-<40  = "30s"
             40-<50  = "40s"
             50-<60  = "50s"
             60-high = "60s+"
             ;
run;

proc tabulate data=creditcard  formchar="           " noseps missing format=7.;
  class xage xrest;
  table all xrest
       ,all xrest
       /rts=10;
  format xage agef.;
run;


/*                                       xrest

            All      0       1     10000   30000  100000  300000

             N       N       N       N       N       N       N

 All         1450    1108      60     102     106      56      18
 xrest
 0           1108    1108       .       .       .       .       .
 1             60       .      60       .       .       .       .
 10000        102       .       .     102       .       .       .
 30000        106       .       .       .     106       .       .
 100000        56       .       .       .       .      56       .
 300000        18       .       .       .       .       .      18

*/

/*
tabulateを使うなら,
proc tabulate formchar="           " noseps missing format=
この1行は,暗記するべきです.損はしませんよ.
*/

**********************************************************************;
************** E N D *************************************************;
**********************************************************************;

質問・意見

サマリ 質問の要旨を簡潔にお願いします。
お名前
状態
メッセージ

質問・意見一覧