パラメータ調整で多々あるみたいですが効率の良いSQLについて検証してみます。
前回のテストテーブルpostgresで連番のテストデータを作成のデータをもとにします。
検証する条件は以下です。
1.思いつくパターンを複数SQLを記載.
2.検証には10回のデータを用います.
3.\timingの値を検証結果とします.
4.limit offsetを利用回数毎に移動させます.
5.WHERE句の有無での変化を確認.
上記条件が主なないようになります。
site,site_viewこちらの2テーブルの検索です。
\timingと呼ばれるコマンドを利用することで簡単に実行時間を
計ることができるのでこちらを利用.
検証1
サイト毎の合計view数,conversion数を計算
SQL1 | SELECT site_id, site_name, SUM(view) AS view, SUM(conversion) AS conversion FROM site AS T1 INNER JOIN site_view AS T2 USING(site_id) GROUP BY site_id, site_name ORDER BY site_id LIMIT 10 OFFSET 0; |
SQL2 | SELECT site_id, site_name, view, conversion FROM site AS T1 INNER JOIN (SELECT site_id, SUM(view) AS view, SUM(conversion) AS conversion FROM site_view GROUP BY site_id) AS T2 USING(site_id) ORDER BY site_id LIMIT 10 OFFSET 0; |
検証用のシェルを作成.
sql1.sql
\timing SELECT site_id, site_name, SUM(view) AS view, SUM(conversion) AS conversion FROM site AS T1 INNER JOIN site_view AS T2 USING(site_id) GROUP BY site_id, site_name ORDER BY site_id LIMIT 10 OFFSET 0; SELECT site_id, site_name, SUM(view) AS view, SUM(conversion) AS conversion FROM site AS T1 INNER JOIN site_view AS T2 USING(site_id) GROUP BY site_id, site_name ORDER BY site_id LIMIT 10 OFFSET 10; ・・・ SELECT site_id, site_name, SUM(view) AS view, SUM(conversion) AS conversion FROM site AS T1 INNER JOIN site_view AS T2 USING(site_id) GROUP BY site_id, site_name ORDER BY site_id LIMIT 10 OFFSET 90;sql2.sql
\timing SELECT site_id, site_name, view, conversion FROM site AS T1 INNER JOIN (SELECT site_id, SUM(view) AS view, SUM(conversion) AS conversion FROM site_view GROUP BY site_id) AS T2 USING(site_id) ORDER BY site_id LIMIT 10 OFFSET 0; SELECT site_id, site_name, view, conversion FROM site AS T1 INNER JOIN (SELECT site_id, SUM(view) AS view, SUM(conversion) AS conversion FROM site_view GROUP BY site_id) AS T2 USING(site_id) ORDER BY site_id LIMIT 10 OFFSET 10; ・・・ SELECT site_id, site_name, view, conversion FROM site AS T1 INNER JOIN (SELECT site_id, SUM(view) AS view, SUM(conversion) AS conversion FROM site_view GROUP BY site_id) AS T2 USING(site_id) ORDER BY site_id LIMIT 10 OFFSET 90;
それぞれの時間を以下のコマンドで検証します.
psql -U postgres -f sql1.sql analytics | grep Time: | cut -d " " -f 2
psql -U postgres -f sql2.sql analytics | grep Time: | cut -d " " -f 2
以下が検証結果です。
実行ファイル | 1回目 | 2回目 | 3回目 | 4回目 | 5回目 | 6回目 | 7回目 | 8回目 | 9回目 | 10回目 | 最小 | 最大 | 平均 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
sql1.sql | 2314.365 | 2247.267 | 2331.978 | 2235.310 | 2244.249 | 2457.474 | 2531.733 | 2445.966 | 2614.957 | 2483.380 | 2235.31 | 2614.957 | 2390.6679 |
sql2.sql | 1080.565 | 1041.092 | 1062.615 | 1019.743 | 1037.495 | 1031.203 | 1020.304 | 1049.538 | 1021.847 | 1043.459 | 1019.743 | 1080.565 | 1040.7861 |
sql2のほうが圧倒的です。(ここまででるのかw
ここからさらにWHERE句を付けより検証していきたいと思います。
とりあえずこの記事はここまで次回part2を記事化していきます。