2008 г.
СУБД с хранением данных по столбцами и по строкам: насколько они отличаются в действительности?
Дэниэль Абади, Сэмюэль Мэдден, Набил Хачем
Пересказ: Сергей Кузнецов
Назад Оглавление
Приложение 1. Запросы тестового набора SSBM
Звено 1
Запрос 1.1
select sum(lo_extendedprice*lo_discount) as revenue
from lineorder, date
where lo_orderdate = d_datekey
and d_year = 1993
and lo_discount betweenl
and 3 and lo_quantity < 25;
Запрос 1.2
select sum(lo_extendedprice*lo_discount) as revenue
from lineorder, date
where lo_orderdate = d_datekey
and d_yearmonth = 199401
and lo_discount between 4 and 6
and lo_quantity between 26 and 35;
Запрос 1.3
select sum(lo_extendedprice*lo_discount) as revenue
from lineorder, date
where lo_orderdate = d_datekey
and d_weeknuminyear = 6
and d_year = 1994
and lo_discount between 5 and 7
and lo_quantity between 26 and 35;
Звено 2
Запрос 2.1
select sum(lo_revenue), d_year, p_brandl
from lineorder, date, part, supplier
where lo_orderdate = d_datekey
and lo_partkey = p_partkey
and lo_suppkey = s_suppkey
and p_category = ‘MFGR#12’
and s_region = ‘AMERICA’
group by d_year, p_brandl
order by d_year, p_brandl;
Запрос 2.2
select sum(lo_revenue), d_year, p_brandl
from lineorder, date, part, supplier
where lo_orderdate = d_datekey
and lo_partkey = p_partkey
and lo_suppkey = s_suppkey
and p_brandl between ‘MFGR#2221’ and ‘MFGR#2228’
and s_region = ‘ASIA’
group by d_year, p_brandl
order by d_year, p_brandl;
Запрос 2.3
select sum(lo_revenue), d_year, p_brandl
from lineorder, date, part, supplier
where lo_orderdate = d_datekey
and lo_partkey = p_partkey
and lo_suppkey = s_suppkey
and p_brandl= ‘MFGR#2239’
and s_region = ‘EUROPE’
group by d_year, p_brandl
order by d_year, p_brandl;
Звено 3
Запрос 3.1
select c_nation, s_nation, d_year, sum(lo_revenue) as revenue
from customer, lineorder, supplier, date
where lo_custkey = c_custkey
and lo_suppkey = s_suppkey
and lo_orderdate = d_datekey
and c_region = ‘ASIA’
and s_region = ‘ASIA’
and d_year >= 1992 and d_year <= 1997
group by c_nation, s_nation, d_year
order by d_year asc, revenue desc;
Запрос 3.2
select c_city, s_city, d_year, sum(lo_revenue) as revenue
from customer, lineorder, supplier, date
where lo_custkey = c_custkey
and lo_suppkey = s_suppkey
and lo_orderdate = d_datekey
and c_nation = ‘UNITED STATES’
and s_nation = ‘UNITED STATES’
and d_year >= 1992 and d_year <= 1997
group by c_city, s_city, d_year
order by d_year asc, revenue desc;
Запрос 3.3
select c_city, s_city, d_year, sum(lo_revenue) as revenue
from customer, lineorder, supplier, date
where lo_custkey = c_custkey
and lo_suppkey = s_suppkey
and lo_orderdate = d_datekey
and (c_city = ‘UNITED КI1’ or c_city = ‘UNlTED KI5’)
and (s_city = ‘UNlTED KI1’ or s_city = ‘UNITED KI5’)
and d_year >= 1992 and d_year <= 1997
group by c_city, s_city, d_year
order by d_year asc, revenue desc;
Запрос 3.4
select c_city, s_city, d_year, sum(lo_revenue) as revenue
from customer, lineorder, supplier, date
where lo_custkey = c_custkey
and lo_suppkey = s_suppkey
and lo_orderdate = d_datekey
and (c_city = ‘UNITED КI1’ or c_city = ‘UNlTED KI5’)
and (s_city = ‘UNlTED KI1’ or s_city = ‘UNlTED KI5’)
and d_yearmonth = ‘Decl997’
group by c_city, s_city, d_year
order by d_year asc, revenue desc;
Звено 4
Запрос 4.1
select d_year, c_nation,
sum(lo_revenue - lo_supplycost) as profit
from date, customer, supplier, part, lineorder
where lo_custkey = c_custkey
and lo_suppkey = s_suppkey
and lo_partkey = p_partkey
and lo_orderdate = d_datekey
and c_region = ‘AMERICA’
and s_region = ‘AMERICA’
and (p_mfgr = ‘MFGR#1’ or p_mfgr = ‘MFGR#2’)
group by d_year, c_nation
order by d_year, c_nation;
Запрос 4.2
select d_year, s_nation, p_category,
sum(lo_revenue - lo_supplycost) as profit
from date, customer, supplier, part, lineorder
where lo_custkey = c_custkey
and lo_suppkey = s_suppkey
and lo_partkey = p_partkey
and lo_orderdate = d_datekey
and c_region = ‘AMERICA’
and s_region = ‘AMERICA’
and (d_year = 1997 or d_year = 1998)
and (p_mfgr = ‘MFGR#1’ or p_mfgr = ‘MFGR#2’)
group by d_year, s_nation, p_category
order by d_year, s_nation, p_category;
Запрос 4.3
select d_year, s_city, p_brandl,
sum(lo_revenue - lo_supplycost) as profit
from date, customer, supplier, part, lineorder
where lo_custkey = c_custkey
and lo_suppkey = s_suppkey
and lo_partkey = p_partkey
and lo_orderdate = d_datekey
and s_nation = ‘UNITED STATES’
and (d_year = 1997 or d_year = 1998)
and p_category = ‘MFGR#14’
group by d_year, s_city, p_brandl
order by d_year, s_city, p_brandl;
Приложение 2. Определение таблиц базы данных SSBM
LINEORDER Table Layout (SF*6,000,000 are populated)
ORDERKEY numeric (int up to SF 300) first 8 of each 32 keys used
LINENUMBER numeric 1-7
CUSTKEY numeric identifier foreign key reference to C_CUSTKEY
PARTKEY identifier foreign key reference to P_PARTKEY
SUPPKEY numeric identifier foreign key reference to S_SUPPKEY
ORDERDATE identifier foreign key reference to D_DATEKEY
ORDERPRIORITY fixed text, size 15 (5 Priorities: 1-URGENT, etc.)
SHIPPRIORITY fixed text, size 1
QUANTITY numeric 1-50 (for PART)
EXTENDEDPRICE numeric, MAX about 55,450 (for PART)
ORDTOTALPRICE numeric, MAX about 388,000 (for ORDER)
DISCOUNT numeric 0-10 (for PART) - (Represents PERCENT)
REVENUE numeric (for PART: (extendedprice*(100-discount))/100)
SUPPLYCOST numeric (for PART, cost from supplier, max = ?)
TAX numeric 0-8 (for PART)
COMMITDATE Foreign Key reference to D_DATEKEY
SHIPMODE fixed text, size 10 (Modes: REG AIR, AIR, etc.)
Compound Primary Key: ORDERKEY, LINENUMBER
PART Table Layout (200,000*[l+log2SF] populated)
PARTKEY identifier
NAME variable text, size 22 (Not unique per PART but never was)
MFGR fixed text, size 6 (MFGR#l-5, CARD = 5)
CATEGORY fixed text, size 7 (‘MFGR#’||l-5||l-5: CARD = 25)
BRAND1 fixed text, size 9 (CATEGORY||1-40: CARD = 1000)
COLOR variable text, size 11 (CARD = 94)
TYPE variable text, size 25 (CARD = 150)
SIZE numeric 1-50 (CARD = 50)
CONTAINER fixed text(10) (CARD = 40)
Primary Key: PARTKEY
SUPPLIER Table Layout (SF*10,000 are populated)
SUPPKEY identifier
NAME fixed text, size 25: ‘Supplier’||SUPPKEY
ADDRESS variable text, size 25 (city below)
CITY fixed text, size 10 (10/nation: nation_prefix||(0-9))
NATION fixed text(15) (25 values, longest UNITED KINGDOM)
REGION fixed text, size 12 (5 values: longest MIDDLE EAST)
PHONE fixed text, size 15 (many values, format: 43-617-354-1222)
Primary Key: SUPPKEY
CUSTOMER Table Layout (SF*30,000 are populated)
CUSTKEY numeric identifier
NAME variable text, size 25 ‘Customer’||CUSTKEY
ADDRESS variable text, size 25 (city below)
CITY fixed text, size 10 (10/nation: NATION_PREFIX||(0-9)
NATION fixed text(15) (25 values, longest UNITED KINGDOM)
REGION fixed text, size 12 (5 values: longest MIDDLE EAST)
PHONE fixed text, size 15 (many values, format: 43-617-354-1222)
MKTSEGMENT fixed text, size 10 (longest is AUTOMOBILE)
Primary Key: CUSTKEY
DATE Table Layout (7 years of days: 7366 days)
DATEKEY identifier, unique id - e.g. 19980327 (what we use)
DATE fixed text, size 18, longest: December 22, 1998
DAYOFWEEK fixed text, size 8, Sunday, Monday Saturday)
MONTH fixed text, size 9: January, …, December
YEAR unique value 1992-1998
YEARMONTHNUM numeric (YYYYMM) - e.g. 199803
YEARMONTH fixed text, size 7: Marl998 for example
DAYNUMINWEEK numeric 1-7
DAYNUMINMONTH numeric 1-31
DAYNUMINYEAR numeric 1-366
MONTHNUMINYEAR numeric 1-12
WEEKNUMINYEAR numeric 1-53
SELLINGSEASON text, size 12 (Christmas, Summer,...)
LASTDAYINWEEKFL 1 bit
LASTDAYINMONTHFL 1 bit
HOLIDAYFL 1 bit
WEEKDAYFL 1 bit
Primary Key: DATEKEY