hadoop - Creating Impala external table from a partitioned file structure -
provided partitioned fs structure following:
logs └── log_type └── 2013 ├── 07 │ ├── 28 │ │ ├── host1 │ │ │ └── log_file_1.csv │ │ └── host2 │ │ ├── log_file_1.csv │ │ └── log_file_2.csv │ └── 29 │ ├── host1 │ │ └── log_file_1.csv │ └── host2 │ └── log_file_1.csv └── 08
i've been trying create external table in impala:
create external table log_type ( field1 string, field2 string, ... ) row format delimited fields terminated '|' location '/logs/log_type/2013/08';
i wish impala recurse subdirs , load csv files; no cigar. no errors thrown no data loaded table.
different globs /logs/log_type/2013/08/*/*
or /logs/log_type/2013/08/*/*/*
did not work either.
is there way this? or should restructure fs - advice on that?
in case still searching answer. need register each individual partition manually.
see here details registering external table
your schema table needs adjusted
create external table log_type ( field1 string, field2 string, ...) partitioned (year int, month int, day int, host string) row format delimited fields terminated '|';
after changed schema, include year, month, day , host, recursively have add each partition table.
something this
alter table log_type add partition (year=2013, month=07, day=28, host="host1") location '/logs/log_type/2013/07/28/host1';
afterwards need refresh table in impala.
invalidate log_type; refresh log_type;
Comments
Post a Comment