hadoop - Creating Impala external table from a partitioned file structure -


provided partitioned fs structure following:

logs └── log_type     └── 2013         ├── 07         │   ├── 28         │   │   ├── host1         │   │   │   └── log_file_1.csv         │   │   └── host2         │   │       ├── log_file_1.csv         │   │       └── log_file_2.csv         │   └── 29         │       ├── host1         │       │   └── log_file_1.csv         │       └── host2         │           └── log_file_1.csv         └── 08 

i've been trying create external table in impala:

create external table log_type (     field1    string,     field2    string,     ... ) row format delimited fields terminated '|' location '/logs/log_type/2013/08'; 

i wish impala recurse subdirs , load csv files; no cigar. no errors thrown no data loaded table.

different globs /logs/log_type/2013/08/*/* or /logs/log_type/2013/08/*/*/* did not work either.

is there way this? or should restructure fs - advice on that?

in case still searching answer. need register each individual partition manually.

see here details registering external table

your schema table needs adjusted

create external table log_type (         field1    string,         field2    string, ...)   partitioned (year int, month int, day int, host string)   row format delimited fields terminated '|'; 

after changed schema, include year, month, day , host, recursively have add each partition table.

something this

alter table log_type add partition (year=2013, month=07, day=28, host="host1")     location '/logs/log_type/2013/07/28/host1'; 

afterwards need refresh table in impala.

invalidate log_type; refresh log_type; 

Comments

Popular posts from this blog

plot - Remove Objects from Legend When You Have Also Used Fit, Matlab -

java - Why does my date parsing return a weird date? -

Need help in packaging app using TideSDK on Windows -