jazz 寫:
看起來應該是 " " 雙引號造成的問題。
這裡有人提到可以用 CSV-SerDe 來解決雙引號造成的問題。
http://stackoverflow.com/questions/13628658/hive-text-delimiter個人覺得比較簡單的作法是對 CSV 檔做個前處理,把雙引號去掉。
- Jazz
我試過把 "" 雙引號拿掉,匯入是正確的。
代碼:
jazz@yarn:~$ cat test.csv
276725;"034545104X";"0"
876777;"023456404X";"3"
jazz@yarn:~$ sed -i 's#"##g' test.csv
jazz@yarn:~$ cat test.csv
276725;034545104X;0
876777;023456404X;3
代碼:
hive> CREATE TABLE bxbookratings (UserID BIGINT,ISBN STRING,BookRating INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\073' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
OK
Time taken: 10.518 seconds
hive> LOAD DATA LOCAL INPATH '/home/jazz/test.csv' INTO TABLE bxbookratings;
Copying data from file:/home/jazz/test.csv
Copying file: file:/home/jazz/test.csv
Loading data to table default.bxbookratings
Table default.bxbookratings stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 40, raw_data_size: 0]
OK
Time taken: 1.718 seconds
hive> select * from bxbookratings;
OK
276725 034545104X 0
876777 023456404X 3
Time taken: 0.571 seconds
- Jazz