リデューサーは 66% で実際のリデュースを開始します (0-33% はシャッフル、33-66% はソート)。ハイブとの結合では、リデューサーは 2 つのデータ セット間でデカルト積を実行しています。
すべてのデータ セットに頻繁に現れる外部キーが少なくとも 1 つあると推測します。NULL とデフォルト値に注意してください。
For example, in a join, imagine the key "abc" appears ten times in each of the six tables (10^6). That's a million output records for that one key. If "abc" appears 1000 times in one table, 1000 in another, 1000 in another, then twice in the other three tables, you get 8 billion records (1000^3 * 2^3). You can see how this gets out of hand. I'm guessing there is at least one key that is resulting in a massive number of output records.
This is general good practice to avoid in RDBMS outside of Hive as well. Doing multiple inner joins between many-to-many relationships can get you in a lot of trouble.