Hive scripts which are scheduled to run in production always take in variables. These variables are set in dynamically and you would need to pass the variables to your queries or scripts. Let’s see how to set and refer to these variables.
Here is how you set the variables. It is good practice to properly specify the namespace on where you are setting these variables. We are using hivevar namespace here and the variable name is date-ymd.
hive> set hivevar:date-ymd = '2019-11-15';
Here is how we refer to the variable that is set in hivevar namespace.
hive> select * from foo where day >= $Above works if you are using Hive interactive prompt but what if you want to pass the variable to a Hive script. Here is how you pass the variable dynamically to a Hive script. Inside the Hive script, sales.hql you will refer the variable just like above – $
hive -hivevar date-ymd ='2019-11-15' -f sales.hql
You can see the list of all the Hive variables with the below command.
hive -e 'set;'
If your goal is to set and refer to the variables in Hive what you have seen so far in this post will work. We have used hivevar namespace in this post and it is the right thing to do. Hivevar namespace didn’t exist when Hive first came out. If you are looking at Hive scripts which were written when Hive came out, you will see the references to hiveconf namespace.
Click here to learn the difference between Hive namespaces.
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.