You are on page 1of 16

Mathematical Functions

The Numerical functions are listed below in alphabetical order. Use these functions in SQL queries.
Return Type Name(Signature) Example

Returns the rounded BIGINT value of the


BIGINT round(double a)
double

DOUBLE round(double a, int d) Returns the double rounded to d decimal places

Returns the maximum BIGINT value that is


BIGINT floor(double a)
equal or less than the double

ceil(double a), Returns the minimum BIGINT value that is


BIGINT
ceiling(double a) equal or greater than the double

Returns a random number (that changes from

row to row) that is distributed uniformly from 0

double rand(), rand(int seed) to 1. Specifiying the seed will make sure the

generated random number sequence is

deterministic.

Returns ea where e is the base of the natural


double exp(double a)
logarithm

double ln(double a) Returns the natural logarithm of the argument


Return Type Name(Signature) Example

double log10(double a) Returns the base-10 logarithm of the argument

double log2(double a) Returns the base-2 logarithm of the argument

log(double base, Return the base “base” logarithm of the


double
double a) argument

pow(double a, double

double p), power(double a, Return a to the power of p value

double p)

double sqrt(double a) Returns the square root of a

string bin(BIGINT a) Returns the number in binary format

If the argument is an int, hex returns the number

as a string in hex format. Otherwise if the


hex(BIGINT a)
string number is a string, it converts each character
hex(string a)
into its hex representation and returns the

resulting string.

Inverse of hex. Interprets each pair of characters

string unhex(string a) as a hexidecimal number and converts to the

character represented by the number.


Return Type Name(Signature) Example

conv(BIGINT num,

int from_base, int

to_base),
string Converts a number from a given base to another
conv(STRING num,

int from_base, int

to_base)

double abs(double a) Returns the absolute value

pmod(int a, int b)

int double pmod(double a, Returns the positive value of a mod b

double b)

double sin(double a) Returns the sine of a (a is in radians)

Returns the arc sin of x if -1<=a<=1 or null


double asin(double a)
otherwise

double cos(double a) Returns the cosine of a (a is in radians)

Returns the arc cosine of x if -1<=a<=1 or null


double acos(double a)
otherwise
Return Type Name(Signature) Example

tan(double
tan(double a) Returns the tangent of a (a is in radians)
a)

double atan(double a) Returns the arctangent of a

double degrees(double a) Converts value of a from radians to degrees

double radians(double a) Converts value of a from degrees to radians

positive(int a),
int double Returns a
positive(double a)

negative(int a),
int double Returns -a
negative(double a)

float sign(double a) Returns the sign of a as ‘1.0’ or ‘-1.0?

double e() Returns the value of e

double ABS( double n ) Returns the absolute value of a number

double pi() Returns the value of pi


String Functions
Return Type Name(Signature) Example

Returns the numeric value of the first


int ascii(string str)
character of str

Returns the string or bytes resulting from

concatenating the strings or bytes passed in


concat(string|binar
as parameters in order. e.g. concat(‘foo’,
string y A, string|binary
‘bar’) results in ‘foobar’. Note that this
B…)
function can take any number of input

strings.

Returns the top-k contextual N-grams from

context_ngrams(arr a set of tokenized sentences, given a string


array<struct<strin
ay<array>, array, of “context”.
g,double>>
int K, int pf) See StatisticsAndDataMining for more

information.

concat_ws(string
Like concat() above, but with custom
string SEP, string A,
separator SEP.
string B…)
Return Type Name(Signature) Example

concat_ws(string Like concat_ws() above, but taking an


string
SEP, array) array of strings. (as of Hive 0.9.0)

Returns the first occurance of str in strList

where strList is a comma-delimited string.

find_in_set(string Returns null if either argument is null.


int
str, string strList) Returns 0 if the first argument contains any

commas. e.g. find_in_set(‘ab’,

‘abc,b,ab,c,def’) returns 3

Formats the number X to a format like

‘#,###,###.##’, rounded to D decimal


format_number(nu
string places, and returns the result as a string. If
mber x, int d)
D is 0, the result has no decimal point or

fractional part. (as of Hive 0.10.0)

Extract json object from a json string based

get_json_object(str on json path specified, and return json

string ing json_string, string of the extracted json object. It will

string path) return null if the input json string is

invalid.NOTE: The json path can only


Return Type Name(Signature) Example

have the characters [0-9a-z_], i.e., no

upper-case or special characters. Also, the

keys *cannot start with numbers.* This is

due to restrictions on Hive column names.

in_file(string str, Returns true if the string str appears as an


boolean
string filename) entire line in filename.

instr(string str, Returns the position of the first occurence


int
string substr) of substr in str

int length(string A) Returns the length of the string

locate(string substr, Returns the position of the first occurrence


int
string str[, int pos]) of substr in str after position pos

lower(string A)
string
lcase(string A)

lpad(string str, int Returns str, left-padded with pad to a


string
len, string pad) length of len

Returns the string resulting from trimming


string ltrim(string A)
spaces from the beginning(left hand side)
Return Type Name(Signature) Example

of A e.g. ltrim(‘ foobar ‘) results in ‘foobar

Returns the top-k N-grams from a set of

ngrams(array<arra tokenized sentences, such as those returned


array<struct<strin
y >, int N, int K, by the sentences() UDAF.
g,double>>
int pf) See StatisticsAndDataMining for more

information.

Returns the specified part from the URL.

Valid values for partToExtract include

HOST, PATH, QUERY, REF,

parse_url(string PROTOCOL, AUTHORITY, FILE, and

urlString, string USERINFO. e.g.

string partToExtract [, parse_url(‘http://facebook.com/path1/p.ph

string p?k1=v1&k2=v2#Ref1?, ‘HOST’) returns

keyToExtract]) ‘facebook.com’. Also a value of a

particular key in QUERY can be extracted

by providing the key as the third argument,

e.g.
Return Type Name(Signature) Example

parse_url(‘http://facebook.com/path1/p.ph

p?k1=v1&k2=v2#Ref1?, ‘QUERY’, ‘k1?)

returns ‘v1?.

printf(String Returns the input formatted according do

string format, Obj… printf-style format strings (as of

args) Hive 0.9.0)

Returns the string extracted using the

pattern. e.g. regexp_extract(‘foothebar’,

‘foo(.*?)(bar)’, 2) returns ‘bar.’ Note that

some care is necessary in using predefined

character classes: using ‘\s’ as the second


regexp_extract(stri
argument will match the letter s; ‘s’ is
string ng subject, string
necessary to match whitespace, etc. The
pattern, int index)
‘index’ parameter is the Java regex

Matcher group() method index. See

docs/api/java/util/regex/Matcher.html for

more information on the ‘index’ or Java

regex group() method.


Return Type Name(Signature) Example

Returns the string resulting from replacing

all substrings in INITIAL_STRING that

match the java regular expression syntax


regexp_replace(stri
defined in PATTERN with instances of
ng
REPLACEMENT, e.g.
INITIAL_STRING
string regexp_replace(“foobar”, “oo|ar”, “”)
, string PATTERN,
returns ‘fb.’ Note that some care is
string
necessary in using predefined character
REPLACEMENT)
classes: using ‘\s’ as the second argument

will match the letter s; ‘s’ is necessary to

match whitespace, etc.

repeat(string str, int


string Repeat str n times
n)

string reverse(string A) Returns the reversed string

rpad(string str, int Returns str, right-padded with pad to a


string
len, string pad) length of len
Return Type Name(Signature) Example

Returns the string resulting from trimming

string rtrim(string A) spaces from the end(right hand side) of A

e.g. rtrim(‘ foobar ‘) results in ‘ foobar’

Tokenizes a string of natural language text

into words and sentences, where each

sentence is broken at the appropriate


sentences(string
sentence boundary and returned as an array
array<array> str, string lang,
of words. The ‘lang’ and ‘locale’ are
string locale)
optional arguments. e.g. sentences(‘Hello

there! How are you?’) returns ( (“Hello”,

“there”), (“How”, “are”, “you”) )

string space(int n) Return a string of n spaces

split(string str, Split str around pat (pat is a regular


array
string pat) expression)

str_to_map(text[, Splits text into key-value pairs using two


map<string,string
delimiter1, delimiters. Delimiter1 separates text into
>
delimiter2]) K-V pairs, and Delimiter2 splits each K-V
Return Type Name(Signature) Example

pair. Default delimiters are ‘,’ for

delimiter1 and ‘=’ for delimiter2.

substr(string|binary Returns the substring or slice of the byte

A, int start) array of A starting from start position till


string
substring(string|bin the end of string A e.g. substr(‘foobar’, 4)

ary A, int start) results in ‘bar’

substr(string|binary
Returns the substring or slice of the byte
A, int start, int len)
array of A starting from start position with
string substring(string|bin
length len e.g. substr(‘foobar’, 4, 1) results
ary A, int start, int
in ‘b’
len)

Translates the input string by replacing the

characters present in the from string with

translate(string the corresponding characters in

string input, string from, the to string. This is similar to

string to) the translatefunction in PostgreSQL. If any

of the parameters to this UDF are NULL,

the result is NULL as well


Return Type Name(Signature) Example

Returns the string resulting from trimming

string trim(string A) spaces from both ends of A e.g. trim(‘

foobar ‘) results in ‘foobar’

Returns the string resulting from

upper(string A) converting all characters of A to upper case


string
ucase(string A) e.g. upper(‘fOoBaR’) results in

‘FOOBAR’

Conditional Functions
Hive supports three types of conditional functions. These functions are listed below.

 IF( Test Condition, True Value, False Value )

The IF condition evaluates the “Test Condition” and if the “Test Condition” is
true, then it returns the “True Value”. Otherwise, it returns the False Value.
Example: IF(1=1, ‘working’, ‘not working’) returns ‘working’

 COALESCE( value1,value2,… )

The COALESCE function returns the fist not NULL value from the list of
values. If all the values in the list are NULL, then it returns NULL.
Example: COALESCE(NULL,NULL,5,NULL,4) returns 5

 CASE Statement

The syntax for the case statement is:


1
CASE [ expression ]
2

WHEN condition1 THEN result1


3

4 WHEN condition2 THEN result2

5 ...

6 WHEN conditionn THEN resultn

7
ELSE result

8
END

Here expression is optional. It is the value that you are comparing to the list of
conditions. (ie: condition1, condition2, … conditionn).
All the conditions must be of same datatype. Conditions are evaluated in the
order listed. Once a condition is found to be true, the case statement will
return the result and not evaluate the conditions any further.
All the results must be of same datatype. This is the value returned once a
condition is found to be true.
IF no condition is found to be true, then the case statement will return the
value in the ELSE clause. If the ELSE clause is omitted and no condition is
found to be true, then the case statement will return NULL
Example:

1
2 CASE Fruit

3
WHEN 'APPLE' THEN 'The owner is APPLE'

4
WHEN 'ORANGE' THEN 'The owner is ORANGE'
5
ELSE 'It is another Fruit'
6

END

 The other form of CASE is

CASE
2

3
WHEN Fruit = 'APPLE' THEN 'The owner is APPLE'

4 WHEN Fruit = 'ORANGE' THEN 'The owner is ORANGE'

5 ELSE 'It is another Fruit'

7
END

Collection Functions
The following built-in collection functions are supported in hive:
Return Type Name(Signature) Example

int size(Map) Returns the number of elements in the map type

int size(Array) Returns the number of elements in the array type

array map_keys(Map) Returns an unordered array containing the keys of the i

array map_values(Map) Returns an unordered array containing the values of the

boolean array_contains(Array, value) Returns TRUE if the array contains value

Sorts the input array in ascending order according to th


array sort_array(Array)
of the array elements and returns it (as of version 0.9.0

Type Conversion Function


The following type conversion functions are supported in hive:
binary(string|binary)
Casts the parameter into a binary

 cast(expr as )

Converts the results of the expression expr to e.g. cast(‘1’ as BIGINT) will
convert the string ‘1’ to it integral representation. A null is returned if the
conversion does not succeed.

You might also like