NDArrayNumericExpression

class hail.expr.NDArrayNumericExpression[source]

Expression of type tndarray with a numeric element type.

Numeric ndarrays support arithmetic both with scalar values and other arrays. Arithmetic between two numeric ndarrays requires that the shapes of each ndarray be either identical or compatible for broadcasting. Operations are applied positionally (nd1 * nd2 will multiply the first element of nd1 by the first element of nd2, the second element of nd1 by the second element of nd2, and so on). Arithmetic with a scalar will apply the operation to each element of the ndarray.

Attributes

T

Reverse the dimensions of this ndarray.

dtype

The data type of the expression.

ndim

The number of dimensions of this ndarray.

shape

The shape of this ndarray.

Methods

sum

Sum out one or more axes of an ndarray.

property T

Reverse the dimensions of this ndarray. For an n-dimensional array a, a[i_0, …, i_n-1, i_n] = a.T[i_n, i_n-1, …, i_0]. Same as self.transpose().

See also transpose().

Returns:

NDArrayExpression.

__add__(other)[source]

Positionally add an array or a scalar.

Parameters:

other (NumericExpression or NDArrayNumericExpression) – Value or ndarray to add.

Returns:

NDArrayNumericExpression – NDArray of positional sums.

__eq__(other)

Returns True if the two expressions are equal.

Examples

>>> x = hl.literal(5)
>>> y = hl.literal(5)
>>> z = hl.literal(1)
>>> hl.eval(x == y)
True
>>> hl.eval(x == z)
False

Notes

This method will fail with an error if the two expressions are not of comparable types.

Parameters:

other (Expression) – Expression for equality comparison.

Returns:

BooleanExpressionTrue if the two expressions are equal.

__floordiv__(other)[source]

Positionally divide by a ndarray or a scalar using floor division.

Parameters:

other (NumericExpression or NDArrayNumericExpression)

Returns:

NDArrayNumericExpression

__ge__(other)

Return self>=value.

__gt__(other)

Return self>value.

__le__(other)

Return self<=value.

__lt__(other)

Return self<value.

__matmul__(other)[source]

Matrix multiplication: a @ b, semantically equivalent to NumPy matmul. If a and b are vectors, the vector dot product is performed, returning a NumericExpression. If a and b are both 2-dimensional matrices, this performs normal matrix multiplication. If a and b have more than 2 dimensions, they are treated as multi-dimensional stacks of 2-dimensional matrices. Matrix multiplication is applied element-wise across the higher dimensions. E.g. if a has shape (3, 4, 5) and b has shape (3, 5, 6), a is treated as a stack of three matrices of shape (4, 5) and b as a stack of three matrices of shape (5, 6). a @ b would then have shape (3, 4, 6).

Notes

The last dimension of a and the second to last dimension of b (or only dimension if b is a vector) must have the same length. The dimensions to the left of the last two dimensions of a and b (for NDArrays of dimensionality > 2) must be equal or be compatible for broadcasting. Number of dimensions of both NDArrays must be at least 1.

Parameters:

other (numpy.ndarray NDArrayNumericExpression)

Returns:

NDArrayNumericExpression or NumericExpression

__mul__(other)[source]

Positionally multiply by a ndarray or a scalar.

Parameters:

other (NumericExpression or NDArrayNumericExpression) – Value or ndarray to multiply by.

Returns:

NDArrayNumericExpression – NDArray of positional products.

__ne__(other)

Returns True if the two expressions are not equal.

Examples

>>> x = hl.literal(5)
>>> y = hl.literal(5)
>>> z = hl.literal(1)
>>> hl.eval(x != y)
False
>>> hl.eval(x != z)
True

Notes

This method will fail with an error if the two expressions are not of comparable types.

Parameters:

other (Expression) – Expression for inequality comparison.

Returns:

BooleanExpressionTrue if the two expressions are not equal.

__neg__()[source]

Negate elements of the ndarray.

Returns:

NDArrayNumericExpression – Array expression of the same type.

__sub__(other)[source]

Positionally subtract a ndarray or a scalar.

Parameters:

other (NumericExpression or NDArrayNumericExpression) – Value or ndarray to subtract.

Returns:

NDArrayNumericExpression – NDArray of positional differences.

__truediv__(other)[source]

Positionally divide by a ndarray or a scalar.

Parameters:

other (NumericExpression or NDArrayNumericExpression) – Value or ndarray to divide by.

Returns:

NDArrayNumericExpression – NDArray of positional quotients.

collect(_localize=True)

Collect all records of an expression into a local list.

Examples

Collect all the values from C1:

>>> table1.C1.collect()
[2, 2, 10, 11]

Warning

Extremely experimental.

Warning

The list of records may be very large.

Returns:

list

describe(handler=<built-in function print>)

Print information about type, index, and dependencies.

property dtype

The data type of the expression.

Returns:

HailType

export(path, delimiter='\t', missing='NA', header=True)

Export a field to a text file.

Examples

>>> small_mt.GT.export('output/gt.tsv')
>>> with open('output/gt.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
locus   alleles 0       1       2       3
1:1     ["A","C"]       0/1     0/0     0/1     0/0
1:2     ["A","C"]       1/1     0/1     0/1     0/1
1:3     ["A","C"]       0/0     0/1     0/0     0/0
1:4     ["A","C"]       0/1     1/1     0/1     0/1
>>> small_mt.GT.export('output/gt-no-header.tsv', header=False)
>>> with open('output/gt-no-header.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
1:1     ["A","C"]       0/1     0/0     0/1     0/0
1:2     ["A","C"]       1/1     0/1     0/1     0/1
1:3     ["A","C"]       0/0     0/1     0/0     0/0
1:4     ["A","C"]       0/1     1/1     0/1     0/1
>>> small_mt.pop.export('output/pops.tsv')
>>> with open('output/pops.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
sample_idx      pop
0       1
1       2
2       2
3       2
>>> small_mt.ancestral_af.export('output/ancestral_af.tsv')
>>> with open('output/ancestral_af.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
locus   alleles ancestral_af
1:1     ["A","C"]       3.8152e-01
1:2     ["A","C"]       7.0588e-01
1:3     ["A","C"]       4.9991e-01
1:4     ["A","C"]       3.9616e-01
>>> small_mt.bn.export('output/bn.tsv')
>>> with open('output/bn.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
bn
{"n_populations":3,"n_samples":4,"n_variants":4,"n_partitions":4,"pop_dist":[1,1,1],"fst":[0.1,0.1,0.1],"mixture":false}

Notes

For entry-indexed expressions, if there is one column key field, the result of calling str() on that field is used as the column header. Otherwise, each compound column key is converted to JSON and used as a column header. For example:

>>> small_mt = small_mt.key_cols_by(s=small_mt.sample_idx, family='fam1')
>>> small_mt.GT.export('output/gt-no-header.tsv')
>>> with open('output/gt-no-header.tsv', 'r') as f:
...     for line in f:
...         print(line, end='')
locus   alleles {"s":0,"family":"fam1"} {"s":1,"family":"fam1"} {"s":2,"family":"fam1"} {"s":3,"family":"fam1"}
1:1     ["A","C"]       0/1     0/0     0/1     0/0
1:2     ["A","C"]       1/1     0/1     0/1     0/1
1:3     ["A","C"]       0/0     0/1     0/0     0/0
1:4     ["A","C"]       0/1     1/1     0/1     0/1
Parameters:
  • path (str) – The path to which to export.

  • delimiter (str) – The string for delimiting columns.

  • missing (str) – The string to output for missing values.

  • header (bool) – When True include a header line.

map(f)

Applies an element-wise operation on an NDArray.

Parameters:

f (function ( (arg) -> Expression)) – Function to transform each element of the NDArray.

Returns:

NDArrayExpression. – NDArray where each element has been transformed according to f.

map2(other, f)

Applies an element-wise binary operation on two NDArrays.

Parameters:
  • other (class:.NDArrayExpression, ArrayExpression, numpy NDarray,) – or nested python list/tuples. Both NDArrays must be the same shape or broadcastable into common shape.

  • f (function ((arg1, arg2)-> Expression)) – Function to be applied to each element of both NDArrays.

Returns:

NDArrayExpression. – Element-wise result of applying f to each index in NDArrays.

property ndim

The number of dimensions of this ndarray.

Examples

>>> nd.ndim
2
Returns:

int

reshape(*shape)

Reshape this ndarray to a new shape.

Parameters:

shape (Expression of type tint64 or) – :obj: tuple of Expression of type tint64

Examples

>>> v = hl.nd.array([1, 2, 3, 4]) 
>>> m = v.reshape((2, 2)) 
Returns:

NDArrayExpression.

property shape

The shape of this ndarray.

Examples

>>> hl.eval(nd.shape)
(2, 2)
Returns:

TupleExpression

show(n=None, width=None, truncate=None, types=True, handler=None, n_rows=None, n_cols=None)

Print the first few records of the expression to the console.

If the expression refers to a value on a keyed axis of a table or matrix table, then the accompanying keys will be shown along with the records.

Examples

>>> table1.SEX.show()
+-------+-----+
|    ID | SEX |
+-------+-----+
| int32 | str |
+-------+-----+
|     1 | "M" |
|     2 | "M" |
|     3 | "F" |
|     4 | "F" |
+-------+-----+
>>> hl.literal(123).show()
+--------+
| <expr> |
+--------+
|  int32 |
+--------+
|    123 |
+--------+

Notes

The output can be passed piped to another output source using the handler argument:

>>> ht.foo.show(handler=lambda x: logging.info(x))  
Parameters:
  • n (int) – Maximum number of rows to show.

  • width (int) – Horizontal width at which to break columns.

  • truncate (int, optional) – Truncate each field to the given number of characters. If None, truncate fields to the given width.

  • types (bool) – Print an extra header line with the type of each field.

sum(axis=None)[source]

Sum out one or more axes of an ndarray.

Parameters:

axis (int tuple) – The axis or axes to sum out.

Returns:

NDArrayNumericExpression or NumericExpression

summarize(handler=None)

Compute and print summary information about the expression.

Danger

This functionality is experimental. It may not be tested as well as other parts of Hail and the interface is subject to change.

take(n, _localize=True)

Collect the first n records of an expression.

Examples

Take the first three rows:

>>> table1.X.take(3)
[5, 6, 7]

Warning

Extremely experimental.

Parameters:

n (int) – Number of records to take.

Returns:

list

transpose(axes=None)

Permute the dimensions of this ndarray according to the ordering of axes. Axis j in the ith index of axes maps the jth dimension of the ndarray to the ith dimension of the output ndarray.

Parameters:

axes (tuple of int, optional) – The new ordering of the ndarray’s dimensions.

Notes

Does nothing on ndarrays of dimensionality 0 or 1.

Returns:

NDArrayExpression.