开源软件名称(OpenSource Name):breandan/kotlingrad开源软件地址(OpenSource Url):https://github.com/breandan/kotlingrad开源编程语言(OpenSource Language):Kotlin 100.0%开源软件介绍(OpenSource Introduction):Kotlin∇: Type-safe Symbolic Differentiation for the JVMKotlin∇ is a type-safe automatic differentiation framework written in Kotlin. It allows users to express differentiable programs with higher-dimensional data structures and operators. We attempt to restrict syntactically valid constructions to those which are algebraically valid and can be checked at compile-time. By enforcing these constraints in the type system, it eliminates certain classes of runtime errors that may occur during the execution of a differentiable program. Due to type-inference, most type declarations may be safely omitted by the end-user. Kotlin∇ strives to be expressive, safe, and notationally similar to mathematics. Table of contents
IntroductionInspired by Stalin∇, Autograd, DiffSharp, Myia, Nexus, Tangent, Lantern et al., Kotlin∇ attempts to port recent advancements in automatic differentiation (AD) to the Kotlin language. AD is useful for gradient descent and has a variety of applications in numerical optimization and machine learning. Our implementation adds a number of experimental ideas, including compile-time shape-safety, algebraic simplification and numerical stability checking with property-based testing. We aim to provide an algebraically-grounded implementation of AD for shape-safe tensor operations. Tensors in Kotlin∇ are represented as multidimensional arrays. FeaturesKotlin∇ currently supports the following features:
Additionally, it aims to support:
All of these features are implemented without access to bytecode or special compiler tricks - just using higher-order functions and lambdas as shown in Lambda the Ultimate Backpropogator, embedded DSLs a la Lightweight Modular Staging, and ordinary generics. Please see below for a more detailed feature comparison. UsageInstallationKotlin∇ is hosted on Maven Central. An example project is provided here. Gradledependencies {
implementation("ai.hypergraph:kotlingrad:0.4.7")
} Maven<dependency>
<groupId>ai.hypergraph</groupId>
<artifactId>kotlingrad</artifactId>
<version>0.4.7</version>
</dependency> Jupyter NotebookTo access Kotlin∇'s notebook support, use the following line magic: @file:DependsOn("ai.hypergraph:kotlingrad:0.4.7") For more information, explore the tutorial. NotationKotlin∇ operators are higher-order functions, which take at most two inputs and return a single output, all of which are functions with the same numerical type, and whose shape is denoted using superscript in the rightmost column below. ℝ can be a † ‡ For infix notation, § Matrix division is defined iff B is invertible, although it could be possible to redefine this operator using the Moore-Penrose inverse. ∗ Where C(ℝm) is the space of all continuous functions over ℝ. If the function is not over ℝ, it will fail at compile-time. If the function is over ℝ but not continuous differentiable at the point under consideration, it will fail at runtime. ? The input shape is tracked at runtime, but not at the type level. While it would be nice to infer a union type bound over the inputs of binary functions, it is likely impossible using the Kotlin type system without great effort. If the user desires type checking when invoking higher order functions with literal values, they will need to specify the combined input type explicitly or do so at runtime. τ, λ, π, ω Arbitrary products. Higher-Rank DerivativesKotlin∇ supports derivatives between tensors of up to rank 2. The shape of a tensor derivative depends on (1) the shape of the function under differentiation and (2) the shape of the variable with respect to which we are differentiating.
Matrix-by-vector, vector-by-matrix, and matrix-by-matrix derivatives require rank 3+ tensors and are currently unsupported. Higher-order derivativesKotlin∇ supports arbitrary order derivatives on scalar functions, and up to 2nd order derivatives on vector functions. Higher-order derivatives on matrix functions are unsupported. Shape safetyShape safety is an important concept in Kotlin∇. There are three broad strategies for handling shape errors:
In Kotlin∇, we use the last strategy to check the shape of tensor operations. Consider the following program: // Inferred type: Vec<Double, D2>
val a = Vec(1.0, 2.0)
// Inferred type: Vec<Double, D3>
val b = Vec(1.0, 2.0, 3.0)
val c = b + b
// Does not compile, shape mismatch
// a + b Attempting to sum two vectors whose shapes do not match will fail to compile, and they must be explicitly resized. // Inferred type: Mat<Double, D1, D4>
val a = Mat1x4(1.0, 2.0, 3.0, 4.0)
// Inferred type: Mat<Double, D4, D1>
val b = Mat4x1(1.0, 2.0, 3.0, 4.0)
val c = a * b
// Does not compile, inner dimension mismatch
// a * a
// b * b Similarly, attempting to multiply two matrices whose inner dimensions do not match will fail to compile. val a = Mat2x4(
1.0, 2.0, 3.0, 4.0,
5.0, 6.0, 7.0, 8.0
)
val b = Mat4x2(
1.0, 2.0,
3.0, 4.0,
5.0, 6.0,
7.0, 8.0
)
// Types are optional, but encouraged
val c: Mat<Double, D2, D2> = a * b
val d = Mat2x1(1.0, 2.0)
val e = c * d
val f = Mat3x1(1.0, 2.0, 3.0)
// Does not compile, inner dimension mismatch
// e * f Explicit types are optional but encouraged. Type inference helps preserve shape information over long programs. fun someMatFun(m: Mat<Double, D3, D1>): Mat<Double, D3, D3> = ...
fun someMatFun(m: Mat<Double, D2, D2>) = ... When writing a function, it is mandatory to declare the input type(s), but the return type may be omitted. Shape-safety is currently supported up to rank-2 tensors, i.e. matrices. ExampleThe following example shows how to derive higher-order partials of a function val z = x * (-sin(x * y) + y) * 4 // Infix notation
val `∂z∕∂x` = d(z) / d(x) // Leibniz notation [Christianson, 2012]
val `∂z∕∂y` = d(z) / d(y) // Partial derivatives
val `∂²z∕∂x²` = d(`∂z∕∂x`) / d(x) // Higher-order derivatives
val `∂²z∕∂x∂y` = d(`∂z∕∂x`) / d(y) // Higher-order partials
val `∇z` = z.grad() // Gradient operator
val values = arrayOf(x to 0, y to 1)
println("z(x, y) \t= $z\n" +
"z(${values.map { it.second }.joinToString()}) \t\t= ${z(*values)}\n" +
"∂z/∂x \t\t= $`∂z∕∂x` \n\t\t= " + `∂z∕∂x`(*values) + "\n" +
"∂z/∂y \t\t= $`∂z∕∂y` \n\t\t= " + `∂z∕∂y`(*values) + "\n" +
"∂²z/∂x² \t= $`∂z∕∂y` \n\t\t= " + `∂²z∕∂x²`(*values) + "\n" +
"∂²z/∂x∂y \t= $`∂²z∕∂x∂y` \n\t\t= " + `∂²z∕∂x∂y`(*values) + "\n" +
"∇z \t\t= $`∇z` \n\t\t= [${`∇z`[x]!!(*values)}, ${`∇z`[y]!!(*values)}]ᵀ") Any backticks and unicode characters above are simply for readability and have no effect on the behavior. Running this program via
Variable captureNot only does Kotlin∇'s type system encode output shape, it is also capable of tracking free and bound variables, for order-independent name binding and partial application. Expressions inhabited by free variables are typed as functions until fully bound, at which time they return a concrete value. Consider the following example: val q = X + Y * Z + Y + 0.0
val p0 = q(X to 1.0, Y to 2.0, Z to 3.0) // Name binding
val p1 = q(X to 1.0, Y to 1.0)(Z to 1.0) // Variadic currying
val p3 = q(Z to 1.0)(X to 1.0, Y to 1.0) // Any order is possible
val p4 = q(Z to 1.0)(X to 1.0)(Y to 1.0) // Proper currying
val p5 = q(Z to 1.0)(X to 1.0) // Returns a partially applied function
val p6 = (X + Z + 0)(Y to 1.0) // Does not compile This feature is made possible by encoding a type-level Hasse diagram over a small set of predefined variable names, with skip-connections for variadic combination and partial application. Curious readers may glean further details by referring to the implementation and usage example. Visualization toolsKotlin∇ provides various graphical tools that can be used for visual debugging. Dataflow graphsKotlin∇ functions are a type of directed acyclic graph, called dataflow graphs (DFGs). For example, running the expression Red and blue edges indicate the right and left inputs to a binary operator, respectively. Consider the DFG for a batch of stochastic gradients on linear regression, which can be written in matrix form as : Thetas represent the hidden parameters under differentiation and the constants are the batch inputs (X) and targets (Y). When all the free variables are bound to numerical values, the graph collapses into a single node, which can be unwrapped into a Kotlin PlottingTo generate the sample 2D plots below, run Plotting is also possible in higher dimensions, for example in 3D via Loss curvesGradient descent is one application for Kotlin∇. Below, is a typical loss curve of SGD on a multilayer perceptron: To train the model, execute TestingTo run the tests, execute Kotlin∇ claims to eliminate certain runtime errors, but how do we know the proposed implementation is not incorrect? One method, borrowed from the Haskell community, is called property-based testing (PBT), closely related to metamorphic testing. Notable implementations include QuickCheck, Hypothesis and ScalaTest (ported to Kotlin in Kotest). PBT uses algebraic properties to verify the result of an operation by constructing semantically equivalent but syntactically distinct expressions, which should produce the same answer. Kotlin∇ uses two such equivalences to validate its AD implementation:
For example, consider the following test, which checks whether the analytical derivative and the automatic derivative, when evaluated at a given point, are equal to each other within the limits of numerical precision: val x by Var()
val y by Var()
|
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论