Recently, I have been working on a very interesting and sophisticated project called typerepmap
. A lot of advanced features and tricks were used during the development process and I have discovered many amusing and new sides of Haskell. So, I decided to share the ideas, steps, issues, etc. in this blog post.
If you want to skip all the funny parts, here is the link to the code itself:
What it’s all about🔗
The basic idea behind typerepmap
is to have a data structure like Map
, but where types serve as keys, and values stored in the map are of the type specified in the corresponding key.
An example image of this structure:
Int: 42
Bool: True
String: "Haskell"
There can be only one keyvalue pair for each type.
And here is an example written in pseudocode for better understanding:
42 :: Int) empty = [(Int, 42)]
insert (^ ^
key value
42 :: Int) (insert True empty) = [(Int, 42), (Bool, True)]
insert (^ ^ ^ ^
key value key value
We also want our values to be indexed by a polymorphic type, but that will be explained later.
Existing Solutions🔗
There already exist some libraries that implement ideas similar to typerepmap
:
typemap
appears to resemble our project, however the interface is different. They track the elements in the types and don’t provide the desired parametrization.dependentmap
is closer to our goal in terms of the interface but the package has a complete reimplementation ofData.Map.Lazy
, and the goal of thetyperepmap
project is to have an efficient implementation based on primitive unboxed arrays.
Motivation🔗
You might wonder what typerepmap
brings to the table if there are other packages that aim to fulfil the same purpose. The primary goal is to use it in the caps
library instead of the DMap
type from dependentmap
parametrized by TypeRep
.
In caps
the performance of lookups is extremely important so it makes sense to prioritize its performance above that of other functions.
Implementing TypeRepMap🔗
Sections below describe the details of the implementation phases and the general concepts.
NOTE: in this blog post I am talking about
ghc8.0.2
or higher.
Extensions used🔗
The code snippets in this blog post assume that the following extensions are enabled:
XTypeApplications
XScopedTypeVariables
XGADTs
XTypeInType
XAllowAmbiguousTypes
Mapbased implementation🔗
The reference implementation is more or less straightforward. It uses a lazy Map
from containers
as an internal representation of the desired data type.
Keys🔗
Normally, types in Haskell are only present at compiletime: after typechecking, they are completely erased from the program. And yet we want to use types as keys in a map. This requires a runtime representation for types. Luckily, the standard library provides a runtime representation for types in the form of TypeRep
. But there are actually two different definitions of TypeRep
in base
:
The one in Type.Reflection
was introduced in GHC 8.2 and the old one was rewritten to be based on it. Type.Reflection.TypeRep
has kind TypeRep :: k > *
while the old one has kind TypeRep :: *
.
To have the basic idea of what actually TypeRep
is, you can think of the old TypeRep
as an infinite ADT with all types enumerated as tag constructors:
data TypeRep = Int  Bool  Char  String  ...
and the new TypeRep
is an equivalent to the infinite GADT:
data TypeRep (a :: k) where
Int :: TypeRep Int
Bool :: TypeRep Bool
Char :: TypeRep Char
String :: TypeRep String
...
If you are interested in the actual difference between old and new versions of the TypeRep
and motivation for this change, here is a nice ICFP video by Simon Peyton Jones:
I use the old TypeRep
that comes from Data.Typeable
. And I have an explanation for that: there is a limitation in regular Map
that all keys must be of the same type and this is not possible to achieve with parameterized TypeRep
. Also, the old TypeRep
will never be deprecated (from 8.2
it is just a different interface to the new TypeRep
, so it’s not obsolete), and it is sufficient for our goal to support older GHC versions.
Here is a usage example of basic TypeRep
interface:
> :t typeRep
ghcitypeRep :: Typeable a => proxy a > TypeRep
> typeRep (Proxy @Int)
ghciInt
> :t it
ghciit :: TypeRep
Values🔗
For the first prototype, I decided to use Dynamic
as values in our TypeRepMap
.
> :t toDyn
ghcitoDyn :: Typeable a => a > Dynamic
> toDyn True
ghci<<Bool>>
> fromDynamic (toDyn "Haskell") :: Maybe String
ghciJust "Haskell"
So we’ve got:
newtype TypeRepMap = TypeRepMap { unMap :: Map TypeRep Dynamic }
and the initial interface looks like this:
insert :: forall a . Typeable a => a > TypeRepMap > TypeRepMap
= TypeRepMap . LMap.insert (typeRep (Proxy @a)) (toDyn val) . unMap
insert val
lookup :: forall a . Typeable a => TypeRepMap > Maybe a
lookup = fromDynamic <=< LMap.lookup (typeRep $ Proxy @a) . unMap
When looking at the Dynamic
data type implementation
data Dynamic = Dynamic TypeRep Obj
type Obj = Any
you can notice that it already stores TypeRep
inside, so it seems like it’s a bit suboptimal decision due to redundancy. And we can safely use Any
as our value type.
According to the Dynamic
implementation, we can use unsafeCoerce
function for the conversion to Any
and from Any
.
So we get:
newtype TypeRepMap = TypeRepMap { unMap :: LMap.Map TypeRep Any }
=
insert val TypeRepMap
. LMap.insert (typeRep $ Proxy @a) (unsafeCoerce val)
. unMap
lookup = fmap unsafeCoerce . LMap.lookup (typeRep $ Proxy @a) . unMap
Let’s check how it’s all working:
> let x = lookup $ insert (11 :: Int) empty
ghci
> x :: Maybe Int
ghciJust 11
> x :: Maybe ()
ghciNothing
All right, we have a simple working version. But there are ways to improve it.
Parameterization🔗
The next step is to parametrize our data type by type variable f
with kind f :: k > *
. This f
will be the interpretation of our keys. Such parameterization allows us to encode additional structure common between all elements, making it possible to use TypeRepMap
to model a variety of things from extensible records to monadic effects. This sort of parametrization may be familiar to users of vinyl
records.
Note that the input kind is k
— we want to support arbitrary kinds as well. Since TypeRep
is polykinded, the interpretation can use any kind for the keys (see some examples in documentation).
newtype TypeRepMap (f :: k > *) = TypeRepMap
unMap :: LMap.Map TypeRep Any
{ }
The implementation of the functions stays the same, but the types are different:
insert :: forall a f . Typeable a => f a > TypeRepMap f > TypeRepMap f
lookup :: forall a f . Typeable a => TypeRepMap f > Maybe (f a)
Our previous implementation is just equivalent to TypeRepMap Identity
in the context of the described design.
NOTE: Another reason to get rid of the
Dynamic
: if we keep it then we have to specifyTypeable (f a)
constraint instead ofTypeable a
in the type declarations. And havingTypeable a
constraint would let us implement the following function efficiently:hoist :: (forall x. f x > g x) > TypeRepMap f > TypeRepMap g
Vectorbased implementation🔗
The next step is to write an alternative implementation based on unboxed vectors, which is expected to be faster.
We want to use Vector (TypeRep, Any)
instead of our lazy map. This vector is going to be sorted by TypeRep
. insert
/lookup
algorithms should be implemented manually in the following way:
insert
: allocate a new vector ofn + 1
element, copy everything from the initial vector adding the new element and don’t forget to keep the sorting.lookup
: the simple binary search.
The point of the unboxed vector is that it helps to get rid of the pointer indirection. If we take just Vector
we will observe this picture (Ty
stands for TypeRep
and El
stands for an element):
Pair₁, Pair₂, Pair₃, Pair₄ ]
[ / \ / \ / \ / \
Ty₁ El₁ Ty₂ El₂ Ty₃ El₃ Ty₄ El₄
Instead of this what we would like to see is:
Ty₁, El₁, Ty₂, El₂, Ty₃, El₃, Ty₄, El₄ ] [
In this way, as the result, the access to the Ty
or El
is shorter for exactly one pointer dereference.
However, turned out that it’s more efficient to store keys and values in separate vectors under corresponding indices:
Ty₁, Ty₂, Ty₃, Ty₄ ]
[    
El₁, El₂, El₃, El₄ ] [
Unfortunately, TypeRep
doesn’t have the Unbox
instance and it looks like it’s not possible to write it. So instead of storing TypeRep
we will be storing a Fingerprint
. Basically, Fingerprint
is the hash for TypeRep
, so it makes sense to move in this direction.
> :t typeRepFingerprint
ghcitypeRepFingerprint :: TypeRep > Fingerprint
> typeRepFingerprint $ typeRep $ Proxy @Int
ghci
b1460030427ac0fa458cbf347f168b53
> typeRepFingerprint $ typeRep $ Proxy @Bool
ghci ebf3a8541b05453b8bcac4a38e8b80a4
TypeRep
from Data.Typeable
module is defined as
type TypeRep = SomeTypeRep
If we take a look at the Ord
instance of SomeTypeRep
in base
we’ll see that it’s confirmed that Fingerprint
is unique for each TypeRep
. That means it’s okay to use Fingerprint
as a key instead of TypeRep
.
Vector🔗
This is initial vectorbased implementation:
data TypeRepMap (f :: k > *) = TypeRepMap
fingerprints :: Vector Fingerprint
{ anys :: Vector Any
, }
We want to use unboxed vector as a type for the fingerprints
field of TypeRepMap
.
Every unboxed vector is the newtype wrapper over some primitive vector. In order to use an unboxed vector of Fingerprint
we need to implement an instance of the Prim
typeclass from the primitive
package for Fingerprint
. It was proposed to add this instance under this issue in primitive
library (having this instance inside library would simplify implementation a lot):
As the reference for Prim
instance implementation, we can use the Storable
type class which contains similar functions. There is already the instance Storable
for Fingerprint
. An assumption is that there is no significant difference between Storable
and Prim
for our lookup
checks and we can use storable vector instead of unboxed one. For more information about the difference between those typeclasses see this SO answer.
Though our initial assumptions were false and turned out that Storable
doesn’t give the desired performance boost as shown with benchmarks.
Optimal Vector🔗
According to the source code, Fingerprint
is a pair of (Word64, Word64)
. So instead of having a single vector of Fingerprint
s we can have a vector of Word64
where Fingerprint
with number i
stored on 2 * i
and 2 * i + 1
indices.
But actually, it’s better to split it into two separate vectors of Word64
where one vector stores the first halves of Fingerprint
and the other one stores the second halves correspondingly. It makes the implementation easier and also faster (checked with benchmarks) because of the assumption that it should be almost always enough to compare only the first part and it makes key comparison faster.
After all described optimizations were done our structure took the following form:
data TypeRepMap (f :: k > *) = TypeRepMap
fingerprintAs :: Unboxed.Vector Word64
{ fingerprintBs :: Unboxed.Vector Word64
, anys :: Boxed.Vector Any
, }
And the lookup
function was implemented like this:
lookup :: forall a f . Typeable a => TypeRepVector f > Maybe (f a)
lookup tVect =
. (anys tVect V.!)
fromAny <$> binarySearch (typeRepFingerprint $ typeRep $ Proxy @a)
(fingerprintAs tVect) (fingerprintBs tVect)
It uses a manually implemented version of the binary search algorithm optimized for unboxed vectors. The algorithm initially performs a binary search using the fingerprintAs
vector only. And then, after finding the first half, walks through the fingerprintBs
vector.
At first, a simple naive binary search was implemented but later it was rewritten into a cacheoptimized binary search (see the description here) which boosted the performance significantly.
Arraybased implementation🔗
But that’s not all. Later we noticed that every vector has the following definition:
data Vector a = Vector {# UNPACK #} !Int
{# UNPACK #} !Int
{# UNPACK #} !(Array a)
As you can see it contains two Int
fields. So we can make our representation more optimal by using Array
instead of boxed vector and PrimArray
instead of unboxed vector directly in the TypeRepMap
data type.
After all optimizations the final shape of the TypeRepMap
is following:
data TypeRepMap (f :: k > Type) = TypeRepMap
fingerprintAs :: {# UNPACK #} !(PrimArray Word64)
{ fingerprintBs :: {# UNPACK #} !(PrimArray Word64)
, anys :: {# UNPACK #} !(Array Any)
, }
Benchmarking🔗
Initially, I was frustrated about this part because I had no idea how to create the Map
of 1000 elements as that means I needed to somehow generate 1000 types. But there was actually a pretty elegant solution for this puzzle — polymorphic recursion.
Let’s introduce the following data types as typelevel natural numbers:
data Z
data S a
Using these data types we can now implement the function which builds TypeRepMap
of the desired size.
buildBigMap :: forall a . Typeable a
=> Int
> Proxy a
> TypeRepMap Proxy
> TypeRepMap Proxy
so when I run buildBigMap
with size n
and Proxy a
, it calls itself recursively with n  1
and Proxy (S a)
at each step, so the types are growing on each step.
But this wasn’t the only challenge in benchmarking TypeRepMap
. There were also a few interesting things with benchmarks to keep in mind:
 We should force maps to normal form before benchmarking.
 We can’t use
rnf
function. DerivingNFData
instance forTypeRepMap
is not possible because there can be noNFData
forAny
. We won’t be able to usernf
because it would try to force both the keys and the values, as our values areAny
(can’t force them), but since evaluating the values is not important at all for the benchmark, we could try to define a function likernf
but without touching the values.
For Map
based implementation we need to benchmark the lookup
function on different depths of our tree (as Map
is internally a tree). But the key can be very close to the root so our benchmarks won’t be honest enough. Thus we need to test on different Proxy
s with different types.
Here is the diagram of how the tree’s been constructed. You can notice that the Char
element is the direct child of the root:
16
size:
tree:Proxy * (S (S (S (S (S (S (S (S (S (S (S (S (S (S Z))))))))))))))
+

+Char
 Proxy * (S (S (S (S (S (S Z))))))
  +
  Proxy * (S (S (S (S (S (S (S (S (S (S (S Z)))))))))))
 +
 
 +
Proxy * (S (S (S (S (S (S (S (S (S (S (S (S (S Z)))))))))))))
Proxy * (S (S (S (S (S (S (S (S (S (S Z))))))))))
 +
 Proxy * (S (S (S (S (S (S (S (S (S Z)))))))))
 +
  Proxy * (S (S (S (S (S (S (S (S Z))))))))
  +
 Proxy * (S (S (S (S (S (S (S Z)))))))
+
Proxy * (S Z)
 +
 Proxy * (S (S (S Z)))
 +
  Proxy * (S (S (S (S (S (S (S (S (S (S (S (S Z))))))))))))
  +
 Proxy * (S (S (S (S Z))))
+
Proxy * (S (S Z))
 +
 Proxy * (S (S (S (S (S Z)))))
+
Proxy * Z +
Since we can’t predict how Ord
on TypeRep
will behave we need to select a Proxy
from our range randomly, however, because our types were huge we introduced the following type family to solve that issue:
type family BigProxy (n :: Nat) :: * where
BigProxy 0 = Z
BigProxy n = S (BigProxy (n  1))
While running this version of benchmarks it turned out that rnf
function was taking a lot of time mostly on normalisation of the enormous TypeRep
keys which consisted of tall nested types like S (S (S ...))
.
So, eventually, I end up using the ghc plugin ghctypelitsknownnat and the type of the buildBigMap
became:
buildBigMap :: forall (a :: Nat) . KnownNat a
=> Int
> Proxy a
> TypeRepMap (Proxy :: Nat > *)
> TypeRepMap (Proxy :: Nat > *)
In order to benchmark lookup
function we implemented a special function fromList
to use in place of the bunch of inserts, so we will be able to see the real time measures of lookup
operation itself.
data TF f where
TF :: Typeable a => f a > TF f
fromList :: [TF f] > TypeRepMap f
Now the buildBigMap
function will have type
buildBigMap :: forall (a :: Nat) . KnownNat a
=> Int
> Proxy a
> [TF (Proxy :: Nat > *)]
> [TF (Proxy :: Nat > *)]
Benchmarks make 10 lookups to collect average performance statistics:
tenLookups :: TypeRepMap (Proxy :: Nat > *)
> ( Proxy 10, Proxy 20, Proxy 30, Proxy 40
Proxy 50, Proxy 60, Proxy 70, Proxy 80
,
)= (lp, lp, lp, lp, lp, lp, lp, lp)
tenLookups tmap where
lp :: forall (a :: Nat) . Typeable a => Proxy a
= fromJust $ lookup tmap lp
and compare the work of mapbased implementation with optimal arraybased implementation. Here are the achieved results:
NOTE: time in the report is for 10 lookups. To get the average time of single
lookup
you need to divide time by 10.
 Benches GHC8.4.3
mapbased/lookup
benchmarking 2.198 μs (2.195 μs .. 2.202 μs)
time 1.000 R² (1.000 R² .. 1.000 R²)
2.196 μs (2.193 μs .. 2.199 μs)
mean 10.46 ns (8.436 ns .. 12.67 ns)
std dev
map/lookup
benchmarking dependent 819.0 ns (810.7 ns .. 829.1 ns)
time 0.999 R² (0.999 R² .. 1.000 R²)
815.8 ns (812.1 ns .. 822.5 ns)
mean 16.11 ns (9.371 ns .. 23.09 ns)
std dev
binarysearch/lookup
benchmarking vector370.7 ns (368.9 ns .. 372.2 ns)
time 1.000 R² (1.000 R² .. 1.000 R²)
368.9 ns (368.2 ns .. 369.7 ns)
mean 2.512 ns (1.938 ns .. 3.474 ns)
std dev
cacheoptimizedbinarysearch/lookup
benchmarking array183.5 ns (183.2 ns .. 183.8 ns)
time 1.000 R² (1.000 R² .. 1.000 R²)
183.6 ns (183.3 ns .. 184.4 ns)
mean 1.535 ns (958.3 ps .. 2.631 ns) std dev
Conclusion🔗
In this blog post, I wanted to show the difficulties, tricks, and useful information which I personally learned during the implementation of an optimized version of TypeRepMap
. Also, I needed to somehow structure the knowledge I’ve gained while working on this project. You can say that some parts of the post can be skipped or might be irrelevant but I wrote it in such a way on purpose to highlight the topics that I find very hard to find and understand quickly. So I hope you too will find this knowledge useful!
Acknowledgments🔗
Many thanks to Vladislav Zavialov (@intindex) for mentoring this project! It was the great experience for me.
Bonus🔗
A few more challenges on the way to the release typerepmap
:
KindOf🔗
During interface enhancement I ran into some weird issue described below.
It’s nice to have the member
function and it makes sense to implement it using already written lookup function:
member :: forall a f . Typeable a => TypeRepMap f > Bool
= case lookup @a trMap of
member trMap Nothing > False
Just _ > True
Type of the lookup
function is the following:
lookup :: forall a f . Typeable a => TypeRepMap f > Maybe (f a)
Unfortunately, this implementation of member
doesn’t compile! The problem is in the fact that the compiler can’t infer that type variable a
and the argument to f
have the same kind. These two functions have the following type with implicitly inferred kinds:
lookup :: forall {k} (a :: k) (f :: k > *) . Typeable a
=> TypeRepMap f > Maybe (f a)
member :: forall {k1} {k2} (a :: k1) (f :: k2 > *) . Typeable a
=> TypeRepMap f > Bool
After this ghc proposal is implemented, it should be possible to write such type signatures directly in code. The current workaround is to use this trick with KindOf
type:
type KindOf (a :: k) = k
member :: forall a (f :: KindOf a > Type) . Typeable a => TypeRepMap f > Bool
New TypeRep performance🔗
During benchmarking the Map
based implementation of TypeRepMap
, very perceptible performance degradation was noticed. Here is the comparison table with the results we have with our Map
based implementation.
ghc version  containers  performance 

8.0.2  0.5.7.1  556.9 ns 
8.2.2  0.5.10.2  2.076 μs 
8.4.3  0.5.11.0  2.464 μs 
We didn’t observe this performance degradation when we used Fingerprint
as keys, so it’s probably an issue with the new TypeRep
.
KnownNat and Typeable🔗
Initial version of buildBigMap
function had this type signature:
buildBigMap :: forall (a :: Nat) . (Typeable a, KnownNat a) => ...
But, unfortunately, it became broken on GHC8.4.3! Turned out that Typeable
and KnownNat
constraints don’t play well together. This observation resulted in the following ghc ticket with quite an interesting discussion: