The duckling package

[ Tags: library, program, systems ] [ Propose Tags ]

Duckling is a library for parsing text into structured data.


[Skip to Readme]

Properties

Versions 0.1.0.0, 0.1.1.0, 0.1.2.0, 0.1.3.0
Dependencies aeson (>=0.11.3.0 && <1.1), array (>=0.5.1.1 && <0.6), attoparsec (>=0.13.1.0 && <0.14), base (>=4.8.2 && <5.0), bytestring (>=0.10.6.0 && <0.11), containers (>=0.5.6.2 && <0.6), deepseq (>=1.4.1.1 && <1.5), dependent-sum (>=0.3.2.2 && <0.5), directory (>=1.2.2.0 && <1.4), duckling, extra (>=1.4.10 && <1.6), filepath (>=1.4.0.0 && <1.5), hashable (>=1.2.4.0 && <1.3), haskell-src-exts (==1.18.*), regex-base (>=0.93.2 && <0.94), regex-pcre (>=0.94.4 && <0.95), snap-core (>=1.0.2.0 && <1.1), snap-server (>=1.0.1.1 && <1.1), text (>=1.2.2.1 && <1.3), text-show (>=2.1.2 && <3.7), time (>=1.5.0.1 && <1.9), timezone-olson (>=0.1.7 && <0.2), timezone-series (>=0.1.5.1 && <0.2), unordered-containers (>=0.2.7.2 && <0.3) [details]
License OtherLicense[multiple license files]
Copyright Copyright (c) 2014-present, Facebook, Inc.
Author Facebook, Inc.
Maintainer duckling-team@fb.com
Category Systems
Home page https://github.com/facebookincubator/duckling#readme
Bug tracker https://github.com/facebookincubator/duckling/issues
Source repository head: git clone https://github.com/facebookincubator/duckling
Uploaded Mon Oct 16 23:20:51 UTC 2017 by patapizza
Distributions NixOS:0.1.3.0
Executables duckling-expensive, duckling-request-sample, duckling-example-exe, duckling-regen-exe
Downloads 702 total (283 in the last 30 days)
Rating (no votes yet) [estimated by rule of succession]
Your Rating
  • λ
  • λ
  • λ
Status Docs uploaded by user
Build status unknown [no reports yet]
Hackage Matrix CI

Modules

[Index]

Downloads

Maintainer's Corner

For package maintainers and hackage trustees


Readme for duckling-0.1.3.0

[back to package description]

Duckling Logo

Duckling Build Status

Duckling is a Haskell library that parses text into structured data.

"the first Tuesday of October"
=> {"value":"2017-10-03T00:00:00.000-07:00","grain":"day"}

Requirements

A Haskell environment is required. We recommend using stack.

On macOS you'll need to install PCRE development headers. The easiest way to do that is with Homebrew:

brew install pcre

If that doesn't help, try running brew doctor and fix the issues it finds.

Quickstart

To compile and run the binary:

$ stack build
$ stack exec duckling-example-exe

The first time you run it, it will download all required packages.

This runs a basic HTTP server. Example request:

$ curl -XPOST http://0.0.0.0:8000/parse --data 'locale=en_GB&text=tomorrow at eight'

See exe/ExampleMain.hs for an example on how to integrate Duckling in your project.

Supported dimensions

Duckling supports many languages, but most don't support all dimensions yet (we need your help!).

| Dimension | Example input | Example value output | --------- | ------------- | -------------------- | AmountOfMoney | "42€" | {"value":42,"type":"value","unit":"EUR"} | Distance | "6 miles" | {"value":6,"type":"value","unit":"mile"} | Duration | "3 mins" | {"value":3,"minute":3,"unit":"minute","normalized":{"value":180,"unit":"second"}} | Email | "duckling-team@fb.com" | {"value":"duckling-team@fb.com"} | Numeral | "eighty eight" | {"value":88,"type":"value"} | Ordinal | "33rd" | {"value":33,"type":"value"} | PhoneNumber | "+1 (650) 123-4567" | {"value":"(+1) 6501234567"} | Quantity | "3 cups of sugar" | {"value":3,"type":"value","product":"sugar","unit":"cup"} | Temperature | "80F" | {"value":80,"type":"value","unit":"fahrenheit"} | Time | "today at 9am" | {"values":[{"value":"2016-12-14T09:00:00.000-08:00","grain":"hour","type":"value"}],"value":"2016-12-14T09:00:00.000-08:00","grain":"hour","type":"value"} | Url | "https://api.wit.ai/message?q=hi" | {"value":"https://api.wit.ai/message?q=hi","domain":"api.wit.ai"} | Volume | "4 gallons" | {"value":4,"type":"value","unit":"gallon"}

Extending Duckling

To regenerate the classifiers and run the test suite:

$ stack build :duckling-regen-exe && stack exec duckling-regen-exe && stack test

It's important to regenerate the classifiers after updating the code and before running the test suite.

To extend Duckling's support for a dimension in a given language, typically 2 files need to be updated:

  • Duckling/<dimension>/<language>/Rules.hs
  • Duckling/<dimension>/<language>/Corpus.hs

To add a new language:

Rules have a name, a pattern and a production. Patterns are used to perform character-level matching (regexes on input) and concept-level matching (predicates on tokens). Productions are arbitrary functions that take a list of tokens and return a new token.

The corpus (resp. negative corpus) is a list of examples that should (resp. shouldn't) parse. The reference time for the corpus is Tuesday Feb 12, 2013 at 4:30am.

Duckling.Debug provides a few debugging tools:

$ stack repl --no-load
> :l Duckling.Debug
> debug (makeLocale EN $ Just US) "in two minutes" [This Time]
in|within|after <duration> (in two minutes)
-- regex (in)
-- <integer> <unit-of-duration> (two minutes)
-- -- integer (0..19) (two)
-- -- -- regex (two)
-- -- minute (grain) (minutes)
-- -- -- regex (minutes)
[Entity {dim = "time", body = "in two minutes", value = "{\"values\":[{\"value\":\"2013-02-12T04:32:00.000-02:00\",\"grain\":\"second\",\"type\":\"value\"}],\"value\":\"2013-02-12T04:32:00.000-02:00\",\"grain\":\"second\",\"type\":\"value\"}", start = 0, end = 14}]

License

Duckling is BSD-licensed. We also provide an additional patent grant.