| | News & Press Releases |
Wednesday, 27 January 2010 | English - Swedish OLIF dictionary releasedEngish - Swedish OLIF dictionary added to the list of OLIF lexicons distributed by Digital Sonata. The dictionary is available for download from http://www.digitalsonata.com/download.aspx?type=linguisticData. |
Sunday, 10 January 2010 | Bilingual OLIF dictionaries releasedDigital Sonata released a set of low-cost royalty-free bilingual dictionaries in OLIF format, optimized for use in NLP and content management applications. Translation, part of the speech, and a thesaurus article is included. The dictionaries are available at http://www.digitalsonata.com/download.aspx?type=linguisticData. Currently the following dictionaries are available: - English -> Finnish
- English -> French
- English -> German
- English -> Japanese
- English -> Korean
- English -> Russian
- English -> Spanish
|
Tuesday, 5 January 2010 | Carabao Language Kit 1.6.2.1 releasedThe version 1.6.2.1 is now available for download. Fixed: - Transliteration to empty string
- Partial transliteration
Added: - Change log which allows distributed collaboration
Improved: - Processing speed
- Entry matching accuracy
|
Monday, 1 June 2009 | No more direct salesPlease be advised that we no longer license our products off-the-shelf. If you would like a quote for our services, leave us a message. |
Wednesday, 4 March 2009 | Guide to sequences uploadedWe uploaded a short guide to building and debugging the sequences. It is available at our whitepaper download page. |
Friday, 20 February 2009 | Carabao Language Kit 1.5.0.1 releasedThe version 1.5.0.1 is now available for download. Lots of changes and enhancements thanks to ongoing development of Chinese (not in the default database though). Fixed: - Regression: "phantom capitalization" of re-used words
- Regression: sequence style forcing / avoiding
- Repositioning errors in sentences with attached tokens
- Sequence processing in languages not using white spaces
- Regression: single member sequence processing
Added: - Lattice-based processing for speech recognition and OCR application usage
- Optional and unmapped members in sequences
- Members in sequences which are validated but not mapped
- Possibility to get a crosslingual representation (components only: DeepAnalyzer and Translation Server)
- Possibility to load content from a disambiguated crosslingual representation
- GUI in Translation Console to enable lattice-based processing
- GUI in Translation Console to enable loading crosslingual representation
- GUI in Translation Console to hint the system about the expected domains in the text
- Analysis mode in Translation Console, when the source and target languages are the same and no styles are enforced / avoided
- Capability of using the white space as a delimiter in languages that don't have white spaces
- Smart quotes and other delimiters
Improved: - Dictionary GUI - presents thesaurus from another language, if missing in the current language
- Sequence builder GUI - color coding of members which are not mapped, or contain conditions producing empty sets
|
Monday, 8 December 2008 | Carabao Language Kit 1.2.3.0 releasedThe version 1.2.3.0 is now available for download. Fixed: - Handling of single quotes as syntax delimiters in English
Added: - A segmentation mode more effectively handling languages that don't use white spaces (e.g. Chinese, Japanese, Korean, Thai). In this mode, different character classes are broken into tokens (e.g. Chinese, and then immediately English). The remaining unidentified part is run through unknown heuristic identifier.
- Automatic conversion for Unicode clipboard data into the currently active encoding in tokens table
- Better warning when attempting to overwrite the current token
- A utility to rebuild semantic links cache
Improved: - In some systems, the table of tokens with every update was adding a new set of system icons (minimize, restore, maximize) to the MDI frame window. The maximize option now causes the window to be set roughly to the full client area, but not in maximize mode
|
Monday, 8 September 2008 | Free source code sectionWe added a small source code section on our Download page, where we will post freebies for developers. |
Monday, 8 September 2008 | Carabao Language Kit 1.2.0.0 releasedThe version 1.2.0.0 is now available for download. Fixed: - Unknown patterns were translated as hypernyms
- Regression: certain category-based sequences were omitted on second execution because of a malfunctioning guess scan caching mechanism
- In analytical mode (Carabao DeepAnalyzer), there was a mismatch between word index number and an idiom member index, in sentences with attached tokens such as 'em, 'm
- When copying a token with 1 rule units or less, the text is always reset to the original
Added: - Capability to match numbers as patterns
- When a translation is not found, the engine tries to fall back to a matching hypernym instead
- New methods to Carabao DeepAnalyzer that enable accessing the members of the detected idioms
- New methods to Carabao CDA that enable accessing the unknown heuristics table
- New sequences
- Russian morphological exceptions
Improved: - If an "unknown pattern" is forced to match a known word, it will not create a new guess if a guess with a same hypernym already exists. For example, if you force to check, whether a known word can be a city, a new record will not be created, if there is already a guess with a known city
Automatic input language switching in locator fields
- Locator fields are pre-filled with the list of all existing languages in the database, eliminating the need to jump to the next language
|
Wednesday, 23 April 2008 | We are published at ELRAAfter a few months of evaluations, agreements, and inspections, our linguistic data is published at European Linguistic Resources Association's website. The Russian - English OLIF dictionary is sold at quite a price, while the freebie Swahili, Czech and Cebuano dictionaries are distributed for free (although ELRA takes postage and media charges).
It is important to mention that all this data can be created from (usually free) ASCII dictionaries on the net using Carabao Linguist Edition.
Clarification: OLIF is Open Lexicon Interchange Format backed by SAP, especially created for NLP oriented lexica. The official website is www.olif.net. |
Tuesday, 1 April 2008 | Server transitionWe just moved to a new server. Much better performance, but there might be some minor technical glitches in the next few days. Thank you for your patience. |
Tuesday, 11 March 2008 | Carabao Language Kit 1.1.0.1 releasedThe version 1.1.0.1 is now available for download - mostly to fix the regressions reported in 1.1.0.0.
Fixed:
- Crash when using sequence extraction option (regression from 1.1.0.0)
Added:
- Capability to import sequences by data entry directly from the Sequence Sheet
- Capability to manually set sequence descriptions
- Some sequences for multi-word entity extraction
- More morphological exceptions for Russian
Improved:
- Processing speed and memory consumption - further boost
- Token Sheet (words & sequences) GUI
|
Thursday, 28 February 2008 | Carabao Language Kit 1.1.0.0 releasedThe version 1.1.0.0 is now available for download. Fixed: - Volatility of newly assigned rule units in late sequences
- Inconsistencies in the generation of inflected forms in design time
Added: - Friendly GUI of meta-rules such as lemmatized forms and generation of inflected forms
- MorphoLogic now inspects the design time data generation meta-rules when generating inflected forms
Improved: - Processing speed and memory consumption
- Increased maximum length of the meta-rule content field
- Increased some fields to accommodate large sequences and a lot of grammatical data
- Concurrency during long processing
NOTE: if you are upgrading from 1.0 and would like to keep your data, please run convertTo11.exe executable on your data. |
Sunday, 24 February 2008 | Our products are now available at ComponentSourceComponentSource, the largest online reseller of software components, is now selling Carabao DeepAnalyzer, with Carabao MorphoLogic and Carabao Translation Server on the way. Here is a direct link to our page: http://www.componentsource.com/features/digital-sonata/index.html
It took us a while (over 2 months) to sign up, with all the checks, examinations, questions, and reviews. ComponentSource provides the corporate customers a more convenient mode of purchase, compliant with their supply chain procedures, and establishes higher visibility for our products. |
|