|
Page last modified: Friday 12th December 2008
This page will track all released versions of the VARD 2 software; detailing bug fixes, changes made and functionality added. The user guide provides details of the current version's functions and how to use the software.
Current Version
2.2
2.2 is a major upgrade to VARD, offering many improvements and new features, including:
- An extended dictionary based upon the Spell Checker Oriented Word Lists (SCOWL) has been added. The dictionary no longer takes words from the known variants list replacements as some of these were erroneous.
- Greatly improved XML provision. VARD now uses regular expressions to deal with all xml tags, this means that 100% valid xml files are no longer required (broken xml tags will cause problems however - this is virtually unavoidable). When exporting, xml tags are only added to reflect changes made by the system or the user.
- Variant tags are now only added if the user indicates a word should be marked as a variant in the interactive version, this is to allow VARD to use the current dictionary to decide if a word should be a variant.
- Correct tags are now added for words which the user indicates as being modern forms (not variants).
- The uncommon words group has been removed as many words were being placed in this group incorrectly. In future versions of VARD it is planned to use contextual information to detect real-word errors.
- A command line interface has been added to allow VARD to be potentially run on a network machine or as part of a sequence of tools in a script.
- VARD now requires Java 6, please ensure you have the latest version of Java running on your machine before running VARD. Get Java.
- Many bugs have been fixed, including the four on the bugs page.
- Text produced in any version of the software (interactive, batch or command line) can be opened and processed with any other version. VARD xml tags will be parsed appropriately and in the same fashion in each version.
- Pasting text (with or without xml tags) will be processed in exactly the same way as if the text was opened in a fresh file, except the text is appending to the end of the currently displayed text.
- As previously stated, xml is dealt with much more successfully.
- The delay when switching between word lists when dealing with large texts in the interactive version has been greatly reduced.
- Various problems when undoing and redoing edits in the interactive version have been cleaned up.
- Previously, when processing large corpora, system memory was being depleted due to poor "garbage collection" of previously processed text. This issue has now been resolved.
- A problem occurred occasionally when the user clicked on the text in the interactive version whilst the words were being evaluated. This has now been resolved.
- The stats file produced during batch processing (and now in the command line interface) now includes tokens as well as types.
- Text within XML tags (<...>) (except VARD's tags), square brackets ([...]) and curly brackets ({...}) are now ignored (not processed) by VARD and coloured grey in the interactive version. This behaviour can be edited by the user in "saved data/text_to_ignore.txt". Regular expressions are used to declare which portions of text to ignore, hence a user can state that text between certain tags should be ignored, such as headers.
- HTML/XML/Unicode entities are now dealt with much better. Unicode entities in the form &(#x)1234; are now converted into their equivalent Java Unicode characters, so processed by VARD like any other character. They are converted back into their original format upon saving. XML/HTML special entities such & and " are converted to their equivalent characters. This behaviour can be edited in "saved data/entities.txt". Regular expressions are used to declare what character sequences should be detected, and a replacement given.
- A user can now search through the instances of a selected word using the previous/next instance buttons located under the word list in the interactive version of the tool.
- When joining words in the interactive version, the newly created word is evaluated like any other word; if the new word forms a dictionary entry it will be marked as a modern form, otherwise it will be marked as a variant.
- New icons have been added in the interactive version courtesy of famfamfam.
- VARD now runs more efficiently, improving the overall speed when processing texts.
- Various other minor improvements and bug fixes.
The User Guide has been updated to reflect these changes. If you have any questions please use the form on the FAQ page. VARD 2.2 can still be considered beta, and some bugs may remain, please report any you find.
VARD is free for academic use, if you would like to download VARD please make a request on the availability page.
Previous Versions
2.1.2
A small update to fix a couple of bugs. Firstly, when the Batch Version outputs texts, the folder structure of the original files is retained. A bug has also been fixed where manually editing the output folder would cause a system error. Secondly, a bug where closing xml tags were being processed by vard has been fixed. Reading existing xml files can still cause problems, it is hoped that this will be rectified in version 2.1.3.
As before, minor bugs may remain, please report any you find.
2.1.1
A minor update to allow the software to be used on other platforms. Use "vard-windows-run.bat" on Windows and "vard-run.command" on other platforms. Java 1.5.0 should also now be sufficient to run VARD 2.
As before, minor bugs may remain, please report any you find.
2.1.0
The first version publicly released (on this website) in order to gain user feedback for the evaluation section of my PhD.
This version has greatly improved processing speed and stability, although it is possible that minor bugs may remain, please report any you find.
2.0.0 - 2.0.9
A series of non-publicly released versions produced during an undergraduate project and the first year of my PhD.
1
The original version of VARD was non-interactive and only used a known variants list to search for and replace variants found within a text.
|