Home Page

Corpus redundancy manager - Download




About Corpus redundancy manager

Redundancy due to cut-paste operations in text creates bias in machine learning for NLP. This module takes a directory and produces a subset of the files in that directory (in a list) with an upper bound on similarity...

Redundancy due to cut-paste operations in text creates bias in machine learning for NLP.
This module takes a directory and produces a subset of the files in that directory (in a list) with an upper bound on similarity between two files.
Features
  • Identify copy paste redundancy in a document corpus
  • Input: a folder with text documents and similarity threshold
  • Output (a) a list of non-redundant documents (a non-redundant subset of the corpus)
  • Output (b) list of document pairs found to be redundant with the amount of redundancy for the pair
  • Python script (2.6) - tested on various Linux flavours + Windows XP/7



Previous Versions

Here you can find the changelog of Corpus redundancy manager since it was posted on our website on 2015-04-25 03:00:00. The latest version is and it was updated on 2024-04-22 15:43:37. See below the changes in each version.

Corpus redundancy manager version
Updated At: 2011-05-09
Changes: Several fixes and updates
Corpus redundancy manager version
Updated At: 2011-05-09


Related Apps

Here you can find apps that are similar with Corpus redundancy manager.



Disclaimer

External Download


We do not host Corpus redundancy manager on our servers. We did not scan it for viruses, adware, spyware or other type of malware. This app is hosted by the software publisher and passed their terms and conditions to be listed there. We recommend caution when installing it.

The external download link for Corpus redundancy manager is provided to you by apps112.com without any warranties, representations or guarantees of any kind, so access it at your own risk.

If you have questions regarding this particular app contact the publisher directly. For questions about the functionalities of apps112.com contact us.

BarCode2D-PNG


Click stars to rate this APP!

Users Rating:  
  5.0/5     1
Downloads: 104
Updated At: 2024-04-22 15:43:37
Publisher: cohenrap
Operating System: Mac,windows,linux
License Type: Free