Data Processing & Review Web Portal
The goal of this project is to provide a friendly collaborative environment for users interested in processing, analysis and review of heterogeneous electronic data.
More Info
Implemented as the Web portal, this system provides the following core functions:
Documents uploading
System accepts documents in any popular format (text, HTML, MS Office, PDF, emails in various formats, including Outlook and Lotus Notes, etc.). These documents are then parsed and indexed by the system and prepared for the future analysis. In addition to Web uploader the system provides a desktop client.
Data organization
Data is stored in easy to use and intuitive manner documents are kept in folders, which are tied to projects.
Tagging and review
Documents can be viewed and tagged right in a web browser. Advanced search, filtering and grouping capabilities are available for the outstanding efficiency
Collaboration with other users
To organize virtual workgroups and work on the same data together it’s possible to assign some data for review to the other users.
Social features
Like in a social network you can connect with other people, send private messages, manage your profile.
Dashboard and reports
Statistics regarding your data and your collaborator’s work is displayed in various ways from graphical charts to hierarchical exportable reports.
The backend of the system is powered by a complex data processing engine. It solves the following tasks: Pre-processing:
- File extensions filtering
- File types identification
- Hash calculation
- Redundancy removal (exact duplicates identification)
Processing:
- Recursive text data and metadata extraction from various types of e-mails and files
- Recursive embedded objects extractio
- Data staging and indexing
- Data analysis: file type analysis, near duplicates identification, e-mail inclusion/redundancy analysis
- TIFF generation
- OCR processing