GtkSort
GtkSort
                  Logo
Last release 0.3.3

News

About

Screenshots

Download

Documentation

FAQ

Forums

Authors








About

GtkSort is a multiprocessor external disk sorter and data manipulator for systems that support GTK+. Its source code has been ported and tested to produce valid results in Linux/x86, Linux/amd64, Linux/alpha, HP-UX/hp-pa 11.11i, Tru64 5.1B, Solaris 10/x86 and Win32 NT Class operating systems.

GtkSort processes files using multi threading in order to implement parallel algorithms. It overlaps disk I/O with sorting and reduces I/O waits. By using only sequential files it utilizes the most of the disk's I/O bandwidth. By giving exclusive read or write permissions for each processed file to only one thread, it minimizes the conflicting I/O requests that reduce the hard disk's efficiency. In order to sort records, and depending on the data type of the sort keys, GtkSort uses the standard library quick sort or its own implementation of Most Significant Byte (MSB) radix sort. GtkSort uses cache efficient algorithms keeping the cache misses at a very low rate (0.4% in version 0.2.0).

GtkSort integrates a Graphical User Interface (GUI) based on GTK+ in order to increase its friendliness against the end user. It also integrates a Command Line Interface (CLI) so it can be used in shell scripts.

GtkSort is free and open source software distributed under the terms of the GNU Public License version 2.


What GtkSort Can do
  • Sort large data sets much faster than the standard sort utility of the operating system.

  • Sort by binary data or text data keys.

  • Sort on a limited number of twelve keys.

  • Support Fixed Length Record (FLR) text or binary files with fixed size keys.

  • Support Variable Length Record (VLR) text files (lines of text) with delimited or fixed size keys.

  • Use multiple processors and disks in parallel.

  • Perform ascending or descending sort of each key.

  • Preview the keys of the unsorted input and the sorted output.

  • Use memory dynamically according to the limitations set by the user.

  • Exploit CPU and file system specific characteristics such as the L2 cache size, the D cache size and the disk I/O block size.