pySplit - a python file splitter

One of the things you tend to come across on the Internet are file archives. Because bandwidth is expensive, files get compressed so that they can be transferred more quickly.

File are also split into chunks for various reasons. One of the more common reasons is to post to Usenet via NNTP. Usenet predates the WWW and can be thought of as a massive builletin board (Google Groups is a giant catalog of Usenets posts.) Messages are posted to a news server and propagate to, in theory, every news server.

A file can be attached to a post by converting the file into text. There are differnet ways to do this conversion, but uuencode and yEnc are the most common. The newsreader program converts the file (say an image ) into message text.

Take the jpeg file: met_enkeph icon

Encoded with uue, we get this; a 5k text file. Using yEnc we get this; a slightly smaller text file at 4k.

Now imagine a 100Mb file. Due to message size limitations it's common to split the file into multiple chunks. If someone wants the file, they download all the parts and combine them. With the addition of parity volumes, any missing chunks can be recreated on the receiving end.

Using some of the common file splitters we get a bunch of files named:
  filename.ext.001
  filename.ext.002
  filename.ext.003
  filename.ext.004

Unfortunatly, both rar archives and arj archives can have the exact same naming format.

Using cat you can combine the parts:

cat filename.ext.00* > filename.ext

But I wanted something nicer. My wish list:

  • run on linux and windows
  • tell if something was an arj or rar
  • read the header files created from other file splitters (.000 files)
  • split files based on file size or file count
  • create .000 files, .crc files and batch files/shell scripts to rejoin the split files without a specific joiner
  • A Pony



<<Home >>