JustDATA File Embedder
Posted: Tue Mar 05, 2013 12:50 am
JustDATA is a little utility for automatically embedding any and all types of files into JustBASIC\Liberty BASIC programs through DATA statements, and then ReGenerating them, either individually or together, exactly when and where you need them. I've been using these Embed and ReGenerate routines (originally based on Rutger's All2Bas utility) to embed files in my projects for quite some time now, but I've finally decided to polish up the routines and flesh out the application so that others may have a chance to use it.
Here are some of it's features:
Allows you to embed one file, several files, or a whole directory of files.
Choose to include the ReGeneration routine as a GOSUB, FUNCTION, or SUB... or not at all.
All settings needed to ReGenerate the files are contained inside of the generated subroutine.
Automatically sorts the folder contents alphabetically.
Displays percentage complete, Filename, number of Files, current\total File Size, Execution Time and Bytes per Second.
Build the target file either to the Source Folder or the Default Folder.
Select a FileName or use the suggested default.
Append the routine to the end of existing files.
Allows aborting an operation, where you may choose to Save, Delete, or Continue.
Choose to keep the Absolute Path for the ReGenerated files.
Choose different ByteOffset or Control Charaters for Substitution, Replication or CR+LF.
Change the maximum allowed line length of the generated DATA statements.
Change several other settings like which characters to substitute, byte offset, etc...
When calling any of the ReGeneration routines, you may provide an optional FileName as an argument (JDfile$). If provided, the embedded files will be searched and only that one file will be rebuilt if found. If not found, then no files are regenerated. If a FileName is not provided, then all of the embedded files will be regenerated. You may also use the GLOBAL variable JDpath$ if you'd like to specify a target path other than DefaultDir$, and the routines either return or set the number of files regenerated using the variable JustDATA. I've tested this utility on literally thousands of different files with a 100% success rate for byte-accurate ReGeneration... only really big files seem to have any problems (over 8-10 MegaBytes or more)... they take forever! My 2.1GhZ QuadCore pulls about 15 KiBytes/sec on most encodes and about 43 KiBytes/sec on most decodes.
JustDATA uses a simple byte-wise form of RLE encoding for chains of repeating characters and exchanges unprintable characters on a 1-for-2 basis, both by using a character flag. The repeat flag is only used if it will actually shorten the length of the sequence, which is more than 4 printable characters or more than 2 unprintable characters. Since TextFiles are most of what I use JustDATA for, I've also included a third character flag that exchanges out the very common (and after encoding, rather wordy 4-byte long) CarriageReturn + LineFeed combo... all in an effort to reduce the size of the generated routines. If you'd like, you can de-activate the CR+LF flag by setting it to "0" in the options, the other two flags are mandatory.
Although JustDATA has the potential to actually compress filesizes (and it will if working with uncompressed BitMaps, heavily formatted TextFiles, or the like), but for executables and pre-compressed data (zip, rar, 7z) it usually increases the filesize to under 110% of it's original size... still pretty good for BASIC DATA statements. This is typical of the default settings, as the JustBASIC editor (which was written in Liberty BASIC, right?) seems to have only 5 characters that cannot be contained in a DATA statement: Null, Tab, LineFeed, CarriageReturn, and Double-Quote (ASCII 0, 9, 10, 13, 34). Add in the 3 default control characters (ASCII 3, 4, 5) and there are only 8 values out of 256 that serve to inflate the filesize. Since the DATA output is completely linear, this allows the close regulation of the DATA statement length which results in fewer DATA statements and fewer times the required 7 characters per line have to be repeated (i.e. DATA "")... in fact, the output is so linear that you can even edit the DATA statements after they've been generated and they will still ReGenerate without any issues (excepting your edits, of course).
Please download, give it a try, and leave a post if you find any bugs, glaring omissions, or obvious errors. Hopefully others my find this utility to be as useful as I have.
Edit: Mere hours after uploading, I found an error in the retention of variables selected by the "Options" button, causing the generated subroutines to always have default values... which works great, as long as you never change any of the settings. ^_^
I've corrected it and re-uploaded the archive.
Here are some of it's features:
Allows you to embed one file, several files, or a whole directory of files.
Choose to include the ReGeneration routine as a GOSUB, FUNCTION, or SUB... or not at all.
All settings needed to ReGenerate the files are contained inside of the generated subroutine.
Automatically sorts the folder contents alphabetically.
Displays percentage complete, Filename, number of Files, current\total File Size, Execution Time and Bytes per Second.
Build the target file either to the Source Folder or the Default Folder.
Select a FileName or use the suggested default.
Append the routine to the end of existing files.
Allows aborting an operation, where you may choose to Save, Delete, or Continue.
Choose to keep the Absolute Path for the ReGenerated files.
Choose different ByteOffset or Control Charaters for Substitution, Replication or CR+LF.
Change the maximum allowed line length of the generated DATA statements.
Change several other settings like which characters to substitute, byte offset, etc...
When calling any of the ReGeneration routines, you may provide an optional FileName as an argument (JDfile$). If provided, the embedded files will be searched and only that one file will be rebuilt if found. If not found, then no files are regenerated. If a FileName is not provided, then all of the embedded files will be regenerated. You may also use the GLOBAL variable JDpath$ if you'd like to specify a target path other than DefaultDir$, and the routines either return or set the number of files regenerated using the variable JustDATA. I've tested this utility on literally thousands of different files with a 100% success rate for byte-accurate ReGeneration... only really big files seem to have any problems (over 8-10 MegaBytes or more)... they take forever! My 2.1GhZ QuadCore pulls about 15 KiBytes/sec on most encodes and about 43 KiBytes/sec on most decodes.
JustDATA uses a simple byte-wise form of RLE encoding for chains of repeating characters and exchanges unprintable characters on a 1-for-2 basis, both by using a character flag. The repeat flag is only used if it will actually shorten the length of the sequence, which is more than 4 printable characters or more than 2 unprintable characters. Since TextFiles are most of what I use JustDATA for, I've also included a third character flag that exchanges out the very common (and after encoding, rather wordy 4-byte long) CarriageReturn + LineFeed combo... all in an effort to reduce the size of the generated routines. If you'd like, you can de-activate the CR+LF flag by setting it to "0" in the options, the other two flags are mandatory.
Although JustDATA has the potential to actually compress filesizes (and it will if working with uncompressed BitMaps, heavily formatted TextFiles, or the like), but for executables and pre-compressed data (zip, rar, 7z) it usually increases the filesize to under 110% of it's original size... still pretty good for BASIC DATA statements. This is typical of the default settings, as the JustBASIC editor (which was written in Liberty BASIC, right?) seems to have only 5 characters that cannot be contained in a DATA statement: Null, Tab, LineFeed, CarriageReturn, and Double-Quote (ASCII 0, 9, 10, 13, 34). Add in the 3 default control characters (ASCII 3, 4, 5) and there are only 8 values out of 256 that serve to inflate the filesize. Since the DATA output is completely linear, this allows the close regulation of the DATA statement length which results in fewer DATA statements and fewer times the required 7 characters per line have to be repeated (i.e. DATA "")... in fact, the output is so linear that you can even edit the DATA statements after they've been generated and they will still ReGenerate without any issues (excepting your edits, of course).
Please download, give it a try, and leave a post if you find any bugs, glaring omissions, or obvious errors. Hopefully others my find this utility to be as useful as I have.
Edit: Mere hours after uploading, I found an error in the retention of variables selected by the "Options" button, causing the generated subroutines to always have default values... which works great, as long as you never change any of the settings. ^_^
I've corrected it and re-uploaded the archive.