Oracle: Zipped Files

  • ian / 400 / Fri, 05 Jun 2009 11:50:00 GMT / Comments (9)
  • Hello

    Is there any way to extract data from a text file that has been zipped, without unzipping it first.

    I have been using UTL_FILE to parse through normal text files and transfer the data it into an APEX table, which works fine. However there is the likelihood that we could receive files in that are zipped and I was hoping to keep the process as automated as possible.

    Cheers ian
  • Keywords:

    zipped, files, oracle

  • http://database.itags.org/oracle/291708/«« Last Thread - Next Thread »»
    1. Hi Ian,

      Well...it's a bit of a loaded question really, since I guess you're asked if you can access a file without *specifically* unzipping it first, since by definition in order to read the original data there has to be some form of unzipping, even if it happens 'automagically' in-memory without you having to do it yourself.

      My first stab at your problem would probably be to load the content of your file into a BLOB and then use the UTL_COMPRESS package to uncompress it and then pass the resulting uncompressed blob through to your parsing routine (with whatever modifications you need to make).

      Hope this helps.

      jes | Sat, 23 Feb 2008 13:21:00 GMT |

    2. Thanks Jes

      I have just tried using utl_compress. I have imported the zip file into a blob field then tried to uncompress the blob field using utl_compress.lz_uncompress(blob).
      Unfortunately this then failed with the error;

      ORA-29294: A data error occurred during compression or uncompression.

      Any ideas?

      Not sure whether this is relevent but the source file I am trying to uncompress is a text file.

      Cheers ian

      ian | Sat, 23 Feb 2008 13:22:00 GMT |

    3. Hi Ian,

      The input file would have been in the same zipped format that the utl_compress package is expecting it to be in (i.e. can you try using a file that you have *compressed* using the same package to rule out different compression algorithm problems).

      I'll try and knock up an example later if I get a chance.

      jes | Sat, 23 Feb 2008 13:23:00 GMT |

    4. Sorry Jes I am not quite sure how to compress a blob and export it as a zip. I have been able to compress a blob and then uncompressed the blob without issue if that helps.

      Apologies my PL/SQL skills are a bit weak at the moment, I am picking up things as I go along!!!

      So far I have used both winzip and Window XPs own zipping process to zip the text files and both formats I am unable to uncompress the blobs once imported.

      Cheers again.

      ian | Sat, 23 Feb 2008 13:24:00 GMT |

    5. Hi Ian,

      Ok here's "rough" example, note that I knocked it up while eating lunch so you may need to adapt it to how you need yours to work -


      On my laptop I copied a dictionary file into the /tmp directory to use as an example text file
      [jes...powerbook tmp]$ cp /usr/share/dict/propernames data.txt
      check the size of the uncompressed text file
      [jes...powerbook tmp]$ ls -al data.txt
      -r--r--r-- 1 jes wheel 8640 Aug 30 14:30 data.txt
      show the first few lines of the file
      [jes...powerbook tmp]$ head data.txt
      Aaron
      Adam
      Adlai
      Adrian
      Agatha
      Ahmed
      Ahmet
      Aimee
      Amy
      Ami

      check what type of file the OS thinks it is
      [jes...powerbook tmp]$ file data.txt
      data.txt: ASCII English text

      Create a table to store the blob
      jes...10gR2> create table blob_files (id int primary key, data blob);

      Table created.

      Create the directory to be used with the bfile loading
      jes...10gR2> create or replace directory file_dir as '/tmp';

      Directory created.

      1 declare
      2 l_blob blob;

      3 l_compressed_blob blob;

      4 l_buffer RAW(32767);

      5 l_amount BINARY_INTEGER := 32767;

      6 l_pos INTEGER := 1;

      7 l_blob_length INTEGER;

      8 l_bfile bfile;

      9 l_output_file UTL_FILE.FILE_TYPE;

      10 begin
      11 -- insert a record into the blob_files table and return back the empty blobs
      12 insert into blob_files values (1, empty_blob(), empty_blob() )
      returning data, compressed_data into l_blob, l_compressed_blob;

      13 -- bfile pointer to the file on the filesystem
      14 l_bfile := bfilename( 'FILE_DIR', 'data.txt' );

      15 -- open the bfile and load it into l_blob
      16 dbms_lob.fileopen(l_bfile );

      17 dbms_lob.loadfromfile(l_blob, l_bfile, dbms_lob.getlength( l_bfile ) );

      18 dbms_lob.fileclose(l_bfile );

      19 -- compress the blob
      20 UTL_COMPRESS.lz_compress (src => l_blob, dst => l_compressed_blob);

      21 -- pointer to our output file on the filesystem
      22 l_output_file := UTL_FILE.fopen('FILE_DIR', 'data.zip', 'wb', 32767);

      23 -- grab the length of the compressed blob
      24 l_blob_length := DBMS_LOB.getlength(l_compressed_blob);

      25 -- loop round the compressed blob and output it to the file
      26 while l_pos <
      l_blob_length LOOP
      27 DBMS_LOB.read(l_compressed_blob, l_amount, l_pos, l_buffer);

      28 UTL_FILE.put_raw(l_output_file, l_buffer, TRUE);

      29 utl_file.fflush(l_output_file);

      30 l_pos := l_pos + l_amount;

      31 END LOOP;

      32 -- flush the output and close the file
      33 utl_file.fflush(l_output_file);

      34 UTL_FILE.fclose(l_output_file);

      35* end;

      36 /

      PL/SQL procedure successfully completed.

      Compare the sizes of the data
      1* select dbms_lob.getlength(data) original_length,
      dbms_lob.getlength(compressed_data) compressed_length
      from blob_files

      ORIGINAL_LENGTH COMPRESSED_LENGTH
      -- --
      8640 3926

      Back in the filesystem, check the zipfile has been created
      [jes...powerbook tmp]$ ls -al data*
      -r--r--r-- 1 jes wheel 8640 Aug 30 14:30 data.txt
      -rw-r--r-- 1 oracle wheel 3926 Aug 30 14:31 data.zip
      OS recognises it as a zipfile
      [jes...powerbook tmp]$ file data.zip
      data.zip: gzip compressed data, from Unix
      It's definitely a zipfile
      [jes...powerbook tmp]$ gzip -l data.zip
      compressed uncompressed ratio uncompressed_name
      3926 8640 54.8% data.zip

      copy the zipfile to use the 'gz' suffix to make it work with gunzip easier
      [jes...powerbook tmp]$ cp data.zip test.gz

      [jes...powerbook tmp]$ gunzip test.gz

      confirm the contents are still the same as the original file
      [jes...powerbook tmp]$ head test
      Aaron
      Adam
      Adlai
      Adrian
      Agatha
      Ahmed
      Ahmet
      Aimee
      Amy
      Ami


      Hope this helps,

      John.

      jes | Sat, 23 Feb 2008 13:25:00 GMT |

    6. Thanks John

      Got that to work fine, to import a text file, zip it then import that zip file back in and unzip it.

      However, what does not seem to work is when I zip up a file using e.g Winzip and try and import and uncompress that file. Which is really what I want to try and do.

      One thing I also noticed was that when you produce a zip file using utl_compress and try and open that file using Winzip, it fails, unless you change the file extension to ".gz"

      Cheers ian

      ian | Sat, 23 Feb 2008 13:26:00 GMT |

    7. Hi Ian,

      >However, what does not seem to work is when I zip up a file using e.g Winzip and try and import and uncompress that file. Which is really what I want to try and do.

      Hmmm works fine for me if I adapt my example to import a file that I have already gzipped (my laptop is a Mac, so I can't speak for Winzip). I believe that Winzip can handle decompressing gzipped files, but you need to give it the correct suffix under windows for it to recognise it as such (it doesn't examine the file header to determine the compression type).

      So I would say that the utl_compress package routines are fully compatible with gzip compression, but not with the compression used with Winzip, unless someone knows different?

      jes | Sat, 23 Feb 2008 13:27:00 GMT |

    8. John - the developers we had in for a piece of work last year used a Java class to unzip files.

      The application allows the user to upload a file zipped either with Winzip or XP compressed folders, which is then unzipped through a Java call and the data processed.

      I have to say that although this works pretty well, very occasionally it fails and you have to re-zip the file to get it to work.

      Hope this helps.

      John.

      johnvaughan | Sat, 23 Feb 2008 13:28:00 GMT |

    9. Hi John

      I starting to think that using java maybe the only option, unfortunately, I have never used java before. Have you got any example of the java code you used? I am assuming the java function is used to uncompresses the blob field?

      Cheers ian

      ian | Sat, 23 Feb 2008 13:29:00 GMT |